A few months ago I wrote about Prof. Patrick Sturgis’ Cathie Marsh Lecture, which dealt with the same topic as the then-forthcoming preliminary findings of the Polling Inquiry. A couple of days later I went to the release of those preliminary findings, at which Prof. Sturgis again featured heavily (being, as he is, the Chair of the Inquiry), and this is my selective summary of the key points as well as some of my own thoughts. I thought I’d publish them now because it’s almost a year since the general election and because we have elections and a referendum (plus, naturally, accompanying polling) approaching. If you’re interested in more detail, then you can read the full Polling Inquiry report here.
The first thing to note from the release of the preliminary findings is that the extent to which the polling companies got their prediction of the general election result wrong was not especially out of line with the normal magnitude of error in estimations of results based on polls. True, the underestimation of the Conservative share of the vote seems to be getting worse over time, which is a matter for the attention of the polling companies, but it was not dramatically worse in 2015 than it had been for other elections. Thus, since the results weren’t especially bad, the problem lay in part with the story that was told. In other words, this wasn’t just a problem of numbers but also one of narrative. If the results had been equally inaccurate but predicted a Conservative victory, then the hot water that the polling companies found themselves in after election day would have been decidedly lukewarm. This is something that the polling companies are aware of but, I presume, they’re also aware that it might appear a bit churlish for them to hark on about the problem being the story rather than their numbers (which had demonstrable problems). Nevertheless, it’s interesting that one of the first points to emerge from the preliminary findings of an inquiry into the error in the 2015 polls is that it wasn’t that unusual.
Error there certainly was, though, so it was worth the Inquiry moving on to consider what might have caused it. The first step was to list the things that they’re pretty confident weren’t major contributing factors, which eliminated postal voting, voter registration, overseas voters, question wording or framing, differential turnout misreporting, and mode of interview. The first three items on that list are no great surprise but the latter three might have been expected to be more of an issue. The amount of academic research on how to word questions in surveys is indicative of how important it can be. However, the Inquiry found no evidence that asking people how they’d vote in different ways made anything more than a modest systematic difference to the results. Of more relevance outside polling and academic circles, the elimination of differential turnout misreporting as an explanation rules out one of the more public arguments made after May the 7th. In other words, it seems that all the talk of ‘lazy Labour voters’ (who said they’d turn up on polling day and cast their votes for Ed Miliband’s party, but ended up not doing so) was wide of the mark. Similarly, and finally, the oft-cited issue of whether you survey people by phone or over the internet seems not to have been an issue in 2015.
So, if it wasn’t any of the above stuff, what the heck was going on? Well, it seems that the polling companies had too many Labour voters in their samples. This, as Prof. Sturgis was careful to joke, might seem like a rather obvious answer: “Why did you think Labour were going to win the general election?” “Um, because we asked lots of Labour voters.” Of course, there was much data presented that supported this conclusion. In particular, analyses of British Election Study (BES) and British Social Attitudes (BSA) survey data, which resulted from something more akin to random probability sampling, revealed predicted results that were much closer to the actual election outcome. Indeed, when only the keen early respondents in the BES and BSA data (who are more like the (keen) respondents to online quota sample surveys) were analysed, the predicted election result was closer to those proffered by the polling companies. In particular, the polling companies appear to have had too many people at the younger end of the oldest age group in their sample. Because age is positively related to turning out it seems that these ‘younger older’ people meant that the polling companies underestimated how many older people would turn out to vote and thus underestimated the Conservative vote (because older people are more likely than younger people to vote Conservative). Similarly, and at the other end of the age spectrum, the polling companies also appear to have had too many keen younger voters in their samples. This lead to an overestimation of the number of younger people who would turn out to vote and thus an overestimation of the Labour vote (because younger people are more likely than older people to vote Labour).
With the main cause of the polling miss identified the obvious next step is to consider what can be done about it. There are two options: improve the samples (to make them more representative of the population) or improve the weighting (especially but not only in relation to predicting likelihood of turning out). Those options aren’t mutually exclusive and, as an example, YouGov (who I’ve worked for in the past and who have co-funded my PhD research) have made it clear that they will be addressing both of those points. There are, of course, multiple ways to improve samples and weighting and, helpfully, the event hinted at some tentative recommendations. These suggested that although changes to the methodologies used by the polling companies will be needed to improve their samples, it will not be necessary for them to move to random probability sampling (which is appropriate for academic research but not necessarily for fast-turnaround polling). There may also be recommendations relating to the British Polling Council’s regulations on transparency and to the reporting and interpretation of polls. Crucially, it was emphasised that there is not a silver bullet to eliminate this problem; it is only possible to reduce, rather than remove, the risk of future polling misses.
The lack of a quick fix was a nice note to end on and I reckon the polling companies will be working on improving their results via as many (financially viable) routes as are available to them. From my perspective, the emphasis should very much be on improving the samples rather than focusing on improving weighting. This is for both a technical reason and a principled reason. In the former case, weighting of results should only ever be the last step in a process that is designed to make results as representative as possible before then. In other words, weighting should be a means to tweak results rather than to make them significantly more representative. Following this logic, it is fair to argue that the goal should be make the results as accurate as possible as early in the process as possible. This points towards the recruitment of more representative panels of respondents, and not just in terms of demographics (though they are important). This, in turn, leads me to the point of principle: in so far as polls profess to give us an insight into the views off the public, they should be based on samples that represent that public as accurately as possible. In particular, this means that there need to be a lot more people who are less politically engaged in the panels that polling samples are drawn from. Of course polling companies are commercial bodies and do not have unlimited resources, but I think this is something that they should prioritise.
Focussing on recruiting less politically engaged people to polling samples could even bear financial fruit in the future, not only by making polling results less liable to be wide of the mark but also by creating the possibility of asking questions to samples of such people. It’s difficult to recruit less politically engaged people to answer polls but I’m not yet at the stage of thinking it should be given up on. That said, there’s also a risk that asking those people lots of questions about politics could rapidly transform them into being more politically engaged. Clearly this is not the purpose of polling companies, and it would place a burden on them to continue recruiting less politically engaged people, but it’s hardly a negative externality. This brings me to the concept of the public good. I think it’s useful to have information about what people in the population think about politics and the government available more than once every five years. Yes, polls can be misused and abused. Yes, politicians can pay too much attention to them. Yes, they can become the focus of too much media coverage (which can risk presenting them as absolute truth). However, I also think that they can provide a complement to other worthwhile expressions of public opinion such as petitions, letter-writing, public meetings, protests, strikes, and direct actions. Crucially, the more they provide an outlet for people who are less inclined to do any of those things (i.e. less politically engaged people) the more they are a complement to those other means of expression. Thus, to my mind, the prize is not just polls that tell us something about public opinion, but polls that can also offer an outlet to those who might not otherwise say anything.
 This puts me in mind of my travels around the U.S. in 2004. I came back convinced that the Kerry-Edwards ticket was a nigh-on guaranteed victory. It was only after George W. Bush won his second term that I realised that my travels around the U.S. had only been to New York City, Chicago, San Francisco, and Los Angeles. Hardly Republican strongholds.
 The issue of whether polls are true taps into a wider, and fundamental, academic debate in the social sciences (and, no doubt other fields) about what, if anything, we can know. I obviously do not, and cannot, resolve that debate here but, I’m largely of the opinion that polls tell us something true but are very far from telling us the whole truth. This means we need to be very careful, every time we see a poll, to establish what the something true that it’s telling us is, and be cautious not to generalise beyond that.