For the past 90 years, election prognosticators have had one tool in their toolbox: surveys. After the third consecutive American presidential election in which this methodology underestimated support for Donald Trump, there are reasons to doubt it makes sense any more. Calling a small sample of people and asking them what they are going to do seems anachronistic in a world in which tech behemoths mine billions of online data points to predict consumer actions—behemoths that often know consumers better than they know themselves.
Consider the problems with polls. First, people do not answer them. Survey response rates have dropped to as low as 2%.
Second, people mess with them. Younger people are particularly likely to enjoy giving faulty answers. One academic study found a correlation in survey data between being adopted and various problematic behaviours; when it was found that 19% of those who said they were adopted were just joking, the study was retracted.
Third, people lie to pollsters to protect their self-image, something called social-desirability bias. The level of deception can be shocking. Research shows that roughly four in ten people who did not vote in an election will report in surveys that they did. People are also known to dramatically over-report how much sex they have, their tendency to give to charity and their academic accomplishments.
These non-respondents, tricksters and liars probably all played into Mr Trump’s better-than-expected performance. There is some evidence suggesting that his supporters are less likely to answer surveys, more likely to mess with them and less likely to admit in them that they support him.
Does this make surveys useless for predicting what will happen in an election? Not entirely. Noisy and flawed as they can be, surveys do contain some useful information, particularly in helping us understand where support for candidates might change. I compared Mr Trump’s actual performance to that predicted by FiveThirtyEight, a poll aggregator. About half of the state-level change in support for Mr Trump between 2020 and 2024 could be predicted by surveys. Surveys correctly noted that his vote share was going to rocket in Kentucky, New York and Massachusetts. And there are trends that were simply overlooked. Some of the seemingly surprising patterns of the election, such as Mr Trump’s strong performance in highly Hispanic districts, would have been less surprising to those who were paying close attention to the surveys that largely predicted that shift.
That said, polls clearly struggle to predict fully what will happen in an election. And, in an era in which internet-trawling humanity produces more than 400 terabytes of data every day, it is increasingly odd to pin hopes on the responses of a few thousand people who happen to pick up the phone, and who may or may not be honest with pollsters, or with themselves.
For the past 15 years, I have been studying Google searches. I and others have found that search data are often far more predictive than surveys. Google searches for “vote” and “voting” can predict who will actually turn out to vote, not just those who say they will—just as searches related to suicide can predict suicide rates better than survey reports of suicidal ideation. Google searches revealed where racism was highest in America and predicted the early rise of Mr Trump. And in April 2020 I used them to discover a new symptom of covid-19: eye pain, a finding confirmed months later by health researchers.
There is already some evidence that search data could provide rich predictive power around elections, and not necessarily by making straightforward queries. Stuart Gabriel of UCLA and I found, for example, that the order in which candidates are searched on Google, across many presidential cycles, is itself an indicator. People who search “Trump Harris debate” are more likely to support Mr Trump than people who search “Harris Trump debate”. What is most fascinating about this indicator is that, again, the data can reveal something that the searcher may not even realise. Ostensibly undecided voters may signal their support based on which candidate they include first when searching.
In Mississippi, a Trump stronghold, more than 65% of searches with both candidate names included “Trump” first—the highest in any state. In Vermont, a Harris stronghold, 58% of searches with both candidates’ names included “Harris” first—also the highest in any state. Overall, 24 of the 26 states most likely to include “Trump” first in their two-name Google searches went for Trump. Nineteen of the 25 states most likely to include “Harris” first in theirs went for Harris. And we have seen four elections in a row for which adding this indicator would have improved state-level predictions compared with just averaging polls.
We are still early in the study of how online data can help understand and predict human behaviour. But it is abundantly clear that 2024’s elections will be among the last in which surveys alone are used to predict the results.
Seth Stephens-Davidowitz is a data scientist and author who formerly worked at Google.
© 2025, The Economist Newspaper Limited. All rights reserved. From The Economist, published under licence. The original content can be found on www.economist.com
Source link