More polls have come in showing Trump close or even ahead of Hillary Clinton. Are they to be believed? Rather than breaking down each poll individually as in my last post, I thought I’d put together a rubric for judging all future presidential polls.
Minority Vote Share
The easiest way to spot a suspicious poll is to check the minority vote. Nate Cohn points out a particularly egregious example:
Rather conservatively, Hillary Clinton should be getting 90% of the black vote and 75% of the hispanic vote. Polls that show any more of the minority vote going to Trump are inflating his performance. It’s not necessarily a sign of fraud, but more the difficulty of getting high quality samples of demographic subgroups, especially non-english speaking hispanics.
Composition of the Electorate* (See Update for Corrections)
Polls generally don’t break down what portion of the electorate each demographic group makes up. However, you can infer it from the demographic breakdown from its topline results.
In 2012, President Obama lost white voters by 20 points but beat Mitt Romney by 4 points on the backs of a 82% of the nonwhite vote. It’s almost certain Trump would have to beat Romney’s margin among white voters significantly. While some may question the presumption that Clinton will motivate as many minority voters to turnout as the so called Obama coalition in 2012, there are a lot of factors in Clinton’s favor. For one, the share of the minority electorate grows by about 2% every four years due to higher population growth relative to whites. Secondly, there’s a good chance that hispanics will increase their democratic vote share compared to 2012 in response to the unique vulgarness of Trump’s call for mass deportations.
Here’s an example of a poll with a questionable racial composition, showing Trump with a 2 point advantage over Clinton despite a lower share of the white vote than Romney.
|White Vote 2012 (Romper Center)||R- 59 D- 39 Overall Result D+4|
|White Vote 2016 (NBC/WSJ Poll)||R- 57 D- 33 Overall Result R+2|
A polster might defend their racial composition of the electorate by claiming that Trump’s candidacy is so unique that the normal demographics of race don’t apply. Maybe Trump’s racial dog whistles will bring out racist whites who usually don’t vote, decreasing the minority percentage.
Luckily, the evidence so far points against that interpretation. Despite Trump’s claims to the contrary, he does not seem to be expanding the party. While turnout in the Republican primaries was much higher than in past years, almost all the new primary voters voted in past general elections. There’s also no relationship between primary turnout and general election results. Plus, if there’s to be an increased racist vote share minority turnout will likely go up in response as well.
The Center for American Progress’s report on the demographics of the 2012 election finds that undercounting the minority vote share is a consistent problem for pollsters:
“Prior to the election, many prominent national surveys were drawing likely voter samples that projected the minority share of voters to remain static or even decline relative to 2008. Gallup estimated minority voters around 22 percent, Washington Post/ABC around 23 percent, and the Pew Research Center around 24 percent. Virtually no pollsters had the minority share reaching the actual 28 percent.”
In 2016, it seems that many polling outfits haven’t learned thier lesson.
Another way of looking at the composition of the electorate is by party identification:
This is not appropriate. The composition of the electorate by party ID varies based on the candidates in the race and is not predictably fixed year to year like race. Given the pollster is acting in good faith, claiming a poll is biased by the Party ID structure is spurious.
It’s also important to note demographic variables aren’t independent of one another. For instance, if assumed the racial component of the electorate seems wrong, this will also be reflected in Party ID. You cannot add multiple overlapping errors in a poll’s electorate composition.
Gender will be particularly salient in 2016 since we have one female candidate and one in misogynist candidate. On one hand, it’s likely that a female presidential candidate will suffer at the polls due to sexism. On the other hand, it’s likely that Trump will suffer for comments like: “Women–you have to treat them like shit.”
How these countervailing factors will balance out largely depends on the respective strength of partisanship, sexism, and anti-sexism. Republicans will have to weigh their partisan loyalty with their desire to condemn sexism. Sexist democrats will have to weigh their partisan affiliation with their desire not to vote for a women. (This is not an election year that will make you feel good about American Politics.)
In 2012, Obama won 55% of women and 45% of men. Thus far, it appears that the partisan and misogynist forces may have a slight advantage over the anti-sexist forces. The latest Washington Post poll (Trump +2) gives Trump 57% of the male vote and 38% of the female vote, with 8.5% undecided.
As the Democratic primary drags on, many Sanders supporters refuse to support Clinton in a general election. All the evidence from past divisive primaries suggests that Sanders voters will eventually line up behind Clinton. But that is not guaranteed, and there is some suggestive evidence to think this time is different.
Tracking the opinions of Sanders’ supporters in future polls will be critical.
One way of tracking this would be to look at the portion of Democrats voting for Clinton. But Bernie’s unique electoral coalition makes that method wrong. Many Bernie supporters are independents, hence Sanders’ poor performance is closed primaries.
To properly assess how Sanders’s supporters are voting, the poll must specifically break down that group of voters. The write up of the latest NBC/WSJ poll illustrates the point:
“While Democrats are backing Clinton by an 83 percent-to-9 percent clip, just 66 percent of Democratic primary voters preferring Sanders support Clinton in a matchup against Trump.”
As the Republican race has been over for a month now, Republicans have had time to unite around Trump. The continuing contentious Democratic primary will advantage Trump in the weeks to come.
Update 6/27 (Original 5/23)
Demographic Breakdowns Based on Exit Polls are Bad
Unfortunately, some evidence has come to light that renders some of my analysis here incorrect. Nate Cohn has written a long feature in the NYT demonstrating that the white vote in 2012 was larger than what was found in exit polls.
From the article:
New analysis by The Upshot shows that millions more white, older working-class voters went to the polls in 2012 than was found by exit polls on Election Day. This raises the prospect that Mr. Trump has a larger pool of potential voters than generally believed.
The wider path may help explain why Mr. Trump is competitive in early general election surveys against Hillary Clinton.
Exit polls are only one source of information about the racial composition of the electorate, and they are a relatively bad one. The census (which asks if you voted), government voter file data, and post election polls are better sources of information for finding out who voted.
Why are exit polls bad? They use a cluster sampling technique that emphasis partisan accuracy, not racial accuracy. Exit polls select precincts that are representative of a state based on party ID, attempting to get some precincts heavily democratic, heavily republican, and some that swing between the two. Nothing is done to ensure these precincts are also representative of the state’s racial composition.
Unfortunately, precincts tend to be very biased representations of the electorate. Since people tend to live with people who are of thier own race, (for reasons that occur both naturally, (check out the Schelling model) due to people’s race/classism, and from U.S government policy) voter precincts are very homogeneous. Getting an accurate sample would require targeting precincts on the basis of race or sampling a very large number of precincts. (Perhaps it can’t even be done by random sampling.) But the status quo of exit polls just tries to ensure the accuracy of partisanship.
The census, voter files, and post election polling come with thier own problems, but the NYT models show that exit polls almost certainly overstate the size of the minority electorate. Therefore, my analysis of the racial composition of the electorate is overstates the amount of possible bias for Clinton.
The difference isn’t huge but it is significant. Rather than 39% of whites voting for Democrats in 2012, the Upshot estimates that number at 41%. Furthermore, whites likely made up a larger portion of the electorate than the exit poll estimate of 72%, but the Upshot isn’t providing that number easily.
The bottom line is that the white portion of the electorate is likely higher numbers based off exit polls and 2016 general election polls are probably not showing the widespread bias that the Center for American Progress (and me) thought they were. Since all demographic information on the composition of the electorate comes with problems, it doesn’t seem appropriate to judge polls for bias on this compositional criteria. There’s just too much uncertainty.
Trump tries to adjust poll based on Party ID
What I wrote a month ago:
“The composition of the electorate by party ID varies based on the candidates in the race and is not predictably fixed year to year like race. Given the pollster is acting in good faith, claiming a poll is biased by the Party ID structure is spurious.”
On the other hand, Trump seems to believe the pollster did violate my assumption:
Draw your own conclusions.
Couldn’t help but notice that the second tweet speak of “poles.” I’m an absolutely atrocious speller and even I can do better than that.