Pollsters Ignored Their “Check Assumptions” Lights

check-light

Back in 2000 and again in 2004, I enjoyed a small piece of influence through political opinion poll analysis. Statistics is an intriguing science, all the more because it tries to quantify and predict human behavior. But that same human behavior also skews how people think, including analysts, and in 2008 and 2012 it caused me to miss important trends in American politics. I was embarrassingly wrong in predicting the Presidential elections, especially missing the energy of Obama’s 2008 run. So I backed off, paid more attention to my regular job and family, and paid less attention to statistics. Others enjoyed the attention of poll mavens, especially Nate Silver, who turned his statistical devotion to baseball into political success with Obama’s success. But Silver made the same mistake I did, and in his case the embarrassment is greater because as a professional statistician, he really ought to have known better. Silver let his enthusiasm for Democrat opinion cause him to ignore warning signs until it was too late to avoid a face plant.

Let’s have a quick review of how polls saw the 2016 Presidential Election, and also how polls work, and finally how predictive analysis is created.

Hillary Clinton announced her decision to run for the White House on April 12, 2015. This is important because Clinton already enjoyed significant name recognition and with the roles of First Lady, Senator and Secretary of State on her resume, she would start as an obvious front-runner for the Democrats’ nomination. Nate Silver gave her a 59.9% chance of winning the party nomination at the beginning (I’m using Silver here for two reasons – first, his projections are built from aggregates of major national polls, and second, Silver was the most prominent poll analyst quoted in the media). She enjoyed media support through the end of 2015 as the presumptive front-runner, but by the end of October 2015 Clinton’s lead over Sanders in Silver’s chart was down to 46.8% to 26.1%, notable not for Sanders’ strength but Hillary’s weakness. By February 2016, Silver put the race at 49.6% Clinton to 39.1% Sanders – note that Hillary’s campaign was failing to win over most of the undecideds, losing them to Sanders more than four to one. By April 23, 2016 Silver had the race 49.6% Clinton to 41.5% Sanders; note two important factors apparent, first that Hillary appeared to have a lead bigger than Sanders could close, but second that Sanders had more momentum than Clinton, and had enjoyed higher energy for some months. By the end of June, Silver showed the race 55.4% Clinton to 36.5% Sanders, essentially a done deal for the Democratic Party nomination.

http://projects.fivethirtyeight.com/election-2016/national-primary-polls/democratic/

Donald Trump announced his candidacy for the office of the President on June 16, 2015. At that time Silver counted his support at a 3.6% chance of winning the GOP nomination. Let’s stop there and consider that this meant the polls showed Hillary Clinton’s chances of winning her party’s nomination were more than sixteen times greater than Donald Trump’s chances of winning his party’s nomination. Part of this was due to the heavy number of candidates for the Republican nod, but also Donald Trump – while known as a face and name – was unknown as a political contender, so he had to establish his bonafides with both the GOP and the voters. Trump’s campaign quickly gained support, however, as he passed the 20% threshold on July 26, 2015, and the 40% threshold on March 21, 2016. This means that Donald Trump had not won over most voters until after his Super Tuesday wins in Alabama, Arkansas, Georgia, Massachusetts, Tennessee, Virginia and Vermont. On March 22, Trump claimed another 58 delegates by winning the Arizona primary. By the end of May, Trump had essentially locked up the GOP nomination.

http://projects.fivethirtyeight.com/election-2016/national-primary-polls/republican/

Both Clinton and Trump finished the win-the-nomination part of their campaigns with damage, however. Trump’s problems were obvious – to energize his base, Trump attacked establishment Republicans and demographics aligned with opponents of populist theory, and this cost him nationally in polls. In early June, polls showed Trump’s support at 38.1%, compared to 42.1% for Clinton. But Clinton had obvious problems, too. The way Clinton won the Democrats’ nomination left many Sanders supporters convinced the primary had been rigged, which may be one reason Trump made similar claims as the General Election reached its resolution. But also, given the many demographic groups Trump had – allegedly – attacked, a four-point lead for Clinton was a clear warning sign that something was not as described.

smoking-car

Call it a poll version of that annoying “check engine” light on your dashboard. Until you have someone get under the hood, you don’t know what exactly has gone wrong, but you can’t ignore it unless you don’t mind spending hours on the side of the road beside your smoking vehicle, at the mercy of passing traffic. There is science behind a poll that is put together and analyzed properly, but laziness or assumptions in your data or procedures can invalidate your conclusions, and make you look a fool in public.

nate-silver

By the way, Nate Silver uses an aggregate of polls, but he is also guilty of some subjectivity in his source selection. For example, Silver’s aggregate shows Clinton had a wire-to-wire lead over Trump in polling, with Trump never enjoying a lead in the aggregate polling at any time:

http://projects.fivethirtyeight.com/2016-election-forecast/national-polls/

Real Clear Politics, however, which also uses an aggregate of polls, showed Donald Trump with an aggregate lead on May 24 and from July 25 through July 28 of this year.

http://www.realclearpolitics.com/epolls/2016/president/us/general_election_trump_vs_clinton-5491.html

That’s not to say one aggregate is ‘better’ than the other, but to illustrate the fact that any aggregate is subjective and contains implicit bias. Ironically, Silver was aware of this bias and tried to correct for it – he calls this “trend line adjustment” – but in the end Silver’s own bias still influenced his conclusions.

http://www.huffingtonpost.com/entry/nate-silver-election-forecast_us_581e1c33e4b0d9ce6fbc6f7f

It’s important to remember that Silver was wrong about Trump winning the GOP nomination. After trump won the GOP nomination, Silver admitted “we basically got the Republican race wrong.”

http://fivethirtyeight.com/features/why-republican-voters-decided-on-trump/

There was no evidence that Silver went back to find the evidence he overlooked in his initial analyses, which could have corrected his results in the General Campaign. But here is, at least, evidence that Silver knew something in the numbers was wrong. Just before the final day of the election, Silver put out his “final election update”, giving Clinton a 71% chance of winning.

http://fivethirtyeight.com/features/final-election-update-theres-a-wide-range-of-outcomes-and-most-of-them-come-up-clinton/?ex_cid=2016-forecast

This ran contrary to far more aggressive posts from the New York Times, which gave Clinton an 82% probability of winning,

http://www.nytimes.com/elections/forecast/president

the Princeton Election Consortium gave Clinton a 93% chance to win the White House,

http://election.princeton.edu/2016/11/08/final-mode-projections-clinton-323-ev-51-di-senate-seats-gop-house/

left-leaning pundit Larry Sabato did not offer a probability, but called for Clinton to win 347 Electoral Votes,

http://ijr.com/2016/08/667335-famed-election-predictor-with-97-100-track-record-reveals-his-trump-vs-hillary-2016-results/

and of course the Huffington Post posted that Clinton had a 98% chance to win the Oval Office.

http://elections.huffingtonpost.com/2016/forecast/president

Anyone who turned on ABC, NBC, CBS, CNN, or Fox was also flooded with assurances that Clinton was poised to win by large margins. That all of these analysts were wrong, and to such a large degree, is amusing given their hubris, but concerning given their prominence in media coverage of the election.

ryan-grim-huffington-post

The last week of the election, Nate Silver’s concerns about the polling data caused him to scale back his probability for Clinton (he initially had Clinton at 89%, but as the election approached he walked it back to 71%), while Ryan Grim of the Huffington Post kept Clinton at a 98% chance to win. This led to some ill-advised words on Twitter between the two men about each other’s methodology.

http://www.vox.com/2016/11/6/13542328/nate-silver-huffpo-polls

Ironically, while Silver was correct that weighting Clinton’s advantage beyond anything supported by poll data was foolish, he failed to properly test the underlying assumptions installed in his own model.

I found it intriguing to notice that neither Gallup nor Pew published polls for the Presidential election, each focusing instead on issues rather than candidates. A business reason was provided,

Here’s Why Gallup Won’t Poll the 2016 Election

but given the long history and prominence Gallup and Pew enjoyed in polling Presidential races, the reason given rings false. A more likely explanation is the difficulty in addressing behavior changes in the voting public. In addition to the shift from landline phones to cell phones, voters are more likely to discuss opinions on line than in a phone interview, but there is no statistically sound means to randomly contact respondents online and the results of online polls are as varied as there are opinions reported by them. Pew observed that online polls are “non-probability” polls, which eliminates by definition the random nature of polls, and therefore calls into question any political conclusion presented by such a poll.

Q/A: What the New York Times’ polling decision means

Pew also posted an article yesterday about why the polls were essentially wrong, but was wrong to pretend weighting mistakes were not a big part of blunder.

Why 2016 election polls missed their mark

Forbes boasted that analysts predicting a Hillary win “used the most advanced aggregating and analytical modeling techniques available”

http://www.forbes.com/sites/startswithabang/2016/11/09/the-science-of-error-how-polling-botched-the-2016-election/#4d6c04257da8

but that is a false claim on its face. What happened was not a “statistical error”, but human error. Weighting for party affiliation or other demographics, is risky at best and often leads to unreliable results. To see what I mean, let’s start with the exit poll from the 2012 Presidential Election, by party affiliation, gender, race, and age:

Party Affiliation: Democrats 38%, Republicans 32%, Independents 29%

Gender: Women 53%, Men 47%

Race: White 72%, African American 13%, Hispanic 10%, Asian 3%, Other 2%

Age: 45-64 38%, 30-44 27%, 18-29 19%, 65 & over 16%

How Groups Voted in 2012

And from 1984 through 2014:

Party Affiliation: Democrats 38.6%, Republicans 32.6%, Independents 27.5%

Gender: Women 53%, Men 47%

Race: White 76%, African American 13%, Hispanic 7%, Asian 2%, Other 1%

Age: 45-64 33%, 30-44 28%, 18-29 14%, 65 & over 25%

http://www.electproject.org/home/voter-turnout/demographics

How Groups Voted

Any poll with demographics different from these numbers is fiddling with the numbers out of clear bias. Without wasting time going through them this skewing invalidates polls from ABC News, the Wall Street Journal, Fox News, NBC News, CNN, and CBS. If you want to check for yourself, simply find one of their polls and drill down to the demographics which are usually included at the end of the topline detail.

Weighting is not supposed to produce the “right” answer, but to line information up according to known population demographics. Sadly, a lot of polls screw up the results by trying to sell a message, rather than accurately report the current situation. This is not an attempt to “rig” an election, I believe, but simple human laziness and a habit of using assumptions instead of due diligence.

This becomes ever more salient, when you realize that the aggregates used by analysts like Silver and Grim incorporate these biased reports, which invalidates their own analyses. Aggregation is really just group-think, even if some people publish such results with impressive names like “meta-sampling”. Everything that goes into an analysis should be tested for its own veracity, and while this is very difficult for a national report, at the very least you should be candid if you are trusting someone else’s report as a source for your own analysis. Yes, Silver claims he ‘unskews’ polls by other agencies, but that’s kind of like a guy admitting someone spit into your drink but he scooped it out and it’s fine for you to drink. If you know the source is biased, it does not belong in your own work, none of it.

One last thought on polling. The Presidential Election is not a national race, no matter what the media tells you. It’s actually fifty-one different races, which results are summed up and produce the champion, in this case the President-Elect of the United States. So the polls you ought to have watched are the state polls, especially according to the respective electoral vote value of each state. Most media ignored the state-level polling, and when it was reported it was usually just from a single source that the media found reliable. I will be publishing a report on the accuracy of the state polls for the 2016 Election when I have all the data, but for now it’s important to know the limits of what analysts even can tell you, and keep in mind that most media people are there to sell you entertainment, not facts.

kitty-faceplant

Tantrums From Clinton Supporters (Updated)
Homeless Rediscovery Watch
  • Retired military

    You said “Hillary Clinton announced her decision to run for the White House on April 12, 2015


    If I remember right she made the announcement like 6 different times and tried to reinvent herself (for those folks who hadn’t been paying attention for the past 30 years) into something different each time.

    “Most media ignored the state-level polling, and when it was reported it was usually just from a single source that the media found reliable. ”

    I contend the media ignored anything which did not fit into their own world view or would be good for Trump. Anyone who said that Texas was in play for Hillary should have had their head examined, even with all the illegals in the state trying to vote.

    “Sadly, a lot of polls screw up the results by trying to sell a message, rather than accurately report the current situation”

    Shhh Don’t tell Pennywit this. He lives by the polls. He also doesn’t have time for the internals since it would mean devoting time to looking at them but he would like a breakdown of the voting on election day by race, sex, age, religion and ethnicity.

  • most members of the media are dumber than a bag of hammers to begin with …

  • Scalia

    Excellent piece, DJ, but I take issue with your denial that rigging was going on. When that many polling groups with that much experience ignore the plain data before them, something is seriously askew. As Allan Lichtman (who was one of the few who called it right) said, the pressure from liberals was tremendous to get him to change his prediction. They were so pathologically opposed to Trump, they were willing to skew data in order to advance their narrative that Hillary’s election was inevitable. Perhaps some of them didn’t want to be an outlier, but that too is dishonest.

    Regardless the motivation, skepticism toward the mainline polling groups will remain high except for those who had the integrity to call this one accurately (e.g. LA Times).

    • Retired military

      Shhh You are breaking Pennywit’s heart

  • yetanotherjohn

    I’m wondering if a better way of dealing with this is margin of error. Start with a poll of appropriate size (e.g. 1000 registered adults). Take several known demographics to compare this to. For example, I bet you could with a little digging get the number of males/females registered to vote by state. Use the average voting numbers as another control.
    Now compare your sample with these known demographic patterns. If the registered voters in a state was exactly 50/50 M/F and your sample came out 60/40 M/F), what margin of error would you need to have gotten a 60/40 out of a 50/50 pool (math is beyond me but that’s why I’m not a statistician). Now instead of adjusting your results to match what you think you should have gotten (e.g. 50/50) increase your margin of error.

    If you are polling more men than women (e.g. 60/40) you know you are some how missing connecting to women. Why assume that the women you did connect to have the same opinions as the women you didn’t (or what ever demographic marker is out of kilter). By adjusting margin of error, your accepting the fact that what you aren’t reaching may all be of the same opinion and thus what may be the true state of the race.

    Take a simple example, you know the state is 50/50 M/F and your poll returns 600 males and 400 females. Further, to keep the math easy assume that you poll came back as 300 males for Trump, 300 males for Clinton, 200 females for Trump and 200 females for Clinton. Classically, the pollster would weight the sample and arrive at a 50% for Trump, 50% for Clinton with +/- 3% margin (or whatever the correct margin is for 1000 sample.

    Now further suppose that the 100 females not contacted where 100% for Trump or 100% for Clinton. Not likely, given the other 1000 polled, but assume the thought exercise. So instead of a true 50/50 race, it is actually 600 for Trump vs 500 for Clinton or 500 for Trump vs 600 for Clinton. So instead of a 50% to 50% race it is 54.5% to 45.5%. So now present your margin of error as 4.5% instead of 3% (or even wider to account for the randomness of the sample).

    Now the chances of all 100 missing females all feeling the same way are minuscule and their might be statistical math that would let you narrow down some from that 4.5%. Doing the same thing for age, party affiliation, race, etc. may lead to multiple and cumulative margin of error adjustments. But looking at a poll and seeing the margin of error would immediately tell you how far out of “normal” was the sample. If there is a social effect (e.g. those who support Trump don’t want to talk on the phone to any supposedly liberal pollster from the NYT), then your margin of error will capture those people.

    Admittedly, the poll saying the race is 50/50 with a 3% margin of error would be read the same by most people as a poll saying the race is 50/50 with a 4.5% margin of error. And the reality is that it is probably the same and the 100 women you missed would have the same 50/50 split as the 400 women you contacted. But a semi-sophisticated consumer wouldn’t have to drill down and re-weight to demographics you think are “better”. You could notice how much wobble in the poll is purely statistical (e.g. a 3% MOE) and how much is because the poll seems to be contacting a non-standard poll sample.

  • You miss the elephant in the room. Conservatives as a rule, and Republicans to a lesser extent don’t trust the press and the pollsters and thus don’t respond to them nor answer their questions. Which is entirely reasonable given what we saw in this election cycle.

  • fustian24

    Modern polling is what I call second order baloney.

    First order baloney is when you just make up your answer. Punditry is an example of first order baloney.

    Second order baloney is when you measure something, then manipulate it based on a “model”. Climate “science” is an example of second order baloney as is polling.

    In theory accurate polling could be done as long as the polling company was willing to pay to poll enough people and they could manage to pull off actual random samples.

    But they can’t (or they don’t).

    So they try to fix their too small, non-random samples by weighting them the way they think they ought to be weighted.

    Once they put their thumbs on the scale like that, they are admitting that they don’t have enough data to make an accurate prediction and it’s just guesswork after that.

    So this raises the obvious question: if they aren’t actually gathering enough data to make an accurate prediction, what are the polls for, anyway?

    I suspect you can probably get information about trends from this kind of data. You might not know what the actual percentages are but the day to day relative data might have some information in it. Maybe there is some small value there.

    But, if you are a pollster, I’m guessing the real reason you are in business these days is because you get paid to shape opinion. Most polling these days is probably done in the service of propaganda.

  • PBunyan

    The vast majority of the polling this year, just like all the “news” broadcasts (including FOX News) was nothing more that psych-ops to suppress Trump’s support.

    They were tremendously successful but unfortunately for the global Marxist oligarchs, their candidate was so fundamentally flawed and despised by so many that even after suppressing probably about 15-20% of Trump’s support she still lost.

    I can’t for the life of me understand why they didn’t run someone else unless they figured a totally corrupt career criminal would be easier to control.

  • A few days before the election, Nate Silver warned that Clinton’s chance at winning had become fragile. The professional pollster who really goofed is Larry Sabato.