Math Year 2013: Virginia governor's race post-mortem:Grades the five pollsters from the final week

Monday, November 11, 2013

Virginia governor's race post-mortem:
Grades the five pollsters from the final week

All the polls in the Virginia governor's race from mid-July to November that Terry McAuliffe had a lead over Ken Cuccinelli, and in the final week the median lead was 7%, which using my system which factors in the size of the sample as well made McAuliffe a 99.4% favorite to win. He won, but by a much smaller margin, leading many people to ask why the polls were so wrong.

The best explanation is a well-known tendency for third party candidates to do better in polls than they do in the actual election. The Libertarian candidate Robert Sarvis was getting good numbers for a third party, the last five polls giving him 13%, 12%, 10%, 8% and 4%. The median was 10% and the average 9.4%. His actual numbers were 6.6% and a lot of his alleged support appears to have favored the Republican Cuccinelli when the curtains on the booths were drawn.

Many websites give the margin of the lead instead of the "margin of error" these days, and that makes some sense, since the "margin of error" is the 95% confidence interval for the actual result, which really should be measured after the undecided and no preference numbers are removed from the tally. If all we care about is the final lead, the Emerson College Polling Society bathed themselves in glory by predicting a 2% lead for McAuliffe, very close to the actual 2.5% final margin when everyone else overstated the lead.

But looking at the 95% confidence intervals after undecideds are removed actually makes Emerson College look like the worst polling company of the final five. Here's why.

Emerson College
Raw numbers: sample of 874, McAuliffe 42%, Cuccinelli 40%, Sarvis 13%
Removing undecided: sample of 830, McAuliffe 44.2%, Cuccinelli 42.1%, Sarvis 13.7%
Total missed percentage from reality: 14.25% (5th place out of 5)
95% confidence interval (aka margin of error) for each candidate:
McAuliffe 47.59% to 40.83% (missed 47.97%)
Cuccinelli 45.46% to 38.75% (barely missed 45.47%)
Sarvis 16.02% to 11.35% (massively missed 6.56%)

Emerson missed all three true vote totals, only barely in the case of the front runners, but by a wide margin when looking at Sarvis.

PPP
Raw numbers: sample of 870, McAuliffe 50%, Cuccinelli 43%, Sarvis 4%
Removing undecided: sample of 844, McAuliffe 51.5%, Cuccinelli 44.3%, Sarvis 4.1%
Total missed percentage from reality: 7.15% (second best)
95% confidence interval (aka margin of error) for each candidate:
McAuliffe 54.92% to 48.17% (missed 47.97%)
Cuccinelli 47.67% to 40.98% (captured 45.47%)
Sarvis 5.47% to 2.78% (missed 6.56%)

PPP was the only company to underestimate Sarvis, so they were very close to the real numbers for Cuccinelli and overestimated McAuliffe. One company did better at the distance from all three candidates.
Rasmussen
Raw numbers: sample of 1002, McAuliffe 43%, Cuccinelli 36%, Sarvis 12%
Removing undecided: sample of 912, McAuliffe 47.3%, Cuccinelli 39.6%, Sarvis 13.2%
Total missed percentage from reality: 13.25% (4th place out of 5)
95% confidence interval (aka margin of error) for each candidate:
McAuliffe 50.59% to 44.01% (captured 47.97%)
Cuccinelli 42.73% to 37.76% (missed 45.47%)
Sarvis 15.38% to 10.99% (massively missed 6.56%)

Like Emerson, the big overestimate of Sarvis skews these numbers badly, but they did capture McAuliffe's numbers.

Newport College
Raw numbers: sample of 1028, McAuliffe 45%, Cuccinelli 38%, Sarvis 10%
Removing undecided: sample of 965, McAuliffe 48.4%, Cuccinelli 40.9%, Sarvis 10.8%
Total missed percentage from reality: 9.22% (3rd best)

95% confidence interval (aka margin of error) for each candidate:
McAuliffe 51.54% to 45.23% (captured 47.97%)
Cuccinelli 43.96% to 37.76% (missed 45.47%)
Sarvis 12.71% to 8.80% (missed 6.56%)

Like several other polls Newport captured the McAuliffe number but underestimated Cuccinelli and overestimated Sarvis.

Quinnipiac
Raw numbers: sample of 1606, McAuliffe 46%, Cuccinelli 40%, Sarvis 8%
Removing undecided: sample of 1510, McAuliffe 48.9%, Cuccinelli 42.6%, Sarvis 8.5%
Total missed percentage from reality: 5.83% (best of 5)
95% confidence interval (aka margin of error) for each candidate:
McAuliffe 51.46% to 46.41% (captured 47.97%)
Cuccinelli 45.05% to 40.06% (missed 45.47%)
Sarvis 9.92% to 7.10% (missed 6.56%)

Coming off their great call of the New York City Democratic comptroller primary, Quinnipiac is again the best polling company of those that polled in the last week, though it must be said they are the best of a bad bunch. Their downfall in the 95% confidence intervals was the size of their sample. Big samples mean smaller margins of error, but not necessarily more captures of the real numbers.

I don't change my model very often, and even though the candidate who my system favored actually won, I was a little embarrassed by assigning such a huge Confidence of Victory number (99.4%) to a race that turned out to be so close. The next time there is a third party candidate polling with significant support but no real chance to win, I'm going to figure out a fair way to re-assign the numbers, with the assumption that a Libertarian will lose support that will go Republican and a Green will lose support that will go Democratic.