Monday, September 23, 2013

VA governor's race:
Late September update

In Virginia, a state Barack Obama won by 4% in 2012 and 6% in 2008, the statewide races feature Tea Party favorites running on the Republican ticket. In the governor's race, both the Democrat Terry McAuliffe and the Republican Ken Cuccinelli have large negative ratings, but the basic math of the electorate says one thing clearly.

The Republicans are the minority party in the land. They've lost two presidential elections in a row, they don't hold the Senate and got less votes across the country in last congressional election, holding onto the majority in large part due to gerrymandering.

For all that, the needle is moving slightly for Cuccinelli if we give all polls the same weighting. Here are the Confidence of Victory numbers for recent polls, given by company name and date of poll. All give McAuliffe the advantage.

Washington Post 9/22 89.1%
Marist 9/19 90.3%
Harper[R] 9/16 94.2%
Roanoke 9/15 63.5%
Quinnipiac 9/15 84.9%

The good news for Cuccinelli: McAuliffe isn't cruising above 95% anymore and Quinnipiac, still crowing over calling the NYC comptroller race all by themselves with their new likely voter model, says the race is getting close, but still isn't really close.

The bad news for Cuccinelli: You see those two big dives downward in the graph, the one in May and the second in July?  The first was a Washington Post poll and the second was a Roanoke poll. The only polls that have had Cuccinelli ahead now have him behind.

The other bad news for Cuccinelli: My system may say 85% Confidence of Victory, but the track record so far says almost no underdog wins unless the Confidence of Victory splits about 60%-40%. The biggest upset against my system since 2008 was Stringer beating Spitzer for NYC Comptroller when Spitzer had 68.9% Confidence of Victory, the election when only Quinnipiac got the numbers right. (They have crowed loudly about that, but they had De Blasio cruising over 40% of decided voters and he only just beat that margin.)

The election is still about seven weeks off, which means my system is not built to make a prediction, but the small gains Cuccinelli has made are not what he needs. Romney was in a similar situation and saw some small gains after Obama's poor showing in the first primary, but those faded quickly enough. Cuccinelli needs something bigger than we've seen so far to close the gap, and appealing to the Republican "base", such as it is, is a step in the wrong direction right now.

More updates when I get more data.

Saturday, September 14, 2013

Virginia Gubernatorial Race:
McAuliffe (D) vs. Cuccinelli (R)

There was a recent article in The Washington Post by Ben Shapiro discussing the Virginia gubernatorial race. Several pundits were asked their opinion of the race. They quote one poll - Quinnipiac from last month showing a six point McAuliffe lead - and several pundits saying it leans towards McAuliffe or, in the words of often quoted Charlie Cook, it's a 50-50 race.

Not one poll aggregator is asked how it's going.

I could say that pundits are a lower life form. I could call them some scatological name or some rude word that describes a male or female sex organ.

But seriously, if you didn't get the memo from 2012, there is no lower life form than pundit. Pond scum should write strongly worded letters to the editor whenever they are compared to pundits.

Here's what the polls have said with remarkable consistency since early May. McAuliffe has a very big lead.  A Washington Post poll said Cucinelli had a lead on May 2, but since a week after that on May 9, only a mid-July Roanoke poll gave him the lead. Within days of that report, PPP and Quinnipiac both gave McAuliffe a substantial lead.

For those who have prejudices against some polls as being "liberal leaning", consider Rasmussen polls, who erred substantially towards Romney in the 2012 election cycle. In early June, they read a 3% lead for McAuliffe. In early September, they said it was a 7% lead for the Democrat. In my Confidence of Victory system, that 7% lead means McAuliffe's is about 140 to 1 favorite to win if the election were held when the poll was taken.

And now for my personal provisos. Nate Silver thinks he can give a percentage for what that number means today given the election happens November 5.

I make no such claims and I never will.

Here's the situation. Right now, it looks like a gimme for McAuliffe, but there are seven and a half weeks to go. What no one can predict, neither the hard working aggregators or the lazy arrogant overpaid and over-quoted pundits, is what will move the needle in that period. As of right now, what Cuccinelli is an outright stunner, something along the lines of Anthony Weiner's penis pics or Mitt Romney's comments about the lazy scum-sucking 47% of the population.

Heck, maybe he can even get McAuliffe to admit he doesn't like the Soggy Bottom Boys. But anything less than that and he has a concession speech to write, the sooner the better.

More reports when there is more information.

Wednesday, September 11, 2013

Results from the Democratic mayoral race in New York City.

Almost all the ballots have been counted in the New York City municipal elections held yesterday. My last post here was on Sunday, but there was a late poll from Public Policy Polling that changed my numbers. After working on how I was going to consider the "median result" from a set of four polls in a three person race where there was a 40% threshold, my last prediction on Monday evening was posted to Twitter.

Final call on NYC Dem Mayor's race, different from  
55% De Blasio 1st ballot 
30% De Blasio/Thompson 
15% De Blasio/Quinn

Professor Wang's last prediction was a 90% probability of De Blasio on the first ballot.

The current vote count has Bill De Blasio at 40.3% and his closest rival Bill Thompson at 26.2%. Because his margin over the 40% threshold is so slim, we will have to wait for absentee ballots and an automatic recount for a result this close. The reported estimate is next Monday for a conclusive result.

Here on the blog, I made no mention of the comptroller's race, but I did mention it on Twitter in a two part tweet.

NYC mayoral primary today. and I agree that De Blasio first ballot win is most likely as is Spitzer for Comptroller. [1/2]

If Spitzer loses the Comptroller race, full credit should go , the only organization to favor his rival Stringer. [2/2]

There were several polls of this race as you can see at this link to the Real Clear Politics polling page. There was nothing that made this look like a close race until late August, with Eliot Spitzer's hopes for a political comeback looking very strong while Manhattan borough president Scott Stringer seemed unable to make any headway. But then Quinnipiac had a poll that said the race was tied, then another Quinnipiac poll had a small of 2% for Stringer on September 1, then a commanding 7% lead a week later.

The actual result was Stringer by 4%, 52% to 48%.

The reason my system - and Professor Wang's - favored Spitzer was that no other pollster agreed with Quinnipiac, not even once. After the first Quinnipiac poll that showed a close race, Siena, Marist and PPP all released polls, all of them favoring Spitzer.

So my system failed to predict this race. Often, when I get a race wrong, I try to find ways to adjust things to improve, but if I was given a similar situation tomorrow, I wouldn't change a thing. Here are my reasons.

1. I never weigh in more than a single poll from any polling company and always the most recent.

2. Because Quinnipiac only gets one result counted, they had one poll out of three in the final mix, along with PPP and Siena.

3. In a three poll situation, I take the median result, not the average of the three. Quinnipiac was the outlier, but outliers aren't often the closest to correct. Obviously, it will happen sometimes, but it is rare enough that I do not want to second guess myself in a situation like this. Polls miss the mark, sometimes by making bad assumptions, sometimes just by the fact that randomness is involved. For example, of the three polls Quinnipiac had the largest sample size, but that does not always (or even often) mean the most accurate. Looking at the mayoral polling and factoring out the undecided, Quinnipiac had De Blasio at 46% of the people who had a preference, while PPP had him at 42% and Siena had him at 39%. In this case, it's the low outlier that was closest.

So this means at least one more report on the mayor's race. It is widely agreed that the Republican primary winner Joe Llota will be very hard pressed to win in the general election. De Blasio got 260,000 votes in the Democratic primary, Llota finished first with over 50% in the Republican primary with about 30,000, which would have given him sixth place in the Democratic race.

And while I did not use their poll, I want to congratulate Quinnipiac for being alone with the correct result in the comptroller's race. It is a field filled with randomness, but they had Stringer with a chance or the lead three separate times in barely two weeks when no one else did. That's a record they can point to with pride.

Sunday, September 8, 2013

New York City Mayoral Race: Democratic Primary
The Weekend before the election.

On Tuesday, New York City voters will go to the polls to decide the Democratic and Republican candidates to be the next mayor. The conventional wisdom is the Democratic nominee holds a huge advantage over whichever candidate the Republicans put forward, so the Democratic primary is being given the lion's share of interest.
Since mid-August, just after Anthony Weiner's habit of sending portraits of his penis to strangers became common knowledge, the front runner has been Bill De Blasio. It is widely agreed that the ads featuring his 15 year old son Dante, he of the remarkable adolescent baritone and even cooler Afro, have made quite the impact.

This Sunday, a new poll from Marist has been  published which agrees with the general trend if not the exact numbers.

Candidate recent % (previous %)

Marist - 9/6 recent, 8/14 previous
De Blasio 36% (24%)
Thompson 20% (16%)
Quinn 20% (24%)
Other 16% (21%)
None of the Above 8% (15%)

Quinnipiac - 9/1 recent, 8/12 previous
De Blasio 43% (30%)
Thompson 20% (22%)
Quinn 18% (24%)
Other 12% (17%)
None of the Above 7% (7%)

Siena - 8/28 recent, 8/7 previous
De Blasio 32% (14%)
Thompson 18% (16%)
Quinn 17% (25%)
Other 16% (19%)
None of the Above 17% (26%)

The agreement is across the board. De Blasio had a great September, Quinn took a beating and Thompson improved enough to probably be the favorite for second place if that matters. De Blasio is seen as the progressive candidate and the candidate for the boroughs other than Manhattan. Even the fact that De Blasio is a Red Sox fan may not be enough to stop him.

Of the three polls, Quinnipiac has been the one that has found the most support for De Blasio and Siena has lagged, but that may be due to Siena being first to polls in each case. That leaves Marist, last to poll and the median poll for De Blasio support both mid-August and early September.

If we accept Marist as the most reliable because of being the median, there are three outcomes that look possible on Tuesday.

De Blasio gets more than 40% on the first ballot: 34%
A De Blasio-Thompson run-off: 36%
A De Blasio-Quinn run-off: 30%

If I were putting a wager on this three way outcome, I'd put my buck on De Blasio-Thompson in a run-off, though any of these three results can hardly be called an upset. The only thing that would make me doubt my sanity or the validity of my methods is De Blasio finishing third. There hasn't been a poll in more than three weeks that shows that result to be even close to plausible.

I'll be back on Tuesday evening to report the actual vote.

Wednesday, September 4, 2013

Rounding - standard method and statistician's method a.k.a Gaussian rounding or banker's rounding.

I do not like to tell tales out of school. I consider the relationship of teacher and student to be one of confidentiality, most especially on the teachers' part. But I will say this.

The students I teach have a heck of a time with rounding.

I've taught some of the remedial classes at community college, like arithmetic and pre-algebra, and the students in these classes often do not get the idea of rounding, either rounding to a nearest decimal - like tenths or hundredths or thousandths - or rounding to the nearest thousand or million or rounding to a certain number of significant digits. In later classes like statistics or even calculus or linear algebra, I see there are more than a few students who haven't grasped round up and rounding down, usually always rounding down, also known as truncating.

For me, this is a major pedagogical hurdle. When I teach, I try to put myself in the mindset of when I didn't know how to do a thing and remember the things that helped me learn it. I'm not going to say I learned rounding in mere seconds, but it feels like I did. I'm sure I stumbled with it back some time in primary school, but after a few mistakes the mechanical rule fell into place and made sense.

Look at the digit that is going to vanish, the one to the right of the last place you are rounding to. If it is a 5, 6, 7, 8 or 9, add one to the digit you are rounding to. If it is a 0, 1, 2, 3 or 4, the last digit you are going to use remains the same. The first method is called rounding up and the second version is called rounding down.

Example: 4/7 = 0.571428571428..., a decimal place then the six digit pattern 571428 repeating forever.

Round 4/7 to the nearest tenth.
The digit in the tenths place is the .5, so the answer is going to be either .5 or .6; because the next digit (in some texts called the decision digit) is a 7, we add 1 to 5 and the answer is .6

Round 4/7 to the nearest hundredth.
If we truncate to the hundredths place, we get .57, so the answer is going to be either .57 or .58; because the next digit is a 1, we leave .57 as it is.

Round 4/7 to the nearest thousandth.
If we truncate to the thousandths place, we get .571, so the answer is going to be either .571 or .572; because the next digit is a 4, we leave .571 as it is.

Okay, I expect that this is not news to many of my readers, though it may have been a while since you thought about it.

I am chagrined that I did not know about other methods given my advanced years, but a statistics text I am using for the first time has a math skills pre-test and uses a slightly different method, known by several names. I first heard of it as "banker's rounding", though doing more research, I understand that "statistician's method" or "Gaussian rounding" are more common and likely more accurate.

Let's say for arguments's sake we are rounding off to the nearest dollar, getting rid of the pesky pennies. The method you likely learned in school, which here is called "Traditional", will simply look at the tenths position.

If we have between $2.00 and $2.49, this will "round down" to $2.00 exactly.

If instead the total is between $2.50 and $2.99, we "round up" to $3.00.

The new method agrees with the old method almost exactly, with the only contentious case being the half dollar. Technically, $2.51 should go to $3.00, because that's the closest value. (It's 49 cents away from $3 and 51 cents away from $2.) Likewise $2.49 should round down to $2.00, since that is the closest value. but $2.50 presents a philosophical dilemma, since it is exactly 50 cents away from $3.00 and 50 cents away from $2.00.

This new method says to round a number of the form x.5 to the nearest even number. That means half the time we round x.5 up and half the time we round down. The row in yellow shows the only disagreement between 1 and 3. 1.5 rounds up traditionally to 2, and in the new method rounds to the nearest even number, which is still 2. But 2.5 rounds traditionally to 3, and in the new method rounds to 2.

Why bother? Think of what we are changing in the sum of rounded numbers, assuming that all numbers are equally likely to show up. Let's say we had the list of numbers as follows:


These 201 entries add up to 402, which means the average is 402/201 = 2.

If we round them using the standard method, we will get

50 x 1 = 50
100 x 2 = 200
51 x 3= 153

This adds up to 403, and 403/201 = 2.004975124..., which is to say that adding, then rounding will not give the same answer as rounding, then adding.

If we round this set using statisticians rounding, this is what happens.

50 x 1 = 50
101 x 2 = 202
50 x 3= 150

This adds up to 402, and rounds to 2 exactly.

As a teacher who knows my students already have a difficult time with rounding, this presents a problem. I do not want to "dumb down" the curriculum, but I also don't want to add in extra problems when I don't have to. Searching Wikipedia for this method, I see that this is the standard for IEEE 754 use with floating point operations.

Some but certainly not all of my students may see this in their careers. I would hope that people who go into programming would have a better grasp of math, but having worked for nearly two decades in the field, I know that is not always the case. More than once, I came onto a project at a large computer company that involved something like higher math and I was the only programmer on a large team who knew the right method. Sometimes it was something slightly esoteric, like group theory and the symmetries of the square. Another time, I was the only person who really understood how sine and cosine worked.

One of my favorite expressions I learned from my father is "You learn something new every day, if you aren't careful." Well, I wasn't careful and I learned something new yesterday. Now I have to decide how it should apply to the classes I teach.

It would be so much easier if I did what I was told and didn't give a rat's ass, but as I am now two score and seventeen years old, I get the feeling the "I don't give a rat's ass" option is not open to me.

Tuesday, September 3, 2013

The New York City Mayor's race:
One week from election day

There were a flurry of polls in the New York mayoral race in mid August and then nothing until last week. Now, we have four new polls, but two of them are from the same company, Quinnipiac, so I only count three, using the most recent Quinnipiac, one from amNew York-News 12 and a third from Siena, the polling company hired by The New York Times.

Both Quinnipiac and Siena polled mid-month and late month, so I will give their numbers in the form

Candidate recent% (previous %)

Quinnipiac - 9/1 recent, 8/12 previous
De Blasio 43% (30%)
Thompson 20% (22%)
Quinn 18% (24%)
Other 12% (17%)
None of the Above 7% (7%)

Siena - 8/28 recent, 8/7 previous
De Blasio 32% (14%)
Thompson 18% (16%)
Quinn 17% (25%)
Other 16% (19%)
None of the Above 17% (26%)

Both of these polls agree that August was a very good month for De Blasio and a bad month for almost everyone else. We also have a poll from amNew York-News12, who only polled once in August.

amNew York-News12 - 8/27 recent, no previous
De Blasio 29%
Thompson 24%
Quinn 17%
Other 17%
None of the Above 13%

The "median" poll can be a little difficult to judge in a multi-candidate contest. If we consider the race De Blasio vs. Thompson, the Siena poll is the median with a 14% lead.  In this case, De Blasio has about a 27% chance to cross the 40% threshold and win outright, and the Siena poll says the race for second is relatively close, but still gives Thompson a 65% Confidence of Victory over Quinn.

Here are the Confidence of Victory numbers from all three polls.

De Blasio outright win: 99.9%
De Blasio-Thompson run-off: 0.08% 
De Blasio-Quinn run-off: 0.02%

De Blasio outright win: 27.2%
De Blasio-Thompson run-off: 47.3% 
De Blasio-Quinn run-off: 25.5%

amNew York-News 12
De Blasio outright win: 0.1%
De Blasio-Thompson run-off: 99.9% 
De Blasio-Quinn run-off: 0%

There should be at least one more poll from Marist by the end of the week, but unless Siena is right, Quinn is pretty much out of the race.