# Presidential polls, pointless pundits and potent probability

## Numerical models should spell the end of innumerate punditry…and elections?

It probably hasn’t escaped your notice that all those dick-swinging pundits with their big narratives, opining that the US election was ‘too close to call’ were flat wrong. Their fables-cum-journalism might’ve been fun to read, but not only is it clearly silly to analyse the behaviour of complex, heterogeneous systems using language (it makes me inwardly wince and smirk to read, for example, that the monoculture of ‘Ohio’ thinks somethingorother in its assimilated communo-brain): it’s been superceded. Statistician Nate Silver predicted the outcome of the vote down to the last state, proving that numbers continue to be the best system for determining which of two things is larger.

Mark Henderson makes an excellent case that this is an illustration of the failure of politicians and the media to embrace numerical methods where appropriate. I’d like to play Devil’s advocate and suggest something rather more left-field: given that Silver’s statistical model got it smack on, did we need to run the election at all?

I’ve written before about the idea of using small opinion polls to replace universal voting; statistically, except in cases where it really is too close to call, you can be ludicrously sure that your answer is correct with a relatively small sample. For example, imagine you want to find out whether or not a particular policy or president has national majority support. Imagine also that support is actually 70%. If you poll 10,000 people on the issue, your chance of getting the election outcome wrong is so tiny that it nearly defies analogy (it’s about the same as the odds of winning the lottery 58 weeks in a row).

We’ve also got to consider that current electoral systems are error-prone. The very reason we have recounts is because the results aren’t always right first time, so technically we should quote election results with some (probably quite small) margin of error too.

Silver’s final pre-election analysis suggested that Obama had victory sealed with odds of 91%: that’s perhaps not good enough to decide on a matter so weighty as who will lead America for the next four years, and so the polls might need to have more participants than those used by the statistical pundits in the US…but is it really worthwhile, in pursuit of spurious precision, to go so far as to make the election universal? How sure do we need to be about which way the nation has spoken? And, given the uncertainty in universal polls, how sure can we really be at the moment?

One possibly relevant level of certainty which occurs to me is your probabilistic conviction, as a voter, that the politician or policy you’re voting for is the right one. I’d argue that, in the fuzzy world of politics and, in particular, people and parties (who have been known to renege on their pre-poll promises), our precision in these matters is fairly low. Is there any point being sure of the outcome of an election to higher than that level of certainty? Might that even make Nate Silver’s statistical pre-election good enough?

Consider the electoral Devil advocated. Comments appreciated.

1. #### Bekah says (15:06 08/11/2012) ¶

I suspect, regardless of the logic, the knee-jerk reaction would be a huge No! Other than maybe some people with maths backgrounds, I think most people would hate the idea of not being allowed to contribute their opinion and would associate it with the pre- universal suffrage days, even though it would be a totally random sample rather than based on any criteria. You'd have to teach everyone a lot of statistics before it would be considered...

2. #### Luke Surl says (16:59 08/11/2012) ¶

Human acceptance reasons aside, the election itself can be considered as a census which can be used to calibrate future models. Nate Silver used data from previous elections (and exit polls which wouldn't occur without elections) to convert poll numbers to decent predictions.
Without such a census every four years or so, such models would probably drift from reality within two election cycles.

3. #### Luke Surl says (17:09 08/11/2012) ¶

Not to mention that the poll figures themselves were, prior to Silver getting hold of them, adjusted for response rate/likely turnout etc. using data from previous elections/exit polls. They weren't simply the raw results of random samples.

4. #### Statto says (17:21 08/11/2012) ¶

I’m not sure that follows… The kinds of reasons that polling values sometimes differ systematically from election results either cease to apply, or aren’t relevant, if the mini-votes were known to replace universal ones.

For example, there’s the psychological difference between giving a non-binding voting intention over the phone and actually penning your irrevocable vote in the polling booth: if the polls were the actual elections, I’m not sure it’s obvious such differences would manifest? For example, there’s no pollster on the end of the phone to impress when you’re doing a proper secret ballot, and this vote is binding, not a whimsical chat.

Another potential source of bias is turnout…maybe people who vote and people who will talk on the phone aren’t perfectly overlapping groups? You’d need to account for that if trying to translate results between systems, but it’s not clear to me which, if either, system returns the ‘correct’ result. Neither can really be called a census, if indeed that’s what we’re aiming for.

Are there other sources of bias I’ve missed which might have a morally-relevant impact on the outcome?

5. #### Luke Surl says (18:31 08/11/2012) ¶

Isn't the outcome of a census election by all those who turn up to vote the 'correct' result by definition? I'm not sure what else you're trying to assess from your sampling strategy.

There are a bunch of reasons you can anticipate that might bias even a binding sample (how readily people answer their phones for one!), and an unknown number of ones that you will not have anticipated. Declaring that you can negate all these using your sampling strategy, or factor them out in your post-processing, is presumptuous, especially as your last set of census election data to compare with becomes 4, 8, 12 years old.

6. #### Statto says (19:02 08/11/2012) ¶

The outcome of an election by all those who turn up is only correct by definition if that’s how you choose to define your democratic process! Why are we trying to perfectly emulate the systematic errors of the current model in devising a new one? Every system has its biases: the equivalent of some people failing to answer the phone in current elections is that turnout decreases by on percentage point per kilometer from the polling station. I’d contend that a new system shouldn’t be morally bound to massage the stats to capture that…and, indeed, is the fact that we don’t currently apply any corrections to counts to make up for that wrong in itself?

The reason Silver and other pollsters need to apply statistical trickery is to shoe-horn results from one kind of survey into another: if we took one kind of survey and it was binding, there would be no need to do this. And the key point is that sample size rapidly ceases to be an important source of error, regardless of your chosen method of polling.

I’d probably go for some more general, fluffy definition of democracy, like trying to best gauge the will of the population. My proposal to reduce sample size is mechanism-agnostic, and I think democracy probably should be too. I’m not proposing any overly-detailed sampling strategy: it should be completely random.

I think you’ve got hung up on reproducing the results of the current system, but I don’t think it’s some paragon of virtue. It’s purely the needlessly massive sample size I’m suggesting, and Nate Silver et al. demonstrated, we don’t need.

7. #### Luke Surl says (19:48 08/11/2012) ¶

Hmm, if you effectively made voting like jury service - randomly choosing voters from the electoral roll and making damn sure they replied then yes, you could produce a result that is statistically valid within a quantifiable uncertainty range. [You could steadily increase sample size if the result was tight] It's somewhat different from having today's pollsters choose the election!

This system, dating from Athenian democracy, is called Sortition, and its Wikipedia article has the Pros and Cons http://en.wikipedia.org/wiki/Sortition That seems like a pretty decent article so I'll leave the discussion with that link.

