# Simulating the Presidential Polls

One of the problems I have with media descriptions of polling is that they take a single poll and make no attempt to compare them to similar polls. Therefore, if there is a poll where Obama has a 2 point lead and the standard error of the poll is 2 points they then assume that Obama and McCain are tied. And rather than do any kind of simple trend analysis, they run with the naive point of view and try to make a story out of it.

The problem is that if Obama indeed had a 2 point lead and the error in that analysis was about 2 percent, then McCain would be winning a lot more polls than he is at present. By my calculations, if that were the case, then Obama would win about 76% of the polls and McCain would be winning about 24% of the polls. Instead, Obama is winning the overwhelming majority of polls. For example, by my count, Obama has won the last 30 polls 26 times, has lost 2 times, and has tied 2 times (90.0% margin). Over the last 70 polls, by my count, Obama has won 62 polls, lost 5 and tied 3, for a margin of 90.71%. This winning ratio is not well explained by a 2 percent margin with a 2 percent stardard deviation.

I have written a polling simulator named sim_polls.pl, whose source code is available here. If you run it with no parameters, it assumes a 2 percent advantage to Obama and a 2 percent standard deviation. You can pass different parameters to the program to simulate different advantages and different standard deviations of the advantage and then calculate how many polls a candidate should win given those variables. Any randomness is assumed to be normally distributed. Any systematic errors are ignored. These are all naive assumptions, but necessary at this point.

Sample results include:

./sim_polls.pl --verbose 3.7 2

This program does a Monte Carlo simulation of the polling
of

the current election to determine if the media gabbing
about

the polls is in any way representative of the data
presented.

Currently - 8-24-2008 - using polls listed on
pollster.com,

Barack Obama is leading by a score of 62-3-5 over the last
70

polls. This is a ratio of 90.71%.

Despite this, people continue to insist that the race is
dead

even. This program simulates the distribution of 10000
occurrences

of 70 polls by assuming that Obama has a 3.7 percent lead

with a 2 percent standard deviation. These popularity

profiles are assumed to be normally distributed.

Greater total: 633265

Lesser total : 66735

Greater pct : 90.4664285714286

Lesser pct : 9.53357142857143

Greater max : 70

Lesser max : 19