This is labeled as "Part 1" because if more email correspondence exists between me and various pollsters, I will add them as further posts. Of course, if pollsters are afraid to respond, I will write about that, too. Sorry, Mr. Cooper... but this is my version of "Keeping Them Honest." :-)
There has been much written about the skewing or "oversampling" of democrat respondents in recent polls. I wrote about how I believe this is a form of psychological voter suppression being employed by the media: http://loudmouthelephant.blogspot.com/2012/09/voter-suppression-look-no-further-than.html. In short, the oversampling of democrats in polls allows the results to be skewed in favor of Obama, promoting a "Romney can't win this race" situation in an effort to get republican voters to stay home since their efforts would be futile. CNN wrote a "debunk" piece of my theory here: http://politicalticker.blogs.cnn.com/2012/09/28/analysis-polling-criticism-unfounded/. And yes, I have been skeptical of these polls before usual the conservative pundits got on board.
The interesting thing about CNN's article: they didn't actually debunk the theory. They simply said, "well, there is no " conspiracy" (though I would never call this a conspiracy - more like an effort); look at the results of these polls," and they simply cited MORE skewed polls! Ugh. If they would have given some examples of how the polling sample best reflects the electorate or recent party affiliation trends, their case would have held more water. They didn't, and in fact, they then released another ridiculously over-sampled pro-democrat poll: http://i2.cdn.turner.com/cnn/2012/images/10/01/rel11a.pdf (37% D - 29% R). There is no information anywhere that shows this is how the country's party affiliation is spread. Shame on CNN for skewing another poll. Shame on CNN and their attempt at debunking my theory :-) Using the very thing being questioned as the rebuttal of the question is purely circular logic, and it's quite silly. I digress...
Moving on, I have kept an updated database of every poll I can analyze: http://loudmouthelephant.blogspot.com/2012/09/lme-presidential-poll-tracker-database.html
Today, while doing my latest round of poll deboggling, I came across this poll: http://weaskamerica.com/2012/10/02/horse-races/
It shows that though Mitt Romney is leading 50.7% to 35.3% among independents, somehow Obama has a 10.5 point lead in Nevada?! What?! I had to question these results, so I emailed the polling company We Ask America. The following email chain is the whole reason I am writing this post (personal information has been removed). Keep in mind, as you will see by my tone, I'm cordial and respectful. This was not a "gotcha" attack; I just wanted to understand the numbers:
My name is Michael XXXXXXX. I'm the chief contributor for the blog The Elephant in the Room (www.loudmouthelephant.com). I'm writing to learn of the sample you used for your Nevada poll this morning. The poll's results claim a 10.5 percentage point lead for Obama (52.5% - 42%). Though you did not release the information, my estimates for the voter demographic breakdown show the poll sampled approximately: 42% D - 28% R 30% I. Is this true? Could you please email me the voting demographic breakdown? I can be reached at firstname.lastname@example.org Thank you.
The First Response:
Our breakdown of Party ID in this poll was actually: 38% D - 37% R - 26% Ind. (rounding errors result in 99%). The real difference in Nevada is the relatively high number of Republican who are supporting Barack Obama. That group is heavily concentrated in women from 45-64, a key voting demographic and move numbers despite Romney having a sizable lead among Independents. As we've written before, we find those self-assigned party affiliation numbers to be a bit unreliable and use 60+ other criteria in our weighting.
We Ask America™ Polls
Naturally, I wrote back:
Thank you for getting back to me. I truly appreciate it.
I'm having a bit of trouble understanding the results of your poll. You stated that the poll's breakdown was 38% D, 37% R, and 26% I. I understand you said you weight things differently, but regardless, on sheer party affiliation, I'm not getting this to line up closely.
Below is my math; please let me know where I'm wrong (I've rounded properly):
Out of the poll's reported 1,078 likely voters:
38% D = 410
37% R = 399
26% I = 280 ( I know this doesn't total 1,078, but that creates only a small issue).
Based on the "which party voted for whom" part of your results seen here: http://weaskamerica.com/2012/10/02/horse-races/ - the race should look like:
D = 410 x 86.4% = 354 votes
R = 399 x 18.7% = 75 votes
I = 280 x 35.3% = 99 votes
Total Obama votes = 528 votes / 1,078 Sample = 48.98%
D = 410 x 11.8% = 48 votes
R = 399 x 78.2% = 312 votes
I = 280 x 50.7% = 142 votes
Total Romney votes = 502 votes / 1,078 Sample = 46.57%
This is nowhere close to the 52.5% O - 42% R results your poll shows. In fact, if there was an error, and it was really 38% D, 26% R and 37% I that was polled, the results would still be 50.8% Obama to 43.6% Romney. I cannot figure out how you got to the 10.5 point Obama lead you guys are claiming. Perhaps I'm wrong, and I'm open to hearing how. Please let me know.
Here's the problem: our weighting formula is extraordinarily complex and involves more than 5,500 fields of data that we feel help us hone in on the real picture. You're basing your analysis only on the numbers you've seen, while we're basing ours on a slug of data that has proven correct more times than not in the past. One of the problems all pollsters are having now is dealing with Party Affiliation in an atmosphere where people are shifting their loyalties like cars change lanes on a freeway. As I've written before, we don't use these self-described affiliations in our weighting. You're trying to line up numbers without having the full array of data. Plus, I've asked my data guy to look into the number of responses. We conducted 24 polls last night…this wouldn't be the first time we messed up the response numbers on our public polls. I double checked the result percentages, but not the responses. When you're a two-main operation, this stuff happens.
I'll get back to you.
Which he did while I was at lunch:
Here's an update:
I did indeed mess up the number of responses: it was actually 1,151 (I'll fix it online shortly). That doesn't help you much, though. This might:
Before weighting, the results showed the presidential race with a five point spread for Obama, and a four point spread in favor of Heller the Senate race. Our weighting moved the numbers as we reported.
Could it be wrong? Absolutely. We'll go back into the field soon to test it again. But we're not giving up on our proprietary weighting system. With it, we were recognized as the most accurate pollster in the nation during the primaries by two independent groups.
Keep in touch; your ideas and review are refreshing and honest. (end of email, no sig)
First, thank you again for corresponding with me. It’s nice to know that you’re willing to reply to an outsider. Also, I hope my email didn’t come off as facetious. I have no ill will, and I am not trying to be nasty or demeaning. I hope I didn’t come off like that. I simply keep track of the polls on my blog, and I breakdown the numbers as they’re reported and run them through my calculator.
It’s interesting to know that you use 5,500 fields for your polls. I’m really curious as to how your weighting algorithms work, but, as I’m sure it’s proprietary information, you probably keep your secret formula close to the chest. Either way, and again, I’m not trying to be “snippy,” I just find it hard to believe that with a 38/37/26 D/R/I spread, somehow the numbers come out to 52.5%-42% given the breakdown of how each party voted. I do understand that people’s affiliations change, but, for the sake of the “who voted for whom,” wouldn’t that be static for this poll? I mean, wouldn’t you report the “who voted for whom” as they are listed? Why the additional weights? For example, in your poll, 86.4% of democrats voted for Obama while 35.3% of independents did. How can it be weighted (with the 2% of republicans that voted for the President), that with the 38/37/26 spread, the total Obama vote comes out to 52.5%? Using purely “how they voted weights,” I came up with the aforementioned 48.98%. A result of 52.5% jumps up the “weighting” of your poll by 7.4% for Obama. Conversely, with Mitt Romney’s voting demographic, your “weighting” drops Mitt Romney’s results by 9.9% (your 42.0% result vs my 46.57% result). How can this be? What exists that, on top of the “how they voted part,” some other information shows that Romney’s results would drop by 10% while Obama’s would jump by about 7.5%?
I would never question the proprietary knowledge of your polls, and I’m not a pollster myself. I’m just curious as to what exists that causes you to use these weights. I also find it interesting that you’re a two-man operation and you’ve made it to RCP (that’s where I found your poll). That’s really pretty cool, and I had no idea about the size and scope of your operation. I own the blog The Elephant in the Room, but I essentially run a two-man show, too. I’m just an economist who typically, though not always, writes about economic/political issues. I get thousands of visitors a day, and many people recently have been flocking to my un-skewed polls database.
Thank you again for getting back to me. If you write back, that would be great. If not, I understand.
He wrote back:
There is no way for me to give you the answer you want without providing you our weighting matrix (and you've been a gentleman about not asking for it). I realize that you'll not find that to be a satisfactory answer and that it has resulted in you trying to put a jigsaw puzzle together without all the pieces.
While we've rechecked our processes on this poll, my data guy reminded me that this one may be an outlier, and that I questioned him about it when the results came in. Therefore, we're going to look at the possibility of getting back into Nevada soon to test it out. In the meantime, the upcoming debates could throw all these results to the wind.
We Ask America™ Polls
My Final Response:
Is there anything you can give or show that explains how your company reaches these results without showing your proprietary algorithms? I think you can understand my concern: I use your 38/37/26 DRI spread and come up with nothing close to your results. Your poll doesn't explain how, and it essentially says "trust us." What prevents a polling company from doing a 33/33/33 DRI spread with equal "who voted for whom" results and publishing them as "Obama trounces Romney - trust us?" Again, I'm not being facetious, and that has never been my attempt; but I hope you can see where I'm coming from. I "numbers check" every poll, and I'm not sure what kind of weighting possibly exists to create the results you have. Is there anything you can share at all?
I'm sure he will write back. So what did I learn? First, I truly appreciate Gregg getting back to me. He was polite, respectful, and he took nothing personal. He certainly didn't have to be. I've confronted many reporters and bloggers and many times I'm met with nastiness. This was not the case. In the tense, hyper-partisan world, he took no cheap shots, and he accused me of taking none. Secondly, it appears there is some sort of weighting system that is involved in polls. I don't know if this truly makes sense, though. Why does a weighting system exist? If you sample people as they are polled and find a large skew in the sample (plus 10 democrat sample), that's what you found. Does it represent the population as a whole? Probably not. I've checked my math over and over. I used We Ask America's numbers. I used their "who voted for whom" information. I used simple math, and what I found is a much different story. Will I be able to get at the heart of their weighting? No. My "weighting" thesis showed, as you can see in an original email to Gregg, a sampling mix of 42% D - 28% R - 30% I. You can check my math, too. Please tell me where I'm wrong.
This is nothing against Gregg personally, but I can't sit and accept "the poll is correct, our numbers work, so just believe us." I never have, and I never will. I will always investigate. Any further correspondence I receive from pollsters will be posted as well. Please share your thoughts below. Thank you.