Dear US Pollsters: Data is for understanding complexity; not divining certainty.
As someone whose work spins around surveys, I have always been struck by how right they can be and how incomplete they can be, almost regardless of statistical significance, robust sampling, or anything else.
You ask a sizeable group of people 10 - 20 well designed questions and there will be some interesting, and maybe even compelling, convergence. However, there will also be divergence, anomalies, inconsistencies, gaps, etc. That’s where it gets interesting.
I am always more interested in those anomalies/inconsistencies/oddities than in the rather pat sense of convergence. People will say, ‘Ah, but 85% of respondents said this, and that is with a 95% CI. That’s what we should pay attention to because we (think we) can measure certainty.’
I’ve always been punk rock. The bland consensus of the masses is less interesting than the outliers, the strange, the data point that, when you really look at it, makes you wonder about all the rest.
I love data. I love how it tells stories and how we can use it to dig into nearly all aspects of performance. Yet, the tendency to look at data as a means of calculating certainty is misguided. It is about identifying difference and diversity. It is a tool to help us ask better questions, to peel away layers of assumption and start to get at a much more nuanced understanding of complex issues.
Let’s hope pollsters get with this and start applying their skills towards understanding the complexity of the electorate rather than trying to put everyone in simplified boxes.