Can NoSQL Solve Polling Dilemma?
The stunning upset of Eric Cantor in the Virginia congressional race was largely stupendous because the pollsters got it so wrong – by “18 standard-deviation points” said one analyst. It was “pollster malfeasance” by the polling firm McLaughlin & Associates, and it should get the equivalent of the “pollster death penalty” said a commenter to Frank Lutz’ New York Times piece.
But to a man and a woman on the street in Virginia it was not that much of an upset; People did not like Cantor. In a similar scenario, David Cameron is up against Ed Millibrand – and one columnist wrote that political analysts can’t grasp “the sheer scale of Ed Milliband’s political incompetence and stupidity.”
Why are pollsters so tone deaf to those emotions? Well the problem right off the bat is that polls are highly structured: Do you like this or that? Are we in the right direction?: all the time, some of the time, none of the time. The speed at which polls are being taken – we are stamping our foot for a response before the last data point is entered – has created an over-reliance on structured answers.
Structured answers has made it easier to tabulate and analyze information. But only those answers that fit a specific structure. To a question of “do I prefer chocolate versus vanilla ice cream” the answer would be chocolate … unless I want to put toppings on my ice cream or unless I put ice cream on pie, or unless it is my son’s home-made vanilla that is to die for. At the end of a year I have had at least three times as much vanilla as chocolate!! (And actually more, because my son makes it every weekend.)
The narrative holds the key — and the truth. In healthcare, companies like M*Modal are helping healthcare providers analyze physician notes to determine best course of patient care. M*Modal can efficiently do this because it has created a linguistic engine that analyzes words used to determine context – and creates rich, hierarchical XML metadata that then flows into MarkLogic for search and analysis.
Could pollsters do the same? Absolutely. They could be mining comments, conducting live interviews (which they should be doing anyway but that is for someone else to write about), having conversations! The recipe is pretty simple: transcribe notes, run the text through vocabularies and thesauri so that we know: “He’s an idiot, moron, obtuse, incompetent, shill” maps to a strong negative.
Humans being what they are, no poll will ever be 100 percent accurate. But 18 standard deviation points is pretty much in the realm of flipping a coin. And with the amount of coins being spent on polling, I am fairly certain clients want better odds of gaining true insights to public opinion.