Monday, November 20, 2006

"Democrats Gain On GOP In New Race: Who's Got The Best Database" (UPDATED)

Information Week:
Pundits have ascribed the sweeping Democratic victories in this year's midterm elections to a number of factors, including dissatisfaction with the war in Iraq. But Democrats had at least one more thing going for them: microtargeting. That's politico-speak for their mining of vast databases to target likely Democratic voters. The Republicans have built and mined their own "Voter Vault" databases since 2002, helping drive the 2004 presidential victory. While the Democrats had similar databases, until these latest elections they were rife with problems, such as incorrect address fields that had Florida residents living in the city of Fort and the state of Lauderdale, and data errors that resulted in more names being listed for Colorado than there are state residents. The Democratic National Committee spent $8 million this time around on a multiterabyte relational database from Netezza. Instead of assembling an Oracle database, EMC storage, and IBM servers, Netezza's Performance Server stores, filters, and processes terabytes of data within a single Linux-based appliance, installed in hours rather than weeks and at lower cost, says Gus Bickford, a consultant who helped implement the DNC database.
Between 60% and 70% of the system's data came from InfoUSA, which sells data on voters' income, age, address, home value, telephone numbers, vehicles, bankruptcy filings, mail order purchases, marital status, and more--including such "lifestyle" information as whether they like auto racing or motivational speakers. The rest came from commercial and public databases. One scenario for microtargeting goes like this: Female cat owners tend to vote for Democrats, as do the majority of married women with children. So if you see a woman at the polls with a ring on her finger, a toddler in her arms, and cat dander on her jacket, you probably know for whom she's voting. Using the data it acquired, chances are the Democratic machine also knows who this lady is and has bombarded her with specially crafted phone calls, mail, and TV ads. Leading up to previous elections, the DNC's data cleansing was unorganized and often manual. This time around, the Dems bought software from Business Objects unit Firstlogic that allows for the creation of rules to automatically scan and correct data, making sure addresses and phone numbers are formatted correctly or people's nicknames are recognized. Meantime, Harold Ickes, former deputy chief of staff for President Clinton, set up a separate database, called Catalist, for America Votes, a coalition of Democratic groups that targeted elections in battleground states. One source says the rival DNC and Catalist data-gathering efforts will come together eventually. The DNC's voter file contains 300 million records with up to 900 fields per record, everything from voting history to purchasing power to whether the voter has a hunting license. It can handle 30 to 40 queries at once, automatically cleans up dirty addresses, and crunches numbers up to 20 times faster than it did in the past. Lists will get rebuilt three times a year and could quadruple in size in two or three years as voters move and new data flows in.

Ken Strasma, president of Strategic Telemetry, a microtargeting company that works with the Democrats, says the technique may have tipped the U.S. Senate races in Virginia and Montana by identifying voters in bright red counties that may have otherwise been overlooked. However, the Democrats still lag the Republicans in volume of data and in experience. "It's good for us, just as it is in any industry, to go out and make our universe larger," consultant Bickford says. After all, the 2008 presidential race is just around the corner.

UPDATE: Keith Goodman continues his series of posts on microtargeting with "GOP Microtargeting Doesn't Live up to the Hype (Republican Perspective)."
In my final pre-election post, I wrote that the Republicans faced a major problem in 2006 with their microtargeting and 72-hour plan:

One of the interesting features of microtargeting is that the intelligence is a snapshot in time. That is, microtargeting is built from a large poll that is, in most cases, conducted MONTHS before an election.

If the environment is static, the microtargeting will be pretty accurate on Election Day. But what happens when conditions change between the date the microtargeting is completed and the election? This is precisely the problem facing the Republican 72-hour plan.


Now that the election is over, Republicans have started talking about this problem. One of my favorite quotes from this article, referring to the 2005 VA governor's race:

What I found was frightening and mind-boggling. Voters who were microtargeted and estimated to be pro-life at a 75%-90% confidence rate (meaning 75%-90% chance that a voter will vote for a candidate who supports the pro-life position), were actually measured at an average rate of 45%-50%—suggesting that half of our get-out-the-vote turnout universe would be voting for the Democrat candidate for governor, Tim Kaine.

Now, this is no disgruntled peon. His bio from the article:

Mr. Stutts served as the national 72-Hour/GOTV director during the 2004 Bush-Cheney re-election campaign. He is president of Phillip Stutts & Company, a political consulting firm.


Yet more evidence that Republican microtargeting and the 72-hour plan are not as great as advertised. In fact, this Republican perspective makes it highly likely that Democratic microtargeting was far more effective in 2006.

No comments: