Garbage In, Garbage Out. We all know the idea when we here it and why “GIGO” is such an overlooked topic in this Big Data era. I could not resist sharing this piece by Nate Silver on his website fivethirtyeight.com that perfectly explains some of the parameters that make sports the first real industry to catch fire in this arena and why. So as we look at the Oscars predictions tonight and other data driven predictions in our respective industries, remember the 3 big reasons why sports provides such accurate models while others can fail according to Mr. Silver. Can we apply these rules accurately to our own garbage data?
- Sports has awesome data
- In sports, we know the rules
- Sports offers fast feedback and clear marks of success
Do the Oscar predictions using big data and social media (for the sake of this argument, let’s call this a form of big data) fit these criteria?
1. The Oscars have some good historical data. A more detailed description of the fivethirtyeight.com model can explain this further, but with plenty of records of past winners, past voters and their correlations, the Oscars pass the first test.
2. This is where Oscar predictions fall a little flat. The rules of voting are pretty subjective in their own right and while the Grammys may have a few more surprises, the 6000 Academy voters fall into many categories with many alternative motives. This makes the “rules” of voting a little suspect.
3. At least the Oscars win on this front — #Oscars might even tell you before the 5 second TV lag does!
So just remember when looking at big data for your predictions — do you really know the quality of the data you are viewing?