Like Google Flu Trends, Can Mining Automobile Forums Reveal Defects?

Could mining car forums about motorist car-related issues help find problems before they become fatal?

Big Data is a tool that appears poised for lots of thorny problem-solving issues.

Google Flu Trends, with precise or highly correlated algorithms, can accurately predict flu outbreaks before the Centers for Disease Control can, by mapping the Google clicks of those with flu symptoms.

In the wake of GM’s ignition switch defect that has to date been linked to the lost lives of 51 people who died in ignition-switch related accidents, and the exploding air bags in Hondas and other  car manufacturers using Takata-installed air bags, mining data from car forums where motorists can reveal this incidents, is now seen as another early step in saving lives and signaling defects  even before the car manufacturers know or learn about them.

Flagging such comments and incidents from car forums and other social platforms might serve as early detection warning systems for potentially catastrophic events.  A kind of social media version of the canary in the coal mine.

How might this data mining work?

By using so-called “smoke words,” derived from the campfire “smoke signals,” data mining experts can look for words or phrases that are substantially more prevalent in posts about defects.

That’s different from sentiment analysis, the practice of monitoring and mining online comments for consumer feelings about a product or topic, says journalist Dina Kraft, formerly of the Transparency Policy  Project at the Ash Center for Democratic Governance and Innovation at the Harvard Kennedy School.

Sentiment analysis vs. defect reporting is typically very different. Those reporting sentiments about a product generally use emotive words such as “this product is great, fantastic, the best I’ve ever seen.”  Forum writers reporting defects are more concise and succinct, according to a Virginia Tech study of smoke words.  So data miners looking for potential defect scenarios might see something like “Car engine turned off.”

In the Virginia Tech smoke words study, mechanical engineering grad students, with concentrations in vehicle safety, analyzed and  categorized sample comments and came up with a list of smoke words which were then incorporated into text-analytics algorithms.  Once a component or defect comes to light, the study’s authors reported, it could be investigated and, if troublesome, corrected or removed from the vehicle as unsafe.

Not only did this seem like a good idea in the prediction of auto-related safety issues, but for  other industrial safety situations as well.  What if chemical plant safety reports, first-aid accounts, or other EHS (Environmental Health and Safety) reports were scoured for smoke words?  Could more serious OIIs (Occupational Illness and Injuries) be prevented or lessened?  How about applying such a smoke-word analysis to other industrial occurrences that could endanger employee health and safety?



This entry was posted in algorithms, Big Data, Data, Uncategorized. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s