Using big data to play Jeopardy®: why big data analytics matters

By: Spence

It may seem like some researchers within IBM just decided to have a fun time--but in reality, the technology demonstrated by IBM's Watson system, and which allows it to play a popular television game show, illustrates sophisticated data and text analytics. Technologies, it turns out, that jStart has been engaged with for over a year and a half.

How Watson Uses Big Data

Think about it for a second: what would a machine have to do to replicate what a human does when we answer questions? It turns out, quite a bit: first the person must hear the question and process what he/she hears into concepts that can be understood--or natural language processing, in computer parlance. Then the person must consult their stored knowledge to see if they can find enough information to either make an educated guess or not. Once the person has determined their level of confidence in their answer, they need to decide if they are confident enough to ring in and put some money on the line. At that point the person, if they decide to ring in, must process their answer into speech--and frame that speech, in Jeopardy's case, into a question. Now imagine a computer doing all of those things--definitely it would be (as some IBM Researchers put it in an understated way): "non-trivial".

What's interesting from the jStart perspective is that Watson uses sophisticated data and text analytics to understand the answer Jeopardy supplies, then applies high level statistical analysis to determine the confidence Watson has in its answer...and assuming that confidence level meets a threshold, on whether to place a bet that it is right. Since 2009, jStart has been working with IBM researchers on text analytics as well as data analytics. From those collaborative efforts, the team developed it's own technology called BigSheets as an attempt to apply sophisticated analysis to Big Data scale challenges.

Watson Goes Up Against 'The Humans' on Feb. 15th, 2011

Watson during a trial run with former Jeopardy champions

jStart is engaged with a number of clients in proof-of-concept prototypes leveraging IBM Big Data and sophisticated analytics tooling. We're also exploring how to connect this tooling to even more sophisticated analytics systems developed by IBM's newly acquired analytics company, SPSS. How can these tools be used to solve real problems that exist today? Based from jStart engagements and experiences, we've put together a number of case studies and scenarios illustrating just how these technologies can be used, today.

For more details on jStart's involvement in Big Data, you can take a look at our Big Data page, as well as our Text Analytics and Data Analytics pages.

