University of Southern California, Annenberg Innovation Lab Video

IBM Big Data solutions help USC gain quick and efficient data analysis

Published on 24-Sep-2012

"If we, for instance, are looking at a political debate, we could be analyzing 200 tweets a second on, so that’s a pretty awesome task that obviously only natural language processing tools like the kind that IBM builds, can do. " - Professor Jonathan Taplin, Director of the USC Annenberg Innovation Lab

Customer:
University of Southern California, Annenberg Innovation Lab

Industry:
Education

Deployment country:
United States

Solution:
Big Data, Big Data & Analytics, Big Data & Analytics: Operations/Fraud/Threats

Overview

USC Annenberg Innovation Lab is using IBM InfoSphere Streams - part of IBM's big data platform - to conduct sophisticated analytics and natural language recognition to gauge positive and negative opinions shared in millions of public tweets.

Business need:
USC’s Annenberg Innovation Lab is a “think-and-do tank,” says Professor Jonathan Taplin, director of the USC Annenberg Innovation Lab. To understand what insights predictive analytics could offer, the lab launched a research project that analyzed social media conversations related to new movie releases. According to Taplin, one of the biggest challenges in conducting predictive analytics on social conversations is the amount of data.

Solution:
The Annenberg Innovation Lab uses IBM InfoSphere Streams software, an IBM big data solution, running on IBM servers. The software helps process, filter and analyze the millions of Twitter messages and Facebook posts in near real time and uses built-in natural language processing (NLP) capabilities to determine whether each message is positive or negative. Students also use InfoSphere Streams software to archive data, making it easier to search for information from a particular timeframe.

Benefits:
- Helped students predict political debate winners based on online conversations - Demonstrated the impact of a TV ad campaign within a day of airing - Expected to enable nations to monitor the “global pulse” and gain early notice regarding emerging health crises or civil unrest

Video

USC Annenberg Innovation Lab is using IBM InfoSphere Streams - part of IBM's big data platform - to conduct sophisticated analytics and natural language recognition to gauge positive and negative opinions shared in millions of public tweets.




Video Transcript


Professor Jonathan Taplin, Director of the USC Annenberg Innovation Lab

[00:12] The Annenberg Lab is what we call a “think and do tank.” It’s a kind of a digital laboratory where corporations like IBM, Verizon, Intel, lots of different companies, come together to try and think about what are the solutions for the next three to five years in the media sphere and how we can provide all sorts of new tools for people to understand social networks, to understand participatory culture, and understand how to make cities and other polities work better.

[00:53] The problem is that there’s too much data. We have what is called the Twitter fire hose coming into our lap, and, you know, if we, for instance, are looking at a political debate, we could be analyzing 200 tweets a second. That’s a pretty awesome task that obviously only natural language processing tools like the kind that IBM builds, can do. We really started to jump into popular culture. The movie opens in the theaters, but that’s really only about 20% of the revenue that a movie takes in. What happens is there is what marketers in the movie business call the white space, which is this weird time between the time that the movie opened in the theaters and just before it comes out in home video. And at that point, you can begin to understand a combination of factors: the movie performance, the social buzz, the social conversation that’s still going on about the movie, the combination of advertising inputs and stuff like that. And that, we think, can be incredibly predictive

[01:58] One of the coolest things that we’ve developed with the help of IBM is using a new product called Streams is real time analytics. To be able to look at a debate on a dashboard in real time as it's happening and see somebody made a mistake, and you see literally the sentiment change in seconds is really powerful. It's like a million person focus group. What if you could analyze a TV show like that? We took the Oscars broadcast, and we captured in the course of three hours about two million tweets. Those two million tweets during the Oscars told you exactly, at every second, what people thought about the programming, what they thought about the advertising, every piece of it, in real time. Where we want to go with this, obviously, is to be able to give TV producers that kind of amazing instant feedback. We think these are really powerful tools that will change the nature of a business that really, in terms of metrics and measurement hasn’t changed since the 1950s.

Products and services used

IBM products and services that were used in this case study.

Software:
InfoSphere Streams

Document options