Published on 29-Jul-2010
"By collecting data over time from a representative, carefully weighted sample of the population, we are developing a much better understanding of Danish food preferences. This information has significant value, and Text Analytics for Surveys has played a pivotal role in the success of our project." - Lars Aarup, Head of FDB Analysis, FDB
BA - Business Analytics, BA - Business Intelligence, Smarter Planet
Smarter Solutions for Retail
FDB is a not-for-profit organization with a broad constituency of some 1.7 million members, or nearly one-third of Denmark’s population. In addition to generating revenue from its 800 retail outlets and advertising operations, the organization sponsors far-reaching education and health awareness campaigns.
FDB needed a way to parse and categorize huge volumes of unstructured data from its ongoing survey of Danish food consumption patterns, which already includes information on 300,000 meals from a representative sample of the population.
FDB selected IBM SPSS Text Analytics for Surveys to complement its existing use of IBM SPSS Statistics. The organization recently added IBM SPSS Decision Trees to its toolkit.
“With some 52,000 responses a year to deal with, there is no way we could do this work without SPSS Text Analytics for Surveys.” —Lars Aarup, Head of FDB Analysis, FDB
• Enables FDB to develop a clear understanding of Danish food preferences • Ability to handle anomalies in free-text responses facilitates accurate categorization • Instrumented survey results can be used to foster greater health awareness • Program can be adapted to meet specific business needs • Commercial application of results includes more effective
Click here to read the case story in Danish.
If you happen to live in Copenhagen, your consumption of rolls for breakfast is distressingly low — at least in light of the national average for roll eaters in Denmark. If you’re a resident of Sydjylland to the west, by contrast, you know a good roll when you see one. Copenhagen is similarly finicky when it comes to medister, a Danish specialty food consisting of a thick, spicy sausage made of minced pork stuffed into pig intestines, while Fyn folk can’t get enough of it. These and other dietary statistics are now available via the interactive and engaging “Food-O-meter” recently launched by FDB, the cooperative member organization behind the largest retailer of fast-moving consumer goods in Denmark. The mad-O-meter (mad means “food” in Danish) invites the visitor to Vælg tema (select a theme) and then click on different regions of the map to display related bar charts and line graphs. FDB, which already had IBM SPSS Statistics in its toolkit, credits IBM SPSS Text Analytics for Surveys for the orderly data behind the scenes.
It’s a lot of data. Every day for the past year and a half, FDB has been asking 143 different people around the country what they eat during the day. The respondents provide the information in two different formats: free-text email — “I had scrambled eggs and bacon, milk, and coffee for breakfast” — and a listing of the specific ingredients that went into each meal. The mad-O-meter reflects just a tiny fraction of the enormous data set that has been amassed so far by FDB.
Text Analytics for Surveys to the rescue
Given the infinite variety of individual dietary choices, a traditional “check the box” questionnaire would be the size of a phone book, and nobody would fill it out. This makes Lars Aarup, head of FDB Analysis, even happier about the availability of a powerful program like Text Analytics for Surveys to handle open-ended responses. Using a survey methodology designed by Aarup himself — with sampling and weighting services provided by YouGov Zapera, a research and consulting organization — FDB Analysis has already collected data on more than 300,000 meals.
Initially, Aarup and his team considered managing all this data with desktop spreadsheets and database tools. Then one of Aarup’s colleagues made the happy discovery of Text Analytics for Surveys, with its openended text variables and unique identifiers. It was the perfect solution: Text Analytics for Surveys enables FDB to efficiently categorize all the free-text responses and then integrate the results of the text analysis with other, quantifiable data. “With this program, we can transform enormous quantities of unstructured survey data into quantitative data without having to read the text responses word for word,” says Aarup. “The program automates the process while still allowing us to intervene manually to refine the results; for example, we have adapted it to pull out only consonants, giving us the root form of the Danish word and making categorization easier. With some 52,000 responses a year to deal with, there is no way we could do this work without Text Analytics for Surveys.”
The program deals effectively with a broad range of anomalies inherent in free-text survey responses. “Egg” in Danish is properly spelled “æg,” but “agg” and “aeg” pop up frequently in the emails. “Fish” could refer to mackerel, tuna, eel — the list is endless. By applying the proper rules in Text Analytics for Surveys, FDB Analysis can ensure that every item drops neatly into the correct category. Says Aarup: “We have so many items and misspellings. It was only the ability of Text Analytics for Surveys to handle open-ended questions that made our project successful.”
The value of knowledge
FDB is a not-for-profit organization with a broad constituency of some 1.7 million members, or nearly one-third of Denmark’s population. In addition to generating revenue from its 800 retail outlets and advertising operations, the organization sponsors far-reaching education and health awareness campaigns. FDB’s wealth of dietary data is likely to prove especially valuable in these latter areas. In fact, Aarup is currently touring the research institutes in Denmark to discuss possible joint projects related to health profiling, and a pilot project is underway. This is not to say that FDB’s findings lack commercial value. Aarup was recently approached by Coop (the massive retail conglomerate owned by FDB) and Carlsberg, the well-known Danish brewery. With the World Cup championships approaching, Coop and Carlsberg anticipate a dramatic spike in beer consumption and wondered how best to stock the store shelves. What do soccer fans like to eat while they’re enjoying a cold one from Carlsberg? The answer from FDB Analysis: potato chips, peanuts, popcorn, pretzels, and tasty morsels of fried, salted pig skin. Putting the beer and snacks next to each other is a good way to fill the market basket.
Other fascinating insights derive from combining FDB’s consumption data with related sales data for certain food items. Looking only at the sales data, one would logically conclude that Sydjylland residents drink remarkably less Coca Cola and other soda. But thanks to FDB Analysis, the truth has come out: Because Germany has much lower taxes than Denmark on sugar, thrifty Danes simply hop across the border to purchase their high-fructose beverages. They actually drink more soda in Sydjylland then anywhere else in Denmark—some eight percent more than the average to be precise.
Putting the data to work
The commercial and public service value of FDB’s food consumption analysis has yet to be fully exploited, but the potential is significant. FDB’s Republica A/S, the leading publisher of weekly advertising circulars in Denmark, could use the information to help Coop stores develop more effective advertisements. Educational material designed for the country’s public schools could foster more healthful food choices and ultimately a healthier population. Store selection and customer satisfaction could be enhanced across the Coop retail chain. Aarup is already looking forward to the next step in FDB’s ongoing food consumption research, having just purchased the IBM SPSS Decision Trees module. “Using multiple regression, I can enter all the data and get a clear picture of how various items are interconnected,” he says.
“Take coffee consumption: What makes a person want a cup of coffee? Are there any demographic factors at play? By creating classification and decision trees, this product will help us better identify groups, discover relationships between groups, and predict future events.” Aarup adds that, because the product makes it possible to present categorical results in an intuitive manner, these results can be more clearly explained to non-technical audiences. But for now, Text Analytics for Surveys is the star at FDB Analysis. “It is our job to gather knowledge that can be used for consumption policy, with the end goal of promoting healthy, eco-friendly lifestyle choices,” Aarup concludes. “By collecting data over time from a representative, carefully weighted sample of the population, we are developing a much better understanding of Danish food preferences. This information has significant value, and Text Analytics for Surveys has played a pivotal role in the success of our project.”
Products and services used
Footnotes and legal information
SPSS Inc., an IBM Company Headquarters,
233 S. Wacker Drive, 11th floor
Chicago, Illinois 60606
SPSS is a registered trademark and the other SPSS products named are trademarks
of SPSS Inc., an IBM Company. © 2010 SPSS Inc., an IBM Company.
All Rights Reserved.
IBM and the IBM logo are trademarks of International Business Machines
Corporation in the United States, other countries or both. For a complete list of IBM
trademarks, see www.ibm.com/legal/copytrade.shtml.
Other company, product and service names may be trademarks or service marks of
References in this publication to IBM products or services do not imply that IBM
intends to make them available in all countries in which IBM operates.
Any reference in this information to non-IBM Web sites are provided for convenience
only and do not in any manner serve as an endorsement of those Web sites. The
materials at those Web sites are not part of the materials for this IBM product and use
of those Web sites is at your own risk.
© Copyright IBM Corporation 2010