Matching and searching data using InfoSphere Big Match for Hadoop

With InfoSphere® Big Match for Hadoop, you can efficiently derive and standardize master data, compare members, resolve members into entities, and do probabilistic searches.

To make the most of big data, you have to start with data you trust. But the sheer volume and complexity of big data means that the traditional manual methods of discovering, governing, and correcting information are no longer feasible.

Transaction details, multichannel interactions, social media, syndicated data from sources such as loyalty cards, and other customer-related information are powerful new tools for creating a complete picture of customers’ preferences and demands. They are the keys to understanding and predicting customer behavior.

Scalability and performance in resolving master data records can become an issue particularly when loading large lead lists and matching those lists with known customers and prospects, or when matching records against large watch lists to combat threat and fraud. Analyzing large volumes of stored data enables organizations to discover previously hidden patterns and insights that allow them to optimize processes and profitability.

Consider an example. Social media postings might tell a hotel chain that a high proportion of its business guests have too many children for standard reward rooms. The hotel can then respond by providing reward privileges on larger suites for these high-value customers, potentially increasing their loyalty.

The Big Match capability runs as a set of MapReduce applications and HBase extensions within the Hadoop framework to derive, compare, and link large volumes of records, for example one billion records or more. In general, the applications can either run automatically as you load data into your HBase tables or as batch processes on HBase tables that are already populated.

With the web-based Big Match Console, you can create and configure the algorithms that you want to use to process the data in your HBase tables. The Big Match applications use the algorithms you configure as instructions for how to derive, compare, and link the records in your tables.

Note: In the Big Match topics, content that is specific to a fix pack version is identified by change markers. For example:


Last updated: 25 Jun 2015