Option D: Consider The Big Index API

Option D: Consider The Big Index API
The Big Index API is well suited for working with extremely large distributed database repositories. The Big Index API, which is available as a push connector component of the Watson Explorer Engine Large Database connector, provides an optimized configuration API for working with large databases.

The Large Database connector enables Watson Explorer Engine applications to crawl large database repositories and index the information that they contain and is designed to be used with IBM InfoSphere Data Replication's CDC (CDC), a replication solution that captures database changes as they happen and delivers them to target databases and other applications. Additionally, the Large Database connector is designed to be installed in a CDC environment that provides a JMS service provider to handle database messaging between CDC implementations and Watson Explorer Engine instances.

The Large Database connector is not appropriate for every data repository environment. It will not resolve issues where extremely large amounts of data are contained in databases that are not intended to handle large amounts of data. Additionally, if your database tables are less than 10G, you will probably not want need to incur the infrastructure overhead of the Large Database connector, and can use one of the other database connectors than are included with Watson Explorer Engine.

The pros of using the Big Index API through the Large Database connector is that you have a pre-configured API ready to meet your big data needs.

The tradeoff is that the Big Index API requires a CDC replication solution, and, as stated earlier, it may not be suitable for all databases and significant knowledge of the nature of big data repositories is required.

For more information about the Big Index API available through the Watson Explorer Engine Large Database connector, see the following resources on the IBM Knowledge Center: