Client query optimization using global indexes

[Version 8.6 and later] The purpose of using global index in a client query is to run queries on applicable partitions only. By doing so, you can avoid unnecessary remote calls.

However, global index does not guarantee performance improvement. For example, the returned partitions from the MapGlobalIndex.findPartitions() method exceed a certain percentage of complete partitions, for example 90%. In this scenario, the resource consumption of using a global index might defeat its purpose.

When you run queries from a data grid client, you must set partitions if the participating maps are partitioned. In a large partitioned ObjectGrid environment, an application usually must run parallel queries concurrently against all partitions to get complete query results. For example, if there are 100 partitions, the application must run the same query on all 100 partitions, and merge query results to get complete the query result. This scenario usually uses large amounts of system resources.

If any equality predicate in the query has the corresponding HashIndex plug-in that is defined, then the client query can enable global index on the HashIndex plug-in. The client query can also use the MapGlobalIndex API to find partitions by the attribute that represents the value of the predicate.

In a simple query where the query contains only one equality predicate, the query might be replaced by the MapGlobalIndex.findValues() method because their results are equivalent. However, the MapGlobalIndex.findValues() method is more efficient.

For example, the following query returns all employees, where employeeCode equals 1. The query uses the index that is defined over the employeeCode field.

SELECT e FROM EmpBean e where e.employeeCode = ?1

The following example is the HashIndex configuration that is used for the query:

<bean id="MapIndexPlugin" 
  className="com.ibm.websphere.objectgrid.plugins.index.HashIndex">
  <property name="Name" type="java.lang.String" value="employeeCODE" 
  description="index name" />
  <property name="AttributeName" type="java.lang.String" value="employeeCode" 
  description="attribute name" />
   <property name="GlobalIndexEnabled" type="boolean" value="true" 
  description="true for global index" />
  </bean>

The indexed attribute is employeeCode that is used in the predicate of the query. The global index is enabled on that index so that the MapGlobalIndex index proxy is available.

The previous query is a simple query with only one equality predicate, in which the attribute is indexed with global index enabled. The query result is equivalent to the result from the MapGlobalIndex.findValues() method. In this case, it is more efficient to use MapGlobalIndex.findValues() rather than to use a query.

// in client ObjectGrid process
MapGlobalIndex mapGlobalIndexCODE = (MapGlobalIndex)m.getIndex("employeeCODE", false);
Object attribute1 = new Integer(1);
Object[] attributes = new Object[] {attribute1};
Set empBeanSet = mapGlobalIndexCODE.findValues(attributes);


// the returned empBeanSet is equivalent to query result from the following query: 
// SELECT e FROM EmpBean e where e.employeeCode = ?1

In a complex query case like the following example, the application can use the MapGlobalIndex.findPartitions() method to find applicable partitions first. Then, run the query on these applicable partitions only.

SELECT e FROM EmpBean e where e.employeeCode = ?1 and e.age > ?2

The following code demonstrates this approach.

// in client ObjectGrid process
MapGlobalIndex mapGlobalIndexCODE = (MapGlobalIndex)m.getIndex("employeeCODE", false);
Object attribute1 = new Integer(1);
Object[] attributes = new Object[] {attribute1};
Collection partitions = mapGlobalIndexCODE.findPartitions(attributes);
// the returned partitions is a subset of all partitions.
Iterator partitionsIter = partitions.iterator();
String query = "SELECT e FROM EmpBean e where e.employeeCode = ?1 and e.age > ?2";
ObjectQuery oQuery = session.createObjectQuery(query);
// set the query parameter value as the attribute1 that is used in
  // mapGlobalIndexCode. findPartitions
oQuery.setParameter(1, attribute1);
// the 2nd parameter is age
Integer age = Integer.valueOf(50);
oQuery.setParameter(2, age);

Set completeQueryResultSet = new HashSet();
// the following code shows serial query pattern, it runs the query on one 
//partition at a time.
// production code should use parallel query pattern to run query on all 
// applicable partitions in parallel.
while (partitionsIter.hasNext()) {
Integer pid = (Integer)partitionsIter.next();
oQuery.setPartition(pid);
Iterator queryResultIter = oQuery.getResultIterator();
while (queryResultIter.hasNext()) {
completeQueryResultSet.add(queryResultIter.next());
}
}