Client query optimization using global indexes
The purpose of using global index in a client query is to run queries on applicable partitions only. By doing so, you can avoid unnecessary remote calls.
However, global index does not guarantee performance improvement. For example, the returned partitions from the MapGlobalIndex.findPartitions() method exceed a certain percentage of complete partitions, for example 90%. In this scenario, the resource consumption of using a global index might defeat its purpose.
When you run queries from a data grid client, you must set partitions if the participating maps are partitioned. In a large partitioned ObjectGrid environment, an application usually must run parallel queries concurrently against all partitions to get complete query results. For example, if there are 100 partitions, the application must run the same query on all 100 partitions, and merge query results to get complete the query result. This scenario usually uses large amounts of system resources.
If any equality predicate in the query has the corresponding HashIndex plug-in that is defined, then the client query can enable global index on the HashIndex plug-in. The client query can also use the MapGlobalIndex API to find partitions by the attribute that represents the value of the predicate.
In a simple query where the query contains only one equality predicate, the query might be replaced by the MapGlobalIndex.findValues() method because their results are equivalent. However, the MapGlobalIndex.findValues() method is more efficient.
employeeCode
equals 1
. The query uses
the index that is defined over the employeeCode
field.SELECT e FROM EmpBean e where e.employeeCode = ?1
<bean id="MapIndexPlugin"
className="com.ibm.websphere.objectgrid.plugins.index.HashIndex">
<property name="Name" type="java.lang.String" value="employeeCODE"
description="index name" />
<property name="AttributeName" type="java.lang.String" value="employeeCode"
description="attribute name" />
<property name="GlobalIndexEnabled" type="boolean" value="true"
description="true for global index" />
</bean>
The indexed attribute is employeeCode
that is
used in the predicate of the query. The global index is enabled on
that index so that the MapGlobalIndex index proxy is available.
// in client ObjectGrid process
MapGlobalIndex mapGlobalIndexCODE = (MapGlobalIndex)m.getIndex("employeeCODE", false);
Object attribute1 = new Integer(1);
Object[] attributes = new Object[] {attribute1};
Set empBeanSet = mapGlobalIndexCODE.findValues(attributes);
// the returned empBeanSet is equivalent to query result from the following query:
// SELECT e FROM EmpBean e where e.employeeCode = ?1
SELECT e FROM EmpBean e where e.employeeCode = ?1 and e.age > ?2
// in client ObjectGrid process
MapGlobalIndex mapGlobalIndexCODE = (MapGlobalIndex)m.getIndex("employeeCODE", false);
Object attribute1 = new Integer(1);
Object[] attributes = new Object[] {attribute1};
Collection partitions = mapGlobalIndexCODE.findPartitions(attributes);
// the returned partitions is a subset of all partitions.
Iterator partitionsIter = partitions.iterator();
String query = "SELECT e FROM EmpBean e where e.employeeCode = ?1 and e.age > ?2";
ObjectQuery oQuery = session.createObjectQuery(query);
// set the query parameter value as the attribute1 that is used in
// mapGlobalIndexCode. findPartitions
oQuery.setParameter(1, attribute1);
// the 2nd parameter is age
Integer age = Integer.valueOf(50);
oQuery.setParameter(2, age);
Set completeQueryResultSet = new HashSet();
// the following code shows serial query pattern, it runs the query on one
//partition at a time.
// production code should use parallel query pattern to run query on all
// applicable partitions in parallel.
while (partitionsIter.hasNext()) {
Integer pid = (Integer)partitionsIter.next();
oQuery.setPartition(pid);
Iterator queryResultIter = oQuery.getResultIterator();
while (queryResultIter.hasNext()) {
completeQueryResultSet.add(queryResultIter.next());
}
}