Tuning serialization performance

WebSphere® eXtreme Scale uses multiple Java™ processes to hold data. These processes serialize the data: That is, they convert the data (which is in the form of Java object instances) to bytes and back to objects again as needed to move the data between client and server processes. Marshalling the data is the most expensive operation and must be addressed by the application developer when designing the schema, configuring the data grid and interacting with the data-access APIs.

The default Java serialization and copy routines are relatively slow and can consume 60 to 70 percent of the processor in a typical setup. The following sections are choices for improving the performance of the serialization.

Deprecated feature The ObjectTransformer interface has been replaced by the DataSerializer plug-ins, which you can use to efficiently store arbitrary data in WebSphere eXtreme Scale so that existing product APIs can efficiently interact with your data.

Write an ObjectTransformer for each BackingMap

An ObjectTransformer can be associated with a BackingMap. Your application can have a class that implements the ObjectTransformer interface and provides implementations for the following operations:

Copying values
Serializing and inflating keys to and from streams
Serializing and inflating values to and from streams

The application does not need to copy keys because keys are considered immutable.

Note: The ObjectTransformer is only invoked when the ObjectGrid knows about the data that is being transformed. For example, when DataGrid API agents are used, the agents themselves as well as the agent instance data or data returned from the agent must be optimized using custom serialization techniques. The ObjectTransformer is not invoked for DataGrid API agents.

Using entities

When using the EntityManager API with entities, the ObjectGrid does not store the entity objects directly into the BackingMaps. The EntityManager API converts the entity object to Tuple objects. Entity maps are automatically associated with a highly optimized ObjectTransformer. Whenever the ObjectMap API or EntityManager API is used to interact with entity maps, the entity ObjectTransformer is invoked.

Custom serialization

Some cases exist where objects must be modified to use custom serialization, such as implementing the java.io.Externalizable interface or by implementing the writeObject and readObject methods for classes implementing the java.io.Serializable interface. Custom serialization techniques should be employed when the objects are serialized using mechanisms other than the ObjectGrid API or EntityManager API methods.

For example, when objects or entities are stored as instance data in a DataGrid API agent or the agent returns objects or entities, those objects are not transformed using an ObjectTransformer. The agent, will however, automatically use the ObjectTransformer when using EntityMixin interface. See DataGrid agents and entity based Maps for further details.

Byte arrays

When using the ObjectMap or DataGrid APIs, the key and value objects are serialized whenever the client interacts with the data grid and when the objects are replicated. To avoid the overhead of serialization, use byte arrays instead of Java objects. Byte arrays are much cheaper to store in memory since the JDK has less objects to search for during garbage collection and they can be inflated only when needed. Previously, you only used byte arrays if you did not need to access the objects using queries or indexes, since the data is stored as bytes and could only be accessed through its key. However, beginning in V7.1.1, indexes and queries are more doable because serializer plug-ins lift copy-to-bytes restrictions that previously existed.

[Version 8.6 and later] In V8.6, when you use eXtreme IO (XIO), you have access to eXtreme data format (XDF), which is a default, built-in serializer. Use XDF to serialize and store keys and values in the data grid in a language-independent format. For more information about XDF, see Configuring data grids to use eXtreme data format (XDF).

WebSphere eXtreme Scale can automatically store data as byte arrays using the CopyMode.COPY_TO_BYTES map configuration option, or it can be handled manually by the client. This option will store the data efficiently in memory and can also automatically inflate the objects within the byte array for use by queries and indexes on demand.

A MapSerializerPlugin plug-in can be associated with a BackingMap plug-in when you use the COPY_TO_BYTES or COPY_TO_BYTES_RAW copy modes. This association allows data to be stored in serialized form in memory, rather than the native Java object form. Storing serialized data conserves memory and improves replication and performance on the client and server. You can use a DataSerializer plug-in to develop high-performance serialization streams that can be compressed, encrypted, evolved, and queried.