IBM InfoSphere Streams Version 4.1.0

Writing operators that support consistent regions

When you develop operators that are included in consistent regions, you must follow programming guidelines and use certain interfaces provided by InfoSphere® Streams.

The logical division of the output streams of operators is established after the drain() callback in the SPL StateHandler interface returns. Any tuple or punctuation that is submitted to output ports before or during the drain method call is processed by the consistent region before the successful establishment of the consistent state. If the drain callback is not implemented by an operator, the logical division is established automatically by the InfoSphere Streams instance.

If an application requires that the operators in a consistent region cannot lose their internal states, the operators must persist their internal states after a drain, and restore their internal states during a reset. To persist primitive operator states, you can use the InfoSphere Streams checkpoint data store and the SPL StateHandler interface. This interface provides the checkpoint(Checkpoint & ckpt) and the reset(Checkpoint & ckpt) callbacks. By serializing and deserializing the state to and from the Checkpoint class, the operator state gets automatically persisted and restored to and from the InfoSphere Streams configurable checkpoint backend.

If any operator in a consistent region fails at run time, the controller of a consistent region detects the failure and triggers the reset of the region. Primitive operators that implement the reset() callback from the StateHandler can then restore their state to a state previously persisted during a checkpoint() callback. SPL logic state variables are automatically serialized on checkpoint and deserialized on reset in primitive and Custom operators.

A reset forces the processing of all tuples and punctuations that are submitted to the output ports of an operator before the invocation of the reset() callback. The reset starts after all the PEs that participate in a consistent region are healthy. Tuple replay starts after the successful reset of all operators in the region.