Optimizing message flow throughput

Each message flow that you design must provide a complete set of processing for messages received from a particular source. This design might result in very complex message flows that include large numbers of nodes that can cause a performance overhead, and might create potential bottlenecks. You can increase the number of message flows that process your messages to provide the opportunity for parallel processing and therefore improved throughput.

About this task

The operation mode of your integration nodes can affect the number of message flows that you can use; see IBM Integration features and Restrictions that apply in each operation mode for more information.

You can also consider the way in which the actions taken by the message flow are committed, and the order in which messages are processed.

The web user interface enables you to view message flow statistics and accounting data, including the message flow throughput and the CPU and elapsed times for transactions in the flow. See Viewing message flow statistics and accounting data for more information about how you can access the statistical data.

Consider the following options for optimizing message flow throughput:

Multiple threads processing messages in a single message flow

When you deploy a message flow, the integration node automatically starts an instance of the message flow for each input node that it contains. This behavior is the default. If you have a message flow that handles a very large number of messages, you can enable more messages to be processed by allocating more instances of the flow.

You can update the Additional Instances property of the deployed message flow in the BAR file; the integration node starts additional copies of the message flow on separate threads, providing parallel processing. This option is the most efficient way of handling this situation if you are not concerned about the order in which messages are processed.

If the message flow receives messages from a WebSphere MQ queue, you can influence the order in which messages are processed by setting the Order Mode property of the MQInput node:

If you set Order Mode to By User ID, the node ensures that messages from a specific user (identified by the UserIdentifier field in the MQMD) are processed in guaranteed order. A second message from one user is not processed by an instance of the message flow if a previous message from this user is currently being processed by another instance of the message flow.
If you set Order Mode to By Queue Order, the node processes one message at a time to preserve the order in which the messages are read from the queue. Therefore, this node behaves as though you have set the Additional Instances property of the message flow to zero.
If you set Order Mode to User Defined, you can order messages by any message element, by setting an XPath or ESQL expression in the Order field location property. The node ensures that messages with the same value in the order field message element are processed in guaranteed order. A second message with the same value in the order field message element is not processed by an instance of the message flow if a previous message with the same value is currently being processed by another instance of the message flow.
If the field is missing, an exception is raised, and the message is rolled back. NULL and empty values are processed separately, in parallel.

If you set Order Mode to By User ID or User Defined, and the message flow uses transformation nodes, it is advisable to set the Parse Timing to Immediate.

Multiple copies of the message flow in an integration node

You can also deploy several copies of the same message flow to different integration servers in the same integration node. This option has similar effects to increasing the number of processing threads in a single message flow.

This option also removes the ability to determine the order in which the messages are processed, because, if there is more than one copy of the message flow active in the integration node, each copy can be processing a message at the same time, from the same queue. The time taken to process a message might vary, and multiple message flows accessing the same queue might therefore read messages from the input source in a random order. The order of messages produced by the message flows might not correspond to the order of the original messages.

Ensure that the applications that receive message from these message flows can tolerate out-of-order messages. Additionally, ensure that the input nodes in these message flows are suitable for deployment to different processes.

Copies of the message flow in multiple integration nodes

You can deploy several copies of the same message flow to different integration nodes. This option requires changes to your configuration, because you must ensure that applications that supply messages to the message flow can put their messages to the right input queue or port. You can often make these changes when you deploy the message flow by setting the message flow's configurable properties.

The scope of the message flow

You might find that, in some circumstances, you can split a single message flow into several different flows to reduce the scope of work that each message flow performs. If you do split your message flow, be aware that it is not possible to run the separate message flows in the same unit of work, and if transactional aspects to your message flow exist (for example, the updating of multiple databases), this option does not provide a suitable solution.

The following two examples show when it might be beneficial to split a message flow:

In a message flow that uses a RouteToLabel node, the input queue might increase in size, indicating that work is arriving faster than it is being processed; this might indicate a need for increased processing capacity. You can use another copy of the message flow in a second integration server, but this option is not appropriate if you want all of the messages to be handled in the order in which they are shown on the queue. You can consider splitting out each branch of the message flow that starts with a Label node by providing an input queue and input node for each branch. This option might be appropriate, because when the message is routed by the RouteToLabel node to the relevant Label node, it has some level of independence from all other messages.
You might also need to provide another input queue and input node to complete any common processing that the Label node branches connect to when unique processing has been done.
If you have a message flow that processes very large messages that take a considerable time to process, you might be able to:
1. Create other copies of the message flow that use a different input queue (you can set this option up in the message flow itself, or you can update this property when you deploy the message flow).
2. Set up WebSphere MQ queue aliases to redirect messages from some applications to the alternative queue and message flow.
You can also create a new message flow that replicates the function of the original message flow, but only processes large messages that are immediately passed on to it by the original message flow, that you modified to check the input message size and redirect the large messages.

The frequency of commits

If a message flow receives input messages on a WebSphere MQ queue, you can improve its throughput for some message flow scenarios by modifying its default properties after you have added it to a BAR file. (These options are not available if the input messages are received by other input nodes; commits in those message flows are performed for each message.)

The following properties control the frequency with which the message flow commits transactions:

commitCount. This property represents the number of messages processed from the input queue per message flow thread, before an MQCMIT is issued.
commitInterval. This property represents the time interval that elapses before an MQCMIT is started.