Group nodes and aggregation nodes

Group nodes and aggregation nodes provide alternative ways to collate related requests and responses. Use these nodes to generate several requests in response to one input message, to control and coordinate the responses that are received in response to those requests, and to combine the information that is provided by the responses to continue processing.

The Group nodes provide the ability to create fan-out/fan-in style static and dynamic aggregation scenarios that are stateless without requiring an IBM® MQ queue manager to be available for the integration server. The GroupScatter node is responsible for creating a new group, and enables downstream nodes to send requests that are marked as part of the group. (The list of currently supported downstream nodes for a group is: MQOutput node (the node must have the “Request” attribute set), HTTPAsyncRequest node, CallableFlowAsyncInvoke node, and RESTAsyncRequest node.)

The Aggregation nodes provide a similar capability, but make use of storage queues that are controlled by IBM MQ so you must install IBM MQ on the same computer as your integration server if you want to use the capabilities that are provided by the aggregation nodes. For more information about the queues that are required by aggregation nodes, see Configuring the storage of events for aggregation nodes.

Note: The group nodes are not a drop-in replacement for the aggregate nodes, therefore customers who wish to migrate from using the aggregate nodes to the group nodes will need to redesign their flows.

There is no group node that is equivalent to the AggregateRequest node, rather supported output nodes now directly specify whether they are participating in a group or not by their node properties. Custom transports can still be enabled using Environment overrides.

Testing has shown that the group nodes provide better overall performance when compared to the aggregate nodes, with reduced message processing times and CPU costs.

You should use the group nodes if you have the following requirements:

You want a stateless integration server
You want highly performant aggregations that scale to high numbers of branches
You do not have access to, or do not want to use, IBM MQ
You will deploy all flows responsible for orchestrating the aggregations to a single integration server (note this does not include the backend services)

You should use the Aggregate nodes if you have the following requirements:

You require that all parts of the aggregation, including control data, received replies and sent requests, are recoverable if the integration server shuts down
You want to deploy the flows responsible for orchestrating the aggregations across multiple integration servers within the same integration node
Your backend replies are large, which may cause excessive memory consumption if not stored on disk while waiting for the aggregation to complete