IBM Integration Bus, Version 9.0.0.5 Operating Systems: AIX, HP-Itanium, Linux, Solaris, Windows, z/OS

See information about the latest product version

The Splitter pattern

A common large message scenario which results in large memory requirements is is to take an input message with a batch of records, and then create individual output messages for each record. The processing loops around the repeating records, and then builds a small output tree that is propagated to the rest of the message flow.

This type of scenario can also be referred to as the read large -> propagate many scenario. To support this type of processing, the ESQL language has the PROPAGATE statement. This statement allows a Compute node to propagate the current set of Output message trees to the named terminal (or named label). The following example message flow demonstrates this behavior:

Figure 1. Diagram of a message flow using the Propagate statement.
When the down stream nodes are called, the processing returns back to the Compute node. The next line of ESQL is then called after the PROPAGATE statement.
DECLARE msgCount INT 0;
WHILE msgCount < 5 DO
  SET OutputRoot.Properties = InputRoot.Properties;
  SET OutputRoot.MQMD = InputRoot.MQMD;
  SET OutputRoot.XMLNSC.TestCase.MsgNumber = msgCount;
  SET msgCount = msgCount + 1;
  PROPAGATE;
END WHILE;
RETURN FALSE;
This ESQL propagates five output messages, where each output message contains the message number, and the message headers are populated each time around the loop, as shown in the following example:
Message
----- Properties
    *****
----- MQMQ
    *****
----- XMLNSC
        TestCase
            MsgNumber

This propagation works because the PROPAGATE statement clears all the Output message trees (OutputRoot, OutputLocalEnvironment, and OutputExceptionList) when the propagation is complete. By clearing all these message trees, any parsers that were created within them can are reset and freed for reuse. Therefore, no matter how many times the PROPAGATE statement is called, the same memory is reused for each output message. If all the output messages are approximately the same size, then such a propagation scenario does not cause any memory growth irrespective of the number of propagations. This process results in the path shown in the following example:

Figure 2. Diagram of propagation to an MQOutputNode
Tip: The Environment tree is not cleared by the PROPAGATE call because it is not considered an Output message tree.

What changes were made to the Java™ API?

The Java API has some similar functionality to ESQL:
  1. The clearMessage() method takes a boolean parameter to indicate if the root element is to be deleted as well.
    • If the old clearMessage() method is called, then the default is NOT to delete the root element.
    • By calling clearMessage(true), the API code is giving the broker permission to delete the root element and reset any parsers that are associated with the message.
    • If, after the root element is deleted, elements are still in scope (because they were attached elsewhere) then the owning parser is NOT reset.
    • Any MbElement references to the message trees in the message should not be used after calling clearMessage(true).
  2. The MbOuptutTerminal.propagate() method now has a boolean parameter that indicates whether the message objects in the assembly are to be cleared.
    • When true is specified, the clearMessage(true) is called on the three message objects in the MbMessageAssembly that was propagated.
    • False is the default, and gives the same behavior as before where the MbMessageAssembly are not cleared.
    • If any of these MbMessage objects are read-only, then clearMessage() is not called on them.
    • When MbOutputTerminal.propagate(MbMessageAssembly, true) is called, there is no need to call clearMessage() on the MbMessage objects that were propagated.

Although it might appear that the functionality of #2 supersedes that of #1, it is possible that some implementations build a temporary MbMessage object that is not propagated. By providing a clearMessage(boolean) method, these MbMessage objects can also be cleared completely.

MbOutputTerminal.propagate(MbMessageAssembly, true) has the same behavior of ESQL PROPAGATE, and as such any number of iterations should cause the same amount of parser resources to be used throughout. When you use Java, there might be some JVM heap growth because Java objects are not cleared until a garbage collection cycle.

.NET and the C API do not yet have this support available, which means that a large message splitter scenario in the .NETCompute node or C Plugin node leads to large memory growth. Both the .NETCompute and C Plugin interface support the deleting of elements using MbElement.delete() and cniDelete(). Therefore, the root element in any message can be accessed and a delete that is issued on it.

Now that deletion of fields can free up parser resources for use, deletion can be used as a manual approach that stops memory growth after each propagation. That is, after propagation the root element of each message can be accessed and deleted before cnoClearMessage() or the NbMessage is deleted.

The .NET and C language APIs support multiple propagations, meaning that the .NETCompute, and C Plugin nodes can attempt splitting a large message. In these languages, a new message object can be created on each iteration of the loop, and then the message (NbMessage or CciMessage) is propagated. This process uses the NbOutputTerminal.propagate() or cniPropagate() methods. These propagate methods do not have the same behavior as ESQL PROPAGATE, such that the message trees and parsers are NOT reset for reuse. The reason for the difference is due to the scope and ownership of the objects that are involved. In a Compute node, the Output trees are owned by the broker code and all references to these trees can be managed, and so the trees are cleared. However, for the .NETCompute and plugin nodes the message flow developer's code owns the message objects. Therefore, the propagate methods cannot clear them because the caller might still be either using them or referencing them.

Each API offers a method of clearing the resources that are used by the method:
  • In Java, MbMessage has a clearMessage() method. MbMessage needs to have a clearMessage() call on it so that the associated C++ objects and message bitstream are deleted. For example, if your Java node (plugin/JCN) is creating a new output message, then you would have lines like:
    MbMessage newMsg = null;
    try
    {
      newMsg = createMessage(inputFileBytes);
      MbMessageAssembly outputAssembly = new MbMessageAssembly(assembly, newMsg);
      MbOutputTerminal outTerm = getOutputTerminal("out");
      if (outTerm != null)
      {
         outTerm.propagate(outputAssembly);
      }
    }
    In this case, if the createMessage() method is called, a buffer is allocated ready to serialize the message tree. As indicated previously, for every createMessage() issued, an associated clearMessage() needs to be called. This needs to be done after the "outTerm.propagate" and regardless of whether an exception is thrown or not. For example:
    MbMessage newMsg = null;
    try
    {
      newMsg = createMessage(inputFileBytes);
      MbMessageAssembly outputAssembly = new MbMessageAssembly(assembly, newMsg);
      MbOutputTerminal outTerm = getOutputTerminal("out");
      if (outTerm != null)
      {
         outTerm.propagate(outputAssembly);
      }
    }
    finally
    {
       if(newMsg != NULL)
       {
         newMsg.clearMessage();
       }
    }
  • In .NET, the NbMessage can be deleted.
  • In C, the cniDeleteMessage() method can be called.

Although these methods erase the message object and its associated buffers, they do not erase the parse trees or reset the parsers for reuse. A parser and its parse tree can have wider scope than a single message object because message tree fields could be detached and attached between them.

Therefore, by default a large message splitter pattern in either of the .NETCompute or plugin nodes causes a large amount of memory to be used if multiple propagations take place within a loop. This behavior is due to new parsers that are created for each new message object that is constructed within the loop.


bj60048_.htm | 
        
        Last updated:
        
        Last updated: 2016-08-12 11:20:23