IBM Integration Bus, Version 9.0.0.5 Operating Systems: AIX, HP-Itanium, Linux, Solaris, Windows, z/OS

See information about the latest product version

Handling large input messages

Although a message flow developer cannot usually change the size of the input data or the model that leads to large memory usage, they can influence how much of a message is actively storied in memory at the same time.

In previous examples, fully parsed message trees were discussed. A standard full parse of a message tree parses a bitstream from start to finish, and fully inflates the message tree in memory. When this process produces a large message tree, large message handling techniques can be implemented to reduce memory usage.

The large message handling techniques presented here work on the principle that any large message tree is large because it has many repeating records. As such, the logic that is implemented by a message flow deals with each record one at a time and undertakes the necessary processing on that record. Because only one record is ever viewed at a time, only that one record needs to be held in memory. This process is achieved by the following sequence:
  1. Using partial parsing to parse the bitstream in a message tree.
  2. Parse the first repeating record with a reference variable (see References & navigating the message tree for more information).
  3. Move to the next repeating record and delete the parent field of the previous record.
If an input message was large because it contained an embedded video or audio file for example, then this message is a single field, not many fields. Therefore, you cannot use these methods to reduce memory usage. As the whole structure must be populated in the message tree, there is typically less capacity to optimize processing. However, you can still take advantage of the Configuration Recommendations to Reduce Memory Usage Best Practice PDF document, or Configuration Recommendations to Reduce Memory Usage in the main product documentation.

By deleting the parent field of the record after it was processed, the elements that were being used for that record are made available to the parser to use again. This behavior means that when the next repeating record is fully parsed, it reuses the same underlying message tree field objects instead of creating new ones. Using this technique means that only the memory for one record is ever used, irrespective of how many repeating records are in the input message.

Large message handling depends on being able to delete message tree fields, and the following methods are currently supported:
  1. For ESQL, use the DELETE FIELD statement.
  2. For .NET, use the NbElement delete() method.
  3. For Java™, use the MbElement delete() method.
  4. For C, use the cniDelete() method.
This technique is also dependent on the message tree never being fully parsed by any other aspects of IBM® Integration Bus. If at any point all of the repeating records are in memory at the same time, then deleting instances of these records does not reduce memory usage because the largest footprint for message tree was already allocated. The following are examples of IBM Integration Bus function might lead to fully parsing a message tree, and therefore would negate the benefits of large message handling:
  • Complete or Immediate parsing on any input node.
  • A trace node with ${Root} or ${Body} in it.
  • The flow debugger being attached as the content of message trees and is sent to the debugger.
  • Serializing a large message tree with a different codepage/encoding or validation options than it was created with.
  • Copying the message tree to a different domain such as with SET OutputRoot.XMLNSC = InputRoot.DFDL. This behavior forces both the input and output trees in their entirety to be in memory at the same time

In summary, large message handling techniques can reduce how many message tree fields are in memory at the same time. Deleting fields does not apply only to repeating fields, it can also apply to any message tree field. Because the DELETE FIELD command has an overhead when the field is removed from the tree hierarchy, a message flow developer might not want to do this for every field that is ever referenced.

Tip: Large message handling relies on the Input trees not being parsed at all, and being copied to a modifiable output tree for record handling. Care must be taken that this InputRoot tree is never fully parsed, upstream as well as downstream. You must avoid situations where the InputRoot is handled correctly from the Compute node onwards, but then when processing returns to the nodes before the Compute node, the original input tree gets accidentally parsed.

bj60043_.htm | 
        
        Last updated:
        
        Last updated: 2016-08-12 11:20:23