IBM Integration Bus, Version 9.0.0.8 Operating Systems: AIX, HP-Itanium, Linux, Solaris, Windows, z/OS

See information about the latest product version

Working with large XML messages

The tree representation of an XML message is typically bigger than the input bit stream. Manipulating a large message tree can require much storage but you can code ESQL statements that help to reduce the storage load on the broker.

When an input bit stream is parsed and a logical tree is created, the tree representation of an XML message is typically bigger, and in some cases much bigger, than the corresponding bit stream.
The reasons for this expansion include the following factors:
  • The addition of the pointers that link the objects together
  • Translation of character data into Unicode, which can double the size
  • The inclusion of field names that might have been implicit in the bit stream
  • The presence of control data that is associated with operation of the broker

Manipulating a large message tree can require much storage. If you design a message flow that handles large messages that are made up of repeating structures, you can code ESQL statements that help to reduce the storage load on the broker. These statements support both random and sequential access to the message, but assume that you do not need access to the whole message at one time.

These ESQL statements cause the broker to perform limited parsing of the message, and to keep in storage at one time, only that part of the message tree that reflects a single record. If your processing requires you to retain information from record to record (for example, to calculate a total price from a repeating structure of items in an order), you can either declare, initialize, and maintain ESQL variables, or you can save values in another part of the message tree; for example, in the local environment.

This technique reduces the memory that is used by the broker to the memory that is needed to hold the full input and output bit streams, plus the memory that is needed for the message trees of just one record. This technique also provides memory savings when even a few repeats are encountered in the message. The broker uses partial parsing and the ability to parse specified parts of the message tree, to and from the corresponding part of the bit stream.

To use these techniques in your Compute node, take any of the following steps.
  • Copy the body of the input message as a bit stream to a special folder in the output message. This action creates a modifiable copy of the input message that is not parsed and therefore uses a minimum amount of memory.
  • Avoid any inspection of the input message, which avoids the need to parse the message.
  • Use a loop and a reference variable to step through the message one record at a time. For each record, use the following processes:
    • Use normal transforms to build a corresponding output subtree in a second special folder.
    • Use the ASBITSTREAM function to generate a bit stream for the output subtree. The generated bit stream is stored in a BitStream element that is placed in the position in the output subtree that corresponds to its required position in the final bit stream.
    • Use the DELETE statement to delete both the current input and output record message trees when you have completed their manipulation.
    • When you have completed the processing of all records, detach the special folders so that they do not appear in the output bit stream.

You can vary these techniques to suit the processing that is required for your messages.

The following ESQL code provides an example of one implementation, and is a modification of the ESQL example in Transforming a complex message. It uses a single SET statement with nested SELECT functions to transform a message that contains nested, repeating structures.
-- Copy the MQMD header
  SET OutputRoot.MQMD = InputRoot.MQMD;

-- Create a special folder in the output message to hold the input tree
-- Note : SourceMessageTree is the root element of an XML parser
  CREATE LASTCHILD OF OutputRoot.XMLNS.Data DOMAIN 'XMLNS' NAME 'SourceMessageTree';

-- Copy the input message to a special folder in the output message
-- Note : This is a root to root copy which will therefore not build trees
  SET OutputRoot.XMLNS.Data.SourceMessageTree = InputRoot.XMLNS;

-- Create a special folder in the output message to hold the output tree
  CREATE FIELD OutputRoot.XMLNS.Data.TargetMessageTree;

-- Prepare to loop through the purchased items
  DECLARE sourceCursor REFERENCE TO OutputRoot.XMLNS.Data.SourceMessageTree.Invoice;
  DECLARE targetCursor REFERENCE TO OutputRoot.XMLNS.Data.TargetMessageTree;
  DECLARE resultCursor REFERENCE TO OutputRoot.XMLNS.Data;
  DECLARE grandTotal   FLOAT     0.0e0;

-- Create a block so that it's easy to abandon processing
  ProcessInvoice: BEGIN
  -- If there are no Invoices in the input message, there is nothing to do
    IF NOT LASTMOVE(sourceCursor) THEN
      LEAVE ProcessInvoice;
    END IF;

  -- Loop through the invoices in the source tree
  InvoiceLoop : LOOP
    -- Inspect the current invoice and create a matching Statement
    SET targetCursor.Statement = THE (SELECT 'Monthly' AS (XML.Attribute)Type,
                                             'Full' AS (0x03000000)Style[1],
                                             I.Customer.FirstName AS Customer.Name,
                                             I.Customer.LastName AS Customer.Surname,                                                                    I.Customer.Title AS Customer.Title,
                                             (SELECT
                                               FIELDVALUE(II.Title) AS Title,
                                               CAST(II.UnitPrice AS FLOAT) * 1.6 AS Cost,
                                               II.Quantity AS Qty
                                             FROM I.Purchases.Item[] AS II
                                             WHERE II.UnitPrice> 0.0) AS Purchases.Article[],
                                             (SELECT
                                               SUM( CAST(II.UnitPrice AS FLOAT) *
                                                    CAST(II.Quantity  AS FLOAT) *
                                                    1.6                          )
                                             FROM I.Purchases.Item[] AS II) AS Amount,
                                             'Dollars' AS Amount.(XML.Attribute)Currency
                                           FROM sourceCursor AS I
                                           WHERE I.Customer.LastName <> 'White');

    -- Turn the current Statement into a bit stream
    DECLARE StatementBitStream BLOB
    ASBITSTREAM(targetCursor.Statement OPTIONS FolderBitStream);
    -- If the SELECT produced a result
    -- (that is, it was not filtered out by the WHERE clause),
    -- process the Statement
    IF StatementBitStream IS NOT NULL THEN
    -- create a field to hold the bit stream in the result tree
       CREATE LASTCHILD OF resultCursor
          Type  XML.BitStream
          NAME  'StatementBitStream'
          VALUE StatementBitStream;
                                         
    -- Add the current Statement's Amount to the grand total
    -- Note that the cast is necessary because of the behavior
    -- of the XML syntax element
       SET grandTotal = grandTotal
        + CAST(targetCursor.Statement.Amount AS FLOAT);
    END IF;

    -- Delete the real Statement tree leaving only the bit stream version
    DELETE FIELD targetCursor.Statement;

    -- Step onto the next Invoice,
    -- removing the previous invoice and any
    -- text elements that might have been
    -- interspersed with the Invoices

    REPEAT
      MOVE sourceCursor NEXTSIBLING;
      DELETE PREVIOUSSIBLING OF sourceCursor;
      UNTIL (FIELDNAME(sourceCursor) = 'Invoice')
       OR (LASTMOVE(sourceCursor) = FALSE)
    END REPEAT;

    -- If there are no more invoices to process, abandon the loop 
    IF NOT LASTMOVE(sourceCursor) THEN
     LEAVE InvoiceLoop;
    END IF;

   END LOOP InvoiceLoop;
 END ProcessInvoice;

 -- Remove the temporary source and target folders
 DELETE FIELD OutputRoot.XMLNS.Data.SourceMessageTree;
 DELETE FIELD OutputRoot.XMLNS.Data.TargetMessageTree;

 -- Finally add the grand total
 SET resultCursor.GrandTotal = grandTotal; 
This ESQL code produces the following output message:
<Data>
 <Statement Type="Monthly" Style="Full">
  <Customer>
   <Name>Andrew</Name>
   <Surname>Smith</Surname>
   <Title>Mr</Title>
  </Customer>
  <Purchases>
   <Article>
    <Title>The XML Companion</Title>
    <Cost>4.472E+1</Cost>
    <Qty>2</Qty>
   </Article> 
   <Article>
    <Title>A Complete Guide to DB2 Universal Database</Title>
    <Cost>6.872E+1</Cost>
    <Qty>1</Qty>
   </Article> 
   <Article>
    <Title>JAVA 2 Developers Handbook</Title>
    <Cost>9.5984E+1</Cost>
    <Qty>1</Qty>
   </Article>
  </Purchases>
  <Amount Currency="Dollars">2.54144E+2</Amount>
  </Statement>
  <GrandTotal>2.54144E+2</GrandTotal>
 </Data>  

ac67176_.htm | Last updated Friday, 21 July 2017