IBM Support

IT17612: MQ-JMS: An unexpected byte-order-mark character is visible in messages decoded from CCSID 17584

Subscribe to this APAR

By subscribing, you receive periodic emails alerting you to the status of the APAR, along with a link to the fix after it becomes available. You can track this item individually or track all items by product.

Notify me when this APAR changes.

Notify me when an APAR for this component changes.

APAR status

  • Closed as program error.

Error description

  • A WebSphere MQ classes for JMS V7.5.0.5 application consumes a
    message from a queue, which had been generated and put to the
    queue by a Siebel application.  The message is put to the queue
    with the following declared character encoding configuration:
    
      MQMD Format:  MQSTR  (MQFMT_STRING)
      MQMD CodedCharSetId: 17584
      MQMD Encoding: 564 (0x222)
    
    The body of the message consists of XML character data.  When
    the message is consumed by the receiving application, and its
    character content is passed to an XML parser, the XML parser
    throws a parsing error.
    
    Previously, the application had been using the WebSphere MQ
    classes for JMS V7.0.1.11 where the problem was not seen, and
    the XML parser was able to process the message successfully.
    
    Examining the byte sequence at the start of the message body on
    the queue before being consumed by the JMS application, the
    bytes of the message body were of the form:
    
    
      0 1 2 3  4 5 6 7  8 9 A B  C D E F
      fffe3c00 3f007800 6d006c00 20007600 : ..<.?.x.m.l. .v.
      65007200 73006900 6f006e00 3d002200 : e.r.s.i.o.n.=.".
      31002e00 30002200 20006500 6e006300 : 1...0.". .e.n.c.
      6f006400 69006e00 67003d00 22005500 : o.d.i.n.g.=.".U.
    

Local fix

  • Configure the message producing application to generate a
    message body which is encoded in an alternative character
    encoding scheme, such as:
    
      UTF-8  (CCSID 1208)
    

Problem summary

  • ****************************************************************
    USERS AFFECTED:
    This issue affects users of the IBM MQ classes for JMS who have
    applications that are consuming messages where the message body
    is declared to be of type MQSTR, with the character encoding
    declaration:
    
      CCSID:  1200, 13488, 17584
      Encoding: 564 (0x222)
    
    where the message body contains a byte-order-mark at the start
    of the data.
    
    
    Platforms affected:
    MultiPlatform
    
    ****************************************************************
    PROBLEM DESCRIPTION:
    With the code change associated with MQ APAR IV40180:
    
    http://www.ibm.com/support/docview.wss?uid=swg1IV40180
    
    when an IBM MQ classes for JMS application consumes a message
    which is declared to be character encoded using CCSID 1200,
    13488 or 17584, and the Encoding field is declared to have
    little-endian integer encoding (0x222), the IBM MQ classes for
    JMS map this character encoding scheme to the Java Charset
    named:
    
        UTF-16LE
    
    This differs to the currently defined IBM global standards
    (external to IBM MQ), where these CCSID values are all declared
    to be big-endian encoded, as per the IBM Globalization
    documentation:
    
    1200:
    https://www.ibm.com/software/globalization/ccsid/ccsid13488.html
    Name: "UTF-16 BE with IBM PUA"
          "Data is big endian order"
    
    13488:
    https://www.ibm.com/software/globalization/ccsid/ccsid13488.html
    Name: "Unicode 2.0, UTF-16 BE with IBM PUA"
          "Data is big endian order"
    
    17584:
    https://www.ibm.com/software/globalization/ccsid/ccsid17584.html
    Name: "Unicode 3.0, UTF-16 BE with IBM PUA"
          "Data is big endian order"
    
    
    It was observed that when viewing the message on the queue prior
    to consumption by the JMS application, this bytes of this
    particular message's body also started with a byte-order-mark:
    
        '0xFF 0xFE'
    
    The Java Charset 'UTF-16LE' does not permit a byte-order-mark
    character to be present in the document, which results in this
    message's byte-order-mark being interpreted as a visible
    character at the start of the message document which was added
    to the "java.lang.String" object returned to the application as
    a result of the JMS method call:
    
      javax.jms.TextMessage.getText()
    
    This in turn resulted in the application's XML parser failing to
    correctly parse the XML document.
    
    
    Prior to the MQ APAR IV40180, a message's character data
    declared to be encoded using CCSID 17584 (with message Encoding
    value 0x222) would be decoded using the Java Charset name:
    
      'UnicodeLittle'
    
    which permitted the byte-order-mark to be present in the bytes
    of the message body.  The issue with this Java Charset is that
    there was no mapping present in the IBM MQ classes for Java/JMS
    to map it back to an IBM CCSID, which meant that while messages
    could be received and decoded using this Java Charset, those
    same messages could then not be sent back to the queue manager
    using the IBM MQ classes for JMS API.
    
    The code change associated with APAR IV40180 was included in the
    MQ versions:
    
      7.0.1.12
      7.1.0.4
      7.5.0.3
    
    resulting in the observed change of behaviour going from any of
    the IBM MQ classes for JMS versions prior to the above fixpack
    level.
    
    By mapping CCSID 17584 to "UTF-16LE" as APAR IV40180 did, a
    byte-order-mark present in the message on the queue would be
    interpreted as a printable character into the Java String
    object, which is incorrect, although it should be noted that
    CCSID 17584 is currently officially declared as always being
    big-endian ordered without a byte-order-mark.
    
    
    In this same scenario, when the JVM system property was defined:
    
      -Dcom.ibm.mq.cfg.CCSID.MapUtf16ByteOrderByCCSID=YES
    
    then all the byte ordering was reversed, resulting in corrupted
    character data as the IBM MQ classes for JMS mapped CCSID 17584
    to the encoding scheme CCSID 1200, resulting in the use of the
    big-endian Java Charset "UTF-16".
    

Problem conclusion

  • The default encoding mapping for CCSIDs:
    
      1200
      13488
      17584
    
    where the message's integer "Encoding" value is defined to use
    little-endian encoding (0x222), has now been mapped to the Java
    Charset named:
    
      x-UTF16LE-BOM
    
    In addition, due to the complexity of the use of CCSID
    1200/13488/17584 with IBM MQ, a new property has been defined
    which controls which Java Charset the data will be decoded from
    the bytes of the message on the queue, irrespective of the
    integer encoding value is specified on the message.
    
    This property has the name:
    
        com.ibm.mq.cfg.CCSID.MapCcsid1200ToSpecificCharset
    
    and can be set as a JVM argument.
    
    For example, if the IBM MQ classes for JMS are to be configured
    to interpret a CCSID 1200 message's bytes using the Java
    'UnicodeLittle' encoding, you would use the command line JVM
    argument:
    
    
    -Dcom.ibm.mq.cfg.CCSID.MapCcsid1200ToSpecificCharset=UnicodeLitt
    le
    
    Note that this property has no effect when sending a message
    from the IBM MQ classes for JMS back to MQ, so care is needed
    when using it.  If you use this property, and specify a Java
    Charset name which your running JVM recognises but is not one
    which the IBM MQ classes for JMS recognise, your application
    will be able to receive the message, but not send it back to MQ,
    as the IBM MQ classes for JMS will not be able to map the
    message's declared "JMS_IBM_Character_Set" property back into an
    CCSID value.
    
    ---------------------------------------------------------------
    The fix is targeted for delivery in the following PTFs:
    
    Version    Maintenance Level
    v7.5       7.5.0.9
    v8.0       8.0.0.9
    v9.0 CD    9.0.5
    v9.0 LTS   9.0.0.4
    
    The latest available maintenance can be obtained from
    'WebSphere MQ Recommended Fixes'
    http://www-1.ibm.com/support/docview.wss?rs=171&uid=swg27006037
    
    If the maintenance level is not yet available information on
    its planned availability can be found in 'WebSphere MQ
    Planned Maintenance Release Dates'
    http://www-1.ibm.com/support/docview.wss?rs=171&uid=swg27006309
    ---------------------------------------------------------------
    

Temporary fix

Comments

APAR Information

  • APAR number

    IT17612

  • Reported component name

    WMQ BASE MULTIP

  • Reported component ID

    5724H7241

  • Reported release

    750

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Special Attention

    NoSpecatt / Xsystem

  • Submitted date

    2016-10-23

  • Closed date

    2018-02-13

  • Last modified date

    2018-02-13

  • APAR is sysrouted FROM one or more of the following:

  • APAR is sysrouted TO one or more of the following:

Fix information

  • Fixed component name

    WMQ BASE MULTIP

  • Fixed component ID

    5724H7241

Applicable component levels



Document information

More support for: WebSphere MQ
APAR / Maintenance

Software version: 7.5

Reference #: IT17612

Modified date: 13 February 2018


Translate this page: