DB2 Version 9.7 for Linux, UNIX, and Windows

XMLPARSE scalar function

Read syntax diagramSkip visual syntax diagram
                                             .-STRIP WHITESPACE----.      
>>-XMLPARSE--(--DOCUMENT--string-expression--+---------------------+--)-><
                                             '-PRESERVE WHITESPACE-'      

The schema is SYSIBM. The function name cannot be specified as a qualified name.

The XMLPARSE function parses the argument as an XML document and returns an XML value.

DOCUMENT
Specifies that the character string expression to be parsed must evaluate to a well-formed XML document that conforms to XML 1.0, as modified by the XML Namespaces recommendation (SQLSTATE 2200M).
string-expression
Specifies an expression that returns a character string or BLOB value. If a parameter marker is used, it must explicitly be cast to one of the supported data types.
STRIP WHITESPACE or PRESERVE WHITESPACE
Specifies whether or not whitespace in the input argument is to be preserved. If neither is specified, STRIP WHITESPACE is the default.
STRIP WHITESPACE
Specifies that text nodes containing only whitespace characters up to 1000 bytes in length will be stripped, unless the nearest containing element has the attribute xml:space='preserve'. If any text node begins with more that 1000 bytes of whitespace, an error is returned (SQLSTATE 54059).

The whitespace characters in the CDATA section are also affected by this option. DTDs may have DOCTYPE declarations for elements, but the content models of elements are not used to determine if whitespace is stripped or not.

PRESERVE WHITESPACE
Specifies that all whitespace is to be preserved, even when the nearest containing element has the attribute xml:space='default'.

The data type of the result is XML. If the result of string-expression can be null, the result can be null; if the result of string-expression is null, the result is the null value.

Note:
  1. Encoding of the input string: The input string may contain an XML declaration that identifies the encoding of the characters in the XML document. If the string is passed to the XMLPARSE function as a character string, it will be converted to the code page at the database server. This code page may be different from the originating code page and the encoding identified in the XML declaration.

    Therefore, applications should avoid direct use of XMLPARSE with character string input and should send strings containing XML documents directly using host variables to maintain the match between the external code page and the encoding in the XML declaration. If XMLPARSE must be used in this situation, a BLOB type should be specified as the argument to avoid code page conversion.

  2. Handling of DTDs: External document type definitions (DTDs) and entities must be registered in a database. Both internal and external DTDs are checked for valid syntax. During the parsing process, the following actions are also performed:
    • Default values that are defined by the internal and external DTDs are applied.
    • Entity references and parameter entities are replaced by their expanded forms.
    • If an internal DTD and an external DTD define the same element, an error is returned (SQLSTATE 2200M).
    • If an internal DTD and an external DTD define the same entity or attribute, the internal definition is chosen.
    After parsing, internal DTDs and entities, as well as references to external DTDs and entities, are not preserved in the stored representation of the value.
  3. Character conversion in non-UTF-8 databases: Code page conversion occurs when an XML document is parsed into a non-Unicode database server, if the document is passed in from a host variable or parameter marker of a character data type, or from a character string literal. Parsing an XML document using a host variable or parameter marker of type XML, BLOB or FOR BIT DATA (CHAR FOR BIT DATA or VARCHAR FOR BIT DATA) prevents code page conversion. When a character data type is used, care must be taken to ensure that all characters in the XML document have a matching code point in the target database code page, otherwise substitution characters may be introduced. The configuration parameter enable_xmlchar can be used to help ensure the integrity of XML data stored in a non-Unicode database. Setting this parameter to "NO" blocks the insertion of XML documents from character data types. The BLOB and FOR BIT DATA data types are still allowed, as documents passed into a database using these data types avoid code page conversion.

Example

Using the PRESERVE WHITESPACE option preserves the white space characters in the XML document inserted into the table, including the white space characters in the description element.

INSERT INTO PRODUCT VALUES ('100-103-99','Tool bag',14.95,NULL,NULL,NULL,
XMLPARSE( DOCUMENT 
  '<produce xmlns="http://posample.org" pid="100-103-99">
    <description>
     <name>Tool bag</name>
     <details>
      Super Deluxe tool bag:
      - 26 inches long, 12 inches wide
      - Curved padded handle
      - Locking latch
      - Reinforced exterior pockets
     </details>
     <price>14.95</price>
     <weight>3 kg</weight>
    </description>
  </product>' PRESERVE WHITESPACE ));
Running the following select statement
SELECT XMLQUERY ('$d/*:product/*:description/*:details' PASSING DESCRIPTION as "d" )
FROM PRODUCT WHERE PID = '100-103-99' ;
returns the details element with the white space characters:
<details xmlns="http://posample.org">
  Super Deluxe tool bag:
  - 26 inches long, 12 inches wide 
  - Curved padded handle
  - Locking latch
  - Reinforced exterior pockets
</details>