Example: Use XMLTABLE to handle multiple values

Sometimes a path expression refers to an item that has multiple values.

The path expressions in the COLUMNS clause must not produce more than one item per row. In the sample documents, notice that the employee Mary Jones has two phone numbers. If you need to query this data and return a relational table with each employee's name and phone number, the query you would write might look like this:

SELECT X.*
  FROM emp,
       XMLTABLE ('$d/dept/employee' PASSING doc AS "d"
                 COLUMNS
                      firstname VARCHAR(20)  PATH 'name/first',
                      lastname  VARCHAR(25)  PATH 'name/last',
                      phone     VARCHAR(12)  PATH 'phone') AS X

When run against the sample documents, this query fails since there are two values for phone. A different solution is needed.

Return only first value

One way to deal with this issue is to return only one of the multiple phone numbers. If you need summarized information for each employee, having just one phone number might be enough. Returning only one occurrence of the phone element can be done with a positional predicate in the XPath expression for the column phone.

Square brackets in XPath are used to specify predicates. To obtain the first phone element for an employee, use a positional predicate, written either as [1] or [fn:position()=1]. The first notation of [1] is an abbreviated version of the second.

SELECT X.*
  FROM emp,
       XMLTABLE ('$d/dept/employee' PASSING doc AS "d"
                 COLUMNS
                      firstname VARCHAR(20)  PATH 'name/first',
                      lastname  VARCHAR(25)  PATH 'name/last',
                      phone     VARCHAR(12)  PATH 'phone[1]') AS X

Return multiple values as XML

Another option to return multiple phone numbers for a single employee is to return an XML sequence of phone elements. To achieve this, the generated phone column needs to be of type XML, which allows you to return an XML value as the result of the XPath expression.

SELECT X.*
  FROM emp,
       XMLTABLE ('$d/dept/employee' PASSING doc AS "d"
                 COLUMNS
                      firstname VARCHAR(20)  PATH 'name/first',
                      lastname  VARCHAR(25)  PATH 'name/last',
                      phone     XML          PATH 'phone') AS X

The result of this query is:

FIRSTNAME    LASTNAME   PHONE
 ----------- ---------- ------------------
John         Doe        -
Peter        Pan        <phone>905-416-5004</phone>
Mary         Jones      <phone>905-403-6112</phone><phone>647-504-4546</phone>

The XML value returned in the phone column for Mary Jones is not a well-formed XML document since there is no single root element. This value can still be processed by DB2®, but you won't be able to insert it into an XML column or parse it with an XML parser. Combining multiple phone numbers into a single VARCHAR or XML value may require additional code in your application to use the individual numbers.

Return multiple columns

Another solution is to return each phone number as a separate VARCHAR value by producing a fixed number of result phone columns. This example uses positional predicates to return phone numbers in two columns.

SELECT X.*
  FROM emp,
       XMLTABLE ('$d/dept/employee' PASSING doc AS "d"
                 COLUMNS
                      firstname VARCHAR(20)  PATH 'name/first',
                      lastname  VARCHAR(25)  PATH 'name/last',
                      phone     VARCHAR(12)  PATH 'phone[1]',
                      phone2    VARCHAR(12)  PATH 'phone[2]') AS X

An obvious drawback to this approach is that a variable number of items is being mapped to a fixed number of columns. One employee may have more phone numbers than anticipated. Others may have fewer which results in null values. If every employee has exactly one office phone and one cell phone, then producing two columns with corresponding names might be very useful.

Return one row for each value

Instead of returning the phone numbers in separate columns, you can also use XMLTABLE to return them in separate rows. In this case, you need to return one row for each phone number instead of one row for each employee. This may result in repeated information in the columns for the first and last names.

SELECT X.*
  FROM emp,
       XMLTABLE ('$d/dept/employee/phone' PASSING doc AS "d"
                 COLUMNS
                      firstname VARCHAR(20)  PATH '../name/first',
                      lastname  VARCHAR(25)  PATH '../name/last',
                      phone     VARCHAR(12)  PATH '.') AS X

The result of this query is:

FIRSTNAME    LASTNAME   PHONE
 ----------- ---------- ------------------
Peter        Pan        905-416-5004
Mary         Jones      905-403-6112
Mary         Jones      647-504-4546

In this result, there is no row for John Doe since he has no phone number.

Handling non-existent path values

The previous example did not return a row for employee John Doe because the row-xquery expression iterates over all the phone elements and there is no phone element for the employee John Doe. As a result, the employee element for John Doe is never processed.

To resolve this issue, you need to use an SQL UNION of two XMLTABLE functions.

SELECT X.*
  FROM emp,
       XMLTABLE ('$d/dept/employee/phone' PASSING doc AS "d"
                 COLUMNS
                      firstname VARCHAR(20)  PATH '../name/first',
                      lastname  VARCHAR(25)  PATH '../name/last',
                      phone     VARCHAR(12)  PATH '.') AS X
UNION
SELECT Y.*, CAST(NULL AS VARCHAR(12))
  FROM emp,
       XMLTABLE ('$d/dept/employee[fn:not(phone)]' PASSING doc AS "d"
                 COLUMNS
                      firstname VARCHAR(20)  PATH 'name/first',
                      lastname  VARCHAR(25)  PATH 'name/last') AS Y

The $d/dept/employee[fn:not(phone)] row expression in the second XMLTABLE returns all employees with no phone numbers, adding the employee rows that were omitted in the first XMLTABLE.