XML data indexing

An XML index can be used to improve the efficiency of queries on XML documents that are stored in an XML column.

In contrast to traditional relational indexes, where index keys are composed of one or more table columns that you specify, an XML index uses a particular XML pattern expression to index paths and values in XML documents stored within a single column. The data type of that column must be XML.

Instead of providing access to the beginning of a document, index entries in an XML index provide access to nodes within the document by creating index keys based on XML pattern expressions. Because multiple parts of a XML document can satisfy an XML pattern, DB2® might generate multiple index keys when it inserts values for a single document into the index.

You create an XML index using the CREATE INDEX statement, and drop an XML index using the DROP INDEX statement. The GENERATE KEY USING XMLPATTERN clause you include with the CREATE INDEX statement specifies what you want to index.

Some of the keywords used with the CREATE INDEX statement for indexes on non-XML columns do not apply to indexes over XML data.

Example: You want to create an XML index in the INFO column of the sample CUSTOMER table. The following document shows the format of documents in the INFO column.

<customerinfo xmlns="http://posample.org" Cid="1000">
<name>Kathy Smith</name> 
<addr country="Canada"> 
<street>5 Rosewood</street> 
<city>Toronto</city> 
<prov-state>Ontario</prov-state> 
<pcode-zip>M6W-1E6</pcode-zip> 
</addr> 
<phone type="work">416-555-1358</phone> 
</customerinfo>
Users of the CUSTOMER table often retrieve customer information using the customer ID. You might use an index like this one to make that retrieval more efficient:
Figure 1. Example of XML index creation
CREATE UNIQUE INDEX CUST_CID_XMLIDX ON CUSTOMER(INFO)       1 
 GENERATE KEY USING XMLPATTERN                              2 
'declare default element namespace "http://posample.org"; 
/customerinfo/@Cid'                                       
 AS SQL VARCHAR(4)                                          3 
Notes to Figure 1:
1 The XML index is defined on the INFO column of the CUSTOMER table. INFO must be of the XML data type.
2 The GENERATE KEY USING XMLPATTERN clause provides information about what you want to index. This clause is called an XML index specification. The XML index specification contains an XML pattern clause. The XML pattern clause in this example indicates that you want to index the values of the Cid attribute of each customerinfo element, and that the namespace for all unqualified elements is http://posample.org. The namespace declaration is necessary because the XML documents have a namespace of http://posample.org.
3 AS SQL VARCHAR(4) indicates that indexed values are stored as VARCHAR(4) values.