Indexes on patterns

You can create a pattern matching index on a time series to speed pattern matching queries.

If you do not create pattern matching indexes, pattern matching queries run sequential scans to find matching data.

A pattern matching index is used during a pattern matching query if the query and the index match in the following ways:

A pattern matching index processes time series values by a Discrete Fourier Transformation (DFT) into vector points, calculates bounding boxes around the vector points, and indexes the bounding boxes with an R-tree index structure. During a query, the pattern match index evaluates the bounding boxes for the target time series against the query pattern.

Before you create a pattern matching index, you must understand your data and the type of search queries that you plan to run. Consider the following factors:

For each time series instance, you can create one whole pattern match index and one subsequence pattern match index for each numeric column in the TimeSeries row type.

When you create a pattern matching index, an internal table is created to contain the bounding boxes. The internal table names have the following format: TS_Index_table_name_col_position, where table_name is the name of the time series table that is indexed, and col_position is the numeric position of the column in the TimeSeries data type that is indexed. The internal table TS_INDEX_MAPPING tracks the pattern matching indexes.

Index maintenance

To extend the time range of an existing index, run the TSCreatePatternIndex function with a new beginning point and ending point and with identical index properties of the existing index. You can extend the index time range in either direction, or both directions, but the existing and new time ranges must overlap.

To change the index properties of an existing index, replace an existing index with a new index by running the TSCreatePatternIndex function with the scope parameter set to 1.

To drop a particular index, run the TSDropPatternIndex function.

To drop all pattern matching indexes and the associated internal tables, run the TSPIndexCleanup function.

Storage requirements

The following formula gives an approximate number of disk pages for a pattern matching index:

disk pages = entries/((pagesize - 88)/(colsize +16))*.6
entries
The number of entries in the index depends on the search method.
Whole pattern match: entries=elements/partitionlen, where elements is the number of time series elements that are indexed and partitionlen is the value of the partitionlen index property.
Subsequence pattern match: entries=((elements/aggwin)-slidewin)/4, where elements is the number of time series elements that are indexed, aggwin is the value of the aggwin index property, and slidewin is the value of the slidewin index property.
pagesize
The page size of the dbspace, in bytes.
colsize
The size of the column in the TimeSeries subtype, in bytes.

The .6 multiplier is because R-tree index leaf pages are usually just over half full.


Copyright© 2018 HCL Technologies Limited