SYSPROC.SYSTS_CREATE

You can call the SYSPROC.SYSTS_CREATE stored procedure to create a text search index. This stored procedure enables a text column for text search indexing. The text search index can then be used in SQL queries that contain the CONTAINS or SCORE functions.

The text search index is created on one of the text search servers that is listed in the QSYS2.SYSTEXTSERVERS table. The text search index is not updated synchronously when the DB2® table is updated. Instead, a log of changes to the DB2 table column is captured by triggers and placed in a staging table.
Note: This stored procedure only defines the text search index. The text search index does not contain any data until after the first invocation of the SYSPROC.SYSTS_UPDATE stored procedure for the new text search index. You create the text search index after the table is initially populated. By creating the text search index after the table is initially populated, you avoid the firing of change triggers before an initial index update.

Prerequisites

Before the SYSPROC.SYSTS_CREATE stored procedure call, verify the following prerequisites:
  • DB2 text search functions were started by invoking the SYSPROC.SYSTS_START stored procedure and at least one text search server is running.
  • The table includes a column that is defined as primary key, unique index, or ROWID.
  • The QSYS2.SYSTEXTSERVERS table contains at least one entry.

Authorization

The privileges held by the authorization ID of the statement must include at least one of the following privileges:

The privileges held by the authorization ID of the statement must include at least one of the following privileges:

  • The following system authorities:
    • *USE to the Create Logical File (CRTLF) command
    • *CHANGE to the data dictionary if the library into which the text index is created is an SQL schema with a data dictionary
  • Administrative authority

The privileges held by the authorization ID of the statement must also include at least one of the following privileges:

  • For the referenced table:
    • The INDEX privilege on the table
    • The system authority *EXECUTE on the library containing the table
  • Administrative authority
  • If SQL names are specified, and a user profile exists that has the same name as the library into which the text index is created, and that name is different from the authorization ID of the statement, then the privileges held by the authorization ID of the statement must include at least one of the following privileges:
    • The system authority *ADD to the user profile with that name
    • Administrative authority

If a distinct type is referenced, the privileges held by the authorization ID of the statement must include at least one of the following privileges:

  • For each distinct type identified in the statement:
    • The USAGE privilege on the distinct type, and
    • The system authority *EXECUTE on the library containing the distinct type
  • Administrative authority

If the column that the text search index is built over contains a field procedure, the authorization ID must have one of

  • *EXECUTE authority to the program and
  • *EXECUTE authority on the library containing the program

Or

  • Administrative authority

For information about the system authorities corresponding to SQL privileges, see GRANT (Table or View Privileges).

Syntax

Read syntax diagramSkip visual syntax diagramSYSTS_CREATE(indexSchemanull,indexName,textSource,options)

The schema qualifier is SYSPROC.

Parameters

indexSchema
Identifies the schema of the text search index. If this parameter is null, the value of the CURRENT SCHEMA special register for the invoker is used. This value must be a valid SQL name.
Note: Enclose names in double quotation marks if the names conflict with SQL keywords or OmniFind keywords that can be used.

The data type of this parameter is VARCHAR(128).

indexName
Identifies the name of the text search index. The name of the text search index with the index schema uniquely identifies the text search index in the DB2 subsystem. You must specify a non-null value for this parameter. This value must be a valid SQL name.
Note: Enclose names in double quotation marks if the names conflict with SQL keywords or OmniFind keywords that can be used.

The data type for this parameter is VARCHAR(128).

textSource
Identifies the table and column specification for the document text source. This parameter can include user-defined functions. You must specify a non-null value for this parameter.

The data type for this parameter is VARCHAR(1024).

textSource
Read syntax diagramSkip visual syntax diagram tableSchema. tableName(  text-column-namefunction-schema. function-name( text-column-name) )
tableSchema
Identifies the schema of the table on which the text search index is created.
Note: Enclose names in double quotation marks if the names conflict with SQL keywords or OmniFind keywords that can be used.
tableName
Identifies the name of the text table that contains the column that the external text search index is created on.
Notes:
  • Views and logical files are not supported.
  • An alias must point to a table or a single member of a physical file.
  • Enclose names in double quotation marks if the names conflict with SQL keywords or OmniFind keywords.
text-column-name
Identifies the name of the column that contains the text that is used for creating the text search index. This column must be of type CHAR, CHAR FOR BIT DATA, BINARY, VARCHAR, VARCHAR FOR BIT DATA, VARBINARY, CLOB, DBCLOB, BLOB, XML, GRAPHIC, or VARGRAPHIC. If the data type is not one of these data types, you can specify an external function that returns a supported data type.
Notes:
  • Only one text search index is allowed for a column. If a text search index exists for the column, SQLCODE-20427 is returned.
  • Enclose names in double quotation marks if the names conflict with SQL keywords or OmniFind keywords.
function-schema. function-name
Identifies the schema and the name of a built-in or user-defined function. The function can be used to modify a text document stored in the column. The function can also be used to access text documents in a column that is not of a supported data type. Or the function can be used to access a document that is stored elsewhere. The function has one input parameter for the text column data type. For example, an integer that serves as a foreign key to the document content in another table. The function returns a value of one of the OmniFind Text Search for DB2 for i supported data types. The function transforms the text column content to the indexed document content.
Notes:
  • Cast functions and functions with more than one argument are not allowed.
  • Enclose names in double quotation marks if the names conflict with SQL keywords or OmniFind keywords that can be used.
options
A character string that specifies the various options that are available for this stored procedure.

The data type for this parameter is VARCHAR(32000).

options
Read syntax diagramSkip visual syntax diagram text-default-information update-characteristics index-configuration-options
text-default-information
Read syntax diagramSkip visual syntax diagramCCSIDccsidLANGUAGE languageFORMAT format
text-default-information
Specifies the coded character set identifier used when indexing binary text documents. Also specifies the language that is used when processing documents, and the format of text documents in the column.
CCSID ccsid
Specifies the coded character set identifier that is used for a text search index in a column with a binary data type. The default value is 1208 (UTF-8) and is taken from the QSYS2.SYSTEXTDEFAULTS table. All the CCSIDs that are supported for conversion to UTF-8 by IBM® i conversion services are allowed for this parameter.

This parameter is ignored for a text search index in a column with a non-binary data type. Text columns inherit the CCSID from the table specification. The ccsid value is ignored when the format value is set to INSO.

LANGUAGE language
Specifies the language that OmniFind Text Search Server for DB2 for i uses for the linguistic processing of text documents. The default value is en_US (English). If you specify the value for this parameter as AUTO, OmniFind Text Search Server for DB2 for i tries to determine the language.
Important: If the language of the documents is not English, do not use the default value of en_US. Change the value to the language of the documents; otherwise, linguistic processing does not work as expected.
FORMAT format
Identifies the format of text documents in the column, such as HTML. The OmniFind Text Search Server for DB2 for i needs to know the format, or content type, of the text documents that you intend to index and search. If you do not specify the format parameter, the default value is taken from the FORMAT column in the QSYS2.SYSTEXTDEFAULTS table. The supported format values are TEXT, HTML, XML, and INSO.

The format value INSO allows OmniFind Text Search Server for DB2 for i to determine the format. In this case, the ccsid value is ignored. If the OmniFind Text Search Server for DB2 for i cannot determine the document format, then a document error is noted in the job log during processing by the SYSPROC.SYSTS_UPDATE stored procedure.

Note: If you do not specify the format parameter while creating the index over an XML data type column, the default value is XML. If you specify the format parameter as TEXT or INSO, the XML search capability is not available over this index. In addition, a warning message appears in the job log.
update-characteristics
Read syntax diagramSkip visual syntax diagramUPDATE FREQUENCYNONE< update-frequency >UPDATE MINIMUMminchanges
update-characteristics
Specifies the frequency of automatic updates to the text search index. Also specifies the minimum number of changes to text documents before the text search index is updated incrementally at the specified time.
UPDATE FREQUENCY update-frequency
Specifies when to make automatic updates to the text search index. The default value is NONE. This option might be useful for a text column in which there are no further changes. The format of the update-frequency option supports two different formats.
update-frequency (Format 1)
Read syntax diagramSkip visual syntax diagramNONED(*0..6, 0..6)H(*0..23, 0..23)M(*0..59, 0..59)
NONE
If NONE is specified, then no further index updates are made. The update must be started manually. This option might be useful for a text column in which no further changes are planned.
D
Specifies the day or days of the week when the index is updated. An asterisk (*) specifies all days. 0 specifies Sunday.
H
Specifies the hour or hours when the index is updated. An asterisk (*) specifies all hours.
M
Specifies the minute or minutes when the index is updated. An asterisk (*) cannot be specified. The minimum update frequency is 5 minutes.
Example: This example specifies that the index update is to run every 30 minutes.
UPDATE FREQUENCY D(*) H(*) M(0,30)
update-frequency (Format 2, chronological)
Read syntax diagramSkip visual syntax diagram< minute>< hour>< dayOfMonth>< monthOfYear>< dayOfWeek>

The format of the update-frequency (chronological) option is a list of the five values separated by a blank space. The five values represent the minutes, hours, days of the month, months of the year, and days of the week beginning with Sunday.

If you specify an interval of values or an asterisk (*), you can specify a step value by using a forward slash (/) at the end of the defined interval.

Example: This example specifies that the index update is to run every quarter hour (0,15,30,45) on the even hours between 8 a.m. and 6:45 p.m. (8-18/2 is equivalent to 8,10,12,14,16,18), from Monday to Friday every month of the year (* * 1-5).

0,15,30,45 8-18/2 * * 1-5
minute
Specifies the minutes of the hour when the text search index is to be updated. You can specify an asterisk (*) for an interval of every 5 minutes, or you can specify an integer 0 - 59. You cannot repeat values. The minimum update frequency is 5 minutes. A value of 1,4,8 is not valid.
update-frequency (minute)
Read syntax diagramSkip visual syntax diagram*/ 0...59, 0...59 - 0...59/ 0...590...59
hour
Specifies the hours of the day when the text search index is to be updated. You can specify an asterisk (*) for every hour, or you can specify an integer 0 - 23. You cannot repeat values.
update-frequency (hour)
Read syntax diagramSkip visual syntax diagram*/ 0...23, 0...23 - 0...23/ 0...230...23
dayOfMonth
Specifies the days of the month when the text search index is to be updated. You can specify an asterisk (*) for every day, or you can specify an integer 1 - 31. You cannot repeat values.
update-frequency (dayOfMonth)
Read syntax diagramSkip visual syntax diagram*/ 1...31, 1...31 - 1...31/ 1...311...31
monthOfYear
Specifies the months of the year when the text search index is to be updated. You can specify an asterisk (*) for every month, or you can specify an integer 1 - 12. You cannot repeat values.
update-frequency (monthOfYear)
Read syntax diagramSkip visual syntax diagram*/ 1...12, 1...12 - 1...12/ 1...121...12
dayOfWeek
Specifies the days of the week when the text search index is to be updated. You can specify an asterisk (*) for every day, or you can specify an integer 0 - 7. Both 0 and 7 are valid values for Sunday. You cannot repeat values.
update-frequency (dayOfWeek)
Read syntax diagramSkip visual syntax diagram*/ 0...7, 0...7 - 0...7/ 0...70...7
UPDATE MINIMUM minchanges
Specifies the minimum number of record changes made to the underlying table before the text search index is updated incrementally at the time specified in the update-frequency option. The value must be an integer 1 - 2147483647. The default value is taken from the UPDATEMINIMUM column in the QSYS2.SYSTEXTDEFAULTS table.

This option is ignored when you update the text search index, unless you specify the USING UPDATE MINIMUM option in the SYSPROC.SYSTS_UPDATE stored procedure.

index-configuration-options
Read syntax diagramSkip visual syntax diagramINDEX CONFIGURATION(, optionvalue)
ROW_COLUMN_ACCESS
Read syntax diagramSkip visual syntax diagramNOT SECUREDSECURED
index-configuration-options
Specifies additional index-specific values as option value pairs. You must enclose string values in single quotation marks. A single quotation mark within a string value must be represented by two consecutive single quotation marks.
CJKSEGMENTATION
Specifies the segmentation method to use when you index documents for CJK (Chinese, Japanese, Korean) languages. The supported values are MORPHOLOGICAL and NGRAM. If the CJKSEGMENTATION value is not specified, the default value is used. The default value is specified by the DEFAULTNAME value in the QSYS2.SYSTEXTDEFAULTS table.
COMMENT
Specifies a comment that is stored in the REMARKS column of the QSYS2.SYSTEXTINDEXES administration table and as the description of the OmniFind Text Search Server for DB2 for i collection.

The value for this option is a string value that is less than or equal to 512 bytes.

IGNOREEMPTYDOCS
Specifies whether to represent empty documents in the text search index. Empty documents are those documents with an empty string or a null value.

The supported values for this option are 0 (zero) and 1. The default value is 1.

If this option is set to 1, empty documents are not represented in the text search index. If you use this option and change the document content to empty, the next incremental update deletes the documents from the text search index.

KEYCOLUMN
Specifies the name of a unique column to be used as the key column in the text index. The key column is used to associate data in the text index to a document or row in the base table. The specified column must have a primary key constraint or unique index. If KEYCOLUMN is not specified, the ROWID column from the table is used, if one exists. Otherwise, the primary key defined on the table is used.
SERVER
Specifies the ID or alias name of the server to be used to store the text search index. If an ID is used, the value is an integer that must exist in the SERVERID column of the QSYS2.SYSTEXTSERVERS catalog. If an alias name is used, the value is a string that must exist in the ALIASNAME column of the QSYS2.SYSTEXTSERVERS catalog. If SERVER is not specified, the default is to select the server with the fewest text search indexes from the servers in the QSYS2.SYSTEXTSERVERS table where parameter SERVERSTATUS is set to 0 (zero), which means that the server is available.
UPDATEAUTOCOMMIT
Specifies how often a commit operation is performed when fetching documents during an index update. A value of 0 (zero) means that a commit operation occurs only at the end of processing.

The value must be an integer between 0 (zero) and 2147483647. The default value is 100.

Performance tip: The value of UPDATEAUTOCOMMIT can have a substantial impact on the performance of index updates. The commit operation that takes place at the specified interval ensures a consistent checkpoint from which to restart the index update, if it is interrupted. However, the commit also temporarily suspends the update process. Increasing the UPDATEAUTOCOMMIT value (or setting it to 0) can substantially improve the update performance, especially the initial update. The value you specify must balance the need for performance with the need for recoverability, based on the frequency of the index updates.

ROW_COLUMN_ACCESS
Specifies whether the text search index is considered secure for row access control and column access control.
NOT_SECURED
Specifies that the text search index is considered not secure for row access control and column access control. This is the default.

The based on table for the index must not have an active permission or Mask.

SECURED
Specifies that the index is considered secure for row access control and column access control.

A text search index must be defined as secured to be built over a table with an active permission or column Mask. If a function is referenced to access or modify data in a masked column, the function must be defined as secured. The authorization ID must have DB_SECADMIN authority to create a text search index with the SECURED attribute.

When a text search index is built over sensitive information there are additional considerations:

The database administrator needs to be aware that the data specified as key column(s) for the text search index will be stored in a staging table in QSYS2 and sent to the text search server using network protocols.

The database administrator needs to be aware that data indexed by a text search index is sent to the text search server using network protocols and stored outside of DB2 on the text search server.

Default values for the options parameter

When you install OmniFind Text Search for DB2 for i, the QSYS2.SYSTEXTDEFAULTS table is created and populated with default values for the options parameter of the SYSPROC.SYSTS_CREATE stored procedure.

The following table lists the options, default values, and descriptions of the options.

Table 1. Default values for the options parameter
Option Default value Description
CCSID 1208 Specifies the coded character set identifier that is used when binary text documents are indexed.
CJKSEGMENTATION NGRAM Specifies the segmentation method to use when you index documents for CJK (Chinese, Japanese, Korean) languages.
LANGUAGE en_US Specifies the language used to process text documents.
FORMAT TEXT Identifies the format of text documents in the column. The default format is plain text unless the data type is XML.
UPDATEFREQUENCY NONE Indicates that no automatic updates are scheduled.
UPDATEMINIMUM 1 If at least one document changed since the last index update, the SYSPROC.SYSTS_UPDATE stored procedure starts processing.
IGNOREEMPTYDOCS 1 Specifies that empty documents (documents with an empty string or a null value) are not represented in the text search index. The metadata fields for these documents are not available for search.
UPDATEAUTOCOMMIT 100 Specifies how often a commit operation is performed when documents are fetched during an index update.
MINIMUMUPDATEINTERVAL 5 Specifies the intervals for the UPDATEFREQUENCY option. Intervals cannot be shorter than 5 minutes.
ROW_COLUMN_ACCESS NOT_SECURED Specifies whether the index is created as secured or not.
USEREXITTHREADS 0 Reserved