Creating a Text Search Collection

This procedure creates an empty search collection. An SQL schema will be created on the system to contain information about the collection. The schema will contain DB2® objects necessary to track and index objects.

SYSPROC.SYSTS_CRTCOL and SYSPROC.SYSTS_CREATE_COLLECTION

Authorization

SYSTS_CRTCOL and SYSTS_CREATE_COLLECTION will be shipped with *EXECUTE authority granted to public.

These procedures will not adopt any additional authority and will run under the invoking profile.

In order to create a text search collection the invoker must have:
  • Authority to create a DB2 schema
  • Authority/ability to create a text search index

The DB2 Objects created as part of the collection, including the administrative stored procedures are created with public authority *EXCLUDE

The user profile creating the collection owns all objects in the collection. A user may grant authority to specific procedure to other users in order to allow another user to administer and search the text search collection.

Syntax

Read syntax diagramSkip visual syntax diagramSYSTS_CRTCOL(collection_name ,options,asp_device_name)

Syntax

Read syntax diagramSkip visual syntax diagramSYSTS_CREATE_COLLECTION(collection_name ,options,asp_device_name)

The schema qualifier is SYSPROC.

Parameters

collection_name
Name of Collection. This name uniquely identifies the collection and must be a non null string. A schema of the name specified for collection name will be created to hold the associated DB2 objects.
Note: Enclose names in double quotation marks if the names conflict with SQL keywords or OmniFind keywords that can be used.

The collection name parameter will follow the SQL rules for schema names.

The collection name must not match the name of an existing user profile.

The data type of this parameter is VARCHAR(128).

options
A character string that specifies the various options that are available for this stored procedure.

The data type for this parameter is VARCHAR(32000).

options
Read syntax diagramSkip visual syntax diagram text-default-information update-characteristics index-configuration-options
text-default-information
Read syntax diagramSkip visual syntax diagramLANGUAGE languageFORMAT format
text-default-information
Specifies the language that is used when processing documents, and the format of text documents in the column.
LANGUAGE language
Specifies the language that OmniFind Text Search Server for DB2 for i uses for the linguistic processing of text documents. The default value is en_US (English). If you specify the value for this parameter as AUTO, OmniFind Text Search Server for DB2 for i tries to determine the language.
Important: If the language of the documents is not English, do not use the default value of en_US. Change the value to the language of the documents; otherwise, linguistic processing does not work as expected.
FORMAT format
Identifies the format of text documents to be indexes, such as TEXT or INSO. The OmniFind Text Search Server for DB2 for i needs to know the format, or content type, of the text documents that you intend to index and search. If you do not specify the format parameter, the default is TEXT.

The format value INSO allows OmniFind Text Search Server for DB2 for i to determine the format. If the OmniFind Text Search Server for DB2 for i cannot determine the document format, then a document error is logged in the job log during processing by the UPDATE stored procedure.

update-characteristics
Read syntax diagramSkip visual syntax diagramUPDATE FREQUENCYNONE< update-frequency >
update-characteristics
Specifies the frequency of automatic updates to the text search collection. The update process for a text search collection involves both indexing the text data, and crawling system objects to detect new or changed data.
UPDATE FREQUENCY update-frequency
Specifies when to make automatic updates to the text search collection. The default value is NONE.
update-frequency (Format 1)
Read syntax diagramSkip visual syntax diagramNONED(*0..6, 0..6)H(*0..23, 0..23)M(*0..59, 0..59)
NONE
If NONE is specified, then no further index updates are made. The update must be started manually.
D
Specifies the day or days of the week when the index is updated. An asterisk (*) specifies all days. 0 specifies Sunday.
H
Specifies the hour or hours when the index is updated. An asterisk (*) specifies all hours.
M
Specifies the minute or minutes when the index is updated. An asterisk (*) cannot be specified. The minimum update frequency is 5 minutes.
Example: This example specifies that the index update is to run every 30 minutes.
UPDATE FREQUENCY D(*) H(*) M(0,30)
update-frequency (Format 2, chronological)
Read syntax diagramSkip visual syntax diagram< minute>< hour>< dayOfMonth>< monthOfYear>< dayOfWeek>

The format of the update-frequency (chronological) option is a list of the five values separated by a blank space. The five values represent the minutes, hours, days of the month, months of the year, and days of the week beginning with Sunday.

If you specify an interval of values or an asterisk (*), you can specify a step value by using a forward slash (/) at the end of the defined interval.

Example: This example specifies that the index update is to run every quarter hour (0,15,30,45) on the even hours between 8 a.m. and 6:45 p.m. (8-18/2 is equivalent to 8,10,12,14,16,18), from Monday to Friday every month of the year (* * 1-5).

0,15,30,45 8-18/2 * * 1-5
minute
Specifies the minutes of the hour when the text search index is to be updated. You can specify an asterisk (*) for an interval of every 5 minutes, or you can specify an integer 0 - 59. You cannot repeat values. The minimum update frequency is 5 minutes. A value of 1,4,8 is not valid.
update-frequency (minute)
Read syntax diagramSkip visual syntax diagram*/ 0...59, 0...59 - 0...59/ 0...590...59
hour
Specifies the hours of the day when the text search index is to be updated. You can specify an asterisk (*) for every hour, or you can specify an integer 0 - 23. You cannot repeat values.
update-frequency (hour)
Read syntax diagramSkip visual syntax diagram*/ 0...23, 0...23 - 0...23/ 0...230...23
dayOfMonth
Specifies the days of the month when the text search index is to be updated. You can specify an asterisk (*) for every day, or you can specify an integer 1 - 31. You cannot repeat values.
update-frequency (dayOfMonth)
Read syntax diagramSkip visual syntax diagram*/ 1...31, 1...31 - 1...31/ 1...311...31
monthOfYear
Specifies the months of the year when the text search index is to be updated. You can specify an asterisk (*) for every month, or you can specify an integer 1 - 12. You cannot repeat values.
update-frequency (monthOfYear)
Read syntax diagramSkip visual syntax diagram*/ 1...12, 1...12 - 1...12/ 1...121...12
dayOfWeek
Specifies the days of the week when the text search index is to be updated. You can specify an asterisk (*) for every day, or you can specify an integer 0 - 7. Both 0 and 7 are valid values for Sunday. You cannot repeat values.
update-frequency (dayOfWeek)
Read syntax diagramSkip visual syntax diagram*/ 0...7, 0...7 - 0...7/ 0...70...7
index-configuration-options
Read syntax diagramSkip visual syntax diagramINDEX CONFIGURATION(, optionvalue)
index-configuration-options
Specifies additional index-specific values as option value pairs. You must enclose string values in single quotation marks. A single quotation mark within a string value must be represented by two consecutive single quotation marks.
CJKSEGMENTATION
Specifies the segmentation method to use when you index documents for CJK (Chinese, Japanese, Korean) languages. The supported values are MORPHOLOGICAL and NGRAM. If the CJKSEGMENTATION value is not specified, the default value is used. The default value is specified by the DEFAULTNAME value in the QSYS2.SYSTEXTDEFAULTS table.
SERVER
Specifies the ID or alias name of the server to be used to store the text search index. If an ID is used, the value is an integer that must exist in the SERVERID column of the QSYS2.SYSTEXTSERVERS catalog. If an alias name is used, the value is a string that must exist in the ALIASNAME column of the QSYS2.SYSTEXTSERVERS catalog. If SERVER is not specified, the default is to select the server with the fewest text search indexes from the servers in the QSYS2.SYSTEXTSERVERS table where parameter SERVERSTATUS is set to 0 (zero), which means that the server is available.
UPDATEAUTOCOMMIT
Specifies how often a commit operation is performed when fetching documents during an index update. A value of 0 (zero) means that a commit operation occurs only at the end of processing.

The value must be an integer between 0 (zero) and 2147483647. The default value is 100.

Performance tip: The value of UPDATEAUTOCOMMIT can have a substantial impact on the performance of index updates. The commit operation that takes place at the specified interval ensures a consistent checkpoint from which to restart the index update, if it is interrupted. However, the commit also temporarily suspends the update process. Increasing the UPDATEAUTOCOMMIT value (or setting it to 0) can substantially improve the update performance, especially the initial update. The value you specify must balance the need for performance with the need for recoverability, based on the frequency of the index updates.

asp_device_name
This parameter, if specified and not null, determines which Auxiliary storage pool the collection is created into. This parameter is optionally available to match the CREATE SCHEMA capability to create the DB2 objects on a specific ASP device.

If a value is supplied for this parameter, the value must name a disk pool in the primary asp group of the current namespace, or a basic ASP unit if the namespace is the system ASP only.

The data type for this parameter is VARCHAR(10).

Examples

  1. CALL SYSTS_CRTCOL(‘mycollection');

    A collection MYCOLLECTION is created.

  2. CALL SYSTS_CRTCOL(‘”mycollection”', “UPDATE FREQUENCY D(*) H(*) M(0)”)

    A collection mycollection (lower case not including the delimiters) is created.

    The text search collection will have an update frequency of every day, at the top of every hour.

  3. CALL SYSTS_CRTCOL(‘ur_collection',‘UPDATE FREQUENCY NONE ‘ || ‘ LANGUAGE zh_CN INDEX CONFIGURATION( ‘ || ‘ CJKSEGMENTATION MORPHOLOGICAL) ‘, ‘23')

    A collection UR_COLLECTION is created.

    The collection has no update frequency.

    The collection's language is simplified Chinese, using dictionary linguistic analysis (morphological).

    The SQL schema is created in basic ASP #23.

    When the text search collection is created, procedures are created in the DB2 schema to administer the collection.

To create a collection from IBM® Navigator for i, follow these steps.

  1. From IBM Navigator for i, expand IBM i Management > System > All Tasks.
  2. On the right panel, select System > Omnifind > Create Collection.