Creating a lexical dictionary for enterprise search collections

After you create an XML file that specifies your custom vocabulary, you must convert the XML file to a lexical dictionary.

About this task

To create a lexical dictionary, use the command line tool called eslexicaldictbuilder, which is provided by Watson Content Analytics in the ES_INSTALL_ROOT/bin directory.

The input to the tool is the XML file that lists your custom terminology, and the output from the tool is a lexical dictionary. The dictionary must have the suffix DIC. For example, c:\mydictionaries\mylexical.dic.

The default location for both files is the directory where the script is invoked. If a dictionary with the same name exists, the script produces an error.

The maximum size of a .dic is 8 MB.

Procedure

To create a lexical dictionary to help improve the precision of search results:

  1. On the master server, log in as the default administrator. This user ID was specified when Watson Content Analytics was installed.
  2. Enter the following command, where XML_file is the fully qualified path to the XML file that contains your custom dictionary elements and DIC_file is the fully qualified path to where the lexical dictionary is to be created.
    AIX® or Linux

    eslexicaldictbuilder.sh XML_file DIC_file

    Windows
    eslexicaldictbuilder.bat XML_file DIC_file

What to do next

After you create a lexical dictionary, use the administration console to add the dictionary to the system and associate it with one or more enterprise search collections. You cannot associate a lexical dictionary with a content analytics collection.

Only the generated DIC file is uploaded to the system. Ensure that the source XML file is kept in an access-controlled environment, with the appropriate backup strategy in place. You need this XML file to update your lexical dictionary.