Creating a boost word dictionary

After you create or update a list of boost words in an XML file, you must convert the XML file to a boost word dictionary.

About this task

To create a boost word dictionary, use the command line tool called esboosttermdictbuilder, which is supplied with Watson Content Analytics. The tool is in the ES_INSTALL_ROOT/bin directory.

The input to the tool is the XML file that lists your boost words, and the output from the tool is a case-sensitive boost word dictionary. The dictionary must have the suffix DIC. For example, c:\mydictionaries\productboostwords.dic.

The default location for both files is the directory where the script is invoked. If a dictionary with the same name exists, the script produces an error.

The maximum size of a DIC file is 8 MB.

Procedure

To create a boost word dictionary:

  1. On the master server, log in as the default Watson Content Analytics administrator.
  2. Enter the following command, where XML_file is the fully qualified path to the XML file that contains the list of boost words and DIC_file is the fully qualified path to the boost word dictionary. If you want to use a synonym dictionary as well, add the fully qualified path to the synonym dictionary after the boost dictionary name. Naming a synonym dictionary is optional.
    AIX® or Linux
    esboosttermdictbuilder.sh XML_file DIC_file SYNDIC_file
    Windows
    esboosttermdictbuilder.bat XML_file DIC_file SYNDIC_file

What to do next

After you create a boost word dictionary, use the administration console to add the dictionary to the system and associate it with one or more enterprise search or content analytics collections.

Only the generated DIC file is uploaded to the system. Ensure that the source XML file is kept in an access-controlled environment, with the appropriate backup strategy in place. You need this XML file to update your boost word dictionary.