After you create or update a list of user-defined stop words in an XML file, you must convert the XML file to a stop word dictionary.
To create a stop word dictionary, use the command line tool called esstopworddictbuilder, which is provided with Watson Content Analytics. The tool is in the ES_INSTALL_ROOT/bin directory.
The input to the tool is the XML file that lists the stop words, and the output from the tool is a case-sensitive stop word dictionary. The dictionary must have the suffix DIC. For example, c:\mydictionaries\productstopwords.dic.
The default location for both files is the directory where the script is invoked. If a dictionary with the same name exists, the script produces an error.
The maximum size of a DIC file is 8 MB.
To create a stop word dictionary:
Only the generated DIC file is uploaded to the system. Ensure that the source XML file is kept in an access-controlled environment, and ensure that you back up the file regularly. You need this XML file to update your stop word dictionary.