Custom lexical dictionaries for enterprise search collections

You can improve the precision of the search results by creating a custom lexical dictionary and integrating it with a collection.

Watson Content Analytics includes dictionaries to parse and tokenize content. You can create a custom dictionary to ensure that the parser uses your enterprise-specific vocabulary when preparing content for the index. Your custom lexical dictionary is used to analyze content in addition to the provided dictionaries.

Restriction: You can associate a lexical dictionary with enterprise search collections. If you create a content analytics collection, you can associate the collection with a custom user dictionary that defines words and equivalent terms, not a lexical dictionary. For text mining, your custom user dictionary is used to analyze content in addition to the provided dictionaries.

To create a custom lexical dictionary, domain or subject matter experts must define the enterprise-specific terminology that you want to use for parsing content in an XML file and then use the ES_INSTALL_ROOT/bin/eslexicaldictbuilder tool to create a dictionary file (.dic file).

To use the custom dictionary, an administrator must use the administration console to upload the dictionary to the system and associate the dictionary with one or more enterprise search collections.