Custom boost word dictionaries

You can define specific terms or multi-word terms that raise or lower the rank value of the document in which the term appears.

Each term in the boost dictionary is associated with a boost factor that can range from -10 to +10. The terms that you particularly want to see in your result documents are allocated a higher boost factor, while those that you do not want to have appear at all, or in combination with higher boosted terms, are given a lower value. The values -1, 0, and 1 have no boost effect.

If a query term which is listed in the boost dictionary with a particular boost factor appears in a retrieved document, the document rank value is either raised or lowered depending on the boost value. The boost value assigned to a term is relative as it also is affected by other factors. Thus if the term X is boosted by B1 and the term Y by B2, and B1 > B2, then boost(X) >= boost(Y).

A boost word typically includes multi-word terms such as product names like WebSphere® Application Server. Multi-word terms contained in the boost word dictionary are correctly identified in user queries and do not have to appear within quotes.

Boost word dictionaries are language independent.

Compound terms in Germanic languages are also correctly identified in queries. A compound term is the combination of two or more words that is used as a single word. Lexicalized compounds like Reisebüro (travel agency) are not considered to be compounds.

Compound terms in a query are broken up into the individual terms that make up the compound. If boost values exist of the individual terms of a compound, the retrieved documents are ranked, although the value assigned is lower than it is if the term appears on its own in the document (and not as part of a compound). This broadens the search scope which is useful in cases in which only a few documents are found that contain the full compound.

For example, the query term Versicherungspolice (insurance policy) will return documents that contain the compound terms Lebensversicherungspolice (life insurance policy) and Haftpflichtversicherungspolice (third party insurance policy). If the word Police (policy) exists in the boost word dictionary, the document containing the compound query term Versicherungspolice is assigned a boost value.

You must list the terms with their boost value in an XML file which you must then convert to a boost word dictionary so that it can be added to the system and associated with a collection.

You can select which boost word dictionary to use in the administration console. One boost word dictionary can be selected for each collection. A boost word dictionary can be shared by several enterprise search and content analytics collections.