Most people think terminology is just about words and definitions. After all, that's what a dictionary contains. But today, more detailed information needs to be recorded about terms to support the development of global products. This information is so structured and granular that conventional text-management tools are inadequate and we need to turn to data-management tools.
XML is revolutionizing the way we manage information resources so that we can reuse and 'repurpose' content. But as single units of text referring to singular concepts, terms can be repurposed at an even higher level than longer chunks of text like sentences, paragraphs, or topics. Terms have many properties including a part of speech, a gender, a canonical form, an inflection pattern, a subject field, product identifiers, variants (acronyms and abbreviations), a standardization label, a confidence index, a usage label, a definition, a context, a concept identifier, a usage note, synonyms, related terms, and, last but not least, a translation. The International Organization for Standardization has identified more than 200 different recordable properties of terms (ISO 12620). Of course, you don't need to record them all, but if you do record certain types of metadata, new ways to use terminology emerge.
The conventional uses of terminology include product glossaries, Web sites for looking up terms, and bilingual or multilingual dictionaries for translators. Even these three uses require different data and structures. The part of speech value, for example, is not usually needed in product glossaries but is typically mandatory for translator's dictionaries. Queryable Web sites offer the flexibility to deliver different layouts and quantities of information. And subject field values are becoming increasingly important to realize reuse objectives.
Some search engines can use thesaurus-like lists of terms from a concept-oriented terminology database to increase successful matches of search queries. Text authoring tools use terms to enhance their spell-checking functions. Terminology resources have also proven useful in text mining, which identifies trends and patterns in information needs. The quality of machine translations depends heavily on the quality of the dictionaries used. Finally, term extraction tools are more effective if they can use lists of ''known" terms as exclusion dictionaries, thereby delivering only the "unknown" terms to the end user.