Skip to main content

Software  > Globalization > LanguageWare > 

Globalize your On Demand Business - LanguageWare Technology

LanguageWare is the new generation IBM linguistic platform. It was designed from the ground up to address the demands posed by today's global applications.

LanguageWare technology

The efficiency and diversity of processing stems from LanguageWare's cross-linguistic architecture. Through identification of key fundamental linguistic phenomena that might span several languages and utilization of appropriate formal models (such as state machines, formal rule systems, logic and statistical tools) LanguageWare removes the overhead of traditional layered language engineering approaches.

Lexical data is represented in state-of-the-art finite state devices that exploit graph metrics and statistical properties of morphological state transition networks from the point of view of random networks theory. Resulting in highly efficient, compact and uniform multi-lingual lexical resource representations. As a result of this cross-linguistic architecture, IBM LanguageWare facilitates the creation of multi-lingual text analytics by transparently handling the challenges associated with many different languages. For example:

  • In Arabic the semantic component of words may be masked by prefixes and postfixes.

  • For highly inflected languages, like Russian, lemmatization is particularly important to achieve high recall in information retrieval.

  • For non-white spaced languages, such as Chinese or Japanese, text segmentation into words is crucial.

  • For compounding languages, such as German, segmentation of words into their constituents is particularly important for information retrieval.

  • For English part-of-speech ambiguity and inflectional irregularities need to be resolved.

In recognition of the inherent complexity of natural language, and the ultimate need for adaptation/customization, one key design characteristic is that LanguageWare uses a data-driven approach. Thus, the behaviour of the system can be significantly modified through the customization of the LanguageWare resources. This essentially allows the customer to inject domain-specific knowledge into the LanguageWare analysis process. The information that may be stored in these resources and leveraged during the analysis process is not limited to the traditional lexical type of data. These resources are open and easily extensible and allow inclusion of morphology, morphotactics, synonyms, ontologies, taxonomies, relationships, constraints, parsing rules, etc.

To expose this customizability to users, LanguageWare also provides a comprehensive development environment - The LanguageWare Resource Workbench - which is an Eclipse-based application that simplifies this domain customization process through providing tooling that enables LanguageWare resources to be compiled from any domain data. The benefit of the Workbench is that it vastly simplifies the process of creating, updating and managing language resources and building them into your analysis process.

The Workbench also allows, through a simple drag-and-drop examples-based interface, the creation of rules and grammars to be applied during the analysis process. The Workbench is designed to allow complete customization of our analysis process without requiring the customer to write one line of code. The workbench provides an entire development environment in which advanced analytics can be easily developed by domain specialists. The workbench creates analysis engines which are based on Apache UIMA and can simply be exported to deployable archives, which can be easily installed and run in any Apache UIMA container.

Today LanguageWare supports a wide range of linguistic functions; from language identification, text segmentation, text normalization, entity extraction, relationship extraction, fact extraction, semantic disambiguation and more general graph mining. All these features are provided as component capabilities that can be easily integrated into any application environment.

Continue to "Functionality"


We're here to help
Easy ways to get the answers you need.
E-mail IBM