Custom text processing

You can apply custom text processing algorithms to text by adding a custom annotator to Annotation Administration Console.

Annotation Administration Console supports UIMA, which is a framework for creating, discovering, composing, and deploying text analysis functions. Application developers create and test analysis algorithms for the text to be analyzed, and then create a processing engine archive (.pear file) that includes all of the resources required to use the archive. To be able to analyze text with your custom text analysis algorithms, you must add the .pear file to the system.

The analysis logic component in a text analysis engine is called an annotator. Each annotator performs specific linguistic analysis tasks. A text processing engine can contain any number of annotators, or it can be a composite of several text analysis engines, each of which contain their own custom annotators. The text analysis engine is included in the .pear file.

The information produced by the annotators is referred to as the analysis results. Analysis results are written to a data structure called a common analysis structure.

When you configure text processing options for a collection, you do the following tasks: