Mapping XML elements to the common analysis structure

If the content to be analyzed includes XML documents with meaningful markup, and you want to use this markup to annotate text, you can map the XML elements to the common analysis structure.

About this task

To enable custom text analysis processes to access specific XML elements, or to map several XML elements to a common Type for use in semantic search, you can create custom mapping files. The mapping files must adhere to the UIMA framework for text analysis.

When you add mapping files to a collection that uses a custom text analysis engine, you enable XML elements in source documents to be mapped to annotations in the common analysis structure. These annotations can then be used by your custom text analysis engine.

For example, you can map the content of <addressee> and <customer> elements to Person annotations in the common analysis structure. These annotations can then be accessed by your custom annotators, which might detect additional information (for example, they might detect the gender of the Person).

Procedure

To map XML elements to the common analysis structure:

  1. Expand the collection that you want to configure and click Actions > Text processing options.
  2. In the Map XML elements to the common analysis structure area of the Text Processing Options page, click Add Mapping.
  3. On the Map XML Elements to the Common Analysis Structure page, type a descriptive display name for the mapping file.
  4. Specify the location of the file. If the mapping file is on your local system, you can browse to locate the file. If the mapping file is on the server where Annotation Administration Console is installed, you must type the fully qualified path. After you click OK, the mapping file that you added is shown on the Text Processing Options page.