UIMA Software Development Kit

The UIMA Software Development Kit includes APIs and tools with which you can create annotators (analysis algorithms including the type system description) and embed these annotators in analysis engines.

The UIMA documentation includes a tutorial-style guide that helps you build these components. The Software Development Kit includes utilities for testing and viewing your results, and a small-scale semantic search engine for indexing your analysis results. You can also perform more advanced semantic search against information stored in the index.  

Because the UIMA Software Development Kit does not provide any pre-configured annotators, and because any custom annotators that you develop by using UIMA and then integrate in Watson Explorer Content Analytics builds upon the results of the provided base annotators, you can use the base annotator package in your UIMA environment. Refer to the UIMA documentation on how to include language detection and tokenization functionality before you run the custom text analysis algorithms in your UIMA environment.

After you have developed and tested your analysis engines by using the UIMA Software Development Kit, you must create a PEAR (Processing Engine Archive) file to run your algorithms on a collection. This archive file includes all of the required resources for deploying your custom analysis functionality as text analysis engines. How to create an archive is described in the UIMA documentation provided in the Software Development Kit.

As an alternative to manually developing annotators with the UIMA SDK, you can use Content Analytics Studio to develop and deploy custom text analytics for Watson Explorer Content Analytics applications. Content Analytics Studio is a complete development environment for the building, customization, and testing of dictionaries, rules, and UIMA annotators. This environment eliminates the need for specialist knowledge of the underlying technologies of natural language processing or UIMA. Content Analytics Studio enables you to develop text analysis engines without needing to write any code. Content Analytics Studio is a separately installable component of Watson Explorer Content Analytics.