Creating Sentiment and Named Entity Annotations

Sentiment and named entity annotations can be added to Watson Explorer Engine indexed documents using the IBM Content Analytics Annotation converter.

About this task

The Sentiment and Named Entity Annotators in the Annotation Administration Console include a default set of expressions or words that you can use as provided to annotate text when it is analyzed. You can customize those annotators to specify additional expressions and words, such as positive expressions, negative expressions, and words that are to be annotated as persons, locations, or organizations. The following procedure describes how to add and configure the converter for sentiment and named entity annotations.

Procedure

  1. Install and start the Annotation Administration Console as described in Installing Annotation Administration Console.
  2. Create a collection in the Annotation Administration Console as described in Creating a collection, and configure it to create sentiment and named entity annotations as described in Configuring sentiment analysis and Named Entity Recognition annotator.

    You may wish to configure the collection to create only sentiment or named entity annotations. A separate Annotation Administration Console collection must be created for each different combination of annotator options. For example, if you want to add sentiment metadata to some Watson Explorer Engine collections, and named entity information to others, you must create one Annotation Administration Console collection to identify sentiment annotations, and another Annotation Administration Console collection to identify named entity information.

  3. In the Watson Explorer Engine administration tool, add the IBM Content Analytics Annotation connector into the converter list for the collection you want to annotate and configure the following options:
    • Annotation analysis URL - URL to the Annotation Administration Console. Both hostname and port are required. The default port is 8393
    • Annotation Collection ID - ID of the annotation collection that you configured in the Annotation Administration Console
    • Annotation Type - Set this option to Sentiment and Named Entities
    • User Name - The user name used to connect to the Annotation Administration Console
    • Password - The password used to connect to the Annotation Administration Console
    • Exclude Contents By Default - When enabled, the Content List field defines which Watson Explorer Engine input contents will be annotated. When disabled (the default), the Content List field defines which Watson Explorer Engine input contents will not be annotated
    • Content List - This list of Watson Explorer Engine contents that will or will not (based on the configuration of the Exclude Contents By Default field) be annotated
    • Logging Configuration - Log4j configuration for the converter. Default configuration enables OFF level logging
  4. Configure any other appropriate search collection options and start indexing the collection. The annotators will add <content> elements to the indexed documents of the following types:

    Sentiment:

    • <content name="positive">VALUE</content>
    • <content name="negative">VALUE</content>

    Named Entities:

    • <content name="person">VALUE</content>
    • <content name="location">VALUE</content>
    • <content name="organization">VALUE</content>