IBM Support

IBM Content Analytics with Enterprise Search does not index header information from Microsoft PowerPoint

Technote (troubleshooting)


In Microsoft PowerPoint, notes can be made for each slide. These notes can have a header and footer like a word document.

These header and footer texts are not getting extracted by IBM Content Analytics with Enterprise Search (ICAwES) text extraction.

In addition, Microsoft PowerPoint has a slide master function where text can be stored too. The ICAwES 3.0 FixPack 1 has a fix where text extraction will only extract the slide master footer text and not text in the header.


Slide notes are ignored by default, but there is a work-around possible.

Resolving the problem

Here are the steps for a work-around to extract this information:

  1. Make directory '$ES_INSTALL_ROOT/lib/com/ibm/es/oze/parser/outsidein/'
  2. Put the attached file under the created directory
  3. Modify the classpath parameter in '$ES_INSTALL_ROOT/configurations/interfaces/stellent__interface.ini' file to include 'lib' directory
    • classpath=lib,es.indexservice.jar
  4. Restart the Parse and Index, then re-crawl/re-parse/re-index the documents

Document information

More support for: Watson Content Analytics

Software version: 3.0

Operating system(s): AIX, Linux, Windows

Reference #: 1623435

Modified date: 04 April 2014