Indexing .dotx and .xltx files with IBM Content Analytics with Enterprise Search.

Technote (FAQ)


Question

How do I crawl and index .dotx and .xltx files with IBM Content Analytics with Enterprise Search?

Answer

(1) Create the following file

<$ES_NODE_ROOT>/master_config/<cid>.stellent/stellenttypes.cfg

In the file add the lines below:

accept DEFAULT
accept FI_WINWORDTEMPLATE2007 dotx
accept FI_WINWORDTEMPLATE2010 dotx
accept FI_EXCELTEMPLATE2007 xltx
accept FI_EXCELTEMPLATE2010 xltx

(2) Modify <$ES_NODE_ROOT>/master_config/<cid>.indexservice/mimetypes.xml and add the lines below to the <ExtensionMapping> section:

<MappedMimetype Name="application/vnd.openxmlformats-officedocument.wordprocessingml.template">
<Extension>.dotx</Extension>
</MappedMimetype>
<MappedMimetype Name="application/vnd.openxmlformats-officedocument.spreadsheetml.template">
<Extension>.xltx</Extension>
</MappedMimetype>


(3) Modify <$ES_NODE_ROOT>/master_config/<collection ID>.indexservice/parser_config.xml and add the lines below to the section under <ParserMapping><ParserName>stellent</ParserName>:

<Mimetype>application/vnd.openxmlformats-officedocument.wordprocessingml.template</Mimetype>
<Mimetype>application/vnd.openxmlformats-officedocument.spreadsheetml.template</Mimetype>


After the above changes restart the crawler and parser/indexer from the admin console to crawl and index the target files.

Related information

Associating document types with the text extractor

Rate this page:

(0 users)Average rating

Document information


More support for:

Watson Content Analytics

Software version:

3.0

Operating system(s):

AIX, Linux, Linux on System z, Windows

Reference #:

1643106

Modified date:

2013-07-08

Translate my page

Machine Translation

Content navigation