IBM Support

Content Search Services is unable to initialize the language tokenizer preventing all Content Based Retrieval indexing

Troubleshooting


Problem

After installing Content Search Services on an NFS mounted drive and attempting to submit documents for Content Based Retrieval indexing, an error is thrown. The error states that Content Search Services cannot initialize the tokenizer.

Symptom

Every attempt to index fails with the same error in the CE trace log:

<message>IQQP0009W The parser cannot parse the document {A4066CC9-64BF-45C1-A6AA-FECDA7793E0A}. The document will not be indexed. Document Status:404 Error details:The tokenizer cannot be initialized with the UIMA descriptor /app/IBM/ContentSearchServices/CSS_Server/resource/uima//aggregate_indexing_default.xml and data directory /app/IBM/ContentSearchServices/CSS_Server/resource/uima/:/app/IBM/ContentSearchServices/CSS_Server/config/dictionaries/.</message>

Cause

The NFS mount has incorrect permissions set.

Environment

AIX using an NFS mount to store Content Search Services indexes.

Diagnosing The Problem

On the CSS server, the error in the trace*.log will be similar to this:

<message>IQQI0005E The document with ID {A4066CC9-64BF-45C1-A6AA-FECDA7793E0A} cannot be indexed.
Causes of the problem:
IQQP6000E The tokenizer cannot be initialized with the UIMA descriptor /app/IBM/ContentSearchServices/CSS_Server/resource/uima//aggregate_indexing_default.xml and data directory /app/IBM/ContentSearchServices/CSS_Server/resource/uima/:/app/IBM/ContentSearchServices/CSS_Server/config/dictionaries/.
IQQG0020E org.apache.uima.resource.ResourceInitializationException: Initialization of annotator class "com.ibm.es.nuvo.tokenizer.annotators.FrostWrapperAnnotator" failed. (Descriptor: file:/app/IBM/ContentSearchServices/CSS_Server/resource/uima/langware.xml)
IQQG0020E java.lang.NullPointerException
java.io.File.&lt;init&gt;(File.java:233)
com.ibm.dltj.CapMatrix.getRequestedFiles(Unknown Source)
com.ibm.dltj.CapMatrix.setDataSpec(Unknown Source)
com.ibm.dltj.CapMatrix.setDataSpec(Unknown Source)
com.ibm.dltj.CapMatrix.&lt;init&gt;(Unknown Source)
com.ibm.dltj.uima_annotator.lex_analysis.DictionariesCacheManager.&lt;init&gt;(DictionariesCacheManager.java:77)
...

Steps to confirm the permissions on the mounted drive:

1) Check to see if the Content Search Services installation is located on an NFS mounted drive using:

df -k

and look for the path to the Content Search Services installation.

2) Check to see if read/execute permission to the mounted drive is granted to the user who runs Content Search Services, starting at the top level directory.



An ls -al of the mounted drive, shows permission denied.

For example:

$ ls -al /ibm
ls: 0653-345 /ibm/..: Permission denied.
total 0
drwxr-xr-x    4 root     system          256 Aug 16 11:08 .
drwxr-xr-x    2 root     system          256 Aug 16 11:07 lost+found
drwxr-xr-x    2 root     system          256 Aug 16 11:08 FileNet

3) Create the Test.java program on the same drive as Content Search Services is installed.

public class Test {
   public static void main(String[] args)  {
       System.out.println(System.getProperty("user.dir"));
   }
}

Compile the above java using this command:
/usr/java6/bin/javac Test.java

Run the test using this command:
/usr/java6/bin/java Test

The output of the test on the NFS mount with incorrect permissions will be ".".

Testing the same java program by copying it to a local drive like /tmp should result in the local drive path instead of "." as the value for "user.dir".

Resolving The Problem


Un-mount the drive, change permissions to allow the user who runs Content Search Services to have read and execute permission and remount the drive again.

For example:
# umount /ibm

# ls -al /ibm
total 8
drwxr-x---    2 root     system          256 Aug 16 11:07 .
drwxr-xr-x   34 root     system         4096 Aug 16 11:07 ..

# chmod 755 /ibm
# mount /ibm

$ ls -al /ibm
total 8
drwxr-xr-x    4 root     system          256 Aug 16 11:08 .
drwxr-xr-x   34 root     system         4096 Aug 16 11:07 ..
drwxr-xr-x    2 root     system          256 Aug 16 11:07 lost+found
drwxr-xr-x    2 root     system          256 Aug 16 11:08 FileNet

After the remount has been completed, retry indexing and the problem should be solved.

[{"Product":{"code":"SSNVNV","label":"FileNet Content Manager"},"Business Unit":{"code":"BU053","label":"Cloud & Data Platform"},"Component":"Content Search Services","Platform":[{"code":"PF002","label":"AIX"}],"Version":"5.1.0;5.2.0","Edition":"","Line of Business":{"code":"LOB45","label":"Automation"}}]

Document Information

Modified date:
17 June 2018

UID

swg21682174