IBM Support

Troubleshooting / Monitoring Analyses for Unified Governance

Troubleshooting


Problem

Analyses started from Information Analyzer or Discovery fail or do not run properly.

Symptom

Running analyses with Information Analyzer Thin Client or the discover command line fail, do not run properly, or stay in an "In progress" state for a very long time.

Environment

Information Server 11.7.0.0

Diagnosing The Problem

  1. Check the error message in Information Analyzer Thin Client or examine the log files:
    • On the services tier, check the Application Server log files.
    • On the engine tier, check the log files in the directory ASBNode/logs that start with the prefix odfengine.
  2. Check the status of the "ODFEngine" operating system service on the engine tier by running the following commands:
    Linux:
    service ODFEngine status
    AIX:
    lssrc -s ODFEngine
    Windows:
    Use Administrative Tools -> Services and look for the status of the service "IBM Information Server Open Discovery Framework."
  3. In addition to checking error messages, you can use a command line tool that is available on the engine tier that lets you monitor running and see finished analyses, cancel running analyses, and perform health checks.
    You can start the tool by running the following command on the engine tier:
    Linux/AIX:
    /opt/IBM/InformationServer/ASBNode/bin/ODFAdmin.sh
    Windows:
    C:\IBM\InformationServer\ASBNode\bin\ODFAdmin.bat
    The tool is interactive. If you type "help," it displays a first help text. At initial startup it might take some time to start. Typical commands are shown in the following list:
    • a -l: Show all running analyses.
    • a -c <number>: Cancel a running analysis. This command resets the state of the data set in Information Analyzer Thin Client.
    • a -d <number>: Show details of a single analysis.
    • e -h: Run a health check of the system.

Resolving The Problem

  1. Depending on the error message you find in the Information Analyzer Thin Client or in the logs, take the following actions:
    1. Column Analysis / Data Quality Analysis fails with a DataStage error like:
      "Event 14: pxbridge(2),0: JVMDUMP039I Processing dump event "systhrow", detail "java/lang/OutOfMemoryError
      Event 15: pxbridge(2),0: JVMDUMP032I JVM requested System dump using '/opt/IBM/InformationServer/Server/Projects/ANALYZERPROJECT/core.[...].dmp' in response to an event
      Resolution: Increase the heap size of the DataStage java stage by increasing the IIS property com.ibm.iis.ia.engine.javaStage.heapSize by running a command on the services tier as follows:
      Linux/AIX:
      /opt/IBM/InformationServer/ASBServer/bin/iisAdmin.sh -set -key com.ibm.iis.ia.engine.javaStage.heapSize -value 2048
      Windows:
      C:\IBM\InformationServer\ASBServer\bin\iisAdmin.bat -set -key com.ibm.iis.ia.engine.javaStage.heapSize -value 2048
    2. Column Analysis / Data Quality Analysis fails with a message "Discovery service with id com.ibm.infosphere.ia.CADataStageJobService is not registered"
      Resolution: Restart the ODFEngine as indicated below and try again.
    3. Column Analysis / Data Quality Analysis fails with an error message in the WAS log: "The ODF runtimes ''[DataStage]'' could not be reached. Please check that all ODF instances are running". This might in particular happen on a "horizontal" WAS cluster environment.
      Resolution: This message indicates that the ODFEngine service on the engine tier is not running or is not reachable. Start the service as indicated below and try again.
      If the service is running and you still see the issue, check the ODFEngine logs. If you see an error message like "org.I0Itec.zkclient.exception.ZkTimeoutException: Unable to connect to zookeeper server within timeout: 5000", check the setting "odf.zookeeper.connect" in the configuration file ASBNode/conf/odf-properties. It should point to the computer where the shared open source services (Zookeeper, Solr, Kafka) are installed. After changing the value, restart the ODFEngine service.
    4. Column Analysis / Data Quality Analysis fails with Kafka / Zookeeper related error messages in the WAS log like "Unable to connect to zookeeper server within timeout: 5000".
      Resolution: Check that the Zookeeper and Kafka services are up and running as described here: http://www-01.ibm.com/support/docview.wss?uid=swg21977649
  2. You can restart the ODFEngine service (in particular if the status command above indicates that it is not running) by issuing these commands:
    Linux:
    service ODFEngine stop
    service ODFEngine start
    AIX:
    stopsrc -s ODFEngine
    startsrc -s ODFEngine
    Windows:
    Use Administrative Tools -> Services tool to stop and start the service that is called "IBM Information Server Open Discovery Framework". Alternatively, you can use these commands:
    net stop InfoSrvODF
    net start InfoSrvODF
  3. If an analysis in Information Analyzer Thin Client is in the "in progress" state for a very long time, use the command 'a -c <number>' of the ODFAdmin tool (see section "Diagnosing the problem") to cancel it.

[{"Product":{"code":"SSZJPZ","label":"IBM InfoSphere Information Server"},"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Component":"InfoSphere Unified Governance","Platform":[{"code":"PF002","label":"AIX"},{"code":"PF016","label":"Linux"},{"code":"PF033","label":"Windows"}],"Version":"11.7.0.0","Edition":"","Line of Business":{"code":"LOB10","label":"Data and AI"}}]

Document Information

Modified date:
16 June 2018

UID

swg22011542