IZ85924: WATCHDOG CONTINUES TO RESTART OS AGENT

Subscribe

You can track all active APARs for this component.

APAR status

  • Closed as program error.

Error description

  • Severity: 2
    Approver: sm
    Reported Release:622
    Compid: 5724C040U Tivoli OMEGAMON XE for UNIX
    Abstract: Watchdog continues to restart OS agent
    
    Environment:
      TEMS/TEPS -- AIX 6.1 -- ITM 6.2.2 FP02
      TEMA -- AIX V6.1 -- ITM 6.2.2 FP02 Unix OS Agent
      TEMA is a HACMP and there are two kuxagent ( the customer
    received TCT and was advised it is supported now ) and both
    agents are experiencing this problem.
    
    Problem Description:
      cinfo -r/R does not show "running" because of the system's
    name resolution setting (the result of hostname command and
    hostname in RunInfo file do not match)
    
    Watchdog is using cinfo -r or -R to check for running agent.
    Normally if "cinfo -r" indicates the agent is not running, the
    watchdog will try to start it.  If however, a PID has been
    collected indicating that the process is running but "cinfo -r"
    indicates it is not running, watchdog treats this as an
    "unhealthy" agent and before starting it will try to stop it
    first. So, it continues to start the OS agent.In the end, OS
    agent is not running any more.
    
    Detailed Recreation Procedure:
      Change name resolution setting the result of hostname command
    and hostname in RunInfo file do not match).
    
    Related Files and Output:
    
      #./cinfo -r
    *********** Wed Aug  4 16:26:00 JST 2010 ******************
    User: root Groups: system bin sys security cron audit lp sapinst
    ncoadmin
    Host name : dcssap1      Installer Lvl:06.22.02.00
    CandleHome: /opt/tivoli/itm
    ***********************************************************
    Host     Prod  PID     Owner  Start  ID    ..Status
    dcssap0  ux    897176                None
    
    Watchdog is using cinfo -r or -R to check for running agent.
    Since it does not get the correct status from cinfo -r/-R,
    Watchdog continues to restart the agent. In the end, OS Agent
    is not running anymore.
    

Local fix

  • Disable Watchdog
    

Problem summary

  • When the Agent Management Services watchdog utility invokes the
    "cinfo -r" command to determine if an agent is running and
    "cinfo" has a problem resolving the hostname, it can report that
    the agent is not running when it is.  This will result in the
    watchdog utility restarting the agent each time it checks
    availability.
    

Problem conclusion

  • Code was updated to not restart the agent if the watchdog
    utility is able to verify the process is running in this
    scenario.
    
    
    The fix for this APAR is going to be available in the following
    maintenance
    
    packages:
          | LA interim fix | 6.2.2.3-TIV-ITM_LINUX-IF0001
          | LA interim fix | 6.2.2.3-TIV-ITM_UNIX-IF0001
          | fix pack | 6.2.2-TIV-ITM-FP0004
    

Temporary fix

  • Watchdog can be disabled with the following steps:
    
    Edit the lz.ini or ux.ini and comment out the line the starts
    with KCA_CAP_DIR.  (Note the value on the right might have a
    slightly different value based on the release).
    
    #  KCA_CAP_DIR=$CANDLEHOME$/config/CAP:/opt/IBM/CAP
    and add the line
    KCA_CAP_DIR=
    
    Stop and restart the OS Agent.
    

Comments

APAR Information

  • APAR number

    IZ85924

  • Reported component name

    ITM AGENT UNIX

  • Reported component ID

    5724C040U

  • Reported release

    622

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Special Attention

    NoSpecatt

  • Submitted date

    2010-09-28

  • Closed date

    2010-11-30

  • Last modified date

    2011-04-06

  • APAR is sysrouted FROM one or more of the following:

  • APAR is sysrouted TO one or more of the following:

Fix information

  • Fixed component name

    ITM AGENT UNIX

  • Fixed component ID

    5724C040U

Applicable component levels

  • R622 PSY

       UP



Rate this page:

(0 users)Average rating

Document information


More support for:

Tivoli OMEGAMON XE for Distributed Systems

Software version:

622

Reference #:

IZ85924

Modified date:

2011-04-06

Translate my page

Machine Translation

Content navigation