IV18119: UNIX OS SHOULD BE MORE RESILIENT ON DAEMON FAILURES

Subscribe

You can track all active APARs for this component.

APAR status

  • Closed as program error.

Error description

  • Environment: The Monitoring Agent for UNIX OS 6.2.3
    
    Problem Description: Unix OS Agent stops when one of the
    daemons used to gather attributes (dataproviders) fails
    to initialize or crashes.
      Example daemons mount_stat or aixdp_daemon
    
    UX agent behavior should be changed to allow the rest of the
    agent to continue processing in the event of a failing daemon
    for product resiliency.
    
    Additional Keywords:
    IV18116
    

Local fix

Problem summary

  • The Monitoring Agent for UNIX OS stops when one of the daemons
    used to gather attributes fails to initialize or crashes. This
    happens, for example, with mount_stat and aixdp_daemon.
    
    See the APAR Conclusion for more details of this new capability
    including a new attribute group used to report on the health of
    internal daemons.
    

Problem conclusion

  • The code was changed to allow the rest of the agent to continue
    processing in the event of a failing daemon. A new attribute
    group named "Data Collection Status" is introduced to report on
    the health of internal daemons comprising the UNIX OS Agent.
    
    The Data Collection Status attribute group reports on the health
    of internal data collectors of the Monitoring Agent for UNIX OS.
    The Data Collection Status table view of the UNIX workspace
    provides specific details.
    
    <u>Data Collection Status attributes:</u>
    Use Data Collection Status attributes to monitor the health of
    internal data collectors of the Unix OS agent.
    
    Name - The full name of a user. Valid entries are up to 48
    letters or numbers.
    
    Operating System Level - The version of the operating system
    where the Unix OS agent is running. Valid values include Not
    Available (-1) and Not Collected
    (-2).
    
    Status - The status of the data collector. Valid values include
    Disabled (3), Failed (2), Running (1), Not Available (-1), and
    Not Collected (-2).
    
    System Name - The host name of the monitored system. The form
    should be hostname:agent_code. Examples include spark:KUX or
    deux.raleigh.ibm.com:KUX. In workspace queries, this attribute
    should be set equal to the value $NODE$ in order to populate the
    workspace with data. This attribute is generally not included in
    situations, unless there is a need to customize the situation
    for a specific managed system.
    
    Timestamp - The date and time the agent collects information as
    set on the monitored system.
    
    
    The fix for this APAR is going to be included in the following
    maintenance vehicle:
        | fix pack | 6.2.3-TIV-ITM-FP0002
        | release  | 6.3.0-TIV-ITM
    

Temporary fix

Comments

APAR Information

  • APAR number

    IV18119

  • Reported component name

    ITM AGENT UNIX

  • Reported component ID

    5724C040U

  • Reported release

    623

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Special Attention

    NoSpecatt

  • Submitted date

    2012-03-23

  • Closed date

    2012-05-18

  • Last modified date

    2013-01-04

  • APAR is sysrouted FROM one or more of the following:

  • APAR is sysrouted TO one or more of the following:

Fix information

  • Fixed component name

    ITM AGENT UNIX

  • Fixed component ID

    5724C040U

Applicable component levels

  • R623 PSY

       UP



Rate this page:

(0 users)Average rating

Add comments

Document information


More support for:

Tivoli OMEGAMON XE for Distributed Systems

Software version:

623

Reference #:

IV18119

Modified date:

2013-01-04

Translate my page

Machine Translation

Content navigation