IBM Support

OA27965: AGENTS NEVER COME BACK ONLINE AFTER MULTIPLE RTEMS FAILOVERS.

A fix is available

Subscribe

You can track all active APARs for this component.

 

APAR status

  • Closed as program error.

Error description

  • Approver:  MS
    
    Description:
    
    When the RTEMS reconnects, it retransmits its entire Node Status
    to the Hub.
    If the agent hasn't told the agent that it is off-line, the
    RTEMS will retransmit that the agent is still on-line.
    If that agent has already connected to another TEMS in the
    meantime, though, the on-line status with the new TEMS will be
    overwritten with an on-line status at the old TEMS.
    Because the Hub thinks that the agent has switched back to the
    old TEMS, when the agent's status at the old TEMS finally
    changes to off-line and the status to transmitted to the Hub,
    the off-line status will be honoured.
    
    Because of the fact that a) the RTEMS must lose its connection
    to the Hub b) around the same time that an agent switches and c)
    the status of the agent must remain on-line at the original TEMS
    after the switch and d) its status must be retransmitted before
    it goes off-line at the original TEMS
    

Local fix

  • No workaround available.
    

Problem summary

  • ****************************************************************
    * USERS AFFECTED: All TEMS users.                              *
    ****************************************************************
    * PROBLEM DESCRIPTION: An Agent can appear offline after it    *
    *                      has switched to another RTEMS.          *
    ****************************************************************
    * RECOMMENDATION:                                              *
    ****************************************************************
    If an RTEMS loses its connection with the HTEMS, it will
    rebroadcast its entire node status table to the HTEMS after
    the connection is reestablished.  If an agent has switched
    away from the RTEMS before the rebroadcast, there is a chance
    the rebroadcast will cause the HTEMS to believe the agent has
    switched back to the original RTEMS.  When the agent does not
    send a heartbeat to the RTEMS, the RTEMS then marks the agent
    as offline.  The agent actually remains online, but connected
    to the TEMS it switched to.
    

Problem conclusion

  • The code was modified to allow the HTEMS to know which TEMS
    an agent last sent a heartbeat to.  This allows the HTEMS to
    more accurately determine where an agent is actually
    connected. The HTEMS will ignore online status requests for
    an agent when the online status is older than the latest one
    received.
    

Temporary fix

Comments

APAR Information

  • APAR number

    OA27965

  • Reported component name

    MGMT SERVER DS

  • Reported component ID

    5608A2800

  • Reported release

    620

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Special Attention

    NoSpecatt

  • Submitted date

    2009-02-15

  • Closed date

    2009-02-25

  • Last modified date

    2009-04-01

  • APAR is sysrouted FROM one or more of the following:

    IZ35329

  • APAR is sysrouted TO one or more of the following:

Modules/Macros

  • KFAAUTOX KFACOM   KFAOMTEC KFAPRB   KFAXCF
    KGL      KGLBASE  KGLCRTST KGLOPCRY KGL01P1  KGL01P2  KSMOMS
    

Fix information

  • Fixed component name

    MGMT SERVER DS

  • Fixed component ID

    5608A2800

Applicable component levels

  • R620 PSY UA46076

       UP09/03/03 P F903

Fix is available

  • Select the PTF appropriate for your component level. You will be required to sign in. Distribution on physical media is not available in all countries.

[{"Business Unit":{"code":"BU054","label":"Systems w\/TPS"},"Product":{"code":"SSRJ5K","label":"Tivoli Management Server for Distributed Systems on z\/OS"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"620","Edition":"","Line of Business":{"code":"LOB17","label":"Mainframe TPS"}}]

Document Information

Modified date:
01 April 2009