IBM Support

PK86823: APPLICATION SERVER NOT AUTOMATICALLY RESTARTED

Subscribe

You can track all active APARs for this component.

 

APAR status

  • Closed as program error.

Error description

  • Customer noticed application servers are not restarted
    automatically by the monitoring policy when they are shut down
    because they cannot be reached by the nodeagent.
    
    It is found that when the network is slow, the notifcation
    listener can be dropped.
    
    This APAR will keep this notification listener from dropping
    even if it is slow to respond.
    

Local fix

  • The following JVM Parameters can be used to try increasing the
    performance of the getLocalHost call:
    
    -Djava.net.preferIPv4Stack=true -Dcom.ibm.cacheLocalHost=true
    -DODC.BBEnabled=false
    

Problem summary

  • ****************************************************************
    * USERS AFFECTED:  All users of IBM WebSphere Application      *
    *                  Server Network Deployment edition v6.1      *
    ****************************************************************
    * PROBLEM DESCRIPTION: In some circumstances, WebSphere        *
    *                      Application Servers do not restart      *
    *                      automatically when a failure occurs.    *
    ****************************************************************
    * RECOMMENDATION:                                              *
    ****************************************************************
    The node agent tries to send notifications to all of its
    registered listeners. If, due to network delays, the discovery
    listener doesn't respond quickly enough, then the nodeagent
    removes that listener and does not attempt to contact it
    anymore. As a result, the nodeagent does not start the process
    that watches the application servers and restarts them in case
    of failure. If the application servers subsequently crash, they
    are not restarted.
    
    When a discovery listener is removed, you may see a message
    like the following in the logs:
    
    [06/21/09 09:35:32:127 EST] 0000020c NotificationD E
    ADME0005E: The following notification listener was removed
    because it did not handle a notification within
    websphere.discovery.process.foundms: 300000
    

Problem conclusion

  • Code was modified so the discovery listener would never be
    dropped, no matter how long it takes to respond. This will
    ensure that nodeagent always monitors the application servers
    so they can be restarted in case of failure.
    
    Note that a similar change was already made for version 7.0 as
    part of APAR PK72749.
    
    The fix for this APAR is currently targeted for inclusion in
    fix pack 6.1.0.29.  Please refer to the Recommended Updates
    page for delivery information:
    http://www.ibm.com/support/docview.wss?rs=180&uid=swg27004980
    

Temporary fix

  • Made temp fix available on 21 May 2009, waiting
    for response from customer
    

Comments

APAR Information

  • APAR number

    PK86823

  • Reported component name

    WEBS APP SERV N

  • Reported component ID

    5724H8800

  • Reported release

    61A

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Special Attention

    NoSpecatt

  • Submitted date

    2009-05-18

  • Closed date

    2009-08-17

  • Last modified date

    2009-08-17

  • APAR is sysrouted FROM one or more of the following:

  • APAR is sysrouted TO one or more of the following:

Fix information

  • Fixed component name

    WEBS APP SERV N

  • Fixed component ID

    5724H8800

Applicable component levels

  • R61A PSY

       UP

  • R61H PSY

       UP

  • R61I PSY

       UP

  • R61P PSY

       UP

  • R61S PSY

       UP

  • R61W PSY

       UP

  • R61Z PSY

       UP

[{"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Product":{"code":"SSEQTP","label":"WebSphere Application Server"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"6.1","Line of Business":{"code":"LOB45","label":"Automation"}}]

Document Information

Modified date:
28 December 2021