IBM Support

PM80341: Containers do not retry failed connections to catalog servers so that those container servers can be eligible for placement.

Subscribe to this APAR

By subscribing, you receive periodic emails alerting you to the status of the APAR, along with a link to the fix after it becomes available. You can track this item individually or track all items by product.

Notify me when this APAR changes.

Notify me when an APAR for this component changes.

 

APAR status

  • Closed as program error.

Error description

  • Containers send one-way ORB requests to the catalog server to
    become eligible for placement.  Since this is not two-way
    communication, it is possible for the request to be lost
    without generating any failures.
    
    Thus, the containers seem to start (CWOBJ1001I message can be
    seen) but, placement of shards on those containers (CWOBJ1511I
    message) does not occur.  The container shows no failures
    and continues to run. However, the container server will not be
    an active part of the grid.
    
    Also, the container will not be displayed in the showPlacement
    output and the number of containers counted in the
    showPlacement "Total known containers" output will be less
    than the number of containers that actually started.
    

Local fix

Problem summary

  • ****************************************************************
    * USERS AFFECTED:  WebSphere eXtreme Scale users               *
    ****************************************************************
    * PROBLEM DESCRIPTION: No tolerance for connectivity issues    *
    *                      when containers attempt to tell the     *
    *                      catalog service to prepare for shard    *
    *                      placement.                              *
    ****************************************************************
    * RECOMMENDATION:                                              *
    ****************************************************************
    Intermittent connectivity issues might occur while a container
    attempts to contact the catalog service to prepare for shard
    placement. Previously, if a communication error was reported,
    the container did not try to contact the catalog service again.
    

Problem conclusion

  • WebSphere eXtreme Scale container servers now retry
    communication with the catalog service when preparing to
    become eligible for shard placement.
    

Temporary fix

Comments

APAR Information

  • APAR number

    PM80341

  • Reported component name

    WS EXTREME SCAL

  • Reported component ID

    5724X6702

  • Reported release

    850

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Special Attention

    NoSpecatt

  • Submitted date

    2013-01-10

  • Closed date

    2013-01-29

  • Last modified date

    2013-01-29

  • APAR is sysrouted FROM one or more of the following:

  • APAR is sysrouted TO one or more of the following:

Fix information

  • Fixed component name

    WS EXTREME SCAL

  • Fixed component ID

    5724X6702

Applicable component levels

  • R850 PSY

       UP

  • R860 PSY

       UP

[{"Business Unit":{"code":"BU053","label":"Cloud & Data Platform"},"Product":{"code":"SSTVLU","label":"WebSphere eXtreme Scale"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"850","Edition":"","Line of Business":{"code":"LOB45","label":"Automation"}}]

Document Information

Modified date:
29 January 2013