IBM Support

IC81233: IN HADR SUPERASYNC MODE IT IS POSSIBLE FOR STANDBY AGENTS TO APPEAR HUNG AND NOT RECEIVE ANY LOG RECORDS.

Subscribe

You can track all active APARs for this component.

APAR status

  • Closed as program error.

Error description

  • In HADR superasync synchronization mode, if the following steps
    are executed, standby will hang for ever.
    
    Note: Because standby is hung, any action on standby including
    takeover will hang.
    
    1. HADR pair using super aysnc mode is setup with logfilsiz X.
    2. HADR primary generates few log's.
    3. Standby is deactivated.
    4. Run some transactions on primary to move up in number of log
    extents.
    5. Primary is deactivated.
    6. logfilsiz on primary is changed to Y
    7. A start hadr on db hadrdb as primary BY FORCE is issued.
    8. Few more log's are written with the new logfilsiz
    9. Activate the standby.
    
    After step 9, standby will appear hung. You can also confirm
    this by looking at the output of db2pd -hadr. If the log extent
    number for standby does not move ahead then it could be due to
    this hang.
    
    db2diag.log will contain messages similar to the following, note
    that the next extent header standby received after extent header
    for log 718  is for log 747 instead of 719. Due to this out of
    order extent header send issue at the time of logfilsiz causes a
    hang on standby.
    
    2012-01-05-04.46.03.978538-300 I391774E443        LEVEL: Info
    PID    : 18870                TID  : 140736989226752PROC :
    db2sysc
    INSTANCE: db2inst1              NODE : 000
    EDUID  : 34                  EDUNAME: db2hadrs (HDRLNX)
    FUNCTION: DB2 UDB, High Availability Disaster Recovery,
    hdrHandleXhdrMsg, probe:10175
    DATA #1 : <preformatted>
    Received HDR_MSG_XHDR, ExtNum 718, firstlsn 00036DEF8000 ExtSize
    
    5000, PageCount 5000, state 0x18211
    
    2012-01-05-04.46.06.062622-300 I392218E448        LEVEL: Info
    PID    : 18870                TID  : 140736989226752PROC :
    db2sysc
    INSTANCE: db2inst1              NODE : 000
    EDUID  : 34                  EDUNAME: db2hadrs (HDRLNX)
    FUNCTION: DB2 UDB, High Availability Disaster Recovery,
    hdrHandleXhdrMsg, probe:10175
    DATA #1 : <preformatted>
    Received HDR_MSG_XHDRCLOSE, ExtNum 718, firstlsn 00036DEF8000
    ExtSize 5000, PageCount 5000, state 0x18211
    
    2012-01-05-04.46.06.079451-300 I393037E443        LEVEL: Info
    PID    : 18870                TID  : 140736989226752PROC :
    db2sysc
    INSTANCE: db2inst1              NODE : 000
    EDUID  : 34                  EDUNAME: db2hadrs (HDRLNX)
    FUNCTION: DB2 UDB, High Availability Disaster Recovery,
    hdrHandleXhdrMsg, probe:10175
    DATA #1 : <preformatted>
    Received HDR_MSG_XHDR, ExtNum 747, firstlsn 000391560000 ExtSize
    
    6700, PageCount 6700, state 0x10001
    

Local fix

  • A local workaround would be:
    1) Kill the instance on standby server
    2) Copy all the log's from primary's active log path to
    standby's active log path
    3) start the server on standby side
    4) start hadr on standby database.
    

Problem summary

  • ****************************************************************
    * USERS AFFECTED:                                              *
    * DB2 LUW HADR users using SUPERASYNC on all platforms         *
    ****************************************************************
    * PROBLEM DESCRIPTION:                                         *
    * Without the fix, customer could  hit the problem described   *
    * in the error description                                     *
    ****************************************************************
    * RECOMMENDATION:                                              *
    * We recommend upgrading to v97fp6                             *
    *                                                              *
    * A local workaround would be:                                 *
    * 1) Kill the instance on standby server                       *
    * 2) Copy all the log's from primary's active log path to      *
    * standby's active log path                                    *
    * 3) start the server on standby side                          *
    * 4) start hadr on standby database.                           *
    ****************************************************************
    

Problem conclusion

  • After applying v97fp6, the problem in error description can be
    avoided.
    

Temporary fix

Comments

APAR Information

  • APAR number

    IC81233

  • Reported component name

    DB2 FOR LUW

  • Reported component ID

    DB2FORLUW

  • Reported release

    970

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Special Attention

    NoSpecatt

  • Submitted date

    2012-02-06

  • Closed date

    2012-12-06

  • Last modified date

    2012-12-06

  • APAR is sysrouted FROM one or more of the following:

    IC80790

  • APAR is sysrouted TO one or more of the following:

Fix information

  • Fixed component name

    DB2 FOR LUW

  • Fixed component ID

    DB2FORLUW

Applicable component levels

  • R970 PSN

       UP



Document information

More support for: DB2 for Linux, UNIX and Windows

Software version: 9.7

Reference #: IC81233

Modified date: 06 December 2012