IBM Support

IC89281: LOGICAL RECOVERY ROLLFORWARD ERRORS ON SDS NODES WHEN PRIMARY NODE IS PERFORMING LRU (FOREGROUND POSSIBLY) WRITES

Subscribe to this APAR

By subscribing, you receive periodic emails alerting you to the status of the APAR, along with a link to the fix after it becomes available. You can track this item individually or track all items by product.

Notify me when this APAR changes.

Notify me when an APAR for this component changes.

APAR status

  • Closed as program error.

Error description

  • The test case submitted for this problem has shown 2 types of
    recovery failures (1 fatal - aborts the sds node, 1 non-fatal -
    just marks index bad on the sds node).  The test is purposely
    configured to be doing very aggressive LRU writing with
    LRU_MIN_DIRTY/LRU_MAX_DIRTY along with a very small buffer pool
    and high write activity (on the primary).  All of the above is
    to try and stress test the SDS code to prevent the flushing of
    modified buffers on the primary of a sds cluster prior to the
    advancement of the log position on the sds node to be past where
    that page was modified.  If that mechanism fails, the sds node
    can see the already modified version of the page, and thus when
    it attempts to apply that log record, it fails, and depending on
    the type of log record, it could crash the node or just possibly
    mark an index bad.
    
    Here are the details of the af's seen running this test:
    
    1) marks index bad
    
    14:25:42  Assert Warning: Error during recovery left index
    inconsistent.
    14:25:42   Who: Session(19, informix@machine, 0,
    7000000103c16c8)
                    Thread(40, xchg_1.1, 7000000103875d8, 1)
                    File: rskey.c Line: 1645
    14:25:42   Results: Index 'stores7:"informix".customer# 100_1'
    is now unusable
    14:25:42   Action: Run 'oncheck -cI stores7:"informix".customer#
    100_1'
    14:25:42  Raw hex dump of stack located in
    /spare2/jrenaut/dumps/af.41039c5.rawstk
    14:25:42  Stack for thread: 40 xchg_1.1
    
     base: 0x0700000011a51000
      len:   69632
       pc: 0x000000010005bd48
      tos: 0x0700000011a60370
    state: running
       vp: 1
    
    (oninit)afstack
    (oninit)afhandler
    (oninit)kybad
    (oninit)dodelitem
    (oninit)plogredo
    (oninit)rlogm_redo
    (oninit)next_recvr
    (oninit)prod_loop2
    (oninit)producer_thread
    (oninit)startup
    
    14:25:42  dodelitem: Node 0xe8, match 1
    14:25:42  Could not delete item, rowid 0x69311, key:
    KEY:71632:
    14:25:42  Node
    Node 0xe8  Prev 0xe7  Next 0xe9
    KEY:71631:
    
    
    Rowids:    69310*
    
    2) HINSERT failure aborts SDS node
    
    07:37:58  Assert Failed: Logical log replay error.
    07:37:58   Who: Session(19, informix@machine, 0,
    7000000103c35b8)
                    Thread(47, xchg_1.7, 70000001038a728, 5)
                    File: rsprecvr.c Line: 7116
    07:37:58   Results: The secondary server cannot continue.
    07:37:58   Action: Reestablish the secondary server.
    07:37:58  Raw hex dump of stack located in
    /spare2/dumps/af.41788b1.rawstk
    07:37:58  Stack for thread: 47 xchg_1.7
    
     base: 0x0700000011b24000
      len:   69632
       pc: 0x000000010005bd48
      tos: 0x0700000011b33e30
    state: running
       vp: 5
    
    (oninit)afstack
    (oninit)afhandler
    (oninit)rollfwd_error
    (oninit)rlogm_redo
    (oninit)next_recvr
    (oninit)prod_loop2
    (oninit)producer_thread
    (oninit)startup
    
    07:37:54  Log record (OLDRSAM:HINSERT) failed, partnum 0x1002a9
    rowid 0x11d iserrno 126
    

Local fix

Problem summary

  • ****************************************************************
    * USERS AFFECTED:                                              *
    * All users                                                    *
    ****************************************************************
    * PROBLEM DESCRIPTION:                                         *
    * See Error Description                                        *
    ****************************************************************
    * RECOMMENDATION:                                              *
    * Update to IDS-11.50.xC10                                     *
    ****************************************************************
    

Problem conclusion

  • Problem Fixed In IDS-11.50.xC10
    

Temporary fix

Comments

APAR Information

  • APAR number

    IC89281

  • Reported component name

    INFORMIX SERVER

  • Reported component ID

    5725A3900

  • Reported release

    B50

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Special Attention

    NoSpecatt / Xsystem

  • Submitted date

    2012-12-26

  • Closed date

    2017-06-15

  • Last modified date

    2017-06-15

  • APAR is sysrouted FROM one or more of the following:

  • APAR is sysrouted TO one or more of the following:

Fix information

  • Fixed component name

    INFORMIX SERVER

  • Fixed component ID

    5725A3900

Applicable component levels

  • RA10 PSN

       UP

  • RA10 PSY

       UP

  • RB10 PSN

       UP

  • RB10 PSY

       UP

  • RB50 PSN

       UP

  • RB50 PSY

       UP

  • RB70 PSN

       UP

  • RB70 PSY

       UP



Document information

More support for: Informix Servers

Software version: B50

Reference #: IC89281

Modified date: 15 June 2017


Translate this page: