IBM Support

IV56153: DEADLOCK IN JFS2 INTERNAL SNAPSHOT CODE CAN CAUSE HANG APPLIES TO AIX 6100-09

A fix is available

Subscribe

You can track all active APARs for this component.

APAR status

  • Closed as program error.

Error description

  • A dead lock can occur when working with internal jfs2
    shapshots
    As a result commands on the associated jfs2 filesystem
    can hang.
    Both kernel stacks involved with dead lock will contain
    the function
    siWriterReadSMap with stacks similar to:
    
    (0)> f 190
    pvthread+00BE00 STACK:
    [000E47F8]e_block_thread+000298 ()
    [000E5368]e_sleep_thread+0000E8 (??, ??, ??)
    [002A76D8]bmAssign+000778 (??, ??, ??, ??, ??, ??)
    [002A6960]bmRead+0000A0 (??, ??, ??, ??, ??, ??)
    [00303BD4]xtSearch+0005D4 (??, ??, ??, ??, ??)
    [002C4F88]siWriterReadSMap+000348 (F10001023D4CA880,
    0000000001ACC0A0,0FFFFFFFF4FA7B48, 0000000000000000)
    [002C4720]siCOWLookupSMap+000060 (??, ??, ??, ??, ??)
    [002CCC70]siCOW+000270 (??, ??)
    [002BBB9C]j2PagerService+00051C (??)
    [002B7954]j2PagerThread+0001F4 (??)
    [00387774]threadentry+000094 (??, ??, ??, ??)
    [kdb_read_mem] no real storage @ FFFFFFFFFFF8D30
    
    and
    0> f 81
    [00574454]complex_lock_sleep_ppc+0001D4
    (0000000000574454,
    8000000000001032,
       0000000088024024, 0FFFFFFFF42B7670 [??])
    [005763C0]lock_write_ppc+0001A0 (??)
    [002C5048]siWriterReadSMap+000408 (F10001023D4CA880,
    00000000032C8BA0,
       0FFFFFFFF42B7B48, 0000000000000000)
    [002C4720]siCOWLookupSMap+000060 (??, ??, ??, ??, ??)
    [002CCC70]siCOW+000270 (??, ??)
    [002BBB9C]j2PagerService+00051C (??)
    [002B7954]j2PagerThread+0001F4 (??)
    [00387774]threadentry+000094 (??, ??, ??, ??)
    [kdb_read_mem] no real storage @ FFFFFFFFFFF8D30
    
    Code changes in IV29780 with abstract:
    Deadlock hang in snapshot code doing chdir and remove
    operations
    
    causes the race condition which can lead to this deadlock
    

Local fix

  • Work around is to use external snapshot instead of
    internal
    snapshots
    

Problem summary

  • ****************************************************************
    * USERS AFFECTED:
    * Systems running the 6100-09 Technology Level with the
    * bos.mp64 fileset below the 6.1.9.15 level.
    ****************************************************************
    * PROBLEM DESCRIPTION:
    * A dead lock can occur when working with internal jfs2
    * shapshots
    * As a result commands on the associated jfs2 filesystem
    * can hang.
    * Both kernel stacks involved with dead lock will contain
    * the function
    * siWriterReadSMap with stacks similar to:
    * (0)> f 190
    * pvthread+00BE00 STACK:
    *  000E47F8 e_block_thread+000298 ()
    *  000E5368 e_sleep_thread+0000E8 (??, ??, ??)
    *  002A76D8 bmAssign+000778 (??, ??, ??, ??, ??, ??)
    *  002A6960 bmRead+0000A0 (??, ??, ??, ??, ??, ??)
    *  00303BD4 xtSearch+0005D4 (??, ??, ??, ??, ??)
    *  002C4F88 siWriterReadSMap+000348 (F10001023D4CA880,
    * 0000000001ACC0A0,0FFFFFFFF4FA7B48, 0000000000000000)
    *  002C4720 siCOWLookupSMap+000060 (??, ??, ??, ??, ??)
    *  002CCC70 siCOW+000270 (??, ??)
    *  002BBB9C j2PagerService+00051C (??)
    *  002B7954 j2PagerThread+0001F4 (??)
    *  00387774 threadentry+000094 (??, ??, ??, ??)
    *  kdb_read_mem  no real storage @ FFFFFFFFFFF8D30
    * and
    * 0> f 81
    *  00574454 complex_lock_sleep_ppc+0001D4
    * (0000000000574454,
    * 8000000000001032,
    *    0000000088024024, 0FFFFFFFF42B7670  ?? )
    *  005763C0 lock_write_ppc+0001A0 (??)
    *  002C5048 siWriterReadSMap+000408 (F10001023D4CA880,
    * 00000000032C8BA0,
    *    0FFFFFFFF42B7B48, 0000000000000000)
    *  002C4720 siCOWLookupSMap+000060 (??, ??, ??, ??, ??)
    *  002CCC70 siCOW+000270 (??, ??)
    *  002BBB9C j2PagerService+00051C (??)
    *  002B7954 j2PagerThread+0001F4 (??)
    *  00387774 threadentry+000094 (??, ??, ??, ??)
    *  kdb_read_mem  no real storage @ FFFFFFFFFFF8D30
    * Code changes in IV29780 with abstract:
    * Deadlock hang in snapshot code doing chdir and remove
    * operations
    * causes the race condition which can lead to this deadlock
    ****************************************************************
    * RECOMMENDATION:
    * Install APAR IV56153.
    ****************************************************************
    

Problem conclusion

  • Change locking serialization
    

Temporary fix

Comments

  • 6100-07 - use AIX APAR IV58198
    6100-08 - use AIX APAR IV46121
    6100-09 - use AIX APAR IV56153
    6100-09 - use AIX APAR IV56153
    6100-09 - use AIX APAR IV56153
    7100-01 - use AIX APAR IV57482
    7100-02 - use AIX APAR IV56225
    7100-03 - use AIX APAR IV56142
    7100-04 - use AIX APAR IV56171
    

APAR Information

  • APAR number

    IV56153

  • Reported component name

    AIX 610 STD EDI

  • Reported component ID

    5765G6200

  • Reported release

    610

  • Status

    CLOSED PER

  • PE

    YesPE

  • HIPER

    NoHIPER

  • Submitted date

    2014-03-03

  • Closed date

    2014-03-03

  • Last modified date

    2016-05-10

  • APAR is sysrouted FROM one or more of the following:

    IV46121

  • APAR is sysrouted TO one or more of the following:

Fix information

  • Fixed component name

    AIX 610 STD EDI

  • Fixed component ID

    5765G6200

Applicable component levels

  • R610 PSY U866370

       UP14/05/21 I 1000



Document information

More support for: AIX Standard Edition

Software version: 610

Operating system(s): AIX

Reference #: IV56153

Modified date: 10 May 2016