IBM Support

IZ85671: ABEND AT FIND_STALE_FR+000258 APPLIES TO AIX 7100-00

A fix is available

Subscribe

You can track all active APARs for this component.

 

APAR status

  • Closed as program error.

Error description

  • nfs v4 server will crash when doing cleanup follow by
    a failure of opening a file. The stack trace looks like
    KDB(0)> f
    pvthread+00AD00 STACK:
    ■0001BF00abend_trap+000000 ()
    ■0486C558find_stale_fr+000258 (??, ??)
    ■04867B4Cfr_gc+0001EC (??, ??)
    ■048681D8node_gc+0001D8 (??)
    ■048641BCwi_do_job+00011C (??)
    ■0486F230sm4_cleanup+000170 (??, ??, ??)
    ■0033E9B0procentry+000010 (??, ??, ??, ??)
    
    From the file record, you can get the p_fh and find out
    the thread who allocated the p_fh. that thread should
    still be in the middle of opening. Its stack trace looks
    like
    KDB(0)> f 4786
    pvthread+12B200 STACK:
    ■0052BE60slock+000480 (00000000000D3770,
    8000000000001032 ■??)
    ■00009558.simple_lock+000058 ()
    ■04884460sm4_cleanup_fr+0000C0 (??, ??)
    ■04882DACend_open_no_or+00026C (??, ??, ??, ??, ??, ??,
    ??, ??)
    ■0488CCA8sm4_end_open+000408 (??, ??)
    ■048AD6D0open_nocreate+000450 (??, ??, ??, ??)
    ■048AFA8Crfs4_open+00026C (??)
    ■047F44CCrfs4_dispatch_i+00002C (??)
    ■047F4CB8rfs4_dispatch+000558 (??, ??)
    ■04731388svc_getreq+0008C8 (??)
    ■0033F4DCthreadentry+00005C (??, ??, ??, ??)
    
    thread pvthread+12B200 turns off all state but didn't
    take the fr off the hash list. While the cleanup thread
    comes in and fond that fr off the hash list and failed
    the ras check and abend the machine.
    

Local fix

Problem summary

  • nfs v4 server will crash when doing cleanup follow by
    a failure of opening a file. The stack trace looks like
    KDB(0)> f
    pvthread+00AD00 STACK:
     0001BF00 abend_trap+000000 ()
     0486C558 find_stale_fr+000258 (??, ??)
     04867B4C fr_gc+0001EC (??, ??)
     048681D8 node_gc+0001D8 (??)
     048641BC wi_do_job+00011C (??)
     0486F230 sm4_cleanup+000170 (??, ??, ??)
     0033E9B0 procentry+000010 (??, ??, ??, ??)
    
    From the file record, you can get the p_fh and find out
    the thread who allocated the p_fh. that thread should
    still be in the middle of opening. Its stack trace looks
    like
    KDB(0)> f 4786
    pvthread+12B200 STACK:
     0052BE60 slock+000480 (00000000000D3770,
    8000000000001032  ?? )
     00009558 .simple_lock+000058 ()
     04884460 sm4_cleanup_fr+0000C0 (??, ??)
     04882DAC end_open_no_or+00026C (??, ??, ??, ??, ??, ??,
    ??, ??)
     0488CCA8 sm4_end_open+000408 (??, ??)
     048AD6D0 open_nocreate+000450 (??, ??, ??, ??)
     048AFA8C rfs4_open+00026C (??)
     047F44CC rfs4_dispatch_i+00002C (??)
     047F4CB8 rfs4_dispatch+000558 (??, ??)
     04731388 svc_getreq+0008C8 (??)
     0033F4DC threadentry+00005C (??, ??, ??, ??)
    
    thread pvthread+12B200 turns off all state but didn't
    take the fr off the hash list. While the cleanup thread
    comes in and fond that fr off the hash list and failed
    the ras check and abend the machine.
    

Problem conclusion

  • don't take off the flag until fr is off the hash list
    

Temporary fix

Comments

  • 6100-04 - use AIX APAR IZ87676
    6100-05 - use AIX APAR IZ87632
    6100-06 - use AIX APAR IZ85308
    7100-00 - use AIX APAR IZ85671
    

APAR Information

  • APAR number

    IZ85671

  • Reported component name

    AIX V7.1

  • Reported component ID

    5765H4000

  • Reported release

    710

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Submitted date

    2010-09-24

  • Closed date

    2010-09-24

  • Last modified date

    2013-04-17

  • APAR is sysrouted FROM one or more of the following:

    IZ85172

  • APAR is sysrouted TO one or more of the following:

Fix information

  • Fixed component name

    AIX V7.1

  • Fixed component ID

    5765H4000

Applicable component levels

  • R710 PSY U832951

       UP10/11/29 I 1000

PTF to Fileset Mapping

[{"Business Unit":{"code":"BU054","label":"Systems w\/TPS"},"Product":{"code":"SSMV87","label":"AIX 6.1 Enterprise Edition"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"710","Edition":"","Line of Business":{"code":"LOB08","label":"Cognitive Systems"}},{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SSMVAX","label":"AIX Express Edition"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"710","Edition":"","Line of Business":{"code":"LOB08","label":"Cognitive Systems"}},{"Business Unit":{"code":"BU054","label":"Systems w\/TPS"},"Product":{"code":"SG11R","label":"AIX 7.1 HIPERS, APARs and Fixes"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"710","Edition":"","Line of Business":{"code":"","label":""}}]

Document Information

Modified date:
17 April 2013