IC68543: BACKUPSTGPOOL CAN DEADLOCK ON VOLUME WRITE ERROR.

Subscribe

You can track all active APARs for this component.

APAR status

  • Closed as program error.

Error description

  • Client backup sessions came in after a backupstgpool process had
    begun.  During the backupstgpool a volume write error is then
    encountered. The function called from the backupstgpool thread
    to handle the volume write error requests a lock.  Locking is
    incompatible with the client session locking.   This causes a
    deadlock within the backupstgpool thread.  The hang frees itself
    once the resourcetimeout is reached.  This is a timing window
    issue while BACKUP STGPOOL is processing.
    Customer/L2 Diagnostics:
    Show Lock
    LockDesc: Type=36001(as volume root), NameSpace=0,
    SummMode=ixLock, Key=''
      Holder: (asutil.c:2029 Thread 137011) Tsn=0:57577335,
    Mode=ixLock   <----- This Tsn and the last waiter below have the
    same thread
      Waiter: (asutil.c:2029 Thread 137058) Tsn=0:57595884,
    Mode=sLock
      Waiter: (asutil.c:2029 Thread 137166) Tsn=0:57806140,
    Mode=isLock
      Waiter: (asutil.c:2029 Thread 137182) Tsn=0:57828868,
    Mode=sLock
      Waiter: (asutil.c:2029 Thread 137197) Tsn=0:57852297,
    Mode=sixLock <---- incompatible lock
      Waiter: (asutil.c:2029 Thread 124158) Tsn=0:57931295,
    Mode=ixLock   <---- waiter has same thread as holder Tsn
    Show TXN
    Tsn=0:57577335, Resurrected=False, InFlight=False,
    Distributed=False,
    Persistent=True, Addr 1b699638
      ThreadId=124158, Timestamp=04/29/10 09:22:01,
    Creator=afcputil.c(4724)
      Participants=3, summaryVote=ReadOnly
      EndInFlight True, endThreadId 124158, tmidx 0 0,
    processBatchCount 0,
    mustAbort False.
        Participant DB: voteReceived=False, ackReceived=False
          DB: Txn 112b87898, ReadOnly(NO), connP=1163bae78,
    applHandle=1007,
    ...
    Thread 124158, Parent 124153: AfBackupPoolThread, Storage
    1734402, AllocCnt
    45740 HighWaterAmt 1740466
     tid=7fe, ptid=3cf9, det=1, zomb=0, join=0, result=0, sess=0
      Awaiting cond waitP->waiting (0x150ef30c0), using mutex
    TMV->mutex
    (0x110a06678), at tmlock.c(721)
      Stack trace:
        0x0900000000293a1c _cond_wait_global
        0x0900000000294514 _cond_wait
        0x0900000000294fbc pthread_cond_wait
        0x0000000100007c74 pkWaitConditionTracked
        0x00000001000c3f64 tmLockTracked
        0x00000001003c0658 AsLockVolRootTracked  <----- waiter
    requesting ixlock - Thread 124158) Tsn=0:57931295,
        0x00000001003f2e9c AsSetVolWriteError <------ write error on
    volume evokes AsSetVolWriteError/ixlock
        0x00000001003d665c FlushVolume
        0x00000001003e1a10 AsPrepareOutput
        0x00000001003c9400 AsPrepareTxn
        0x000000010022e6c0 ssPrepareTxn
        0x00000001000bd62c CollectVotes
        0x00000001000bcbcc tmEndX
        0x000000010065ce6c CopyBatch
        0x0000000100656730 CopyVolume
        0x0000000100655b30 AfCopyVolume
        0x000000010042373c AfBackupPoolThread
        0x0000000100009b28 StartThread
    slot -> 31:  (last waiter above)
    Tsn=0:57931295, Resurrected=False, InFlight=True,
    Distributed=False,
    Persistent=True, Addr 1b670ed8
      ThreadId=124158, Timestamp=04/29/10 10:22:43,
    Creator=asvol.c(2110)
      Participants=1, summaryVote=ReadOnly
      EndInFlight False, endThreadId 124158, tmidx 0 0,
    processBatchCount 0,
    mustAbort False.
        Participant DB: voteReceived=False, ackReceived=False
          DB: Txn 113f10278, ReadOnly(YES), connP=113b6fb78,
    applHandle=1016,
    openTbls=1:
          DB: --> OpenP=116a034d8 for table=NA.Path.
    *NOTE*
    None
    Platforms affected:
    TSM 5.4, 5.5, 6.1, 6.2 Unix/Linux/Windows
    
    Initial Impact: Medium
    Additional Keywords: hang hung deadlock ZZ61
    

Local fix

  • 1) determine and correct the volume write errors
    2) Do not have client sessions backup during the backup stgpool
    process
    

Problem summary

  • ****************************************************************
    * USERS AFFECTED: All Tivoli Storage Manager server users.     *
    ****************************************************************
    * PROBLEM DESCRIPTION: See error description.                  *
    ****************************************************************
    * RECOMMENDATION: Apply fixing level when available. This      *
    *                 problem is currently projected to be fixed   *
    *                 in levels 6.1.4.0 and 6.2.2.0. Note that     *
    *                 this is                                      *
    *                 subject to change at the discretion of IBM.  *
    ****************************************************************
    *
    

Problem conclusion

  • This problem was fixed.
    Affected platforms:  AIX, HP-UX, Sun Solaris,
    Linux and Windows.
    

Temporary fix

Comments

APAR Information

  • APAR number

    IC68543

  • Reported component name

    TSM SERVER

  • Reported component ID

    5698ISMSV

  • Reported release

    61A

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Special Attention

    NoSpecatt

  • Submitted date

    2010-05-10

  • Closed date

    2010-05-13

  • Last modified date

    2010-05-13

  • APAR is sysrouted FROM one or more of the following:

  • APAR is sysrouted TO one or more of the following:

Fix information

  • Fixed component name

    TSM SERVER

  • Fixed component ID

    5698ISMSV

Applicable component levels

  • R61A PSY

       UP

  • R61H PSY

       UP

  • R61L PSY

       UP

  • R61S PSY

       UP

  • R61W PSY

       UP

  • R62A PSY

       UP

  • R62H PSY

       UP

  • R62L PSY

       UP

  • R62S PSY

       UP

  • R62W PSY

       UP



Rate this page:

(0 users)Average rating

Document information


More support for:

Tivoli Storage Manager

Software version:

61A

Reference #:

IC68543

Modified date:

2010-05-13

Translate my page

Machine Translation

Content navigation