IBM Support

IC89590: ANR0551E..LOCK CONFLICT WHEN NODE REPLICATION RUNS CONCURRENTLY TO OTHER OPERATIONS FROM SAME CLIENT

Subscribe

You can track all active APARs for this component.

 

APAR status

  • Closed as program error.

Error description

  • When running a "REPLICATE NODE" process and a Tivoli Storage
    Manager Client operation (ie. backup, restore) simultaneously,
    it can cause the deadlock of the client session with the
    following error logged to the activity log:
    
    
    11/08/12   21:11:09      ANR9999D_1607799673
    xiBuildNodeDef(xirepl.c:1248)Thread<10231>: Error 1020 from
    admGetNodeConvState nodeName=NODE1
    (SESSION: 4890, PROCESS: 112)
    11/08/12   21:11:09      ANR9999D Thread<10231> issued message
    9999 from: (SESSION:   4890, PROCESS: 112)
    11/08/12   21:11:09      ANR9999D Thread<10231>
    0x000000010000d0b8 StdPutText (SESSION: 4890, PROCESS: 112)
    11/08/12   21:11:09      ANR9999D Thread<10231>
    0x000000010000db34 OutDiagToCons     (SESSION: 4890, PROCESS:
    112)
    11/08/12   21:11:09      ANR9999D Thread<10231>
    0x0000000100008d9c outDiagfExt    (SESSION: 4890, PROCESS: 112)
    11/08/12   21:11:09      ANR9999D Thread<10231>
    0x0000000100bdf454 xiBuildNodeDef    (SESSION: 4890, PROCESS:
    112)
    11/08/12   21:11:09      ANR9999D Thread<10231>
    0x0000000100663eb4 smReplQueryNode SESSION: 4890, PROCESS: 112)
    11/08/12   21:11:09      ANR9999D Thread<10231>
    0x000000010063089c DoNodeHandshake   (SESSION: 4890, PROCESS:
    112)
    11/08/12   21:11:09      ANR9999D Thread<10231>
    0x000000010062f2e4 NrReplicationThread  (SESSION: 4890, PROCESS:
    112)
    11/08/12   21:11:09      ANR9999D Thread<10231>
    0x0000000100020ae0 StartThread   (SESSION: 4890, PROCESS: 112)
    
    11/08/12   21:11:09      ANR0551E The client operation failed
    for session XXXXXX for node NODE1 (AIX) - lock conflict.
    (SESSION: 4890)
    
    
    The problem is because there are two different transactions
    using the same thread for accessing node information.
    
    
    
    Customer/L2 Diagnostics:
    Using the serverperf.pl script and automatizing the collection
    of "show locks" and "show thread" we can see the following:
    
    LockDesc: Type=17001(admin node name), NameSpace=0,
    SummMode=sLock, Key='BCC47'
      Holder: (admutil.c:4667 Thread 898045) Tsn=0:247788814,
    Mode=sLock
      Waiter: (admutil.c:4667 Thread 898048) Tsn=0:247788827,
    Mode=sixLock
      Waiter: (admutil.c:4667 Thread 898045) Tsn=0:247788896,
    Mode=sLock
    
    Thread 898045, Parent 898044: NrReplicationThread, Storage
    374846, AllocCnt 279 HighWaterAmt 407315
     tid=e0fd, ptid=dffc, det=0, zomb=0, join=1, result=0,
    sess=579875
      Awaiting cond waitP->waiting (0x16cd96030), using mutex
    TMV->mutex (0x110c80bb8), at tmlock.c(753)
      Stack trace:
        0x09000000004f3ae0 _cond_wait_global
        0x09000000004f4678 _cond_wait
        0x09000000004f5360 pthread_cond_wait
        0x00000001000076f4 pkWaitConditionTracked
        0x00000001000bda1c tmLockTracked
        0x00000001000d2e1c AdmLockNode
        0x000000010026eea4 admGetNodeIdForNodeNameExt
        0x000000010025b59c admCheckProxyNode
        0x0000000100268234 admGetNodeExtAttrs
        0x0000000100bdeb80 xiBuildNodeDef
        0x0000000100663eb4 smReplQueryNode
        0x000000010063089c DoNodeHandshake
        0x000000010062f2e4 NrReplicationThread
        0x0000000100020ae0 StartThread
    
    
    
    Note that Thread 898045 is using 2 transactions, and Thread
    898048 jumps in between them.
    
    
    
    Tivoli Storage Manager Versions Affected:
    Tivoli Storage Manager Server v.6.3 and above
    
    
    
    Initial Impact:
    Medium
    
    
    
    Additional Keywords:
    TSM DEADLOCK THREAD NODE REPLICATION CLIENT BACKUP TXN
    

Local fix

  • Don't run replicate node concurrently with a client operation
    

Problem summary

  • ****************************************************************
    * USERS AFFECTED: All Tivoli Storage Serer Manager users of    *
    *                 the REPLICATE NODE command.                  *
    ****************************************************************
    * PROBLEM DESCRIPTION: See ERROR DESCRIPTION.                  *
    ****************************************************************
    * RECOMMENDATION: Apply fixing level when available. This      *
    *                 problem is currently projected to be fixed   *
    *                 in level 6.3.4. Note that this is subject    *
    *                 to change at the discretion of IBM.          *
    ****************************************************************
    *
    

Problem conclusion

  • This problem was fixed.
    Affected platforms:  AIX, HP-UX, Solaris, Linux, and Windows.
    

Temporary fix

Comments

APAR Information

  • APAR number

    IC89590

  • Reported component name

    TSM SERVER

  • Reported component ID

    5698ISMSV

  • Reported release

    63A

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Special Attention

    NoSpecatt

  • Submitted date

    2013-01-17

  • Closed date

    2013-03-05

  • Last modified date

    2013-03-05

  • APAR is sysrouted FROM one or more of the following:

  • APAR is sysrouted TO one or more of the following:

Fix information

  • Fixed component name

    TSM SERVER

  • Fixed component ID

    5698ISMSV

Applicable component levels

  • R63A PSY

       UP

  • R63H PSY

       UP

  • R63L PSY

       UP

  • R63S PSY

       UP

  • R63W PSY

       UP

[{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SSGSG7","label":"Tivoli Storage Manager"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"63A","Edition":"","Line of Business":{"code":"LOB26","label":"Storage"}}]

Document Information

Modified date:
05 March 2013