IBM Support

IT03234: "DEFINE PATH" AND "UPDATE PATH" MAY FAIL WITH ANR2033E DUE TO A LOCK CONFLICT

Subscribe

You can track all active APARs for this component.

APAR status

  • Closed as program error.

Error description

  • A Tivoli Storage Manager server DEFINE PATH or UPDATE PATH
    command can fail due to a lock conflict.
    This can happen if the DEFINE PATH or UPDATE PATH command is
    issued while mount activity is going on. Before the DEFINE PATH
    or UPDATE PATH process fails, it seems to be hang. This
    "hanging" process can cause other processes and sessions to hang
    while waiting for a mount point.
    
    Customer/L2 Diagnostics:
    Example from Tivoli Storage Manager server actlog for the
    DEFINE PATH command:
    
    MM/DD/YYYY 12:14:54 ANR2017I Administrator ADMIN issued command:
      DEFINE PATH STA_1 DRIVE_1 SRCType=server destt=drive
      library=LIB_1 device=/dev/rmt1  (SESSION: 10)
    ......
    MM/DD/YYYY 12:45:46 ANR0538I A resource waiter has been aborted.
    MM/DD/YYYY 12:45:46 ANR0538I A resource waiter has been aborted.
    MM/DD/YYYY 12:45:46 ANR2033E DEFINE PATH: Command failed - lock
      conflict. (SESSION: 10)
    
    The following example shows a lock conflict for the DEFINE PATH
    command, as seen in a server trace with traceflags
    INSTR NA MONITOR_DETAIL MONITOR and SHOW THREADS outputs:
    
    It is a deadlock issue, and the resource monitor thread aborted
    the lock request after a long time waiting.
    The deadlock occurred as follows (there are 3 threads involved):
    
    1. Thread 152919, which is the DEFINE PATH thread.
       It is waiting for the NA universe lock (sixLock):
    
    From SHOW THREADS:
    
    Thread 152919, Parent 152918: SmAdminCommandThread, Storage
    66979, AllocCnt 171 HighWaterAmt 135143
     tid=3b57, ptid=1656, det=0, zomb=0, join=0, result=0, sess=0
      Awaiting cond waitP->waiting (0x110ce8f50), using mutex
    TMV->mutex (0x110da08a8), at tmlock.c(785)
      Stack trace:
        0x09000000004e2370 _cond_wait_global
        0x09000000004e2efc _cond_wait
        0x09000000004e3bec pthread_cond_wait
        0x00000001000079f4 pkWaitConditionTracked
        0x000000010004dde4 tmLockTracked
        0x0000000100410708 NaLockUniverse
        0x00000001003fd648 naDefinePath
        0x0000000100d89808 AdmDefinePath
        0x000000010039d904 AdmCommandLocal
        0x000000010039bb7c admCommand
        0x0000000100b00494 SmAdminCommandThread
        0x000000010000c264 StartThread
    
    From server trace:
    
    12:14:54.296 [152919][nautil.c][195][NaLockUniverse]:Acquiring
    NA universe lock (sixLock).
    12:45:46.211 [152919][nautil.c][207][NaLockUniverse]:Lock
    acquisition (sixLock) failed for NA universe lock.
    12:45:46.211 [152919][output.c][7531][PutConsoleMsg]:ANR2033E
    DEFINE PATH: Command failed - lock conflict.~
    
    2. The monitor thread 22 acquired the NA universe lock (sLock),
       but waiting for the device class latch:
    
    From SHOW THREADS:
    
    Thread 22, Parent 1: StatusMonitorThread, Storage 305692146,
    AllocCnt 3675550 HighWaterAmt 306420668
     tid=516, ptid=1, det=1, zomb=0, join=0, result=0, sess=0
      Awaiting cond latchP->sFree (0x110ce6fd0), using mutex
    PVRV->mutex (0x1113892c8), at latch.c(256)
      Stack trace:
        0x09000000004e2370 _cond_wait_global
        0x09000000004e2efc _cond_wait
        0x09000000004e3bec pthread_cond_wait
        0x00000001000079f4 pkWaitConditionTracked
        0x00000001002d8fc4 AcquireLatchSpecific
        0x00000001003bb688 pvrIsFileDevClassId
        0x00000001003bb1f0 pvrMonitorDevices
        0x0000000100118f5c StatusMonitorThread
        0x000000010000c264 StartThread
    
    From server trace:
    
    12:14:54.212 [22][nautil.c][154][NaTryLockUniverse]:Acquiring NA
    universe lock (sLock).
    12:14:54.212 [22][nautil.c][195][NaLockUniverse]:Acquiring NA
    universe lock (sLock).
    12:14:54.212 [22][napthcmd.c][1887][naOpenQueryPathEx]:Setting
    search bounds with destName = SL8500_ACS0:SL8500_T1A_032
    12:14:54.320 [22][nautil.c][154][NaTryLockUniverse]:Acquiring NA
    universe lock (sLock).
    .......
    12:14:56.123 [22][latch.c][238][AcquireLatchSpecific]:LATCH
    Acquire ATTEMPTED for latch 15c7d1b0, mode 0, using mutex
    11389188 by requestor pvrclass.c(22).   ---It acquired the na
    universe s lock, and waiting for latch
    13:16:47.468 [22][latch.c][294][AcquireLatchSpecific]:LATCH
    Acquire SUCCESSFUL for latch 15c7d1b0, mode 0, using mutex
    11389188 by requestor pvrclass.c(22).
    
    3. Thread 40, which handles the mount point request, acquired
    the
       device class latch, but waiting for NA universe lock (sLock).
    
    From SHOW THREADS:
    
    Thread 40, Parent 1: AsMPAgent, Storage 529320, AllocCnt 4806464
    HighWaterAmt 600854
     tid=1728, ptid=1, det=1, zomb=0, join=0, result=0, sess=0
      Holding mutex ASV->mpQueueMutex (0x115d1f928), acquired at
    asvolmnt.c(2337)
      Holding mutex libP->driveListMutex (0x111d102a8), acquired at
    mmsdrive.c(7187)
      Awaiting cond waitP->waiting (0x110cece50), using mutex
    TMV->mutex (0x110da08a8), at tmlock.c(785)
      Stack trace:
        0x09000000004e2370 _cond_wait_global
        0x09000000004e2efc _cond_wait
        0x09000000004e3bec pthread_cond_wait
        0x00000001000079f4 pkWaitConditionTracked
        0x000000010004dde4 tmLockTracked
        0x0000000100410708 NaLockUniverse
        0x0000000100406ba8 naObtainPathAttr
        0x0000000100409cc8 naGetDevice
        0x0000000100445724 IPRA.$MmsAcquireDrivePath
        0x0000000100448b98 MmsCheckDrivesForMP
        0x000000010045ec50 pvrAcquireMountPoint
        0x000000010083e17c TestSwMpReq
        0x000000010083bb58 TestMpReq
        0x00000001008403a4 AsMPAgent
        0x000000010000c264 StartThread
    
    From server trace:
    
    12:14:56.084 [40][latch.c][238][AcquireLatchSpecific]:LATCH
    Acquire ATTEMPTED for latch 15c7d1b0, mode 1, using mutex
    11389188 by requestor pvrmp.c(938)(40).
    12:14:56.084 [40][latch.c][294][AcquireLatchSpecific]:LATCH
    Acquire SUCCESSFUL for latch 15c7d1b0, mode 1, using mutex
    11389188 by requestor pvrmp.c(938)(40).
    ....
    12:14:56.089 [40][nautil.c][195][NaLockUniverse]:Acquiring NA
    universe lock (sLock).
    12:45:46.210 [40][nautil.c][207][NaLockUniverse]:Lock
    acquisition (sLock) failed for NA universe lock.
    
    The monitor thread 22 holds the NA universe sLock, but the
    DEFINE PATH thread is waiting for the sixLock first.
    So thread 40 could not get the NA universe sLock forever and
    the deadlock occurred.
    
    The main problem here are threads 22 and thread 40 acquiring the
    device class latch and NA universe lock in a different
    sequential. This is the reason for the deadlock.
    
    Tivoli Storage Manager Versions Affected:
    Tivoli Storage Manager server 6,2 6.3 and 7.1
    on all supported platforms
    
    Initial Impact:
    low
    
    Additional Keywords:
    TSM zz62 zz63 zz71 deadlock lock
    

Local fix

  • Issue DEFINE PATH (or DEL/DEF PATH or DEL/DEF/UPD DATAMOVER)
    while no mount activity is going on.
    

Problem summary

  • ****************************************************************
    * USERS AFFECTED:                                              *
    * All Tivoli Storage Manager server users.                     *
    ****************************************************************
    * PROBLEM DESCRIPTION:                                         *
    * See ERROR DESCRIPTION.                                       *
    ****************************************************************
    * RECOMMENDATION:                                              *
    * Apply fixing level when available. This problem is currently *
    * projected to be fixed in levels 6.3.6 and 7.1.1.200 and      *
    * 7.1.3. Note that this is subject to change at the discretion *
    * of IBM.                                                      *
    ****************************************************************
    

Problem conclusion

  • This problem was fixed.
    Affected platforms:  AIX, HP-UX, Solaris, Linux, and Windows.
    

Temporary fix

Comments

APAR Information

  • APAR number

    IT03234

  • Reported component name

    TSM SERVER

  • Reported component ID

    5698ISMSV

  • Reported release

    63A

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Special Attention

    NoSpecatt / Xsystem

  • Submitted date

    2014-07-22

  • Closed date

    2015-01-26

  • Last modified date

    2015-01-26

  • APAR is sysrouted FROM one or more of the following:

  • APAR is sysrouted TO one or more of the following:

Fix information

  • Fixed component name

    TSM SERVER

  • Fixed component ID

    5698ISMSV

Applicable component levels

  • R63A PSY

       UP

  • R63H PSY

       UP

  • R63L PSY

       UP

  • R63S PSY

       UP

  • R63W PSY

       UP

  • R71A PSY

       UP

  • R71H PSY

       UP

  • R71L PSY

       UP

  • R71S PSY

       UP

  • R71W PSY

       UP



Document information

More support for: Tivoli Storage Manager

Software version: 63A

Reference #: IT03234

Modified date: 26 January 2015