IBM Support

IT10703: TIVOLI STORAGE MANAGER SERVER CAN HANG WHEN USING SIMULTANEOUS WRITES

Subscribe

You can track all active APARs for this component.

 

APAR status

  • Closed as program error.

Error description

  • The Tivoli Storage Manager Server can appear to hang when using
    simultaneous writes, where the storage pool destinations are
    using different device classes and when all drives are being
    used in the copy pool. Most commands can run and be accepted,
    apart from commands or sessions relating to the library and
    drive access which will hang.
    
    Tivoli Storage Manager Versions Affected:
    Tivol Storage Manager 6.3.x and 7.1.x on all platforms
    
    Customer/L2 Diagnostics (If Applicable)
    
    A dump of the dsmsvc.exe process from a Windows server will show
    a stack similar to the following :
    
    db2sys64!sqloxult_app+0x70
    db2app64!SQLFreeStmt+0x42f
    adsmdll!ResetTxnDesc+0x292
    [c:\build\src\276\srv6.3.5.10\extracts\rdb\dbitxn.c @ 1816]
    adsmdll!dbiEndTxn+0x338
    [c:\build\src\276\srv6.3.5.10\extracts\rdb\dbitxn.c @ 798]
    adsmdll!DoEndFuncCallbacks+0x147
    [c:\build\src\276\srv6.3.5.10\extracts\tm\tmtxn.c @ 2035]
    adsmdll!tmEndX+0x2cc
    [c:\build\src\276\srv6.3.5.10\extracts\tm\tmtxn.c @ 1042]
    adsmdll!naTestPath+0x5d9
    [c:\build\src\276\srv6.3.5.10\extracts\na\napath.c @ 785]
    adsmdll!MmsAcquireDrivePath+0x31b
    [c:\build\src\276\srv6.3.5.10\extracts\pvr\mmsdrive.c @ 7704]
    adsmdll!GetDcCounts+0x216
    [c:\build\src\276\srv6.3.5.10\extracts\pvr\pvrmp.c @ 7130]
    adsmdll!pvrAcquireMountPoint+0x55d
    [c:\build\src\276\srv6.3.5.10\extracts\pvr\pvrmp.c @ 1118]
    adsmdll!TestSwMpReq+0x138
    [c:\build\src\276\srv6.3.5.10\extracts\ss\asvolmnt.c @ 5195]
    adsmdll!TestMpReq+0x3e
    [c:\build\src\276\srv6.3.5.10\extracts\ss\asvolmnt.c @ 3972]
    adsmdll!AsMPAgent+0x13d
    [c:\build\src\276\srv6.3.5.10\extracts\ss\asvolmnt.c @ 2286]
    adsmdll!startThread+0x124
    [c:\build\src\276\srv6.3.5.10\extracts\nt\pkthread.c @ 3244]
    msvcr100!endthreadex+0x43
    msvcr100!endthreadex+0xdf
    kernel32!BaseThreadInitThunk+0xd
    ntdll!RtlUserThreadStart+0x1d
    
    The problem is in the AsMPAgent thread and a trace taken using
    classes "as mms pvr" will show that pvrAcquireMountPoint is
    called multiple times and the mountlimit exceeds, so a mount
    point is not granted.
    
    In this example, a VTL library was used and data is being stored
    in a primary pool which is device class DEVPRIM (device class
    11) whilst also being simultaneously written into a copy pool
    which is device class DEVCOPY (device class 12). These device
    classes have separate library definitions with 24 drives for
    each library. A mount point in each device class must be
    obtained.
    
    An attempt is made to firstly obtain a mount point in device
    class 11 (primary) which succeeds :
    
    [pvrmp.c][958][pvrAcquireMountPoint]:Attempting to acquire 1
    MP(s): devclass 11, priority 8, session 66, process 0, readOnly
    0, retry 0.
    [pvrmp.c][989][pvrAcquireMountPoint]:Acquiring 1 MP(s) in class
    DEVPRIM
    [mmsdrive.c][2798][MmsCheckDrivesForMP]:Checking Available
    drives in library TSMLIBPRIM for devType LTO with device class
    format 0.
    ..
    [pvrmp.c][7355][CreateMp]:Allocating mount point, devClass = 11.
    [pvrmp.c][7357][CreateMp]:CreateMp mpStatus = 0 sessId = 66
    procId = 0 priority = 8 count = 1.
    [pvrmp.c][7491][CreateMp]:Mount point 0 allocated for device
    class = 11
    
    After this completes, an attempt is made to obtain a mount point
    in device class 12 (copy) :
    
    [pvrmp.c][958][pvrAcquireMountPoint]:Attempting to acquire 1
    MP(s): devclass 12, priority 8, session 66, process 0, readOnly
    0, retry 0.
    [pvrmp.c][989][pvrAcquireMountPoint]:Acquiring 1 MP(s) in class
    DEVCOPY
    [mmsdrive.c][2798][MmsCheckDrivesForMP]:Checking Available
    drives in library TSMLIBCOPY for devType LTO with device class
    format 0.
    
    This is the point, where all drives are in use, so the mount
    point request  is exceeded :
    
    [pvrmp.c][1120][pvrAcquireMountPoint]:MPs in device class
    library TSMLIBCOPY are:
    [pvrmp.c][1123][pvrAcquireMountPoint]:  reserved = 0, open = 24,
    opening = 0, idle = 0                    <== 24 drives open
    [pvrmp.c][1128][pvrAcquireMountPoint]:  idle with dismount
    failures = 0, dismounting = 0, dismounted = 0
    [pvrmp.c][1130][pvrAcquireMountPoint]:  waiting for idle
    dismount = 0, sync dismounts = 0.
    [mmsdrive.c][7277][MmsReleaseDriveList]:Releasing DriveList in
    library TSMLIBCOPY
    [pvrmp.c][1169][pvrAcquireMountPoint]:Mountlimit is exceeded -
    Mountpoint not granted                 <==Mountpoint denied
    [pvrmp.c][1910][pvrReleaseMountPoint]:Releasing mount point for
    devClass=11.
    [pvrmp.c][1942][pvrReleaseMountPoint]:Destroying mount point 0
    (devClass = 11).
    [pvrmp.c][7668][DestroyMp]:Destroying mount point 0
    (devClass=11).
    [pvrmp.c][958][pvrAcquireMountPoint]:Attempting to acquire 1
    MP(s): devclass 11, priority 8, session 44, process 0, readOnly
    0, retry 0.
    [pvrmp.c][989][pvrAcquireMountPoint]:Acquiring 1 MP(s) in class
    DEVPRIM
    
    At this point, the mountpoint is released from the primary pool.
    The AsMPAgent thread servicing the mountpoints has a list of
    threads waiting for a mount point. It then attempts to acquire
    another mount point in the primary pool for the next request in
    the list, then also in the copy pool. Once the mountpoint is
    denied in the copypool and the mount is released from the
    primary pool, the thread immediately reacquires the mutex which
    stops other threads from releasing their mountpoints. The
    AsMPAgent thread then continues to go through the list of
    waiting mounts until it reaches the end of the list, then starts
    again at the beginning of the list, repeating this process.This
    causes a loop trying to acquire a mount point in the copypool
    which can never be serviced as existing threads cannot release
    their mountpoints.
    
    This will not be seen within a library sharing environment and
    will most likely be seen on the Windows platform due to a
    difference in the way threading is handled in Unix and Linux
    environments.
    
    
    Initial Impact: Low|Medium|High
    High
    
    Additional Keywords:
    TSM hang loop vtl mountpoint exceed
    

Local fix

  • Increase the number of drives and mountpoints available
    

Problem summary

  • ****************************************************************
    * USERS AFFECTED:                                              *
    * All Tivoli Storage Manager server users.                     *
    ****************************************************************
    * PROBLEM DESCRIPTION:                                         *
    * See error description.                                       *
    ****************************************************************
    * RECOMMENDATION:                                              *
    * Apply fixing level when available. This problem is currently *
    * projected to be fixed in levels 6.3.6.0, 7.1.3.100 and       *
    * 7.1.4. Note that this is subject to change at the discretion *
    * of IBM.                                                      *
    ****************************************************************
    

Problem conclusion

  • This problem was fixed.
    Affected platforms: Windows.
    

Temporary fix

Comments

APAR Information

  • APAR number

    IT10703

  • Reported component name

    TSM SERVER

  • Reported component ID

    5698ISMSV

  • Reported release

    63W

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Special Attention

    NoSpecatt / Xsystem

  • Submitted date

    2015-08-14

  • Closed date

    2015-11-03

  • Last modified date

    2015-11-03

  • APAR is sysrouted FROM one or more of the following:

  • APAR is sysrouted TO one or more of the following:

Fix information

  • Fixed component name

    TSM SERVER

  • Fixed component ID

    5698ISMSV

Applicable component levels

  • R63W PSY

       UP

  • R71W PSY

       UP

[{"Line of Business":{"code":"LOB26","label":"Storage"},"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SSGSG7","label":"Tivoli Storage Manager"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"63W"}]

Document Information

Modified date:
26 September 2021