IC68110: DISK STORAGE POOL WITH CACHE=YES AFFECTS PERFORMANCE OF CLIENT BACKUP AND STORAGE POOL MIGRATION.

Subscribe

You can track all active APARs for this component.

APAR status

  • Closed as program error.

Error description

  • A Tivoli Storage Manager client backing up to a DISK storage
    pool defined with CACHE=yes may suffer from performance
    degradation. The client session progesses slowly or may appear
    to hang with little or no change in bytes sent/received when
    queried with the QUERY SESSION F=D command. For example :
    
                 Sess Number: 1234
                 Comm. Method: Tcp/Ip
                   Sess State: Run
                    Wait Time: 0 S
                   Bytes Sent: 1.0 K
                  Bytes Recvd: 1.1 K
                    Sess Type: Node
                     Platform: DP Oracle Win64
                  Client Name: MYDB_ORC
          Media Access Status:
                    User Name:
    Date/Time First Data Sent:
       Proxy By Storage Agent:
    
    Similarly, storage pool migration from a DISK storage pool
    defined with CACHE=yes may also suffer from performance
    degradation. The migration process progesses slowly or may
    appear to hang with the "Current Physical File" set to "None"
    when queried with the QUERY PROCESS command. For example :
    
    Process Number : nn
    Process Description : Migration
      Status : Disk Storage Pool DISKPOOL, Moved Files: 12345,
      Moved Bytes: 12,345,678,901, Unreadable Files: 0,
      Unreadable Bytes: 0. Current Physical File (bytes): None
      Current output volume: TAPE01
    
    Customer/L2 Diagnostics:
    A server trace (DS DF DISK BITVECTOR) shows the client threads
    deallocating disk block range for slow backup to disk pool with
    cache. For example :
    
    <timestamp> [11][bv.c][1871][bvRangeFcn]:bvDealloc: for
     DSKV0000039330 start 32104735, count 12992, txnId 0:8013960.
    <timestamp> [11][bvtxn.c][1730][BvParticipate]:DSKV0000039330:
     2459 is now participating in txnId 0:8013960.
    <timestamp> [11][dsalloc.c][3665][DeallocRange]:Deallocating
     disk block range (32117791,12992) for Segment
     (DiskId=39330, Start=32117791).
    <timestamp> [11][bv.c][850][bvOpenX]:For DSKV0000039330,
      txnId=0:8013960.
    <timestamp> [11][bv.c][6779][GetAlreadyOpenHandle]:already have
     handleP=000000002580F818
    
    A server instrument trace will show the client session threads
    spending most time in the "Unknown" operation. For example :
    
    Thread 11 SessionThread (Win Thread ID xxxx)
      Session ID 1234 MYDB_ORC (DP Oracle Win64)
    Operation       Count  Tottime Avgtime Mintime Maxtime InstTput
    ----------------------------------------------------------------
    ...
    Tm Lock Wait       30    4.256   0.142  0.000    1.279
    Acquire XLatch      1    0.000   0.000  0.000    0.000
    Unknown               6754.502
    ----------------------------------------------------------------
    Total                 6765.863                             0.0
    
    A "show txnt" output shows the client thread shows the session
    holding many locks type 93001. For example :
    
    slot -> xx:
    Tsn=0:xxx, Resurrected=False, InFlight=True,
    ...
      Locks held by Tsn=0:xxx :
        Type=93001(bv Supernode), NameSpace=0, SummMode=xLock,
           Mode=xLock, Key='a:b'
        Type=93001(bv Supernode), NameSpace=0, SummMode=xLock,
           Mode=xLock, Key='c:d'
    
    Similarly, a server trace (af bf df tm dbstmt) will show selects
    on the DS_OVERFLOW table for the slow migration process.
    For example :
    
    
    <timestamp> [nn][tbcli.c][4130][RdbPrepareAndExecuteStmt]:
    Executing statement 'SELECT RANGESTART,RANGECOUNT FROM
    "TSMDB1"."DS_OVERFLOW"  WHERE (VOLID=? AND START=? AND
    RANGENUM=?) FETCH FIRST 1 ROW ONLY FOR READ ONLY WITH UR --nn'
    submitted from dsalloc.c(710).
    <timestamp> [nn][tbcli.c][4635][RdbFetchAndGetNextResponse]:
    Performing fetch for dsalloc.c(710) : 'SELECT RANGESTART,
    RANGECOUNT FROM "TSMDB1"."DS_OVERFLOW"  WHERE (VOLID=? AND
    START=? AND RANGENUM=?) FETCH FIRST 1 ROW ONLY FOR READ ONLY
    WITH UR --nn'
    <timestamp> [nn][tmlock.c][596][tmLockTracked]:Tid=0:xxx,
      Type=93000(bv MSNodes), NameSpace=0, Key=(nil) ,Mode=sLock
     (UnCond) from bvutil.c:4237
    <timestamp> [nn][tmlock.c][1945][GrantLock]:Granted lock:
     Tsn=0:xxx, Type=93000(bv MSNodes), NameSpace=0, Key='',
     Mode=sLock, SummMode=sLock
    <timestamp> [nn][tmlock.c][1037][tmUnlock]:Tid=0:xxx,
      Type=93000, NameSpace=0, Key=(nil)
    <timestamp> [nn][tmlock.c][2048][UngrantLock]:Ungranted lock:
     Tsn=0:xxx, Type=93000(bv MSNodes), NameSpace=0, Key='',
     Mode=sLock, SummMode=noLock
    
    Other symptoms include deadlock situations resulting in
    following messages in server activity log :
    
    ANR0379W A server database deadlock situation has been
     encountered; the lock request for the bf aggregate
     (superbitfile) id lock, <bitfile id> will be denied to resolve
     the deadlock.
    ANR0530W Transaction failed for session nnn for node <nodename>
     (platform) - internal server error detected.
    
    Transactions involved in deadlock will show the Supernode lock
    held and DF.CachedBitfiles table accessed. For example :
    
    Tsn=0:nnn, Resurrected=False, InFlight=True, Distributed=False,
      Persistent=True, Addr 7c12ecc8
      ThreadId=492, Timestamp=..., Creator=dferase.c(712)
      Participants=5, summaryVote=ReadOnly
      EndInFlight False, endThreadId 492, tmidx 0 0,
      processBatchCount 0,  mustAbort False.
    ...
         DB: --> OpenP=0000000080892118 for table=DF.CachedBitfiles.
    ...
      Locks held by Tsn=0:nnn :
        Type=93001(bv Supernode), NameSpace=0, SummMode=xLock,
         Mode=xLock, Key=...
        Type=93001(bv Supernode), NameSpace=0, SummMode=xLock,
         Mode=xLock, Key=...
        Type=47052(df bitfile id), NameSpace=0, SummMode=xLock,
         Mode=xLock, Key=...
        Type=47051(df bitfile root), NameSpace=0, SummMode=ixLock,
          Mode=ixLock, Key=''
        Type=46001(bf aggregate (superbitfile) id), NameSpace=0,
         SummMode=xLock, Mode=xLock, Key=...
    
    TSM Versions Affected:
    TSM Servers V6.1 on all platforms.
    TSM servers V6.2 are not affected by this.
    
    Additional Keywords:
    zz61 IC65323 slow hung
    

Local fix

  • Turn off the disk pool cache with following commands :
    1) UPDATE STG <poolname> cache=no
    2) MOVE DATA <volname> stg=<nextpool> ..for all disk volumes
    

Problem summary

  • ****************************************************************
    * USERS AFFECTED: All Tivoli Storage Manager server users.     *
    ****************************************************************
    * PROBLEM DESCRIPTION: See error description.                  *
    ****************************************************************
    * RECOMMENDATION: Apply fixing level when available. This      *
    *                 problem is currently projected to be fixed   *
    *                 in levels 6.1.5. Note that this is           *
    *                 subject to change at the discretion of       *
    *                 IBM.                                         *
    *                                                              *
    *                 There is an associated Flash for this APAR   *
    *                 which is titled:  "Tivoli Storage Manager    *
    *                 Server information concerning APAR           *
    *                 IC68110".  You must read this Flash          *
    *                 before upgrading to service level            *
    *                 6.1.5 or later.  This Flash will be          *
    *                 published when service level 6.1.5 is        *
    *                 made available.                              *
    ****************************************************************
    *
    Note, this flash has change, it is now:
    www.ibm.com/support/docview.wss?uid=swg21431893
    Reverting from version 6.1.5.0 to a previous 6.1.x server level
    requires a manual reconfiguration of random disk volumes
    

Problem conclusion

  • This problem was fixed.
    Affected platforms:  AIX, HP-UX, Sun Solaris, Linux, and
    Windows.
    

Temporary fix

Comments

APAR Information

  • APAR number

    IC68110

  • Reported component name

    TSM SERVER

  • Reported component ID

    5698ISMSV

  • Reported release

    61W

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Special Attention

    NoSpecatt

  • Submitted date

    2010-04-22

  • Closed date

    2010-05-27

  • Last modified date

    2013-01-28

  • APAR is sysrouted FROM one or more of the following:

  • APAR is sysrouted TO one or more of the following:

Fix information

  • Fixed component name

    TSM SERVER

  • Fixed component ID

    5698ISMSV

Applicable component levels

  • R61A PSY

       UP

  • R61H PSY

       UP

  • R61L PSY

       UP

  • R61S PSY

       UP

  • R61W PSY

       UP



Rate this page:

(0 users)Average rating

Add comments

Document information


More support for:

Tivoli Storage Manager

Software version:

61W

Reference #:

IC68110

Modified date:

2013-01-28

Translate my page

Machine Translation

Content navigation