IC67566: MULTIPLE CONCURRENT NDMP NODE BACKUPS LOOKS TO HANG WHEN TOC IS SET TO YES OR PREFERRED

Subscribe

You can track all active APARs for this component.

APAR status

  • Closed as program error.

Error description

  • When multiple NAS backup operations are started in parallel,
    they may appear to hang. Q PROCESS shows no activity and it is
    impossible to start other backup node command.
    In fact the processes are not really hang.
    When a NAS backup is done with TOC=YES or TOC=PREFERRED, TSM
    creates a temporary table in db2 and inserts many objects data
    into this table. At a certain point a runstat is started to
    improve temp table access performance.
    The problem is that we have a db2 lock contention between the
    temp tables insertion and the runstats processes.
    
    
    Tivoli Storage Manager Versions Affected:
    6.1   , 6.2
    
    Customer/L2 Diagnostics :
    
    "Show thread" output show several threads that looks hang in
    "semop" . Some are the backup nodes and usually at least one is
    a runstat:
    
    Thread 64, Parent 1: RdbMonitorStatsThread, Storage 8334,
    AllocCnt 578 HighWaterAmt 102541
     tid=2b40, ptid=1, det=1, zomb=0, join=0, result=0, sess=0
      Stack trace:
        0x0900000000247820 semop
        0x09000000022c5e24 .sqlccrecv_fdprpro_clone_1
        0x00000001151d1a50 *UNKNOWN*
        0x0000000000010000 *UNKNOWN*
        0x09000000022c59f0 sqljcReceive__FP10sqljCmnMgr
        0x09000000022e9f64
    sqljrDrdaArCall__FP14db2UCinterfaceP9UCstpInfo
        0x09000000022ab7d4 sqleproc__FPcP7sqlcharP5sqldaT3P5sqlca
        0x09000000022ef9dc
    sqlerInvokeKnownProcedure__FUiP5sqldaP5sqlca
        0x0900000001c35d94 db2RunstatsSqlda
        0x0900000001c34690 db2Runstats
        0x000000010007baf8 RdbRunstatOneTable
        0x000000010007b620 RdbRunstatAllTables
        0x000000010007b310 TbRunstatAllTbl
        0x0000000100625508 RdbMonitorStatsThread
        0x0000000100009bb4 StartThread
    
    Thread 2877, Parent 2876: SmAdminCommandThread, Storage 722739,
    AllocCnt 455 HighWaterAmt 763443
     tid=343d, ptid=2e3c, det=0, zomb=0, join=0, result=0, sess=0
      Holding mutex txnP->mutex (0x111b450f8), acquired at
    tbtbl.c(4293)
      Stack trace:
        0x0900000000247820 semop
        0x09000000022c5e24 .sqlccrecv_fdprpro_clone_1
        0x09000000009f68e0 uprv_free
        0x0000000000010000 *UNKNOWN*
        0x09000000022c59f0 sqljcReceive__FP10sqljCmnMgr
        0x09000000022c70c8
    sqljrDrdaArExecute__FP14db2UCinterfaceP9UCstpInfo
        0x0900000001e92d58
    CLI_sqlExecute__FP17CLI_STATEMENTINFOP19CLI_ERRORHEADERINFO
        0x0900000001da5d78
    SQLExecute2__FP17CLI_STATEMENTINFOP19CLI_ERRORHEADERINFO
        0x0900000001dd653c SQLExecute
        0x000000010006aebc FreePrepareExecute
        0x0000000100075570 ProcessTempTable
        0x0000000100074ed8 tbOpenX
        0x00000001000787e0 tbCreateTemp
        0x0000000100718f14 tocCreateNdmpToc
        0x00000001008a5ec8 DoBackup
        0x00000001008a3258 AdmBackupNode
        0x0000000100175520 AdmCommandLocal
        0x0000000100173b8c admCommand
        0x000000010078c694 SmAdminCommandThread
        0x0000000100009bb4 StartThread
    
    Thread 2919, Parent 60: CsRunCmdThread, Storage 865296, AllocCnt
    812 HighWaterAmt 1622120
     tid=3f67, ptid=293c, det=1, zomb=0, join=0, result=0, sess=0
      Holding mutex txnP->mutex (0x1123b7c78), acquired at
    tbtbl.c(4293)
      Stack trace:
        0x0900000000247820 semop
        0x09000000022c5e24 .sqlccrecv_fdprpro_clone_1
        0x09000000009f68e0 uprv_free
        0x0000000000010000 *UNKNOWN*
        0x09000000022c59f0 sqljcReceive__FP10sqljCmnMgr
        0x09000000022c70c8
    sqljrDrdaArExecute__FP14db2UCinterfaceP9UCstpInfo
        0x0900000001e92d58
    CLI_sqlExecute__FP17CLI_STATEMENTINFOP19CLI_ERRORHEADERINFO
        0x0900000001da5d78
    SQLExecute2__FP17CLI_STATEMENTINFOP19CLI_ERRORHEADERINFO
        0x0900000001dd653c SQLExecute
        0x000000010006aebc FreePrepareExecute
        0x0000000100075570 ProcessTempTable
        0x0000000100074ed8 tbOpenX
        0x00000001000787e0 tbCreateTemp
        0x0000000100718f14 tocCreateNdmpToc
        0x00000001008a5ec8 DoBackup
        0x00000001008a3258 AdmBackupNode
        0x0000000100175520 AdmCommandLocal
        0x0000000100173b8c admCommand
        0x00000001005f42b8 SmExecScheduledCommand
        0x00000001005f3f78 smScheduledConsoleSession
        0x00000001005f35b0 CsRunCmdThread
        0x0000000100009bb4 StartThread
          ....
    
    Initial Impact:  High
    
    Additional Keywords:
    zz61 zz62  hung semop tsm
    

Local fix

  • . Limit the number of concurrent backup nas  processes.
    or
    . Run backup with TOC=NO until correction.
    

Problem summary

  • ****************************************************************
    * USERS AFFECTED: All Tivoli Storage Manager server users.     *
    ****************************************************************
    * PROBLEM DESCRIPTION: See error description.                  *
    ****************************************************************
    * RECOMMENDATION: Apply fixing level when available. This      *
    *                 problem is currently projected to be fixed   *
    *                 in levels 6.1.4.0 and 6.2.2. Note that this  *
    *                 is subject to change at the discretion of    *
    *                 IBM.                                         *
    ****************************************************************
    *
    

Problem conclusion

  • This problem was fixed.
    Affected platforms:  AIX, HP-UX, Sun Solaris, Linux, and
    Windows.
    

Temporary fix

Comments

APAR Information

  • APAR number

    IC67566

  • Reported component name

    TSM SERVER

  • Reported component ID

    5698ISMSV

  • Reported release

    61A

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Special Attention

    NoSpecatt

  • Submitted date

    2010-04-06

  • Closed date

    2010-04-08

  • Last modified date

    2010-04-08

  • APAR is sysrouted FROM one or more of the following:

  • APAR is sysrouted TO one or more of the following:

Fix information

  • Fixed component name

    TSM SERVER

  • Fixed component ID

    5698ISMSV

Applicable component levels

  • R61A PSY

       UP

  • R61H PSY

       UP

  • R61L PSY

       UP

  • R61S PSY

       UP

  • R61W PSY

       UP

  • R62A PSY

       UP

  • R62H PSY

       UP

  • R62L PSY

       UP

  • R62S PSY

       UP

  • R62W PSY

       UP



Rate this page:

(0 users)Average rating

Document information


More support for:

Tivoli Storage Manager

Software version:

61A

Reference #:

IC67566

Modified date:

2010-04-08

Translate my page

Machine Translation

Content navigation