IBM Support

IC99889: SERVER HANGS DURING DB2 RECOVERY

Subscribe

You can track all active APARs for this component.

APAR status

  • Closed as program error.

Error description

  • There are instances when the Tivoli Storage Manager server needs
    to restart DB2 as an attempt to recover from an error. The
    server may hang in this case if it hits a specific timing
    window. This timing window occurs when one session thread
    encounters a problem and needs to restart DB2, another session
    thinks it has a problem and also tries to restart DB2, which may
     cause another session to think it also has problem with DB2 and
     restarting. We end up with this cascading effect of restarting
    DB2 and the server will go into a hanging/looping condition. You
     will see an increase in memory usage of the server process.
    The chance of hitting this timing window increases with the
    number of concurrent sessions on the server.
    
    NOTE: The restarting of DB2 is not a problem. This is a normal
    process the server uses to recover from some errors. The problem
    only occurs when there is this cascading effect of multiple
    sessions trying to restart DB2.
    
    
    Tivoli Storage Manager Versions Affected:
    Tivoli Storage Manager server version 6.3
    
    Customer/L2 Diagnostics (If Applicable)
    If the server goes into a hang condition and the memory usage on
    the server process is high, you can force a coredump on the
    server process. Review the dump file.
    The following Windows callstack is caused by the server
    performing the restart of DB2. It is discarding a connection
    pool.
     1899  Id: 92c0.c85c Suspend: 0 Teb: 000007ff`fe15c000 Unfrozen
    Child-SP          RetAddr           Call Site
    00000002`622dcba8 00000000`7767e4e8
    ntdll!NtWaitForSingleObject+0xa
    00000002`622dcbb0 00000000`7767e3db
    ntdll!RtlpWaitOnCriticalSection+0xe8
    00000002`622dcc60 000007fe`f2b4f8fc
    ntdll!RtlEnterCriticalSection+0xd1
    00000002`622dcc90 000007fe`f30e0393 adsmdll!pkAcquireMutex+0x1c
    [c:\build\fix_builds\6340-bohm\nt\pkmonnt.c @ 1031]
    00000002`622dccc0 000007fe`f30e0d80
    adsmdll!RdbCloseConnection+0x213
    [c:\build\fix_builds\6340-bohm\rdb\dbiconn.c @ 2028]
    00000002`622dcd50 000007fe`f3160b00 adsmdll!DbiReleaseAll+0x300
    [c:\build\fix_builds\6340-bohm\rdb\dbiconn.c @ 904]
    00000002`622dcdc0 000007fe`f316cc58
    adsmdll!RdbDeactivateDatabase+0x150
    [c:\build\fix_builds\6340-bohm\rdbluw\rdbdb.c @ 1676]
    00000002`622dcee0 000007fe`f314226a adsmdll!RdbRestart+0x6b8
    [c:\build\fix_builds\6340-bohm\rdbluw\rdbinst.c @ 817]
    00000002`622dcf40 000007fe`f30dfbd7
    adsmdll!DbiEvalSQLOutcomeX+0x79a
    [c:\build\fix_builds\6340-bohm\rdb\dbieval.c @ 758]
    00000002`622dd5a0 000007fe`f30e1350
    adsmdll!RdbCreateConnection+0x457
    [c:\build\fix_builds\6340-bohm\rdb\dbiconn.c @ 1721]
    00000002`622dd660 000007fe`f30e2021
    adsmdll!DbiGetConnectionTracked+0x160
    [c:\build\fix_builds\6340-bohm\rdb\dbiconn.c @ 515]
    00000002`622dd6c0 000007fe`f30e2e84 adsmdll!AllocTxnDesc+0x91
    [c:\build\fix_builds\6340-bohm\rdb\dbitxn.c @ 1522]
    00000002`622dd750 000007fe`f30dd114 adsmdll!DbiParticipate+0xb4
    [c:\build\fix_builds\6340-bohm\rdb\dbitxn.c @ 1387]
    00000002`622dd7e0 000007fe`f2ba1877 adsmdll!tbOpenX+0x114
    [c:\build\fix_builds\6340-bohm\rdb\tbtbl.c @ 5010]
    00000002`622dd870 000007fe`f325210b
    adsmdll!admElBuildClientVectors+0x1e7
    [c:\build\fix_builds\6340-bohm\adm\admevent.c @ 1092]
    00000002`622dec30 000007fe`f34a1049
    adsmdll!smExecuteSession+0xe6b
    [c:\build\fix_builds\6340-bohm\sm\smexec.c @ 2675]
    00000002`622df900 000007fe`f2b46384 adsmdll!SessionThread+0x419
    [c:\build\fix_builds\6340-bohm\nt\tcpcomm.c @ 3511]
    00000002`622df990 00000000`72a41d9f adsmdll!startThread+0x124
    [c:\build\fix_builds\6340-bohm\nt\pkthread.c @ 3017]
    00000002`622df9d0 00000000`72a41e3b msvcr100!endthreadex+0x43
    00000002`622dfa00 00000000`7752652d msvcr100!endthreadex+0xdf
    00000002`622dfa30 00000000`7765c521
    kernel32!BaseThreadInitThunk+0xd
    00000002`622dfa60 00000000`00000000
    ntdll!RtlUserThreadStart+0x1d
    
    
    This DB2 restarting process will cause the following error, but
    can only be seen in the dump or if the server is started in the
    foregound:
    ANR9999D_3702047661 FindPoolToUse(dbiconn.c:1299) Thread<419>:
    DBV->connPool for 0 entry is NULL.
    
    There will be many of these messages from different sessions.
    Also, a callstack will accompany the ANR9999D messages. This
    callstack is caused by the server trying to determine which
    connection pool we can allocate a connection in, but the
    connection pool is being discarded by the previous call stack.
     1807  Id: 92c0.a5c8 Suspend: 0 Teb: 000007ff`fe346000 Unfrozen
    Child-SP          RetAddr           Call Site
    00000002`09f0adb0 000007fe`f30dee13 adsmdll!outDiagfExt+0x11a
    [c:\build\fix_builds\6340-bohm\util\outvarg.c @ 223]
    00000002`09f0ae40 000007fe`f30e1317 adsmdll!FindPoolToUse+0xd3
    [c:\build\fix_builds\6340-bohm\rdb\dbiconn.c @ 1300]
    00000002`09f0aec0 000007fe`f30e2021
    adsmdll!DbiGetConnectionTracked+0x127
    [c:\build\fix_builds\6340-bohm\rdb\dbiconn.c @ 496]
    00000002`09f0af20 000007fe`f30e2e84 adsmdll!AllocTxnDesc+0x91
    [c:\build\fix_builds\6340-bohm\rdb\dbitxn.c @ 1522]
    00000002`09f0afb0 000007fe`f30dd114 adsmdll!DbiParticipate+0xb4
    [c:\build\fix_builds\6340-bohm\rdb\dbitxn.c @ 1387]
    00000002`09f0b040 000007fe`f2e6ffec adsmdll!tbOpenX+0x114
    [c:\build\fix_builds\6340-bohm\rdb\tbtbl.c @ 5010]
    00000002`09f0b0d0 000007fe`f31bbe98 adsmdll!imOpenFsQuery+0x7c
    [c:\build\fix_builds\6340-bohm\im\imfs.c @ 3189]
    00000002`09f0b150 000007fe`f31da1f2 adsmdll!SmDoFSQry+0x198
    [c:\build\fix_builds\6340-bohm\sm\smnode.c @ 23578]
    00000002`09f0bf80 000007fe`f324a7fc adsmdll!SmNodeSession+0x3952
    [c:\build\fix_builds\6340-bohm\sm\smnode.c @ 7386]
    00000002`09f0e160 000007fe`f32530a2
    adsmdll!HandleNodeSession+0x192c
    [c:\build\fix_builds\6340-bohm\sm\smexec.c @ 5454]
    00000002`09f0edb0 000007fe`f34a1049
    adsmdll!smExecuteSession+0x1e02
    [c:\build\fix_builds\6340-bohm\sm\smexec.c @ 3397]
    00000002`09f0fa80 000007fe`f2b46384 adsmdll!SessionThread+0x419
    [c:\build\fix_builds\6340-bohm\nt\tcpcomm.c @ 3511]
    00000002`09f0fb10 00000000`72a41d9f adsmdll!startThread+0x124
    [c:\build\fix_builds\6340-bohm\nt\pkthread.c @ 3017]
    00000002`09f0fb50 00000000`72a41e3b msvcr100!endthreadex+0x43
    00000002`09f0fb80 00000000`7752652d msvcr100!endthreadex+0xdf
    00000002`09f0fbb0 00000000`7765c521
    kernel32!BaseThreadInitThunk+0xd
    00000002`09f0fbe0 00000000`00000000
    ntdll!RtlUserThreadStart+0x1d
    
    
    Initial Impact:
    High
    
    Additional Keywords:
    TSM abend core DBCONN
    

Local fix

  • If this problem occurs, restart the Tivoli Storage Manager
    server. However, you may have a problem restarting the server
    due to many sessions trying to connect and it may go into this
    cascading effect again. Disable client sessions to allow the
    server to start. You can use server command DISABLE SESSION in
    the server option file. Re-enable the client sessions after the
    server starts successfully
    

Problem summary

  • ****************************************************************
    * USERS AFFECTED:                                              *
    * All Tivoli Storage Manager server users.                     *
    ****************************************************************
    * PROBLEM DESCRIPTION:                                         *
    * See error description.                                       *
    ****************************************************************
    * RECOMMENDATION:                                              *
    * Apply fixing level when available. This problem is currently *
    * projected to be fixed in levels 6.3.5 and 7.1.1. Note that   *
    * this is subject to change at the discretion of IBM.          *
    ****************************************************************
    

Problem conclusion

  • This problem was fixed.
    Affected platforms:  AIX, HP-UX, Solaris, Linux, and Windows.
    

Temporary fix

Comments

APAR Information

  • APAR number

    IC99889

  • Reported component name

    TSM SERVER

  • Reported component ID

    5698ISMSV

  • Reported release

    63W

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Special Attention

    NoSpecatt

  • Submitted date

    2014-03-06

  • Closed date

    2014-04-08

  • Last modified date

    2014-04-11

  • APAR is sysrouted FROM one or more of the following:

  • APAR is sysrouted TO one or more of the following:

Fix information

  • Fixed component name

    TSM SERVER

  • Fixed component ID

    5698ISMSV

Applicable component levels

  • R63A PSY

       UP

  • R63H PSY

       UP

  • R63L PSY

       UP

  • R63S PSY

       UP

  • R63W PSY

       UP

  • R71A PSY

       UP

  • R71H PSY

       UP

  • R71L PSY

       UP

  • R71S PSY

       UP

  • R71W PSY

       UP



Document information

More support for: Tivoli Storage Manager

Software version: 63W

Reference #: IC99889

Modified date: 11 April 2014