IBM Support

IT21049: UNRESPONSIVE SERVER AFTER RUNNING PROTECT STGPOOL WITH TYPE=LOCAL DUE TO ORPHANED DATABASE TRANSACTIONS AND CONNECTIONS

Subscribe

You can track all active APARs for this component.

APAR status

  • Closed as program error.

Error description

  • Running PROTECT STGPOOL with TYPE=LOCAL will result in orphaned
    database transactions, the associated database connections will
    also be orphaned.  This can result in the server appearing to
    be hung as new sessions cannot be started.
    Customer/L2 Diagnostics:
    The servermon.pl script can be used to collect data for this
    issue.
    In the SHOW TXNT the following will be seen:
    Tsn=0:8813831, Resurrected=False, InFlight=True,
    Distributed=False, Persistent=True, Addr 00000072F4AA1220
      Start ThreadId=296, Timestamp=05/23/2017 20:00:49,
    Creator=sdrepl.c(12101)
      Last known in use by ThreadId=296
      Participants=1, summaryVote=ReadOnly
      EndInFlight False, endThreadId 0, tmidx 0, processBatchCount
    0, mustAbort False.
        Participant DB: voteReceived=False, ackReceived=False
          DB: Txn 0000007355076A80, ReadOnly(YES),
    connP=000000732C9B1D10, applHandle=5479, openTbls=2:
          DB: --> OpenP=000000720533E160 for table=GUIL2.MaintHist.
          DB: --> OpenP=0000007356170F60 for table=Activity.Summary.
          DB: --> RegSqlId=0x0F0000A9 SELECT for
    table=SD.Chunk.Copies, executed(Yes).
    The key things in this output are the RegSqlId for the
    SD.Chunk.Copies table is 0x0F0000A9 and that the Start ThreadId
    in the output does not exist in the SHOW THREADS output.
    The actlog should also show a local protect stgpool issued about
    that same time:
    05/23/2017 20:00:48 ANR2017I Administrator ADMIN1 issued
    command: PROTECT STGPOOL STGPOOL1 Type=Local RECLaim=YESLIMited
    (SESSION: 122102)
    05/23/2017 20:00:48 ANR0984I Process 1840 for PROTECT STGPOOL
    (SUMMARY) started in the BACKGROUND at 20:00:48. (SESSION:
    122102, PROCESS: 1840)
    05/23/2017 20:00:49 ANR0984I Process 1841 for PROTECT STGPOOL
    (WORKER) started in the BACKGROUND at 20:00:49. (SESSION:
    122102, PROCESS: 1841)
    05/23/2017 20:00:49 ANR0985I Process 1841 for PROTECT STGPOOL
    (WORKER) running in the BACKGROUND completed with completion
    state SUCCESS at 20:00:49. (SESSION: 122102, PROCESS: 1841)
    05/23/2017 20:00:49 ANR4000I The protect storage pool process
    for STGPOOL1 on server TSMSERVER1 to STGPOOL2 on server
    TSMSERVER1 is complete.
     Extents protected: 0 of 0. Extents failed to protect: 0.
     Extents deleted: 0 of 0. Extents failed to delete: 0. Extents
    moved: 0 of 0. Extents failed to move: 0. Amount moved: 0 bytes
    of 0 bytes. Amount failed to move: 0 bytes. Elapsed time: 0
    Days, 0 Hours, 1 Minutes. (SESSION: 122102, PROCESS: 1840)
    05/23/2017 20:00:49 ANR0985I Process 1840 for PROTECT STGPOOL
    (SUMMARY) running in the BACKGROUND completed with completion
    state SUCCESS at 20:00:49. (SESSION: 122102, PROCESS: 1840)
    
    
    The orphan transactions/connections can result in the 9600
    connection limit for database connections being reached, this
    causes the server to be unresponsive to new sessions and can be
    perceived as being hung.
    
    Instances of the following message may be seen, this indicates
    that the 9600 connection limit for database connections has been
    reached.
    
    The 'db2 get snapshot for database on tsmdb1' command shows the
    high water mark for database connections, this can be monitored
    to see if it continually increases and how close to the limit it
    is:
    High water mark for connections            = 1011
    The 'db2 list application show detail' command shows the current
    connections, the number of connections can be determined from
    this.  A high number of connections in 'UOW Waiting' state can
    indicative of the orphaned connections.
    Initial Impact: High
    Tivoli Storage Manager Versions Affected: 7.1.7 and higher and
    all 8.1.x servers
    Additional Keywords:
    hang
    

Local fix

  • Avoid running PROTECT STGPOOL TYPE=LOCAL frequently.
    

Problem summary

  • ****************************************************************
    * USERS AFFECTED:                                              *
    * All Tivoli Storage Manager server and Spectrum Protect       *
    * server users.                                                *
    ****************************************************************
    * PROBLEM DESCRIPTION:                                         *
    * See error description.                                       *
    ****************************************************************
    * RECOMMENDATION:                                              *
    * Apply fixing level when available. This problem is currently *
    * projected to be fixed in level 7.1.9 and 8.1.5. Note that    *
    * this is subject to change at the discretion of IBM.          *
    ****************************************************************
    

Problem conclusion

  • This problem was fixed.
    Affected platforms: AIX, Solaris, Linux, and Windows
    

Temporary fix

Comments

APAR Information

  • APAR number

    IT21049

  • Reported component name

    TSM SERVER

  • Reported component ID

    5698ISMSV

  • Reported release

    81W

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Special Attention

    NoSpecatt / Xsystem

  • Submitted date

    2017-06-15

  • Closed date

    2017-12-08

  • Last modified date

    2017-12-08

  • APAR is sysrouted FROM one or more of the following:

  • APAR is sysrouted TO one or more of the following:

Fix information

  • Fixed component name

    TSM SERVER

  • Fixed component ID

    5698ISMSV

Applicable component levels

  • R81A PSY

       UP

  • R81L PSY

       UP

  • R81W PSY

       UP



Document information

More support for: Tivoli Storage Manager

Software version: 81W

Reference #: IT21049

Modified date: 08 December 2017