IBM Support

IT14533: 7.1.5 SERVER PERFORMANCE DEGRADATION, SYSTEM HANG, INSTANCE CRASH DUE TO DB2 APAR IT14357.

Subscribe

You can track all active APARs for this component.

APAR status

  • Closed as program error.

Error description

  • Upgrading an IBM Spectrum Protect/Tivoli Storage Manager server
    to 7.1.5 will upgrade DB2 10.5 to FP7 and introduce the
    possibility of APAR IT14357 affecting server performance.
    Performance degradation may be seen as client connections taking
    a long time to complete, admin command line queries taking a
    long time to complete (if they complete at all) and client
    sessions showing state of "started" (if session output returns
    at all).
    
    Top output from the reporting Linux system showed load averages
    in the  2000+ range for a 24 cpu machine. Additionally it show
    the db2sysc process for the server instance as having 2000%+ CPU
    usage. Over time performance degraded further and further and
    further to the point of hanging the whole Linux system in a few
    cases and crashing the instance in others.
    Pstack output for the dsmserv pid shows half or more of the
    server thread stack traces waiting in DB2 code having processes
    through sqljrDrdaArDisconnect:
    Thread 2095 (Thread 0x7ffd77b3d700 (LWP 18026)):
    #0 0x00007ffff7731b5c in recv () from /lib64/libpthread.so.0
    #1 0x00007ffff43f1858 in tcprecv(SQLCC_COMHANDLE_T*, int,
       char*, int, unsigned short, unsigned short,
       SQLCC_TCPCONNHANDLE_T*, SQLCC_COND_T*, unsigned int, int*) ()
      from /opt/hd/db/db2/instance/tsmprs30/sqllib/lib64/libdb2.so.1
    #2 0x00007ffff43f08e3 in sqlcctcprecv(SQLCC_COMHANDLE_T*,
       SQLCC_COND_T*) () from
       /opt/hd/db/db2/instance/tsmprs30/sqllib/lib64/libdb2.so.1
    #3 0x00007ffff43e9de0 in sqlccrecv () from
       /opt/hd/db/db2/instance/tsmprs30/sqllib/lib64/libdb2.so.1
    #4 0x00007ffff448e189 in sqljcReceive(sqljCmnMgr*) () from
       /opt/hd/db/db2/instance/tsmprs30/sqllib/lib64/libdb2.so.1
    #5 0x00007ffff4db8fc3 in sqljrDrdaArDisconnect(db2UCinterface*)
       () from
       /opt/hd/db/db2/instance/tsmprs30/sqllib/lib64/libdb2.so.1
    #6 0x00007ffff4434528 in sqleUCdisconnect () from
       /opt/hd/db/db2/instance/tsmprs30/sqllib/lib64/libdb2.so.1
    #7 0x00007ffff443b8d5 in sqleUCappConnectReset () from
       /opt/hd/db/db2/instance/tsmprs30/sqllib/lib64/libdb2.so.1
    #8 0x00007ffff429c653 in CLI_sqlDisconnect(CLI_CONNECTINFO*,
       sqlca*, CLI_ERRORHEADERINFO*) () from
       /opt/hd/db/db2/instance/tsmprs30/sqllib/lib64/libdb2.so.1
    #9 0x00007ffff427e833 in SQLDisconnect2(CLI_CONNECTINFO*) ()
      from /opt/hd/db/db2/instance/tsmprs30/sqllib/lib64/libdb2.so.1
    #10 0x00007ffff427deb1 in SQLDisconnect () from
       /opt/hd/db/db2/instance/tsmprs30/sqllib/lib64/libdb2.so.1
    #11 0x0000000000b96f75 in RdbCloseConnection ()
    #12 0x0000000000b97f59 in DbiReleaseConnectionEx ()
    #13 0x0000000000b9b7f9 in FreeTxnDesc ()
    #14 0x0000000000b9c97e in dbiEndTxn ()
    #15 0x0000000000f22fc1 in tmDiscardTxn ()
    #16 0x0000000001035a84 in FinishThread ()
    #17 0x000000000103695c in StartThread ()
    #18 0x00007ffff772a9d1 in start_thread () from
        /lib64/libpthread.so.0
    #19 0x00007ffff39188fd in clone () from /lib64/libc.so.6
    
    Tivoli Storage Manager Versions Affected: 7.1.5 all platforms
    
    Customer/L2 Diagnostics (If Applicable)
    Initial Impact: High
    
    Additional Keywords: TSM slow hang hung unresponsive
    

Local fix

  • As documented in DB2 APAR IT14357 execute the following db2 and
    db2pd commands as the DB2 instance owner after connecting to the
    database:
    
    db2 connect to tsmdb1
    db2pd -db tsmdb1 -dmpevrec comp=SQLRA name=sqlraMED mask=0
    db2pd -db tsmdb1 -dmpevrec comp=SQLRA name=sqlraLOW mask=0
    
    This will disable the logging of the rollback event in the
    flight recorder trace and remove the delay. However, this will
    need to be done after every recycle of database
    (deactivate/activate, or recycle of instance ).
    

Problem summary

  • ****************************************************************
    * USERS AFFECTED:                                              *
    * All Tivoli Storage Manager server users.                     *
    ****************************************************************
    * PROBLEM DESCRIPTION:                                         *
    * See error description.                                       *
    ****************************************************************
    * RECOMMENDATION:                                              *
    * Apply fixing level when available. This problem is currently *
    * projected to be fixed in levels 7.1.5.200 and 7.1.6. Note    *
    * that this is subject to change at the discretion of IBM.     *
    ****************************************************************
    

Problem conclusion

  • This problem was fixed.
    Affected platforms:  AIX, HP-UX, Solaris, Linux, and Windows.
    

Temporary fix

Comments

APAR Information

  • APAR number

    IT14533

  • Reported component name

    TSM SERVER

  • Reported component ID

    5698ISMSV

  • Reported release

    71L

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Special Attention

    NoSpecatt / Xsystem

  • Submitted date

    2016-03-30

  • Closed date

    2016-04-18

  • Last modified date

    2016-05-02

  • APAR is sysrouted FROM one or more of the following:

  • APAR is sysrouted TO one or more of the following:

Fix information

  • Fixed component name

    TSM SERVER

  • Fixed component ID

    5698ISMSV

Applicable component levels

  • R71A PSY

       UP

  • R71H PSY

       UP

  • R71L PSY

       UP

  • R71S PSY

       UP

  • R71W PSY

       UP



Document information

More support for: Tivoli Storage Manager

Software version: 7.1.3

Reference #: IT14533

Modified date: 02 May 2016