IBM Support

DB2 readlog API Slow Performance

Troubleshooting


Problem

When the DB2 readlog API is performing slow for Q-rep, CDC, or other applications during peak loads where many logs are being completed and archived quickly it may be due to an internal memory buffer used to hold records read in from the transaction log being too small.

Symptom

Perceived hang or slow execution when calling the DB2 readlog API

Diagnosing The Problem

The DB2 Readlog API is the DB2 component used by replication utilities like Q-rep and CDC. The readlog API is called upon to read the DB2 transaction log records and return them to the caller. Replication software is usually a continual consumer of log records that is never done reading so that it can stay up to date. In rare cases, when DB2 gets very busy with Insert/Update/Delete transactions the number of log records increases and fills logs at a faster rate. Thus, it might be observed that as log records are generated at a fast enough rate that the DB2 Readlog API can fall behind and have a difficulty keeping up. This in turn can delay replication software software from getting the log records it has requested.

Proper tuning of the Log File Size, Log Buffer Size, and log archive methods should be completed first. The tuning of these well documented configuration parameters can alleviate most DB2 transaction log performance problems. Please consult with DB2 Support before implementing the registry variable change below.

The thread inside the db2 engine process (db2sysc) that reads the log records is the db2shred thread. This read puts the log records into a memory buffer and makes them available to be read by the readlog API caller. This buffer is set to 8MB of memory. When this buffer is full the db2shred thread will wait until the API caller has read the log records and memory has been freed up before it fills it with more log records. This can lead to a burst like log read behavior when there is a rapid generation of log records that need to read from disk.

To identify this scenario you will see the db2shred thread in this stack on many successive iterations:

0x00007FFFEC7B74A5 _Z25ossDumpStackTraceInternalmR11OSSTrapFileiP7siginfoPvmm + 0x0385
0x00007FFFEC7B70AC ossDumpStackTraceV98 + 0x002c
0x00007FFFEC7B317D _ZN11OSSTrapFile6dumpExEmiP7siginfoPvm + 0x00fd
0x00007FFFF2CE3FF4 sqlo_trce + 0x0404
0x00007FFFF2D6E0FE sqloDumpDiagInfoHandler + 0x010e
0x0000003AAD60F7E0 __PRETTY_FUNCTION__.6093 + 0x0070
0x0000003AAD2EB16A semtimedop + 0x000a
0x00007FFFF0CCB4CE sqloWaitEDUWaitPost + 0x02ae
0x00007FFFF2E92358 _Z20sqlpshrLogrecBuffGetP17SQLPSHR_WORKER_CBlPP16SQLPR_LOGREC_DISb + 0x01a8
0x00007FFFF2E9DDBE _Z10sqlpshrEduP8sqeAgentP5sqlcaPv + 0x453e
0x00007FFFF211E9BC _Z26sqleSubCoordProcessRequestP8sqeAgent + 0x010c
0x00007FFFEFDA68B7 _ZN8sqeAgent6RunEDUEv + 0x0c27
0x00007FFFF15905B7 _ZN9sqzEDUObj9EDUDriverEv + 0x00f7
0x00007FFFF0D5BE73 sqloEDUEntry + 0x0303
0x0000003AAD607AA1 pthread_getschedparam + 0x0061
0x0000003AAD2E8AAD clone + 0x006d

You will also see the db2agent coordinator thread for the application that is calling the readlog API in a stack similar to this:

0x00007FFFEC7B74A5 _Z25ossDumpStackTraceInternalmR11OSSTrapFileiP7siginfoPvmm + 0x0385
0x00007FFFEC7B70AC ossDumpStackTraceV98 + 0x002c
0x00007FFFEC7B317D _ZN11OSSTrapFile6dumpExEmiP7siginfoPvm + 0x00fd
0x00007FFFF2CE3FF4 sqlo_trce + 0x0404
0x00007FFFF2D6E0FE sqloDumpDiagInfoHandler + 0x010e
0x0000003AAD60F7E0 __PRETTY_FUNCTION__.6093 + 0x0070
0x0000003AAD60F00D __nanosleep + 0x002d
0x00007FFFEC79D0B1 ossSleep + 0x0051
0x00007FFFF0CE8268 sqlorest + 0x00f8
0x00007FFFF2EAB9DD _Z24sqlpshrLogMergeIterationP8sqeAgentP17SQLPSHR_MASTER_CBmPmS3_S3_ + 0x050d
0x00007FFFF2EABBA3 _Z15sqlpshrLogMergeP8sqeAgentP17SQLPSHR_MASTER_CBPm + 0x0113
0x00007FFFF2EAC12E _Z15sqlpshrScanNextPvP8sqeAgentmPP16SQLPR_LOGREC_DISPm + 0x032e
0x00007FFFF2DEB93D _Z22sqlpgReadLogReadActionP8sqeAgentP14sqlpMasterDbcbjP15SQLP_READLOG_CBP17SQLP_READLOG_INFOmPc + 0x037d
0x00007FFFF2DEE7DD _Z15sqlpgReadLogAPIP8sqeAgentjP6db2LRIS2_S2_mmPcP17SQLP_READLOG_INFOP5sqlca + 0x0acd
0x00007FFFF2EE0E4D _Z22sqlpReadLogInternalAPIP8sqeAgentjP6db2LRIS2_S2_PcjP17SQLP_READLOG_INFOjP5sqlca + 0x00dd
0x00007FFFF2EE1495 _Z15sqlpReadLogDRDAP5sqldaS0_P8sqeAgentP5sqlca + 0x0465
0x00007FFFEFFC420D _Z19sqlerKnownProcedureiPcPiP5sqldaS2_P13sqlerFmpTableP8sqeAgentP5sqlca + 0x076d
0x00007FFFEFFC5A9D _Z11sqlerCallDLP14db2UCinterfaceP9UCstpInfo + 0x06fd
0x00007FFFF3C7E29E _Z19sqljs_ddm_excsqlsttP14db2UCinterfaceP13sqljDDMObject + 0x09be
0x00007FFFF3C7D41E _Z21sqljsParseRdbAccessedP13sqljsDrdaAsCbP13sqljDDMObjectP14db2UCinterface + 0x007e
0x00007FFFF00ACECB _Z10sqljsParseP13sqljsDrdaAsCbP14db2UCinterfaceP8sqeAgentb + 0x036b
0x00007FFFF00A7400 _Z18sqlnx_exp_DPLSTPRTP23sqlnx_exp_lolepop_parmsiP16sqlno_dpart_elimP11sqlno_tablePjPPcPi + 0x0430
0x00007FFFF00A592C _Z26sqlnx_exp_bexps_xpath_leafP9sqlnq_pidj17sqlnq_storageTypeP13sqlnq_memPoolPjPc + 0x003c
0x00007FFFF00A2979 _Z31sqlnx_exp_process_qun_int_propsP23sqlnx_exp_lolepop_parmsP17sqlno_access_pathP9sqlnq_qunb + 0x02a9
0x00007FFFF00A256B _Z17sqljsDrdaAsDriverP18SQLCC_INITSTRUCT_T + 0x00eb
0x00007FFFEFDA6948 _ZN8sqeAgent6RunEDUEv + 0x0cb8

When the db2agent/db2shred stacks, seen above, are observed on several successive stack collections and the readlog API is falling behind it could be due to the size of the memory buffer being used by the log shredder which by default is 8 MB.

Resolving The Problem

This memory buffer can be increased with an undocumented registry variable. The DB2_DPS_RLOG_SHR_MEMBYTES registry variable takes a size in bytes to use for the log shredder memory buffer. By increasing this size the db2shred thread is able to read more log records into this memory buffer from the log file. When a large enough buffer is used the db2shred thread will not run out of memory buffers very often. This allows readlog API call to read and free up enough buffers to keep the shredder moving without delays. Thus, the effect of increasing this memory buffer size is decreasing the likelihood that the log schredder falls way behind when the DB2 database is under a heavy insert/update/delete load.

To set this variable to 32MB for example:

$ db2set DB2_DPS_RLOG_SHR_MEMBYTES=33554432
$ db2set -all
[i] DB2_DPS_RLOG_SHR_MEMBYTES=33554432

Then when the db2 readlog API is called you will see a message similar to below that verifies that the size set in the registry variable is active.

2017-04-06-23.50.21.109727-240 E867195E546 LEVEL: Info
PID : 2692 TID : 140233538529024 PROC : db2sysc
INSTANCE: db2inst1 NODE : 000 DB : SAMPLE
APPHDL : 0-3221 APPID: *LOCAL.db2inst1.170407041828
AUTHID : DB2INST1 HOSTNAME: myhost
EDUID : 6686 EDUNAME: db2agent (SAMPLE)
FUNCTION: DB2 UDB, data protection services, sqlpgReadLogCalculateShredderScanMemory, probe:19180
DATA #1 : <preformatted>
DB2_DPS_RLOG_SHR_MEMBYTES=33554432 in effect

This memory is allocated from the DBHEAP. If this setting is very low then it should be increased to accommodate the memory increase just given to the log shredder.

[{"Product":{"code":"SSEPGG","label":"Db2 for Linux, UNIX and Windows"},"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Component":"--","Platform":[{"code":"PF002","label":"AIX"},{"code":"PF016","label":"Linux"},{"code":"PF033","label":"Windows"}],"Version":"9.7;10.1;10.5;11.1","Edition":"Advanced Enterprise Server;Advanced Workgroup Server;Enterprise Server;Express;Express-C;Workgroup Server","Line of Business":{"code":"LOB10","label":"Data and AI"}}]

Document Information

Modified date:
16 June 2018

UID

swg22003150