Troubleshooting
Problem
Cause
- Tablespace containing many Large Binary Objects (LOBs) or Long Fields (LFS) would slow down the EM process
- Tablespace containing many objects would slow down EM process
- EM is very I/O bound. Concurrently running multiple EM operations like ALTER TABLESPACE tsname REDUCE MAX/LOWER HIGH WATER MARK on multiple table spaces would slow things down.
- Size of table space
- EM is online operation, concurrent workloads (example: online backup, LOAD, etc...) running on the same tablespace would negatively impact performance
- HADR systems may also be affected by IT27775: SLOW DATABASE ROLLFORWARD OR HADR REPLAY OF "REDUCE MAX" OPERATION
Diagnosing The Problem
Diagnostics to collect when experiencing slow EM
For Data Partitioned Systems with multiple partitions/nodes, only collect on one of the nodes where there is slow EM
1) Get a list of EDU ids performing the extent movement from db2diag.log for the node.
Once you have the list of EDUs performing the extent movement, trace a few of them
Example
2019-05-29-02.15.52.032218-240 E569263A504 LEVEL: Warning
PID : 19202166 TID : 14394 PROC : db2sysc 0
...
EDUID : 1201 EDUNAME: db2agent (FOO)
FUNCTION: DB2 UDB, buffer pool services, sqlbExtentMovementEntryPoint, probe:6870
DATA #1 : <preformatted>
Extent Movement started on table space 2
2019-05-29-02.15.52.032218-240 E569263A504 LEVEL: Warning
PID : 19202166 TID : 14000 PROC : db2sysc 0
...
EDUID : 1210 EDUNAME: db2agent (FOO)
FUNCTION: DB2 UDB, buffer pool services, sqlbExtentMovementEntryPoint, probe:6870
DATA #1 : <preformatted>
Extent Movement started on table space 1
2019-05-29-02.15.52.032218-240 E569263A504 LEVEL: Warning
PID : 19202166 TID : 15000 PROC : db2sysc 0
...
EDUID : 1532 EDUNAME: db2agent (FOO)
FUNCTION: DB2 UDB, buffer pool services, sqlbExtentMovementEntryPoint, probe:6870
DATA #1 : <preformatted>
Extent Movement started on table space 3
For large db2diag.log files the db2diag tool can be used to filter for these entries.
db2diag -g 'funcname:=sqlbExtentMovementEntryPoint' db2diag.log
In this example db2sysc PID=19202166
The TIDs running the extent movements are 14394, 14000 and 15000
1.1)db2trc on -t -f trace.dmp -p 19202166.14394.14000.15000
1.2) Wait 2 minutes if possible before stopping the trace. In past experience 30s of tracing caused trace.dmp to grow to 5 GB.
1.3)db2trc off
1.4) Format files:
db2trc fmt trace.dmp trace.fmt db2trc flw -t -wc trace.dmp trace.flw db2trc perfrep -g trace.dmp trace.perfrep
Notes with older fixpacks from v10.5 and previous versions did not have the functionalities:
- "db2trc ... -wc" option. Exclude "-wc" parameter on older versions.
- "db2trc perfrep -g ..." option, skip this step
1.5) Collect files
tar -cvf trace_em.tar trace.dmp trace.f* trace.perfrep gzip trace_em.tar
2) Get some stacks for same TIDs from step (1)
Here we look at the EDUID from db2diag.log for the agents doing the extent movement
Example
db2pd -stack 1201 dumpdir=/tmp -repeat 5 3 db2pd -stack 1210 dumpdir/tmp -repeat 5 3 db2pd -stack 1532 dumpdir=/tmp -repeat 5 3
Writes output to /tmp. The -repeat 5 3 dumps 3 sets of diagnostics into the same file every 5 seconds.
3) Collect db2pd and iostat output
Collects db2pd for 15 seconds: db2pd -db dbName -edus -extentmovement -tcbstats -tablespaces -repeat 5 3 >> db2pd.out
Collects iostat for 50 seconds: iostat 5 10 >> iostat.out
4) Send Diagnostics to DB2 Support
trace_em.tar.gz
*.stack.txt from step (2)
db2pd.out
iostat.out
Resolving The Problem
The slowdown from sqloseekwrite64 is propagated up to the higher calling functions. sqloseekwrite64 is a wrapper which under the covers calls the operating system's I/O APIs. Average of 116ms is much too slow, normally these calls should take a few millseconds. Thus the disk(s), RAID array, etc... the tables space container maps to should be investigated. Output from "db2pd -db dbName -tcbstats -tablespaces" will provide the containers used for the table space.
Lvl FuncName TTime(ms) HTime(ms) LTime(ms) AvgTime(ms) --- ------------------------------- --------------- --------------- --------------- --------------- 0 sqlbMoveHighestExtent 307901.997 410.732 13.279 141.033 1 sqlbDMSMapAndWrite 253713.585 398.045 3.327 116.399 2 sqlbWriteBlocks 253643.168 398.019 0.245 116.368 3 sqloWriteBlocks 253637.901 398.017 0.243 116.366 4 sqloseekwrite64 253628.203 398.013 0.241 116.362 1 sqlbDMSMapAndRead 45564.088 96.448 0.176 10.624 2 sqlbReadBlocks 45450.010 96.423 0.158 10.599 3 sqloReadBlocks 45432.948 96.419 0.155 10.595 1 sqlbDMSFreeOneExtentImmediatel 2393.442 98.778 0.401 1.537 1 sqldotblUpdate 1822.617 129.509 0.972 65.208 2 sqldUpdateRow 1822.205 129.498 0.966 65.186 3 sqldReplaceRow 1822.129 129.496 0.963 65.185 4 sqlbWritePage 1787.332 129.260 0.746 64.869 5 sqlbWritePageToDisk 1785.301 129.186 0.698 64.811 6 sqlbWriteBlocks 1785.108 129.181 0.696 64.807
Lvl FuncName TTime(ms) HTime(ms) LTime(ms) AvgTime(ms) --- ------------------------------- --------------- --------------- --------------- --------------- 0 sqlbMoveHighestExtent 20074.239 100.872 6.625 14.760 1 sqlbDMSFreeOneExtentImmediatel 5846.448 34.783 1.517 4.299 1 sqlbDMSMapAndRead 4505.617 45.341 0.585 1.655 2 sqlbReadBlocks 4397.987 45.314 0.567 1.616 3 sqloReadBlocks 4380.444 42.959 0.566 1.609
Document Location
Worldwide
Historical Number
slow;extent;movement;alter;reduce;max;hadr;tablespace;water;mark;lower
Was this topic helpful?
Document Information
Modified date:
18 December 2020
UID
ibm10967141