II14743: DB2 RECOMMENDED APARS AND TUNING FOR DB2 INSERT PERFORMANCE

Subscribe to this APAR

By subscribing, you receive periodic emails alerting you to the status of the APAR, along with a link to the fix after it becomes available. You can track this item individually or track all items by product.

Notify me when this APAR changes.

Notify me when an APAR for this component changes.

APAR status

  • INTRAN

Error description

  • DB2 recommended APARs and tuning for DB2 INSERT performance
    This information apar is opened for recommended APARs and tuning
    to improve DB2 INSERT performance based on recent observations
    from analysis and solution to customers using SAP application.
    Those recommendation should also apply to general customers.
    ----------------------------------------------------------------
    Recommended APARs:
    PM91835
    condition: UTS and single table classic segmented non Member
    Cluster tablespace with either versioned alters or dictionary
    build by insert
    symptom: loop, excessive getpages during insert
    problem: When searching the available data page within a space
    map page, if it reaches the allowable locking failure threshold
    or page false lead threshold, the current space map page is
    skipped to enhance the space search performance.   When
    determining the next space map page to search the last physical
    page is used to find the next anchor of the segment. However,
    if the last page of the current space map page is a system page
    then it is incorrectly used to determine  the anchor linkage of
    the next segment.  As the result, looping through visiting the
    same segment within the space map page may occur. This
    enhancements only applies to the classic segmented table space
    with a single table or a Universal table space.
    .
    PM93688
    condition: UTS with Member Cluster
    symptom: excessive getpages
    problem: During the space search process of INSERT operation,
    a relative segment was used instead of the absolute segment
    number when locating the space map page covered by the target
    segment to search for space. As the result, the same space map
    page was searched repeatedly and caused additional get page
    counts.
    .
    PM97867
    condition: UTS with insert and delete activity
    symptom: excessive getpages
    problem:  When deleted space distributes through the table
    space, the subsequent insert process could result in high get
    page  count. This is due to the false lead threshold that was
    increased to improve space reuse within the candidate segment
    or during the exhaustive search prior to physical extend.
    However, the extra page search does not fit for the situation
    where the data page has unusable free space. As a result, the
    get page count would increase by visiting more pages for each
    .
    PM99792
    condition: UTS with insert & delete activity, Page Level
    Locking (PLL) and long running uncommitted UR  in >= V10 NFM
    symptom: excessive getpages
    problem: Under V10 or V11 NFM with heavy concurrency of
    insert and delete operations on a UTS table space, insert
    performance can be severely degraded if there exists a long
    running UR without  performed a single commit.  This
    long-running UR can affect the INSERT performance even if
    accesses other tables only. This long running UR could delay the
    advance of commit lsn and therefore make all of deleted and
    committed space become unavailable for reuse.
    .
    PI04942
    condition: UTS non-Member Cluster 256G DSSIZE with > 214G
    data,  and Member Cluster with > 5.1G data
    symptom: severe insert performance degradation. possible
    DSNISGSC:5001
      problem: When a table contains larger quantity of data rows on
    UTS, DB2  mistakenly calculated a wrong segment number due to
    the use of  smaller integer fields.  This wrong segment
    calculation would  degrade the overall insert performance.
    .
    PI06926 PI12398
    condition: UTS
    symptom: excessive getpages
    problem: Slow performance in consecutive INSERT including
    Multiple Row INSERT in the same commit scope because the DB2
    tries to reuse space efficiently by visiting same set of
    possible available deleted space. However, the uncommitted
    delete space prevent space reuse and cause long space search and
    further degrade performance.
    .
    PI07513
    condition: GBP Dependent objects
    symptom: Possible high other write I/O for insert or other sql
    update
    problem:  Over-detection of the Vertical Deferred Write Queue
    Threshold(VDWQT) caused by the presence of clean pages on the
    Virtical Deferred Write Queue(VDWQ). VDWQT writes are triggered
    based on the count of pages on the VDWQ, dirty of not.  It's
    always been this way, and there's always been a possibility of
    clean pages on the VDWQ, but it became much more likely in V10
    with Member Cluster. Possible high other write I/O for insert or
    other SQL update the page when the object is GBP-dependent. In
    some case, OTHER WRITE I/O is possibly reported as OTHER READ
    I/O incorrectly.
    .
    PI11661
    condition: particularly for LOBs but possibly for other tables
    as well
    symptom: excessive getpages and high cpu utilization
    problem: The ability to reuse deleted space for LOBs, or for
    other tables using static scrollable cursors (or sharing a
    buffer pool with LOBs or tables using static scrollable
    cursors), is limited by  the value of the global read-LSN.  For
    the situation where read claims are continuously coming and
    going, but there is never a point in time where no read claims
    are held at all, DB2 may
    fail to advance the global read-LSN value for long periods of
    time even though the local table-level and buffer pool values
    are moving forward.
    .
    ----------------------------------------------------------------
    Recommended tuning:
    1. Separate tablespaces without LOB columns from bufferpool of
    UTS tablespaces with LOB columns.
    condition: any tablespace with considerable  insert/delete
    workload sharing a bufferpool in Data Sharing environment with
    UTS that has LOB columns if there are long running read
    transactions against any object in the bufferpool.
    symptom: excessive locks and getpages per insert if PLL
    locking, excessive getpages per insert  if Row Level
    Locking(RLL), especially evident with MAXROW(1)
    .
    Separate tablespaces without LOB columns from bufferpool where
    UTS tablespaces with LOB columns reside to avoid being impacted
    by read claim tracking for UTS tablespaces with LOB columns
    because V10 NFM added tracking by read Log Sequence Number(LSN)
    for UTS tablespaces that have LOB or XML column.
    Prior to V10 NFM, DB2 tracked oldest read LSN for LOB and XML
    tablespace for space reuse purposes. In datasharing the read
    claim tracking affects all objects within a bufferpool, causing
    DB2 to preserve deleted rows in case Sensitive Static
    Scrollable(SSS) cusrors area active.
     V10 NFM added tracking read LSN for UTS tablespace with LOB
    column for Sensitive Static Scrollable(SSS) cursors.  If read
    LSN is pegged by a long running reader, data rows deleted since
    the reader started will not be reused, even if the delete is
    committed.
    A message DSNB260 reporting a long running read claim will show
    up (see Zparm LRDRTHLD).
    .
    2. Frequent commit for any long running uncommitted UR and clean
    up any indoubt threads
    condition: under long running uncommitted UR or indoubt thread
    existing
    symptom: excessive getpages
    .
     In V10 UTS with PLL, a page lock proves committedness or
    ownership, but V10 has to determine the committedness of the
    page by checking a Commit  Log Sequence Number (CLSN), because
    V10 insertions into UTS objects may additionally not reuse space
    freed by their own uncommitted work for rollbackability.  If
    this checking fails due to any long running uncommitted UR in
    order to reuse any deleted space on the page, V10 treats the
    page it is holding as not committed, it will skip this page
    without reusing any of the deleted space on the page.  PM99792
    improved the reuse of space when V10 has a pseudo delete rows.
     A message DSNR035I reporting long running uncommitted UR will
    show up (see Zparm URCHKTH).
     In V10 UTS with RLL, the presence of long running URs hampers
    DB2's ability to detect if space is committed or not. This is
    true in all releases, but V10 UTS handles better than before V10
    NFM.
     In before V10 UTS with PLL,  possession of the page lock proved
    committedness or ownership and allowed all space to be used. So
    changing RLL to PLL is recommended, because RLL need CLSN for
    committedness.
    .
    ----------------------------------------------------------------
    Recommended documents for insert performance degradation to
    figure out the root cause:
    1. DDL including database, table space, table and index
    definition.
    2. Console dump with
    jobname=(ssnmDBM1,ssnmMSTR,ssnmIRLM,ssnmDIST) on the right
    member while an insert is going on.
    Sample dump operation:
      DUMP COMM=(any comment)
      reply to prompts;
      JOBNAME=(ssnmDBM1,ssnmMSTR,ssnmIRLM,ssnmDIST),CONT
      SDATA=(RGN,CSA,LSQA,SQA,PSA,SWA,LPA,TRT,GRSQ,SUM,ALLNUC),END
         *Where ssnm is the DB2 subsystem name
         *MAXSPACE 8000M minimum for DB2 V9,
          V10 and above would require 16000M+ MAXSPACE
          DISPLAY: D D,OPTIONS  to verify the option
    .
    3. DB2 performance trace for 30 seconds to a few minutes by
      -STA TRACE(P) C(3,6,7) IFCID(198) TDATA(CPU,COR,TRA,DIST)
          DEST(GTF) [PLAN() AUTHID() PKGPROG()]
          * "PLAN() AUTHID() PKGPROG()" are optional to make the
             output smaller
     To avoid losing trace records, use multiple GTF datasets with
    BLOK=1M specified in GTF procedure. Here is a sample procedure
    on how to do that:
    .
    //GTFDB2 PROC MEMBER=GTFDB2
    //IEFPROC EXEC PGM=AHLGTF,
    //        PARM='MODE=EXT,DEBUG=NO,TIME=YES,BLOK=1M,SIZE=8M',
    //        REGION=0M
    //GTFOUTxx DD DSNAME=??????.TRACExx,UNIT=SYSDA,VOL=SER=xxxxxx,
    //   DISP=(NEW,KEEP),SPACE=(CYL,(1000)),DCB=NCP=255
    //*
    //* repeated 15 times for all the other trace data sets !!
    //*
    //SYSLIB  DD   DSNAME=SYS1.PARMLIB(&MEMBER),DISP=SHR
    .
    If you use multiple datasets, you want to look for a AHL904I
    message when GTF is stopped identifying the trace data sets
    used:
    AHL006I GTF ACKNOWLEDGES STOP COMMAND
    AHL904I THE FOLLOWING TRACE DATASETS CONTAIN TRACE DATA :
              ????????.TRACE01
              ????????.TRACE02
    How to start and stop trace:
    1) start GTF
       S GTF.G
       R xx,TRACE=USRP
       R xx,USR=(FB9),END
       R xx,U
    2) start DB2 trace
       -STA TRACE(P) C(1,2,3,6,7,21) IFCID(198)
       TDATA(CPU,COR,TRA,DIST) DEST(GTF) [PLAN() AUTHID() PKGPROG()]
    3) wait for 30 to 60 seconds
    4) stop DB2 trace
       -DIS TRACE
        check TNO's you started
       -STO TRACE(P) TNO(pp)    pp: TNO of Performance trace
    5) stop GTF
       P G
    .
    4. DSN1PRNT with NODATPGS collected right after the performance
    trace above and before REORG to check exact space map bits.
    5. DSN1COPY or IMAGE COPY of the partition after the performance
    trace above if you can determine "which partition" a row is
    inserted into by the degraded insert.
    6. Insert key pattern (increasing or decreasing or random key
    for an insert) so that we can study how inserts are done by
    using cluster index.
    7. Archive log data sets covering insert time from all of DB2
    data sharing members.
    

Local fix

Problem summary

Problem conclusion

Temporary fix

Comments

APAR Information

  • APAR number

    II14743

  • Reported component name

    PB LIB INFO ITE

  • Reported component ID

    INFOPBLIB

  • Reported release

    001

  • Status

    INTRAN

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Special Attention

    NoSpecatt

  • Submitted date

    2014-04-01

  • Closed date

  • Last modified date

    2014-08-19

  • APAR is sysrouted FROM one or more of the following:

  • APAR is sysrouted TO one or more of the following:

Fix information

Applicable component levels



Rate this page:

(0 users)Average rating

Document information


More support for:

z/OS family

Software version:

001

Operating system(s):

MVS

Reference #:

II14743

Modified date:

2014-08-19

Translate my page

Machine Translation

Content navigation