IBM Support

IT15816: SERVER CAN CRASH IN "SDADDCHUNKENCRYPTIONINFO" DURING REPLICATION IF TARGET CONTAINER STGPOOL'S MAXSIZE IS EXCEEDED.

Subscribe

You can track all active APARs for this component.

APAR status

  • Closed as program error.

Error description

  • The target server can crash during "replicate node" under the
    following conditions:
    
    1. The source server is replicating data from a
    directory-container storage pool.
    The target server stores data into a directory-container storage
    pool.
    File being replicated exceeds the maxsize setting of the target
    directory-container pool.
    Next storage pool on the target server, written to due to
    exceeding the directory-container storage pool maxsize limit, is
    not a directory-container pool.
    
    or
    
    2. Replicated node is attempting to write to a non-dedup pool
    because there are
    multiple management classes, some of which point to a container
    pool and
    some of which point to a non-dedup pool.
    
    
    Getcoreinfo.txt on Linux platform will look like this on the
    target server for a crash during the replicate node process
    under such conditions:
    
    #0  SdAddChunkEncryptionInfo (txnP=0x7ffd98cb6a68,
    entryP=0x7ffdb0667010, metaData=True) at sdcrypt.c:172
    #1  0x0000000000c51f65 in AddPendingMetaData
    (txnP=0x7ffd98cb6a68,
    metaDataSize=350, chunkPP=0x7ffe434e4610) at sddedup.c:2195
    #2  0x0000000000c53575 in sdProcessNonDedupChunk
    (sessHandle=<value
    optimized out>, nonDedupChunkId=2234952175961185024, digest=...,
    chunkLength=350, isMetadata=True, metaDataIsLink=<value
    optimized out>,
    metaSize=<value optimized out>) at sddedup.c:1712
    #3  0x0000000000d9e65b in DoProcessNonDedupChunk
    (contextP=0x7ffd98008bc8, bufPtr=0x7ffd98d80050
    "\001I\305\245\003",
    bufSize=262064, bytesReadP=0x7ffe434e4d5c) at smtrans.c:10776
    #4  SmRecvNextData (contextP=0x7ffd98008bc8,
    bufPtr=0x7ffd98d80050
    "\001I\305\245\003", bufSize=262064, bytesReadP=0x7ffe434e4d5c)
    at
    smtrans.c:4959
    #5  0x0000000000edef4f in ssStore (sessHandle=0x7ffd9819eb78,
    txnId=<value optimized out>, srvId=0, segGroupId=377885911,
    poolId=<value optimized out>, priority=8,
    mountWaitMode=ssWaitMount,
    estSize=10485760000, sourceFunc=0xd9cf00 <SmRecvNextData>,
    contextP=0x7ffd98008bc8, action=ssStoreOpenAggr,
    segListPP=0x7ffd98cb7a98, numSegsP=0x7ffd98cb7a90,
    segOneP=0x7ffd98cb7aa0, actSizeP=0x7ffe434e5000,
    totalSizeP=0x7ffd98cb79e0, totalPhysicalSizeP=0x7ffe434e4e28,
    formatP=0x7ffd98cb79f0, isReconstruct=False) at sstrans.c:1860
    #6  0x00000000007aa791 in AfCreate (txnP=0x7ffd98cb56f8,
    poolId=6,
    srvId=0, bfId=377885911, ck1=<value optimized out>, ck2=<value
    optimized
    out>, estSize=10485760000, mountWaitMode=bfWaitMount,
    sourceFunc=0xd9cf00 <SmRecvNextData>, contextP=0x7ffd98008bc8,
    action=ssStoreOpenAggr, actSizeP=0x7ffe434e5000) at
    afcreate.c:1030
    #7  0x00000000006dc1c2 in CreateBitfile (sessP=0x7ffd98179dc8,
    txnId=0x7ffd98013da8, bfId=377885911, estSize=10485760000,
    estBitfileSize=29803677001, ck1=19, ck2=14,
    poolName=0x7ffe434e55f0
    "ST_DD_FILE0", mountWaitMode=bfWaitMount, sourceFunc=0xd9cf00
    <SmRecvNextData>, contextP=0x7ffd98008bc8,
    aggregateState=admAggregate,
    forceAggregation=False, rptSizeP=0x7ffe434e51d8) at
    bfcreate.c:8381
    #8  0x00000000006dd87a in bfCreate (sessHandle=0x7ffd98179dc8,
    txnId=0x7ffd98014d18, bfId=377885911, estSize=29803677001,
    estBitfileSize=29803677001, ck1=19, ck2=14,
    poolName=0x7ffe434e55f0
    "ST_DD_FILE0", mountWaitMode=bfWaitMount, sourceFunc=0xd9cf00
    <SmRecvNextData>, contextP=0x7ffd98008bc8,
    aggregateState=admAggregate,
    forceAggregation=False, isFragmentedP=0x7ffe434e5644,
    rptSizeP=0x7ffe434e5610) at bfcreate.c:2161
    #9  0x0000000000d17465 in CreateBitfile (sessP=0x7ffd98008bc8,
    bfHandle=0x7ffd98179dc8, txnId=0x7ffd98014d18, bfId=377885911,
    ck1=19,
    ck2=14, objInfoP=0x7ffd981b056d "\v\t\026", objInfoLen=136,
    poolName=0x7ffe434e55f0 "CONTAINER_POOL", estSize=29803677001,
    estBitfileSize=29803677001, mountWaitMode=2, objType=1 '\001',
    isFragmentedP=0x7ffe434e5644, rptSizeP=0x7ffe434e5610) at
    smnode.c:28653
    #10 0x0000000000d191fb in SmDoBackInsNormEnhanced
    (sessP=0x7ffd98008bc8,
    bfHandle=0x7ffd98179dc8, txnId=0x7ffd98014d18,
    verbP=0x7ffd981b1000,
    domainId=1, txnDate=..., updateList=0x7ffd9817ac78,
    objectOpen=0x7ffe434e5b40, setDeltaStoredP=0x7ffe434e5b44,
    objIdP=0x7ffe434e5af0) at smnode.c:17879
    #11 0x0000000000e406b7 in SmReplServerSession
    (sessP=0x7ffd98008bc8) at
    smrepl.c:2110
    #12 0x0000000000ceeb94 in DoReplServer (sessP=0x7ffd98008bc8,
    nodeInfoP=<value optimized out>) at smexec.c:9113
    #13 0x0000000000cf0b98 in smExecuteSession (infoP=<value
    optimized out>,
    beginFunc=<value optimized out>, sendFunc=<value optimized out>,
    recvFunc=<value optimized out>, flushFunc=<value optimized out>,
    abortFunc=0x104d810 <tcpAbort>, qmethodFunc=0x104d6e0
    <tcpQryMethod>,
    qaddressFunc=0x1050960 <tcpQueryAddress>, authFunc=0,
    isNetworkMethod=True, qIsNodeThreadFunc=0x104d6c0
    <tcpIsNodeThread>) at
    smexec.c:3787
    #14 0x000000000104e37c in psSessionThread (argP=<value optimized
    out>)
    at tcpcomm.c:2345
    #15 0x00000000010384e2 in StartThread (startInfoP=0x0) at
    pkthread.c:3779
    #16 0x0000003a61c079d1 in start_thread () from
    /lib64/libpthread.so.0
    #17 0x0000003a614e88fd in clone () from /lib64/libc.so.6
    
    
    
    
    IBM Spectrum Protect Versions Affected:
    IBM Spectrum Protect Server: 7.1.3.x and higher on all platforms
    
    
    
    Initial Impact: Medium
    
    
    Additional Keywords: TSM IBM Spectrum Protect container pools
    crash core
    

Local fix

  • Set maxsize on the target server directory-container pool to be
    unlimited or to be as large as the setting on the source
    server directory-container pool.
    

Problem summary

  • ****************************************************************
    * USERS AFFECTED:                                              *
    * All Tivoli Storage Manager server users of node replication. *
    ****************************************************************
    * PROBLEM DESCRIPTION:                                         *
    * See error description.                                       *
    ****************************************************************
    * RECOMMENDATION:                                              *
    * Apply fixing level on the source server when available. This *
    * problem is currently projected to be fixed in level 7.1.7.   *
    * Note that this is subject to change at the discretion of     *
    * IBM.                                                         *
    ****************************************************************
    

Problem conclusion

  • This problem was fixed.
    Affected platforms:  AIX, Solaris, Linux, and Windows.
    

Temporary fix

Comments

APAR Information

  • APAR number

    IT15816

  • Reported component name

    TSM SERVER

  • Reported component ID

    5698ISMSV

  • Reported release

    71L

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Special Attention

    NoSpecatt / Xsystem

  • Submitted date

    2016-06-22

  • Closed date

    2016-07-18

  • Last modified date

    2016-07-25

  • APAR is sysrouted FROM one or more of the following:

  • APAR is sysrouted TO one or more of the following:

Fix information

  • Fixed component name

    TSM SERVER

  • Fixed component ID

    5698ISMSV

Applicable component levels

  • R71A PSY

       UP

  • R71L PSY

       UP

  • R71S PSY

       UP

  • R71W PSY

       UP



Document information

More support for: Tivoli Storage Manager

Software version: 7.1.3

Reference #: IT15816

Modified date: 25 July 2016