IBM Support

IT00475: REPLICATE NODE ENDS IN FAILURE WITH NO ASSOCIATED MESSAGE(S) IF A DAMAGED BITFILE IS FOUND

Subscribe

You can track all active APARs for this component.

APAR status

  • Closed as program error.

Error description

  • "Replicate node" command will fail if a damaged file is found
    and report
    no messages in activity log to show that a damaged file was
    found.
    
    For example:
    ANR0327I Replication of node NODE_NAME completed. Files
    current: 200. Files replicated: 0 of 1. Files updated: 0
    of 0. Files deleted: 0 of 0. Amount replicated: 0 KB of
    81,596 MB. Amount transferred: 0 KB. Elapsed time: 0
    Day(s), 0 Hour(s), 1 Minute(s).
    
    ANR0987I Process 450 for Replicate Node running in the
    BACKGROUND processed 200 items with a completion state of
    FAILURE at 10:48:03 AM.
    
    ANR1893E Process 450 for Replicate Node completed with a
    completion state of FAILURE.
    
    Diagnostics:
    Collecting a server 'AF BF REPL DEDUP DEDUP1 DEDUP2' trace
    shows:
    
    16:15:38.159 [216360][smrepl.c][2834][smReplRtrv]:Retrieving
    object 146924415
    16:15:38.159 [216360][bfrtrv.c][789][bfRtrvExt]:Entering for
    bitfile 146924415, offset 0, length 0 mountWaitMode 0, rtrvType
    0, noQueryRestore False, thisPool 0 thisPool strategy 0.
    16:15:38.159 [216360][bfrtrv.c][869][bfRtrvExt]:Object 146924415
    is not chunked, size is 0
    16:15:38.159 [216360][bfrtrv.c][1260][bfRtrvExt]:Bitfile
    146924415 is being retrieve with offset 0 and length 0.
    16:15:38.160 [216360][bfrtrv.c][3374][RtrvOne]:Retrieving
    bitfile 146924415, offset 0, length 0, useRtrvInfo: False,
    sessP->lastRtrvInfo: False
    16:15:38.160 [216360][bfrtrv.c][3530][RtrvOne]:Attempting
    retrieval of bitfile 146924415 from DISK.
    16:15:38.160 [216360][bfrtrv.c][3540][RtrvOne]:Bitfile not found
    on DISK.
    16:15:38.160 [216360][bfrtrv.c][3570][RtrvOne]:Attempting
    retrieval of bitfile 146924415 from other sequential media.
    16:15:38.160
    [216360][afcputil.c][1950][AfGetBestVolumes]:Selecting volume
    for 146924415 having lowKey 2 and highKey 2.
    16:15:38.161 [216360][afcputil.c][2012][AfGetBestVolumes]:Have
    pool 7 with numSegs 1 and damaged 1.
    16:15:38.161
    [216360][afcputil.c][2043][AfGetBestVolumes]:AfGetBestVolumes:
    Pool 7 disqualified on damaged.
    16:15:38.161 [216360][afcputil.c][2490][AfGetBestVolumes]:No
    best pool found, tmpRc=0.
    16:15:38.161
    [216360][afcputil.c][1950][AfGetBestVolumes]:Selecting volume
    for 146924415 having lowKey 2 and highKey 2.
    16:15:38.161 [216360][afcputil.c][2012][AfGetBestVolumes]:Have
    pool 7 with numSegs 1 and damaged 1.
    16:15:38.161
    [216360][afcputil.c][2043][AfGetBestVolumes]:AfGetBestVolumes:
    Pool 7 disqualified on damaged.
    16:15:38.162 [216360][afcputil.c][2490][AfGetBestVolumes]:No
    best pool found, tmpRc=0.
    16:15:38.162 [216360][bfrtrv.c][3589][RtrvOne]:Bitfile not found
    in other sequential media.
    16:15:38.162 [216360][bfrtrv.c][3625][RtrvOne]:RC 1101 from
    rtrv.
    16:15:38.162 [216360][bfrtrv.c][2361][bfRtrvExt]:Skipping
    end-to-end digest check for bitfile 146924415
    16:15:38.162 [216360][bfrtrv.c][2431][bfRtrvExt]:Exiting,
    bitfile 146924415, rc=1101.
    16:15:38.162 [216360][smrepl.c][2882][smReplRtrv]:Object
    146924415 processed with rc = 1101 (bfSize = 85560537088)
    16:15:38.162 [216360][smrepl.c][2900][smReplRtrv]:Exiting for
    objId 146924415, rc 1101
    16:15:38.162
    [216360][nrmain.c][5786][NrReplicateBatch]:sourceRc=1101,
    bytesReduced=0
    16:15:38.162 [216360][nrmain.c][11810][CheckRcForBatch]:Object
    146924415 needs retry for rc 1101
    16:15:38.163 [216360][nrmain.c][5836][NrReplicateBatch]:Ending
    final remote transaction
    
    From this it can be seen that bitfile_id 146924415 is identified
    to be damaged
    
    Check if the file is damaged with:
    ISDamaged stgpool_name bitfile_id
    
    (stgpool_name can be discovered comparing Storage Pool ID: X
    from 'show invo bitfile'
    output with 'show sspool' which shows   'stgpool_name(X)' )
    
    
    
    
    Tivoli Storage Manager Versions Affected:
    Tivoli Storage Manager Server 6.3.x and 7.1.x on supported
    platforms
    
    Initial Impact: Medium
    
    Additional Keywords: zz63 zz71 replicate node replication
    failure message
    

Local fix

  • Run 'restore stgpool' command to restore copy of damaged bitfile
     if it has a copy, then rerun replicate node. If it doesn't have
    a copy, contact IBM support.
    

Problem summary

  • ****************************************************************
    * USERS AFFECTED:                                              *
    * All Tivoli Storage Manager server users of node replication. *
    ****************************************************************
    * PROBLEM DESCRIPTION:                                         *
    * See ERROR DESCRIPTION.                                       *
    ****************************************************************
    * RECOMMENDATION:                                              *
    * Apply fixing levels when available.                          *
    * This problem is currently projected                          *
    * to be fixed in levels 6.3.5 and 7.1.1.                       *
    * Note that this is subject                                    *
    * to change at the discretion of IBM.                          *
    ****************************************************************
    

Problem conclusion

  • This problem was fixed.
    Affected platforms:  AIX, HP-UX, Solaris, Linux, and Windows.
    
    This new message is introduced with this APAR:
    ANR3651W The replication process is skipping a damaged file on
    volume <volume name>: Node <node name>, Type <file type>, File
    space <filespace name>, File name <file name>.
    Explanation:
    During the replication process, a file is encountered that was
    previously found to be damaged. If this file is part of an
    aggregate, the entire aggregate was previously marked damaged,
    possibly because an integrity error was detected for some other
    file within the aggregate.
    Sysact:
    The damaged file is not replicated.
    Uresp:
    Perform the following actions:
    - Audit the volume with FIX=NO to verify that the file is
    damaged.
    If the file is undamaged, the audit process resets the file
    status.
    If the file is part of an aggregate and the entire aggregate is
    found to be undamaged, the audit process resets the aggregate
    status.
    - If the volume is in a primary storage pool and has a copy in
    the copy storage pool, attempt to restore the damaged file by
    issuing the RESTORE STGPOOL command.
    

Temporary fix

Comments

APAR Information

  • APAR number

    IT00475

  • Reported component name

    TSM SERVER

  • Reported component ID

    5698ISMSV

  • Reported release

    63L

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Special Attention

    NoSpecatt

  • Submitted date

    2014-03-21

  • Closed date

    2014-04-24

  • Last modified date

    2014-04-24

  • APAR is sysrouted FROM one or more of the following:

  • APAR is sysrouted TO one or more of the following:

Fix information

  • Fixed component name

    TSM SERVER

  • Fixed component ID

    5698ISMSV

Applicable component levels

  • R63A PSY

       UP

  • R63H PSY

       UP

  • R63L PSY

       UP

  • R63S PSY

       UP

  • R63W PSY

       UP

  • R71A PSY

       UP

  • R71H PSY

       UP

  • R71L PSY

       UP

  • R71S PSY

       UP

  • R71W PSY

       UP



Document information

More support for: Tivoli Storage Manager

Software version: 63L

Reference #: IT00475

Modified date: 24 April 2014