Tivoli Storage Manager Servers Might Encounter Data Integrity Issues with Client-Side Deduplication or the Replication Process

Flash (Alert)


Abstract

Data that is deduplicated by client-side deduplication, or replication, might be incorrectly stored on the Tivoli Storage Manager server. Under rare conditions, this data could become inaccessible on the Tivoli Storage Manager server.

Content

PROBLEM SUMMARY:

As documented in APAR IC85825, data on Tivoli Storage Manager servers might be stored incorrectly and possibly become inaccessible under the following conditions:

· The target server that is used in the deduplication environment is at Tivoli Storage Manager server
Version 6.2.1 or later.
· Either client-side data deduplication or replication with data deduplication is used.

Affected data types: Any data that is deduplicated with the client-side method or by the replication process.

WHO IS AFFECTED:

· All users of Tivoli Storage Manager server Version 6.2.1 or later who have enabled client-side data
deduplication in their Tivoli Storage Manager environment.
· All users of Tivoli Storage Manager server Version 6.3.0 or later who replicate deduplicated data to
a deduplication-enabled storage pool on the Tivoli Storage Manager target server.

RECOMMENDATION:

Before attempting to deduplicate data with client-side deduplication, or with the replication process, apply the Tivoli Storage Manager server fix Version 6.2.4.200, 6.2.5.000, 6.3.2.200, or 6.3.3.000, or later, which contain the fix for APAR IC85825. (The release of 6.2.5.000 and 6.3.3.000 is subject to change at the discretion of IBM.)

If data is already stored on a Tivoli Storage Manager server by one of the methods that are described in this document, the data might be stored incorrectly. The following message might be displayed if this issue occurs:

ANR4895E Deduplicated bitfile <bitfile_id> on volume <volname> has invalid links.

Important: Message ANR4895E can be issued for several invalid link issues and not only for this APAR.

To determine if any data has been stored incorrectly in a deduplicated storage pool as documented in this APAR, issue the following SQL statement steps.

- Open a DB2 command window, exit the DB2 prompt, and connect to the appropriate Tivoli Storage
Manager server instance by issuing the following commands:

    db2 connect to <database name>
    db2 set schema <database name>

    Example:
    db2 connect to tsmdb1
    db2 set schema tsmdb1

- Obtain the storage pool ID of the deduplicated storage pool or pools:
    db2 " select poolid from ss_pools where poolname='<DEDUPLICATED STORAGE POOL NAME IN CAPS>' "

    Example:

    db2 " select poolid from ss_pools where poolname='FILEDEDUPPOOL' "

    POOLID
    -----------
    24
Important: Run this statement when no data movement or copy processes (such as migration, reclamation, or storage pool backups) are active in the deduplicated storage pool being processed by the statement. In addition, expiration should not be running while the SQL statement is being executed. .

    db2 "select count(*) from af_bitfiles afbf left join bf_aggregate_attributes bfaa on (bfaa.srvid=afbf.srvid and bfaa.superbfid=afbf.bfid and bfaa.numchunks >1) left join bf_bitfile_extents bfbe on (bfbe.srvid=afbf.srvid and bfbe.superbfid=afbf.bfid and bfbe.poolid=afbf.poolid) where afbf.srvid=0 and (afbf.flags=2 or afbf.flags=3) and afbf.poolid=<poolid> and bfbe.srvid is null and bfaa.srvid is not null"

If the count is zero, this APAR does not apply. If the count from the SQL statement is greater than 0, the APAR applies and the following cleanup steps should be followed. These steps should be following for each deduplicated storage pool identified in the above SQL statement.

1. If not done already, open a DB2 command window, exit the DB2 prompt, and connect to the
appropriate Tivoli Storage Manager server instance by issuing the following commands:
    db2 connect to <database name>
    db2 set schema <database name>

    Example:
    db2 connect to tsmdb1
    db2 set schema tsmdb1

2. If not done already, obtain the storage pool ID of the deduplicated storage pool or pools:
    db2 " select poolid from ss_pools where poolname='<DEDUPLICATED STORAGE POOL NAME IN CAPS>' "

    Example:

    db2" select poolid from ss_pools where poolname='FILEDEDUPPOOL' "

    POOLID
    -----------
    24
3. Verify that all major server activity in the given storage pool is quiesced to ensure
static results in the following steps. This includes namely data movement, or copy, operations such as
migration, reclamation, backup stgpool and move data/nodedata.
    Important: The one exception is step #7 (if it is required) as that step can be executed with server operations running as normal.

4. Issue the following command by substituting the storage pool ID that was obtained in Step 3. This
operation takes at least one hour for most servers.
    db2 "update bf_bitfile_extents bfbe set bfbe.superbfid=( select bfbf.superbfid from bf_aggregated_bitfiles bfbf left join af_bitfiles afbf on (afbf.srvid=bfbf.srvid and afbf.bfid=bfbf.superbfid and afbf.poolid=<poolid>) where bfbf.srvid=bfbe.srvid and bfbf.bfid=bfbe.bfid and bfbf.superbfid!=bfbe.superbfid and bfbf.flags=0 and afbf.bfid is not null ) where bfbe.srvid=0 and bfbe.poolid=<pool id> and bfbe.bfid in (select bfbf2.bfid from bf_aggregated_bitfiles bfbf2 left join af_bitfiles afbf2 on (afbf2.srvid=bfbf2.srvid and afbf2.bfid=bfbf2.superbfid and afbf2.poolid=<poolid>) where bfbf2.srvid=bfbe.srvid and bfbf2.bfid=bfbe.bfid and bfbf2.superbfid!=bfbe.superbfid and bfbf2.flags=0 and afbf2.srvid is not null)"

5. Issue the following SQL statement to see if any damage could not be corrected with the statement from
step #4:

  db2 "select count(*) from af_bitfiles afbf left join
   bf_aggregate_attributes bfaa on (bfaa.srvid=afbf.srvid and
   bfaa.superbfid=afbf.bfid and bfaa.numchunks >1) left join  
   bf_bitfile_extents bfbe on (bfbe.srvid=afbf.srvid and
   bfbe.superbfid=afbf.bfid and bfbe.poolid=afbf.poolid) where afbf.srvid=0
   and (afbf.flags=2 or afbf.flags=3) and afbf.poolid=<poolid> and
   bfbe.srvid is null and bfaa.srvid is not null"

6. If the count from the previous SQL statement is 0, the bitfiles are no longer cataloged
incorrectly and the problem is resolved. If the count is greater than 0, continue with rest of steps
to create a cleanup macro and remove the damage:

db2 -x "select 'delete object ' || bfbf.owner || ' force=yes' from
   bf_aggregated_bitfiles bfbf left join bf_bitfile_extents bfbe on
  (bfbf.srvid=bfbe.srvid and bfbf.bfid=bfbe.bfid and
   bfbf.superbfid=bfbe.superbfid and bfbe.poolid=<pool id>) left join
   af_bitfiles afbf on (afbf.srvid=bfbf.srvid and afbf.bfid=bfbf.superbfid
   and afbf.poolid=<pool id> and afbf.flags!=3) where bfbf.srvid=0 and
   bfbf.length=0 and bfbf.flags=0 and bfbe.srvid is null and afbf.srvid is
   not null group by bfbf.owner" > deleteinvalidchunks.mac

7. After the deleteinvalidchunks.mac macro is created, it must be invoked through the dsmadmc client as
follows:
dsmadmc -id=<admin id> -password=<admin pwd> -itemcommit macro
   deleteinvalidchunks.mac

8. When the macro is completed, run the following SQL statement again to ensure that the result is 0 for
the given deduplicated storage pool:

  db2 "select count(*) from af_bitfiles afbf left join
   bf_aggregate_attributes bfaa on (bfaa.srvid=afbf.srvid and
   bfaa.superbfid=afbf.bfid and bfaa.numchunks >1) left join
   bf_bitfile_extents bfbe on (bfbe.srvid=afbf.srvid and
   bfbe.superbfid=afbf.bfid and bfbe.poolid=afbf.poolid) where afbf.srvid=0
   and (afbf.flags=2 or afbf.flags=3) and afbf.poolid=<poolid> and
   bfbe.srvid is null and bfaa.srvid is not null"

9. If the count is 0, the invalid data was removed from the storage pool. Issue the following command to
reset the "damaged" flags for the given deduplicated storage pool:
db2 "delete from af_damaged where poolid=<pool id>"


If Step 8 results in a count greater than 0, there is a possibility that the damage was not completely resolved. Repeat Steps 1 - 8 and ensure that all major server activity is quiesced where requested. If the macro from step #6 is created empty, the problem documented in this flash has been resolved.

PROBLEM RESOLUTION:

As documented in APAR IC85825, after you apply versions 6.2.4.200, 6.2.5.000, 6.3.2.200, 6.3.3.000, or later, the server correctly stores all data with deduplication enabled on the target server.

CIRCUMVENTION:
Apply server fix before attempting to deduplicate data with the client-side feature or with the replication process.

Rate this page:

(0 users)Average rating

Add comments

Document information


More support for:

Tivoli Storage Manager
Server

Software version:

6.2, 6.3

Operating system(s):

Platform Independent

Software edition:

All Editions

Reference #:

1610534

Modified date:

2013-02-15

Translate my page

Machine Translation

Content navigation