IT17718: ANR0205W SEEN AFTER PREVIOUS REPAIR STG PROCESSES HAVE RAN WHICH CAUSES EXTENT INCONSISTENCY

APAR status

Closed as program error.

Error description

When doing an operation such as MOVE CONTAINER, the command
fails due to a data extent ID which appears to be marked as
damaged (AN0205W):

move container /tsm/tsmdir18/stg1/0d/0000000000000db1.dcf
stgpooldirectory=/tsm/tsmnewdir01/stg1

  ----------------------------------------------------------
  ANR0984I Process 52 for Move Container started in the
   BACKGROUND at 09:51:57 AM.
  ANR0205W MOVE CONTAINER skipped data extent ID
   -1280297497403949499 because it is marked damaged.
  ANR0985I Process 52 for Move Container running in the
   BACKGROUND completed with completion state FAILURE at
   09:52:00 AM.

An AUDIT CONTAINER does not find any object, neither valid nor
damaged:

audit container /tsm/tsmdir18/stg1/0d/0000000000000db1.dcf
action=markdamaged

  ----------------------------------------------------------
  ANR0984I Process 54 for AUDIT CONTAINER (MARK DAMAGED)
   started in the BACKGROUND at 09:52:41 AM.
  ANR4894I Audit Container (Mark Damaged) process started
   for container /tsm/tsmdir18/stg1/0d/0000000000000db-
   1.dcf (process ID 54).
  ANR4891I AUDIT CONTAINER process 54 ended for the
   /tsm/tsmdir18/stg1/0d/0000000000000db1.dcf
   container: 0 data extents inspected, 0 data extents
   marked as damaged, 0 data extents previously marked as
   damaged reset to undamaged, and 0 data extents marked as
   orphaned.
  ANR0985I Process 54 for AUDIT CONTAINER (MARK DAMAGED)
   running in the BACKGROUND completed with completion
   state SUCCESS at 09:52:42 AM.

This can also be seen for other operations attempting to obtain
the data extent, such as a retrieval process.

This happens when multiple container pools are defined and at
some point in the past, REPAIR STG processes were run
concurrently
on these container pools or a repair was performed on one pool
which completes and another started immediately afterwards on
the other pool, so you could
have multiple background cleanup threads running for different
container pools at the same time. This issue could also happen
if a backup operation was
performed which referenced the same data extent that the
background cleanup thread was updating at the time.

These scenarios introduces some inconsistencies in the database
and incorrectly updates the extent which is then seen as
damaged. The extent
itself is not actually damaged, but has been updated
incorrectly.


IBM Spectrum Protect Versions Affected:  7.1.3 and higher on all
platforms.

Customer/L2 Diagnostics:

To verify that your seeing this problem, issue the following
select statement :

db2 connect to tsmdb1
db2 set schema tsmdb1
db2 "select sdcl.poolid as sdcl_poolid, sdcn.poolid as
sdcn_poolid, sdcl.cntrid as cntrid, sdcl.refcount as refcount,
sdcl.flags as flags, count as count from sd_chunk_locations sdcl
inner join sd_containers sdcn  on (sdcn.cntrid=sdcl.cntrid)
where sdcl.poolid!=sdcn.poolid group by sdcl.poolid,
sdcn.poolid, sdcn.cntrid, sdcl.cntrid, sdcl.refcount, sdcl.flags
for read only with ur"

If the select returns some rows of data, then you are affected
by this APAR.


Initial Impact:  Medium

Additional Keywords:   TSM container pool dedup

Local fix

- Avoid doing multiple concurrent REPAIR STG.
- http://www-01.ibm.com/support/docview.wss?uid=swg21997476
- Contact IBM support to cleanup the inconsistencies.

Problem summary

****************************************************************
* USERS AFFECTED:                                              *
* All IBM Spectrum Protect server users.                       *
****************************************************************
* PROBLEM DESCRIPTION:                                         *
* See ERROR DESCRIPTION.                                       *
****************************************************************
* RECOMMENDATION:                                              *
* Apply fixing level when available. This problem is currently *
* projected to be fixed in levels 7.1.7.100, 7.1.8, and 8.1.1. *
* Note that this is subject to change at the discretion of     *
* IBM.                                                         *
****************************************************************

Problem conclusion

This problem was fixed.
Affected platforms: AIX, Solaris, Linux, and Windows.

Temporary fix

Comments

APAR Information

APAR number
IT17718
Reported component name
TSM SERVER
Reported component ID
5698ISMSV
Reported release
71L
Status
CLOSED PER
PE
NoPE
HIPER
NoHIPER
Special Attention
NoSpecatt / Xsystem
Submitted date
2016-10-28
Closed date
2016-12-16
Last modified date
2017-06-30

APAR is sysrouted FROM one or more of the following:
APAR is sysrouted TO one or more of the following:

Fix information

Fixed component name
TSM SERVER
Fixed component ID
5698ISMSV

Applicable component levels

[{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SSGSG7","label":"Tivoli Storage Manager"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"7.1.3","Edition":"","Line of Business":{"code":"LOB26","label":"Storage"}}]

Document Information

Modified date:
30 June 2017

Tips

IT17718: ANR0205W SEEN AFTER PREVIOUS REPAIR STG PROCESSES HAVE RAN WHICH CAUSES EXTENT INCONSISTENCY

Direct links to fixes

Subscribe

APAR status

Closed as program error.

Error description

Local fix

Problem summary

Problem conclusion

Temporary fix

Comments

APAR Information

APAR number

Reported component name

Reported component ID

Reported release

Status

PE

HIPER

Special Attention

Submitted date

Closed date

Last modified date

APAR is sysrouted FROM one or more of the following:

APAR is sysrouted TO one or more of the following:

Fix information

Fixed component name

Fixed component ID

Applicable component levels

Document Information

Share your feedback

Need support?