IC81115: TIVOLI STORAGE MANAGER SERVER PERFORMING REORG WITH DEDUPLICATION ENABLED CAN CAUSE NODE TO BECOME "LOCKED"

APAR status

Closed as program error.

Error description

The Tivoli Storage Manager server might experience resource
timeouts that result in sessions/processes failing consistently.


This can only occur if the server has been configured for
deduplication on at least one storage pool. That storage pool
has to be actively managing data that has been deduplicated.
Depending on what resource is being pinned, the symptom could be
that at least one session/process cannot complete its operation
successfully. This will most likely be repeatable until the
reorganization is stopped/killed. The following messages will be
seen in the activity log repeatedly:

01/22/12   06:02:34      ANR0538I A resource waiter has been
aborted.

The operation that has experienced the waiter abort will have
its own failure message shortly thereafter.

The key to diagnosing this issue is by locating one of the
following tables being reorganized at the time of the excessive
lock failures and perceived hangs.
BF_QUEUED_CHUNKS
BF_DEREFERENCED_CHUNKS

This can be found by issuing the following DB2 diagnostic
command:
db2pd -db tsmdb1 -reorg

This might occur while the Tivoli Storage Manger Server is
reorganizing the BF_QUEUED_CHUNKS or the BF_DEREFERENCED_CHUNKS
tables.  These tables due to their volatile nature should not be
reorganized.

The following commands can be used to determine if the reported
condition has occurred:

1) db2 connect to tsmdb1
2) db2 set schema tsmdb1
3) db2pd -d tsmdb1 -reorg

If the output indicates that Status column of the "Table Reorg
Stats:" stanza for BF_QUEUED_CHUNKS or BF_DEREFERENCED_CHUNKS
tables is not "Done" or "Stopped" or "Paused", and the CurCount
value is not incremented for subsequent "db2pd -d tsmdb1 -reorg"
commands, or if there is no value for the CurCount column, you
are probably experiencing this condition.

Example output:
Table Reorg Stats:
Address TableName Start End PhaseStart MaxPhase Phase CurCount
MaxCount Status Completion
0x000007F726E94E28 BF_QUEUED_CHUNKS 01/20/2012 16:40:35 n/a n/a
n/a n/a 0 114527 Started 0

Additional Keywords:

Hang, Abort, Dedupe, ACO5436E, ANS1301E;

Local fix

Stop the online REORG and then perform the backup. To stop the
REORG:


1) db2 connect to tsmdb1
2) db2 set schema tsmdb1
3) db2 "reorg table <tablename> inplace stop"
4) Wait 5 minutes
5) db2pd -d tsmdb1 -reorg
The Status column of the  "Table Reorg Stats:" stanza should be
"Stopped".

If this does not stop the reorg, or if you get DB2 error
message:

SQL2219N The specified INPLACE table reorganization action on
table
"TSMDB1.<tablename>" is not allowed on one or more nodes. Reason
code:
"10".

you are probably experiencing DB2 APAR IC79773.   To stop the
reorg, do the following:


1. Determine the application ID of the reorganization process,
by issuing the following commands in a DB2 Command Line
Processor window:
A. db2 connect to tsmdb1
B. db2 get snapshot for all applications >application.out
2. Examine the application.out file and find the "Most recent
operation" entry like this:
Most recent operation = Reorganize

If that line isn't there, look for an entry like this:
Application name = db2reorg
3. Scroll backwards until finding the "Application handle"
entry. It will look like something like this:
Application handle = NNNNN (where NNNNN is the actual
application handle)
Ensure that the correct application handle is found.
4. Issue the following command in the DB2 Command Line Processor
Window substituting in the actual application handle in for
NNNNN:
db2 "force application (NNNNN)"
5. Because the nature of the command being canceled and that the
DB2 FORCE APPLICATION command is asynchronous, it might take up
to 30 minutes for the process to be canceled.
6. To verify that it has been canceled, issue steps 1b and 2
again. If there is no "Most recent operation" of type Reorganize
message displayed, it has been canceled.



Please see the following technote for additional information on
cancelling the reorganization.

      http://www-01.ibm.com/support/docview.wss?uid=swg21452146

Problem summary

****************************************************************
* USERS AFFECTED: All Tivoli Storage Manager server users.     *
*                                                              *
*                                                              *
****************************************************************
* PROBLEM DESCRIPTION: See error description.                  *
*                                                              *
*                                                              *
*                                                              *
****************************************************************
* RECOMMENDATION: Apply fixing level when available. This      *
*                                                     Problem  *
*                 is currently projected to be fixed           *
*                    in levels 6.2.4, and 6.3.1.               *
*                    Note that this                            *
*                    is subject to change at the               *
*                    discretion of IBM.                        *
*                                                              *
*                                                              *
****************************************************************
*

Problem conclusion

This problem was fixed.

See Flash(Alert) 1580639 for information.
http://www.ibm.com/support/docview.wss?uid=swg21580639

Affected platforms:  AIX, HP-UX, Solaris, Linux and Windows.

Temporary fix

Comments

APAR Information

APAR number
IC81115
Reported component name
TSM SERVER
Reported component ID
5698ISMSV
Reported release
61L
Status
CLOSED PER
PE
NoPE
HIPER
NoHIPER
Special Attention
NoSpecatt
Submitted date
2012-02-01
Closed date
2012-03-30
Last modified date
2013-08-23

APAR is sysrouted FROM one or more of the following:
APAR is sysrouted TO one or more of the following:

Fix information

Fixed component name
TSM SERVER
Fixed component ID
5698ISMSV

Applicable component levels

R61A PSY
UP
R61H PSY
UP
R61L PSY
UP
R61S PSY
UP
R61W PSY
UP
R62A PSY
UP
R62H PSY
UP
R62L PSY
UP
R62S PSY
UP
R62W PSY
UP
R63A PSY
UP
R63H PSY
UP
R63L PSY
UP
R63S PSY
UP
R63W PSY
UP

[{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SSGSG7","label":"Tivoli Storage Manager"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"61L","Edition":"","Line of Business":{"code":"LOB26","label":"Storage"}}]

Document Information

Modified date:
23 August 2013

Tips

IC81115: TIVOLI STORAGE MANAGER SERVER PERFORMING REORG WITH DEDUPLICATION ENABLED CAN CAUSE NODE TO BECOME "LOCKED"

Subscribe

APAR status

Closed as program error.

Error description

Local fix

Problem summary

Problem conclusion

Temporary fix

Comments

APAR Information

APAR number

Reported component name

Reported component ID

Reported release

Status

PE

HIPER

Special Attention

Submitted date

Closed date

Last modified date

APAR is sysrouted FROM one or more of the following:

APAR is sysrouted TO one or more of the following:

Fix information

Fixed component name

Fixed component ID

Applicable component levels

R61A PSY

R61H PSY

R61L PSY

R61S PSY

R61W PSY

R62A PSY

R62H PSY

R62L PSY

R62S PSY

R62W PSY

R63A PSY

R63H PSY

R63L PSY

R63S PSY

R63W PSY

Document Information

Share your feedback

Need support?