IT15468: "ANR3247W" CAUSED BY ABORT OF A BACKUP PERFORMING SIMULTANEOUS WRITE OF A FILE LARGER THAN "MAXFRAGMENTSIZE".

APAR status

Closed as program error.

Error description

Warning message:
ANR3247W Process 1234 skipped 1 files on volume
/filesystem/filevol1 because of pending fragments

can be displayed in activity log during data movement under the
following scenario:

The client is performing a backup using simultaneous write of a
file that is larger than the MAXFRAGMENTSIZE setting on the
server.
Something happens causing this transaction to abort on the
server.  There does not have to be any messages indicating this
abort anywhere because of possible client retries on an abort.

There are a large number of reasons, why the transaction can be
aborted some of which will have messages associated with it and
some will not.

One example is:
ANE4037E: File 'file-namefile-namefile-name' changed during
processing. File skipped.
and because of the serialization options the client aborted the
transaction at the end of the transaction because the file
changed while the client was backing it up.

This leads into is a situation where the server queues up the
deletion of the entire fragment and it queues up the deletion of
just the copy of the fragment that is in the copy pool.
These deletions then just happen to run simultaneously on
separate threads causing a "not found" error in the thread that
is doing the deletion of the whole fragment because the deletion
of the partial fragment deleted part of this whole fragment out
from under it.

This is not the same scenario as IT13236

Diagnostics:

Server 'BFDESTROY BFSAGGR AFMOVE'
18:20:34.033 [17420][bfcreate.c][1894][bfCreate]:Bitfile
305918375 will be stored as fragments
18:20:34.033 [17420][bfcreate.c][2024][bfCreate]:Main txnP is
123197bc8
18:20:34.033 [17420][bfcreate.c][2035][bfCreate]:sub txnP is
1234769e8
18:20:34.033 [17420][afcreate.c][2258][AfAllocSpace]:Increasing
requested amount from 10485760000 to 11568489790 for super
aggregate
18:20:34.041 [17420][afutil.c][950][AfUpdatePool]:Pool
COPYPOOL(-2) updated with highMigOcc 26861699541645 and
lowMigOcc 20892432976835.
18:20:34.042 [17420][afcreate.c][2258][AfAllocSpace]:Increasing
requested amount from 10485760000 to 11568489790 for super
aggregate
18:25:02.269 [17420][bfcreate.c][2118][bfCreate]:4319574762
bytes stored from CreateBitfile, 4319574762 from smSessP, 0 from
SD
18:25:02.270 [17420][bfcreate.c][2150][bfCreate]:This is the
last of 1 fragment
18:25:02.270
[17420][bfsaggr.c][1125][BfCatalogNewFragment]:Adding fragId
305918375 with sequence number 0 to object 0
18:25:02.270
[17420][bfsaggr.c][2083][BfCatalogFragment]:Entering for objId
305918375, fragId 305918375, bfId 305918375, poolId 5, pending
ID 0
18:25:02.270 [17420][bfsaggr.c][2119][BfCatalogFragment]:Setting
BFSA_CK1 to 218 and BFSA_CK2 to 17
18:25:02.270 [17420][bfsaggr.c][2128][BfCatalogFragment]:Exit,
rc 0
18:25:02.270 [17420][bfsaggr.c][1180][BfCatalogNewFragment]:SW
Copy in pool -2 assigned pendingId 1498689438192
18:25:02.271
[17420][bfsaggr.c][2083][BfCatalogFragment]:Entering for objId
305918375, fragId 305918375, bfId 305918375, poolId -2, pending
ID 1498689438192
18:25:02.271 [17420][bfsaggr.c][2119][BfCatalogFragment]:Setting
BFSA_CK1 to 218 and BFSA_CK2 to 17
18:25:02.272 [17420][bfsaggr.c][2128][BfCatalogFragment]:Exit,
rc 0
18:25:02.272
[17420][bfsaggr.c][1220][BfCatalogNewFragment]:Exiting, rc 0
18:25:02.282 [17420][aftxn.c][461][AfEndTxn]:Transaction
incrementing migOcc 3342307619821 by 4320102250 (incrNumFiles
1).
18:25:02.282 [17420][aftxn.c][461][AfEndTxn]:Transaction
incrementing migOcc 3327207578442 by 4320102250 (incrNumFiles
1).
18:25:02.283
[17420][bfsaggr.c][1665][BfQueueSuperAggregateDeletion]:Entering
for super aggregate 305918375
18:25:02.283
[17420][bfsaggr.c][1690][BfQueueSuperAggregateDeletion]:Exit, rc
0
18:25:02.283
[17420][bfsaggr.c][1560][BfCleanupFragmentsByPendingId]:Cleaning
up fragments for pendingId 1498689438192
18:25:02.283
[17420][bfsaggr.c][1729][BfQueueFragmentDeletion]:Queuing
request for fragId 305918375, bfId 305918375, poolId -2
18:25:02.285
[17420][bfsaggr.c][1754][BfQueueFragmentDeletion]:Exit, rc 0
18:25:02.285
[17420][bfsaggr.c][1635][BfCleanupFragmentsByPendingId]:Exiting,
rc 0

Note both the message:
18:25:02.283
[17420][bfsaggr.c][1665][BfQueueSuperAggregateDeletion]:Entering
for super aggregate 305918375
and
18:25:02.283
[17420][bfsaggr.c][1729][BfQueueFragmentDeletion]:Queuing
request for fragId 305918375, bfId 305918375, poolId -2

and these happen after this is seen::
18:25:02.282 [17420][aftxn.c][461][AfEndTxn]:Transaction
incrementing migOcc 3342307619821 by 4320102250 (incrNumFiles
1).
18:25:02.282 [17420][aftxn.c][461][AfEndTxn]:Transaction
incrementing migOcc 3327207578442 by 4320102250 (incrNumFiles
1).

db2 "select * from tsmdb1.bf_super_aggregates where pendingid IS
NOT NULL"
can be used to show all current pending fragments.


Tivoli Storage Manager Versions Affected:
Tivoli Storage Manager Server: 7.1.x and higher on all platforms



Initial Impact: Medium


Additional Keywords: TSM IBM Spectrum Protect pending fragments

Local fix

Restart server process/service to clean out the pending
fragments.

Problem summary

****************************************************************
* USERS AFFECTED:                                              *
* All IBM Tivoli Storage Manager and IBM Spectrum Protect      *
* server users.                                                *
****************************************************************
* PROBLEM DESCRIPTION:                                         *
* see error description.                                       *
****************************************************************
* RECOMMENDATION:                                              *
* Recommendation:                                              *
* Apply fixing level when available. This problem is currently *
* projected to be fixed in levels 7.1.7.100, 7.1.8.0, and      *
* 8.1.1.0. Note that this is subject to change at the          *
* discretion of IBM.                                           *
****************************************************************

Problem conclusion

This problem was fixed.
Affected platforms:  AIX, HP-UX, Linux, Solaris, and Windows.

Temporary fix

Comments

APAR Information

APAR number
IT15468
Reported component name
TSM SERVER
Reported component ID
5698ISMSV
Reported release
71A
Status
CLOSED PER
PE
NoPE
HIPER
NoHIPER
Special Attention
NoSpecatt / Xsystem
Submitted date
2016-05-26
Closed date
2016-12-15
Last modified date
2016-12-15

APAR is sysrouted FROM one or more of the following:
APAR is sysrouted TO one or more of the following:

Fix information

Fixed component name
TSM SERVER
Fixed component ID
5698ISMSV

Applicable component levels

[{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SSGSG7","label":"Tivoli Storage Manager"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"7.1.3","Edition":"","Line of Business":{"code":"LOB26","label":"Storage"}}]

Document Information

Modified date:
15 December 2016

Tips

IT15468: "ANR3247W" CAUSED BY ABORT OF A BACKUP PERFORMING SIMULTANEOUS WRITE OF A FILE LARGER THAN "MAXFRAGMENTSIZE".

Direct links to fixes

Subscribe

APAR status

Closed as program error.

Error description

Local fix

Problem summary

Problem conclusion

Temporary fix

Comments

APAR Information

APAR number

Reported component name

Reported component ID

Reported release

Status

PE

HIPER

Special Attention

Submitted date

Closed date

Last modified date

APAR is sysrouted FROM one or more of the following:

APAR is sysrouted TO one or more of the following:

Fix information

Fixed component name

Fixed component ID

Applicable component levels

Document Information

Share your feedback

Need support?