A fix is available
APAR status
Closed as program error.
Error description
There is a timing issue at the end of NDMP filer-to-server backups or restores that may cause the session to hang if the NDMP operation failed. If the NDMP operation was launched by an administrative schedule, then the schedule will never complete, causing subsequent schedule to be missed. The issue does not seem to present itself if the operation completes successfully. When the problem occurs, we can validate that this is the case when a schedule is missed, and the schedule appears in the output of SHOW PENDING and the deadline is in the past. L2 diagnostic: When looking at trace with CC, ADM, SCHED, we see the following after the backup job failed, but before the duration elapsed: 17:31:03.039 [136][cscmdsch.c][774][CsCmdSchedulerThread]: Command Scheduler: Schedule NDMP_SCHED started. 17:31:03.040 [579962][csmgr.c][1842][CsUpdateEvent]: Schedule : NDMP_SCHED 17:31:03.045 [579962][csmgr.c][1954][CsUpdateEvent]: Schedule : NDMP_SCHED 17:31:03.046 [579962][cscmdsch.c][964][CsRunCmdThread]: Schedule NDMP_SCHED has already been run - cannot start again 17:31:03.048 [136][cscmdsch.c][474][CsCmdSchedulerThread]: Command Scheduler: Skipping schedule NDMP_SCHED. After the schedule duration is over, we see the following loop for the remaining period of the trace, including the next schedule time: 18:30:19.244 [136][cscmdsch.c][502][CsCmdSchedulerThread]: Command Scheduler: Schedule NDMP_SCHED is expired, but still running. 18:30:19.261 [136][cscmdsch.c][502][CsCmdSchedulerThread]: Command Scheduler: Schedule NDMP_SCHED is expired, but still running. For a backup job run manually, show threads should still have hanging threads for the backup showing: =========================================================== Thread 55, ID 2672 (0x0a70): AfStoreNativeThread Parent=0, result=0, joining=0, detached=1, zombie=0, session=0 Waiting for Cond 50941240 (&sessP->ndmp_ssDoneCond) using mutex 0 (na). Waiting from Stack trace: 00000000772A186A NtWaitForMultipleObjects()+a 000007FEFD501430 GetCurrentProcess()+40 0000000077141220 WaitForMultipleObjects()+b0 000007FED76A1C5A pkWaitCondition()+3ea pkmonnt.c:1507 000007FED8458861 ssEndSession()+511 sssess.c:935 000007FED7995FC2 bfEndSession()+5f2 bfutil.c:1865 000007FED7B48749 DoEndSess()+99 afremote.c:2913 000007FED7B49224 AfStoreNativeThread()+334 afremote.c:3504 000007FED769A73F startThread()+35f pkthread.c:3249 000007FEE6077175 beginthreadex()+205 000007FEE6077377 endthreadex()+1d7 000000007714652D BaseThreadInitThunk()+d 000000007727C541 RtlUserThreadStart()+21 =========================================================== Thread 56, ID 10632 (0x2988): ShowThreadController Parent=40, result=0, joining=0, detached=0, zombie=0, session=0 Versions Affected: All versions on all platforms Initial Impact: medium Additional Keywords: ANR1893E tsm
Local fix
To release the schedule, perform one of the following: - restart the TSM Server - delete the schedule and redefine it To avoid the issue, address the problem that is causing the operation to fail.
Problem summary
**************************************************************** * USERS AFFECTED: * * All Tivoli Storage Manager server users that perform NDMP * * operations * **************************************************************** * PROBLEM DESCRIPTION: * * See error description * **************************************************************** * RECOMMENDATION: * * Apply fixing level when available. This problem is currently * * projected to be fixed in levels 6.3.5 and 7.1.1. Note that * * this is subject to change at the discretion of IBM. * ****************************************************************
Problem conclusion
This problem was fixed. Affected platforms: AIX, HP-UX, Solaris, Linux, and Windows.
Temporary fix
Comments
APAR Information
APAR number
IC99949
Reported component name
TSM SERVER
Reported component ID
5698ISMSV
Reported release
63A
Status
CLOSED PER
PE
NoPE
HIPER
NoHIPER
Special Attention
NoSpecatt
Submitted date
2014-03-10
Closed date
2014-04-29
Last modified date
2014-04-29
APAR is sysrouted FROM one or more of the following:
APAR is sysrouted TO one or more of the following:
Fix information
Fixed component name
TSM SERVER
Fixed component ID
5698ISMSV
Applicable component levels
R63A PSY
UP
R63W PSY
UP
R63S PSY
UP
R63H PSY
UP
R63L PSY
UP
R71A PSY
UP
R71S PSY
UP
R71H PSY
UP
R71W PSY
UP
R71L PSY
UP
[{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SSGSG7","label":"Tivoli Storage Manager"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"63A","Edition":"","Line of Business":{"code":"LOB26","label":"Storage"}}]
Document Information
Modified date:
29 April 2014