Fixes are available
APAR status
Closed as program error.
Error description
ClearCase VOB shared memory becomes wedged and does not recover without manual intervention ClearCase 7.1.2.6 RedHat 5 Update 7 *VOB databases are stored on Network Attached Storage Description of the Problem: When storing VOBs on Network Attached Storage, individual VOBs may become unresponsive with the following messages within the db_server_log: db_server(*****): Error: CRRDM0656E *** db_VISTA database error -926 - problem in shared memory lock manager: Process ***** timed out waiting for lock. db_server(*****): Error: Timeout getting lock in VOB '/net/nas/test.vbs/db'. db_server(*****): Warning: 'admin,pid==*****,euid==UNIX:UID-0, cleartool 'describe' 'vob:/vobs/test'' waited 299 seconds for a 'r' lock in the usual area in /net/nas/test.vbs/db! A truss of the db_server process resembles the following: 17331 13:36:41 futex(0xb7cb90b8, FUTEX_WAIT, 68, {59, 360378300}) = -1 ETIMEDOUT (Connection timed out) ?59.369790? 17331 13:37:41 kill(17331, SIG_0) = 0 ?0.000008? 17331 13:37:41 time(NULL) = 1342039061 ?0.000006? 17331 13:37:41 time(NULL) = 1342039061 ?0.000006? 17331 13:37:41 futex(0xb7cb900c, FUTEX_WAKE, 1) = 0 ?0.000008? 17331 13:37:41 clock_gettime(CLOCK_REALTIME, {1342039061, 9759788}) = 0 ?0.000007? 17331 13:37:41 futex(0xb7cb90b8, FUTEX_WAIT, 70, {59, 990240212}) = -1 ETIMEDOUT (Connection timed out) ?60.000617? 17331 13:38:41 kill(17331, SIG_0) = 0 ?0.000007? 17331 13:38:41 time(NULL) = 1342039121 ?0.000006? 17331 13:38:41 time(NULL) = 1342039121 ?0.000006? 17331 13:38:41 futex(0xb7cb900c, FUTEX_WAKE, 1) = 0 ?0.000007? 17331 13:38:41 clock_gettime(CLOCK_REALTIME, {1342039121, 10693032}) = 0 ?0.000007? 17331 13:38:41 futex(0xb7cb90b8, FUTEX_WAIT, 72, {59, 989306968}) = -1 ETIMEDOUT (Connection timed out) ?60.000586? 17331 13:39:41 kill(17331, SIG_0) = 0 ?0.000007? 17331 13:39:41 time(NULL) = 1342039181 ?0.000006? 17331 13:39:41 time(NULL) = 1342039181 ?0.000006? 17331 13:39:41 futex(0xb7cb900c, FUTEX_WAKE, 1) = 0 ?0.000008? 17331 13:39:41 clock_gettime(CLOCK_REALTIME, {1342039181, 11622577}) = 0 ?0.000010? 17331 13:39:41 futex(0xb7cb90b8, FUTEX_WAIT, 74, {59, 988377423} ?unfinished ...? . . 17331 13:40:41 ?... futex resumed? ) = -1 ETIMEDOUT (Connection timed out) ?59.998220? 17331 13:40:41 kill(17331, SIG_0) = 0 ?0.000007? 17331 13:40:41 time(NULL) = 1342039241 ?0.000006? 17331 13:40:41 time(NULL) = 1342039241 ?0.000006? 17331 13:40:41 futex(0xb7cb900c, FUTEX_WAKE, 1) = 0 ?0.000007? 17331 13:40:41 clock_gettime(CLOCK_REALTIME, {1342039241, 10196451}) = 0 ?0.000006? 17331 13:40:41 futex(0xb7cb90b8, FUTEX_WAIT, 76, {59, 989803549} ?unfinished ...? . . 17331 13:41:41 ?... futex resumed? ) = -1 ETIMEDOUT (Connection timed out) ?60.000063? 17331 13:41:41 kill(17331, SIG_0) = 0 ?0.000008? 17331 13:41:41 time(NULL) = 1342039301 ?0.000006? 17331 13:41:41 write(2, 'CRRDM0656E *** db_VISTA database'..., 83) = 83 ?0.000012? 17331 13:41:41 write(2, ': ', 2) = 2 ?0.000008? 17331 13:41:41 write(2, 'Process 17331 timed out waiting '..., 41) = 41 ?0.000008? 17331 13:41:41 write(2, '\n', 1) = 1 ?0.000007? Workaround: Use direct attached storage (iSCSI) to split the database from the rest of the VOB pools and store the database locally
Local fix
Problem summary
**************************************************************** * USERS AFFECTED: * **************************************************************** * PROBLEM DESCRIPTION: * **************************************************************** * RECOMMENDATION: * **************************************************************** Under certain conditions, error conditions on NAS VOB storage could trigger a deadlock in access to the VOB database files, causing db_server processes to hang.
Problem conclusion
A fix is available in ClearCase 7.1.2.9 and 8.0.0.5.
Temporary fix
Comments
APAR Information
APAR number
PM69279
Reported component name
CLEARCASE UNIX
Reported component ID
5724G2901
Reported release
711
Status
CLOSED PER
PE
NoPE
HIPER
NoHIPER
Special Attention
NoSpecatt
Submitted date
2012-07-19
Closed date
2012-12-15
Last modified date
2012-12-15
APAR is sysrouted FROM one or more of the following:
APAR is sysrouted TO one or more of the following:
Fix information
Fixed component name
CLEARCASE UNIX
Fixed component ID
5724G2901
Applicable component levels
R711 PSN
UP
Rate this page:
Average rating
Copyright and trademark information
IBM, the IBM logo and ibm.com are trademarks of International Business Machines Corp., registered in many jurisdictions worldwide. Other product and service names might be trademarks of IBM or other companies. A current list of IBM trademarks is available on the Web at "Copyright and trademark information" at www.ibm.com/legal/copytrade.shtml.