IBM Support

DB2 backup to NFS filesystem may hang

Troubleshooting


Problem

Starting with version 10, DB2 acquire a file lock on target before starting the backup. If the target is on NFS, and the NFS is not properly configured, this may result in the backup hang waiting for the lock. This does not affect V9.x family as DB2 does not acquire the filelock before proceeding with the backup.

Symptom

db2trc will show the db2agent looping in the following :

278455 156.478576500 | | | | | sqluReadMessageFromQueue entry [eduid 7712 eduname db2agent]
278457 156.478577312 | | | | | | sqlorque2 entry [eduid 7712 eduname db2agent]
....
279782 161.478715246 | | | | | | sqluCheckIfAgentInterrupted exit
279783 161.478715671 | | | | | sqluReadMessageFromQueue exit [rc = 0x870F00B9 = -2029059911 = SQLO_SEM_TIMEOUT]

The edu trying to acquire the lock is db2med. Use procstack <db2_sysc_pid> to dump the stack of db2med :

---------- tid# 89325961 (pthread ID: 9254) ----------
0x09000000000366e0 __fcntl(??, ??, ??) + 0x1e0
0x09000000000369a4 fcntl(0x3200000032, 0xc0000000c, 0xa000000073fd2d0, 0x1001, 0x9000000618f2038, 0x7c, 0xcafe, 0x0) + 0x44
0x0900000061a7daa4 sqloflock(0x110d61214, 0x100000000000001, 0x300000000000003) + 0x23c
0x0900000061a7c528 sqloopenp(0xa000000073fdc80, 0x900000009, 0x18000000180, 0x2c3000002c3, 0x0) + 0x7dc
0x090000006276a99c sqluInitFileDevice(SQLUMC_IBLK_T*,unsigned int,int,unsigned int)(??, ??, ??, ??) + 0x2d8
0x090000006277b3d4 sqluMCInitBackupMC(SQLUMC_IBLK_T*)(??) + 0x408

Cause

NFS server configuration.

Environment

Any Unix system that import a NFS file system

Diagnosing The Problem

Below is a small Perl program to invoke the fcntl function. It creates and lock a file. If this is successful, one would get the message that the lock is successfully acquired. If this program hangs - then this indicate a NFS configuration issue. Run this program as <progname> <nfsfile>.

<program>
#!/usr/bin/perl

use Fcntl ':flock'; # Import LOCK_* constants

my ( $file ) = @ARGV ;

open(my $fh, '>' , $file ) or die "Could not open '$file' - $!";

# Get exclusive lock (will block until it does)
flock($fh, LOCK_EX) or die "Could not lock '$file' - $!";
print "File $file locked, hit <ENTER> to unlock\n" ;
$ans = <STDIN> ;

close($fh) or die "Could not write '$file' - $!";
print "File $file unlocked\n" ;

unlink $file ;
print "File removed\n" ;
</program>

Resolving The Problem

Backup to local disk , or mount the NFS system with "nolock" option, or backup to another file system where fcntl succeeds in getting a file lock.

[{"Product":{"code":"SSEPGG","label":"Db2 for Linux, UNIX and Windows"},"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Component":"DB2 Tools - Troubleshooting","Platform":[{"code":"PF002","label":"AIX"},{"code":"PF010","label":"HP-UX"},{"code":"PF016","label":"Linux"},{"code":"PF027","label":"Solaris"}],"Version":"10.1","Edition":"","Line of Business":{"code":"LOB10","label":"Data and AI"}}]

Document Information

Modified date:
16 June 2018

UID

swg21641535