Readme and Release notes for release 3.1.0.30 (GPFS) IBM General Parallel File System 3.1.0.30 GPFS-3.1.0.30-power-AIX Readme

Readme file for: GPFS-3.1.0.30-power-AIX
Product/Component Release: 3.1.0.30
Update Name: GPFS-3.1.0.30-power-AIX
Fix ID: GPFS-3.1.0.30-power-AIX
Publication Date: 29 July 2010
Last modified date: 29 July 2010

Installation information

Download location

Below is a list of components, platforms, and file names that apply to this Readme file.

Fix Download for AIX

Product/Component Name: Platform: Fix:
(GPFS) IBM General Parallel File System AIX 5.3
AIX 6.1
GPFS-3.1.0.30-power-AIX

Prerequisites and co-requisites

None

Known issues

  • - Problem discovered in earlier GPFS releases

    During internal testing, a rare but potentially serious problem has been discovered in GPFS. Under certain conditions, a read from a cached block in the GPFS pagepool may return incorrect data which is not detected by GPFS. The issue is corrected in GPFS 3.3.0.5 (APAR IZ70396) and GPFS 3.2.1.19 (APAR IZ72671). All prior versions of GPFS are affected.

    The issue has been discovered during internal testing, where an MPI-IO application was employed to generate a synthetic workload. IBM is not aware of any occurrences of this issue in customer environments or under any other circumstances. Since the issue is specific to accessing cached data, it does not affect applications using DirectIO (the IO mechanism that bypasses file system cache, used primarily by databases, such as DB2® or Oracle).

    This issue is limited to the following conditions:

    1. The workload consists of a mixture of writes and reads, to file offsets that do not fall on the GPFS file system block boundaries;
    2. The IO pattern is a mixture of sequential and random accesses to the same set of blocks, with the random accesses occurring on offsets not aligned on the file system block boundaries; and
    3. The active set of data blocks is small enough to fit entirely in the GPFS pagepool.

    The issue is caused by a race between an application IO thread doing a read from a partially filled block (such a block may be created by an earlier write to an odd offset within the block), and a GPFS prefetch thread trying to convert the same block into a fully filled one, by reading in the missing data, in anticipation of a future full-block read. Due to insufficient synchronization between the two threads, the application reader thread may read data that had been partially overwritten with the content found at a different offset within the same block. The issue is transient in nature: the next read from the same location will return correct data. The issue is limited to a single node; other nodes reading from the same file would be unaffected.

Installation information

After you have downloaded a GPFS for AIX update package into any directory on your system, use the following section to install the fix package.

  • - Installing a GPFS update for AIX

    Complete these steps to install the fix package:

    1. Unzip and extract the BFF image(s) from the *.tar.gz file:

      gzip -d -c < filename > .tar.gz | tar -xvf -


    2. Verify the update's BFF image(s) in the directory.

      Normally, the BFF images in the directory would be similar to the following:
      Unnnnnn .gpfs.base.bff
      Unnnnnn .gpfs.msg.en_US.bff
      Unnnnnn .gpfs.docs.data.bff


      where nnnnnn represents the six (6) digits of the PTF number for the BFF image.

      For specific filenames, check the Readme for the GPFS update by clicking the "View" link for the update on the Download tab.


    3. Follow the installation and migration instructions in your GPFS Concepts, Planning and Installation Guide.
  • - Upgrading GPFS nodes

    In the below instructions, node-by-node upgrade cannot be used to migrate from GPFS 2.3 to later releases. For example, upgrading from 2.3.x to 3.1.y requires complete cluster shutdown, upgrade install on all nodes and then cluster startup.

    Upgrading GPFS may be accomplished by either upgrading one node in the cluster at a time or by upgrading all nodes in the cluster at once. When upgrading GPFS one node at a time, the below steps are performed on each node in the cluster in a sequential manner. When upgrading the entire cluster at once, GPFS must be shutdown on all nodes in the cluster prior to upgrading.

    When upgrading nodes one at a time, you may need to plan the order of nodes to upgrade. Verify that stopping each particular machine does not cause quorum to be lost or that an NSD server might be the last server for some disks. Upgrade the quorum and manager nodes first. When upgrading the quorum nodes, upgrade the cluster manager last to avoid unnecessary cluster failover and election of new cluster managers.

    1. Prior to upgrading GPFS on a node, all applications that depend on GPFS (e.g. Oracle) must be stopped. Any GPFS file systems that are NFS exported must be unexported prior to unmounting GPFS file systems. If tracing was turned on, then tracing must be turned off before shutting down GPFS as well.
    2. Stop GPFS on the node. Verify that the GPFS daemon has terminated and that the kernel extensions have been unloaded (mmfsenv -u ). If the command mmfsenv -u reports that it cannot unload the kernel extensions because they are "busy", then the install can proceed, but the node must be rebooted after the install. By "busy" this means that some process has a "current directory" in some GPFS filesystem directory or has an open file descriptor. The freeware program lsof can identify the process and the process can then be killed. Retry mmfsenv -u and if that succeeds then a reboot of the node can be avoided.
    3. Upgrade GPFS using the installp command or via SMIT on the node.

Additional information

  • - Notices

    [June 9, 2010]

    A build error caused an issue with the DMAPI function in the GPFS 3.2.1-20 package that was released on May 27, 2010. The corresponding packages have now been replaced on the service download site.

    If you installed the May 27 GPFS 3.2.1-20 package and mounted a DMAPI-enabled file system while running GPFS 3.2.1-20 (fo r example, -z yes by means of the HSM features of TSM), please contact IBM Support. The replacement 3.2.1-20 package works a s designed, but does not fix a file system that was mounted with the problematic 3.2.1-20 package.

    Verify that you have the correct package installed by running the rpm -qi gpfs.base command. Make sure that the Bu ild Date is Mon 07 Jun 2010.

    [June 2, 2010]

    A build error caused an issue with the DMAPI function in the GPFS 3.3.0-6 package that was released on May 22, 2010. The corresponding packages have now been replaced on the service download site.

    If you installed the May 22 GPFS 3.3.0-6 package and mounted a DMAPI-enabled file system while running GPFS 3.3.0-6 (for example, -z yes by means of the HSM features of TSM), please contact IBM Support. The replacement 3.3.0-6 package works as d esigned, but does not fix a file system that was mounted with the problematic 3.3.0-6 package.

    Verify that you have the correct package installed by running the rpm -qi gpfs.base command. Make sure that the Bu ild Date is Thu 27 May 2010.

    [April 1, 2010]

    During internal testing, a rare but potentially serious problem has been discovered in GPFS. Under certain conditions, a read from a cached block in the GPFS pagepool may return incorrect data which is not detected by GPFS. The issue is corrected in GPFS 3.3.0.5 (APAR IZ70396) and GPFS 3.2.1.19 (APAR IZ72671). All prior versions of GPFS are affected.

    Click here for details.

    [December 17, 2009]

    Support for GPFS 3.1 has only been extended for AIX and Linux on POWER systems. Service updates will be made available for other Linux platforms, but support is not being extended.

    [November 9, 2009]

    GPFS 3.3.0-1 does not correctly operate with file systems created with GPFS V2.2 (or older). Such file systems can be identified by running "mmlsfs all -u": if "no" is shown for any file system, this file system uses the old format, and the use of GPFS 3.3.0-1 is not possible. GPFS 3.3.0-2 corrects this issue.

    [November 7, 2008]

    GPFS 3.2.1.7 contained a change that impacts TSM HSM recall process of files with stub size >0 causing hangs during recalls. To avoid this problem, the configuration parameter dmapiDataEventRetry has to be set to 'no' via command 'mmchconfig dmapiDataEventRetry=no -i '.

    [September 11, 2008]

    The 3.2.1-5 maintenance level had a data integrity problem using the mmap feature to write or update files on Linux and AIX. The 3.2.1-6 maintenance level is the recommended upgrade path from versions 3.2.0-0 through 3.2.1-4.

  • - Package information

    The update images listed below and contained in the tar image with this README are maintenance packages for GPFS. The update images can be directly applied to your system.

    The update images require a prior level of GPFS. Thus, the usefulness of this update is limited to installations that already have the GPFS product. Contact your IBM representative if you desire to purchase a fully installable product that does not require a prior level of GPFS.

    After all BFFs are installed, you have successfully updated your GPFS product.

    Update to Version:

    3.1.0.30

    Update from Version:

    3.1.0.0 through 3.1.0.29

    Update (tar file) contents:

    README
    changelog
    U811843.gpfs.docs.data.bff
    U824369.gpfs.msg.en_US.bff
    U829152.gpfs.base.bff

  • - Changelog for GPFS 3.1.x
    Notes

    Unless specifically noted otherwise, this history of problems fixed for GPFS 3.1.x applies for all supported platforms.

    Notices

    GPFS V3.1 maintenance levels 10 (GPFS-3.1.0.10) through 12 (GPFS-3.1.0.12) can not coexist with other maintenance levels.

    All nodes in the cluster must conform to one of these maintenance level compatibility restrictions:

    • All nodes must be at maintenance levels 1-9 or 13 and later (GPFS-3.1.0.1 thru GPFS-3.1.0.9 or GPFS-3.1.0.13 and later)
    • All nodes must be at maintenance levels 10-12 (GPFS-3.1.0.10 - GPFS-3.1.0.12)

    Problems fixed in GPFS 3.1.0.30 [September 21, 2009]

    • Fix for a race condition between inode expansion and file system manager migration.
    • Fixed GetSomeDataBlockDiskAddrs to synchronize with metanode takeover.
    • setxattr(ACL) was setting aceLength incorrectly causing mmgetacl to fail.
    • Fixed a false no space error returned from file creation when disks still have space left.
    • Invoke the syncfsconfig user exit after mmchconfig disk related changes.
    • Correct code causing fsck problems to be mangled or lost when machines of different endians are involved in shipping the problem between the worker node and the Stripe Group manager.
    • This update addresses the following APARs: IZ53516 IZ59830.

    Problems fixed in GPFS 3.1.0.29 [July 14, 2009]

    • Allow a cluster with a single node quorum to have a tiebreaker disk.
    • Fixed node crash when shutting down gpfs due to mmap counter.
    • Avoid hang after forced unmount interrupts addition of new inodes.
    • Prevent rare data corruption after node failure just before enabling log replication.
    • Correct return value type of getMaxAcqSeqNo to prevent 32-bit overflow.
    • Fix a deadlock during token manager appointment during heavy workload.
    • Fix for a race condition between quota file update and quota manager cleanup.
    • This update addresses the following APARs: IZ53325 IZ53327 IZ53328.

    Problems fixed in GPFS 3.1.0.28 [May 18, 2009]

    • Fix dereference of stale OpenFile pointer in gpfsMmap.
    • Do proper endian conversion on imported FskError inode problems when imported from a different endian machine.
    • Fixed a race between sync and create snapshot processes.
    • Fixed code which can cause an assert during stripe group cleanup.
    • Fix incorrect block calculation when using gpfs_preallocate.
    • Fix rare assert due to mmchmgr command running during node recovery.
    • Fixed incomplete error checking when acquiring a byte range token for fcntl locking.
    • Fix mmdumpkthreads/kdump for BlueGene/P IO nodes.
    • Correct parsing of mmcrfs -i parameter.
    • Fix a problem in restripe code where a spurious error is generated when the most recent snapshot is in a state where it can't be opened.
    • Fixed loop in error path during snapshot file creation.
    • This update addresses the following APARs: IZ48534 IZ50232 IZ50236.

    Problems fixed in GPFS 3.1.0.27 [March 30, 2009]

    • AIX errno improperly translated by Linux node resulting in misleading message.
    • Fix for a spurious abnormal shutdown following stripe group manager resignation in complex failure scenarios.
    • Fix assert caused when attempt to become metanode fails.
    • Fixed fsck to update the stripegroup descriptor if it fixed a fatally flawed quota inode.
    • Calculate the available pagepool memory in 64 bit to prevent integer overflow.
    • Fix rare deadlock that occurs if a node reboots multiple times during the join protocol.
    • Fix rare segmentation violation that occurs if filesystem panics under heavy stress.
    • Fix rare single node hang due to stripegroup panic and quorum loss during restripe.
    • Fix for a log recovery problem after an interrupted mmdelsnapshot command.
    • Fix fsck to print the right message when user declines to fix a compare mismatch error.
    • Fix log recovery in a rare situation arising when a node fails just after disk failure in a stripe group with replicated metadata.
    • Fix a memory leak in mmlinkfileset command.
    • Change the data type of st_size in stat64_user32 and stat64_user64 to be Int64.
    • Explicitly set the mode of the intermidiate SSL key files.
    • Keep quotas current if replication changed by restripe.
    • Correct non-root user handling of tsfsattr return code if kernel extension already set errno.
    • Fix deadlock encountered during mmap reads.
    • Fix problem where reused inode has leftover SX byte-range lock in Token Manager state.
    • Fix a problem in restripe of snapshot copies of fileset metadata files.
    • Fixed mmap code that fixes close errors when compiling in GPFS.
    • Fix quota accounting with snapshot block changes.
    • Fix for an inode hold count leak in fcntl code on Linux.
    • This update addresses the following APARs: IZ44246 IZ45301 IZ45533 IZ45656.

    Problems fixed in GPFS 3.1.0.26 [February 9, 2009]

    • Fixed problem referencing snapshots by the "latest name" established by mmsnaplatest.
    • Remove leftover respawn log files.
    • Correctly gather trace reports.
    • Fix mmchattr to use the stat64 structure to avoid errors on really large files.
    • Prevent process group pid (0) termination within signal handler.
    • Verify that intermediate versions of /etc/filesystems and /etc/fstab are not empty when root filesystem is full.
    • Add validity check for mmlsattr array index.
    • Strengthen input specification checking for the mmfileid command.
    • Fix cases where mmpmon may not recognize certain host names in certain configurations.
    • This update addresses the following APARs: IZ42207 IZ42208 IZ42209 IZ42212 IZ42310 IZ42312 IZ42314 IZ42350 IZ42355 IZ42356.

    Problems fixed in GPFS 3.1.0.25 [December 18, 2008]

    • Performance improvement for multi-threaded read-only workloads.
    • Properly cleansup filesystem having inodes with fatal errors.
    • Fixed a problem where reading extended attributes from a newly created file could sometimes return old attributes of a deleted file.
    • Reduced SMP locking contention in certain read/write code paths.
    • Increased size of largest lun that can be added to the file system to 8x the largest lun used to create it.
    • Fixed cxiStartIO NULL ptr when a disk fails.
    • Fixed problem with loss of cluster membership on failing node due to incorrect daemon version number.
    • Set xattr must set the xattr file length on the local cached xattr file even if it didn't allocate the inode.
    • Fixed race condion in directio write path.
    • Make char device immediately available after mknod completes.
    • Improved SMP scaling for DIO workloads on AIX.
    • This update addresses the following APARs: IZ37753 IZ37913 IZ38597 IZ38598 IZ38599 IZ38600 IZ38601 IZ38602 IZ38603 IZ38604 IZ39037 IZ39038 .

    Problems fixed in GPFS 3.1.0.24 [October 30, 2008]

    • Fixed cxiStartIO NULL ptr when a disk fails.
    • Fixed system hang problem when disks are almost full.
    • Fixed case where trace report file exceeds 2G on 32-bit Linux.
    • Fixed loader problems due to shmat when trying to load from gpfs filesystem.
    • Deadlock during recovery handling synchronus RecLockRetry messages.
    • Keep only the 10 most recent system map files.
    • Fixed code which caused mmchmgr to hang.
    • Always use /bin/ksh to start mm commands on remote nodes.
    • locks_remove_flock BUG call on SLES SP2, or 2.6.18 kernels.
    • Allow pagepools greater than 2G.
    • When checking space utilization, use df -P to ensure proper parsing of ouput.
    • Corrected the handling of node names that contain the dash character.
    • Reduced unnecessary mmfsck error messages or assertion failures when inodes are corrupted.
    • Null pointer dereference in cxiRefOSNode when nfsWatchKproc closes a file after unmount.
    • Fixed for a rare race condition on Linux during GPFS shutdown that can cause an oops with do_lookup on the stack.
    • Limit mmlsattr and mmchattr to work only on regular files or directories.
    • On Linux, tolerate a non-default setting of kernel.pid_max.
    • Explicitly initializes the exception table for the x86_64 platform.
    • Allowed mmchfs to set the estimated number of nodes for a filesystem. This will only affect new pools created after the change.
    • If the inode is fatally flawed with bad indirection level or bad indirection block, query the user for deletion of the inode.
    • Fixed race condition in mmcheckquota command.
    • This update addresses the following APARs: IZ33868 IZ33869 IZ33870 IZ33872 IZ34226 IZ34229 IZ34233 IZ34236 IZ34237 IZ34238 IZ34242 IZ34245 IZ34246 IZ34248 IZ34250 IZ34251 IZ34343 IZ34716 IZ35034 IZ35157 IZ35158 IZ35172.

    Problems fixed in GPFS 3.1.0.23 [September 18, 2008]

    • Fixed msync EIO failure.
    • Fixed rare race condition between early msgs and group protocol that occur during random node failures.
    • Improved performance of the file close path when many instances of the file are open.
    • Eliminated a rare assert when mounting a filesystem after deleting a disk that was added since the last filesystem mount.
    • Cleared bigger NFS filehandle size area on AIX 6.1 NFS servers. Can affect NFS client /bin/pwd command output.
    • Fixed assert in dmapi msg handlers when sending an rpc to sgmgr due to bad sgInfo.mgr field.
    • Fixed endless loop trying to remount a filesystem from a remote cluster when the home cluster loses quorum in the middle of a join protocol.
    • Fixed Linux kmalloc memory leak found during POSIX fcntl lock test.
    • Fixed rereadSGDesc to skip reading invalid disk structures in the descriptor.
    • VM_IO flag has been masked off in GPFS mmap code to allow parent process to ptrace its child process.
    • Improved performance of multi-threaded workloads on multiprocessor machines.
    • Fixed pwd problem after automigration.
    • Fixed error(EINVAL) while writing to a new file in dataship mode.
    • Fixed problem in "mmrestripefs -b" so it would properly rebalance the file system.
    • Fixed deadlock after running mmapplypolicy when having problem deleting a storage pool.
    • Added -N option to mmfileid to limit the nodes used as workers.
    • This update addresses the following APARs: IZ28021 IZ29602 IZ29750 IZ29751 IZ29753 IZ29754 IZ29755 IZ29756 IZ29758 IZ29761 IZ29762 IZ30082 IZ30811 IZ30813 IZ30819 IZ30820 IZ30821 IZ30822 IZ30823 IZ30824.

    Problems fixed in GPFS 3.1.0.22 [July 31, 2008]

    • Improved performance of restoring sparse files when using mmrestorefs of a previous snapshot.
    • Fixed a condition that caused clean buffer threads wait for file system descriptor update but the file system is panicked.
    • Fixed for a rare deadlock during token recovery when multiple file systems are going through recovery simultaneously.
    • Fixed a window where a combination of node failures and transient file system errors could cause file system corruption.
    • Fixed deadlock in mmap page write trying to make indirect block valid.
    • Fixed remote node recovery being impacted due to local recovery problems.
    • Fixed potential deadlock for dmapi respond event, when commiuncation between session node and event node is broken.
    • STM_Migrate returns during an iterator loop without calling LastObj to detach.
    • assert (advObjP == ofP->advLkObjP) when state not cleared on last close.
    • writebehindActive assert after I/O errors caused SG panic.
    • Allow max shared segment size to be up to 2G on AIX 64-bit.
    • Exception table feature of Linux kernel 2.6 is being used so that kernel does not raise Oops and continues execution in normal way if it encounters a soft page fault while accessing a particular memory region in GPFS.
    • Fixed problem locating free space on newly added disks.
    • Fixed inode scan algorithms to handle corrupted inodes and still get the valid ones that are in the same block of inodes.
    • Fixed assertion, nInodesTotal > 0 in pit.C, when the number of inodes created in a filesystem * the number of snapshots is greater than 2**31-1.
    • This update addresses the following APARs: IZ23912 IZ25633 IZ25640 IZ25642 IZ25643 IZ25645 IZ25757 IZ25759 IZ25760 IZ25761 IZ25772 IZ25773 IZ25774 IZ25775 IZ25776 IZ25777 IZ25778.

    Problems fixed in GPFS 3.1.0.21 [June 19, 2008]

    • Correct space error conditions from mmrpldisk command.
    • Fixed deadlock in snapshot commands if they are invoked in a narrow time window during which mmfsck is finishing.
    • Fixed race condition dealing with socket reconnects.
    • Fixed a problem where deleting or renaming a file system could cause deadlock if a new file system is created using the old name.
    • Zero the lckdat structures before use (posix_lock_test).
    • If -N is explicitly specified on the gpfs.snap command, do not assume -a as well.
    • Prevent data corruption in rare conditions following a forced unmount due to disk failure.
    • The mmgetacl command should display inherrited FileInherit ACEs as InheritOnly for directories.
    • Fixed possible lost update of quota shares if quota manager is changed before the update is saved to quota file.
    • The mmfsck command with -c option reports spurious CmpMismatch errors for directories on large block filesystems.
    • Fixed adding quorum nodes when there are tiebreaker disks.
    • Add additional connection related messages to mmfs.log.latest.
    • Fixed mmap data corruption.
    • Improved checking on block allocation to avoid potential data corruption.
    • This update addresses the following APARs: IZ21698 IZ22446.

    Problems fixed in GPFS 3.1.0.20 [May 8, 2008]

    • Fixed unmount code to clear stalled mount flags.
    • Fixed rare deadlock between sg panic due to disk failures and recovery.
    • Fixed assert during clmgr election when using tiebreaker disks.
    • Fixed a Linux direct IO data miscompare when user buffer is not page aligned.
    • Fixed problem in executing parallel utilities on a 2 node cluster, where non-manager node fails on its last work item after the manager node has completed all other items.
    • Fixed assert during token cleanup or token migration because it could not release some tokens due to tRefCount.
    • Fixed assert due to node failure while transferring tokens.
    • Fixed problem in dm_get_dmattr prefetch code.
    • Fixed assert unmounting a dmapi enabled fs from a remote node.
    • The mmputacl/mmeditacl command which when used on NFSV4 acls wouldn't accept user and group names with a space character in between.
    • Fixed a rare deadlock between quota client and file system sync when the file system manager fails.
    • Fixed assertion failure when migrating logs off a disk being deleted without having suspended the disk previously.
    • Fixed problem, which under certain stress workload, could cause a deadlock with a long waiter "change_lock_shark waiting to set acquirePending flag" on one of the nodes.
    • Fixed a problem where a combination of node failures and I/O errors could lead to deadlock.
    • Fixed problem where mmcrvsd fails when vsd is running on ml0 and gpfs management is running over reliable hostname network.
    • Fixed problem where a combination of node failures and I/O errors could lead to deadlock.
    • Fixed gpfs.snap to handle df output when the filesystem name is long.
    • Prevent a deadlock during cleanup in some cases after multiple failures cause the stripe group manager to resign.
    • Fixed disk rediscovery code path driven after fixing disk failure.
    • Fixed umask in gpfs.snap.
    • Fixed a rare failed assertion condition following asynchronous recovery on Linux after an interrupted deletion of a file open on multiple nodes.
    • Reduce impact on applications while mmrestripefs, mmdeldisk, or mmchdisk start are running.
    • Show system information in mmfsadm dump version.
    • Fixed possible assertion failure in the (very unlikely) case that an FIOASYNCQX ioctl or kxEpollAdd call fails for a new connection.
    • Fixed LOGSHUTDOWN errors from epoll failures to print a real error value rather than always reporting ENOMEM.
    • Fixed file system corruption problem when using snapshots on a dmapi enabled file system.
    • Prevent rare assert when using mmrestripefs on a busy filesystem.
    • Avoid rare assert after forced unmount during heavy load.
    • Fixed race condition, where after some transient errors that caused a file system to become temporarily unavailable, subsequently issued gpfs commands could hang ("waiting for SG cleanup").
    • Fixed a small window, where transient errors during log recovery could cause some recently forced log records to be lost, potentially leading to file system corruption.
    • Fixed error when deallocating low-level files after running out of space.
    • Fixed listing of .snapshots dir, when readdir is called with small buffer.
    • On AIX, look for the opessl command first in /usr/bin and if it is not found there, try /opt/freeware/bin/openssl.
    • Export user provided environment variables prior to starting the main GPFS daemon.
    • This update addresses the following APARs: IZ17622 IZ18686 IZ18772 IZ19143 IZ19196 IZ19200 IZ19201 IZ19202 IZ19203 IZ19204 IZ19207 IZ19208 IZ19210 IZ19212 IZ19218 IZ19223 IZ19241 IZ19242 IZ19243 IZ19244 IZ19245 IZ19252 IZ19253 IZ19255 IZ19256 IZ19257 IZ19800 IZ19804 IZ19805 IZ19806 IZ19807 IZ19808 IZ19810 IZ19811 IZ19812 IZ20148 IZ20149 IZ20151 IZ21177 IZ21185.

    Problems fixed in GPFS 3.1.0.19 [March 27, 2008]

    • Enhanced mmrestripefs -r to dissolve unused and ill-replicated log groups in metadata replicated file systems.
    • Improved command usage of mmdefedquota.
    • Quiesce ops hang indefinitely after a filesystem panic.
    • Fixed a rare race condition that could cause livelock in a multicluster environment during a metanode token request.
    • Fixed the problem in dm_get_dirattrs; reading past end of directory.
    • Fixed problem with writing to a file using mmap when snapshots are used. The mmap code was incorrectly updating the file data in the snapshot, which sometimes caused assertion failures, but was incorrect in any case. Change the page fault handler to perform a copy-on-write check the first time a block is accessed.
    • Fixed a problem where SOFS read of an offline file failed with EACCES.
    • Prevent DeadManSwitch timeout on cluster with single quorum node and a tiebreaker disk.
    • Fixed a deadlock where fcntl revoke is still waiting in crtWaitSeqNo after seqNum has changed.
    • Initialize the internal GPFS variables prior to the exportfs step.
    • Fixed isAlloc assert problem after doing mmcrsnapshot with newly truncated large files.
    • Fixed an assert caused by a race condition between snapshot deletion and recovery.
    • Fixed a deadlock in SGPanic waiting for recoverAllocManager to signal sgMgrOpEnd.
    • Fixed assertion triggered when attempting to obtain mutex with SNMP support is enabled.
    • Fixed a race between cluster manager election and node quorum designation when using minority quorum aka tiebreaker disks.
    • Fixed an assert after doing mmchfs -V that upgrades filesystem from 2.3 level to 3.1 level which creates a new fileset quota file.
    • Fixed a problem in parallel traversal of inode file function when restarting a work item of a failed node.
    • Fixed a race condition where, after deleting a disk, creating or writing a new file could fail if snapshots are present.
    • Fixed an assert caused by a race condition in spooling log records and processing log wrap.
    • Fixed a spurious error during a restarted snapshot deletion command.
    • Fixed auxData.synchedFSize >= 0 assertion.
    • Complete server-side fcntl revoke, and reply to acquire message.
    • Fixed a performance problem with root user allocations when quotas is in use.
    • Added new configuration paramenter dmapiMountEvent to control how dmapi MOUNT and PREUNMOUNT event is generated. Available values are all, SessionNode, LocalNode.
    • Fixed a problem where there is a hang in rare cases after disk failure, or other serious error, causes a stripe group panic.
    • Fixed mmdeldisk to migrate all directories off the disk being deleted, even if the directory has invalid blocks.
    • This update addresses the following APARs: IZ14182 IZ14653 IZ14735 IZ14933 IZ14972 IZ15561 IZ15785 IZ16124 IZ16128 IZ16135 IZ16138 IZ16140 IZ16151 IZ16161 IZ16420 IZ16666 IZ16667 IZ16860 IZ16917 IZ16918 IZ16919 IZ16920 IZ16921 IZ16945 IZ16947 IZ16950 IZ16951 IZ16960 IZ16965.

    Problems fixed in GPFS 3.1.0.18 [February 21, 2008]

    • On Linux, fixed mmap deadlock scenario that occurs with high IO stress on files that are write mapped.
    • Enabled the user to view the maximum disk size that can be added to the filesystem using the mmdf command.
    • Fixed Null pointer dereference in posix_locks_deadlock.
    • Fixed a problem where file system runs out of metadata space.
    • The command mmlsmount gracefully now exits with the proper error message when executed on a node which is part of multicluster environment with a remote filesystem mounted across cluster and one or more clusters are in the arbitration state.
    • Fixed problem in mmfsck after it detects a connection problem so that the next mmfsck attempt can continue.
    • Fixed problems in handling file deletions in conjunction with remounts.
    • Fixed a window where a token server failure under high load could cause a kernel panic (assert: "get_obj_status() == LkObj::valid" in rdwr.C ...).
    • Fixed a problem where an SOFS read of an offline file failed with EACCES.
    • Fixed a deadlock when a regular file read is into a mapped page of the same file.
    • Fixed Deadlock: SGMgr takeover, fcntl operation, fcntl revoke processing.
    • On Linux, fixed kernel oops that occurs when reading a file via sendfile when the file is opened for direct I/O.
    • Fixed a problem in the dm_get_events() call so that it always returns the number of events no more than maxmsgs specified.
    • Fixed a kernel oops on Linux due to a race during file descriptor list expansion.
    • Allow listing and getting extended attributes on the .snapshots directory.
    • Correct a problem that was preventing mmchnsd from removing the NSD servers.
    • Fixed code for api gpfs_getpoolname() so that it runs properly in 32bit application with 64bit gpfs on x86_64 platform.
    • Correct the handling of the ignoreStartupMount node override files.
    • Allow keepalive to be set on GPFS socket connections to prevent firewall timeouts on inactive sessions. Use mmchconfig socketKeepAlive=yes to enable.
    • Do not use the result from hostname for gpfs.snap; use the defined admin interface instead.
    • Fixed a problem with gpfs_prealloc() when not all data disks are available.
    • Fixed a kernel panic following mmchfs -V on a mounted pre-3.1 filesystem.
    • Fixed a problem in NFS when number of active files exceeds the gpfs cache limit.
    • No CNFS action needs to be taken for non NFS-exported filesystems being umounted using mmumount.
    • Fixed several problems (forced unmounts and GPFS crashes) which occurred after snapshot creation.
    • Fixed mmrepquota to not disply quota information for non-existing filesets.
    • Fixed problem with file system running out of metadata space.
    • Added test for SLES10 SP2 for the new statfs interface.
    • Fixed a tscomm deadlock caused by serverSideRevoke code.
    • Fixed for race conditions occurring after mmchmgr involving quota and snapshots code.
    • This update addresses the following APARs: IZ11602 IZ12367 IZ12391 IZ12422 IZ12423 IZ12426 IZ12428 IZ12430 IZ12438 IZ12439 IZ12440 IZ12441 IZ12443 IZ12446 IZ13095 IZ13395 IZ13417 IZ13420 IZ13438 IZ13439 IZ13440 IZ13441 IZ13681 IZ13740 IZ13742 IZ13744 IZ14009 IZ14542 IZ14940 IZ15433 IZ15434 IZ15435 IZ15436 IZ15438.

    Problems fixed in GPFS 3.1.0.17 [December 20, 2007]

    • Fixed mmlsmount all to show intermediate error messages.
    • Fixed kernel panic, on Linux, from a trace/debug call that occurs during GPFS shutdown when the system is under extreme memory pressure.
    • Avoided fcntl retry deadlock caused by token left in COPYSET.
    • Fixed locking problem in deallocating data blocks.
    • Fixed an assert during fileset creation racing with recovery.
    • Prevented mmrestripefs and mmdefragfs hanging when they are running at the same time on multiple nodes and more than one node fails.
    • Avoid deadlock between blocked vfs operations, file system quiesce, and file system panic.
    • Fixed a rare case where the file system use-count is not decremented properly causing cleanup to hang.
    • Fixed a kernel assert in setQueue().
    • Prevented file create failure in unusual conditions involving snapshots.
    • Fixed an assert in multi-token server list.
    • Fixed an assert that may occur when changing file system name and modifying file system disk configuration.
    • Fixed a kernel assert on Linux following a remount.
    • Fixed an assert in quota manager cleanup when the file system is panicked.
    • Changed the check for server-side UID remapping script.
    • Fix for clients hang waiting for fcntl locks when using different NFS servers.
    • Avoided panic kernel when GPFS daemon dies while Samba using Share operations.
    • Performance improvement for dmapi library on linux. Library now caches open file descriptor used to make kernel calls.
    • Avoid daemon crash during mount while restripefs is running after adding new disks.
    • Fix for "No space left on device" returned from getfacl for large ACLs.
    • Improved performance of active nodes needing quota services when nodes shutting down.
    • Avoided deadlock when many Samba threads using GPFS Share operations.
    • Avoided modification of freed sleepElement storage.
    • Improved NFS write performance when exact mtime filesystem option is set (-E yes).
    • When looking up a mount point, return only the first matching entry.
    • Correct SIGSEGV when filesystem created with large blocksize (-B >= 2M) and large number of nodes (-n > 2000).
    • Fixed an assert during token recovery after a series of file system panics.
    • Fixed assert that happens when setting STF_DESIGNATED_MNODE.
    • Fixed communications hang that happens in rare cases after a node restart.
    • Defined a GPFS generated error number in gpfs.h for gpfs_quotactl.
    • Removing last managed region on a dmapi managed file was causing other dmapi attributes for that file to be lost.
    • setid/sticky bits not set when chmod targets a file with nfs4 ACL
    • Fixed gpfs_get_share to close fd on error case.
    • Fixed NFS access to .snapshots (previously returning E_STALE)
    • Corrected problem mounting existing SANergy exportable filesystems; on 3.2 nodes should not show that it is not enabled.
    • Corrected a SIGSEGV problem with mmlsfileset -Y.
    • Fixed a spurious lookup failure (ENOENT error) on Linux during parallel mkdir/rmdir operations
    • On AIX with RVSD, do not start GPFS if RVSD is down and wait4RVSD is set to no.
    • Performance improvements to the inode scan.
    • Fixed NFS access to .snapshots (previously returning E_STALE)
    • Fixed problem so that GPFS can restore files which were backed up using older PTFs.
    • This update addresses the following APARs: IZ07295 IZ07514 IZ07542 IZ07543 IZ07544 IZ07545 IZ07549 IZ07751 IZ07943 IZ07944 IZ07945 IZ07946 IZ07948 IZ07949 IZ08629 IZ08842 IZ08843 IZ08844 IZ08845 IZ08846 IZ08847 IZ08848 IZ08849 IZ08851 IZ08856 IZ08859 IZ09508 IZ09596 IZ09620 IZ09693 IZ09694 IZ09695 IZ09696 IZ09916 IZ09925 IZ09974 IZ09977 IZ09978 IZ09991 IZ10229 IZ10815.

    Problems fixed in GPFS 3.1.0.16 [November 8, 2007]

    • Fixed problem adding disks between 1TB and 2TBs to gpfs filesystems on AIX.
    • Fixed a hang in snapshot ditto resolution.
    • In a mixed cluster with both big endian and little endian such as power and i386 nodes, when configuration manager is changed to a node with different endian setting, dmapi disposition setting is corrupted. This fixes such problem so that dmapi works without problem in such cluster.
    • Improved file deletion performance under very heavy thread contention.
    • Fixed parsing of remote cluster contact names.
    • Prevented a rare assert caused by restriping actively modified files.
    • Fixed the handling of names with imbedded blanks.
    • Fileset's junction pathname can be displayed as "--" by mmlsfileset when a directory in the path was recently modified.
    • Ensure mmimportfs records all disk attributes in the config file.
    • Remove temp backup file after a successful restore.
    • getNFS needs to get nfs_lock before clearing the vinfo pointer.
    • Unable to write to GPFS via ppc64 Samba (libgpfs_gpl.so not found).
    • Fixed compiler warning in inode.c when compiling the GPL layer.
    • Prevented mount from hanging when run following snapshot commands.
    • Fixed EOPNOTSUPP error from gpfs_putacl into a "-k nfs4" filesystem on Linux.
    • Allowed mmchdisk and other commands to work on file systems in which all disks are suspended.
    • Fixed problem where the gpfs daemon gets sig11 rarely when cleaning up a node crash or unmount.
    • Pool information was not always being deleted when the last disk deleted from the pool.
    • Improved performance of file tree traversal when most files have extended attributes like HSM migrated files.
    • Discovered the correct disk size for AIX hdisks over 2TB.
    • Fix utime error processing.
    • Improve metadata allocation on near-full filesystems.
    • Fixed a synchronization problem with file block fetch handlers.
    • Fixed a problem where a small window of time, where conflicting accesses to the same file from different, remote clusters could cause the GPFS daemon on a manager node in the home cluster to fail.
    • Fixed problem where a manager node failure under a workload with high lock contention could sometimes cause the gpfs daemon on another node to fail and restart.
    • The command mmfileid should look at the actual current mount point, not the default one.
    • Make mmdumpkthreads execution asynchonous during an internal dump.
    • Fixed bad tracebacks on x86_64 nodes.
    • Avoid log recovery failures after a crash interrupts file deletion.
    • Avoid log recovery failures after a crash interrupts file deletion.
    • Fixed for a problem in log record generation during file truncate
    • Fixed race condition (and subsequent assert) that may occur during online mmfsck run with concurrent heavy workload.
    • Fixed assert during clmgr election.
    • Fixed deadlock in create or delete snapshot while mmapplypolicy is running.
    • Fixed assert during mnode token revoke.
    • Fixed segfault that may occur during restripe activity.
    • Prevented long clmgr takeover when using Node quorum with tiebreaker disks.
    • Fix a deadlock between node failure recovery and adjust token server.
    • Fixed a rare hang following file system internal unmount.
    • Fixed for a bad pointer dereferencing in sg_unmount during file system manager takeover.
    • Fixed assert that can occur in certain situations when snapshots are in use and the file being snapshot is deleted.
    • Fixed Linux oops storing an ACL into a "-k nfs4" filesystem.
    • Pool information was not always being deleted when the last disk deleted from the pool.
    • This update addresses the following APARs: IZ04445 IZ05334 IZ05335 IZ05337 IZ05338 IZ05695 IZ05867 IZ05869 IZ05882 IZ05893 IZ05980 IZ05982 IZ05984 IZ05986 IZ05988 IZ06115 IZ06304 IZ06366 IZ06624 IZ06625 IZ06626 IZ06627 IZ06628 IZ06629 IZ06631 IZ06632 IZ06633 IZ06635 IZ06636 IZ06637 IZ06638 IZ06640 IZ06642 IZ06643 IZ06644 IZ06645 IZ06698 IZ06722 IZ06724 IZ06725 IZ06726 IZ06729 IZ06771 IZ06774 IZ06775 IZ06776 IZ06777 IZ06778 IZ06803 IZ06805.

    Problems fixed in GPFS 3.1.0.15 [October 1, 2007]

    • Change mmedquota to require file system for changing fileset quotas.
    • Fix occasional assert related to mutex operations on some pLinux 2.6 kernels.
    • On Linux RHEL4, fix an unlikely timing condition that can cause a kernel panic with a spinlock operation.
    • On Linux, allow system stat command to reflect attributes change right after dm_set_fileattr() api call.
    • Fix code that caused an assert "exp(err == E_OK && ibdP != NULL) -llio.C" when running mmaddisk.
    • On ppc64 Linux (Samba), fix libgpfs_gpl.so not found problem.
    • Fix assert: tryCount <1000, file /llio.C, line 2439.
    • Fix daemon assert that occurs under certain conditions when many threads are simultaneously updating and truncating the same file, all using direct IO.
    • Fix a potential problem of ignoring a node list that was entered as part of an administrative command.
    • Fix memory leak in mmpmon utility.
    • Fix incorrect arithmetic calculations in policy SQL language.
    • Fix error where two nodes allocated disk space for the same block, resulting in assertion failure.
    • Remove long running mmlssnapshot -d from gpfs.snap.
    • Fix a deadlock with Paxos Challenge Thread and waitWithTimeout Thread.
    • Fix restore acl attributes in a mixed cluster environment.
    • Fix kernel panic in mmap pageout during shutdown in rare error cases if data replication is enabled.
    • Fix for a kernel assert on Linux following a remount.
    • Correct deadlock in exception handler.
    • Fix segfault (Linux) or assert (AIX) when an IO error is encountered for a data block when using directIO and the file has replication factor 1.
    • Fixed assert in file dSynch-plat.C ThMutex::getListMutex on x86 platforms.
    • Return valid reply to gpfs_fgetattrs call for the .snapshots directory.
    • gpfs_statfspool and gpfs_getpoolname validate storage pool id and will fail call with E_INVAL if pool id is not valid.
    • Fix for segv in allocation.
    • Allow pool change when -R setting is greater than -M setting.
    • Improve mmedquota -j error messages.
    • Correct file name displayed in error message.
    • Fix problem where more than 32K blocks of a file cached in pagepool.
    • Fix memory leak in allocation manager shipping mode.
    • After mmmount failed on a dmapi enable filesystem, make fs manager node aware of the failure so that the manager could remove the node from mount list. Then any command such as mmchfs can proceed.
    • Fix problem where a node failure during a short window after restarting a manager node could in some cases lead to file system corruption.
    • Correct assert caused by an invalid openfile object.
    • Even if the daemon is down allow for posix locks cleanup to proceed.
    • Prevent "context" assertion failure when running fileset commands.
    • Stop respawning GPFS daemon when errors occur during initialization due to system configuration problems.
    • A stripe group panic in an active filesystem can be caused in rare cases by an administrative operation on a different filesystem.
    • Correct problem with token prefetch table if server and client are at different GPFS levels.
    • Fix for various failures in a scenario where snapshots are present and the filesystem is close to being full.
    • Fix a problem with default quota limits setting.
    • Prevent "context" assertion failure when running fileset commands.
    • Fileset's junction pathname can be displayed as "--" by mmlsfileset when a directory in the path was recently modified.
    • Fix for on-line mmfsck running on a file system containing snapshots.
    • On AIX, fix problem where 32nd supplemental gid (and 31st on a 64bit kernel) not recognized
    • This update addresses the following APARs: IZ02439 IZ02732 IZ02734 IZ02735 IZ02885 IZ02894 IZ03022 IZ03116 IZ03192 IZ03248 IZ03417 IZ03518 IZ03521 IZ03523 IZ03526 IZ03537 IZ03650 IZ03652 IZ03667 IZ03697 IZ03698 IZ03903 IZ03904 IZ03905 IZ03907 IZ03908 IZ03909 IZ03910 IZ03940 IZ03941 IZ03944 IZ03945 IZ03946 IZ03947 IZ03948 IZ03949 IZ03950 IZ03952 IZ03953 IZ04029 IZ04030 IZ04031 IZ04033 IZ04034 IZ04187 IZ04188 IZ04189 IZ04190 IZ04191 IZ04192 IZ04194 IZ04196 IZ04197 IZ04199 IZ04893.

    Problems fixed in GPFS 3.1.0.14 [August 20, 2007]

    • Fixed co-ordination between GPFS and Linux dcache when deleting inodes.
    • Fixed problem cleaning out deleted files from stat cache on Linux.
    • Fixed panic: cxiStartIO (sr4 lost during synchronous iodone processing).
    • Fixed a problem where in some rare cases a node recovery and simultaneous disk failure can result in a deadlock.
    • Fixed an error in the mmunlinkfileset command for very large directories.
    • Improved robustness of the mmfsadm command that is used by service for GPFS diagnostics.
    • Fixed mmcrsnapshot failure when the root directory contains a very large number of files.
    • Make gpfsperf buildable on PPC64 machines.
    • Fixed problem with using mmap to write out a page, where the file ends in the middle of the page. This only happens when data is replicated and blocksize is 16 or 64 KB.
    • Fix to snapshot copy on write for a block of inodes.
    • Fix problem where rolling upgrade from PTF 10 or earlier to PTF 11 or later could cause spurious quorum loss until all quorum nodes have been upgraded.
    • Fixed pathconf(name,__PC_LINK_MAX) to show 64K-1 links.
    • Correct assert for invalid inode number with Linux Snapshots.
    • Fix assert that occurs due to the size indicator of the last block in a file not being properly updated in certain situations where there is intense transaction load against that file.
    • Fix inode file expansion to fix up previously failed inode expansions.
    • Fixed assert exp(stack_addr != NULL) after thread create failure.
    • Corrected assert for invalid inode number with Linux Snapshots.
    • On Linux, fixed for special files where mtime was updated erroneously.
    • Fixed SIGBUS error when accessing an mmap'ed file after remount on Linux.
    • Fixed small time window where restarting GPFS on one node while another node is joining the cluster could cause the first node to fail to re-join.
    • Forced LC_ALL=C when executing the ifconfig command.
    • Added configuration variable treatOSyncLikeODSync (default=yes) so that Linux O_DIRECT + O_SYNC performs better.
    • Ensured that only valid disk addresses are written to the log.
    • Fixed an error in the quota manager initialization code.
    • Corrected online quota check when a quota client node fails during the command processing.
    • Fixed mmfsck handling of the lost+found directory when filesets are used.
    • On Linux, fixed for kernel panic that can occur with UID remapping enabled.
    • Fixed problem where inodes held indefinitely after large NFS stress load on Linux.
    • Fixed loop on open of "." in a deleted directory.
    • Avoided SGTableMutex deadlock during StripeGroup endUse.
    • This update addresses the following APARs:
      IZ01157 IZ01208 IZ01209 IZ01211 IZ01212 IZ01214 IZ01217 IZ01223 IZ01232 IZ01233 IZ01385 IZ01498 IZ01661 IZ01703 IZ01704 IZ01706 IZ01708 IZ01730 IZ01731 IZ01733 IZ01734 IZ01736 IZ01737 IZ01738 IZ01739 IZ01740 IZ01808 IZ01948 IZ01966 IZ01967 IZ01968 IZ01969 IZ02247 IZ02248 IZ02249 IZ02250 IZ02251 IZ02253 IZ02254 IZ02255 IZ02256 IZ02336

    Problems fixed in GPFS 3.1.0.13 [June 28, 2007]

    • Fixed an incompatibility problem between PTF releases since 3.1.0.10 that causes Token Manager memory to fill up.
    • Adapt to new AIX 5.3.0.60 rules for setting PTHREAD_SCOPE_SYSTEM.
    • The mmchnsd command now rejects an nsd server that cannot access the nsd for which it was specified.
    • Fixed a rare race condition which could occur during cluster manager election.
    • Fixed assert during dmapi recovery when the configuration manager node is a 32 bit AIX node and the recovering node is 64-bit AIX.
    • Fixed a problem where the quota enablement information in the file system descriptor could become out of sync with the mount option.
    • Fix code to gracefully handle incompatible file system format.
    • Fixed erroneous mmsdrcli output from the mmchcluster command when the primary or backup server is being changed and the applicable server node is a Linux node.
    • When using metadata replication, deleting a disk from a file system could sometimes cause the gpfs daemon to fail.
    • Fixed daemon crash with signal 11 which occurs after deleting a disk from a strip group and then mounting the stripe group.
    • Fixed filesystem manager assignment looping in cases where some nodes fail to mount a filesystem with a format error.
    • Eliminated potential error in getting or setting the extended attributes on a file.
    • Fixed a problem in error handling with the gpfs_quotactl API.
    • Fixed problem which prevented an open, unlinked file from being memory mapped on AIX.
    • Fixed problem which prevented the use of a GPFS file system for the temporary work file required by mmfsck.
    • Fixed forced unmount deadlocks during heavy, multi-node fcntel locking.
    • Fixed problem where locking via 2 NFS servers was hanging on blocking locks.
    • Fixed potential problem in managing space allocated to a snapshot file.
    • Fixed potential deadlock in generating a dmapi close event.
    • Fixed daemon crash on some API calls with NULL arguments. The problem seems to show up with newer operating system ports to
    • Fixed daemon crash on some API calls with NULL arguments. The problem seems to show up with newer operating system ports to EMT64 and AMD64 hardware.
    • Suppress explicit node listing if mmchconfig -N all is specified.
    • Fixed assert in mmfsd when mmchconfig is used to change the pagepool to an overly large size (like 10000G).
    • Do not change mtime/ctime when opening existing file.
    • Show finer time resolution in mmfs.log timestamps.
    • Allow multiple IP address definitions for primary NSD server name to work in some cases.
    • Fixed deadlock which might occur during certain recovery situations.
    • Fixed slow exit when many processes running an executable file from GPFS exit at nearly the same time.
    • Added -Y option to the mmlsfileset command which enables output to be displayed in colon delimited format.
    • Fixed assert when mounting a filesystem due to an incorrect operation in mmfsck. This problem occurs only when using storage pools and when a specific type of on-disk damage occurs.
    • Fixed assert on create or assume of session after dmapi recovery.
    • Fixed the display of the usage message for the mmumount command.
    • Fixed the gpfs_quotactl command to be consistent with the mmedquota command.
    • Improved handling of dmapi unmount events.
    • AIX xmalloc debug: avoid reference to potentially freed kernel storage in a trace.
    • Fixed logassert in RecLockReset to allow ESTALE from kxCommonReclock.
    • Fixed a problem which could happen with very low probability when deleting or renaming a filesystem while node recovery is taking place.
    • Fixed EFAULT errors from gpfs_getacl during processing the aclget command or aclx_get operation.
    • A change was made to the gpfs.snap -w option so that it no longer fails with a usage message.
    • Fixed invalid argument returned on Linux when cp -p command is issued and the GPFS filesystem option -k is set to nfs4.
    • Fixed possible kernel panic on Linux during multi-node fcntl locking.
    • Fixed deadlock during mmdelsnapshot with heavy multi-node fcntl locking.
    • Fixed spurious EACCES from non-blocking fcntl lock during heavy lock contention between nodes.
    • Fixed TM Monitor Storage level thread hang.
    • A more useful error message is issued when mmcrvsd is passed a disk descriptor with a primary server not known to VSD.
    • Fixed hang with too many kthreads making GPFS requests.
    • Fixed a problem where running mmfsck on a file system created with GPFS 1.1 for Linux could leave file system metadata in an inconsistent state.
    • Fixed potential deadlock in directory lookups on Linux.
    • The mmcrvsd command was changed to allow the use of a VSD server interface other than the interface name GPFS is using for the node.
    • Fixed E_OPNOTSUPP error being returned from mmgetacl (or gpfs_getacl) for "-k nfs4" filesystems.
    • Fixed problem where the command mmdelnsd can result in daemon sig11 if trace is on.
    • Fixed problem where Linux nodes SEGV during the mmadddisk command that fails during devOpen.

    Problems fixed in GPFS 3.1.0.12 [May 17, 2007]

    • Fixed token cleanup code on the token server during a node failure recovery.
    • Corrected recommendations when disks are too small for mmcrfs.
    • Fixed block allocation that could have caused an exception when a file system is nearly full.
    • Improved access to data in filesets during a restricted mount.
    • Fixed deadlock involving multi-node advisory locking and an exclusive inode lock.
    • Fixed assert on structure error during repair or restripe.
    • The mount command provides additional warning messages about down disks under very specific circumstances.
    • Fixed assert ("oldDiskAddrP == NULL || oldDiskAddrFound.compAddr(*oldDiskAddrP)") sometimes encountered when multiple nodes update holes in a file.
    • Fixed token server code to handle a timing problem in the token recovery path.
    • Fixed filesystem hang under heavy network load when using encrypted connections with asynchronous socket notification disabled (usually this is the case only on AIX systems).
    • Fixed the token migration code which caused a node to assert during node recovery.
    • Added an additional check to verify the correctness of the tslsdisk command output.
    • Fixed SIGSEGV error when dereferencing a NULL mountpointP pointer in getEFOptions().
    • Fixed trace statements in dmapireaddir error cases.
    • Fixed group membership loss on cluster manager node when a quorum node with a lower IP address reboots and starts GPFS within 70 seconds of the boot time.
    • Fixed assert which occurred during NSD server shutdown.
    • Eliminated a crash when mmlinkfileset creates a junction in a very large directory.
    • Added the mmchfileset command man page to the set of AIX man pages.
    • Changes to a file's data replication or storage pool assignment were not being correctly recorded if the file contained no data.
    • Improved performance of setting Extended Attributes in parallel applications.
    • Fixed abnormal shutdown after disk failure.
    • A small random write workload could cause GPFS to fail after a snapshot is created. This is fixed.
    • Fixed the chmod command on Linux where it sets mode but returns permission denied.
    • Fixed rare race condition between recovery and end use.
    • The mmgetacl command can now handle file names with embedded blanks.
    • The default values for the mmcrfs -M and -R parameters are now 2.
    • Fuzzy sequential access patterns now recognized in non-NFS file access.
    • Removed extraneous 'unavailable disks' warning messages which were being reported by some commands.
    • Fixed problem with bad TickTimeClassMutex traces.
    • Fixed mmfsck assert due to corrupted indirect block addresses.
    • Fixed deadlock between configuration manager and dm message handlers.
    • Fixed hang in the kernel when running multiple tsreaddir commands.
    • Fixed the problem where the chgrp command was wrongly denied permission on a file with an NFSv4 ACL.
    • If a cluster does not own a file system, assume the file system cannot be mounted.
    • Fixed quota accounting when using snapshots.

    Problems fixed in GPFS 3.1.0.11 [April 12, 2007]

    • The mmmount functionality is corrected to remount remote file systems.
    • Fixed quorumloss and rejoin problem race condition.
    • Fixed byte range token cleanup and recovery code during failure recovery of a failed node.
    • Fixed the problem where the device number is not set when using mknod to create a GPFS object on a 64bit AIX kernel.
    • Avoided collecting logs and waiters from nodes that are not specified with the -N option of gpfs.snap.
    • Fixed shared segment calculations to help avoid overcommitting the space.
    • Add man pages for AIX: mmapplypolicy, mmlspolicy, mmlsfileset, mmlinkfileset, mmunlinkfileset, mmdelfileset, mmcrfileset, mmchpolicy, mmmount, mmumount.
    • In the function dm_get_dmattr, do not return E_BADF if the file is a quota file. Treat quota files as regular files so that the return code is consistent between the two types of files.
    • Fixed DBGASSERT in SGAllocMap::acquireRegionOwnership following SGPanic.
    • Distinguish between delete and create snapshot when reporting running programs.
    • Fix AIX stat to correctly display device type.
    • When a restore operation is incomplete, refuse to run fileset commands.
    • Added range checking for the value of maxblocksize.
    • In the function SFSGetEvent, check if sgP is NULL and if so then use the global clusterConfiguration.
    • On Linux kernels >= 2.6.00, mounting a GPFS file system that was already mounted resulted in the inadvertent removal of IDR 0 which may be allocated to the GPFS file system mounted or another mounted file system.
    • Fixed kdump for SLES 10.
    • Just before checking for quota limit on fileset, if the user is root, skip the check.
    • Improved performance on HSM.
    • The command mmlsquota works for users with more than 64 groups on AIX 5.3 systems.
    • Fixed segfault in mmfsd during NSD local access rediscovery.
    • Fixed cfgmgr failover while another node join is pending.
    • Improve performance of HSM tree scans when lots of files have been migrated.
    • Fixed race in allocation ownership protocol that could cause problems during file system mounts.
    • Fixed deadlock that can occur when updating a file that used gpfs_prealloc.
    • Fixed online checkquota to not assert if the user command killed early.
    • The -Y option on the commands mmdf, mmlsmount and mmlsconfig will produce colon separated output that will allow another program or shell script to parse the fields and analyze the output.
    • Provide a means to retrieve selected configuration information.
    • Fixed DSI from aclx_geti() with NULL buffer pointer on AIX 5.3.

Rate this page:

(0 users)Average rating

Document information


More support for:

General Parallel File System

Reference #:

00000147

Modified date:

2010-07-29

Translate my page

Machine Translation

Content navigation