Readme and Release notes for release 3.3.0.27 General Parallel File System 3.3.0.27 GPFS-3.3.0.27-power-AIX Readme

Fix Readme

Abstract

xxx

Content

Readme file for: GPFS Readme header
Product/Component Release: 3.3.0.27
Update Name: GPFS-3.3.0.27-power-AIX
Fix ID: GPFS-3.3.0.27-power-AIX
Publication Date: 12 November 2012
Last modified date: 12 November 2012

Download location
Prerequisites and co-requisites
Known issues

Installation information

Additional information

Download location

Below is a list of components, platforms, and file names that apply to this Readme file.

Fix Download for AIX

Product/Component Name:	Platform:	Fix:
General Parallel File System	AIX 5.3 AIX 6.1	GPFS-3.3.0.27-power-AIX

Prerequisites and co-requisites

None

Known issues

Problem discovered in earlier GPFS releases
During internal testing, a rare but potentially serious problem has been discovered in GPFS. Under certain conditions, a read from a cached block in the GPFS pagepool may return incorrect data which is not detected by GPFS. The issue is corrected in GPFS 3.3.0.5 (APAR IZ70396) and GPFS 3.2.1.19 (APAR IZ72671). All prior versions of GPFS are affected.

The issue has been discovered during internal testing, where an MPI-IO application was employed to generate a synthetic workload. IBM is not aware of any occurrences of this issue in customer environments or under any other circumstances. Since the issue is specific to accessing cached data, it does not affect applications using DirectIO (the IO mechanism that bypasses file system cache, used primarily by databases, such as DB2® or Oracle).

This issue is limited to the following conditions:
1. The workload consists of a mixture of writes and reads, to file offsets that do not fall on the GPFS file system block boundaries;
2. The IO pattern is a mixture of sequential and random accesses to the same set of blocks, with the random accesses occurring on offsets not aligned on the file system block boundaries; and
3. The active set of data blocks is small enough to fit entirely in the GPFS pagepool.
The issue is caused by a race between an application IO thread doing a read from a partially filled block (such a block may be created by an earlier write to an odd offset within the block), and a GPFS prefetch thread trying to convert the same block into a fully filled one, by reading in the missing data, in anticipation of a future full-block read. Due to insufficient synchronization between the two threads, the application reader thread may read data that had been partially overwritten with the content found at a different offset within the same block. The issue is transient in nature: the next read from the same location will return correct data. The issue is limited to a single node; other nodes reading from the same file would be unaffected.

Installation information

After you have downloaded a GPFS for AIX update package into any directory on your system, use the following section to install the fix package.

Installing a GPFS update for AIX
Complete these steps to install the fix package:
1. Unzip and extract the BFF image(s) from the *.tar.gz file:
  
  gzip -d -c < filename > .tar.gz | tar -xvf -
2. Verify the update's BFF image(s) in the directory.
  
  Normally, the BFF images in the directory would be similar to the following:
  Unnnnnn .gpfs.base.bff Unnnnnn .gpfs.msg.en_US.bff Unnnnnn .gpfs.docs.data.bff
  
  where nnnnnn represents the six (6) digits of the PTF number for the BFF image.
  
  For specific filenames, check the Readme for the GPFS update by clicking the "View" link for the update on the Download tab.
3. Follow the installation and migration instructions in your GPFS Concepts, Planning and Installation Guide.

Upgrading GPFS nodes
In the below instructions, node-by-node upgrade cannot be used to migrate from GPFS 2.3 to later releases. For example, upgrading from 2.3.x to 3.1.y requires complete cluster shutdown, upgrade install on all nodes and then cluster startup.

Upgrading GPFS may be accomplished by either upgrading one node in the cluster at a time or by upgrading all nodes in the cluster at once. When upgrading GPFS one node at a time, the below steps are performed on each node in the cluster in a sequential manner. When upgrading the entire cluster at once, GPFS must be shutdown on all nodes in the cluster prior to upgrading.

When upgrading nodes one at a time, you may need to plan the order of nodes to upgrade. Verify that stopping each particular machine does not cause quorum to be lost or that an NSD server might be the last server for some disks. Upgrade the quorum and manager nodes first. When upgrading the quorum nodes, upgrade the cluster manager last to avoid unnecessary cluster failover and election of new cluster managers.
1. Prior to upgrading GPFS on a node, all applications that depend on GPFS (e.g. Oracle) must be stopped. Any GPFS file systems that are NFS exported must be unexported prior to unmounting GPFS file systems. If tracing was turned on, then tracing must be turned off before shutting down GPFS as well.
2. Stop GPFS on the node. Verify that the GPFS daemon has terminated and that the kernel extensions have been unloaded (mmfsenv -u ). If the command mmfsenv -u reports that it cannot unload the kernel extensions because they are "busy", then the install can proceed, but the node must be rebooted after the install. By "busy" this means that some process has a "current directory" in some GPFS filesystem directory or has an open file descriptor. The freeware program lsof can identify the process and the process can then be killed. Retry mmfsenv -u and if that succeeds then a reboot of the node can be avoided.
3. Upgrade GPFS using the installp command or via SMIT on the node.

Additional information

Notices

[January 24, 2011]

A fix introduced in GPFS 3.3.0-11 and in GPFS 3.4.0-3 changed the returned buffer size for file attributes to include a dditional available information, affecting the TSM incremental backup process due to the selection criteria used by TSM. A s a result of this buffer size change, TSM incremental backup will treat all previously backed up files as modified, causi ng the dsmc incremental backup process to initiate new backups of all previously backed up files. If the file system bei ng backed up is HSM managed, this new backup can result in recall of all files which have been previously backed up. This effect is limited to files backed up using TSM incremental backup; there are no known effects on files backed up using ei ther GPFS mmbackup or the TSM selective backup process.

This issue is resolved in GPFS 3.3.0-12 (APAR IZ92779) and GPFS 3.4.0-4 (APAR IZ90535). Customers using the TSM Backup/Ar chive client to do incremental backup (via dsmc incremental command) should not apply GPFS 3.3.0-11 or GPFS 3.4.0-3 , but should wait to apply GPFS 3.3.0-12 or GPFS 3.4.0-4. Any customer using TSM incremental backup and needing fixes in GPFS 3.3.0-11 or 3.4.0-3 should apply an ifix containing the corresponding APAR before executing dsmc incremental backup using these PTF levels, to avoid the additional file backup overhead, and (in the case of HSM-managed file systems) the potential for large scale recalls caused by the backup. Please contact IBM service to obtain the ifix, or to discuss your individual situation.

[June 9, 2010]

A build error caused an issue with the DMAPI function in the GPFS 3.2.1-20 package that was released on May 27, 2010. The corresponding packages have now been replaced on the service download site.

If you installed the May 27 GPFS 3.2.1-20 package and mounted a DMAPI-enabled file system while running GPFS 3.2.1-20 (for example, -z yes by means of the HSM features of TSM), please contact IBM Support. The replacement 3.2.1-20 package works as designed, but does not fix a file system that was mounted with the problematic 3.2.1-20 package.

Verify that you have the correct package installed by running the rpm -qi gpfs.base command. Make sure that the Build Date is Mon 07 Jun 2010.

[June 2, 2010]

A build error caused an issue with the DMAPI function in the GPFS 3.3.0-6 package that was released on May 22, 2010. The corresponding packages have now been replaced on the service download site.

If you installed the May 22 GPFS 3.3.0-6 package and mounted a DMAPI-enabled file system while running GPFS 3.3.0-6 (for example, -z yes by means of the HSM features of TSM), please contact IBM Support. The replacement 3.3.0-6 package works as designed, but does not fix a file system that was mounted with the problematic 3.3.0-6 package.

Verify that you have the correct package installed by running the rpm -qi gpfs.base command. Make sure that the Build Date is Thu 27 May 2010.

[April 1, 2010]

During internal testing, a rare but potentially serious problem has been discovered in GPFS. Under certain conditions, a read from a cached block in the GPFS pagepool may return incorrect data which is not detected by GPFS. The issue is corrected in GPFS 3.3.0.5 (APAR IZ70396) and GPFS 3.2.1.19 (APAR IZ72671). All prior versions of GPFS are affected.

Click here for details.

[December 17, 2009]

Support for GPFS 3.1 has only been extended for AIX and Linux on POWER systems. Service updates will be made available for other Linux platforms, but support is not being extended.

[November 9, 2009]

GPFS 3.3.0-1 does not correctly operate with file systems created with GPFS V2.2 (or older). Such file systems can be identified by running "mmlsfs all -u": if "no" is shown for any file system, this file system uses the old format, and the use of GPFS 3.3.0-1 is not possible. GPFS 3.3.0-2 corrects this issue.

[November 7, 2008]

GPFS 3.2.1.7 contained a change that impacts TSM HSM recall process of files with stub size >0 causing hangs during recalls. To avoid this problem, the configuration parameter dmapiDataEventRetry has to be set to 'no' via command 'mmchconfig dmapiDataEventRetry=no -i '.

[September 11, 2008]

The 3.2.1-5 maintenance level had a data integrity problem using the mmap feature to write or update files on Linux and AIX. The 3.2.1-6 maintenance level is the recommended upgrade path from versions 3.2.0-0 through 3.2.1-4.

Package information

The update images listed below and contained in the tar image with this README are maintenance packages for GPFS. The update images can be directly applied to your system.

The update images require a prior level of GPFS. Thus, the usefulness of this update is limited to installations that already have the GPFS product. Contact your IBM representative if you desire to purchase a fully installable product that does not require a prior level of GPFS.

After all BFFs are installed, you have successfully updated your GPFS product.

Update to Version:

3.3.0.27

Update from Version:

3.3.0.0 through 3.3.0.26

Update (tar file) contents:

README
changelog
U829687.gpfs.docs.data.bff
U856058.gpfs.base.bff
U849244.gpfs.msg.en_US.bff
U842345.gpfs.gui.bff

Changelog for GPFS 3.3.x
Unless specifically noted otherwise, this history of problems fixed for GPFS 3.3.x applies for all supported platforms.

Problems fixed in GPFS 3.3.0.27 [November 9, 2012]

Add synchronization between filesystem manager resign and some ACL related operations. This is needed to prevent a possible GPFS daemon assert while running mmchmgr command.

Fix range revoke handler to better handle error conditions such as IO error. Instead of causing GPFS daemon assert, just panic the filesystem.

Fixed code that can cause GPFS daemon assert when multiple threads tries to write to the same file after it has been truncated to size 0.

force log writes for synchronous nfs unlink operations.

Fixed problem in readpage/splice_read where it is returning EFAULT instead of ETIMEDOUT when accessing HSM migrated file from NFS client.

Fix duplicate message sometimes being received following an automatic TCP socket reconnect.

Fixed a bug when setting filesize with truncate file operation.

This update addresses the following APARs: IV30007 IV30469 IV30611 IV30739.

Problems fixed in GPFS 3.3.0.26 [October 9, 2012]

mmbackup will check if session between remote TSM client node and TSM server is healthy and will remove the combination from transaction if non-healthy situation is detected.

Added logic to reduce the chance of failure for "mmfsadm dump cfgmgr".

mmbackup will filter filename with newline correctly.

Fix the code that can cause a GPFS daemon assert when multiple thread working on same file caused a race condition to occur.

Fixed race condition between FakeSync and RemoveOpenFile.

FSErrBadAclRef reported when lockGetattr called RetrieveAcl with a zero aclRef.

deadlock resulting out-of-order aclFile/buffer locking.

Fix hung AIX IO when the disk transfer size is smaller than the GPFS blocksize.

Fix an issue in a mixed version cluster, where a node running running GPFS 3.4 or older failing in a small window during mount could cause spurious log recovery errors.

Fix CNFS to recognize GPFS filesystem in RHEL6.3.

Fix segfault in dm_getall_disp() functions.

This fix applies to any customer still running a 3.3 release, but is particularly important for customers in Asia and India who are more likely to use 3 or 4 bytes UTF-8 characters within their clusters.

This update addresses the following APARs: IV25485 IV27423 IV27425 IV27426 IV27427 IV27564 IV27969.

Problems fixed in GPFS 3.3.0.25 [August 27, 2012]

Fix snapshot creation code to prevent a possible GPFS daemon assert when filesystem is very low on disk space

Added fix for 838067, restripe did not properly handle errors returned by copyReplicas that could cause data corruption

Fixed assertion when generating read or destroy events

Fix a kernel panic which caused by a race between two nfs read

Fix a restripe code that could cause a potential filesystem corruption. The problem only affect filesystem that was created without FASTEA enabled but was later upgraded to enable FASTEA via mmmigratefs with --fastea option

This fix only applies to customers running GPFS on Linux/PowerPC, using WEIGHT clauses in their policy rules

Fix mmdeldisk to ignore special files that do not have data in a pool

Fixed a problem where 'mmchmgr -c' fails on a cluster configured with a tiebreaker disk, resulting in quorum loss

gpfs_i_unlink failed to release d_lock causing d_prune_aliases crash

This update addresses the following APARs: IV24241 IV24384 IV25187 IV25441

Problems fixed in GPFS 3.3.0.24 [July 18, 2012]

When a tiebreaker disk is being used, avoid quorum loss under heavy load when the tiebreaker disk is down but all quorum nodes are still up.

Fix the file close code to prevent a daemon assert which can occurs on AIX with DMAPI enabled filesystem.

Fix an infinite wait when delsnapshot.

Fix an FSErrSnapInodeModified error caused by copying quota files to snapshots.

When a tiebreaker disk is used, prevent situations where more than one cluster configuration manager is present simultaneously in the same cluster.

Fix a problem that would cause mmadddisk failure.

Fix an assertion caused by leftover "isBeingRestriped" bit after a failed restripe operation.

Update mmrpldisk to issue warning instead of error when it can not invalidate disk contents due to disk been in down state.

Avoid deadlock creating files under extreme stress conditions.

Fix code to ensure E_ISDIR error get returned when FWRITE flag is used to open a directory.

Fix problems with using mmapped files after a filesystem has been force unmounted by a panic or cluster membership loss.

Fix issue in multi-cluster environment, where nodes in different remote clusters updating the same set of files could cause deadlock under high load.

Fix a bug that causes slowness during mmautoload/mmstartup on systems with automount file system. The performance hit is noticeable on large clusters.

This update addresses the following APARs: IV21133 IV21749 IV21755 IV22016 IV23288 IV23289.

Problems fixed in GPFS 3.3.0.23 [June 5, 2012]

mmbackup will filter "ANS1361E Session Rejected: The specified node name is currently locked error" and will exit error.

mmbackup will filter filename that contains unsupported characters by TSM.

gpfsOpen asserts when getInodeStatus==INODE_DELETED but nLink!=0.

When backup partially fail, mmbackup continues to compensate shadow file even though there are multiple failed reported for the same file in auditlog file.

Fixed a bug in log recovery which could result in a CmpMismatch file system corruption problem.

Fix for the iP->i_count == 0 kernel assert in super.c. This problem only affects Linux 2.6.36 and later.

Fix a rare deadlock where a kernel process gets blocked waiting for a free mailbox to send to the GPFS daemon.

Fix a memory allocation problem when online mmfsck runs on a node with a heavy mmap workload.

Fix mmapplypolicy to avoid listing skipped files count twice.

mmbackup will not stop processing even though there's no auditlog file if only expiration processing is done.

Fixes problem where the 'expelnode' callback indicates that the chosen node had joined the cluster first.

Fix a problem with nBytesNonStealable accounting.

Fixed message handler for filesystem quiesce which caused a GPFS assert when filesystem manager failed while filesystem is being quiesced.

Fix printing of long fileset names in mmrepquota and mmlsquota commands.

Fix mmap operations to go through nsd server when direct access to disks are no longer possible.

Fix mmsetquota to handle numerical fileset names.

mmbackup can backup files/directories with long pathname as long as GPFS and TSM support.

Fix an error message in mmchattr command with -M/R/m/r option.

Fix a problem that restripe failed in to an inifinite loop when sg panicked on the busy node.

mmbackup will display backup/expiration progress message in every interval specified by MMBACKUP_PROGRESS_INTERVAL environment variable if specified. Otherwise, mmbackup will display backup/expiration progress message in every 30 mins.

Fixed rare assert when deleting files in a fileset.

Fixed rare hang problem during sg or token recovery.

Fix deadlock when doing inode scan (mmapplypolicy/mmbackup) in small pagepool.

getxattr for ACLs may overwrite the kernel buffer if small buffer sizes (less than 8 bytes) are specified.

When mmbackup shadow file is rebuilt by --rebuild or -q option, mmbackup will get CTIME information from TSM server, hence files modified after previous backup but before shadow is rebuilt will be backed up by consequent incremental backup.

Prevent disks from being marked as 'down' when a node with the configuration option unmountOnDiskFail=yes receives an I/O error or loses connectivity to a disk.

When mmbackup can't backup files, the message is more informational.

When backup fail (partially or entirely) due to error from TSM client, mmbackup will display error msg from TSM cleint for easy problem detection. But mmbackup will display the error msg only once for the same error even though multiple times occur.

When compesating shadow file takes long time because backup partially fail, mmbackup will show progress message.

mmbackup Device -t incremental and --rebuild is valid syntax and will work properly.

Fix the problem that deldisk returned success even though if failed.

handleRevokeM loops when a lease callback is not responded to.

Fix assertion failure when multiple threads use direct I/O to write to the same block of a file that has data replication enabled.

This update addresses the following APARs: IV19039 IV20611 IV20614 IV20616 IV20628 IV20632 IV20895.

Problems fixed in GPFS 3.3.0.22 [April 23, 2012]

Fixed mmap deadlock due to MML_FLUSH.

Fix multi-cluster assert when using a common directory.

When current backup has 3.2 format, mmbackup -t full will expire 3.2 format files from TSM server.

When existing backup is 3.2 format, incremental mmbackup will keep the format.

When part of back up failed during mmbackup process, the error msg will show more explanatory error message.

Fix cluster partition problem when a single-node cluster configured witha tiebreaker disk is expanded to multiple quorum nodes, and a network outage causes the cluster manager to lose contact with the other nodes.

When mmbackup --rebuild exit with syntax/usage error, the error msg will be more accurate.

Fix the routine that determine if file system is enabled fastea.

Take whether a node is an NSD server into account as a new criterian deciding which node to expel when two nodes cannot communicate. Also add new callback script to allow changing the original decision on which node to expel.

Fix a deadlock in alloc protocol.

Fixed mmfsck handling of duplicate address corruptions.

Fixed inode inconsistency struct error problem due to lastBlockSubblocks not being set correctly after truncate.

Fix to daemon failure ("sgiP->sgMgr == NONE_APPOINTED") after node loses and then regains quorum.

All files backed up will be in shadow file correctly even though there are unlinked filesets.

Fix a race window that causes msync returns undocumented error EAGAIN on AIX.

After NSD server nodes change, fix ReadReplicaPolicy choice of a localNSD server.

Fix an issue that could cause log recovery to fail if a node fails shortly after time-of-day on the node had be set backwards by more than a few seconds.

Fix a problem that can lead to SAMBA create/rename permission denied.

Fixed a case where 3.2 filesystem is not resetting hasXAttr flag at file destroy time.

Fix a deadlock between the last local restripe thread and remote helpers.

mmbackup -t incremental will keep 3.2 backup format if existing backup was done with GPFS 3.2 or earlier version of mmbackup.

No longer provides an improper warning message after an mmchdisk that doesn't actually change anything.

Files created from a Linux node via an AIX/NFSv4 server have bad mtimes.

Exception in the AIX vattr_to_fattr3 routine.

mmbackup will not replace shadow file by .mmbackupShadow.*.old inadvertently.

"Incorrect ACL entry" from mmputacl -i due to overlapping memcpy.

mmbackup will check detail error code and preserve .mmbackupCfg data for debug.

Fix code that can leave filesystem in quiesced state on some nodes after filesystem manager node fails in the middle of a snapshot command.

When existing backup is 3.2 format, mmbackup with multiple TSM server will exit with error.

Fix calculation of the number of file systems managed by each node, which is used in deciding which node to expel.

Fix a rare quota manager deadlock occurred when it needs to expand the last fragment of a quota file.

mmbackup without -S will not consider a fileset that is deleted but remain as unlinked status in a snapshot as unlinked fileset.

When mmbackup fail even before actual backup processing start, it will exit with 2.

mmbackup will backup regular files or directories that happen to contain "mmbackup" in its path.

CNFS: get the correct vlan interface name in SLES.

mmbackup -S will skip files that match to exclude criteria.

mmbackup -t full on 3.2 backup format will successfully expire 3.2 backup format files from TSM server.

mmbackup will consider linked child fileset whose parent fileset is unlinked as unlinked fileset.

CNFS: Fix recovery with multiples VLANs in CNFS IP list.

Prevent the cluster manager from being expelled as a consequence of some communication outage with another node.

This update addresses the following APARs: IV15099 IV15132 IV16137 IV16986 IV16989 IV16992 IV17340 IV17410 IV17478 IV17489 IV18093 IV18096 IV18329.

Problems fixed in GPFS 3.3.0.21 [March 09, 2012]

AIX versions of libgpfs API returning errno in rc instead of seperately.

Correct filesystem hang due to running out of mailboxes.

Fix assert when many threads append at the beginning of the same file.

Fix quota manager locking order that was causing long waiters.

Unblock quota commands earlier in file system recovery.

Fix AIX fatal page fault at 0xFFFFF00010000000.

Fix a problem where writing to a previously read but not yet modified page in a memory-mapped file will sometimes fail after creating a snapshot.

Fix an E_WOULDBLOCK assert which caused by mixed mmap/regular read write on multi nodes.

Speed up recovery when the GPFS daemon terminates at the cluster managernode.

Mmbackup fails with DEBUG=1. Finding the snapshot root dir with "mmsnapdir" when DEBUG is set resulted in mmbackup failing to parse the debug output while looking for the snaproot. Recode the function to look for the specific output needed, and change to use tssnapdir instead of mmsnapdir. Fix a few spelling typos in comments and fix comparison of errorFileCount.

Fix to a problem where, with a file opened with the O_DSYNC flag, the data may not be written to disk before the write() system call returns.

Fix the problem that AIO without O_DIRECT on GPFS Linux yields 0 bytes writtenon kernel >= 2.6.19.

Skip renamed/deleted name entry during a Windows query directory call, which could lead to entire query directory call to fail.

unauthorized chown command was not rejected and setid bits were cleared.

Mmbackup shadow file rebuild process can result in some filesbeing expired from the wrong server. This happens when thereis a plurality of TSM servers and there remains queried inventory from a server reported in the QueryShadow file with path names lexicographically after all the current file pathnames in the live file system. The mmcmi rebuildshadow command is finishing processing the input files (QueryShadow and List) and when remaining data in the QueryShadow must be appended to the being-rebuilt new shadow file. Remedy by duplicating the code which replaces the QueryShadow values for iAggregate, iRule, etc with the data saved from the first line of the current list file and which contains a valid iAggregate and iRulevalue pertaining to this TSM server.

Fix mmfileid to handle big disk addresses in 32 bit kernels.

In mmbackup, check if unlinked fileset at the beginning and display correct error if -f is not specified.

Fix assert that can be caused by truncating a file to less than one block in sizeand then expanding the file back to one block in size.

Fix mmcheckquota command cleanup on failure that was impeding furtherinvocation of the command.

Mmbackup leaves audit log after success. Mmbackup needs to remove audit logs for each tsm server before running and after running each backup job.

Fix the buffer length calculation so that there is room for the complete error message.

Fixed segfault in SFSdmGetDirAttrs.

When scanning through the errors in the audit log file, if mmbackups comes to the end of the list file then it has now made a "reduced copy" of the orginal list, so it closes the originallist, closes the reduced copy and re-opens the reduced copy starting at the top and looking at this reduced copy as the "list file" to continue the search for failures to remove. It continues searching the audit failures out from the ever-reducing list file, until done.IF it succeeds in finding all audit failures in the list, then it has successfully reduced the list to a proper "new shadow " file state and can exit with some measure of success (1).But if it fails to find all the audit failures in there, then it cannot update the shadow and record success.Repair this routine which reopens the reduced list to no longer discardthe first 5 lines of the top of the list file. Originally the intermediate shadow file would have had a 5-line header but the usual 5-lines are not present in this list file anymore as it is in a full shadow database. Simple removal of the code that was discarding the 5 top lines fixes the problem.

Fixed hang due to dmapi lock not being released.

Fix an issue where writing to a file with O_DSYNC did not correctly commit data if disk space for the file was pre-allocated via gpfs_prealloc(). Support two more DWARF4 operation codes in GPFS traceback generation code.

Avoid a deadlock lock when quorumloss happen in restripe is in progress.

Fix a problem that, in certain rare instances, can cause a segfault when a filesystem manager fails or is migrating.

Provide the nanosecond granularity support in gpfs_stat().

Avoid double mutex release when SGMount fails to get the up-to-dateStripeGroupDesc information.

This update addresses the following APARs: IV11268 IV13275 IV14918 IV14922 IV14923 IV15096 IV15187 IV15297 IV15315 IV15316.

Problems fixed in GPFS 3.3.0.20 [January 27, 2012]

Fixed code to prevent daemon assert which can occur after file system panic due to diskfailure.

Fix a panic when gpfs shuts down.

Run cbooP->occupancy callback when THRESHOLD(0,) or no THRESHOLD clause.

Fixed the logging code which can caused GPFS daemon assert under some race condition.

Fix a completion race in the parallel inode traverse.

Fix duplicated dmapi session problem caused in cluster manager node takeover process.

Fix potential kernel panic when dmapi mount event is waiting for response while daemon shutdown at the same time.

Fix a deadlock between the quota manager and inode prefetch in accessing quota files.

Fix a race condition where mixed(mmap and regular) read-write of the same file on multiple nodes could cause some mmap writes to be lost.

This update addresses the following APARs: IV12633 IV12882 IV12931.

Problems fixed in GPFS 3.3.0.19 [December 09, 2011]

Fix a race between truncate and restripe that causes a E_HOLE error.

Fixed a problem with the tie-breaker disk logic, where a cluster manager would resign from its role because of a disk challenge coming from a node which was no longer a quorum node.

Rare filesystem hang due SGPanic before setting filesystem uid.

Fixed assert caused due to having extended attributes in inode of a sparse file.

Fix a small window where truncating a file could cause restripe to fail.

Fix assertion occurring during clone file deletion in dmapi enabled file systems.

Fixed command parser for mmlsquota.

Allow changing fs data replica to max data replica even if higher than max metadata replica.

Added conditional compilation statement so that no error messages will appear when compiling GPL layer on SuSE 10 SP1.

Fix a condition that was causing mmsnmpagentd to terminate abnormally.

Ensure the sg is not panicked before we trigger a logassert in updateDataBlockDiskAddr.

Fixed gpfs rpm dependence problem so that no error message will appear if "rpm -ivh gpfs*" is used.

Add mmapBufstructs configuration setting to allow more page requests which may avoid vmmiowait deadlock.

Fix a file system mount problem when quota files can not be created according to metadata replication factor.

Avoid assert when deleting a very large directory from multiple nodes.

Fix performance of directIO that has to go to NSD server.

Do not allow NumNFSDataObjects increment if NFSKProc is terminated.

Recognize new TSM error when file list aborted.

File struct cleared during reclock revoke handler UNLCK.

Fix a segfault triggerred by mmaddcallback and daemon shutdown at the same time.

Fix a deacdlock problem during restripe fs.

Fix the directory code which can cause a rare deadlock during mkdir and link fileset. The deadlock could occur when GPFS can not write to all replicas due to down disk.

Fix an rare assertion during a file system mount while opening file system disks the quorum is lost.

Ensure that fileset related checks are not done for older filesystem format that do not support filesets. This ensures fsck does not generate false positves of corruption.

Fix a resource leak problem when run mmfileid.

Fixed problem in mmbackup that caused files larger than 100 GB to be backed up needlessly after rebuilding the shadow file.

This update addresses the following APARs: IV01035 IV04701 IV09147 IV09150 IV10336 IV10534 IV10609 IV11660.

Problems fixed in GPFS 3.3.0.18 [October 20, 2011]

Fix to daemon assert when node is changed from a quorum to a non-quorum node.

Fix a rare problem which could cause mount to hang in "waiting for SG cleanup" after newFS manager takeover fails due to panic.

Speed up mmexpelnode execution when multiple nodes fail.

Separate audit logs into separate files for each TSM server. Make each loop that iterates over all TSM servers recoverable so that after an error that server's state is marked to indicate the error and the loop continues to the next server. Track shadow file updates, file backup errors, and general TSM command failures separately and compute the return code given to mmbackup.sh based on the degree of success attained over all TSM servers. Rewrite logic at beginning to obtain or recreate a shadow file for each TSM server depending on the command line arguments and state of shadow file. Track serverExits, serverFails, shadowUpdates all in hashes based on $server. Isolate all code for performing query from TSM server and execute this step when needed for missing shadow file from each server. Save query results into a file and refer to it later after file system scan if rebuild needed. Discard TSM messages that are warnings or information during inventory query. Recognize when TSM has nothing backed up yet for the specified file system and that this is not an error. Create new function mergeQuery() for use after file system scan to merge query data with file system scan data using mmcmi rebuildShadow. Eliminate use of lstat() during query rebuild of shadow file saving time. Create new function restoreShadow() to restore a single shadow file from TSM if it is in inventory. Permit rebuilding shadow files even without TSM present by use of --notsm switch. Support the --rebuild options from mmbackup.sh. Disallow use of -q and -t full at the same time. Corrected check of $? after close only of command pipes and not files.

Avoid problems accessing files in snapshots while running the mmunlinkfileset and mmdelsnapshot commands.

Fix timer tick calculations to avoid AIX times() bug when running on LPARS in shared processor mode (not dedicated processors).

Fix mmapplypolicy during its directory scan phase.

Leave vattr untouched when there is a failure collecting the attributes.

Fix read code which caused errno to be incorrectly set to EAGAIN even when O_NONBLOCK is used. This problem only occurs when reading past end of file.

Fix problem that quorum reached event could be triggered twice for one event occurance during node shutdown time.

Ensure the presence of excluded disks is always reflected in the mount options string.

When a tie-breaker disk is not present, fix delay in completing mmexpelnode when multiple nodes fail and are the targets of the command.

Fix mmecheckquota used to replace quota files to avoid assertion in dmapi enabled file systems.

Hung dmapi dm_release_rights api call in DeclareResourceUsage.

Fix quota file checksum errors caused by using wrong byte-ordering.

This update addresses the following APARs: IV08089 IV08208 IV08607 IV08664 IV08729 IV08730 IV08732 IV08736.

Problems fixed in GPFS 3.3.0.17 [September 16, 2011]

Fix a race condition between creating a fileset snapshot in a node and mounting file system in another node which the file system was internal mounted.

Fix an assertion caused by expanding a quota file fragment.

Attempt to restore a missingshadow file before resorting to rebuilding it from query data in all cases.The former behavior was inconsistent with -t full or -t incremental.

In an environment where a tie-breaker disk is used, fix problem where command mmexpelnode (primarily used in a DB2 cluster) may take more than 2 minutes to run if target nodes are rebooted but come back quickly after the reboot.

Fix code which can cause deadlock when mmdeldisk running with a truncate operation on the helper node.

Fix a double free problem when doing restripe.

Prevent quorum loss at the cluster manager when quorum nodes are being added or deleted in an environment where tie-breaker disks and persistent reserve are used.

OpLock token revoke can cause deadlock on cacheObjMutex.

Take note of any "severe" class error message from dsmc commands and count them. If a severe error occurs, or if ANS1999E message occurs indicating processing of a file list was aborted, then keep old shadow file.

Add debug statement in exec scriptto detail expire errors. Restore lost progress indication for every 30mins to output Backing up files. Limit error messages about policy nonzero return status to occur only for policy errors. In full backup plusquery mode preserve the restored shadow file in .mmbackupShadow*.old.Repaired message formatting for TSM failed with RC... message. Repaired message formatting indicating Audit log missing info. Elide decorating double quotes from audit log lines while comparing file names to shadow file. This fixes most common problem with quotation marks in file names when usingthe TSM 6.3.0 client (enhanced now at least print the correct path). Most critically - change exit status from tsbackup33 when all errors havebeen corrected to be "4" instead of $worstTSMrc.

Fixed rare deadlock during token recovery.

Prevent a node from being added to the cluster a second time under a different name.

Fix mmstartpolicy to accept all valid low disk space event names.

Free kernel buffer upon exit from kxWinOps.

Handle CNFS interface on vlan tagging devices.

This update addresses the following APARs: IV03590 IV03704 IV06330 IV06334 IV06439 IV06466.

Problems fixed in GPFS 3.3.0.16 [August 09, 2011]

Reference any user handle passed to the kernel using the correct access mode and security check.

Fix defrag code that can cause mmdefragfs to never finish. This could occur only when all disks are almost full.

Fix UTF8 encoding check routine to return correct path name length.

Based on number of clients that can connect simulteneosly, if correct value is set for socketMaxListenConnections, connection failure will not be observed by clients attempting to connect at the same time.

Fix to mmexpelnode to avoid indefinite blockage when a network partition is present and the target node is the cluster manager.

Avoid propogating MM_SORT_OPTS enviroment variable across nodes when using using mmapplypolicy with the --sort-buffer-size option.

Permit mmbackup to analyze audit log even if "Failed" messages are not detected in STDOUT from dsmc selective command. Should help in rare cases where TSM runs out of space and exits rc=12 without emitting any of the usual failure messages.

Fix exception in GetReturnAddr following a cNFS grace period.

Fix large inDoubt left unclaimed problem after running chgrp command.

Add checking code to make sure sgID and inode number passed by dmapi handle are valid.

Change allocation code to prevent a deadlock that could occur when rebalance disks. This problem could only occur if there are large number of disk (over 32) and most of them 100% full.

Fix a problem where a file is opened with O_DIRECT, but there are other processes with the same file opened without this flag, the writes to the file do not get immediately updated on the disk.

Fix AIX crash or EINVAL error when calling gpfs_get_winattrs_path.

HoldDaemonSeg before referencing configuration flags in the segment.

This update addresses the following APARs: IV02054 IV02088 IV02252 IV02270 IV02677 IV02739 IV02744 IV03219.

Problems fixed in GPFS 3.3.0.15 [June 24, 2011]

Call putECred on exit from kxWinOps and kxSetTimes.

Remove extra IOs when closing a sequentially written file that is larger than the writebehindThreshold.

Fix deadlock involving fcntl locking operations that can occur on on Linux systems with 2.6.18 based kernels under memory pressure.

Prevent a rare FSSTRUCT error accessing large directory in a snapshot from multi-threaded application on AIX.

Fix rare kernel memory race condition doing AIX IO when the disk max_transfer setting is smaller than the GPFS blocksize.

Simplify expel command execution when multiple nodes including clmgr is specified in the command.

Fixed mmexpelnode to avoid 'Failed to locate a working cluster manager' errors.

Serialize creation of grace period thread.

Fix dummy super block registration (used by kernel modules to detect GPFS daemon death) so that it works with newer Linux kernels.

Fix code which can lead to assertion in a very rare case.

Fix a quota manager assertion where it could be caching invalid quota file inode after restripe.

Fix assert !ofP->destroyOnLastClose or apparent hang under certain workloads with concurrent directory updates from multiple nodes.

Corrrectly set the list of registered RPC programs after restart portmap.

Fix the allocation code which caused signal 11 under certain error condition.

Fix PR on clusters where the nsd are NOT directly-attached.

Fixed remount code that caused mmmount to still show success after remount failed.

Fix CNFS failover problem with SLES11 or later.

Fix for a rare race condition when GPFS startup and shutdown race each other, resulting in a spurious assert.

Fix a GPL build break on RHEL 4.X when LINUX_KERNEL_VERSION=2060900.

Fix a problem where policy rules files with lines longer than 4K are not regurgitated correctly by the mmlspolicy command and are silently dropped.

Cleanup mmnfsmonitor monitor process after remove CNFS node.

Correct disk usage problem in recent Linux kernels with certain LANG setting.

When exec script returns nonzero and mmapplypolicy returns EBADF(9) just print message and allow cleanup code to discern whether failure is fatal or not.

Change variable name to badPathCnt as it represents path names mangled by TSM, not files that were skipped by TSM because they were busy. Eliminate use of lstat() in determining this and simply see if backupDir is a common root of the failed path. If badPathCnt is nonzero, will have to fail the backup.

Fix long waiters caused by error handling in mmpmonNodeListRequest message handler.

Correct mmchconfig buffer allocation problem for real large clusters.

Get rid of spurious approaching limit for the maximum number of inodes when a new FS manager takes over.

This update addresses the following APARs: IV00033 IV00416 IV01058 IV01081 IV01084 IV01086 IV01137 IV01139.

Problems fixed in GPFS 3.3.0.14 [May 12, 2011]

In the mmlsmount -L output, show the type of mount if it is different from an explicit mount in read-write mode.

Fix mmchpolicy causing daemon assert after fail with E_NOENT error.

During audit log analysis, if any path names must be skipped fail the mmbackup with return code 12 and leave the old shadow file in place.

Improve performance of snapshot copy-on-write. In particular, immediately after the snapshot is created and a large number of files are updated from multiple nodes in the cluster.

mmtracectl --off will now reset all trace config. variables.

Replace mhAclStore assert when stale ACL data is discovered.

Only have one receiver thread check for broken connection timeouts in each five second period rather than letting all of them do it.

Fsck::dump() now dumps regionsPerPass for each storage pool.

Add functionality to suspend write operations on a filesystem.

tsdbfs: avoid displaying Inode's wide address fields on narrow disk address filesystems.

Disallow immutable flag setting on narrow da fs.

Fix Persistent Reserve enabled when more than 32 paths registered at the disk. This is most likely to be seen in database environments with direct-attached network storage devices.

mmbackup cleanup exit code reporting throughout.

Fix long mmstartup delay on AIX after mmshutdown --force.

Correct a problem causing mmimportfs to ignore the -S option.

Fix possible filesystem panic during sync.

The maximum shared segment size was increased from 256 MB to 1 GB and the maximum TM memory limit was increased from 1 GB to 128 GB.

Replace portmap with rpcbind on SLES11 for CNFS.

Remove spurious "approaching limit for the maximum number of inodes" when a new FS manager takes over.

Fix CNFS failover problem with SLES11 and later.

Fix a GPL build break on RHEL 4.X when LINUX_KERNEL_VERSION=2060900.

Improve the fairness of the outbound RPC message queue, so that certain reply threads will not be stuck indefinitely under heavy communications loads.

Fix rare race condition causing spurious ENOSPC error when creating files.

Fix NFS returning ENOENT instead of ESTALE for deleted files.

Stop mmapplypolicy hanging or deadlocking itself during exit processing.

Fix writing replicated data with direct-IO where GPFS recovery, after a node failure, could interfere with concurrent updates to the same file from other nodes.

Fix a bogus assert check in the directory check routine.

Prevent GPFS from starting when registering the pagepool to infiniband fails.

Fix mmapplypolicy (tsapolicy) memory fault on AIX.

Fix a memory leak when getting extended attributes of a file.

This update addresses the following APARs: IZ96834 IZ97186 IZ97358 IZ98238 IZ98689 IZ98698 IZ98701.

Problems fixed in GPFS 3.3.0.13 [March 24, 2011]

Update mmtracectl to have --format and --noformat flags which allow one greater control over whether to format traces.

cxiIsNFSLock erroneously returning FALSE for NFSv4 lockctl calls.

Fix problem where forced unlink of a fileset (mmunlinkfileset -f) on Linux would cause temporary loss of file system access if there were deleted files still open in the fileset at the time it was unlinked.

Fix allocation code which caused delete disk to fail when deleting last disk of a failure group.

Fix for a bug in logging code when data replication is enabled and metadata replication is not.

Change "mmwindisk initialize" to create the GPFS data partition with a 16MB alignment.

Make corrections so that gpfs_fgetattr() returns attributes buffer always in correct big endian format.

Always refresh session list that registered for mount to avoid mount failure in the situation that the session could be deleted or added while another node is processing the mount event.

Write new filesystem device name to all disks to avoid unwanted warning messages.

Change in the allocation code to prevent from looping when migrating blocks after a disk's failure group assignment, data type or storage pool has changed.

Create gpfs init lock file on system startup to ensure GPFS shutdown is being called during system shutdown on RHEL distros.

Allow user space attributes to be set by setfattr or gpfs_fputattrs() interface.

Fix an inodeScan interface clean up error that could cause long waiters during unmount.

Fix panic in cxiStartIO when disk device drivers are configured for 1024 scatter gather lists.

Check for snapshot named "NONE" and avoid eliding that name from backup pathnames which is the mmapplypolicy representation of no snapshot.

gpfsFcntl referencing a freed sleep element while handline NFS requests.

Fix problem where under certain rare conditions, mmdeldisk could allow deleting a disk without moving existing data off the disk first. This would only occur on file systems with metadata replication enabled (-m 2), strict allocation enforced (-K always; the default is whenpossible), when running mmdeldisk shortly after creating a new snapshot, and if the only disks remaining are in a single failure group.

Disallow the colon character in a filename during create/open from Windows. The Unix nodes can still create filename that have a colon. Such files can be accessed on Windows using their 8.3 names.

Fix for a bug where small synchronous writes to a block pre-allocated using gpfs_prealloc() may be lost.

This update addresses the following APARs: IZ94720 IZ94723 IZ95803 IZ95817 IZ95820 IZ95853.

Problems fixed in GPFS 3.3.0.12 [February 10, 2011]

Fix rare race between flushBuffer and mergeInode updating lastDataBlock.

Fix RDMA connecting between two clusters where the IB networks are not connected.

Fix asserts in directory code when using uncommon 384K block size.

Add a make parameter "LINUX_DISTRIBUTION" for non-standard Linux distributions. e.g. make LINUX_DISTRIBUTION=REDHAT_AS_LINUX Autoconfig.

Fix the allocation code which caused an assert on filesystem manager node after encountering an I/O error.

Fix a bug in mmwindisk utility (called from mmdevdiscover, for instance) that could cause the program to fail when the Windows node has certain uncommon storage devices attached.

Enable the MaxLoopCheck_dumpBufQueue and set its default value to be 2, due to a certain line in dump pgalloc section getting dumped repeatly, so it will not impact the performance greatly.

Fix Windows code to handle, or enable access on, all Unix filesystem object such as a block device, fifo, socket, symlinks. All such objects will be listed as 0 byte files on Windows. No other operation such as read/write/delete/modify will be allowed from Windows.

Speed up snapshot creation and unmount on systems with a large amount of dirty data in the cache.

sublock to disk sector conversion routines handle invalid disk addresses by returning E_INVAL back to the caller during fsck scan.

Fix for a surplus indirect block may not be processed during restripe.

Speed up the reclaim of unused GPFS inodes on Linux.

Do not allow GPFS internal extended attributes to be set using the gpfs_fputattrs API. Require root authority to set DMAPI external attributes or external namespace attributes other than "user."

Improve performance of file system metadata scan phases of mmrestripefs.

Add gpfs_set_times() and gpfs_set_times_path() API.

Fix killing a mmrestripefs or mmdeldisk command and having the background activity continue for an extended period of time.

Fix daemon assert and segmentation fault during filesystem takeover.

Includ mount event disposition in dm_get_disp() call.

Fix invalid assert in fcntl lock token relinquish path.

Prevent GPFS from starting while certain admin commands are running.

GPFS for Windows now disables SMB2 on the node during installation.

Fix unnecessary work and processing during mmrestripefs and mmdeldisk commands.

Fix quorum formation when the /var/mmfs/gen/BallotFile file is too small.

Fix a longwaiter problem caused by an infinite loop in mmdefragfs.

Fix a problem workload that continuously invokes operations that require exclusive inode locks, such as chmod or chown, thereby possibly preventing mmrestripefs or mmdeldisk commands progressing.

Fix mm commands connecting to server in cluster where the admin interface is different than the daemon interface.

Do not return windows attributes blindly for gpfs_fgetattrs() API.

Fix allocation code thereby preventing an assert that could occur while trying to delete/replace disk.

Fix code in initializing a dummy super block "shutdownSuperP", which may cause kernel crash in some of the Linux kernel versions.

Ensure the scope of config parameters is not changed as a result of delete operation.

Fix IcQueryDirectory implementation for FILE_ID_BOTH_DIR_INFORMATION and FILE_ID_FULL_DIR_INFORMATION to correctly assign the file ID field.

Add GPFS mount options nfsHashName and nonfsHashName. If nfsHashName is in effect, the NFS FH will include the hash value of the file name. This option will improve performance but is off by default. Turning it on might cause some ENOENT error with NFS.

Fix a bug that lets ACL garbage collector delete auto-generated Window SID mappings.

Fix race condition starting too many mmkprocs when using many mmapped files.

Fix gpfsInodeCache slab (and cpu) usage high due to NFS anon dentry allocations.

This update addresses the following APARs: IZ92304 IZ92308 IZ92314 IZ92317 IZ92319 IZ92322 IZ92327 IZ92426.

Problems fixed in GPFS 3.3.0.11 [December 16, 2010]

Assert in setCachedRecAddr when the cached disk address is NULL while the disk address read from disk is a real disk address. Modify the assert to allow this kind of change. And, update the cached address locally.

Fix repeated RDMA connection attempts on a down port due to IBV_EVENT_PORT_ERR.

Fix AIX crash caused by kxFreeAllSharedMemory due to another process having the shared segment mapped for command execution.

Address the following: Add use of POSIX library for stat function, ceate new subroutines to generate shadow file header, convert 3.3 to 3.4 sorted shadow file and to format ctimes into standard date/time format in GMT0, utilize version 4 mmcmi commands such as diffshadow, complete rewrite of query code to rebuild shadow file from TSM database, removal of "$expiredLine", use diffshadow without changing of sort-order, and maintaining inode-order henceforth, do not re-sort the shadow files before diff.

Rework migration code for 3.2 format shadow file conversion up to 3.4. Resort even 3.4 style shadow files to inode-order to improve backup accuracy. Omit .mmbackupCfg and .snapshot dirs in backup and remove excessive debug output. Repair gencount of -1 from query data. Preserve old shadow file if $debugPreserve.

An additional characteristic of pathnames with special characters present is they can cause TSM to exit with rc=4. Sometimes this was being mis-handled in mmbackup because the highest error code from all the runs of TSM was not recorded. Change to record highest TSM error from BACKUP operations, ignore return from EXPIRE operations, and always try to calculate the net backup success status.

Permit TSM install to be in "bin64" for AIX and find needed config file (dsm.opt) there. Enhanced debugging in tsbackup33 using DEBUGmmbackup bits.

Add new functions to carefully split file list lines and notice if the split char is showing up in the file name as well. When this happens, carefully put the path name back together along with the split char. Then proceed as before.

Fix the allocation code which can cause a filesystem to panic with "Too many disks are unavailable" when running out of disk space.

Fix a race condition where a file system manager failure during a disk status change could cause temporary loss of file system access.

Fix kernel assert when dmapi event generator is accessing null sgP pointer.

Provide support for recognizing DSM_CONFIG env variable in mmbackup.

Add noSpaceEventInterval config parameter to control interval between two nospace events. The default is 120 seconds.

Fix race between two remove threads removing same file. Check for valid inode after acqiring inode lock.

Fix race between deferred deletions and policy file creation. Changed runTSChangePolicy code to acquire file lock before calling finish allocation to synchronize with deferred deletions.

Fix rare problem in mmrestorefs error code path. Added new macro CHECK_ADVANCE_CONTINUE_ON_ERROR. This advances buffer pointer before continuing the loop.

Address: relocate mmbackup related temporary files from root of gpfs to /.mmbackupCfg/, interpret env DEBUGmmbackup bits: 0x02 = preserve temp files after backup. 0x01 = debug tracing, pervasive use of cleanupAndDie() routine rather than exit, fix sort->$sort, permit recovery after non-fatal TSM error codes.

Fix rare assert in fsync code path. Fix SFSSyncFile to check inode status before updating mtime and mark inode dirty.

Improve mmdeldisk progress time.

Customers using IBM Tivoli Storage Manager Backup Archive Client and GPFS storage pools with policy RESTORE rules for data placement should apply this fix before restoring data to the GPFS file system.

Improve performance of small writes (<32k) over NFS to a file opened with with O_DIRECT on the NFS client.

Keep inode number in sleeper struc so there is no need to reference structs that might not be valid anymore. Address post node failure, when using cNFS, someimtes GPFS crashes with a reference to a bad pointer.

When capturing mmapplypolicy output in a file, use isatty to determine if progress messages are going to a device that supports "\r" as a "carriage return, no-newline" command. Depending on the result (or an available override switch) - use appropriate progress message updates.

Reduce message traffic when writing a file with NFS.

Fix synchronization problem of dmapi destroy event thread and dmapi event response thread.

Fix assert in dmapi event timeout handlers.

Use TRCBUFSIZE environment variable for trace buffer size and ensure it is not overwritten by config parameter.

Fix some 64 bit counters in GPFS SNMP.

Fix problem to properly restore windows attributes.

Improve performance of mixed random read/write workloads on large files over NFS.

Avoid asserts and deadlock by having mmlsfileset and mmlssnapshot commands wait while mmcrsnapshot command runs.

Fix logAssertFailed: rmr1 != rmr2 when using GPFS RDMA.

Fix Assert exp((mappingP->kvaddr >= SharedSegmentKernelBase) ... on 32-bit Linux.

Fix Linux mmdelacl returning E_OPNOTSUPP for files in a "-k nfs4" fs.

Add useDIOXW configuration variable to avoid Direct IO token thrashing when using some IO requests that match the GPFS blocksize.

Correct linux capabilities allowing access even when root squashing is enabled.

Rework all the failure analysis code in the non-subdir case, eliminate awk and run multiple passes if needed through the new shadow file eliding records matching pathnames that failed to be backed up to TSM, document changes in Design comments, adjust LOCAL_FILES setting according to debug parameter and include the audit log and audit fail list in clean up.

Update unlinked fileset handling code to properly cull paths from a new (3.4)-style shadow file and sort by inode number into the updated list file. Exempts the unlinked fileset contents from being expired from TSM.

Fix duplicated session id returned by dm_create_session due to clock out of sync problem.

Added code to suppress implicit file time updates after an explicit set time operation was perform on the same handle. These semantics only apply to Windows systems.

Enable all dmapi clients to acquire access rights to a file that is being destroyed.

Add two new API calls which can be used to improve performance on Linux.

Tweak TSM Query code to recover files more accurately when doing shadow file reconstruction, including file names with break chars in the name.

Fix for an assert during multiple instance of restripe running in parallel.

Fix problem that could cause some files to become unreadable when running mmrestripefs on a system with small page pool or a workload that causes high demand for page pool buffers.

This update addresses the following APARs: IZ84086 IZ84914 IZ86043 IZ86044 IZ87149 IZ88699 IZ88703 IZ88751 IZ89147 IZ89185.

Problems fixed in GPFS 3.3.0.10 [October 28, 2010]

Fix a potential metadata allocation problem where wrong disk may be selected.

Fix GPFS automount so that it reads config value in /etc/sysconfig/autofs.

Fix "mmquotaon/mmquotaoff/mmdefquotaon/mmdefquotaoff" which provides misleading error msgs when another mounted node is disconnecting.

Alarm only if 3 continuous loops all detect the lease expires thread when system time is changed.

Fix a potential problem in create snapshot routine so that when create snapshot fails the new snap id won't get set into any of the snapshot files.

Fix an erronous assert check in fsck cleanup path. Ensures that the assert is only checked under normal conditions and not during cleanup code path as the relevant data structures would have already cleaned up or be in the process of cleaning up.

mmapplypolicy internal error finding one of its internal sort files.

If node cannot do cNFS recovery for a failed node, then kill process so another node can do the takeover for both nodes.

Prevent logAssertFailed assert which can happen under a rare race condition. This occurred when the SG manager node was trying to resign while acl garbage collector thread was being started.

Update to displayed error message and the return code obtained when mmrestoreconfig fails while enabling/disabling default quotas.

Fix assert related to RCTX.REPLIED and TSCOMM.C that occurs on the FS manager node if the FS manager is running GPFS release 3.2, and a release 3.3 client tries to mount the filesystem.

Linux IO: check mm_struct before pinning pages.

Fix retest_path error checking.

Improve performance of stat operations on Linux under certain multi-node access patterns.

Fixed FSErrValidate error in ACL garbage collection. ACL garbage collection was running at the same time as an inode expansion and was attempting to process new (unititialized) blocks at the end of the inode0 file.

Prevent a rare deadlock netween mmcheckquota and FS manager recovery.

Corrected assert in mmgetacl/tsgetacl for default ACL on a directory in a remote fs.

Forcefully evict unused inodes that have been invalidated to keep the number of unused inodes from growing too large.

Fixed mmapplypolicy with SNAPID on AIX getting an SQL error.

Improve performance of applications using directIO or, if alignment test fails, go to the regular request path and skip trying to do direct IO.

Ensure mmstartup commands are properly serialized thereby avoiding interspersed messages in the mmfs.log file.

Fix asserts in fsck while trying to fix corrupt directories.

Fix hang between node join thread and events exporter request handler thread.

tsapolicy: reduce pathnames (e.g. xxx/. will now be xxx).

Fix deadlock when preMount callback invokes mm commands.

Fix quote error in mmapplypolicy macro processing.

Improve GPFS mmstartup time & other GPFS commands in adminMode=allToAll cluster.

Fix buffer length calculation for dmapi user event returned by dm_get_events call.

Fix dm_handle_to_path so that it can look up the directory name by its own handle.

Fix problem where a remote cluster does not always pick a local NSD server when readReplicaPolicy=local is set.

This update addresses the following APARs: IZ84015 IZ84040 IZ84160 IZ85218 IZ85446 IZ86146 IZ86153 IZ86164.

Problems fixed in GPFS 3.3.0.9 [August 16, 2010]

Add T (for terabytes) and P (for petabytes) as suffix to mmedquota/mmdefedquota.

Fix an ENOMEM error in the Token Manager memory when multiple remote cluster are working with the same files.

Fixed race between tschpolicy thread and deferred deletion thread. This was causing an inconsistent inode state of that policy file being created both in disk inode bitmap and in-memory bitmap.

Fixed race between endUse thread and sgmMsgSGTakeoverQuery msg handler.

Removed erroneous assert encountered during delete snapshot.

Fix a deadlock caused by buffer steal during quota update.

Fixes an assert in the communication layer caused due to improper return code being sent back by the block map check message handler while fsck is in progress.

Fixes lock_vfs_f/releaseSlow asserts because it doesn't hold the mutex after DAEMON_DEATH.

Fix assert in dm_set_disp() path, return error if ccmgr changed in the middle of processing new disposition.

Fixes gpfsWrite vinfoUnlock being called when lock was not held and causing exception.

Fixes segfaults various ThreadThing methods.

Fixed race between mount and garbage collector thread.

Fix problem in kSFSGetAttr call to handle compact file case.

Robustness improvements to better detect errors from TSM. Better tracking of return codes from mmapplypolicy. Periodic time-stamped output of "Backing up files..." during long-running TSM jobs. Better error diagnostics when remote TSM jobs fail. Corrected file tally for total files backed when remote TSM jobs executed. Cleaner verbose output when debug is not enabled.

Fix hang when executable run out of GPFS and mmapRangeLock=no.

This fix applies to GPFS 3.2 and higher. To handle recovery when devices return E_NODEV.

Fix problem in mmimgrestore when restoring immutable files.

Fixed code that allowed regular file read to be performed on a directory which lead to EIO error later. This happens on Linux only.

Fix code to always return EISDIR when regular file read is called on a directory.

When a migrated file is being deleted, generate dmapi READ event only when copy to snapshot is needed.

Fix problem where mmunlinkfileset would sometimes succeed even if there is a process that still has a file in the fileset open on one of the nodes.

Fix a deadlock during file system panic processing while ACL garbage collector is running.

Fix problem of reading snapshot file after file is migrated and recalled back.

Fix get_next_inode to retrieve specified inode information.

Improved error message for XATTR policy statement.

Fixes gpfsInodeCache slab (and cpu) usage high due to NFS anon dentry allocations.

Reduced time required while creating snapshot.

Add support for RDMA connections to nsdperf sample program.

Adding a trace message and return code for mmnfsrecovernode.

Add new path for trace commands access.

Fixes tsapolicy command on AIX. Does not produce "error: [X] Error".

Fix a problem when adding first disk to a new storage pool while file system is in sync process.

Fix a deadlock occuring in a mix of very heavy DIO workload and mmap on Linux.

Removed bad DBGASSERT(hasVinfoLock) and add additional maintenance for the local hasVinfoLock flag. Specifically, after kSFSWriteFast had released the lock when returning E_CDITTO_LOCK.

Correct a startup problem when migrating from GPFS 2.3.

FSCK checking log file inodes even if they have log group number set to -1.

Fixed a bug in Windows support that effected systems accessing GPFS through Windows file sharing (CIFS). In some scenarios, directory access could be come very slow and could possibly return incomplete data.

Modified the Windows implementation to periodically flush unused executable files from memory.

Assert working with elements on the kxRecLockAcquires queue (needs to hold mutex).

Fix rare occurrence of file fragment expansion happening during file sync that can cause the assert failure.

Fixes asserts in fsck while trying to fix corrupt directories.

Fix assertion caused when deleting snapshots with very large files.

mmtracectl when running on Windows has been enhanced so that is utilizes the traceFileSize and traceBufferSizeForAIX configuration parameters.

Prevent a very narrow race condition during directory lookup.

This update addresses the following APARs: IZ80053 IZ80737 IZ80741 IZ80744 IZ80973 IZ81229 IZ81232 IZ82941 IZ83044 IZ83711 IZ83797 IZ83795 IZ84007 IZ84063.

Problems fixed in GPFS 3.3.0.8 [July 27, 2010]

Image restore GXR execution inside policy to allows recognition of failures of remote jobs by incrementing f_errs on failure. This should fix image restore to cause tsapolicy to exit non-zero if a failure occurs.

Add error checking for ImagePath parameter and invoke usage message output if incorrect value provided.

Capture remote job data that was previously ignored. Detect failures and tally success counts from remote jobs. Display overall job summary at end of run.

Choose a bitmap size based on gpfs_statfs64() call to see how many inodes actually in use.

Fix a rare assertion during quota file append. During AppendBlockOfRecords a wa lock is needed to append the quota file.

Correct a problem with mmlsfileset when Windows is the file system manager-the path name for the junction of the root fileset is missing in the tslsfileset output and the path names for the rest of the junction names are missing the part that represents the mount point.

AIX mmapplypolicy error:Missing or improper nodelist file actually is a problem of a fork()d process not terminating correctly. Results were a bogus message "improper nodelist file..."

Catch exit codes from critical commands such as sort. Look at returned codes from close calls from pipelined commands. Keep final two lines of output from pipelined commands and display if close returns non-zero.

Fixed a rare problem in reading a file from a snapshot that resulted in the data for the last portion of the file being replaced with zeros. Problem occurred only when a node reads the file through a snapshot, then another node appends a small amount of data to the file in the active file system and creates a new snapshot, followed by the original node immediately reading the same file in the new snapshot.

Fix the file overwrite codepath on Windows and disallow any operations on symlink objects.

Fix mmsdrbackup user exit on Windows.

Catch and report errors during Pagepool size reduction on Windows.

Ensure that fsck handles orphans from deleted fileset appropriately and deletes them rather than letting them stay unfixed in the filesystem forever.

Always stop mmnfsmonitor after GPFS shutdown regardless cnfs status.

When trace buffer size given to lxtrace daemon exceeds the lower or upper limit, it should use the minimum or maximum buffer size quietly instead of printing usage message. It should also print a message about what buffer size will be used.

Add missing intialization for gpfs32Version variable.

Fix bug in mmrestoreconfig on filesystem with filesets.

Pass a flag to tell underlying function (flushFile) if flushflag is already held or not.

Ensure tsapolicy command has correct exit code. Only affects use of mm image backup.

Do not delete *~ files when doing "make clean" in gpl-linux directory since the build process does not create them.

Fix assert caused due to accessing deleted inodemap.

Avoid rare daemon crash during heavy create load with low memory.

Set thread context to global operation context for every snapshot command.

putacl/getacl deadlocked on aclFile buffer lock.

When doing a trace cycle on linux nodes with RHEL5 and SLES10 or above, generate internaldump after trace cycle.

This update addresses the following APARs: IZ79664 IZ79674 IZ79675.

Problems fixed in GPFS 3.3.0.7 [June 24, 2010]

Created new error message and exit function. Utilized in all error exit paths that previously had no failure message.

On failover/failback, (gratituous) ARP requests from the node registering a new CNFS IP address are rejected by some switches (that have STP enabled and portfast disabled) for a short period. Subsequently the IP address may not be reachable from outside the subnet. Make sure the port is enabled (and outbound requests are accepted) by first ARPing the gateway (if one is configured) with a deadline of 30 seconds.

Serialize xattr registry initialization process.

Fix mmbackup to better handle file names with spaces and certain other metachars. Conditionally, replace use of awk with new mmcmi parsebackuprecord function when available. Mmcmi will decorate file paths with double quotes and write to expired and changed files for TSM. Update parsing of PDRs to strictly use file path length value. Allow tsbackup33 to notice if policy run failed and indicate error.

Fixed deadlock due to sg takeover failure.

Avoid corrupted snapshot files in unusual case of open unlinked files.

Avoid deadlock on Linux with small maxFilesToCache due to very frequent file creates and deletes.

Always shutdown GPFS when nfsmonitor detects unrecoverable problems such as statd is inactive.

Generate DMAPI read event when file is deleted and when copy to snapshot is needed.

Handle recovery for devices that return E_NODEV on connectivity loss instead of E_IO on AIX. VIO is an example of this.

Prevention of buffers being stolen from inodes that are low-level locked. The current fix ensures only dirty buffers are not stolen.

Search /usr/sbin for sm-notify under SLES11 after IP failover.

Prevent repeated "No space left on device" filesystem manager failure when snapshot copyon write gets triggered and while filesystem is running out of disk space.

Replace use of diff in mmbackup with new mmcmi mergeshadow function. If mmcmi version too old, fall back to diff command.

Ensure online fsck does a proper job of cleaning up stale, or failed, allocation message queues. Fixes resulting online fsck assert after finding stale AllocMsgQ.

Add (undocumented) --notsm/--tsm switch to permit skipping archiving of file contents for debugging/testing only. In tsbackup33 the switch is passed in as -k.

If the file system contains unbalanced big files, there is a small chance to lead file corruption after mmdeldisk is run. Fixed by adjusting PIT code.

Fixed problem in dm_set_dmattr() function so that attribute with common first serveral bytes are set correctly.

synched disk address in a hyper-allocated file now matches the indirect block allocation address when allocation fails.

Fsck now prints verbose information about the range of regions and the stroage pool it scans for each pass.

Online fsck handles cached bad disk addresses without causing any SIGFPE.

Hook page fault handler when accessing user data. Fixes fatal page fault at kxWaitCondvar+0xf8.

Corrected mmapplypolicy failing with "too many files open".

Reinitializes ea limit before needing to adjust it after remount of the file system.

Quota check operation prints approrpiate error messages when conflicting programs are running.

Cleanup allocation message queues properly during a failed fsck operation as a result of stripe group panic. Subsequent online fsck will not assert checking for NULL allocation message queues during initialization.

Adds nfsd CAP_DAC_READ_SEARCH and CAP_DAC_OVERRIDE capabilities instead of settting fsuid/fsgid thereby stopping permission-denied errors when nfsd rebuilds dentry trees on kernels 2.6.27 (or later).

Use mmapplypolicy ... -g -N ... to preempt disk space issues.

`kill -SIGINT ...` has been supported in all previous code releases. This update brings tsapolicy into compliance with the defacto standard for handling SIGTERM.

Fixes assert when a Windows node has to create lost+found during mmfsck.

Improve takeover time when using tiebreaker disks in certain cases.

Fix Signal 11 at QuotaMgr::Phase2OnlineQuotacheck initializing nDests not initialized.

Disallow immutable flag to be changed on snapshot files.

Turn on the CXIUP_NOWAIT flag when we know that it is safe to use igrab().

Correct PIT RPC communication.

Fix of internal dmapi attr name comparison routine so that it can compare the string with its true length.

Create directory call now always passes in a valid name.

fsck code now does not assert trying to look into invalid disk addresses as a result of race with flush buffer operation.

Retry deadlocks on rlMutex when called from RecLockReset to cleanup advisory locks.

mmapplypolicy is scripted and examines the final command exit code ($?) distinguising skips from errors.

Quick response to interruption of command mmlssnapshot.

mergeshadow function to emulate diff better thereby correctly specifying modified files needing to be backed up.

Extract interface name for networking configuration file on SLES11.

Fix assert(ofP->inodeLk.get_lock_state()).

Fix a problem in removing empty quota entries by online quota check.

CreateReservedFiles checks the number of blocks to be written to inode file before starting threads.

Change the way the running command lock is obtained.

Fix GPL build problem on BG for IIA. BG has the same kernel version as SLES10SP1. But, it does not ship the new definition file of relayfs (relay.h) in kernel source. Loose the version check to use old definition file (relayfs_fs.h) for this kernel version (2.6.16.46). It is also ok for SLES10SP1 since it ships both old and new definition files.

Solved rare race condition which may lead to a kernel crash for 64bit AIX boxes when uninstalling GPFS directly when it is still running.

Fix variable initialization that could cause "mmcheckquota -a" to terminate.

Check socket connection between command client node and fsmgr node.

Fix a long running/hang mmrestoreconfig command when running in a CWD which contains many files/subdirs.

Use new memory to pass parameter to new thread for memory reuse.

Disallow replication factor change on snapshots.

Avoid a crash in mmdeldisk for certain filesystem blocksizes and snapshots present.

Rename GPFS device names and remove any external reference to the string 'StripeGroups'.

Ensures to "reap" process forked.

Policy aputil filename handling.

Fixed the code to handle E_DAEMON_DEATH situation in gpfsWrite.

Detect missing definition for mmlsfileset utility and define it if needed. Earlier version of globfuncs do not have mmlsfileset utility defined.

Check for out of memory condition in token revoke handler.

Define I_LOCK if it is not already defined in Linux.

diff-replacement code in mmcmi not handling the difference between 3.2 style and 3.3 style shadow file lines smoothly. It calls out the file in the snapshot as needing expiration. Use the original diff code in case of 3.2 style file system backup on 3.3 code.

Cleanup allocation message queues properly during a failed offline fsck operation as a result of stripe group panic. Subsequent offline fsck will not assert checking for NULL allocation message queues during initialization.

Fix dmapi attribute name comparison routine to take the short length into account.

fsck does not assert trying to look into invalid disk addresses as a result of race with flush buffer operation.

Increase lower limit of tracedevbuffersize from 4k to 1m.

On Windows, some temporary files starting with the name /var/mmfs/tmp/popen.* may not be removed if GPFS shuts down abnormally. These temporary files are not cleaned up the next time GPFS starts. nsdperf source and README files added to the Windows installation package.

On Windows, the sample program nsdperf is shipped as an executable (nsdperf.exe). The README files and source code for this sample are now also included on Windows installations.

Update kernel code licensing info to reflect Dual BSD/GPL license.

Fix failure due to expel command being run during disk election in tiebreakerdisk cluster.

This update addresses the following APARs: IZ70721 IZ75258 IZ76614 IZ76615 IZ76798 IZ76810 IZ76834 IZ76837 IZ76939 IZ75549.

Problems fixed in GPFS 3.3.0.6 [June 02, 2010]

Use the disk availability information from the daemon for the mmlsdisk -m/-M options.

Reject the mmmount request if the drive letter is in use.

Fixed a sig#11 problem during sg disk table update.

If there is mount failure to GPFS file system and you can only find "No child processes" message in mmfslog, apply this fix and you will see the real reason for the mount failure. This problem only affects Linux.

Add a method to indirect block iterator to ignore last change count so as to step to the next block when the current one is deleted. Use the new method this instead of bumping down the global list change count.

Fix for a directory with FGDL enabled, when mnode token is being revoked but not the inode token, and when there is thread hold the openfile, CTF_FINE_GRAIN_DIR_MNODE flag did not get reset which may trigger an assertion next time the node tries to become metanode.

Improve performance of inode allocation when running low/out of free inode.

Fix for a rare race condition on Windows which may result in conflicting auto-generated SID mappings.

Set Shared.processP.pid to -3 when mmfsd is killed by SIGKILL (kill -9) on AIX 64-bit. Otherwise, client command like tsctl or mmfsadm still think daemon is alive and wait for 5 minutes timeout to exit.

Fix rare race condition between lease thread and healthcheck thread, resulting in false lease thread stuck condition.

Fix race condition that could occur when an active NSD server is also runing workload that uses the NSDs served by that server, and GPFS is being shutdown on that node.

Corrected concurrent threads running mmunlinkfileset and performing asynchronous recovery's SFSDoDeferredDeletions cause a file's open instance count to go negative.

Initialize dirLockNeed variable to avoid unknown id error during trace formatting.

Ensures that false compare mismatch errors are not not reported and a relevant assert is not triggered when compare operation is done on a inode with bad file size.

Fix so that a failure to read an inode 0 file for any of the snapshots aborts fsck operation.

Change return code from E_PERM to E_INVAL for mmchfs.

When a panic causes an EAGAIN to be turned into ESTALE, a call to cxiFcntlUnblock must be made to clean-up the fl_block list. locks_free_lock BUG(fl_block) call on ESTALE return from a fcntl lock.

Fix mmrestoreconfig to correctly handle filesystem containing no fileset.

Fix mmfsck so that it could handle badly damaged inode better when relica count went bad.

Fix a fileset restore for linked filesets whose config was backed up with mmbackupconfig.

Fix SLES 11 automount.

Allow mmchnode --cnfs-enable to accept trailing spaces in network config file.

Fix mmbackup processing of -s and -g switch arguments.

Fixed dmapi event timetout handler to correctly broadcast message to waiting threads.

Fix mmbackup to limit memory consumed in sort program by using --buffer-size=5% on Linux and by using -T switch on all platforms.

Fix incorrect error message after last CNFS node is deleted.

Fix the code which caused GPFS daemon to assert after filesystem panic on FS manager node.

Inherit ACL entries based on filemode (should be the default ACL mode).

Get a stronger lock when prefetching inodes (rf vs ro). Fix assert DE_IS_FREE(fP).

The mmauth command, which supports multi-cluster configurations, stopped working on Windows platforms beginning with GPFS 3.3.0.3. The problem was due to incompatibilities between the OpenSSL library and the WinSock library (which was new in this release). This issue is now resolved. GPFS for Windows uses a custom built OpenSSL library compatible with the WinSock library.

Correct a problem when running mmmount all_remote for a Windows node.

Enforce stricter device naming on Windows cluster.

Allow change directio flag for immutable files.

Fix a GPFS deadlock that occurs on Linux, under high load conditions, with memory pressure and memory mapped files.

A race condition in the Windows POSIX subsystem (SUA) makes it possible for process fork operations can hang. This problem can hang GPFS and require a restart. To avoid this problem, all fork/exec operations in the GPFS daemon, which are used to start a child process, have been replaced with native Windows APIs.

There are now long form option names for each command option, at least within the tsapolicy C program.

Fix code to remove stale object when deleting snapshot.

Fix mmdeldisk syntax error message.

Fix code that could cause assert after node fail while running fsck.

Change the type of parameter 'ino' to InodeNumber in cxiFillDir_t. If INODE64_PREP is defined, InodeNumber is Int64, otherwaise it's Int32.

Fix code which caused an assert during filesystem manager takeover after manager node failed.

Fix problem for tsmigrated migrate to new clmgr node when clmgr node changed.

Fix module build errors with Linux kernel version 2.6.33.

Fix mmfileid command to scan user data for disk address check.

Fix a typo in the routine that retrieves node information causing mmsnmpagentd to terminate occasionally.

Fix for a rare race condition during disk address lookup of a newly allocated address under heavy load.

Install a page fault handler when user data is copied by kxReleaseMutex. Longwaiters and Oops:kxReleaseMutex on ppc64.

Fix an unexpected remote copy error from mmrestoreconfig command.

Ensure the mmsdrserv process is not killed if it uses its own separate TCP port.

Fix an extremely rare case where higher-level indirect blocks were not being flushed when they were supposed to be.

Add input validation for xattr value size.

Change mmfileid to find "invalid" disk addresses when using the :BROKEN keyword.

Fix excessive prefetching IO for random NFS read workload on large files after installing 3.3.0.5.

Ensures all locks acquired during the lock file operation are released during a failed operation. And, prevents the need for an explicit file lock release failing which the code will assert.

Correct a problem when resetting a config parameter to its default value for a subset of the nodes.

Fix code to correctly respond to returning error from PIT parent node.

NFS client gets "permission denied" when "subdir/.." is looked-up internally.

Fix mmexectsmcmd to to tolerate error return codes such as 4, 8 and interpret as non-fatal.

Avoid GPL compiling warning. Use void* instead of struct inode* in cxi file and cast it in OS specific file.

Fix for a rare deadlock during recovery on Windows.

Avoid chance of deadlock when updating shared directory under high load.

Rework handling automount on RHEL 5 or SLES 11.

On AIX disable the filter process, and use the trace control file to format the trace instead of the merged ones.

Allow mmgetstate -s to continue even if there is an inaccessible node in the cluster.

Fix for a rare assert during multi-node create/delete races.

Improve handling of PIT worker nodes starting RPC.

Fix problem in fully replicated filesystem which will not mount if all the disks in one FG are stopped and suspended.

Fix mmwinserv to prevent a possible hang on Windows nodes during mmstartup.

Verify interface is up (IFF_UP) before processing it. Interfaces brought down using "ifconfig down" (unlike "ifdown ") are returned with SIOCGIFCONF.

Check whether mmfs.log.previous file exists before renaming it.

Fix a rare assert caused by RelinquishAttrByteRange thread.

Fix EA limit calculation.

Fix code to improve cache handling.

Close a timing window to eliminate an assert which can happen under heavy load when disks are being quiesced.

Fix a deadlock involving read-write mmap under heavy stress.

Fixed a Windows node failure that occurred when clusters were configured to provide SNMP events.

Fix a race condition between mmnfsdown and mmnfsup so that mmnfsdown can kill all nfsmonitor process.

Fix code to correct the order of initializing state of PIT nodes.

Fix problem where restripe/chdisk/rpldisk commands return error 'Invalid Argument' while processing user file.

Avoid a rare failure accessing a directory long after concurrent updates.

Fix array out of bound problem in eaRegistry dump function.

Prevent gpfs_iwritedir api from asking to open inode that are fs metadata inodes.

Fix assert caused by rare CPU cache inconsistency situation on X86_64 hardware.

Fix GPL build problem on BG.

This update addresses the following APARs: IZ73346 IZ74517 IZ74539 IZ74542 IZ74544 IZ74547 IZ74549 IZ74550 IZ75250 IZ75252 IZ75259.

Problems fixed in GPFS 3.3.0.5 [April 1, 2010]

Fix problem where metadata-update intensive workloads (e.g., file creates and deletes) running on systems with large inode cache (large maxFilesToCache) would periodically pause for several seconds.

Fix spurious EIO errors accessing hidden .snapshots directories enabled via "mmsnapdir -a".

Allow case insensitive node identifier in specfile.

Fix fsck code so that it doesnt report the corrupt addresses problem that it claimed to have fixed during the previous fsck run.

Improve linux trace performance.

Fix mmtrace to return non-zero value and report error when lxtrace binary for current kernel is not installed.

Fix performance problems when reading large files from NFS clients.

Correct problems with mmaddcallback -N clustermanager & mmlscallback.

Fix missing out-of-memory check in get inode routine.

Improve performance of large file create when DIO is used.

Fix buffer calculation in dm_get_events when buffer size is greater than 64K.

Fix potential loss of events in dm_get_event() call when buffer size is greater than 64K.

Notify dm_get_events that the session already failed after quorum lost in the cluster.

Fixed potential assert when writing small files via NFS under heavy load.

Fixed hardlink assertion problem when upgrading file system from 2.3 to 3.3 or later.

Remove stat() calls in the mmshutdown path.

Fixed an allocation loop which could occur during mount and rebalance of a filesystem.

Fix mmbackup to divide up the list of files that need backup based on filesize when numberOfProcessesPerClient=2 (or more).

A progress indicator is added in the case of mmchfs -F if it leads to the expansion of preallocated inodes.

Fix a race condition between deldisk and deallocation of surplus indirect blocks that could result in dangling block pointers.

Fix disklease inconsistencies between cluster manager resetting lastLeaseProcessed and the client resetting lastLeaseReplyReceived.

When token_revoke results in a downgrade for a device file, call invalidate so that device-specific cleanup occurs.

Fix allocation code which can cause "No space left on device" error on initial mount after filesystem creation.

Fixed a file sharing check that was causing an incorrect "access denied" error.

Restart mmsdrserv after installing new code on Windows.

Remove redundant preMount and Mount user callback events.

Stop GPFS trace automatically when doing upgrade to GPFS 3.3 on Linux. Added more detailed error messages when GPFS kernel extentions can not be unloaded.

Fix a quota problem that fails to translate invalid fileset ids.

Fix initial run of mmbackup recall in UTC+ timezone to avoid recall of unchanged files that are already on the TSM server.

Fix assert s_magic == GPFS_SUPER_MAGIC on kernel 2.6.16.60-0.59.1 and above.

Fixed allocation code which caused an assert during daemon shutdown.

Load policy file on sgmgr when file system is mounted so that low space threshold always set when file system is mounted.

Return EMEDIUMTYPE rather than ELNRNG for incompatible format errors on Linux.

Cleanup flag beingRestriped if inode is deleted while restriping.

Fix an assert in kxCommonReclock on AIX node.

Error conditions returned due to failed metadata flush operation are handled appropriately preventing the restripe operation from asserting due to failed checks.

Fix a problem in quota file creation when file system pool has metadataOnly disks.

Fix to avoid holding mutex twice while revoking token encounters SGPanic.

Fixed a rare race condition during Windows GPFS initialization that could cause a system fault.

Fix code to avoid unreasonable checking for socket.

Fix exception using a spin_lock in fasync_helper during fcntl revoke.

Fix server side token issue in failure cases.

Ensure that online fsck is not held for ever trying to steal buffers from inode range that is currently locked for online fsck.

Fixed the parallel inode traversal code which can cause signal 11 during restripe and replace disk.

Fix code to remove unnecessary assertion when a token is revoked while a large file is being restriped.

Fix problem where in a file system with large snapshots a failure of the file system manager during the first phase of an mmrestripefs or mmdeldisk command could under certain timing conditions cause corruption.

Fix problem where filesystems created by GPFS release 2.3 or older were not mountable by GPFS release 3.2 or 3.3.

Fix for a very rare race condition where a non-DIO read from a cached buffer may transiently return partially incorrect data.

Fixes issues in minorityQuorum clusters that have leaseDuration set and have migrated from 2.3 to 3.2.

Fix to tolerate an inconsistent state of Windows security settings on an inode following a failed TSM restore.

This update addresses the following APARs: IZ68715 IZ68725 IZ69476 IZ70073 IZ70074 IZ70396 IZ70409 IZ70599.

Problems fixed in GPFS 3.3.0.4 [January 28, 2010]

Fix problem where mmrestripe command might not correctly detect I/O errors during the first phase of restripe.

Correct the resetting of config parameters to default on a subset of the nodes.

Replace usage of the lsvg command with getlvodm.

Fix function checkIntRange error message when checking negative numbers.

Clear the tiebreaker disk parameter after mmexportfs all.

Fix ioctl opcode conflict with FIGETBSZ on Linux kernel 2.6.31 and later.

Fix fsck to avoid incorrectly reporting and fixing of filesystem corruptions in a heterogeneous cluster.

If the file system is internally forced to unmount (file system panic), invoke the preunmount user exit if one is installed.

Avoid confusion when using a local fcntl lock versus an NLM one.

Give customers using mmbackup more flexibility by allowing alternate install location for TSM.

Fix determining filename length when filename contains invalid UTF8 characters.

Fix data corruption when using mmap.

Fix assert due to invalid fcntl acquire sleep element found on the kernel queue.

Keep FS descriptors off of excluded disks even if they come online.

Fixed a race condition by serializing the xattr object in inode properly.

Fix hang between node failure thread and events exporter request handler thread.

Fix mmapplypolicy to estimate correctly the number of GPFS storage pool bytes freed by migrating to an external/HSM pool. Introduce MM_POLICY_MIGRATION_STUBSIZE environment variable to allow users to directly control size for migration.

Fix mmbackup to avoid giving file name length and file size to TSM for inclusion in backup list.

Fix async recovery to let mounts succeed while also processing deffered deletions.

Fix assert failure on FS manager node when unmountOnDiskFailure=yes and a disk fails after 3.2.1.14-16 installed.

Prevent HSM and NFS from asking to open inodes that are system metadata nodes.

Do not let socket get stuck in reconn_cleanup state following repeated breaks that occur just after connection handshake completes.

Reduce the pagepool usage by inode allocation segments during FS manager initialization or recovery.

Fix a problem with cutting traces in a CNFS setup.

Fix filesystem panic when a failed disk holds a FS descriptor and returns unexpected error codes.

Fix problem with mmlsfileset when expanding inodes is running concurrently.

Ignore un-supported permission flags passed to gpfs_i_permission on SLES11.

Fix for a SIGSEGV on Windows caused by a race in accessing the ACL file.

Fix a condition where mm commands can exit with errors if CWD is unavailable.

Fix for a rare failed assert in the main process thread on Linux.

Fix a race condition where node may be deleted right after it started up.

Fix code to correct backward compatibility of non-blocking token request between gpfs 3.2 and gpfs 3.3.

Make trace recycle timeout message more descriptive and avoid recycle file being overwritten when trace recycles next time (Linux nodes only).

Succedent tscrfs command will unset some flags unexpectedly even if it cannot get the permission to run. It will cause a daemon assert. Clear flags only if the command has set it before.

When open of the directory fails and not all fields are set, do not call back into GPFS to do close (release). This may cause an invalid assert due to attempting to reference uninitialized fields.

Fix signal 11 due to bad RDMA index and cookie received from the TcpConn in verbs::verbsClient_i.

Fix remote startup on Windows.

Fix a race condition between an mmexpelnode and mmchmgr.

Correctly cleanup tmp files on remote nodes.

Fix a problem in mmdf where number of free inodes may become negative.

Fix race condition that occurs due to disk failure during clmgr election while using tiebreaker disks.

Fixed inode expansion code which can cause restripe to fail with an assert. This problem only happens when restripe and inode expansion run concurrently.

Several sample script and configuration files are now included with the GPFS for Windows installation. These can be found in %SystemRoot%\SUA\usr\lpp\mmfs\samples. Only the files appropriate for use on Windows are included; additional samples are available with UNIX installations.

Fix assert "offset mappedLen" when reading dirs.

Fix allocation manager problem that caused pool to not be deleted when it should have been.

Initialize allocSize variable during the initialization phase of file repair to prevent assert.

Fix a rare bug that occurs during nsd config change along with earlier disk issues to another deleted nsd.

Fixed a GPFS on Windows failure that can occur on systems with a large number of cores (e.g. 8 or more) running a workload with thousands of threads. When this error occurs, /var/adm/ras/mmfs.log.* shows "logAssertFailed: tid >= 0 && tid <= MAX_GPFS_KERNEL_TID". The fix for this problem removes any assumption on the maximum thread ID.

Fix a problem that can lead to loss of an intermediate SSL key file.

Fix mmbackup to accurately reflect the error encountered on the TSM server.

Fix a problem with interpreting the syncnfs mount option.

Fix fsck so that it reports duplicate fragments and its count correctly and also prevent a possible fsck crash due to count overflow.

Added %myNode as callback parameters.

Fix an assertion during mount that could happen when quota management is enabled and snapshot is being used.

Fix fsck so that it detects problems and fixes them without encountering struct assert errors even if the 'assertOnStructureError' config option is turned on.

This update addresses the following APARs: IZ67659 IZ67660 IZ67661 IZ67662 IZ67663 IZ67664 IZ67665 IZ67666 IZ67667 IZ67723 IZ67746 IZ68028.

Problems fixed in GPFS 3.3.0.3 [December 10, 2009]

On x86_64 Linux when special encoding flag is set in a functions debugging frame section, an extra offset should be added during decoding. Otherwise, the thread traceback can not be decoded correctly.

Fix Stripe group configuration change so data block loss cannot occur if data is being ingested along with configuration changes.

On Windows, consolidated separate -msi and -sh installation logs into a single file(gpfs-install.2009.10.20.11.23.58.gpfs-n70-win.log). Also, eliminated Command window popups that were appearing during GPFS installation.

Fix problem where the new inode scan function gpfs_next_inode64() would return incorrect values for some gpfs_iattr64_t fields on file systems originally created prior to GPFS 2.3.

Change to GPFS inode scan api for gpfs_ireaddir64. Directory entry structure returned by call now contains a flag field and allows directory entry names to be 1020 bytes (ie 255 characters encoded in UTF-16 or other Unicode encodings).

Fix policy handling of rules of the form "EXCLUDE FROM POOL" to prevent LOW_SPACE events from incorrectly being logged.

Extended Attributes (EAs) on Windows have been changed so that internally they are stored with a "user." prefix. This change supports compatibility with Linux and improves file system security.

Fix problem on systems configured with large maxFilesToCache that could cause file systems to be unmounted on some client nodes when running recovery after a manager node failure.

Fixed GPFS hang when recalling files from HSM.

When running lsvg do not wait for the volume group lock.

Fix mmbackup to give users early warning and exit when unlinked filesets are present during backup. It also prevents further processing of files that would otherwise give the user misleading error information.

Limit the number of attempts made to destroy nfsd threads in mmnfsquorumloss in case an nfsd thread is stuck waiting for IO to complete in GPFS.

Don't stop NFS or unexport fs on quorum loss, and kill NFSDs that are stuck during setNfsdProcs.

Fix mmdelnode syntax error checking.

Fixed the allocation code which caused a loop during metadata allocation. This problem only affects filesystem with metadata replication enabled.

Fix mmbackup incremental to handle conversion from short filename records to new longer records after upgrade to 3.2.1.14 or later.

Fix Linux "mmnfsinit start" command and return correct return code.

Fix problem where a multi-threaded workload reading extended attributes from a large number of files could cause accumulation of a large number of byte range tokens leading to slowdown and spurious ENOMEM errors.

Upon mmfsd daemon failure, change the way of debug data collection to asynchronous script execution.

Fix a quota initialization problem that could allow quota files in storage pools, while they really belong in system pool.

Fix for a rare race condition that may cause an assert in the invalid fileset object disposal path.

Added new command line option "--oneerror" to mmaddcallabck command.

Enable mmapplypolicy on fs with 8MB blocksize.

Correct processing to prevent quota requests from being performed while the quota manager operations are being quiesced.

Converted GPFS to use the Windows sockets library (WinSock) rather than the SUA library. This change fixed an issue with large data transfers that appeared when a file system's block sizes was larger than 256KB. The WinSock library also significantly reduces the CPU consumption required to perform network data transfers.

Fixed error handling when registering pagepool memory to Infiniband.

Correct quota hard limit processing to check grace time.

Correct a rare problem (due to an error encountered writing quota files) that can prevent a newly created filesystem from being mounted.

Improve stability when encountering hostname resolve issues.

Prevent callback command path in GPFS file system.

Fixed a problem which prevents filesystem remount after a forced umount due to error(ie. filesystem panic,quorum loss, etc).

Fix panic handler code to ensure that the right fsck cleanup code path is chosen by looking at the workerNode flag in the fsck data structure.

Correct processing during restoring filesets to allow more than the KSH array limit of 1024.

Fix quota manager cleanup when file system manager migrates to another node.

Fix a file structure error caused by SetAllocationSize.

Enable kdump to retrieve kernel thread's backtrace on IA64.

Handle IB port event of LID change.

Correct a problem when verifying that the daemon is down from a Windows node.

Resolved an issue with mmwinservctl where the command would fail to set the account name and password. This error would occur if Windows is not installed in same location on all nodes in the cluster (e.g. some nodes have Windows installed on the C: drive and other have it installed on the D: drive.)

Fix to avoid assertion when calculating the next valid data block number of low level file.

Removed some commands and programs that were included in the Windows installation, but not supported on Windows.

Parse the 'ro' mount option and pass it explicitly to gpfsMount to prevent Windows write on readonly filesystem.

Fix mmbackupconfig in Windows to give the customers the correct mmbackupconfig behavior and exit gracefully.

Fix mmapplypolicy to run on a large number of files with the -g and -N flags.

Make filesystem restore messages for mmrestoreconfig more descriptive.

Fix mmbackup to backup snapshots that are older than the latest filesystem backup.

Fix mmbackup to ensure that the only desired TSM servers will be processed.

Package mmbackup32 for use with 3.2 clients.

Add support for -N nodeList option in mmbackup version 3.1 and 3.2.

Fix possible deadlock restriping a file system with data replication enabled under application load and with small pagepool.

Warning messages on conflicting opertaions are sent to stderr to avoid littering to stdout.

Resolved an issue that in rare cases could cause GPFS to terminate when tracing is enabled.

Fixed some mmwinservctl operations which were causing GPFS to start inadvertently when GPFS was configured with autoload=yes.

Fix mmchconfig trace command to load kernel extensions if not already loaded.

Increase the default maximum size of the shared segment on 64-bit AIX to 1G (32-bit AIX is architecturally limited to 256M).

This update addresses the following APARs: IZ63333 IZ65179 IZ65379 IZ65416.

Problems fixed in GPFS 3.3.0.2 [November 12, 2009]

On Windows, mmtracectl no longer requires ActiveState Python in order to collect and format traces.

During a stripe group mount, ignore the 'exclDisks' mount option for a filesystem in a remote cluster.

Fix direct I/O path to set the Windows archive bit only for writes, not reads.

Add more trace information and ifdef for problems with Linux NFS fast lookup.

Ensure mmexportfs does not remove tiebreaker disks unless device name is "all".

Fix kernel exception in fifo_open due to an invalid i_pipe pointer.

Increase prefetchThreads+worker1Threads+nsdMaxWorkerThreads to 1500 on AIX 64bit systems.

Added two new calls to the inode scan api to allow dmapi attributes to be backed up and restored for disaster recovery. The dmapi attributes are also saved in a snapshot of the original file on which they were set.

Collect the output of systeminfo.exe command instead of the command itself.

Fix trace record missing problem. The _STrace function should pass non-blocking flag as 0 instead of 1 to rl_trc_write.

Improve SMP scalability in the DIO code path.

Fix small window where message send will hang if destination list includes the local node and all other nodes reply before the local send can start.

Prevent force unmounts when disks in different failure groups (FGs) fail, but are in different pools. Prevent marking disks down in multiple FGs when disks die simultaneously.

Allow command line case insensitive hostname.

Add an new file system option --filesetdf.

Fixed assert when deleting files from a dmapi enabled filesystem.

Dmapi locks are ignored from lock file operation during offline fsck as offline fsck doesnt have any dmapi context. This will prevent offline fsck from crashing the deaom while fixing orphans in a dmapi enabled filesystem.

Fix mmapplypolicy to accept a policy that includes a 3-value THRESHOLD(hi,lo,pre) clause in a migrate to external pool rule.

Ensure mmfsctl syncFSconfig does not affect free disks unless device name is "all".

Ensure mmaddnode fails if the IP address already appears in the cluster.

Fix node crash from destroy event when accessing stalled stripe group.

Correct mmcrnsd disk sector size (>2T) error on 32bit Linux.

Fix mmpolicyExec-hsm.sample to handle characters \ " and ' in filenames properly so that they work in HSM file list.

Fix deadlock during FS manager takeover if previous FS manager and its disks (site failure) fail at the same time.

Fix DMAPI enabled filesystems when they are mounted on top of another GPFS filesystem.

Fixed performance problem while migrating files.

To avoid 32-bit integer overflow in case of huge sparse file, argument dataBlockNum in repairDataBlock is changed to Int64 instead of int.

Modified GPFS installation on Windows to prevent GPFS from being started after a package update, but before the mmwinserv service is configured. This problem can result in permission errors when running GPFS administrative operations.

Fixed a problem in GPL layer makefile so that no warning messages will appear when upgrading from 3.3 GA code on Linux platform when GPL layer has never been compiled before.

Acquire the stripe group descriptor mutex before changing the quota files inode information in the stripe group descriptor.

On Windows, error messages related to the mmwinserv service were improved to provide clearer indication of the problem and how it might be corrected.

Fixed the allocation code which caused an infinite loop when running out of full metadata block.

Fix for a rare race condition that may result in a "Busy inodes after unmount" syslog message on Linux.

Fsck generates false positives for bad repication status in user metadata files after a failed PIT operation. The fix ensures that fsck does not generate the false positives.

When fixing orphans, fsck now prints the fileset name from which the orphan is generated.

Fix mmbackupconfig to present a clearer error message when they attempt to run the command when the filesystem is mounted.

Fix mmapplypolicy command to indicate it is making progress.

Hold the stripe group descriptor mutex only while actually accessing/updating the stripe group descriptor when updating the quota file information in the descriptor.

Avoid crash in rare cases of concurrent multi-node file creates.

Prevent customer files that have been backed up to the TSM server from expiring in the same session.

Enabled global cluster wide events from remote cluster in user exit callbacks.

Fix "mmtrace noformat" to work on linux nodes.

Correctly propagate authorization key files to new node in admin central cluster.

Fixed repair code which can caused snapshot file corruption which could happen when filesystem contains fileset, snapshots and deleting disk resulted in not enough failure group for proper replication of metadata.

Clarify mmwinserv error messages.

Fix rare deadlock that occurs during recovery of large filesystem.

Fixed GPFS trace control on Windows which in some scenarios was not restarting trace collection correctly.

Fix backward compatibility problem that caused file creates to fail on file systems that were originally created with GPFS version 2.2 or earlier.

This update addresses the following APARs: IZ63058 IZ63080 IZ63171 IZ63307 IZ63308 IZ63320.

[{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SSFKCN","label":"General Parallel File System"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"","Edition":"","Line of Business":{"code":"","label":""}}]

Was this topic helpful?

Document Information

Modified date:
12 November 2012

UID

isg400001374

Tips

Readme and Release notes for release 3.3.0.27 General Parallel File System 3.3.0.27 GPFS-3.3.0.27-power-AIX Readme

Fix Readme

Abstract

Content

Contents

Installation information

Download location

Fix Download for AIX

Prerequisites and co-requisites

Known issues

Installation information

Additional information

Was this topic helpful?

Document Information

UID

Share your feedback

Need support?