This document discusses procedures to access AIX FlashCopy, MetroMirror or GlobalMirror volume groups including information on ensuring data consistency.
|This subject is fairly well documented in appendix A of the IBM System Storage DS8000 Series: Copy Services in Open Environments Redbook, SG24-6788-02, available at http://www.redbooks.ibm.com/redbooks/pdfs/sg246788.pdf . And the recreatevg command is covered in the Information Center at http://publib.boulder.ibm.com/infocenter/pseries/v5r3/index.jsp?topic=/com.ibm.aix.cmds/doc/aixcmds4/recreatevg.htm . However, there are some details that are not covered that this note discusses.
There are two phases to accessing a VG copy:
2. Accessing the VG
Step 1 is usually accomplished via first assigning the LUNs to a host from the storage side and then running cfgmgr at the host. Step 2 is then accomplished by either importing the VG with importvg, or by running recreatevg against the LUNs if another copy (this could also be the original VG) of the VG resides on the host. Using recreatevg is necessary if a copy of the VG already resides on the host, as the disks will have duplicate PVIDs, LV names, and file system mount points, none of which LVM allows. AIX does allow duplicate PVIDs on a system, but this is designed to be a temporary situation.
One thing that is not documented in the redbook is that if there is a LUN varied on in a VG with PVIDa, then one cannot configure another hdisk/vpath on the system with the same PVID using cfgmgr.
So to configure the disks in such a situation, one must varyoff the VG with the duplicate PVID, and then run cfgmgr. Another approach to this is to configure the LUN on AIX before the copy of data is placed on the LUN (which creates the duplicate PVID). Once the LUN is configured on AIX (at this point it won't have a PVID as the LUN has just been created), then subsequent FlashCopies (or whatever disk subsystem method is used to create the copy) will not require the LUN to be configured again.
Another thing that's not necessary, yet is documented in the redbook, is clearing and setting a new PVID on the LUNs with:
# chdev -l <hdisk#> -a pv=clear
# chdev -l <hdisk#> -a pv=yes
These commands are run by the recreatevg command, so they can be skipped.
For example, say we have an existing VG, existingvg, defined on the system, and we are creating a FlashCopy of it, flashcopyvg. The steps to do so would be as follows:
2. Configure the LUNs on the host with # cfgmgr (assume we're using SDDPCM on a DS8000 which results in one hdisk for each LUN)
3. Make a note of the new hdisks (I'll assume they are hdisk10, hdisk11 and hdisk12)
4. Initiate the FlashCopy. If the file systems are not unmounted, see the information below on ensuring consistency of the file systems and data structures via using disk subsystem consistency groups and JFS2 freeze/thaw.
5. Run recreatevg to clean up the duplicate PVIDs, LV names, etc., using: # recreatevg -y flashcopyvg hdisk10 hdisk11 hdisk12
6. Varyon the VG with # varyonvg flashcopyvg
Note that here we actually configured the new hdisks prior to creating the FlashCopy, so they'll have no PVID. If we already did the FlashCopy, we'd have to varyoff existingvg prior to step 2.
Later, if the customer chooses to update the copies to match the VG again, the procedure would be:
2. Export flashcopyvg with # exportvg flashcopyvg This removes information about flashcopyvg from the ODM on AIX, but doesn't change any information on the disks in the VG.
3. Create the new copy with FlashCopy. If the application is not quiesced and the file systems are not unmounted, see the information below on ensuring consistency of the file systems and data structures via using disk subsystem consistency groups and JFS2 freeze/thaw.
4. Run recreatevg, e.g., using: # recreatevg -y flashcopyvg hdisk10 hdisk11 hdisk12 Running recreatevg will create a new VG from the FlashCopy volumes (in this case it will be called flashcopyvg) with new LV names, and it will also change the PVIDs of the disks to unique PVIDs and load the ODM with this information. Note that one does not need to run importvg as the recreatevg loads the ODM with the VG information. See the recreatevg man page at http://publib.boulder.ibm.com/infocenter/pseries/v5r3/index.jsp?topic=/com.ibm.aix.cmds/doc/aixcmds4/recreatevg.htm for more details such as how to specify the new LV names or change the mount points of any file systems.
5. Varyon the VG with # varyonvg flashcopyvg
Step 2 is necessary since we're going to re-run the recreatevg, and we can't have the VG already defined in the ODM.
Also note that one may make multiple copies of the VG using the above procedures.
Regarding ensuring file system and data consistency, the reason that it is preferable to unmount the file systems, is that otherwise data may reside in file system cache and may not be written to disk. If the application is running, we want to make sure that the FlashCopy (or whatever disk subsystem mirroring method is used) data is a point in time image, and is consistent. This also requires an application that is written to recover in the event of a system crash. If the application doesn't support recovery in the event of a system crash, then one must stop the application prior to initiating the FlashCopy, and in such a case one should also unmount the file systems prior to the FlashCopy to flush file system cache to the file systems. It is possible to create a FlashCopy of VGs used by a running application which supports recovery after a system crash, and to do so it's recommended that:
2. Use disk subsystem consistency groups for the disks in the VG
3. Freeze the JFS2 file systems using the chfs command (see the chfs man page at http://publib.boulder.ibm.com/infocenter/pseries/v5r3/index.jsp?topic=/com.ibm.aix.cmds/doc/aixcmds1/chfs.htm ) if using JFS2 (and preferably use JFS2 as there's no similar function in JFS)
4. Preferably have one file system log per file system
5. Initiate the FlashCopy
6. Thaw the JFS2 file system
7. Turn off the hot backup or quiesce mode of the application
8. Use the above procedures to get the new VG activated on the system
9. Run logredo against the file systems
10. Run fsck against the file systems
11. Use the application to verify the consistency of the application data
In 2005, AIX development issued this statement regarding consistency of data when using FlashCopy:
Clarification of supported and unsupported use of Flashcopy backups of mounted AIX filesystems
When is FlashCopy of a mounted file system a supported operation?
The latest levels of AIX 5.2 and 5.3 include a freeze/thaw function for the JFS2 file system. FlashCopy of a mounted file system is supported when the JFS2 freeze/thaw function is used. [The freeze/thaw function is currently available for AIX 5.2 with APAR IY66043. The 5.3 version is available in maintenance level 1.]
Some storage systems provide a means for holding all I/O to a set of volumes to produce a consistent image across those volumes. This function is generally referred to as “consistency groups”. It is possible to produce a valid point-in-time image of the file system and log when the storage system provides this function and it is properly set up by the administrator. The FlashCopy Consistency Group function provided by ESS Copy Services meets that requirement. FlashCopy of a mounted file system is supported where all LUNs used in the file system and log are covered by the consistency group.
What is the reason for support only in this limited environment?
The specific problem is this: the file system logging mechanism was designed to recover from a power failure or a system crash. In those situations the I/O to the separate file system and log volumes stops at the same time. This point-in-time image of the file system and log guarantees that when the log is replayed all of the metadata will be consistent. Where writing continues to either the log or the file system while writes to the other have been stopped that guarantee cannot be made.
The result of replaying a log in the situation where all I/O does not stop at the same point-in-time is a file system that may contain metadata corruption. There will be no indication of the corruption at the time the copy is mounted – mount will replay the log successfully and mount the file system. However, at some later time when the inconsistent metadata is accessed there will be problems. If the corruption is recognized as such then the system may crash. If it is not recognized when it is used then loss or corruption of user data may result.
What steps are recommended for a consistent copy?
The recommended environment is this:
The procedures recommended by the storage subsystem vendor must be followed. For FlashCopy there is a Redbook, Implementing ESS Copy Services in Open Environments, that describes the procedures.
The usual scenario would include these operations:
What about user data?
The discussion so far has been about the effect of FlashCopy on the integrity of file system metadata. It should be pointed out that application data within the file system must also be treated carefully to insure its integrity. The primary concern here is buffering of data. Most user data is cached either at the application level or in the kernel before it is written to disk. The sync command or fsync system call will force data cached in the kernel to be written from the cache to the disk, and it will force modified metadata to be logged. Applications should ideally provide an interface that allows the app to be quiesced. At the time of quiesce the application should issue fsync calls for all open files. The administrator should also issue sync commands to force out data that is not associated with well-behaved applications. The sync command may not be able to write all data for files that are actively in use at the time of the sync. A short delay followed by another sync command might help, but if the applications continue to write new data some will not be copied. Please refer to the appropriate documentation for the sync command and fsync system call for more in-depth information.
Many applications and middleware include controls for doing online backup. These facilities need to be used in addition to the sync command and fsync system call.
The JFS2 freeze/thaw will guarantee a point-in-time view of user data as well as the metadata. It is the most effective way (short of unmount) to ensure that all data that has been written is committed to disk.
If DB2 or Oracle databases are in use then refer to the following information:
DB2 : There is a technote titled “ Requirements for implementing an on-line copy of a DB2 UDB database utilizing IBM ESS TotalStorage FlashCopy on the AIX platform” . This is available at http://www.ibm.com/software/data/support/ . Click on “Search support knowledge base”, select AIX in the operating system box, and then enter the search term “FlashCopy”.
Oracle : There is a document available on Oracle’s MetaLink website with the subject “ Using IBM ESS TotalStorage FlashCopy or Similar for Oracle 10G on AIX”.
What can be done if a supported environment is not available?
** IBM strongly recommends that supported methods be used to backup data. In the event supported methods cannot be used, the following information is provided to assist customers in attempting to produce valid, consistent copies. Following these procedures does not guarantee that valid, consistent copies will result, and it does not imply that IBM will provide support for errors that may occur. **
If neither freeze/thaw nor consistency groups are available, the only supported procedure is to unmount the file system. If it is not possible to do that then the following actions should be taken:
1. Use JFS2 with inline log so that all data is in a single volume. If not using JFS2 or inline logs, don’t log multiple file systems to a single log.
2. Take action to quiesce applications in order to prevent I/O from continuing during the copy process. (Use application function to quiesce, use sync/fsync to schedule/force writes to disk and delay some time prior to initiating the FlashCopy operation.)
3. Do not attempt to mount the copied file system until the following steps are done:
§ Run “logredo /dev/<logname>”. If the “log wrap” error is displayed then the log for the source volume needs to be extended and the copy recreated.
§ Run a full fsck on the copied file system after it is created. Any error reported by fsck indicates that there was a problem with the copy and the copied file system should not be used.
The purpose of these steps is to minimize the potential for corruption. If these procedures are followed but fsck on the copied file system reports errors then it is the result of the environment. It is not a system error.
What if logredo fails?
Replaying the log should succeed even in unsupported configurations. One instance where logredo will fail is in the case of log wrap. If the log has wrapped it cannot bring the file system to a consistent state. When log wrap occurs the log needs to be extended. If logredo produces other error messages then the service team should examine the log to look for evidence of system errors.
|Category :||Backup and Recovery|
|Organization :||Advanced Technical Sales|
Techdocs Technical Support Information