z/OS DFSMS Managing Catalogs
Previous topic | Next topic | Contents | Contact z/OS | Library | PDF


Full Catalog Recovery Procedure

z/OS DFSMS Managing Catalogs
SC23-6853-00

The principal steps required for a full catalog recovery follow below. Some steps will make reference to other, more specialized procedures.

  1. List the catalog's aliases from each of the master catalogs to which it is connected. These aliases may have to be re-established later depending on the recovery technique used.
  2. Determine whether the catalog is open on each system. Run the Catalog Display program.
  3. Attempt to list the catalog from each system to which the catalog is connected using IDCAMS LISTCAT.
  4. If LISTCAT is successful from systems that previously did not have the catalog open, it is likely that the control blocks associated with the catalog are in error on the system from which LISTCAT fails. In this case, cause the catalog to be closed and reopened on those systems that cannot access the catalog correctly.
  5. If LISTCAT fails on all systems, the catalog will have to be recovered.
  6. Deny access to the catalog from all systems except the system to be used for recovery.
    Note: In planning for recovery, remember that the recovery system must have access to a catalog (other than the damaged catalog) containing data sets needed for the recovery.

    You might be tempted to use IDCAMS DELETE UCAT with the RECOVERY option at this point. If you do, you will not be able to save, for future reference or diagnostics, a copy the damaged catalog including the most current changes. Additionally, without the following actions, it would be possible for users to begin using the recovered catalog before the diagnostics following recovery could be executed and possible corrective actions taken.

    Denying access may be accomplished by either of the following methods.

    1. Vary the catalog unit offline to prevent I/O to the catalog for the duration of the recovery. This is the preferred method since aliases associated with the catalog will not need to be re-established later as required by the next method. However, this frequently will not be possible because of other allocated data sets on the catalog volume.
    2. Disconnect the catalog from the master catalog. Use IDCAMS EXPORT DISCONNECT. This will also delete the aliases to the catalog. Note that the DISCONNECT command will function even if the catalog is open and its successful completion does not mean that catalog usage will cease.

      If necessary, make sure the catalog is closed after it is disconnected.

  7. Deny access to the catalog from non-recovery jobs executing on the system to be used for recovery.

    It may be necessary to "hold" and "dry up" initiators on the recovery system.

  8. Cause the catalog to be closed on the recovery system.

    It will be necessary to terminate all address spaces (including TSO) that have referenced the catalog and are causing it to be held open.

  9. Record the date and time when it has been confirmed that the catalog is closed on all systems. This is the stop date and time needed as input to the Integrated Catalog Forward Recovery Utility. See CRURRSV Parameters.
  10. Switch and dump the SMF data sets on all systems that have had access to the catalog. The SMF records for the catalog will be needed for forward recovery of the catalog. These dump data sets should be the last (by system) in the concatenation of data sets for SMFIN DD to the Record Selection and Validation program.
  11. Save a copy of the damaged catalog for future use (for diagnostics, for example). Use DFSMSdss DUMP by data set, tracks or volume. If the catalog's VVR is not usable or is inaccessible because of problems with the VVDS, then the DFSMSdss DUMP by data set will not work.
  12. Save the contents of the catalog in a readable format. Using DFSMSdss PRINT by tracks, save both the data and the index components.
  13. Save the VVDS containing the catalog.

    Use DFSMSdss PRINT by tracks. If the VVDS is still usable, you can use IDCAMS PRINT.

  14. Identify the EXPORT backup copy1 of the catalog to be used as the basis for recovery. This will be the EXPIN data set for CRURRAP.

    Normally this will be the most current backup. If you use a versioning technique such as the one in Figure 2, you can (from batch jobs only) refer to the "zero generation": for example, Baplicat.CATALOG.BACKUP(0).

  15. Establish a starting date and time for forward recovery. These values are needed as execution parameters for the programs in the ICFRU. See CRURRSV Parameters.

    With either IDCAMS EXPORT or DFSMSdss DUMP the first record in the output data set contains the date and time the copy was made. However, there is no utility to extract this information. (It is true that IDCAMS IMPORT will print out the date and time, but we need the information before actually running the IMPORT.)

    You can obtain the date and time in either case using IDCAMS PRINT COUNT(1). You will need to supply DCB information to read the DFSMSdss dump data set; the EXPORT format is VBS.

    Rather than interpreting the dumped data, you may prefer to save the messages from the job creating the copy. This will include the IDC0594I message giving the date and time of the EXPORT copy of the catalog. One technique is to use generation data groups for both the data set copies and the listings, thereby documenting each copy operation in a data set with related name and corresponding generation number.

  16. In a multi-system installation, determine the maximum difference in the TOD clock values among the systems. This value is needed as an execution parameter for the programs in the ICFRU. See CRURRSV Parameters.
  17. Identify the SMF data needed for forward recovery of the old version of the catalog. The concatenation of all these data sets is the SMF input (SMFIN DD) for CRURRSV (Record Selection and Validation).

    You will need data from all systems having access to the catalog since the recovery start time identified above. Remember to include the recent data preceding the switch of SMF data sets included in the catalog recovery procedure.

  18. Determine the interval to be used as the "significant gap time" for CRURRSV. This value is needed as an execution parameter for the programs in the ICFRU. The value should be the minimum number of minutes normally needed to fill an SMF recording data set. See CRURRSV Parameters.
  19. Execute the CRURRSV program using the parameters and input data sets determined above.

    For details, see Catalog Forward Recovery Steps with ICFRU.

  20. Review the condition code, reports and log messages from CRURRSV and determine whether an SMF lost-data condition exists or whether any necessary SMF data sets have been omitted. For guidance in interpreting the results, see Reports from the Record Selection and Validation Program.

    If one or more lost data conditions exists, save the dumped copies of the lost-data records for final error analysis. This record type tells you the period during which SMF records were not written because no SMF recording data set was available. It also tells you the number of SMF records lost. Also save any other messages that may indicate lost data.

    If one or more SMF dump data sets were omitted from the previous CRURRSV execution, run that program again supplying the previously omitted data sets. Be certain to include all such data sets, since the program will almost certainly give another set of error messages reflecting the fact that the data sets already processed are now absent and any data sets still missing in this second run will not be detected. The output may be added to the data set previously used (DISP=MOD) or written to a new data set to be concatenated with the previous one in the next sort step.

    Note: There is no way to determine conclusively whether all SMF records have been included. However, the program will detect suspiciously long intervals when no SMF records (not just catalog records) were written by a particular system. If SMF data has been truly lost, you may choose to continue with SMF recovery and perform additional forward recovery using other techniques.
  21. Using DFSORT or similar facility, sort the output from CRURRSV by data set name (ascending) and date/time sequence (descending).

    For details, see Sort Control Parameters.

  22. Execute the CRURRAP (Record Analysis and Processing) program using the parameters previously determined, the output from the previous sort as SMF input (SMFIN DD), and the EXPORT copy identified above as EXPORT input (EXPIN DD).
  23. Review the results of CRURRAP for:
    1. Evidence of lost SMF data
    2. A list of errors and anomalies to be investigated later
    For guidance in interpreting the results, see Reports from the Record Analysis and Processing Program.

    If there is evidence of lost SMF data, you should investigate again the results of CRURRSV to confirm whether data has, in fact, been lost. If so, attempt to identify the interval surrounding the lost data condition.

    When you are satisfied with the results of CRURRAP, proceed as follows.

  24. Use IDCAMS to DELETE the catalog for RECOVERY.

    This will prevent IMPORT failures due to the inability to perform integrity checking of a damaged catalog before internally deleting and redefining it as a part of IMPORT processing.

    This deletion will result in the removal of all associated aliases and the need to rebuild them later. You may wish to make a list of the aliases before deleting the related user catalog and then use the list to restore the aliases after IMPORTing the EXPORT copy.

  25. Optionally, re-DEFINE the catalog using IDCAMS.

    Do this if you wish to preserve a current control area size that is less than one cylinder or if you choose to change the structure of the catalog in any way. Note that, with this procedure, the recovery must be to the same volume serial number and the same device type. Otherwise, you may use a different volume. Do not decrease the current maximum logical record size.

  26. IMPORT the EXPORT copy produced by CRURRAP.

    Use the INTOEMPTY parameter if you have already redefined the catalog. Use the VOLUME sub-parameter if you are changing volume serial numbers.

    Note that the date and time of the EXPORTed copy given by message IDC0604I will be the date and time you specified as the stop time for the recovery.

    If the IMPORT is without the INTOEMPTY parameter, an existing copy of the catalog will be internally deleted (for recovery) and redefined. The deletion will result in the removal of all associated aliases and the need to rebuild them later.

  27. Now that forward recovery is complete, again check the status of the catalog using IDCAMS DIAGNOSE, LISTCAT and EXAMINE.

    These diagnostics should not be steps within the job that recovered the catalog; the recovered catalog should be newly opened.

  28. For any entries noted with error or anomaly messages from CRURRAP, make sure that the catalog entry now present represents reality.

    For VSAM entries, IDCAMS DIAGNOSE with the COMPARE option will perform the necessary checking, so you should review the results from the previous step.

    For questionable non-VSAM entries, you will have to resort to other methods. If the device type is DASD, check the VTOC for the indicated data set. If it is not present, remove the catalog entry. If the device type is for tape, check the tape management inventory (if you have one). With a small number of tape data sets to be investigated, it may also be possible to examine the tape volumes. If the data set is not on the indicated volume, remove the catalog entry.

  29. If needed SMF data was lost, you may want to attempt to resolve discrepancies arising from the loss at this point (as opposed to waiting for users or production jobs to encounter a problem).
  30. Rebuild the aliases for the catalog if the catalog was imported to a new volume or if the aliases have been deleted during the recovery.
  31. If VSAM passwords are in use and if any were reset during the recovery (indicated by message CRU116I), security for these data sets should be redefined at this point. This may be done be security administration or by the data set owners.
  32. Backup the catalog to start a new recovery cycle.

    Use IDCAMS EXPORT or use DFSMSdss DUMP by data set or other standard DFSMSdss dump procedure. Even if no corrections to the recovered catalog were necessary, you may want to perform the backup to make sure that the new backup is properly cataloged and tracked.

  33. Restore access to the catalog from all other systems by either varying the catalog unit back online and mounting the catalog volume or reconnecting the catalog and redefining the aliases using IDCAMS IMPORT CONNECT and DEFINE ALIAS and by releasing any initiators held for the recovery process.

This completes the full catalog recovery procedure.

1 With DFSMSdss or DFSMShsm backups, you will first have to RESTORE or RECOVER the catalog. Then you can create the necessary data set for recovery using IDCAMS EXPORT.

Go to the previous page Go to the next page




Copyright IBM Corporation 1990, 2014