The principal steps required for a full catalog recovery follow
below. Some steps will make reference to other, more specialized procedures.
- List the catalog's aliases from each of the master catalogs to
which it is connected. These aliases may have to be re-established
later depending on the recovery technique used.
- Determine whether the catalog is open on each system. Run the
Catalog Display program.
- Attempt to list the catalog from each system to which the catalog
is connected using IDCAMS LISTCAT.
- If LISTCAT is successful from systems that previously did not
have the catalog open, it is likely that the control blocks associated
with the catalog are in error on the system from which LISTCAT fails.
In this case, cause the catalog to be closed and reopened on those
systems that cannot access the catalog correctly.
- If LISTCAT fails on all systems, the catalog will have to be recovered.
- Deny access to the catalog from all systems except the system
to be used for recovery.
Note: In planning for recovery, remember
that the recovery system must have access to a catalog (other than
the damaged catalog) containing data sets needed for the recovery.
You
might be tempted to use IDCAMS DELETE UCAT with the RECOVERY option
at this point. If you do, you will not be able to save, for future
reference or diagnostics, a copy the damaged catalog including the
most current changes. Additionally, without the following actions,
it would be possible for users to begin using the recovered catalog
before the diagnostics following recovery could be executed and possible
corrective actions taken.
Denying access may be accomplished
by either of the following methods.
- Vary the catalog unit offline to prevent I/O to the catalog for
the duration of the recovery. This is the preferred method since aliases
associated with the catalog will not need to be re-established later
as required by the next method. However, this frequently will not
be possible because of other allocated data sets on the catalog volume.
- Disconnect the catalog from the master catalog. Use IDCAMS EXPORT
DISCONNECT. This will also delete the aliases to the catalog. Note
that the DISCONNECT command will function even if the catalog is open
and its successful completion does not mean that catalog usage will
cease.
If necessary, make sure the catalog is closed after it is
disconnected.
- Deny access to the catalog from non-recovery jobs executing on
the system to be used for recovery.
It may be necessary to "hold" and "dry
up" initiators on the recovery system.
- Cause the catalog to be closed on the recovery system.
It
will be necessary to terminate all address spaces (including TSO)
that have referenced the catalog and are causing it to be held open.
- Record the date and time when it has been confirmed that the catalog
is closed on all systems. This is the stop date and time needed as
input to the Integrated Catalog Forward Recovery Utility. See CRURRSV Parameters.
- Switch and dump the SMF data sets on all systems that have had
access to the catalog. The SMF records for the catalog will be needed
for forward recovery of the catalog. These dump data sets should be
the last (by system) in the concatenation of data sets for SMFIN DD
to the Record Selection and Validation program.
- Save a copy of the damaged catalog for future use (for diagnostics,
for example). Use DFSMSdss DUMP by data set, tracks or volume. If
the catalog's VVR is not usable or is inaccessible because of problems
with the VVDS, then the DFSMSdss DUMP by data set will not work.
- Save the contents of the catalog in a readable format. Using DFSMSdss
PRINT by tracks, save both the data and the index components.
- Save the VVDS containing the catalog.
Use DFSMSdss PRINT by
tracks. If the VVDS is still usable, you can use IDCAMS PRINT.
- Identify the EXPORT backup copy1 of
the catalog to be used as the basis for recovery. This will be the
EXPIN data set for CRURRAP.
Normally this will be the most current
backup. If you use a versioning technique such as the one in Figure 2, you can (from batch jobs only)
refer to the "zero generation": for example, Baplicat.CATALOG.BACKUP(0).
- Establish a starting date and time for forward recovery. These
values are needed as execution parameters for the programs in the
ICFRU. See CRURRSV Parameters.
With either
IDCAMS EXPORT or DFSMSdss DUMP the first record in the output data
set contains the date and time the copy was made. However, there is
no utility to extract this information. (It is true that IDCAMS IMPORT
will print out the date and time, but we need the information before
actually running the IMPORT.)
You can obtain the date and time
in either case using IDCAMS PRINT COUNT(1). You will need to supply
DCB information to read the DFSMSdss dump data set; the EXPORT format
is VBS.
Rather than interpreting the dumped data, you may prefer
to save the messages from the job creating the copy. This will include
the IDC0594I message giving the date and time of the EXPORT copy of
the catalog. One technique is to use generation data groups for both
the data set copies and the listings, thereby documenting each copy
operation in a data set with related name and corresponding generation
number.
- In a multi-system installation, determine the maximum difference
in the TOD clock values among the systems. This value is needed
as an execution parameter for the programs in the ICFRU. See CRURRSV Parameters.
- Identify the SMF data needed for forward recovery of the old version
of the catalog. The concatenation of all these data sets is the
SMF input (SMFIN DD) for CRURRSV (Record Selection and Validation).
You
will need data from all systems having access to the catalog since
the recovery start time identified above. Remember to include the
recent data preceding the switch of SMF data sets included in the
catalog recovery procedure.
- Determine the interval to be used as the "significant gap time" for
CRURRSV. This value is needed as an execution parameter for the
programs in the ICFRU. The value should be the minimum number
of minutes normally needed to fill an SMF recording data set. See CRURRSV Parameters.
- Execute the CRURRSV program using the parameters and input data
sets determined above.
For details, see Catalog Forward Recovery Steps with ICFRU.
- Review the condition code, reports and log messages from CRURRSV
and determine whether an SMF lost-data condition exists or whether
any necessary SMF data sets have been omitted. For guidance in interpreting
the results, see Reports from the Record Selection and Validation Program.
If
one or more lost data conditions exists, save the dumped copies of
the lost-data records for final error analysis. This record type tells
you the period during which SMF records were not written because no
SMF recording data set was available. It also tells you the number
of SMF records lost. Also save any other messages that may indicate
lost data.
If one or more SMF dump data sets were omitted from
the previous CRURRSV execution, run that program again supplying the
previously omitted data sets. Be certain to include all such
data sets, since the program will almost certainly give another set
of error messages reflecting the fact that the data sets already processed
are now absent and any data sets still missing in this second run
will not be detected. The output may be added to the data set previously
used (DISP=MOD) or written to a new data set to be concatenated with
the previous one in the next sort step.
Note: There is no way
to determine conclusively whether all SMF records have been included.
However, the program will detect suspiciously long intervals when
no SMF records (not just catalog records) were written by a particular
system. If SMF data has been truly lost, you may choose to continue
with SMF recovery and perform additional forward recovery using other
techniques.
- Using DFSORT or similar facility, sort the output from CRURRSV
by data set name (ascending) and date/time sequence (descending).
For details, see Sort Control Parameters.
- Execute the CRURRAP (Record Analysis and Processing) program using
the parameters previously determined, the output from the previous
sort as SMF input (SMFIN DD), and the EXPORT copy identified above
as EXPORT input (EXPIN DD).
- Review the results of CRURRAP for:
- Evidence of lost SMF data
- A list of errors and anomalies to be investigated later
For guidance in interpreting the results, see Reports from the Record Analysis and Processing Program. If there is evidence of lost
SMF data, you should investigate again the results of CRURRSV to confirm
whether data has, in fact, been lost. If so, attempt to identify the
interval surrounding the lost data condition.
When you are
satisfied with the results of CRURRAP, proceed as follows.
- Use IDCAMS to DELETE the catalog for RECOVERY.
This will prevent
IMPORT failures due to the inability to perform integrity checking
of a damaged catalog before internally deleting and redefining it
as a part of IMPORT processing.
This deletion
will result in the removal of all associated aliases and the need
to rebuild them later. You may wish to make a list of the aliases
before deleting the related user catalog and then use the list to
restore the aliases after IMPORTing the EXPORT copy.
- Optionally, re-DEFINE the catalog using IDCAMS.
Do
this if you wish to preserve a current control area size that is less
than one cylinder or if you choose to change the structure of the
catalog in any way. Note that, with this procedure, the recovery must
be to the same volume serial number and the same device type. Otherwise,
you may use a different volume. Do not decrease the current maximum
logical record size.
- IMPORT the EXPORT copy produced by CRURRAP.
Use the INTOEMPTY
parameter if you have already redefined the catalog. Use the VOLUME
sub-parameter if you are changing volume serial numbers.
Note
that the date and time of the EXPORTed copy given by message IDC0604I
will be the date and time you specified as the stop time for the recovery.
If the IMPORT is without the INTOEMPTY parameter, an
existing copy of the catalog will be internally deleted (for recovery)
and redefined. The deletion will result in the removal of all associated
aliases and the need to rebuild them later.
- Now that forward recovery is complete, again check
the status of the catalog using IDCAMS DIAGNOSE, LISTCAT and EXAMINE.
These diagnostics should not be steps within the job that recovered
the catalog; the recovered catalog should be newly opened.
- For any entries noted with error or anomaly messages from CRURRAP,
make sure that the catalog entry now present represents reality.
For
VSAM entries, IDCAMS DIAGNOSE with the COMPARE option will perform
the necessary checking, so you should review the results from the
previous step.
For questionable non-VSAM entries, you will
have to resort to other methods. If the device type is DASD, check
the VTOC for the indicated data set. If it is not present, remove
the catalog entry. If the device type is for tape, check the tape
management inventory (if you have one). With a small number of tape
data sets to be investigated, it may also be possible to examine the
tape volumes. If the data set is not on the indicated volume, remove
the catalog entry.
- If needed SMF data was lost, you may want to attempt to resolve
discrepancies arising from the loss at this point (as opposed to waiting
for users or production jobs to encounter a problem).
- Rebuild the aliases for the catalog if the catalog was imported
to a new volume or if the aliases have been deleted during the recovery.
- If VSAM passwords are in use and if any were reset during the
recovery (indicated by message CRU116I), security for these data sets
should be redefined at this point. This may be done be security administration
or by the data set owners.
- Backup the catalog to start a new recovery cycle.
Use IDCAMS
EXPORT or use DFSMSdss DUMP by data set or other standard DFSMSdss
dump procedure. Even if no corrections to the recovered catalog were
necessary, you may want to perform the backup to make sure that the
new backup is properly cataloged and tracked.
- Restore access to the catalog from all other systems by either
varying the catalog unit back online and mounting the catalog volume
or reconnecting the catalog and redefining the aliases using IDCAMS
IMPORT CONNECT and DEFINE ALIAS and by releasing any initiators held
for the recovery process.
This completes the full catalog recovery procedure.