BSDS problems

Use this topic to investigate, and resolve problems with BSDS.

For background information about the bootstrap data set (BSDS), see the Planning your IBM® MQ environment on z/OS® .

Normally, there are two copies of the BSDS, but if one is damaged, IBM MQ immediately changes to single BSDS mode. However, the damaged copy of the BSDS must be recovered before restart. If you are in single mode and damage the only copy of the BSDS, or if you are in dual mode and damage both copies, use the procedure described in Recovering the BSDS.

This section covers some of the BSDS problems that can occur at startup. Problems not covered here include:
  • RECOVER BSDS command errors (messages CSQJ301E - CSQJ307I)
  • Change log inventory utility errors (message CSQJ123E)
  • Errors in the BSDS backup being dumped by offload processing (message CSQJ125E)

Error occurs while opening the BSDS

Symptoms

IBM MQ issues the following message:


CSQJ100E +CSQ1 ERROR OPENING BSDSn DSNAME=..., ERROR STATUS=eeii

where eeii is the VSAM return code. For information about VSAM codes, see the DFSMS/MVS™ Macro Instructions for Data Sets documentation.

System action
During system initialization, the startup is terminated.

During a RECOVER BSDS command, the system continues in single BSDS mode.

System programmer action
None.
Operator action
Carry out these steps:
  1. Run the print log map utility on both copies of the BSDS, and compare the lists to determine which copy is accurate or current.
  2. Rename the data set that had the problem, and define a replacement for it.
  3. Copy the accurate data set to the replacement data set, using Access Method Services.
  4. Restart the queue manager.

Log content does not agree with the BSDS information

Symptoms
IBM MQ issues the following message:

CSQJ102E +CSQ1 LOG RBA CONTENT OF LOG DATA SET DSNAME=...,
           STARTRBA=..., ENDRBA=...,
           DOES NOT AGREE WITH BSDS INFORMATION

This message indicates that the change log inventory utility was used incorrectly or that a down-level data set is being used.

System action
Queue manager startup processing is terminated.
System programmer action
None.
Operator action
Run the print log map utility and the change log inventory utility to print and correct the contents of the BSDS.

Both copies of the BSDS are damaged

Symptoms
IBM MQ issues the following messages:

CSQJ107E +CSQ1 READ ERROR ON BSDS
           DSNAME=... ERROR STATUS=0874
CSQJ117E +CSQ1 REG8 INITIALIZATION ERROR READING BSDS
           DSNAME=... ERROR STATUS=0874
CSQJ119E +CSQ1 BOOTSTRAP ACCESS INITIALIZATION PROCESSING FAILED
System action
Queue manager startup processing is terminated.
System programmer action
Carry out these steps:
  1. Rename the data set, and define a replacement for it.
  2. Locate the BSDS associated with the most recent archive log data set, and copy it to the replacement data set.
  3. Use the print log map utility to print the contents of the replacement BSDS.
  4. Use the print log records utility to print a summary report of the active log data sets missing from the replacement BSDS, and to establish the RBA range.
  5. Use the change log inventory utility to update the missing active log data set inventory in the replacement BSDS.
  6. If dual BSDS data sets had been in use, copy the updated BSDS to the second copy of the BSDS.
  7. Restart the queue manager.
Operator action
None.

Unequal time stamps

Symptoms
IBM MQ issues the following message:

CSQJ120E +CSQ1 DUAL BSDS DATA SETS HAVE UNEQUAL TIME STAMPS,
           SYSTEM BSDS1=...,BSDS2=...,
           UTILITY BSDS1=...,BSDS2=...
The possible causes are:
  • One copy of the BSDS has been restored. All information about the restored BSDS is down-level. The down-level BSDS has the lower time stamp.
  • One of the volumes containing the BSDS has been restored. All information about the restored volume is down-level. If the volume contains any active log data sets or IBM MQ data, they are also down-level. The down-level volume has the lower time stamp.
  • Dual logging has degraded to single logging, and you are trying to start without recovering the damaged log.
  • The queue manager terminated abnormally after updating one copy of the BSDS but before updating the second copy.
System action
IBM MQ attempts to resynchronize the BSDS data sets using the more recent copy. If this fails, queue manager startup is terminated.
System programmer action
None.
Operator action
If automatic resynchronization fails, carry out these steps:
  1. Run the print log map utility on both copies of the BSDS, compare the lists to determine which copy is accurate or current.
  2. Rename the down-level data set and define a replacement for it.
  3. Copy the good data set to the replacement data set, using Access Method Services.
  4. If applicable, determine whether the volume containing the down-level BSDS has been restored. If it has been restored, all data on that volume, such as the active log data, is also down-level.

    If the restored volume contains active log data and you were using dual active logs on separate volumes, you need to copy the current version of the active log to the down-level log data set. See Recovering logs for details of how to do this.

Out of synchronization

Symptoms
IBM MQ issues the following message during queue manager initialization:

CSQJ122E +CSQ1 DUAL BSDS DATA SETS ARE OUT OF SYNCHRONIZATION

The system time stamps of the two data sets are identical. Differences can exist if operator errors occurred while the change log inventory utility was being used. (For example, the change log inventory utility was only run on one copy.) The change log inventory utility sets a private time stamp in the BSDS control record when it starts, and a close flag when it ends. IBM MQ checks the change log inventory utility time stamps and, if they are different, or they are the same but one close flag is not set, IBM MQ compares the copies of the BSDSs. If the copies are different, CSQJ122E is issued.

This message is also issued by the BSDS conversion utility if two input BSDS are specified and a record is found that differs between the two BSDS copies. This situation can arise if the queue manager terminated abnormally prior to the BSDS conversion utility being run.

System action
Queue manager startup or the utility is terminated.
System programmer action
None.
Operator action
If the error occurred during queue manager initialization, carry out these steps:
  1. Run the print log map utility on both copies of the BSDS, and compare the lists to determine which copy is accurate or current.
  2. Rename the data set that had the problem, and define a replacement for it.
  3. Copy the accurate data set to the replacement data set, using access method services.
  4. Restart the queue manager.
If the error occurred when running the BSDS conversion utility, carry out these steps:
  1. Attempt to restart the queue manager and shut it down cleanly before attempting to run the BSDS conversion utility again.
  2. If this does not solve the problem, run the print log map utility on both copies of the BSDS, and compare the lists to determine which copy is accurate or current.
  3. Change the JCL used to invoke the BSDS conversion utility to specify the current BSDS in the SYSUT1 DD statement, and remove the SYSUT2 DD statement, before submitting the job again.

I/O error

Symptoms
IBM MQ changes to single BSDS mode and issues the user message:

CSQJ126E +CSQ1 BSDS ERROR FORCED SINGLE BSDS MODE

This is followed by one of the following messages:


CSQJ107E +CSQ1 READ ERROR ON BSDS
           DSNAME=... ERROR STATUS=...
 
CSQJ108E +CSQ1 WRITE ERROR ON BSDS
           DSNAME=... ERROR STATUS=...
System action
The BSDS mode changes from dual to single.
System programmer action
None.
Operator action
Carry out these steps:
  1. Use Access Method Services to rename or delete the damaged BSDS and to define a new BSDS with the same name as the BSDS that had the error. Example control statements can be found in job CSQ4BREC in thlqual.SCSQPROC.
  2. Issue the IBM MQ command RECOVER BSDS to make a copy of the good BSDS in the newly allocated data set and reinstate dual BSDS mode. See also Recovering the BSDS.

Log range problems

Symptoms

IBM MQ has issued message CSQJ113E when reading its own log, or message CSQJ133E or CSQJ134E when reading the log of a queue manager in the queue-sharing group. This can happen when you do not have the archive logs needed to restart the queue manager or recover a CF structure.

System action

Depending upon what log record is being read and why, the requestor might end abnormally with a reason code of X'00D1032A'.

System programmer action

Run the print log map utility (CSQJU004) to determine the cause of the error. When message CSQJ133E or CSQJ134E has been issued, run the utility against the BSDS of the queue manager indicated in the message.

If you have:
  • Deleted the entry with the log range (containing the log RBA or LRSN indicated in the message) from the BSDS, and
  • Not deleted or reused the data set
you can add the entry back into the BSDS using the following procedure:
  1. Identify the data set containing the required RBA or LRSN, by looking at an old copy of the contents of BSDS, or by running CSQJU004 against a backup of the BSDS.
  2. Add the data set back into the BSDS using the change log inventory utility (CSQJU003).
  3. Restart the queue manager.

If an archive log data set has been deleted, you will not be able to recover the page set or CF structure that needs the archive logs. Identify the reason that the queue manager needs to read the log record, then take one of the following actions depending on the page set or CF structure affected.

Page sets

Message CSQJ113E during the recovery phase of queue manager restart indicates that the log is needed to perform media recovery to bring a page set up to date.

Identify the page sets that need the deleted log data set for media recovery, by looking at the media recovery RBA in the CSQI1049I message issued for each page set during queue manager restart, then perform the following actions.

  • Page set zero
    You can recover the objects on page set zero, by using the following procedure.
    Attention: All data in all other page sets will be lost when you carry out the procedure.
    1. Use function SDEFS of the CSQUTIL utility to produce a file of IBM MQ DEFINE commands.
    2. Format page set zero using CSQUTIL, then redefine the other page sets as described in the next section.
    3. Restart the queue manager.
    4. Use CSQUTIL to redefine the objects using the DEFINE commands produced by the utility in step 1.
  • Page sets 1-99
    Use the following procedure to redefine the page sets.
    Attention: Any data on the page set is lost when you carry out this operation.
    1. If you can access the page set without any I/O errors, reformat the page set using the CSQUTIL utility with the command FORMAT TYPE(NEW).
    2. If I/O errors occurred when accessing the page set, delete the page set and recreate it.

      If you want the page set to be the same size as before, use the command LISTCAT ENT(dsname) ALLOC to obtain the existing space allocations, and use these in the z/OS DEFINE CLUSTER command.

      Format the new page set using the CSQUTIL utility with the command FORMAT TYPE(NEW).

    3. Restart the queue manager. You might have to take certain actions, such as resetting channels or resolving indoubt channels.
CF structures

Messages CSQJ113E, CSQJ133E, or CSQJ134E, during the recovery of a CF structure, indicate that the logs needed to recover the structure are not available on at least one member of the queue-sharing group.

Take one of the following actions depending on the structure affected:
Application CF structure
Issue the command RECOVER CFSTRUCT(structure-name) TYPE(PURGE).

This process empties the structure, so any messages on the structure are lost.

CSQSYSAPPL structure
Contact your IBM support center.
Administration structure
This structure is rebuilt using log data since the last checkpoint on each queue manager, which should be in active logs.
If you get this error during administration structure recovery, contact your IBM support center as this indicates that the active log is not available.

Once you have recovered the page set or CF structure, perform a backup of the logs, BSDS, page sets, and CF structures.

To prevent this problem from occurring again, increase the:
  • Archive log retention (ARCRETN) value to be longer, and
  • Increase the frequency of the CF structure backups.