Commitment control recovery during initial program load after abnormal end

When you perform an initial program load (IPL) after your system ends abnormally, the system attempts to recover all the commitment definitions that were active when the system ended.

Likewise, when you vary on an independent disk pool, the system attempts to recover all the commitment definitions related to that independent disk pool that were active when it was varied off or ended abnormally.

The recovery is performed by database server jobs that are started by the system during IPL. Database server jobs are started by the system to handle work that cannot or must not be performed by other jobs.

The database server jobs are named QDBSRVnn, where nn is a two-digit number. The number of database server jobs depends on the size of your system. Likewise, the name of the database server job for an independent disk pool or independent disk pool group is QDBSxxxVnn, where xxx is the independent disk pol number and nn is a two-digit number. For example, QDBS035V02 can be the name of the database server job for independent disk pool 35.

States of the transaction for two-phase commitment control shows the actions that the system takes, depending on the state of the transaction when the failure occurred. For two states, PRP and LAP, the system action is in doubt.

Notes:

The following applies only to commitment definitions with job-scoped locks.
The transaction manager recovers commitment definitions associated with XA transactions (whether their locks are job-scoped or transaction-scoped) using XA APIs, not the resynchronization process described in this topic.

The system cannot determine what to do until it performs resynchronization with the other locations that participated in the transaction. This resynchronization is performed after the IPL or vary on operation completes.

The system uses the database server jobs to perform this resynchronization. The commitment definitions that need to be recovered are associated with the database server jobs. During the IPL, the system acquires all record locks and other object locks that were held by the commitment definition before the system ended. These locks are necessary to protect the local commitment resources until resynchronization is complete and the resources can be committed or rolled back.

Messages are sent to the job logs of the database server jobs to indicate the status of resynchronization with the remote locations. If the transaction is in doubt, resynchronization must be completed with the location that owns the decision for the transaction before local resources can be committed or rolled back.

When the decision for a transaction is made, the following messages might be sent to the job log for the database server job.

CPI8351: &1 pending changes being rolled back
CPC8355: Post-IPL recovery of commitment definition &8 for job &19/&18/&17 completed.
CPD835F: IPL recovery of commitment definition &8 for job &19/&18/&17 failed.

Other messages related to the recovery can also be sent. These messages are sent to the history (QHST) log. If errors occur, messages are also sent to the QSYSOPR message queue.

You can determine the progress of the recovery by using System i® Navigator, by displaying the job log for the database server job, or by using the Work with Commitment Definitions (WRKCMTDFN) command. Although with System i Navigator and the Work with Commitment Definitions display, you can force the system to commit or roll back, you must use this only as a last resort. If you anticipate that all of the locations that participated in the transaction will eventually be returned to operation, you must allow the systems to resynchronize themselves. This ensures the integrity of your databases.