- Collect
and analyze logrec error records.
Check all logrec error records
related to the abended task. Determine if any records show an earlier
system problem; if so, continue diagnosis with that problem. Because
of recovery and percolation, a SYSABEND or SYSUDUMP dump can be the
end result of an earlier system problem.
- Collect and analyze messages about the problem. Use time
stamps to select messages related to the problem:
- The job log
- The system log (SYSLOG) or operations log (OPERLOG)
Check the messages for earlier dumps written while the abended
task was running. Determine if these earlier dumps indicate an earlier
system problem; if so, continue diagnosis with that problem.
- Analyze the dump, as described in the following steps.
Note: After
the problem and before the dump, recovery tried to reconstruct erroneous
control block chains before ending the task. If the problem proves
to be in a system component, a SYSABEND or SYSUDUMP dump cannot be
used to isolate it because of the recovery actions; these dumps are
useful only for problems in the private area.
- Obtain the abend code, reason code, job name, step name, and
program status word (PSW) from the dump title at the beginning
of the dump.
If the completion code is USER=dddd, an application
program issued an ABEND macro to request the dump and to specify the
completion code.
If the completion code is SYSTEM=hhh, a system
component ended the application program and a recovery routine in
the program requested the dump. The application program probably caused
the abend.
Reference See z/OS MVS System Codes for
an explanation of the abend code.
- Analyze the RTM2WA, as follows:
- In the TCB summary, find the task control block (TCB) for the
failing task. This TCB has the abend code as its completion code
in the CMP field. In the TCB summary, obtain the address of the recovery
termination manager 2 (RTM2) work area (RTM2WA) for the TCB.
- In the RTM2WA summary, obtain the registers at the time of the
error and the name and address of the abending program.
- If the RTM2WA summary does not give the abending program name
and address, probably an SVC instruction abnormally ended.
- If the RTM2WA summary gives a previous RTM2WA for recursion, the
abend for this dump occurred while an ESTAE or other recovery routine
was processing another, original abend. In recursive abends, more
than one RTM2WA may be created. Use the previous RTM2WA to diagnose
the original problem.
For information about the RTM2WA, SDWA, and TCB data areas,
see z/OS® MVS™ Data Areas in z/OS Internet library.
- Analyze the dump for the program name. Obtain the program
name from the RTM2WA summary. If the name field is zero, do the following:
- Locate the failing program module in the hexadecimal dump.
- Find the instruction that caused the abend.
The PSW
in the dump header is from the time of the error. Obtain the address
in the right half of the PSW. The leftmost digit denotes addressing
mode and is not part of the address.
For most problems, subtract
the instruction length in the ILC field of the dump header from the
PSW address to obtain the address of the failing instruction. Do
not subtract the instruction length in the following cases; the failing
instruction is at the PSW address.
- Page translation exception.
- Segment translation exception.
- Vector operation interruption.
- Other interruptions for which the processing of the instruction
identified by the old PSW is nullified. See z/Architecture® Principles of Operation for
the interruption action.
- If access registers were being used at the time of the error,
so that the access list entry token (ALET) may be incorrect.
Subtract the failing instruction address from the failing
module address. Use this offset to find the matching instruction
in the abending program's assembler listing.
- For an abend from an SVC or system I/O routine, find the last
program instruction.
If the abend occurred in a system component
running on behalf of the dumped program, find the last instruction
that ran in the program, as follows:
- For an abend from an SVC routine, look in the last PRB in the
control blocks for the task being dumped. The right half of the PSW
in the RTPSW1 field contains the address of the instruction following
the SVC instruction.
- For an abend from a system I/O routine, look in the save area
trace. This trace gives the address of the I/O routine branched to.
The return address in that save area is the last instruction that
ran in the failing program.
- For an abend from an SVC or system I/O routine, determine the
cause of the abend, using the following:
- For an abend from an SVC, look in the system trace table for SVC
entries matching the SVRBs in the control blocks for the task being
dumped.
- For an abend from an I/O routine, look in the system trace table
for I/O entries issued from addresses in the failing program. The
addresses are in the PSW ADDRESS column.
If SVC entries match the dumped blocks or the I/O entries
were issued from the failing program, the system trace table was not
overlaid between the problem and the dump.
In this case, start
with the most recent entries at the end of the trace. Back up to
the last SVC entry with the TCB address of the abending task. Go
toward the end of the trace, looking for indications of the problem.
See System trace for more information.
- For a program interrupt, determine the cause of the abend,
using the registers at the time of the error in the RTM2WA and in
the SVRB following the PRB for the abending program.
Also, look
at the formatted save area trace for input to the failing module.
- For an abend in a cross memory environment, do the following
to analyze the dump.
Many services are requested by use of the
Program Call (PC) instruction, rather than by SVCs or SRBs. When
an abend is issued by the PC routine, the OPSW field in the RB contains
the instruction address of the PC routine that issued the abend.
The SVRB contains the registers of the PC routine.
Do the following
to look for the registers and PSW at the time the PC instruction was
issued:
- For a stacking PC, find the registers in the linkage stack. Any
entries on the linkage stack are before the RBs in the dump.
- For a basic PC, find the registers in the PCLINK stack. Any entries
on the PCLINK stack are after the RBs in the dump.
For a stacking PC, find the linkage stack entry that corresponds
to the RB/XSB for the program. The LSED field of the linkage stack
entry and the XSBLSCP field in the corresponding XSB have the same
value. From the linkage stack entry, obtain the registers and the
PSW at the time the stacking PC was issued. The address in the PSW
points to the instruction following the PC instruction in the abending
program.
For a basic PC, determine the caller from the PCLINK
stack. To locate the PCLINK stack element (STKE):
- The STKEs appear in the dump following all of the RBs. If the
dump contains more than one STKE, the pointer to the STKE for the
PC involved in the problem is in the XSBSTKE field of the XSB associated
with the RB for the abending program.
- The RBXSB field in the RB points to the XSB.
- The XSBSEL field in the XSB points to the current STKE.
In the STKE, the STKERET field contains the return address
of the caller of the PCLINK service.
For information about the
STKE and XSB data areas, see z/OS MVS Data Areas in z/OS Internet library.