Analysis Procedure

To analyze a SYSABEND or SYSUDUMP, take the following steps:
  1. Collect and analyze logrec error records.

    Check all logrec error records related to the abended task. Determine if any records show an earlier system problem; if so, continue diagnosis with that problem. Because of recovery and percolation, a SYSABEND or SYSUDUMP dump can be the end result of an earlier system problem.

  2. Collect and analyze messages about the problem. Use time stamps to select messages related to the problem:
    • The job log
    • The system log (SYSLOG) or operations log (OPERLOG)

    Check the messages for earlier dumps written while the abended task was running. Determine if these earlier dumps indicate an earlier system problem; if so, continue diagnosis with that problem.

  3. Analyze the dump, as described in the following steps.
    Note: After the problem and before the dump, recovery tried to reconstruct erroneous control block chains before ending the task. If the problem proves to be in a system component, a SYSABEND or SYSUDUMP dump cannot be used to isolate it because of the recovery actions; these dumps are useful only for problems in the private area.
  4. Obtain the abend code, reason code, job name, step name, and program status word (PSW) from the dump title at the beginning of the dump.

    If the completion code is USER=dddd, an application program issued an ABEND macro to request the dump and to specify the completion code.

    If the completion code is SYSTEM=hhh, a system component ended the application program and a recovery routine in the program requested the dump. The application program probably caused the abend.

    Reference See z/OS MVS System Codes for an explanation of the abend code.

  5. Analyze the RTM2WA, as follows:
    • In the TCB summary, find the task control block (TCB) for the failing task. This TCB has the abend code as its completion code in the CMP field. In the TCB summary, obtain the address of the recovery termination manager 2 (RTM2) work area (RTM2WA) for the TCB.
    • In the RTM2WA summary, obtain the registers at the time of the error and the name and address of the abending program.
    • If the RTM2WA summary does not give the abending program name and address, probably an SVC instruction abnormally ended.
    • If the RTM2WA summary gives a previous RTM2WA for recursion, the abend for this dump occurred while an ESTAE or other recovery routine was processing another, original abend. In recursive abends, more than one RTM2WA may be created. Use the previous RTM2WA to diagnose the original problem.

    For information about the RTM2WA, SDWA, and TCB data areas, see z/OS® MVS™ Data Areas in z/OS Internet library.

  6. Analyze the dump for the program name. Obtain the program name from the RTM2WA summary. If the name field is zero, do the following:
    • Find the control blocks for the task being dumped.
    • The last request blocks are SVRBs. In the WLIC field in an SVRB, find the following SVC interruption codes:
      • X'33' for a SNAP SVC interruption
      • X'0C' for a SYNCH SVC interruption
    • The program request block (PRB) for the abending program immediately precedes these SVRBs.
    • When the dump contains more than one CDE, determine the first and last address for each CDE. The entry point address is the first address. Add the length to the entry point address to obtain the last address. Compare these addresses to the address in the right half of the PSW in the dump header; the PSW address falls between the first and last addresses of the correct CDE.

      Note that the leftmost digit in the PSW address denotes addressing mode and is not part of the address.

    • In that CDE, the NAME field gives the program name.
  7. Locate the failing program module in the hexadecimal dump.
  8. Find the instruction that caused the abend.

    The PSW in the dump header is from the time of the error. Obtain the address in the right half of the PSW. The leftmost digit denotes addressing mode and is not part of the address.

    For most problems, subtract the instruction length in the ILC field of the dump header from the PSW address to obtain the address of the failing instruction. Do not subtract the instruction length in the following cases; the failing instruction is at the PSW address.
    • Page translation exception.
    • Segment translation exception.
    • Vector operation interruption.
    • Other interruptions for which the processing of the instruction identified by the old PSW is nullified. See z/Architecture® Principles of Operation for the interruption action.
    • If access registers were being used at the time of the error, so that the access list entry token (ALET) may be incorrect.

    Subtract the failing instruction address from the failing module address. Use this offset to find the matching instruction in the abending program's assembler listing.

  9. For an abend from an SVC or system I/O routine, find the last program instruction.
    If the abend occurred in a system component running on behalf of the dumped program, find the last instruction that ran in the program, as follows:
    • For an abend from an SVC routine, look in the last PRB in the control blocks for the task being dumped. The right half of the PSW in the RTPSW1 field contains the address of the instruction following the SVC instruction.
    • For an abend from a system I/O routine, look in the save area trace. This trace gives the address of the I/O routine branched to. The return address in that save area is the last instruction that ran in the failing program.
  10. For an abend from an SVC or system I/O routine, determine the cause of the abend, using the following:
    • For an abend from an SVC, look in the system trace table for SVC entries matching the SVRBs in the control blocks for the task being dumped.
    • For an abend from an I/O routine, look in the system trace table for I/O entries issued from addresses in the failing program. The addresses are in the PSW ADDRESS column.

    If SVC entries match the dumped blocks or the I/O entries were issued from the failing program, the system trace table was not overlaid between the problem and the dump.

    In this case, start with the most recent entries at the end of the trace. Back up to the last SVC entry with the TCB address of the abending task. Go toward the end of the trace, looking for indications of the problem. See System trace for more information.

  11. For a program interrupt, determine the cause of the abend, using the registers at the time of the error in the RTM2WA and in the SVRB following the PRB for the abending program.

    Also, look at the formatted save area trace for input to the failing module.

  12. For an abend in a cross memory environment, do the following to analyze the dump.

    Many services are requested by use of the Program Call (PC) instruction, rather than by SVCs or SRBs. When an abend is issued by the PC routine, the OPSW field in the RB contains the instruction address of the PC routine that issued the abend. The SVRB contains the registers of the PC routine.

    Do the following to look for the registers and PSW at the time the PC instruction was issued:
    • For a stacking PC, find the registers in the linkage stack. Any entries on the linkage stack are before the RBs in the dump.
    • For a basic PC, find the registers in the PCLINK stack. Any entries on the PCLINK stack are after the RBs in the dump.

    For a stacking PC, find the linkage stack entry that corresponds to the RB/XSB for the program. The LSED field of the linkage stack entry and the XSBLSCP field in the corresponding XSB have the same value. From the linkage stack entry, obtain the registers and the PSW at the time the stacking PC was issued. The address in the PSW points to the instruction following the PC instruction in the abending program.

    For a basic PC, determine the caller from the PCLINK stack. To locate the PCLINK stack element (STKE):
    • The STKEs appear in the dump following all of the RBs. If the dump contains more than one STKE, the pointer to the STKE for the PC involved in the problem is in the XSBSTKE field of the XSB associated with the RB for the abending program.
    • The RBXSB field in the RB points to the XSB.
    • The XSBSEL field in the XSB points to the current STKE.

    In the STKE, the STKERET field contains the return address of the caller of the PCLINK service.

    For information about the STKE and XSB data areas, see z/OS MVS Data Areas in z/OS Internet library.