- Trace the loop.
Loop problems might involve many
modules or a single module. If possible, trace the looping instructions.
Using the operator's reference for your host processor, instruction-step
through the looping addresses. Save these addresses for use in diagnosing
the problem.
Take a dump and determine which module is looping
by checking the PSW addresses in the CLKC entries for a repeating
pattern.
If the VIT was running when the loop started, look
for any exception conditions that might have led to the loop. If the
internal trace was not running, you might have to re-create the problem
to get the trace at the time of the loop. Set the internal trace to
MODE=EXT to record the trace entries in an external file.
- Get dump output.
To get a dump of VTAM, issue the DUMP command, or press the Program
Restart key.
If the loop is disabled, the system console is
not available for input, so take a stand-alone dump. (See Stand-alone dump.)
- Get the system console log and LOGREC output.
The
system console log might contain information, such as error messages,
that can help you diagnose the problem. Also, print the LOGREC file.
Use
the LOGDATA option to print the in-core LOGREC buffers. See Table 1 to determine what document has
information on LOGDATA.
- Is a message involved?
Determine whether there are
any messages associated with the loop, such as a particular message
always preceding the problem, or the same message being issued repeatedly.
If so, add the message numbers to your problem documentation and go
to the message procedure, step 4.
- Is it a device error?
For any device error, first check
the NetView report (if you
have the NetView program)
and then the LOGREC output.
Does the LOGREC output show repetitive
entries for the same error on a particular device? If so, VTAM is receiving several different
errors from that device.
- If the LOGREC error records are for a link or link
station attached to a communication controller, get VIT PIU records
and an I/O trace of the NCP. If you have the NetView program,
get session trace data or session awareness data for the NCP. If the
error records are for a link or device attached to a communication
adapter, get VIT PIU records or a dynamic trace of the communication
adapter.
If the trace shows continual arrival of RECMS PIUs, then
the repetitive entries in LOGREC are caused by a device error.
- For channel-attached devices, use one or more of the
following traces for the device to determine whether VTAM is receiving many errors:
- VTAM internal trace with
CIO option
- Session trace data (if using the NetView program)
- Session awareness data (if using the NetView program)
- CCWTRACE (if available)
- Many errors received?
If VTAM is receiving many errors, the problem is
probably in the device. Run a CIO VIT trace to trace execution of
the VTAM ERP routines. Then
continue with step 7.
- Is the loop traced?
If you were able to instruction-step
through the loop, go to step 15;
otherwise, continue with step 8.
- Find the failing module.
Use the PSW to find the
failing module.
- The PSW is found in LOGREC output, the SDWA, or the RTM2WA.
When
you use PSW RESTART to terminate a looping task, a LOGREC entry is
created with a completion code of X'071' for the task. An
RTM2WA is also created for the task. Use the LOGREC record and the
RTM work area to locate the failing module. See the diagnostic books
listed in "Bibliography" for your operating system for help in locating
the PSW in dump output.
Depending on the PSW bit 32, the last
3 bytes (24-bit mode) or 4 bytes (31-bit mode) of the PSW contain
the address being executed at the time of the dump. Scan the dump
output to find the address given in the PSW. See
Table 1 to determine which document contains
more information on PSWs.
Note: Addresses might not always be in
numeric order because the dump does not always generate output in
sequential order.
If you cannot find the address, the
dump might not contain the relevant portion of main storage. For example,
the address might be in LPA storage. Have this portion of storage
dumped, or use output from LPAMAP to identify the module, and proceed
as above.
Note: The VTAMMAP VTFNDMOD formatted dump tool can be used to
gather the module information described in steps
9,
10 and
11.
- Find the module name that contains the failing address.
VTAM identifies modules with an
EBCDIC module name and the Julian date (and, if appropriate, the latest
PTF applied) at or near the beginning of most modules. This module
identifier is usually in the form:
ISTxxxxx yy.ddd [nnnnnnn]
where xxxxx is
the last five characters of the module name, yy.ddd is the
Julian date the module was assembled, and nnnnnnn is the latest
PTF (if any) that has been applied to this module.
To find
the module ID, start at the failing address and scan upward (in descending
address order) along the right side of the dump listing. The module
ID is printed in EBCDIC. Add the module name to your documentation
list.
- Find the module pointed to by register 12.
General
register 12 (X'0C') is normally the base register for VTAM modules. In a VTAM loop, register 12 should point to the same
module found in step 11. If not,
add this module name to your documentation list.
- Find the module pointed to by register 14.
General
register 14 (X'0E') might point to a module that called the
routine that is looping. Add this module name to your documentation
list.
Add the module names from steps 9, 10, and 11 to your documentation list. You can
report the problem next, but you might need to continue with step 12.
- Get the system trace output.
The system trace might
show many external and I/O interrupts. The PSW addresses in system
trace entries will be part of the loop.
- Get the VIT output.
The VIT is useful in determining
the reason for a loop, such as a process being continually redispatched
for the same request. Get the VIT output. If you require VIT options
in addition to the default options (API, CIO, MSG, NRM, PIU, PSS,
SMS, and SSCP), start a VIT in addition to the default and specify
MODE=EXT. If VTAM does not
accept the command, it might be necessary to re-create the problem.
For more information about using the VIT, see z/OS Communications Server: SNA Diagnosis Vol
2, FFST Dumps and the VIT.
- Examine the trace entries.
By examining all of the
trace entries, you might be able to determine whether there is a loop.
The most obvious loops would be a module or modules getting continual
control of the VTAM system,
or a control block chaining to itself. Check the output of the PSS
option to see which VTAM routines
are getting control. If you see a pattern of repetition in the trace
entries, it does not necessarily mean that VTAM is looping. Some VTAM processes are timer-driven and repeat periodically.
Note: - Get the trace information and examine the clock comparative entries
for repeating PSW addresses. For short loops, the repeating PSWs show
the extent of the loop.
- The absence of any apparent loop does not necessarily mean that VTAM is not looping. The
loop might not contain a VTAM trace
point.
If a module or modules are looping, get their addresses
from the trace entries. Step 15 explains
how to find the module name.
If you find a control block chained
to itself, or if a queue of control blocks is in a cycle, try to identify
the control block. Most control blocks have a 1-byte ID at offset X'00'.
See the control block ID codes in Storage and control block ID codes to
identify the control block name.
- Find the module names.
Note: You can also use the VTAMMAP
VTFNDMOD formatted dump tool to find the module ID. See
VTFNDMOD.
Use the addresses found
in step 14 to find the module names
involved in the loop.
To find the module ID, start at the failing
address and scan upward (in descending address order) along the right
side of the dump listing. The module ID is printed in EBCDIC. Add
this module ID to your documentation list. Continue with step 16.
- Report or go to the failing module procedure.
If
you determined the module names, go to Failing module.
Otherwise, you are ready to contact IBM®.
Go to Reporting the problem to IBM.