z/OS JES2 Initialization and Tuning Guide
Previous topic | Next topic | Contents | Contact z/OS | Library | PDF


Health monitor processing

z/OS JES2 Initialization and Tuning Guide
SA32-0991-00

Health monitor processing includes the following steps:
  1. Sampler processing: The first step JES2 takes is one of sampler processing. At a rate of 20 times per second, JES2 examines all JES2 main task control blocks (TCBs) and request blocks (RBs) as it looks for main task waits, loops, and other processing delays. It also records all unexpected MVS™ waits. Further, the monitor examines resource usage at a rate of once per second and tracks low, high, and average usage. It resets this usage data at the beginning of each new hour, and maintains a 72-hour usage history of the same set of resources reported by $HASP050. See z/OS JES2 Messages for that resource list. With a baseline and history, JES2 feeds the probe processing task where JES2 can determine typical and abnormal processing trends.
  2. Probe processing: With up to 72 hours of processing data available, JES2 can compare current data to that base and determine when problems arise. Based on specific condition duration, JES2 performs increasingly heightened degrees of monitoring as it begins to inform the operator of problem situations. JES2 also groups and tracks main task events and checkpoint-lock-held events separately.
  3. Operator notification messaging: The monitor has the ability to issue messages based on length of a particular event. Generally, the longer a potential problem lasts the higher notification level JES2 assigns it and the more messages it provides the operator. JES2 responds and notifies the operator as events are detected and revealed through $HASP9nnn messages. Table 1 lays out the various message types and an indication of what information each type provides.
    Table 1. JES2 health monitor message types (as determined by time)
    Time range * Message type JES2 interpretation and handling
    0 - n seconds Notice JES2 considers the condition within "normal" parameters (or just begun) and ignores it. JES2 gathers such information and displays them in response to $JDJES and $JDSTATUS commands. Examples include events such as JES2 TERMINATING, CKPT RECONFIGURATION IN PROGRESS, NOT ALL SPOOL VOLUMES ARE AVAILABLE. These messages are collected in one place to assist the operator get a more complete view of JES2 status, although such messages are issued elsewhere, outside the JES2 monitor message range ($HASP9nnn).
    n - x seconds Tracking JES2 starts tracking the event. $JD JES command displays both real and potential problem events
    x+ seconds Alert JES2 issues an alert message. The situation might require eventual operator involvement to correct
    +30 or +120 seconds Alert DOM'd JES2 reissues the alert message with updated information and continues to do so until the condition has cleared
    n/a "All Clear" JES2 issues $HASP9301 JES2 MAIN TASK ALERTS CLEARED
    * n and x values vary because they are based on "normal" timing for the specific event. JES2 sets these dynamically based on individual processes and as compared to the 72-hour historical data the health checker maintains.

Go to the previous page Go to the next page




Copyright IBM Corporation 1990, 2014