The enqueue request rate check issues an exception for the following types of comparisons:
After the PFA_ENQUEUE_REQUEST_RATE check issues an exception, it does not perform the next comparison type. To avoid skewing the enqueue request rate, PFA ignores the first hour of enqueue data after IPL and the last hour of enqueue data prior to shutdown. In addition, PFA attempts to track the same persistent address spaces that it tracked prior to IPL or PFA restart if the same persistent address spaces are still active. Read the topic about persistent jons in PFA_MESSAGE_ARRIVAL_RATE to understand how the PFA_ENQUEUE_REQUEST_RATE check determines the top twenty persistent jobs.
By default, an EXCLUDED_JOBS file containing the address spaces NETVIEW and *MASTER* on all systems is created during installation. Therefore, if you have not made any modifications to the EXCLUDED_JOBS file, these jobs are excluded. See Using and configuring supervised learning for more information.
Guidelines
Parameter name | Default value | Minimum Value | Maximum Value | Description |
---|---|---|---|---|
collectint | 1 Minute | 1 | 360 | This parameter determines how often (in minutes) to run the data collector that retrieves the current enqueue request rate. |
modelint | 720 Minutes | 60 | 1440 | This parameter determines how often (in minutes) you want the system to analyze the data and construct a new enqueue request rate model or prediction. By default, PFA analyzes the data and constructs a new model every “default value” minutes. The model interval must be at least four times larger than the collection interval. Note that, even when you set a value larger than 360, PFA performs the first model at 360 minutes (6 hours). By default, PFA analyzes the data and constructs a new model every 720 minutes (12 hours). |
stddev | 10 | 2 | 100 | This parameter is used to specify how much variance is allowed between the actual enqueue request rate per amount of CPU and the expected enqueue request rate. It determines if the actual enqueue request rate has increased beyond the allowable upper limit and how much variance is allowed across the time range predictions. If you set the STDDEV parameter to a smaller value, an exception issues when the actual enqueue request rate is closer to the expected enqueue request rate and the predictions across the time ranges are consistent. If you set the STDDEV parameter to a larger value, an exception issues when the actual enqueue request rate is significantly greater than the expected enqueue request rate even if the predictions across the different time ranges are inconsistent. |
collectinactive | 1 (on) | 0 (off) | 1 (on) | Defines whether data is collected and modeled even if the check is not eligible to run, not ACTIVE(ENABLED), in IBM® Health Checker for z/OS. |
trackedmin | 3 | 0 | 1000 | This parameter defines the minimum enqueue request rate required for a persistent job in order for it to be considered a top persistent job that should be tracked individually. |
exceptionmin | 1 | 0 | 1000 | This parameter is used when determining if an exception should be issued for an unexpectedly high enqueue request rate. For tracked jobs, this parameter defines the minimum enqueue request rate and the minimum predicted enqueue request rate required to cause a too high exception. For the total system comparison, this parameter defines the minimum enqueue request rate required to cause a too high exception. |
checklow | 1 | 0 | 1 | Defines whether Runtime Diagnostics is run to validate that a low enqueue request rate is caused by a problem. If this value is off, PFA does not issue exceptions for conditions in which the enqueue request rate is unexpectedly low. |
stddevlow | 4 | 2 | 100 | This parameter is used to specify
how much variance is allowed between the actual enqueue request rate
per amount of CPU, and the expected enqueue request rate, when determining
if the actual rate is unexpectedly low.
|
limitlow | 3 | 1 | 100 | This parameter defines the maximum enqueue request rate allowed when issuing an exception for an unexpectedly low number of enqueues. |
debug | 0 (off) | 0 (off) | 1 (on) | This parameter (an integer of 0 or 1) is used at the direction of IBM service to generate additional diagnostic information for the IBM Support Center. This debug parameter is used in place of the IBM Health Checker for z/OS policy. The default is off (0). |
AIR018I 02:22:54 PFA CHECK DETAIL
CHECK NAME: PFA_ENQUEUE_REQUEST_RATE
ACTIVE : YES
TOTAL COLLECTION COUNT : 5
SUCCESSFUL COLLECTION COUNT : 5
LAST COLLECTION TIME : 02/05/2009 10:18:22
LAST SUCCESSFUL COLLECTION TIME : 02/05/2009 10:18:22
NEXT COLLECTION TIME : 02/05/2009 10:19:22
TOTAL MODEL COUNT : 1
SUCCESSFUL MODEL COUNT : 1
LAST MODEL TIME : 02/05/2009 10:18:24
LAST SUCCESSFUL MODEL TIME : 02/05/2009 10:18:24
NEXT MODEL TIME : 02/05/2009 22:18:24
CHECK SPECIFIC PARAMETERS:
COLLECTINT : 1
MODELINT : 720
COLLECTINACTIVE : 1=ON
DEBUG : 0=OFF
STDDEV : 10
TRACKEDMIN : 3
EXCEPTIONMIN : 1
CHECKLOW : 1=ON
STDDEVLOW : 4
LIMITLOW : 3
UPDATE CHECK(IBMPFA,PFA_ENQUEUE_REQUEST_RATE)
ACTIVE
SEVERITY(MEDIUM)
INTERVAL(ONETIME)
PARMS=('COLLECTINT(1)','MODELINT(720)','STDDEV(10)','DEBUG(0)',
'COLLECTINACTIVE(1)','EXCEPTIONMIN(1)','TRACKEDMIN(3)')
'CHECKLOW(1)','STDDEVLOW(4)','LIMITLOW(3)'
DATE(20080330)
REASON('The enqueue request rate is higher than expected
which can indicate a damaged address space.')
The
enqueue request rate check is designed to run automatically after
every data collection. Do not change the INTERVAL parameter. Enqueue Request Rate Prediction Report
Last successful model time : 01/27/2009 11:08:01
Next model time : 01/27/2009 23:08:01
Model interval : 720
Last successful collection time : 01/27/2009 17:41:38
Next collection time : 01/27/2009 17:56:38
Collection interval : 15
Persistent address spaces with high rates:
Predicted Enqueue
Enqueue Request Rate
Job Request
Name ASID Rate 1 Hour 24 Hour 7 Day
TRACKED1 001D 58.00 23.88 22.82 15.82
TRACKED2 0028 11.00 0.34 11.11 12.11
TRACKED3 0029 11.00 12.43 2.36 8.36
Enqueue Request Rate Prediction Report
Last successful model time : 10/10/2010 11:08:01
Next model time : 10/10/2010 23:08:01
Model interval : 720
Last successful collection time : 10/10/2010 17:41:38
Next collection time : 10/10/2010 17:56:38
Collection interval : 15
Persistent address spaces with low rates:
Predicted Enqueue
Enqueue Request Rate
Job Request
Name ASID Rate 1 Hour 24 Hour 7 Day
IBMUSER2 002F 1.17 23.88 22.82 15.82
IBMUSER1 002E 2.01 8.34 11.11 12.11
Runtime Diagnostics Output:
Runtime Diagnostics detected a problem in job: JOBS4
EVENT 06: HIGH - HIGHCPU - SYSTEM: SY1 2009/06/12 - 13:28:46
ASID CPU RATE: 96% ASID: 0027 JOBNAME: JOBS4
STEPNAME: DAVIDZ PROCSTEP: DAVIDZ JOBID: STC00042 USERID: ++++++++
JOBSTART: 2009/06/12 - 13:28:35
Error:
ADDRESS SPACE USING EXCESSIVE CPU TIME. IT MAY BE LOOPING.
Action:
USE YOUR SOFTWARE MONITORS TO INVESTIGATE THE ASID.
----------------------------------------------------------------------
EVENT 07: HIGH - LOOP - SYSTEM: SY1 2009/06/12 - 13:28:46
ASID: 0027 JOBNAME: JOBS4 TCB: 004E6850
STEPNAME: DAVIDZ PROCSTEP: DAVIDZ JOBID: STC00042 USERID: ++++++++
JOBSTART: 2009/06/12 - 13:28:35
Error:
ADDRESS SPACE APPEARS TO BE IN A LOOP.
Action:
USE YOUR SOFTWARE MONITORS TO INVESTIGATE THE ASID.
----------------------------------------------------------------------
Runtime Diagnostics detected a problem in job: JOBS5
EVENT 03: HIGH - HIGHCPU - SYSTEM: SY1 2009/06/12 - 13:28:46
ASID CPU RATE: 96% ASID: 0027 JOBNAME: JOBS5
STEPNAME: DAVIDZ PROCSTEP: DAVIDZ JOBID: STC00042 USERID: ++++++++
JOBSTART: 2009/06/12 - 13:28:35
Error:
ADDRESS SPACE USING EXCESSIVE CPU TIME. IT MAY BE LOOPING.
Action:
USE YOUR SOFTWARE MONITORS TO INVESTIGATE THE ASID.
----------------------------------------------------------------------
EVENT 04: HIGH - LOOP - SYSTEM: SY1 2009/06/12 - 13:28:46
ASID: 0027 JOBNAME: JOBS5 TCB: 004E6850
STEPNAME: DAVIDZ PROCSTEP: DAVIDZ JOBID: STC00042 USERID: ++++++++
JOBSTART: 2009/06/12 - 13:28:35
Error:
ADDRESS SPACE APPEARS TO BE IN A LOOP.
Action:
USE YOUR SOFTWARE MONITORS TO INVESTIGATE THE ASID.
----------------------------------------------------------------------
Enqueue request rate Prediction Report
Last successful model time : 01/27/2009 17:08:01
Next model time : 01/27/2009 23:08:01
Model interval : 360
Last successful collection time : 01/27/2009 17:41:38
Next collection time : 01/27/2009 17:56:38
Collection interval : 15
Enqueue request rate
at last collection interval : 83.52
Prediction based on 1 hour of data : 98.27
Prediction based on 24 hours of data: 85.98
Prediction based on 7 days of data : 100.22
Top persistent users:
Predicted Enqueue
Enqueue Request Rate
Job Request
Name ASID Rate 1 Hour 24 Hour 7 Day
TRACKED1 001D 58.00 23.88 22.82 15.82
TRACKED2 0028 11.00 0.34 11.11 12.11
TRACKED3 0029 11.00 12.43 2.36 8.36
Enqueue Request Rate Prediction Report
Last successful model time : 01/27/2009 11:08:01
Next model time : 01/27/2009 23:08:01
Model interval : 720
Last successful collection time : 01/27/2009 17:41:38
Next collection time : 01/27/2009 17:56:38
Collection interval : 15
Persistent address spaces with low rates:
Predicted ENQ
ENQ Request Rate
Job Request
Name ASID Rate 1 Hour 24 Hour 7 Day
JOBS4 001F 1.17 23.88 22.82 15.82
JOBS5 002D 2.01 8.34 11.11 12.11
Runtime Diagnostics Output:
----------------------------------------------------------------------
EVENT 01: HIGH - ENQ - SYSTEM: SY1 2010/10/04 - 10:19:53
ENQ WAITER - ASID:002F - JOBNAME:IBMUSER2 - SYSTEM:SY1
ENQ BLOCKER - ASID:002E - JOBNAME:IBMUSER1 - SYSTEM:SY1
QNAME: TESTENQ
RNAME: TESTOFAVERYVERYVERYVERYLOOOOOOOOOOOOOOOOOOOOOONGRNAME1234567...
ERROR: ADDRESS SPACES MIGHT BE IN ENQ CONTENTION.
ACTION: USE YOUR SOFTWARE MONITORS TO INVESTIGATE BLOCKING JOBS AND
ACTION: ASIDS.
----------------------------------------------------------------------
Name | Sym | Size |
---|---|---|
Kilo | K | 1,024 |
Mega | M | 1,048,576 |
Giga | G | 1,073,741,824 |
Tera | T | 1,099,511,627,776 |
Peta | P | 1,125,899,906,842 |
The following fields apply to all reports:
Guideline: If the use of the z/OS image is radically different after an IPL (for instance, the change from a test system to a production system) of if you modify anything that affects enqueue details, delete the files in the PFA_ENQUEUE_REQUEST_RATE/data directory to ensure the check can collect the most accurate modeling information.
Results files
Data store files:
Intermediate files:
This directory holds the following log files. Additional information is written to these log files when DEBUG(1).