IOS_IORATE_MONITOR

Description:
Detects if any control units in the system are reporting inconsistent I/O rates for their attached channel paths.

Typically, I/Os are distributed equally across all paths for a control unit. When the system determines that there is a performance problem with a path, it will direct I/Os away from that path. This action, taken by the system to correct the performance problem, results in inconsistent I/O rates across the paths.

The check issues an exception if at least one control unit in the system has a total I/O rate across all of its channel paths that exceeds the THRESHOLD check parameter value, and at least one path with an I/O rate significantly lower (as defined by the RATIO check parameter) than that of the channel path with the highest I/O rate for the control unit.

Example:
Path 1,  I/O rate = 600 I/Os per second
Path 2,  I/O rate = 250 I/Os per second 
Path 3,  I/O rate = 300 I/Os per second 

If THRESHOLD is 800 and RATIO is 2, the check issues an exception because the total I/O rate of 1150 exceeds the threshold value, and path 2, (the path with the lowest I/O rate of 250) is less than half the I/O rate for path 1 (the path with the highest I/O rate).

Reason for check:
I/O rate measures the number of I/Os started down the channel path per second. A lower than average I/O rate can be a symptom of potential problems in the fabric. By monitoring this measurement alone and comparing it among the paths to a control unit, fabric problems like hardware errors, misconfiguration and congestion may be more easily detected.
z/OS® releases the check applies to:
z/OS V1R12 and later with apar OA40548 on a zEC12 or later processor.
Parameters accepted:
Yes, the following parameters are accepted:
PARM(’THRESHOLD(threshold),RATIO(x),XTYPE(devtype),XCU(cu1,cu2,...,cux)’)
THRESHOLD(threshold)
THRESHOLD defines the value in number of I/Os per second that is used in conjunction with the RATIO parameter to determine whether an exception exists. If the total I/O rate of all of the paths to the control unit exceed the THRESHOLD value, then the RATIO value is used to further determine if an exception exists.

Range: 10 to 1000

Default: 100

RATIO(ratio)
RATIO defines the value used to determine if the I/O rate of the path with the lowest I/O rate is significantly lower than the path with the highest I/O rate for this control unit, using a factor of 'ratio'. This is used to determine if an exception exists only after the THRESHOLD condition has been met.

If the THRESHOLD condition has been met and if the path with the lowest I/O rate is at least a factor of 'ratio' less than the path with the highest I/O rate, an exception will be declared for the control unit.

Range: 2 to 100

Default: 2

XTYPE(devtype)
devtype is the device type of control units that will be excluded from the check and not reported on.

Supported device type values: DASD,TAPE

Default: no value

XCU(cu1,cu2,...,cux)
XCU defines a list of specific control units that will be excluded from the check and will not be reported on. Each control unit in this list is a hexadecimal value representing the control unit number. This parameter takes up to 40 different control unit numbers.

Range: 0 to FFFE

Default: no value

Note: If any parameter is changed, the check results may not reflect these changes for several minutes because the check must gather a few minutes worth of data before performing analysis using the new parameters.
User override of IBM values:
Start of changeThe following sample shows the defaults for customizable values for this check. Use this sample to make permanent check customizations in an HZSPRMxx parmlib member used at IBM Health Checker for z/OS startup. If you just want a one-time only update to the check defaults, omit the first line (ADDREPLACE POLICY) and use the UPDATE statement on a MODIFY hzsproc command. Note that using non-POLICY UPDATEs in HZSPRMxx can lead to unexpected results and is therefore not recommended.End of change
Start of changeADDREPLACE POLICY[(policyname)] [STATEMENT(name)]End of change
UPDATE
CHECK(IBMIOS,IOS_IORATE_MONITOR)
ACTIVE
VERBOSE(NO)
INTERVAL(00:05)
SEVERITY(MED)
DATE(’date_of_the_change’)
PARM(’THRESHOLD(100),RATIO(2),XCU(),XTYPE()’)
REASON(’Your reason for making the update’)
Debug support:
No.
Verbose support:
Yes, if VERBOSE(YES) is specified on the check, the control units that were excluded via the XTYPE and XCU parameters will be displayed in the report if exceptions were found for them. This allows an easy way to temporarily obtain information on ALL control units with an exception without the need for a change to the XCU and XTYPE parameters.
Reference:
For more information on interpreting initial command response (CMR) time for the affected control units, see "IOQUEUE - I/O Queuing Activity Report" in z/OS RMF™ Report Analysis.
Messages:
This check issues the following exception messages:
  • IOSHC132E
See IOSHC messages in z/OS MVS System Messages, Vol 9 (IGF-IWM).
SECLABEL recommended for MLS users:
SYSLOW