Scan Extended (SCANX)

Bound program access

Built-in number for SCANX is 415. SCANX ( base_locator : address of a space pointer(16) base locator scan_controls : address of scan controls scan_options : literal(4) containing scan options ) : signed binary(4) value to indicate the manner in which the instruction completed

Description

The base string to be scanned is specified by the base locator and controls operands. The base locator addresses the first character of the base string. The controls specifies the length of the base string in the base length field.

The scan operation begins at the left end of the base string and continues character by character, left-to-right. The scan operation can be performed on a base string which contains all simple (1-byte) or all extended (2-byte) character codes or a mixture of the two. When the base string is being interpreted in simple character mode, the operation moves through the base string one byte at a time. When the base string is being interpreted in extended character mode, the operation moves through the base string 2 bytes at a time. The character string value of the base operand is scanned for occurrences of a character value satisfying the criteria specified in the control and options operands.

The scan is completed by updating the base locator and controls operands with scan status when a character value being scanned for is found, the end of the base string is encountered, or an escape code is encountered when the test for escape codes option is specified within the scan controls operand. A completion code indicating the manner in which the instruction completed is also returned. The base locator is set with addressability to the character (simple or extended) which caused the instruction to complete execution. The controls operand is set with information which identifies the mode (simple or extended) of the base string character addressed by the base locator and which provides for resumption of the scan operation with minimal overhead.

The controls and options operands specify the modes to be used in interpreting characters during the scan operation. Characters can be interpreted in one of two character modes: simple (1-byte) and extended (2-byte). Additionally, the base string can be scanned in one of two scan modes, mixed (base string may contain a mixture of both character modes) and nonmixed (base string contains one mode of characters).

When the mixed scan mode is specified in the options operand, the base string is interpreted as containing a mixture of simple and extended character codes. The mode, simple or extended, with which the string is to be interpreted, is controlled initially by the base mode indicator in the controls operand and thereafter by mode control characters imbedded in the base string. The mode control characters are as follows:

When the nonmixed scan mode is specified in the options operand, the base string is interpreted using only the character mode specified by the base mode indicator in the controls operand. Character mode shifting can not occur because no mode control characters are recognized when scanning in nonmixed mode.

The base locator operand is a space pointer which is both input to and output from the instruction. On input, it locates the first character of the base string to be processed. On output, it locates the character of the base string which caused the instruction to complete.

The controls operand is the address of an aggregate which specifies additional information to be used to control the scan operation. The aggregate scan controls must be at least 8 bytes long and have the following format:

Offset
Dec Hex
Field Name
Data Type and Length
0 0
Scan controls
Char(24)
0 0
Control indicators
Char(1)
0 0
Base mode
Bit 0



0 = Simple
1 = Extended


0 0
Comparison character mode
Bit 1



0 = Simple
1 = Extended


0 0
Reserved
Bits 2-5
0 0
Enhanced options
Bit 6



0 = Enhanced options fields are not used
1 = Enhanced options fields are used


0 0
Scan state
Bit 7



0 = Resume scan
1 = Start scan


1 1
Ignored
Char(1)
2 2
Comparison character
Char(2)
4 4
Reserved (binary 0)
Char(1)
5 5
Base end
Char(3)
5 5
Instruction work area
Char(1)
6 6
Base length
Char(2)
8 8
Enhanced length
UBin(8)
16 10
Enhanced resume info
UBin(8)
24 18
--- End ---

Only the first 8 or 24 bytes of scan controls are used, depending upon the value of enhanced options. Any excess bytes are ignored.

The base mode is both input to and output from the instruction. In either case, it specifies the mode of the character in the base string currently addressed by the base locator.

The comparison character mode is not changed by the instruction. It specifies the mode of the comparison character contained in the controls operand.

The scan state is both input to and output from the instruction. As input, it indicates whether the scan operation for the base string is being started or resumed. If it is being started, the instruction assumes that the base length value in the base end field of the controls operand specifies the length of the base string, and the instruction work area value is ignored. If it is being resumed, the instruction assumes the base end field has been set by a prior start scan execution of the instruction with an internal machine value identifying the end of the base string.

For a start scan execution of the instruction, the scan state field is reset to indicate resume scan to provide for subsequent resumption of the scan operation. Additionally, for a start scan execution of the instruction, the base end field is set with an internally optimized value which identifies the end of the base string being scanned. This value then overlays the values which were in the instruction work area and base length fields on input to the instruction. Predictable operation of the instruction on a resume scan execution depends upon this base end field being left intact with the value set by the start scan execution.

For a resume scan execution of the instruction, the scan state and base end fields are unchanged.

The comparison character is input to the instruction. It specifies a character code to be used in the comparisons performed during the scanning of the base string. The comparison character mode in the control indicators specifies the mode (simple or extended) of the comparison character. If it is a simple character, the first byte of the comparison character field is ignored and the comparison character is assumed to be specified in the second byte. If it is an extended character, the comparison character is specified as a 2-byte value in the comparison character field.

When enhanced options has a value of 0, the base end value is used. Otherwise the enhanced length and enhanced resume info fields are used and base length is ignored. The value of enhanced options must not be changed between start scan and resume scan executions on the same string.

When base locator points to a space pointer which contains a teraspace address, an unsupported space use  (hex 0607) exception is signaled if enhanced options has a value of 0 and resume scan is specified.

The base end field is both input to and output from the instruction. It contains data which identifies the end of the base string. Initially, for a start scan execution of the instruction, it contains the length of the base string in the base length field. Additionally, the base end field is used to retain information over multiple instruction executions which provides for minimizing the overhead required to resume the scan operation for a particular base string. This information is set on the initial start scan execution of the instruction and is used during subsequent resume scan executions of the instruction to determine the end of the base string to be scanned. If the end of the base string being scanned must be altered during iterative usage of this instruction, a start scan execution of the instruction must be performed to provide for correctly resetting the internally optimized value to be stored in the base end from the values specified in the base locator operand and base length field.

The enhanced length field is input to the instruction. It contains the length in bytes of the string to be scanned when enhanced options has a value of 1. Current machine implementations support a maximum length of 16777215; larger values cause a scalar value invalid  (hex 3203) exception to be signaled.

The enhanced resume info field is both input to and output from the instruction but is only used when enhanced options has a value of 1. This field is set with internal information during a start scan execution of this instruction and used as input for subsequent resume scan executions of this instruction.

If the end of the base string being scanned must be altered during iterative usage of this instruction, a start scan execution of the instruction must be performed to provide for correctly resetting the internally optimized value to be stored in enhanced resume info from the values specified in the base locator operand and enhanced length field.

For the special case of a start scan execution where a length value of zero (no characters to scan) is specified in either the base length field when enhanced options has a value of 0 or in the enhanced length field when enhanced options has a value of 1, the instruction results in a not found resultant condition. In this case, the base string is not verified and the scan state indicator, the base end field, and the base locator are not changed.

The options operand must be a literal which specifies the options to be used to control the scan operation. Scan options must be at least 4 bytes in length and has the following format:

Offset
Dec Hex
Field Name
Data Type and Length
0 0
Scan options
Char(4)
0 0
Options indicators
Char(1)
1 1
Reserved (binary 0)
Char(3)
4 4
--- End ---

The option indicators field has the following format:

Offset
Dec Hex
Field Name
Data Type and Length
0 0
Option indicators
Char(1)
0 0
Reserved (binary 0)
Bit 0
0 0
Scan mode
Bit 1



0 = Mixed
1 = Nonmixed


0 0
Reserved
Bits 2-3
0 0
Comparison relation
Bits 4-6
0 0
Equal, (=) relation
Bit 4
0 0
Less than, (<) relation
Bit 5
0 0
Greater than, (>) relation
Bit 6



0 = No match on relation
1 = Match on relation


0 0
Test for escape codes
Bit 7



0 = Do not test for escape codes during the scan
1 = Test for escape codes during the scan


1 1
--- End ---

The scan mode specifies whether the base string contains a mixture of character modes, or contains all one mode of characters; that is, whether or not mode control characters should be recognized in the base string. Mixed specifies that there is a mixture of character modes and, therefore, mode control characters should be recognized. Nonmixed specifies that there is not a mixture of character modes and, therefore, mode control characters should not be recognized. Note that the base mode indicator in the controls operand specifies the character mode of the base string character addressed by the base locator.

The comparison relation specifies the relation or relations of the comparison character to characters of the base string which will satisfy the scan operation and cause completion of the instruction with one of the high, low, or equal resultant conditions. Multiple relations may be specified in conjunction with one another. Specifying all relations insures a match against any character in the base string which is of the same mode as the comparison character. Specifying no relation insures a not found resultant condition, unless the instruction is testing for escape codes and an escape code value is found, regardless of the values of the characters in the base string which match the mode of the comparison character.

An example of comparison scanning is a scan of simple mode characters for a value less than hex 40. This could be done by specifying a comparison character of hex 40 and a comparison relation of greater than. This could also be done by specifying a comparison character of hex 3F and comparison relations of equal and greater than.

The test for escape codes field determines whether the base string is tested for values less than hex 40 while the scan is being performed. This testing, if requested, is always performed in conjunction with whatever comparison processing has been requested. That is, escape code testing is performed even if no comparison relation is specified. The following material discusses this function in more detail.

Operation

During the scan operation, the characters of the base string which are not of the same mode as the comparison character are skipped over until the mode of the characters being processed is the same as the mode of the comparison character. The operation then proceeds by comparing the comparison character with each of the characters of the base string.

If a base string character satisfying the criteria specified in the controls and options operands is found, the base locator is set to address the first byte of it, the base mode indicator is set to indicate the mode of the base string as of that character, and the instruction is completed with the appropriate completion code, based on the comparison relation (high, low, or equal) of the comparison character to the base string character.

If a matching base string character is not found prior to encountering a mode change, the characters of the base string are again skipped over until the mode of the characters being processed is the same as the mode of the comparison character before comparisons are resumed.

If a matching base string character is not found prior to encountering the end of the base string, the base location is set to address the first byte of the character encountered at the end of the base string, the base mode indicator is set to indicate the mode of the base string as of that character, and the instruction is completed with the not found completion code. A mode control string results in the changing of the base string mode, but the base locator is left addressing the mode control character.

If test for escape codes has a value of 1, the test is performed on the characters of the base string prior to their being skipped or compared with the comparison character. Each byte of the base string is checked for a value less than hex 40. Additionally, for a mixed scan mode, when such a value is encountered, it is then determined if it is a valid mode control character.

If a byte value of less than hex 40 is not a valid mode control character, it is considered to be an escape code. The base locator is set to address the first byte of the base string character (simple or extended) which contains the escape code, the base mode indicator is set to indicate the mode of the base string as of that character, and the completion code is set to indicate that an escape code was found.

If possible, specify scan controls on an 8-byte multiple (doubleword) boundary relative to the start of the space containing it. Appreciably less overhead is incurred in accessing and storing the value of the controls if this is done.

For the case where a base string is to be just scanned for byte values less than hex 40, two techniques can be used.

The following diagram defines the various conditions which can be encountered at the end of the base string and what the base locator addressability is in each case. The solid vertical line represents the end of the base string. The dashes represent the bytes before and after the base string end. The V is positioned over the byte addressed by the base locator in each case. These are the conditions which can be encountered when the base locator input to the instruction addresses a byte prior to the base string end. When the base length field specifies a value of zero for a start scan execution of the instruction, or the input base locator addresses a point beyond the end of the instruction, no processing is performed and the instruction is immediately completed with the not found completion code value.


Scan diagram

An analysis of the diagram shows that normally, after appropriate processing for the particular found, not found, or escape condition, the scan can be restarted at the byte of data which would follow the base string end in the data stream being scanned. Any mode shift required by an ending mode control character will have been performed.

However, one ending condition may require subsequent resumption of the scan at the character encountered at the end of the base string. This is the case where the instruction completes with the not found completion code value and the base string ends with an extended character split across string end. That is, the base mode indicator specifies extended mode, the base locator addresses the last byte of the base string, and that byte value is not a shift out, hex 0E character. In this case, complete verification of the extended character and relation comparison could not be performed. If this extended character is to be processed, it must be done through another execution of this instruction where both bytes of the character can be input to the instruction within the confines of the base string.

Completion code values

Authorization Required

Lock Enforcement

Exceptions

06 Addressing

08 Argument/Parameter

22 Object Access

24 Pointer Specification

32 Scalar Specification

44 Protection Violation