|
A token is the unit of low-level syntax from which clauses
are built. Programs written in REXX are composed of tokens. They are
separated by blanks or comments or by the nature of the tokens themselves.
The classes of tokens are: - Literal Strings:
- A literal string
is a sequence including any characters and delimited by the
single quotation mark (') or the double quotation
mark ("). Use two consecutive double quotation marks
("") to represent a " character
within a string delimited by double quotation marks. Similarly, use
two consecutive single quotation marks ('') to represent
a ' character within a string delimited by single
quotation marks. A literal string is a constant and its contents are
never modified when it is processed.
A literal string with no
characters (that is, a string of length 0) is called
a null string.
These are valid strings: 'Fred'
"Don't Panic!"
'You shouldn''t' /* Same as "You shouldn't" */
'' /* The null string */
Note
that a string followed immediately by a ( is considered
to be the name of a function. If followed immediately by the symbol X or x,
it is considered to be a hexadecimal string. If followed immediately
by the symbol B or b, it is considered
to be a binary string. Descriptions of these forms follow.
Implementation
maximum: A literal
string can contain up to 250 characters. (But note that the length
of computed results is limited only by the amount of storage available.)
- Hexadecimal Strings:
- A
hexadecimal string is a literal string, expressed using a hexadecimal
notation of its encoding. It is any sequence of zero or more hexadecimal
digits (0–9, a–f, A–F),
grouped in pairs. A single leading 0 is assumed, if necessary, at
the front of the string to make an even number of hexadecimal digits.
The groups of digits are optionally separated by one or more blanks,
and the whole sequence is delimited by single or double quotation
marks, and immediately followed by the symbol X or x.
(Neither x nor X can be part of
a longer symbol.) The blanks, which can be present only at byte boundaries
(and not at the beginning or end of the string), are to aid readability.
The language processor ignores them. A hexadecimal string is a literal
string formed by packing the hexadecimal digits given. Packing the
hexadecimal digits removes blanks and converts each pair of hexadecimal
digits into its equivalent character, for example: 'C1'X to A.
Hexadecimal
strings let you include characters in a program even if you cannot
directly enter the characters themselves. These are valid hexadecimal
strings: 'ABCD'x
"1d ec f8"X
"1 d8"x
A hexadecimal string is not a representation
of a number. Rather, it is an escape mechanism that lets a user describe
a character in terms of its encoding (and, therefore, is machine-dependent).
In EBCDIC, '40'X is the encoding for a blank. In every case, a string
of the form '.....'x is simply an alternative to a straightforward
string. In EBCDIC 'C1'x and 'A' are identical, as are '40'x and a
blank, and must be treated identically.
Implementation maximum: The
packed length of a hexadecimal string (the string with blanks removed)
cannot exceed 250 bytes.
- Binary Strings:
- A
binary string is a literal string, expressed using a binary representation
of its encoding. It is any sequence of zero or more binary digits
(0 or 1) in
groups of 8 (bytes) or 4 (nibbles). The first
group can have fewer than four digits; in this case, up to three 0
digits are assumed to the left of the first digit, making a total
of four digits. The groups of digits are optionally separated by
one or more blanks, and the whole sequence is delimited by matching
single or double quotation marks and immediately followed by the symbol b or B.
(Neither b nor B can be part of
a longer symbol.) The blanks, which can be present only at byte or
nibble boundaries (and not at the beginning or end of the string),
are to aid readability. The language processor ignores them.
A
binary string is a literal string formed by packing the binary digits
given. If the number of binary digits is not a multiple of eight,
leading zeros are added on the left to make a multiple of eight before
packing. Binary strings allow you to specify characters explicitly,
bit by bit.
These are valid binary strings: '11110000'b /* == 'f0'x */
"101 1101"b /* == '5d'x */
'1'b /* == '00000001'b and '01'x */
'10000 10101010'b /* == '0001 0000 1010 1010'b */
''b /* == '' */
A
note on binary string interpretation in TSO/E: Binary string
support was introduced with TSO/E Version 2 Release 4. With this
release, and all following ones, a string in the form of 'string'B causes string to
be interpreted as binary string. Prior to TSO/E 2.4, the two parts
of the expression 'string'B, string and B,
were concatenated after the value for the variable B was
determined. For example, if B='variable_value' were
concatenated with string, then 'string'B would
be interpreted as stringvariable_value. If this
error is detected in REXX execs written before TSO/E 2.4, use the
concatenation operator ( || ) to eliminate the problem. For example,
code 'string'B as: 'string'|Lor.B
- Symbols:
- Symbols are groups of characters, selected from the:
- English alphabetic characters (A–Z and a–z1)
- Numeric characters (0–9)
- Characters @ # $ ¢ . !2 ? and
underscore.
- Double-Byte Character Set (DBCS) characters (X'41'–X'FE')—ETMODE
must be in effect for these characters to be valid in symbols.
Any lowercase alphabetic character in a symbol is translated
to uppercase (that is, lowercase a–z to
uppercase A–Z) before use.
These
are valid symbols: Fred
Albert.Hall
WHERE?
If a symbol does not begin
with a digit or a period, you can use it as a variable and can assign
it a value. If you have not assigned it a value, its value is the
characters of the symbol itself, translated to uppercase (that is,
lowercase a–z to uppercase A–Z).
Symbols that begin with a number or a period are constant symbols
and cannot be assigned a value.
One other form of symbol is
allowed to support the representation of numbers in exponential format.
The symbol starts with a digit (0–9)
or a period, and it can end with the sequence E or e,
followed immediately by an optional sign (- or +),
followed immediately by one or more digits (which cannot be followed
by any other symbol characters). The sign in this context is part
of the symbol and is not an operator.
These are valid numbers
in exponential notation: 17.3E-12
.03e+9
Implementation maximum: A symbol can consist of up to 250 characters.
(But note that its value, if it is a variable, is limited only by
the amount of storage available.)
- Numbers:
- These
are character strings consisting of one or more decimal digits, with an optional
prefix of a plus or minus sign, and optionally including a single
period (.) that represents a decimal point. A number
can also have a power of 10 suffixed in conventional exponential notation:
an E (uppercase or lowercase), followed optionally
by a plus or minus sign, then followed by one or more decimal digits
defining the power of 10. Whenever a character string is used as a
number, rounding may occur to a precision specified by the NUMERIC
DIGITS instruction (default nine digits). See topics Numbers and arithmetic-Errors for
a full definition of numbers.
Numbers can have leading blanks
(before and after the sign, if any) and can have trailing
blanks. Blanks might not be embedded among the digits of a number
or in the exponential part. Note that a symbol (see preceding) or
a literal string can be a number. A number cannot be the name of a
variable.
These are valid numbers: 12
'-17.9'
127.0650
73e+128
' + 7.9E5 '
A whole number is a number that
has a zero (or no) decimal
part and that the language processor would not usually express in
exponential notation. That is, it has no more digits before the decimal
point than the current setting of NUMERIC DIGITS (the default is 9).
Implementation
maximum: The exponent of a number expressed
in exponential notation can have up to nine digits.
- Operator Characters:
- The characters: +
- \ / % * | & = ¬ > <
and the sequences >= <= \> \< \= ><
<> == \== // && || ** ¬>
¬< ¬= ¬== >> << >>= \<< ¬<<
\>> ¬>> <<= /= /== indicate
operations (see topic Operators). A few of these
are also used in parsing templates, and the
equal sign is also used to indicate assignment. Blanks adjacent to
operator characters are removed. Therefore, the following are identical
in meaning:
345>=123
345 >=123
345 >= 123
345 > = 123
Some of these characters might not be
available in all character sets, and, if this is the case, appropriate
translations can be used. In particular, the vertical bar (|) or character
is often shown as a split vertical bar.
Throughout the language,
the not character, ¬, is synonymous with the backslash
(\). You can use the two characters interchangeably
according to availability and personal preference.
- Special Characters:
- The
following characters, together with the individual characters from
the operators, have special significance when found outside of literal
strings:
, ; : ) (
These characters
constitute the set of special characters. They all act as token delimiters,
and blanks adjacent to any of these are removed. There is an exception:
a blank adjacent to the outside of a parenthesis is deleted only if
it is also adjacent to another special character (unless the character
is a parenthesis and the blank is outside it, too). For example, the
language processor does not remove the blank in A (Z).
This is a concatenation that is not equivalent to A(Z),
a function call. The language processor does remove the blanks in (A) + (Z) because
this is equivalent to (A)+(Z).
The following example shows how a clause is composed of tokens. 'REPEAT' A + 3;
This is composed of six tokens—a literal string ( 'REPEAT'),
a blank operator, a symbol ( A, which can have a value),
an operator ( +), a second symbol ( 3,
which is a number and a symbol), and the clause delimiter ( ;).
The blanks between the A and the + and
between the + and the 3 are removed.
However, one of the blanks between the 'REPEAT' and
the A remains as an operator. Thus, this clause is
treated as though written: 'REPEAT' A+3;
1 Note that some code
pages do not include lowercase English characters a–z. 2 The encoding of the exclamation point character depends
on the code page in use.
|