Decimal floating-point numbers are represented in three formats:
short, long, or extended.
- Short: 1 sign bit, 11 combination field bits, 20 significand continuation
field bits
- Long: 1 sign bit, 13 combination field bits, 50 significand continuation
field bits
- Extended: 1 sign bit, 17 combination field bits, 110 significand
continuation field bits
There are four classes of decimal floating-point data, including
numeric and related nonnumeric entities. Each data item consists of
a sign, an exponent, and a significand. The exponent is biased such
that all exponents are nonnegative unsigned numbers, and the minimum
biased exponent is zero. The significand consists of an explicit fraction
and an implicit unit bit to the left of the decimal point. The sign
bit is zero for plus and one for minus values.
The classes are:
- Zeros have a biased exponent of zero,
a zero fraction and a sign. The implied unit bit is zero.
- Numbers have a biased exponent greater
than zero but less than all ones. The largest numbers have approximate
values 1097 (short format), 10385 (long format), and 101145 (extended format). The smallest numbers have approximate
values 10-101 (short format), 10-398 (long format), and 10-6176 (extended format).
- An infinity is represented if the
first 5 bits of the combination field are 11110 (binary).
- A NaN (Not-a-Number) is represented if
the first 5 bits of the combination field are 11111 (binary). If the
following bit is 1, the NaN is a Signaling NaN; otherwise it is a
Quiet NaN.
The rounding modes used for decimal floating point are:
- R8
- Decimal floating point equivalent of binary floating point R4
(round-half-even)
- R9
- Decimal floating point equivalent of binary floating point R5
(truncate; round towards zero)
- R10
- Decimal floating point equivalent of binary floating point R6
(-> +<inf>; ceiling)
- R11
- Decimal floating point equivalent of binary floating point R7
(-> -<inf>; floor)
- R12
- Decimal floating point equivalent of binary floating point R1
(round-half-up)
- R13
- Round-half-down (no binary floating point equivalent)
- R14
- Round-up (away-from-zero, no binary floating point equivalent)
- R15
- Decimal floating point: round-for-reround, or 'prepare for
shorter precision'.