Floating-point formats

XL C/C++ supports the following binary floating-point formats:
  • 32-bit single precision, with an approximate absolute normalized range of 0 and 10-38 to 1038 and precision of about 7 decimal digits
  • 64-bit double precision, with an approximate absolute normalized range of 0 and 10-308 to 10308 and precision of about 16 decimal digits
  • 128-bit extended precision, with slightly greater range than double-precision values, and with a precision of about 32 decimal digits

XL C/C++ extended precision is not in the binary128 format that is suggested by the IEEE standard. The IEEE standard suggests that extended formats use more bits in the exponent for greater range and the fraction for greater precision.

Special numbers, such as NaN, infinity, and negative zero, are not fully supported by the 128-bit extended precision values. Arithmetic operations do not necessarily propagate these numbers in extended precision.



Voice your opinion on getting help information Ask IBM compiler experts a technical question in the IBM XL compilers forum Reach out to us