Product documentation
Abstract
In the interest of speed, certain MASS vector functions may produce slightly different results for a given input value, depending on its position in the vector, the vector length, and nearby elements of the input vector. In most applications, these inconsistencies will not have any significant impact.
Content
All the functions in the AIX POWER7 MASS vector library (libmassvp7.a) are consistent. Information relating to the consistency of AIX MASS vector functions for other processors is given below.
In the interest of speed, the MASS libraries make certain tradeoffs. One of these involves the consistency of certain MASS vector routines. It is possible that the result computed for a particular input value will vary slightly (usually only in the least significant bit) depending on its position in the vector, the vector length, and nearby elements of the input vector. Also, the results produced by the different MASS libraries are not necessarily bitwise identical. Although most users will likely not be disturbed by this, the behavior is described here to allow you to determine whether it affects your application. (One application that can be affected is the debugging of parallel programs by varying the number of processors and/or the distribution of the data across the processors, while expecting the results to be bitwise identical.)
The vector MASS routines contain a main loop that computes K (K=4 or 8, depending on the target machine) elements of the output vector per iteration. If the number of input vector elements N is not a multiple of K, a "tail" loop is used to compute the remaining N mod K elements. In the interest of speed, the algorithm used for the tail is not necessarily the same as the one used by the main loop. A consequence of this is that, for certain input values, a slightly different result may be computed, depending on whether the value occurs in the last N mod K elements of the input vector (and hence is computed by the tail loop) or whether it occurs in previous elements (and hence is computed by the main loop).
Also in the interest of speed, certain special (such as extremely small or large) input argument values can cause a block of K results to be recomputed with a different algorithm. A possible consequence is that the same input value may produce slightly different results, depending on whether a nearby input value is special. This should rarely happen for most applications.
Inconsistency can be avoided (assuming no extreme arguments) by always calling the vector MASS routines with a vector length that is a multiple of 8, and padding any final unused positions with a dummy value if necessary. For long vectors, the overhead will be negligible compared to the time required to compute the entire vector. Note, however, that it is still possible to get inconsistent results if there are any extreme values in the input vector.
Beginning with MASS version 3.3, some routines in libmassvp4.a and later vector libraries are consistent. The consistent routines are as follows:
Version 3.3: vsqrt, vssqrt, vexp, vsexp, vlog, vrec, vdiv, vsin, vcos
Version 3.4 and higher: vsqrt, vssqrt, vexp, vsexp, vlog, vrec, vdiv, vsin, vcos, vacos, vasin, vatan2, vrsqrt, vscos, vsdiv, vsrec, vssin
For long vectors, most of the consistent routines run at the same speed as the routines they replace. (Exceptions are vsin, vcos, vssin, and vscos, for which the inconsistent versions can be faster on some vectors having arguments between approximately 0.78 and 1.) There are also some differences in speed for short vectors (see below). (vsin and vcos are not compared in the table since POWER4tuned versions were not present prior to MASS version 3.3.)
Average relative elapsed time difference (percent) between MASS version 3.3 (consistent) and version 3.2.1 (nonconsistent) routines in libmassvp4.a. (Negative means v3.3 is faster.)
n=1007 
n=7 

vsqrt 
0 
+33 
vssqrt 
0 
5 
vexp  0  8 
vlog  0  +18 
vrec  0  +41 
vdiv  0  +19 
Segment  Product  Component  Platform  Version  Edition 

pSeries Unix Servers (incl pSeries Intellistation)  pSeries  Enterprise Servers  
RS/6000 (Servers and Workstations)  RS/6000 Enterprise Servers  
Operating Systems  AIX family  AIX  All Versions  
Software Development  XL C/C++  AIX, AIX5L, AIXL  All Versions  
Software Development  XL Fortran  AIX, AIXL  All versions  
Software Development  XL C Enterprise Edition for AIX  AIX  All Versions  
Software Development  VisualAge C++  AIX  All versions 