Skip to main content

Software  >  Rational  >  

Consistency of MASS vector library functions

 Product documentation
 
Abstract
In the interest of speed, certain MASS vector functions may produce slightly different results for a given input value, depending on its position in the vector, the vector length, and nearby elements of the input vector. In most applications, these inconsistencies will not have any significant impact.
 
 
Content
In the interest of speed, the MASS libraries make certain trade-offs. One of these involves the consistency of certain MASS vector routines. It is possible that the result computed for a particular input value will vary slightly (usually only in the least significant bit) depending on its position in the vector, the vector length, and nearby elements of the input vector. Also, the results produced by the different MASS libraries are not necessarily bit-wise identical. Although most users will likely not be disturbed by this, the behavior is described here to allow you to determine whether it affects your application. (One application that can be affected is the debugging of parallel programs by varying the number of processors and/or the distribution of the data across the processors, while expecting the results to be bit-wise identical.)

The vector MASS routines contain a main loop that computes K (K=4 or 8, depending on the target machine) elements of the output vector per iteration. If the number of input vector elements N is not a multiple of K, a "tail" loop is used to compute the remaining N mod K elements. In the interest of speed, the algorithm used for the tail is not necessarily the same as the one used by the main loop. A consequence of this is that, for certain input values, a slightly different result may be computed, depending on whether the value occurs in the last N mod K elements of the input vector (and hence is computed by the tail loop) or whether it occurs in previous elements (and hence is computed by the main loop).

Also in the interest of speed, certain special (such as extremely small or large) input argument values can cause a block of K results to be re-computed with a different algorithm. A possible consequence is that the same input value may produce slightly different results, depending on whether a nearby input value is special. This should rarely happen for most applications.

Inconsistency can be avoided (assuming no extreme arguments) by always calling the vector MASS routines with a vector length that is a multiple of 8, and padding any final unused positions with a dummy value if necessary. For long vectors, the overhead will be negligible compared to the time required to compute the entire vector. Note, however, that it is still possible to get inconsistent results if there are any extreme values in the input vector.

Beginning with MASS version 3.3, some routines in libmassvp4.a were modified to be consistent. The consistent routines are as follows:

Version 3.3: vsqrt, vssqrt, vexp, vsexp, vlog, vrec, vdiv, vsin, vcos

Version 3.4 and higher: vsqrt, vssqrt, vexp, vsexp, vlog, vrec, vdiv, vsin, vcos, vacos, vasin, vatan2, vrsqrt, vscos, vsdiv, vsrec, vssin

For long vectors, most of the consistent routines run at the same speed as the routines they replace. (Exceptions are vsin, vcos, vssin, and vscos, for which the inconsistent versions can be faster on some vectors having arguments between approximately 0.78 and 1.) There are also some differences in speed for short vectors (see below). (vsin and vcos are not compared in the table since POWER4-tuned versions were not present prior to MASS version 3.3.)

Average relative elapsed time difference (percent) between MASS version 3.3 (consistent) and version 3.2.1 (non-consistent) routines in libmassvp4.a. (Negative means v3.3 is faster.)


n=1007

n=7

vsqrt

0

+33

vssqrt

0

-5
vexp0-8
vlog0+18
vrec0+41
vdiv0+19
   
 
 
Cross Reference information
Segment Product Component Platform Version Edition
pSeries Unix Servers (incl pSeries Intellistation)pSeries - Enterprise Servers
RS/6000 (Servers and Workstations)RS/6000 Enterprise Servers
Operating SystemsAIX family AIXAll Versions
Software DevelopmentXL C/C++AIX, AIX5L, AIXLAll Versions
Software DevelopmentXL FortranAIX, AIXLAll versions
Software DevelopmentXL C Enterprise Edition for AIXAIXAll Versions
Software DevelopmentVisualAge C++ AIXAll versions
 
 

Copyright and trademark information
IBM, the IBM logo and ibm.com are trademarks of International Business Machines Corp., registered in many jurisdictions worldwide. Other product and service names might be trademarks of IBM or other companies. A current list of IBM trademarks is available on the Web at "Copyright and trademark information" at www.ibm.com/legal/copytrade.shtml.
Rate this page
Please take a moment to complete this form to help us better serve you.
This material provides me with the information I need.




This material is clear and easy to understand.




Did the information help you to achieve your goal?
What updates, improvements, or related information would you like to see in this document?
Your response will be used to improve our document content. Requests for assistance, if applicable, should be submitted through your normal support channel as we cannot respond from this site.
Input the verification number to submit feedback:
Document information
 Product categories:
 Software
 Software Development
 Traditional Progamming Language & Compilers
 Mathematical Acceleration Subsystem
 Libraries
 Operating system(s):
  AIX, AIX5L, AIXL
 Software version:
  All
 Reference #:
  7005373
 IBM Group:
 Software Group
 Modified date:
 2005-04-27

Translate My Page
 
 

Rate this page

Help us improve this page. Your response will be used to improve our document content. Requests for assistance, if applicable, should be submitted through your normal support channel as we cannot respond from this site.