Using the vector libraries

If you want to explicitly call any of the MASS vector functions, you can do so by including massv.h in your source files and linking your application with the appropriate vector library. (Information about linking is provided in Compiling and linking a program with MASS.)
libmassv.a
The generic vector library that runs on any supported POWER® processor. Unless your application requires this portability, use the appropriate architecture-specific library below for maximum performance.
libmassvp4.a
Contains some functions that have been tuned for the POWER4 architecture. The remaining functions are identical to those in libmassv.a. If you are using a PPC970 machine, this library is the recommended choice.
libmassvp5.a
Contains some functions that have been tuned for the POWER5 architecture. The remaining functions are identical to those in libmassv.a.
libmassvp6.a
Contains some functions that have been tuned for the POWER6® architecture. The remaining functions are identical to those in libmassv.a.
libmassvp7.a
Contains functions that have been tuned for the POWER7® architecture.
libmassvp8.a
Contains functions that have been tuned for the POWER8™ architecture.

All libraries can be used in either 32-bit or 64-bit mode.

The single-precision and double-precision floating-point functions contained in the vector libraries are summarized in Table 1. The integer functions contained in the vector libraries are summarized in Table 2. Note that in C and C++ applications, only call by reference is supported, even for scalar arguments.

With the exception of a few functions (described in the following paragraph), all of the floating-point functions in the vector libraries accept three parameters: The functions are of the form
function_name (y,x,n)
where y is the target vector, x is the source vector, and n is the vector length. The parameters y and x are assumed to be double-precision for functions with the prefix v, and single-precision for functions with the prefix vs. As an example, the following code:
#include <massv.h>

double x[500], y[500];
int n;
n = 500;
...
vexp (y, x, &n);
outputs a vector y of length 500 whose elements are exp(x[i]), where i=0,...,499.

The functions vdiv, vsincos, vpow, and vatan2 (and their single-precision versions, vsdiv, vssincos, vspow, and vsatan2) take four arguments. The functions vdiv, vpow, and vatan2 take the arguments (z,x,y,n). The function vdiv outputs a vector z whose elements are x[i]/y[i], where i=0,..,*n–1. The function vpow outputs a vector z whose elements are x[i]y[i], where i=0,..,*n–1. The function vatan2 outputs a vector z whose elements are atan(x[i]/y[i]), where i=0,..,*n–1. The function vsincos takes the arguments (y,z,x,n), and outputs two vectors, y and z, whose elements are sin(x[i]) and cos(x[i]), respectively.

In vcosisin(y,x,n) and vscosisin(y,x,n), x is a vector of n elements and the function outputs a vector y of n __Complex elements of the form (cos(x[i]),sin(x[i])). If -D__nocomplex is used (see note in Table 1), the output vector holds y[0][i] = cos(x[i]) and y[1][i] = sin(x[i]), where i=0,..,*n-1.

Table 1. MASS floating-point vector functions
Double-precision function Single-precision function Description Double-precision function prototype Single-precision function prototype
vacos vsacos Sets y[i] to the arc cosine of x[i], for i=0,..,*n-1 void vacos (double y[], double x[], int *n); void vsacos (float y[], float x[], int *n);
vacosh vsacosh Sets y[i] to the hyperbolic arc cosine of x[i], for i=0,..,*n-1 void vacosh (double y[], double x[], int *n); void vsacosh (float y[], float x[], int *n);
vasin vsasin Sets y[i] to the arc sine of x[i], for i=0,..,*n-1 void vasin (double y[], double x[], int *n); void vsasin (float y[], float x[], int *n);
vasinh vsasinh Sets y[i] to the hyperbolic arc sine of x[i], for i=0,..,*n-1 void vasinh (double y[], double x[], int *n); void vsasinh (float y[], float x[], int *n);
vatan2 vsatan2 Sets z[i] to the arc tangent of x[i]/y[i], for i=0,..,*n-1 void vatan2 (double z[], double x[], double y[], int *n); void vsatan2 (float z[], float x[], float y[], int *n);
vatanh vsatanh Sets y[i] to the hyperbolic arc tangent of x[i], for i=0,..,*n-1 void vatanh (double y[], double x[], int *n); void vsatanh (float y[], float x[], int *n);
vcbrt vscbrt Sets y[i] to the cube root of x[i], for i=0,..,*n-1 void vcbrt (double y[], double x[], int *n); void vscbrt (float y[], float x[], int *n);
vcos vscos Sets y[i] to the cosine of x[i], for i=0,..,*n-1 void vcos (double y[], double x[], int *n); void vscos (float y[], float x[], int *n);
vcosh vscosh Sets y[i] to the hyperbolic cosine of x[i], for i=0,..,*n-1 void vcosh (double y[], double x[], int *n); void vscosh (float y[], float x[], int *n);
vcosisin1 vscosisin1 Sets the real part of y[i] to the cosine of x[i] and the imaginary part of y[i] to the sine of x[i], for i=0,..,*n-1 void vcosisin (double _Complex y[], double x[], int *n); void vscosisin (float _Complex y[], float x[], int *n);
vdint   Sets y[i] to the integer truncation of x[i], for i=0,..,*n-1 void vdint (double y[], double x[], int *n);  
vdiv vsdiv Sets z[i] to x[i]/y[i], for i=0,..,*n–1 void vdiv (double z[], double x[], double y[], int *n); void vsdiv (float z[], float x[], float y[], int *n);
vdnint   Sets y[i] to the nearest integer to x[i], for i=0,..,*n-1 void vdnint (double y[], double x[], int *n);  
verf vserf Sets y[i] to the error function of x[i], for i=0,..,*n-1 void verf (double y[], double x[], int *n) void vserf (float y[], float x[], int *n)
verfc vserfc Sets y[i] to the complimentary error function of x[i], for i=0,..,*n-1 void verfc (double y[], double x[], int *n) void vserfc (float y[], float x[], int *n)
vexp vsexp Sets y[i] to the exponential function of x[i], for i=0,..,*n-1 void vexp (double y[], double x[], int *n); void vsexp (float y[], float x[], int *n);
vexp2 vsexp2 Sets y[i] to 2 raised to the power of x[i], for i=1,..,*n-1 void vexp2 (double y[], double x[], int *n); void vsexp2 (float y[], float x[], int *n);
vexpm1 vsexpm1 Sets y[i] to (the exponential function of x[i])-1, for i=0,..,*n-1 void vexpm1 (double y[], double x[], int *n); void vsexpm1 (float y[], float x[], int *n);
vexp2m1 vsexp2m1 Sets y[i] to (2 raised to the power of x[i]) - 1, for i=1,..,*n-1 void vexp2m1 (double y[], double x[], int *n); void vsexp2m1 (float y[], float x[], int *n);
vhypot vshypot Sets z[i] to the square root of the sum of the squares of x[i] and y[i], for i=0,..,*n-1 void vhypot (double z[], double x[], double y[], int *n) void vshypot (float z[], float x[], float y[], int *n)
vlog vslog Sets y[i] to the natural logarithm of x[i], for i=0,..,*n-1 void vlog (double y[], double x[], int *n); void vslog (float y[], float x[], int *n);
vlog2 vslog2 Sets y[i] to the base-2 logarithm of x[i], for i=1,..,*n-1 void vlog2 (double y[], double x[], int *n); void vslog2 (float y[], float x[], int *n);
vlog10 vslog10 Sets y[i] to the base-10 logarithm of x[i], for i=0,..,*n-1 void vlog10 (double y[], double x[], int *n); void vslog10 (float y[], float x[], int *n);
vlog1p vslog1p Sets y[i] to the natural logarithm of (x[i]+1), for i=0,..,*n-1 void vlog1p (double y[], double x[], int *n); void vslog1p (float y[], float x[], int *n);
vlog21p vslog21p Sets y[i] to the base-2 logarithm of (x[i]+1), for i=1,..,*n-1 void vlog21p (double y[], double x[], int *n); void vslog21p (float y[], float x[], int *n);
vpow vspow Sets z[i] to x[i] raised to the power y[i], for i=0,..,*n-1 void vpow (double z[], double x[], double y[], int *n); void vspow (float z[], float x[], float y[], int *n);
vqdrt vsqdrt Sets y[i] to the fourth root of x[i], for i=0,..,*n-1 void vqdrt (double y[], double x[], int *n); void vsqdrt (float y[], float x[], int *n);
vrcbrt vsrcbrt Sets y[i] to the reciprocal of the cube root of x[i], for i=0,..,*n-1 void vrcbrt (double y[], double x[], int *n); void vsrcbrt (float y[], float x[], int *n);
vrec vsrec Sets y[i] to the reciprocal of x[i], for i=0,..,*n-1 void vrec (double y[], double x[], int *n); void vsrec (float y[], float x[], int *n);
vrqdrt vsrqdrt Sets y[i] to the reciprocal of the fourth root of x[i], for i=0,..,*n-1 void vrqdrt (double y[], double x[], int *n); void vsrqdrt (float y[], float x[], int *n);
vrsqrt vsrsqrt Sets y[i] to the reciprocal of the square root of x[i], for i=0,..,*n-1 void vrsqrt (double y[], double x[], int *n); void vsrsqrt (float y[], float x[], int *n);
vsin vssin Sets y[i] to the sine of x[i], for i=0,..,*n-1 void vsin (double y[], double x[], int *n); void vssin (float y[], float x[], int *n);
vsincos vssincos Sets y[i] to the sine of x[i] and z[i] to the cosine of x[i], for i=0,..,*n-1 void vsincos (double y[], double z[], double x[], int *n); void vssincos (float y[], float z[], float x[], int *n);
vsinh vssinh Sets y[i] to the hyperbolic sine of x[i], for i=0,..,*n-1 void vsinh (double y[], double x[], int *n); void vssinh (float y[], float x[], int *n);
vsqrt vssqrt Sets y[i] to the square root of x[i], for i=0,..,*n-1 void vsqrt (double y[], double x[], int *n); void vssqrt (float y[], float x[], int *n);
vtan vstan Sets y[i] to the tangent of x[i], for i=0,..,*n-1 void vtan (double y[], double x[], int *n); void vstan (float y[], float x[], int *n);
vtanh vstanh Sets y[i] to the hyperbolic tangent of x[i], for i=0,..,*n-1 void vtanh (double y[], double x[], int *n); void vstanh (float y[], float x[], int *n);
Note:
  1. By default, these functions use the __Complex data type, which is only available for AIX® 5.2 and later, and does not compile on older versions of the operating system. To get an alternate prototype for these functions, compile with -D__nocomplex. This defines the functions as: void vcosisin (double y[][2], double *x, int *n); and void vscosisin(float y[][2], float *x, int *n);

Integer functions are of the form function_name (x[], *n), where x[] is a vector of 4-byte (for vpopcnt4) or 8-byte (for vpopcnt8) numeric objects (integral or floating-point), and *n is the vector length.

Table 2. MASS integer vector library functions
Function Description Prototype
vpopcnt4 Returns the total number of 1 bits in the concatenation of the binary representation of x[i], for i=0,..,*n–1 , where x is a vector of 32-bit objects. unsigned int vpopcnt4 (void *x, int *n)
vpopcnt8 Returns the total number of 1 bits in the concatenation of the binary representation of x[i], for i=0,..,*n–1 , where x is a vector of 64-bit objects. unsigned int vpopcnt8 (void *x, int *n)

Overlap of input and output vectors

In most applications, the MASS vector functions are called with disjoint input and output vectors; that is, the two vectors do not overlap in memory. Another common usage scenario is to call them with the same vector for both input and output parameters (for example, vsin (y, y, &n)). For other kinds of overlap, be sure to observe the following restrictions, to ensure correct operation of your application:
  • For calls to vector functions that take one input and one output vector (for example, vsin (y, x, &n)):

    The vectors x[0:n-1] and y[0:n-1] must be either disjoint or identical, or the address of x[0] must be greater than the address of y[0]. That is, if x and y are not the same vector, the address of y[0] must not fall within the range of addresses spanned by x[0:n-1], or unexpected results may be obtained.

  • For calls to vector functions that take two input vectors (for example, vatan2 (y, x1, x2, &n)):

    The previous restriction applies to both pairs of vectors y,x1 and y,x2. That is, if y is not the same vector as x1, the address of y[0] must not fall within the range of addresses spanned by x1[0:n-1]; if y is not the same vector as x2, the address of y[0] must not fall within the range of addresses spanned by x2[0:n-1].

  • For calls to vector functions that take two output vectors (for example, vsincos (x, y1, y2, &n)):

    The above restriction applies to both pairs of vectors y1,x and y2,x. That is, if y1 and x are not the same vector, the address of y1[0] must not fall within the range of addresses spanned by x[0:n-1]; if y2 and x are not the same vector, the address of y2[0] must not fall within the range of addresses spanned by x[0:n-1]. Also, the vectors y1[0:n-1] and y2[0:n-1] must be disjoint.

Alignment of input and output vectors

To get the best performance from the POWER7 and POWER8 vector libraries, align the input and output vectors on 16-byte boundaries.

Consistency of MASS vector functions

The accuracy of the vector functions is comparable to that of the corresponding scalar functions in libmass.a, though results might not be bitwise-identical.

In the interest of speed, the MASS libraries make certain trade-offs. One of these involves the consistency of certain MASS vector functions. For certain functions, it is possible that the result computed for a particular input value varies slightly (usually only in the least significant bit) depending on its position in the vector, the vector length, and nearby elements of the input vector. Also, the results produced by the different MASS libraries are not necessarily bit-wise identical.

All the functions in libmassvp7.a and libmassvp8.a are consistent.

The following functions are consistent in all versions of the library in which they appear.
double-precision functions
vacos, vacosh, vasin, vasinh, vatan2, vatanh, vcbrt, vcos, vcosh, vcosisin, vdint, vdnint, vexp2, vexpm1, vexp2m1, vlog, vlog2, vlog10, vlog1p, vlog21p, vpow, vqdrt, vrcbrt, vrqdrt, vsin, vsincos, vsinh, vtan, vtanh
single-precision functions
vsacos, vsacosh, vsasin, vsasinh, vsatan2, vsatanh, vscbrt, vscos, vscosh, vscosisin, vsexp, vsexp2, vsexpm1, vsexp2m1, vslog, vslog2, vslog10, vslog1p, vslog21p, vspow, vsqdrt, vsrcbrt, vsrqdrt, vssin, vssincos, vssinh, vssqrt, vstan, vstanh

The following functions are consistent in libmassvp3.a, libmassvp4.a, libmassvp5.a, and libmassvp6.a:

vsqrt and vrsqrt.

The following functions are consistent in libmassvp4.a, libmassvp5.a, and libmassvp6.a:

vrec, vsrec, vdiv, vsdiv, and vexp.

The following function is consistent in libmassv.a, libmassvp5.a, and libmassvp6.a:

vsrsqrt.

Older, inconsistent versions of some of these functions are available on the Mathematical Acceleration Subsystem for AIX website. If consistency is not required, there may be a performance advantage to using the older versions. For more information on consistency and avoiding inconsistency with the vector libraries, as well as performance and accuracy data, see the Mathematical Acceleration Subsystem website.