Using the vector libraries
- libmassv.a
- The generic vector library that runs on any supported POWER® processor. Unless your application requires this portability, use the appropriate architecture-specific library below for maximum performance.
- libmassvp8.a
- Contains functions that have been tuned for the POWER8® architecture.
The single-precision and double-precision floating-point functions contained in the vector libraries are summarized in Table 1. The integer functions contained in the vector libraries are summarized in Table 2. Note that in C and C++ applications, only call by reference is supported, even for scalar arguments.
- A double-precision (for double-precision functions) or single-precision (for single-precision functions) vector output parameter
- A double-precision (for double-precision functions) or single-precision (for single-precision functions) vector input parameter
- An integer vector-length parameter.
function_name (y,x,n)
where y is the target vector, x is the
source vector, and n is the vector length.
The parameters y and x are
assumed to be double-precision for functions with the prefix v, and single-precision for functions with the prefix vs. As an example, the following code: #include <massv.h>
double x[500], y[500];
int n;
n = 500;
...
vexp (y, x, &n);
outputs a vector y of length
500 whose elements are exp(x[i]), where i=0,...,499.The functions vdiv, vsincos, vpow, and vatan2 (and their single-precision versions, vsdiv, vssincos, vspow, and vsatan2) take four arguments. The functions vdiv, vpow, and vatan2 take the arguments (z,x,y,n). The function vdiv outputs a vector z whose elements are x[i]/y[i], where i=0,..,*n–1. The function vpow outputs a vector z whose elements are x[i]y[i], where i=0,..,*n–1. The function vatan2 outputs a vector z whose elements are atan(x[i]/y[i]), where i=0,..,*n–1. The function vsincos takes the arguments (y,z,x,n), and outputs two vectors, y and z, whose elements are sin(x[i]) and cos(x[i]), respectively.
In vcosisin(y,x,n) and vscosisin(y,x,n), x is a vector of n elements and the function outputs a vector y of n __Complex elements of the form (cos(x[i]),sin(x[i])).
Double-precision function | Single-precision function | Description | Double-precision function prototype | Single-precision function prototype |
---|---|---|---|---|
vacos | vsacos | Sets y[i] to the arc cosine of x[i], for i=0,..,*n-1 | void vacos (double y[], double x[], int *n); | void vsacos (float y[], float x[], int *n); |
vacosh | vsacosh | Sets y[i] to the hyperbolic arc cosine of x[i], for i=0,..,*n-1 | void vacosh (double y[], double x[], int *n); | void vsacosh (float y[], float x[], int *n); |
vasin | vsasin | Sets y[i] to the arc sine of x[i], for i=0,..,*n-1 | void vasin (double y[], double x[], int *n); | void vsasin (float y[], float x[], int *n); |
vasinh | vsasinh | Sets y[i] to the hyperbolic arc sine of x[i], for i=0,..,*n-1 | void vasinh (double y[], double x[], int *n); | void vsasinh (float y[], float x[], int *n); |
vatan2 | vsatan2 | Sets z[i] to the arc tangent of x[i]/y[i], for i=0,..,*n-1 | void vatan2 (double z[], double x[], double y[], int *n); | void vsatan2 (float z[], float x[], float y[], int *n); |
vatanh | vsatanh | Sets y[i] to the hyperbolic arc tangent of x[i], for i=0,..,*n-1 | void vatanh (double y[], double x[], int *n); | void vsatanh (float y[], float x[], int *n); |
vcbrt | vscbrt | Sets y[i] to the cube root of x[i], for i=0,..,*n-1 | void vcbrt (double y[], double x[], int *n); | void vscbrt (float y[], float x[], int *n); |
vcos | vscos | Sets y[i] to the cosine of x[i], for i=0,..,*n-1 | void vcos (double y[], double x[], int *n); | void vscos (float y[], float x[], int *n); |
vcosh | vscosh | Sets y[i] to the hyperbolic cosine of x[i], for i=0,..,*n-1 | void vcosh (double y[], double x[], int *n); | void vscosh (float y[], float x[], int *n); |
vcosisin | vscosisin | Sets the real part of y[i] to the cosine of x[i] and the imaginary part of y[i] to the sine of x[i], for i=0,..,*n-1 | void vcosisin (double _Complex y[], double x[], int *n); | void vscosisin (float _Complex y[], float x[], int *n); |
vdint | Sets y[i] to the integer truncation of x[i], for i=0,..,*n-1 | void vdint (double y[], double x[], int *n); | ||
vdiv | vsdiv | Sets z[i] to x[i]/y[i], for i=0,..,*n–1 | void vdiv (double z[], double x[], double y[], int *n); | void vsdiv (float z[], float x[], float y[], int *n); |
vdnint | Sets y[i] to the nearest integer to x[i], for i=0,..,*n-1 | void vdnint (double y[], double x[], int *n); | ||
verf | vserf | Sets y[i] to the error function of x[i], for i=0,..,*n-1 | void verf (double y[], double x[], int *n) | void vserf (float y[], float x[], int *n) |
verfc | vserfc | Sets y[i] to the complimentary error function of x[i], for i=0,..,*n-1 | void verfc (double y[], double x[], int *n) | void vserfc (float y[], float x[], int *n) |
vexp | vsexp | Sets y[i] to the exponential function of x[i], for i=0,..,*n-1 | void vexp (double y[], double x[], int *n); | void vsexp (float y[], float x[], int *n); |
vexp2 | vsexp2 | Sets y[i] to 2 raised to the power of x[i], for i=1,..,*n-1 | void vexp2 (double y[], double x[], int *n); | void vsexp2 (float y[], float x[], int *n); |
vexpm1 | vsexpm1 | Sets y[i] to (the exponential function of x[i])-1, for i=0,..,*n-1 | void vexpm1 (double y[], double x[], int *n); | void vsexpm1 (float y[], float x[], int *n); |
vexp2m1 | vsexp2m1 | Sets y[i] to (2 raised to the power of x[i]) - 1, for i=1,..,*n-1 | void vexp2m1 (double y[], double x[], int *n); | void vsexp2m1 (float y[], float x[], int *n); |
vhypot | vshypot | Sets z[i] to the square root of the sum of the squares of x[i] and y[i], for i=0,..,*n-1 | void vhypot (double z[], double x[], double y[], int *n) | void vshypot (float z[], float x[], float y[], int *n) |
vlog | vslog | Sets y[i] to the natural logarithm of x[i], for i=0,..,*n-1 | void vlog (double y[], double x[], int *n); | void vslog (float y[], float x[], int *n); |
vlog2 | vslog2 | Sets y[i] to the base-2 logarithm of x[i], for i=1,..,*n-1 | void vlog2 (double y[], double x[], int *n); | void vslog2 (float y[], float x[], int *n); |
vlog10 | vslog10 | Sets y[i] to the base-10 logarithm of x[i], for i=0,..,*n-1 | void vlog10 (double y[], double x[], int *n); | void vslog10 (float y[], float x[], int *n); |
vlog1p | vslog1p | Sets y[i] to the natural logarithm of (x[i]+1), for i=0,..,*n-1 | void vlog1p (double y[], double x[], int *n); | void vslog1p (float y[], float x[], int *n); |
vlog21p | vslog21p | Sets y[i] to the base-2 logarithm of (x[i]+1), for i=1,..,*n-1 | void vlog21p (double y[], double x[], int *n); | void vslog21p (float y[], float x[], int *n); |
vpow | vspow | Sets z[i] to x[i] raised to the power y[i], for i=0,..,*n-1 | void vpow (double z[], double x[], double y[], int *n); | void vspow (float z[], float x[], float y[], int *n); |
vqdrt | vsqdrt | Sets y[i] to the fourth root of x[i], for i=0,..,*n-1 | void vqdrt (double y[], double x[], int *n); | void vsqdrt (float y[], float x[], int *n); |
vrcbrt | vsrcbrt | Sets y[i] to the reciprocal of the cube root of x[i], for i=0,..,*n-1 | void vrcbrt (double y[], double x[], int *n); | void vsrcbrt (float y[], float x[], int *n); |
vrec | vsrec | Sets y[i] to the reciprocal of x[i], for i=0,..,*n-1 | void vrec (double y[], double x[], int *n); | void vsrec (float y[], float x[], int *n); |
vrqdrt | vsrqdrt | Sets y[i] to the reciprocal of the fourth root of x[i], for i=0,..,*n-1 | void vrqdrt (double y[], double x[], int *n); | void vsrqdrt (float y[], float x[], int *n); |
vrsqrt | vsrsqrt | Sets y[i] to the reciprocal of the square root of x[i], for i=0,..,*n-1 | void vrsqrt (double y[], double x[], int *n); | void vsrsqrt (float y[], float x[], int *n); |
vsin | vssin | Sets y[i] to the sine of x[i], for i=0,..,*n-1 | void vsin (double y[], double x[], int *n); | void vssin (float y[], float x[], int *n); |
vsincos | vssincos | Sets y[i] to the sine of x[i] and z[i] to the cosine of x[i], for i=0,..,*n-1 | void vsincos (double y[], double z[], double x[], int *n); | void vssincos (float y[], float z[], float x[], int *n); |
vsinh | vssinh | Sets y[i] to the hyperbolic sine of x[i], for i=0,..,*n-1 | void vsinh (double y[], double x[], int *n); | void vssinh (float y[], float x[], int *n); |
vsqrt | vssqrt | Sets y[i] to the square root of x[i], for i=0,..,*n-1 | void vsqrt (double y[], double x[], int *n); | void vssqrt (float y[], float x[], int *n); |
vtan | vstan | Sets y[i] to the tangent of x[i], for i=0,..,*n-1 | void vtan (double y[], double x[], int *n); | void vstan (float y[], float x[], int *n); |
vtanh | vstanh | Sets y[i] to the hyperbolic tangent of x[i], for i=0,..,*n-1 | void vtanh (double y[], double x[], int *n); | void vstanh (float y[], float x[], int *n); |
Integer functions are of the form function_name (x[], *n), where x[] is a vector of 4-byte (for vpopcnt4) or 8-byte (for vpopcnt8) numeric objects (integral or floating-point), and *n is the vector length.
Function | Description | Prototype |
---|---|---|
vpopcnt4 | Returns the total number of 1 bits in the concatenation of the binary representation of x[i], for i=0,..,*n–1 , where x is a vector of 32-bit objects. | unsigned int vpopcnt4 (void *x, int *n) |
vpopcnt8 | Returns the total number of 1 bits in the concatenation of the binary representation of x[i], for i=0,..,*n–1 , where x is a vector of 64-bit objects. | unsigned int vpopcnt8 (void *x, int *n) |
Overlap of input and output vectors
- For calls to vector functions that take one input and one output
vector (for example, vsin (y, x, &n)):
The vectors x[0:n-1] and y[0:n-1] must be either disjoint or identical, or unexpected results might be obtained.
- For calls to vector functions that take two input vectors (for
example, vatan2 (y, x1, x2, &n)):
The previous restriction applies to both pairs of vectors y,x1 and y,x2. That is, y[0:n-1] and x1[0:n-1] must be either disjoint or identical; and y[0:n-1] and x2[0:n-1] must be either disjoint or identical.
- For calls to vector functions that take two output vectors (for
example, vsincos (y1, y2, x, &n)):
The above restriction applies to both pairs of vectors y1,x and y2,x. That is, y1[0:n-1] and x[0:n-1] must be either disjoint or identical; and y2[0:n-1] and x[0:n-1] must be either disjoint or identical. Also, the vectors y1[0:n-1] and y2[0:n-1] must be disjoint.
Alignment of input and output vectors
To get the best performance from the POWER8 vector libraries, align the input and output vectors on 8-byte (or better, 16-byte) boundaries.
Consistency of MASS vector functions
All the functions in the MASS vector libraries are consistent, in the sense that a given input value will always produce the same result, regardless of its position in the vector, and regardless of the vector length.