| The following tables provide approximate performance data for the MASS scalar and vector libraries running on a CBE PPU. The columns labelled mass list the results obtained with the libmass.a library. This data was obtained by timing many repetitions of a loop over 1000 random arguments and includes all overheads. Timing in this way brings the input and output vectors into the on-chip cache (because the loop is short enough for the vectors to fit in the cache). Performance may deteriorate significantly when the input and output vectors are not in the cache. Performance may also deteriorate for arguments at or near the end-points of the valid argument ranges. Function names are given using the libmassv.a naming conventions, so that, for example, vexp indicates both the vector function vexp and the scalar function exp. The column labelled massv list the results obtained with the libmassv.a library. The data represents the time for the evaluation of a vector element for the MASS vector library. The column labelled massv list the results obtained with the libmassv.a library. Vectors of length 1000 were used so that the caches contain the entire vectors. The columns labelled libmand mass give the results from using the functions in the MASS C source code library libmassv.c to call the functions in the system math library and libmass.a scalar library, respectively. The C source code was compiled with the IBM ppuxlc compiler using the -O option. The times shown in the mass, massv, and libm columns are proportional to the number of PPU cycles taken, but do not represent cycles directly. The constant of proportionality, however, is the same for all three columns, so that the speedup ratio columns provide valid comparisons of MASS to libm. The system library measurements were made with the versions of the library available on the test systems. They may vary from the versions timed for previous versions of MASS. Users may experience performance that differs from that found in these tables. Results will vary with vector length. Entries in the table where the library function does not exist, or the experiment was not performed, are left blank. The range key at the end of this section defines the range letters. CBE PPU libmass.a and libmassv.a performance (length 1000 loop, 32-bit mode) R a n g ======== time ======== == speedup ratio === function e mass massv libm libm/mass libm/massv ----------------------------------------------------------- vacos B 273.81 133.54 598.34 2.19 4.48 vacosh G 374.18 470.02 1369.25 3.66 2.91 vasin B 263.98 133.65 637.42 2.41 4.77 vasinh D 394.71 128.31 1278.04 3.24 9.96 vatan B 203.99 694.02 3.40 D 323.05 793.79 2.46 vatan2 D 339.55 120.82 1278.37 3.76 10.58 vatanh B 410.00 100.68 1236.99 3.02 12.29 vcbrt D 265.73 64.62 1043.61 3.93 16.15 vcopysign D 164.04 120.59 0.74 vcos B 122.83 51.21 487.63 3.97 9.52 D 247.54 51.20 769.61 3.11 15.03 vcosh D 216.08 65.63 899.28 4.16 13.70 vcosisin B 306.41 106.39 958.44 3.13 9.01 D 397.29 113.95 1575.87 3.97 13.83 vdint D 16.67 vdiv D 41.78 vdnint D 111.99 25.78 140.80 1.26 5.46 verf C 132.61 238.09 1.80 verfc C 319.99 652.07 2.04 vexp D 259.99 49.05 735.47 2.83 14.99 vexpm1 D 262.17 49.71 579.60 2.21 11.66 vhypot D 265.03 915.68 3.46 vlgamma H 631.29 1414.05 2.24 vlog C 271.96 41.93 1096.64 4.03 26.15 vlog10 C 254.00 65.64 1411.99 5.56 21.51 vlog1p H 221.26 90.04 672.59 3.04 7.47 vpow C 544.06 120.03 1961.58 3.61 16.34 vqdrt C 36.58 vrcbrt D 43.12 vrec D 42.58 vrqdrt C 38.60 vrsqrt C 220.99 35.43 384.17 1.74 10.84 vsacos B 194.59 81.18 637.44 3.28 7.85 vsacosh G 376.63 430.33 944.28 2.51 2.19 vsasin B 193.79 81.18 570.88 2.95 7.03 vsasinh D 388.37 122.79 759.06 1.95 6.18 vsatan B 157.02 325.42 2.07 D 261.19 349.96 1.34 vsatan2 D 416.69 86.73 885.07 2.12 10.20 vsatanh B 405.98 96.75 1070.85 2.64 11.07 vscbrt D 229.91 26.54 870.29 3.79 32.79 vscopysign D 164.04 118.54 0.72 vscos B 121.97 44.04 402.02 3.30 9.13 D 241.55 44.03 774.11 3.20 17.58 vscosh D 251.92 70.53 1269.88 5.04 18.00 vscosisin B 56.61 D 57.55 vsdiv D 33.97 vserf C 149.81 255.37 1.70 vserfc C 185.47 656.70 3.54 vsexp D 250.28 39.93 1088.23 4.35 27.25 vsexpm1 D 225.28 38.83 722.26 3.21 18.60 vshypot D 244.05 711.01 2.91 vsin B 92.43 51.19 429.73 4.65 8.39 D 276.06 51.19 753.32 2.73 14.72 vsincos B 124.63 72.84 980.22 7.87 13.46 D 228.60 72.84 1598.37 6.99 21.94 vsinh D 268.14 65.37 865.51 3.23 13.24 vslgamma H 601.66 924.36 1.54 vslog C 224.05 47.72 608.38 2.72 12.75 vslog10 C 224.05 50.16 833.62 3.72 16.62 vslog1p H 302.06 57.53 597.67 1.98 10.39 vspow C 355.87 111.28 2833.02 7.96 25.46 vsqdrt C 34.68 vsqrt C 277.99 37.08 338.08 1.22 9.12 vsrcbrt D 31.61 vsrec D 38.74 vsrint D 130.20 104.86 0.81 vsrqdrt C 30.09 vsrsqrt C 35.86 vssin B 140.03 44.02 334.46 2.39 7.60 D 244.21 44.03 770.79 3.16 17.51 vssincos B 77.55 D 77.55 vssinh D 266.85 45.94 1157.20 4.34 25.19 vssqrt C 45.74 vstan D 297.90 69.41 929.15 3.12 13.39 vstanh F 212.65 87.77 1187.16 5.58 13.53 vtan D 237.36 83.44 1123.47 4.73 13.46 vtanh F 310.15 85.25 1051.80 3.39 12.34 Range Key A 0,1 B -1,1 C 0,100 D -100,100 E -10,10 F -20,20 G 1,100 H -1,101 I 0,10 |