Missing percentiles produced by FREQUENCIES in Release 18 and above

Technote (troubleshooting)


Problem(Abstract)

I ran the following FREQUENCIES command in Releases 17 and 18. There were 9 cases in the data set.

FREQUENCIES
VARIABLES=ALL /FORMAT=NOTABLE
/NTILES=100
/ORDER= ANALYSIS .

If I run this syntax in Release 17, the NTILE percentile table is populated all of the way from 1 to 99. If it is run on the same data set in Release 18, the percentile table is only populated up to the 85th percentile with 90-99 completely blank. The results from version 17 are what I would expect to see in version 18.

Resolving the problem

The change reflects a reversal to a change that was made to the FREQUENCIES algorithm in Release 8.

The percentiles algorithm in FREQUENCIES employs the HAVERAGE method of percentile calculation (see Technote 1480663). The change in Release 8 was intended to address the issue of missing percentile estimates, which occur using the standard HAVERAGE algorithm when (W+1)p>=W. W refers to the sum of the case weights, or N for unweighted data, and p is the percentile divided by 100. When W=9, then (W+1)*p is greater than 9 for all percentiles greater than 89, so percentile above .89 would be system missing under the standard HAVERAGE method. If W=5, (W+1)*p is greater than 5 for all percentiles greater than .83, so percentiles above .83 would be system missing under the standard HAVERAGE method for W=5.

The revised method in Release 8 made a second attempt to compute percentiles above the (W+1)*p>=W threshold by reversing the data, computing the 100 - pth percentile, and taking the negative of this value as the pth percentile. However, there were problems with the implementation and conception of this revision that could lead to nonmonotonic values for the percentiles. The changes from Release 8 were reversed for Statistics 18, setting percentiles to system missing when (W+1)*p>=W.

Note that in Releases 8 through 17, as long as W, the sum of the case weights, was greater than 1, what was previously and now again in Release 18 missing was shown simply as the largest observed value of the variable.

To provide some background information on percentile calculation in Statistics procedures, see Technote 1480663.
There are some links to other informative web sites within the resolution. You can find the algorithms for the various percentile methods via Help>Algorithms. Scroll down to the EXAMINE Algorithms link. The revision to HAVERAGE in Release 8 Frequencies was not applied to the EXAMINE command (Analyze>Descriptive Statistics>Explore).

Historical Number

86272

Rate this page:

(0 users)Average rating

Add comments

Document information


More support for:

SPSS Statistics

Software version:

18.0

Operating system(s):

Platform Independent

Reference #:

1484728

Modified date:

2013-09-22

Translate my page

Machine Translation

Content navigation