IBM Support

Obtaining a confidence interval for a median

Troubleshooting


Problem

Can IBM SPSS Statistics calculate the confidence interval for a median?

Resolving The Problem

There are various methods to compute confidence intervals for medians in IBM SPSS Statistics. To see Help pages for these methods, choose Topics in the Help menu of SPSS Statistics and enter the topic terms: median confidence interval .

I. Ratio Statistics

The Ratio Statistics procedure (Analyze->Descriptive Statistics->Ratio) will print confidence intervals for medians in a pivot table. However, since the RATIO STATISTICS procedure provides this result only for ratios of two variables, you will need to employ a simple work-around to avail yourself of this functionality. This work-around is based on the principle that the ratio of any number to 1 yields the original number. Hence, you simply create a new variable with a constant value of 1 for use as the denominator and use the variable of interest as the numerator when selecting variables in the RATIO STATISTICS dialog box.

Here's an example to illustrate this method. Open the data set Employee data.sav, which comes with SPSS, then follow the steps outlined below. (The sample data files are installed into the Samples directory where IBM Statistics was installed. For example, if you had installed


the English language version of Statistics 19, then by default this sample file would be in the path:
'C:\Program Files\IBM\SPSS\Statistics\19\Samples\English\Employee data.sav'.) We will be calculating the 95% confidence interval for the variable SALARY.

1. In the SPSS Data Editor menu, go to Transform>Compute..

2. In the Compute Variable dialog box, type in any name that helps you remember that the new column will simply hold a constant value of one. In this example, we'll call the variable UNIT.

3. In the box labeled Numeric Expression, simply type the number 1 and then click OK.

4. In the menus, go to Analyze->Descriptive Statistics->Ratio..

5. In the Ratio Statistics dialog box, highlight the variable SALARY and click on the arrow to move it to the box labeled Numerator.

6. Similarly, highlight the variable UNIT and move it to the box labeled Denominator.

7. Click the Statistics button.

8. In the Statistics subdialog box, under Central Tendency, select Median and select Confidence Intervals. (Note that the default is for the 95% confidence interval, but you can modify the value in the subdialog box.)

9. Click Continue and click OK.

The results will show you that the median is 28875, with a lower 95% confidence limit of 27750 and an upper 95% confidence limit of 30000. Furthermore, the actual interval covered by these limits is slightly larger than the desired 95% confidence interval; it is in fact 95.2%, as identified by the entry reporting the Actual Coverage.

The command equivalents for the steps outlined above are as follows.

COMPUTE unit = 1 .
RATIO STATISTICS salary WITH unit
/PRINT = CIN(95) MEDIAN .

Note that there are multiple methods for computing percentiles, including medians. The Explore procedure (Analyze-Descriptive Statistics->Explore; EXAMINE command) offers five options for computing percentiles. Frequencies uses one of these methods, HAVERAGE, while Custom Tables uses the AEMPIRICAL method. See Technical Note 1480663 ("How does SPSS Statistics calculate percentiles in FREQUENCIES?") for a discussion of the methods and where they might disagree, along with links to more detailed documentation.

The Algorithm for the median in Ratio Statistics, as described in the Ratio Statistics Algorithms under Help->Algorithms, is:
"The middle number of the sorted ratios if n is odd. The mean (average) of the two middle ratios if the n is even."
Note that Ratio Statistics will round noninteger positive weights to the nearest integer, unlike the procedures described above and this may be another source of disagreement in the medians reported across the various procedures.
Another source of discrepancies with the means and medians reported by other procedures is that Ratio Statistics uses only cases with positive values for both numerator and denominator. Therefore, not all cases that are used by other procedures will be used by Ratio Statistics, if some cases have negative or 0 values on the variable(s) in question.

II. Bootstrapping
If you have the option for Bootstrapping in SPSS Statistics, you can request bootstrapped confidence intervals for a median with the Explore procedure (Analyze->Descriptive Statistics->Explore). If the Bootstrap option is licensed, there will be a Bootstrap button in the main Explore dialog box. Click that button to enter the Bootstrapping dialog and specify the number of bootstrap samples, the confidence interval calculation and the sampling method. When bootstrapping is requested from a procedure that supports it, the BOOTSTRAP command is run prior to the procedure, as in the following example:

BOOTSTRAP
/SAMPLING METHOD=SIMPLE
/VARIABLES TARGET=salary
/CRITERIA CILEVEL=95 CITYPE=PERCENTILE NSAMPLES=1000
/MISSING USERMISSING=EXCLUDE.
EXAMINE VARIABLES=salary
/PLOT BOXPLOT STEMLEAF
/COMPARE GROUPS
/PERCENTILES(5,10,25,50,75,90,95) HAVERAGE
/STATISTICS DESCRIPTIVES
/CINTERVAL 95
/MISSING LISTWISE
/NOTOTAL.

III Error Bars

You can print a graphical representation of the median's confidence interval as error bars in a bar graph or line graph, using the Chart Builder under the Graph menu.

The Chart Builder dialog steps to request a bar graph of the median salary with error bars are as follows. Salary must have a measurement level of Scale in the Variable View of the Data Editor.

1. The procedure will require an x-axis variable which designates the different groups and therefore the bars. If you do not have a group variable and wish to get a single bar for the mediaan that includes all cases, you can compute a variable with a single value to act as the X-Axis variable. To compute a variable called Unit which has the value 1 for all cases, you can open the Transform->Compute Variable menu choice, type in "unit" (without quotes) in the Target Variable box, enter a 1 in the Numeric Expression box and click OK to run the computation or Paste to paste the syntax into a syntax window. The syntax would appear as:
COMPUTE unit = 1.
EXECUTE.
If you have a group variable such that you want the median of salary for each group, skip step 1.

2. Choose Bar in the Gallery tab of the Chart Builder dialog and drag the icon for Simple Bar into the Chart Preview area.

3. Drag Salary into the Y-Axis box of the bar graph diagram. The axis will be labelled Mean Salary at first. Drag Unit or your group variable into the X-Axis box.

4. If the Element Properties panel is not displayed to the right of the dialog, click the Element Properties panel to display it. In the Element Properties panel, highlight Bar1 under "Edit Properties of:"

5. Still in the Element Properties panel, use the scroll bar under Statistic in the Statistics area to choose Median.

6. Click the check box labelled "display error bars" in the center of the panel. The default confidence interval is 95% but you can alter that in the Level (%) box.

7. Click the Apply button at the bottom of the Element Properties panel.

8. Click OK to run the graph or Paste to paste the syntax to a Syntax window.

Here is the syntax for a bar graph of the median of salary with 95% CI error bars.
GRAPH
/GRAPHDATASET NAME="graphdataset" VARIABLES=unit MEDIANCI(salary, 95)[name="MEDIAN_salary"
LOW="MEDIAN_salary_LOW" HIGH="MEDIAN_salary_HIGH"] MISSING=LISTWISE REPORTMISSING=NO
/GRAPHSPEC SOURCE=INLINE.
BEGIN GPL
SOURCE: s=userSource(id("graphdataset"))
DATA: unit=col(source(s), name("unit"), unit.category())
DATA: MEDIAN_salary=col(source(s), name("MEDIAN_salary"))
DATA: LOW=col(source(s), name("MEDIAN_salary_LOW"))
DATA: HIGH=col(source(s), name("MEDIAN_salary_HIGH"))
GUIDE: axis(dim(1), label("unit"))
GUIDE: axis(dim(2), label("Median salary"))
GUIDE: text.footnote(label("Error Bars: 95% CI"))
SCALE: linear(dim(2), include(0))
ELEMENT: interval(position(unit*MEDIAN_salary), shape.interior(shape.square))
ELEMENT: interval(position(region.spread.range(unit*(LOW+HIGH))), shape.interior(shape.ibeam))
END GPL.

[{"Product":{"code":"SSLVMB","label":"IBM SPSS Statistics"},"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Component":"Not Applicable","Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"19.0","Edition":"","Line of Business":{"code":"LOB10","label":"Data and AI"}}]

Historical Number

21267

Document Information

Modified date:
16 April 2020

UID

swg21476966