## Technote (FAQ)

## Question

I would like to draw a scatterplot of two variables in my SPSS Statistics data set, which I can do from either the Chart Builder or Legacy Dialogs choices under the Graphs menu. However, I would also like to print a local regression (loess) fit line on the graph and I do not see a fit line option in the dialogs for either procedure. How can I print a local regression fit line on a scatterplot in SPSS Statistics?

## Answer

The fit line can be drawn by the chart editor after building the scatterplot from either the Graphs->Chart Builder menu or the Graph->Legacy Dialogs menu. After the scatterplot is drawn, you would right-click the graph in the Output viewer and then choose "Edit Content->In Separate Window" from the pop-up menu that appears. In the Chart Editor window, click on the Elements menu and choose "Fit line at total". A Properties dialog opens with a "Fit Line" tab. In the "Fit Line" panel, you can choose the type of fit line, as well as the kernel if it is LOESS. Click Apply and then Close to close the Properties panel. Click the x in the upper right corner of the Chart Editor to return to the output viewer, where you should now see the fit line in your scatterplot.

If you have a large number of such graphs to produce, you may wish to avoid using the Chart Editor with each graph. You can use SPSS command syntax to produce the scatterplot with the fit line included. The GGRAPH and GPL commands, which are linked to the Chart Builder dialogs, will produce the chart by using commands that do not have comparable options in the Chart Builder dialogs. The IGRAPH command, which is associated with the Interactive Graph procedure, can also produce the scatterplot with the fit line. The Interactive Graph procedure is only available through syntax commands in version 18 onward.

You may want to consult Technote 1485480 which addresses some overall steps for producing fit lines on scatterplots through command syntax. It references multiple graphs and Split File variables, but the commands there do not require Split Files to work. The examples there produce linear fit lines, but

local regression lines are fit in the examples below. The fit lines can be produced by either the GGRAPH/GPL commands or the IGRAPH command.

You can find the GPL Reference Guide, along with other SPSS documentation, at

http://www.ibm.com/support/entry/portal/Documentation/Software/SPSS/SPSS_Statistics

In the Documentation window that opens, go to the Product Documentation->SPSS Statistics area and click the "Documentation in PDF" link for your version of SPSS Statistics. In the page that opens,

scroll down to the "Client version manuals" section and click the link for "GPL Reference Guide for IBM SPSS Statistics". The smoothing functions are listed in the "GPL Functions", where functions are listed in alphabetical order. Scroll to the smooth.loess function. The kernel options, which determine the weighting of nearby cases in the line, are listed there as well and are printed below.

Kernel Functions

uniform: All data receive equal weights.

epanechnikov: Data near the current point receive higher weights than extreme data receive. This function weights extreme points more than the triweight, biweight, and

tricube kernels but less than the Gaussian and Cauchy kernels.

biweight: Data far from the current point receive more weight than the triweight kernel allows but less weight than the Epanechnikov kernel permits.

tricube: Data close to the current point receive higher weights than both the Epanechnikov and biweight kernels allow.

triweight: Data close to the current point receive higher weights than any other kernel allows. Extreme cases get very little weight.

gaussian: Weights follow a normal distribution, resulting in higher weighting of extreme cases than the Epanechnikov, biweight, tricube, and triweight kernels.

cauchy: Extreme values receive more weight than the other kernels, with the exception of the uniform kernel, allow.

Here is an example of a loess fit line with a gaussian kernel on a scatterplot of edlevel (x axis) and salnow (y axis). The key specifications are on the second ELEMENT command.

GGRAPH

/GRAPHDATASET NAME="graphdataset" VARIABLES=edlevel salnow MISSING=LISTWISE REPORTMISSING=NO

/GRAPHSPEC SOURCE=INLINE.

BEGIN GPL

SOURCE: s=userSource(id("graphdataset"))

DATA: edlevel=col(source(s), name("edlevel"))

DATA: salnow=col(source(s), name("salnow"))

GUIDE: axis(dim(1), label("Educational level"))

GUIDE: axis(dim(2), label("Current salary"))

ELEMENT: point(position(edlevel*salnow))

ELEMENT: line(position(smooth.loess.gaussian(edlevel*salnow)))

END GPL.

As noted above and in Technote 1485480, the Interactive Graph procedure is only available through syntax commands after Statistics 17. The IGRAPH command is described in detail in the "Command Syntax Reference", which is available from the Help menu in SPSS Statistics. You can request a smoothed fit line in an IGRAPH command with the METHOD=LLR keywords on the /FITLINE subcommand, with further specifications of the kernel. You can find specifications for the IGRAPH command in the "Command Syntax Reference" guide, which is available under the Help menu in SPSS. Chapters for individual procedures are arranged alphabetically by command name.

Here is an example of an IGRAPH (Interactive Graph) command with a LLR fit line with a EPANECHNIKOV kernel, using the same variables

as the GGRAPH/GPL command.above.

IGRAPH

/VIEWNAME='Scatterplot'

/X1=VAR(edlevel) TYPE=SCALE

/Y=VAR(salnow) TYPE=SCALE

/COORDINATE=VERTICAL

/FITLINE METHOD=LLR EPANECHNIKOV BANDWIDTH=CONSTRAINED

X1MULTIPLIER=2.00 LINE=total

/YLENGTH=5.2

/X1LENGTH=6.5

/CHARTLOOK='NONE'

/SCATTER COINCIDENT=NONE.