Kaiser-Meyer-Olkin measure for identity correlation matrix
I have run a factor analysis in IBM SPSS Statistics with the FACTOR command (Analyze>Dimension Reduction>Factor). I requested measures of sampling adequacy by checking the boxes for "KMO and Bartlett's test of sphericity" and "Anti-image" in the Descriptives dialog of the Factor procedure. (They are also available by adding the keywords KMO and AIC, respectively, in the /PRINT subcommand of the FACTOR command. )
Given the formula for the Kaiser-Meyer-Olkin (KMO) measure of sampling adequacy in the Factor chapter in the SPSS Statistical Algorithms manual, it seems that KMO should be undefined when the correlation matrix is an identity matrix. All of the off-diagonal correlations and partial correlations should be 0 in this situation, so the KMO should be 0/(0+0) and therefore undefined. However, KMO is printed as .5 when the correlation matrix is an identity matrix. Is .5 inserted arbitrarily when KMO is undefined?
Resolving the problem
The KMO statistic is a Measure of Sampling Adequacy, both overall and for each variable (Kaiser 1970; Cerny and Kaiser 1977; Dziuban & Shirkey, 1974). The overall KMO is printed in the "KMO and Bartlett's Test" table of the Factor output. The Measures of Sampling Adequacy (MSA) for individual variables are printed as the diagonal elements of the Anti-image Correlation matrix in the "Anti-image Matrices" table of the Factor output.
The KMO statistic is a summary of how small the partial correlations are, relative to the original (zero-order) correlations. The partial correlation for each pair of variables in the factor analysis
is comprised of the correlation between those variables after partialling out the influence of all of the other variables in the factor analysis. (The off-diagonal elements of the Anti-image correlation matrix are the partial correlations multiplied by -1.0.) If the variables share common factor(s), then the partial correlations should be small and the KMO should be close to 1.0. The KMO measure should equal 0.5 when the correlation matrix equals the partial correlation matrix. A special case of this situation is the case where the original correlation matrix is an identity matrix. As the correlation matrix approaches an identity matrix, the KMO value, as calculated by the Statistical Algorithms formula, approaches .5. The SPSS program code sets KMO to .5 when the correlation matrix is an identity matrix, avoiding the division-by-0 problem.
KMO values greater than 0.8 can be considered good, i.e. an indication that component or factor analysis will be useful for these variables. This usually occurs when most of the zero-order correlations are positive. KMO values less than .5 occur when most of the zero-order correlations are negative. KMO values less than 0.5 require remedial action, either by deleting the offending variables or by including other variables related to the offenders. Perhaps the variables reflect responses to a questionnaire where some items were written so that high scores reflect the trait in question while other items were structured so that low scores reflect the trait. Reverse-coding the negatively-worded items may remedy the low KMO value in this situation.
Cerny, C.A., & Kaiser, H.F. (1977). A study of a measure of sampling adequacy for factor-analytic correlation matrices. Multivariate Behavioral Research, 12(1), 43-47.
Dziuban, C. D., & Shirkey, E. C. (1974). When is a correlation matrix appropriate for factor analysis? Psychological Bulletin, 81, 358-361.
Kaiser, H.F. (1970). A second generation Little Jiffy. Psychometrika, 35, 401-415.