Abstract
Three methods for selecting a few predictors from the many available are described and compared with respect to shrinkage in cross-validation. From 2 to 6 predictors were selected from the 15 available in 100 samples ranging in size from 25 to 200. An iterative method was found to select predictors with slightly, but consistently, higher cross-validities than the popularly used stepwise method. A gradient method was found to equal the performance of the stepwise method only in the larger samples and for the largest predictor subsets.
Similar content being viewed by others
References
Anderson, H. E. & Fruchter, B. Some multiple correlation and predictor selection methods.Psychometrika, 1960,25, 59–76.
Anscombe, F. J. Topics in the investigation of linear relations fitted by the method of least squares.Journal of the Royal Statistical Society, Series B, 1967,29, 1–52.
Burket, G. R. A study of reduced rank models for multiple prediction.Psychometric Monographs, No. 12, 1964.
Cochran, W. G. The omission or addition of an independent variate in multiple linear regression.Journal of the Royal Statistical Society (supplement), 1938,5, 171–176.
Cooley, W. W., & Lohnes, P. R.Multivariate procedures for the behavioral sciences. New York: Wiley, 1962.
Cureton, E. E. Approximate linear restraints and best predictor weights.Educational and Psychological Measurement, 1951,11, 12–15.
Dixon, W. J. (Ed.)Biomedical computer programs. Los Angeles: UCLA Student Store, 1965.
Draper, N. R. & Smith, H.Applied regression analysis. New York: Wiley, 1966.
DuBois, P. H.Multivariate correlational analysis. New York: Harper, 1957.
Dwyer, P. S. The square root method and its use in correlation and regression.Journal of the American Statistical Association, 1945,40, 493–503.
Efroymson, M. A. Multiple regression analysis. In Ralston, A. & Wilf, H. S. (Eds.)Mathematical methods for digital computers. New York: Wiley, 1960.
Elfving, G., Sitgreaves, R. & Solomon, H. Item selection for item variables with a known factor structure.Psychometrika, 1959,24, 189–205.
Fisher, R. A.Statistical methods for research workers. Edinburgh: Oliver and Boyd, 6th ed., 1936.
Fruchter, B. & Anderson, H. E. Geometrical representation of two methods of linear least squares multiple correlation.Psychometrika, 1961,26, 433–442.
Garside, M. J. The best subset in multiple regression analysis.Applied Statistics Journal of the Royal Statistical Society, Series C, 1965,14, 196–200.
Gorman, J. W. & Toman, R. J. Selection of variables for fitting equations to data.Technometrics, 1966,8, 27–51.
Graybill, F. A.An introduction to linear statistical models. New York: McGraw-Hill, 1961.
Greenberger, M. H. & Ward, J. H. An iterative technique for multiple correlation analysis.IBM Technical Newsletter, 1956,12, 85–97.
Hamaker, H. C. On multiple regression analyses.Statistica Neerlandica, 1962,16, 31–56.
Hemmerle, W. J.Statistical computations on a digital computer. Waltham, Mass.: Blaisdell, 1967.
Hocking, R. R. & Leslie, R. N. Selection of the best subset in regression analysis.Technometrics, 1967,9, 531–540.
Horst, P. (Ed.)The prediction of personal adjustment. New York: Social Science Research Council Bulletin 48, 1941.
Horst, P. & Smith, S. The discrimination of two racial samples.Psychometrika, 1950,15, 271–289.
Householder, A. S.Principles of numerical analysis. New York: McGraw-Hill, 1953.
International Business Machines Corporation.System/360 scientific subroutine package. White Plains, New York: H20-0205-0, 1966.
Jennings, E. Matrix formulas for part and partial correlation.Psychometrika, 1965,30, 353–356.
Kelley, T. L. & Salisbury, F. S. An iteration method for determining multiple correlation constants.Journal of the American Statistical Association, 1926,21, 282–292.
Leiman, J. M.The calculation of regression weights from common factor loadings. Unpublished doctoral dissertation: University of Washington, 1951.
Lev, J. Maximizing test battery prediction when the weights are required to be non-negative.Psychometrika, 1956,21, 245–252.
Li, J. C. R.Statistical Inference II. Ann Arbor, Mich.: Edwards Brothers, 1964.
Linhart, H. A criterion for selecting variables in a regression analysis.Psychometrika, 1960,25, 45–58.
Lubin, A. & Summerfield, A. A square root method for selecting a minimum set of variables in multiple regression: II.Psychometrika, 1951,16, 425–437.
Mann, H. B.Analysis and design of experiments. New York: Dover, 1949.
Oosterhoff, J.On the selection of independent variables in a regression equation. Preliminary Report S 319. Amsterdam: Stichting Mathematisch Centrum, 1963.
Rao, C. R.Linear statistical inference and its applications. New York: Wiley, 1965.
Rhyne, A. L., Jr. & Steel, R. G. D. Tables for a treatments versus control multiple comparisons sign test.Technometrics, 1965,7, 293–306.
Scheffe, H.The analysis of variance. New York: Wiley, 1959.
Searle, S. R.Matrix algebra for the biological sciences. New York: Wiley, 1966.
Shine, L. C. The relative efficiency of test selection methods in crossvalidation on generated data.Educational and Psychological Measurement, 1966,26, 833–846.
Steel, R. G. D. A multiple comparison sign test: treatments versus control.Journal of the American Statistical Association, 1959,54, 767–775.
Summerfield, A. & Lubin, A. A square root method of selecting a minimum set of variables in multiple regression: I.Psychometrika, 1951,16, 271–284.
Thomas, G. B.Calculus and analytic geometry. Reading, Mass.: Addison-Wesley, 1960.
Thorndike, R. L.Personnel selection. New York: Wiley, 1949.
Toops, H. A. The L-method.Psychometrika, 1941,6, 249–266.
Veldman, D. J.Fortran programming for the behavioral sciences. New York: Holt, 1967.
Watson, F. R. A new method for solving simultaneous linear equations associated with multivariate analysis.Psychometrika, 1964,29, 75–86.
Wherry, R. J. A new formula for predicting the shrinkage of the coefficient of multiple correlation.Annals of Mathematical Statistics, 1931,2, 440–451.
Wherry, R. J. & Gaylord, R. H. Test selection with integral gross score weights.Psychometrika, 1946,11, 173–183.
Wherry, R. J. in Stead, W. H. & Shartle, C. P.Occupational counseling techniques. New York: American Book Company, 1940.
Wood, K. R., McCornack, R. L. & Villone, L.Multiple regression with subsetting of variables. Santa Monica, California: System Development Corporation, FN-662, 1962.
Winer, B. J.Statistical principles in experimental design. New York: McGraw-Hill, 1962.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
McCornack, R.L. A comparison of three predictor selection techniques in multiple regression. Psychometrika 35, 257–271 (1970). https://doi.org/10.1007/BF02291267
Received:
Revised:
Issue Date:
DOI: https://doi.org/10.1007/BF02291267