Skip to main content
Log in

A comparison of three predictor selection techniques in multiple regression

  • Published:
Psychometrika Aims and scope Submit manuscript

Abstract

Three methods for selecting a few predictors from the many available are described and compared with respect to shrinkage in cross-validation. From 2 to 6 predictors were selected from the 15 available in 100 samples ranging in size from 25 to 200. An iterative method was found to select predictors with slightly, but consistently, higher cross-validities than the popularly used stepwise method. A gradient method was found to equal the performance of the stepwise method only in the larger samples and for the largest predictor subsets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Anderson, H. E. & Fruchter, B. Some multiple correlation and predictor selection methods.Psychometrika, 1960,25, 59–76.

    Google Scholar 

  • Anscombe, F. J. Topics in the investigation of linear relations fitted by the method of least squares.Journal of the Royal Statistical Society, Series B, 1967,29, 1–52.

    Google Scholar 

  • Burket, G. R. A study of reduced rank models for multiple prediction.Psychometric Monographs, No. 12, 1964.

  • Cochran, W. G. The omission or addition of an independent variate in multiple linear regression.Journal of the Royal Statistical Society (supplement), 1938,5, 171–176.

    Google Scholar 

  • Cooley, W. W., & Lohnes, P. R.Multivariate procedures for the behavioral sciences. New York: Wiley, 1962.

    Google Scholar 

  • Cureton, E. E. Approximate linear restraints and best predictor weights.Educational and Psychological Measurement, 1951,11, 12–15.

    Google Scholar 

  • Dixon, W. J. (Ed.)Biomedical computer programs. Los Angeles: UCLA Student Store, 1965.

    Google Scholar 

  • Draper, N. R. & Smith, H.Applied regression analysis. New York: Wiley, 1966.

    Google Scholar 

  • DuBois, P. H.Multivariate correlational analysis. New York: Harper, 1957.

    Google Scholar 

  • Dwyer, P. S. The square root method and its use in correlation and regression.Journal of the American Statistical Association, 1945,40, 493–503.

    Google Scholar 

  • Efroymson, M. A. Multiple regression analysis. In Ralston, A. & Wilf, H. S. (Eds.)Mathematical methods for digital computers. New York: Wiley, 1960.

    Google Scholar 

  • Elfving, G., Sitgreaves, R. & Solomon, H. Item selection for item variables with a known factor structure.Psychometrika, 1959,24, 189–205.

    Google Scholar 

  • Fisher, R. A.Statistical methods for research workers. Edinburgh: Oliver and Boyd, 6th ed., 1936.

    Google Scholar 

  • Fruchter, B. & Anderson, H. E. Geometrical representation of two methods of linear least squares multiple correlation.Psychometrika, 1961,26, 433–442.

    Google Scholar 

  • Garside, M. J. The best subset in multiple regression analysis.Applied Statistics Journal of the Royal Statistical Society, Series C, 1965,14, 196–200.

    Google Scholar 

  • Gorman, J. W. & Toman, R. J. Selection of variables for fitting equations to data.Technometrics, 1966,8, 27–51.

    Google Scholar 

  • Graybill, F. A.An introduction to linear statistical models. New York: McGraw-Hill, 1961.

    Google Scholar 

  • Greenberger, M. H. & Ward, J. H. An iterative technique for multiple correlation analysis.IBM Technical Newsletter, 1956,12, 85–97.

    Google Scholar 

  • Hamaker, H. C. On multiple regression analyses.Statistica Neerlandica, 1962,16, 31–56.

    Google Scholar 

  • Hemmerle, W. J.Statistical computations on a digital computer. Waltham, Mass.: Blaisdell, 1967.

    Google Scholar 

  • Hocking, R. R. & Leslie, R. N. Selection of the best subset in regression analysis.Technometrics, 1967,9, 531–540.

    Google Scholar 

  • Horst, P. (Ed.)The prediction of personal adjustment. New York: Social Science Research Council Bulletin 48, 1941.

  • Horst, P. & Smith, S. The discrimination of two racial samples.Psychometrika, 1950,15, 271–289.

    Google Scholar 

  • Householder, A. S.Principles of numerical analysis. New York: McGraw-Hill, 1953.

    Google Scholar 

  • International Business Machines Corporation.System/360 scientific subroutine package. White Plains, New York: H20-0205-0, 1966.

    Google Scholar 

  • Jennings, E. Matrix formulas for part and partial correlation.Psychometrika, 1965,30, 353–356.

    Google Scholar 

  • Kelley, T. L. & Salisbury, F. S. An iteration method for determining multiple correlation constants.Journal of the American Statistical Association, 1926,21, 282–292.

    Google Scholar 

  • Leiman, J. M.The calculation of regression weights from common factor loadings. Unpublished doctoral dissertation: University of Washington, 1951.

  • Lev, J. Maximizing test battery prediction when the weights are required to be non-negative.Psychometrika, 1956,21, 245–252.

    Google Scholar 

  • Li, J. C. R.Statistical Inference II. Ann Arbor, Mich.: Edwards Brothers, 1964.

    Google Scholar 

  • Linhart, H. A criterion for selecting variables in a regression analysis.Psychometrika, 1960,25, 45–58.

    Google Scholar 

  • Lubin, A. & Summerfield, A. A square root method for selecting a minimum set of variables in multiple regression: II.Psychometrika, 1951,16, 425–437.

    Google Scholar 

  • Mann, H. B.Analysis and design of experiments. New York: Dover, 1949.

    Google Scholar 

  • Oosterhoff, J.On the selection of independent variables in a regression equation. Preliminary Report S 319. Amsterdam: Stichting Mathematisch Centrum, 1963.

    Google Scholar 

  • Rao, C. R.Linear statistical inference and its applications. New York: Wiley, 1965.

    Google Scholar 

  • Rhyne, A. L., Jr. & Steel, R. G. D. Tables for a treatments versus control multiple comparisons sign test.Technometrics, 1965,7, 293–306.

    Google Scholar 

  • Scheffe, H.The analysis of variance. New York: Wiley, 1959.

    Google Scholar 

  • Searle, S. R.Matrix algebra for the biological sciences. New York: Wiley, 1966.

    Google Scholar 

  • Shine, L. C. The relative efficiency of test selection methods in crossvalidation on generated data.Educational and Psychological Measurement, 1966,26, 833–846.

    Google Scholar 

  • Steel, R. G. D. A multiple comparison sign test: treatments versus control.Journal of the American Statistical Association, 1959,54, 767–775.

    Google Scholar 

  • Summerfield, A. & Lubin, A. A square root method of selecting a minimum set of variables in multiple regression: I.Psychometrika, 1951,16, 271–284.

    Google Scholar 

  • Thomas, G. B.Calculus and analytic geometry. Reading, Mass.: Addison-Wesley, 1960.

    Google Scholar 

  • Thorndike, R. L.Personnel selection. New York: Wiley, 1949.

    Google Scholar 

  • Toops, H. A. The L-method.Psychometrika, 1941,6, 249–266.

    Google Scholar 

  • Veldman, D. J.Fortran programming for the behavioral sciences. New York: Holt, 1967.

    Google Scholar 

  • Watson, F. R. A new method for solving simultaneous linear equations associated with multivariate analysis.Psychometrika, 1964,29, 75–86.

    Google Scholar 

  • Wherry, R. J. A new formula for predicting the shrinkage of the coefficient of multiple correlation.Annals of Mathematical Statistics, 1931,2, 440–451.

    Google Scholar 

  • Wherry, R. J. & Gaylord, R. H. Test selection with integral gross score weights.Psychometrika, 1946,11, 173–183.

    Google Scholar 

  • Wherry, R. J. in Stead, W. H. & Shartle, C. P.Occupational counseling techniques. New York: American Book Company, 1940.

    Google Scholar 

  • Wood, K. R., McCornack, R. L. & Villone, L.Multiple regression with subsetting of variables. Santa Monica, California: System Development Corporation, FN-662, 1962.

    Google Scholar 

  • Winer, B. J.Statistical principles in experimental design. New York: McGraw-Hill, 1962.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

McCornack, R.L. A comparison of three predictor selection techniques in multiple regression. Psychometrika 35, 257–271 (1970). https://doi.org/10.1007/BF02291267

Download citation

  • Received:

  • Revised:

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF02291267

Keywords

Navigation