Abstract
This paper discusses the topic of model selection for finite-dimensional normal regression models. We compare model selection criteria according to prediction errors based upon prediction with refitting, and prediction without refitting. We provide a new lower bound for prediction without refitting, while a lower bound for prediction with refitting was given by Rissanen. Moreover, we specify a set of sufficient conditions for a model selection criterion to achieve these bounds. Then the achievability of the two bounds by the following selection rules are addressed: Rissanen's accumulated prediction error criterion (APE), his stochastic complexity criterion, AIC, BIC and the FPE criteria. In particular, we provide upper bounds on overfitting and underfitting probabilities needed for the achievability. Finally, we offer a brief discussion on the issue of finite-dimensional vs. infinite-dimensional model assumptions.
Similar content being viewed by others
References
Akaike, H. (1970). Statistical predictor identification,Ann. Inst. Statist. Math.,22, 202–217.
Akaike, H. (1974). A new look at the statistical model identification,IEEE Trans. Automat. Control,19, 716–723.
Atkinson, A. C. (1980). A note on the generalized information criterion for choice of a model,Biometrika,67, 413–418.
Bhansali, R. H. and Downham, D. Y. (1977). Some properties of the order of an autoregressive model selected by a generalization of Akaike's FPE criterion,Biometrika,64, 547–551.
Box, G. E. P. and Draper, N. R. (1959). A basis for the selection of a regression surface design,J. Amer. Statist. Assoc.,54, 622–654.
Box, G. E. P. and Draper, N. R. (1963). The choices of a second order rotatable design,Biometrika,50, 335–352.
Breiman, L. A. and Freedman, D. F. (1983). How many variables should be entered in a regression equation?,J. Amer. Statist. Assoc.,78, 131–136.
Clayton, M. K., Geisser, S. and Jennings, D. (1986). A comparison of several model selection procedures,Bayesian Inference and Decision (eds. P. Goel and A. Zellner), 425–439, Elsevier, New York.
Dawid, A. P. (1984). Present position and potential developments: some personal views, Statistical theory—The prequential approach (with discussion),J. Roy. Statist. Soc. Ser. A,147, 278–292.
Dawid, A. P. (1992). Prequential data analysis,Current Issues in Statistical Inference: Essays in Honor of D. Basu, Institute of Mathematical Statistics, Monograph,17 (eds. M. Ghosh and P. K. Pathak).
Geweke, J. and Meese, R. (1981). Estimating regression models of finite but unknown order,Internat. Econom. Rev.,22, 55–70.
Hannan, E. J. and Quinn, B. G. (1979). The determination of the order of an autoregression,J. Roy. Statist. Soc. Ser. B,41, 190–195.
Hannan, E. J., McDougall, A. J. and Poskitt, D. S. (1989). Recursive estimation of autoregressions,J. Roy. Statist. Soc. Ser. B,51, 217–233.
Haughton, D. (1989). Size of the error in the choice of a model to fit data from an exponential family,Sankhyā Ser. A,51, 45–58.
Hemerly, E. M. and Davis, M. H. A. (1989). Strong consistency of the predictive least squares criterion for order determination of autoregressive processes,Ann. Statist.,17, 941–946.
Hjorth, U. (1982). Model selection and forward validation,Scand. J. Statist.,9, 95–105.
Kohn, R. (1983). Consistent estimation of minimal dimension,Econometrica,51, 367–376.
Lai, T., Robbins, H. and Wei, C. Z. (1979). Strong consistency of least squares estimates in multiple regression II,J. Multivariate Anal.,9, 343–361.
Merhav, N., Gutman, M. and Ziv, J. (1989). On the estimation of the order of a Markov chain and universal data compression,IEEE Trans. Inform. Theory,39, 1014–1019.
Nishi, R. (1984). Asymptotic properties of criteria for selection of variables in multiple regression,Ann. Statist.,12, 758–765.
Rao, C. R. (1973).Linear Statistical Inference, 2nd ed., Wiley, New York.
Rissanen, J. (1984). Universal coding, information prediction, and estimation,IEEE Trans. Inform. Theory,30, 629–636.
Rissanen, J. (1986a). Stochastic complexity and modeling,Ann. Statist.,14, 1080–1100.
Rissanen, J. (1986b). A predictive least squares principle,IMA J. Math. Control Inform.,3, 211–222.
Rissanen, J. (1989).Stochastic Complexity in Statistical Inquiry, World Books, Singapore.
Sawa, T. (1978). Information criteria for discriminating among alternative regression models,Econometrica,46, 1273–1291.
Schwartz, G. (1978). Estimating the dimension of a model,Ann. Statist.,6, 461–464.
Shibata, R. (1976). Selection of the order of an autoregressive model by Akaike's information criterion,Biometrika,63, 117–126.
Shibata, R. (1981). An optimal selection of regression variables,Biometrika,68, 45–54.
Shibata, R. (1983a). Asymptotic mean efficiency of a selection of regression variables,Ann. Inst. Statist. Math.,35, 415–423.
Shibata, R. (1983b). A theoretical view of the use of AIC,Times Series Analysis: Theory and Practice 4 (ed. O. D. Anderson), 237–244, Elsevier, Amsterdam.
Shibata, R. (1984). Approximate efficiency of a selection procedure for the number of regression variables.Biometrika,71, 43–49.
Shibata, R. (1986a). Selection of the number of regression variables; a minimax choice of generalized FPE,Ann. Inst. Statist. Math.,38, 459–474.
Shibata, R. (1986b). Consistency of model selection and parameter estimation,Essays in Time Series and Allied Processes: Papers in Honour of E. J. Hannan, J. Appl. Probab.,23A, 127–141.
Stone, M. (1977). An asymptotic equivalence of choice of model by cross-validation and Akaike's criterion,J. Roy. Statist. Soc. Ser. B,39, 44–47.
Wax, M. (1988). Order selection for AR models by predictive least squares,IEEE Trans. Acoust. Speech Signal Process.,36, 581–588.
Wei, C. Z. (1992). On the predictive least squares principle,Ann. Statist.,20, 1–42.
Woodroofe, M. (1982). On model selection and the arc sine laws,Ann. Statist.,10, 1182–1194.
Author information
Authors and Affiliations
Additional information
Support from the National Science Foundation, grant DMS 8802378 and support from ARO, grant DAAL03-91-G-007 to B. Yu during the revision are gratefully acknowledged.
About this article
Cite this article
Speed, T.P., Yu, B. Model selection and prediction: Normal regression. Ann Inst Stat Math 45, 35–54 (1993). https://doi.org/10.1007/BF00773667
Received:
Revised:
Issue Date:
DOI: https://doi.org/10.1007/BF00773667