Frequentist Model Averaging

Fletcher, David

doi:10.1007/978-3-662-58541-2_3

David Fletcher²

Part of the book series: SpringerBriefs in Statistics ((BRIEFSSTATIST))

1468 Accesses
1 Citations

Abstract

We provide an overview of frequentist model averaging. For point estimation, we consider different methods for selecting the model weights, including those based on AIC, bagging, weighted AIC, stacking and focussed methods. For interval estimation, we consider Wald, MATA and percentile-bootstrap intervals. Use of the methods are illustrated by examples involving real data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 59.99; Price excludes VAT (USA)

Softcover Book: USD 79.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
This can come as a surprise; see [159] for a useful discussion of the assumptions underlying AIC.
2.
As discussed in Sect. 2.2, when counting the number of parameters in a model we include any scale parameters, such as the error variance in a normal linear model.
3.
See [24] for a discussion of the connection between the model-selection probabilities \(p \left( S = m \right) \) \((m=1,\dots ,M)\) and AIC weights.
4.
Throughout the rest of the chapter, it will be implicit that constrained-optimisation is used whenever we determine the weights by minimising an objective function.
5.
For normal linear models, AIC(w) is equivalent to Mallows model averaging (MMA) [79, 121, 143, 190, 220, 223]. Although MMA was developed without the assumption of normal errors, for simplicity we use the more general name AIC(w) when referring to MMA.
6.
As with AIC(w), the choice of estimate of any scale parameter will not affect the weights.
7.
This name is potentially confusing as the original jackknife is somewhat different, involving the use of pseudo-values to reduce the bias of an estimate obtained from a single model [48, 154, 175].
8.
Other modifications to AIC(w) in this setting have been proposed [129, 220, 226].
9.
This assumption has also been used in interval estimation (Sect. 3.4.1).
10.
An alternative derivation avoids the notion of selecting a random sample from a population of models [24]. However this involves regarding \(\theta \) as a weighted mean of least-false values of \(\theta \).
11.
It has been wrongly claimed that \(\widehat{\theta }\) is often assumed to be unbiased [48]. The only theory that involves this assumption (asymptotically) is the local misspecification framework (Sect. 3.2.3).
12.
Even if \(\widehat{b}_m\) is unbiased, \(\widehat{b}_m^{\,2}\) will be biased as an estimate of \(b_m^{2}\), but analytical bias-adjustment would involve estimation of the correlation between \(\widehat{\theta }_{m_1}\) and \(\widehat{\theta }_{m_2}\) \((m_1 \ne m_2)\), and any decrease in bias might be offset by an increase in variance.
13.
There is also a logical problem associated with use of (3.18) [24].
14.
Both [21] and [24] wanted to avoid assuming that the true model is in the model set.
15.
This estimate is not simply the standard deviation of the \(\widehat{\theta }_{\left( b \right) }\) in (3.8) [21].
16.
It has been wrongly claimed that use of this interval involves assuming that the largest model is not in the model set [48].
17.
Unfortunately, the work of [106] has led to the impression that the MATA interval will not perform well in general [48].
18.
A similar issue arises when using a technique such as RJMCMC in the Bayesian setting (Sect. 2.2.1), where a large number of iterations may be required in order to visit each model often enough to obtain reliable estimates of both the posterior model probabilities and the posteriors for those parameters in models with low posterior model probabilities.
19.
This example also provides evidence that the Wald interval can perform well, despite the issues raised in Sect. 3.4.1.
20.
Unless we use DIC weights, which can depend on the paramtetrisation (Sect. 2.5).
21.
Conversely, we could adjust the nominal confidence level for each interval until they all have the same width, and then choose the one with the highest true coverage rate [150].
22.
This procedure is similar to use of all possible singleton models in the context of focussed model averaging (Sect. 3.6.2) [32]. In order for \(\widehat{\theta }\) to be consistent, however, [32] require the weights to sum to one, as each \(\widehat{\theta }_m\) is consistent [88].
23.
This constraint can also be useful for generalisation of the conclusions [15].

References

Aiolfi, M., Capistran, C., Timmermann, A.: Forecast combinations. In: Clements, M.P., Hendry, D.F. (eds.) Oxford Handbook of Economic Forecasting. Oxford University Press (2010)
Google Scholar
Akaike, H.: Information theory as an extension of the maximum likelihood principle. In: Petrov, B.N., Csaki, F. (eds.) Second International Symposium on Information Theory, pp. 267–281. Akademiai Kiado, Budapest (1973)
Google Scholar
Akaike, H.: A new look at the statistical model identification. IEEE Trans. Autom. Control. 19, 716–723 (1974)
Article MathSciNet MATH Google Scholar
Akaike, H.: A Bayesian analysis of the minimum AIC procedure. Ann. I. Stat. Math. 30, 9–14 (1978)
Article MathSciNet MATH Google Scholar
Akaike, H.: A Bayesian extension of the minimum AIC procedure of autoregressive model fitting. Biometrika 66, 237–242 (1979)
Article MathSciNet MATH Google Scholar
Aksu, C., Gunter, S.I.: An empirical analysis of the accuracy of SA, OLS, ERLS and NRLS combination forecasts. Int. J. Forecast. 8, 27–43 (1992)
Article Google Scholar
Amemiya, T.: Selection of regressors. Int. Econ. Rev. 21, 331–354 (1980)
Article MathSciNet MATH Google Scholar
Amini, S.M., Parmeter, C.F.: Comparisons of model averaging techniques: assessing growth determinants. J. Appl. Econ. 27, 870–876 (2012)
Article MathSciNet Google Scholar
Anderson, D.R., Burnham, K.P., White, G.C.: AIC model selection in overdispersed capture-recapture data. Ecology 75, 1780–1793 (1994)
Article Google Scholar
Ando, T., Li, K.-C.: A model-averaging approach for high-dimensional regression. J. Am. Stat. Assoc. 109, 254–265 (2014)
Article MathSciNet MATH Google Scholar
Ando, T., Li, K.-C.: A weight-relaxed model averaging approach for high-dimensional generalized linear models. Ann. Stat. 45, 2654–2679 (2017)
Article MathSciNet MATH Google Scholar
Arlot, S., Celisse, A.: A survey of cross-validation procedures for model selection. Stat. Surv. 4, 40–79 (2010)
Article MathSciNet MATH Google Scholar
Augustin, N., Sauerbrei, W., Schumacher, M.: The practical utility of incorporating model selection uncertainty into prognostic models for survival data. Stat. Model. 5, 95–118 (2005)
Article MathSciNet MATH Google Scholar
Bozdogan, H.: Akaike’s information criterion and recent developments in information complexity. J. Math. Psychol. 44, 62–91 (2000)
Article MathSciNet MATH Google Scholar
Breiman, L.: Stacked regressions. Mach. Learn. 24, 49–64 (1996)
MATH Google Scholar
Breiman, L.: Bagging predictors. Mach. Learn. 24, 123–140 (1996)
MATH Google Scholar
Breiman, L.: Heuristics of instability and stabilization in model selection. Ann. Stat. 24, 2350–2383 (1996)
Article MathSciNet MATH Google Scholar
Breiman, L.: Random forests. Mach. Learn. 45, 5–32 (2001)
Article MATH Google Scholar
Brewer, M.J., Butler, A., Cooksley, S.L.: The relative performance of AIC, AICC and BIC in the presence of unobserved heterogeneity. Methods Ecol. Evol. 7, 679–692 (2016)
Article Google Scholar
Buchholz, A., Hollnder, N., Sauerbrei, W.: On properties of predictors derived with a two-step bootstrap model averaging approach—a simulation study in the linear regression model. Comput. Stat. Data Anal. 52, 2778–2793 (2008)
Article MathSciNet MATH Google Scholar
Buckland, S.T., Burnham, K.P., Augustin, N.H.: Model selection: an integral part of inference. Biometrics 53, 603–618 (1997)
Article MATH Google Scholar
Buckland, S.T., Burnham, K.P., Augustin, N.H.: Rejoinder to the Letter to the Editors from Wagenmakers, E.-J., Farrell, S., Ratcliff, R. Biometrics 60, 283 (2004)
Article MathSciNet Google Scholar
Burnham, K.P., Anderson, D.R., White, G.C.: Evaluation of the Kullback-Leibler discrepancy for model selection in open population capture-recapture models. Biometrical. J. 36, 299–315 (1994)
Article MATH Google Scholar
Burnham, K.P., Anderson, D.R.: Model Selection and Multimodel Inference: A Practical Information-Theoretic Approach, 2nd edn. Springer (2002)
Google Scholar
Burnham, K.P., Anderson, D.R.: Multimodel inference understanding AIC and BIC in model selection. Sociol. Method. Res. 33, 261–304 (2004)
Article MathSciNet Google Scholar
Cade, B.S.: Model averaging and muddled multimodel inferences. Ecology 96, 2370–2382 (2015)
Article Google Scholar
Candolo, C., Davison, A.C., Demtrio, C.G.B.: A note on model uncertainty in linear regression. J. R. Stat. Soc. D-Stat. 52, 165–177 (2003)
Article MathSciNet Google Scholar
Carney, M., Cunningham, P.: Calibrating probability density forecasts with multi-objective search. Technical Report TCD-CS-2006-07, Trinity College, Dublin (2006)
Google Scholar
Cavanaugh, J.E.: Unifying the derivations for the Akaike and corrected Akaike information criteria. Stat. Probabil. Lett. 33, 201–208 (1997)
Article MathSciNet MATH Google Scholar
Cavanaugh, J.E., Shumway, R.H.: A bootstrap variant of AIC for state-space model selection. Stat. Sin. 7, 473–496 (1997)
MathSciNet MATH Google Scholar
Cavanaugh, J.E.: A large-sample model selection criterion based on Kullback’s symmetric divergence. Stat. Probab. Lett. 42, 333–343 (1999)
Article MathSciNet MATH Google Scholar
Charkhi, A., Claeskens, G., Hansen, B.E.: Minimum mean squared error model averaging in likelihood models. Stat. Sin. 26, 809–840 (2016)
MathSciNet MATH Google Scholar
Chen, X., Zou, G., Zhang, X.: Frequentist model averaging for linear mixed-effects models. Front. Math. China 8, 497–515 (2013)
Article MathSciNet MATH Google Scholar
Chung, H.-Y., Lee, K.-W., Koo, J.-Y.: A note on bootstrap model selection criterion. Stat. Probab. Lett. 26, 35–41 (1996)
Article MathSciNet MATH Google Scholar
Claeskens, G., Hjort, N.L.: The focused information criterion. J. Am. Stat. Assoc. 98, 900–916 (2003)
Article MathSciNet MATH Google Scholar
Claeskens, G., Croux, C., Kerckhoven, J.V.: Variable selection for logistic regression using a prediction-focused information criterion. Biometrics 62, 972–979 (2006)
Article MathSciNet MATH Google Scholar
Claeskens, G., Carroll, R.J.: An asymptotic theory for model selection inference in general semiparametric problems. Biometrika 94, 249–265 (2007)
Article MathSciNet MATH Google Scholar
Claeskens, G., Hjort, N.L.: Model Selection and Model Averaging. Cambridge University Press, Cambridge (2008)
Book MATH Google Scholar
Claeskens, G., Magnus, J.R., Vasnev, A.L., Wang, W.: The forecast combination puzzle: a simple theoretical explanation. J. Forecast. 32, 754–762 (2016)
Article Google Scholar
Clyde, M.: Model uncertainty and health effect studies for particulate matter. Environmetrics 11, 745–763 (2000)
Article Google Scholar
Craven, P., Wahba, G.: Smoothing noisy data with spline functions: estimating the correct degree of smoothing by the method of generalized cross-validation. Numer. Math. 31, 377–403 (1979)
Article MathSciNet MATH Google Scholar
Dardanoni, V., Modica, S., Peracchi, F.: Regression with imputed covariates: a generalized missing-indicator approach. J. Econ. 162, 362–368 (2011)
Article MathSciNet MATH Google Scholar
Dardanoni, V., de Luca, G., Modica, S., Peracchi, F.: Bayesian model averaging for generalized linear models with missing covariates. No. 1311. Einaudi Institute for Economics and Finance (2013)
Google Scholar
Davison, A.C., Hinkley, D.V.: Bootstrap Methods and their Applications. Cambridge University Press, Cambridge (1997)
Book MATH Google Scholar
Debray, T.P.A., Koffijberg, H., Nieboer, D., Vergouwe, Y., Steyerbergb, E.W., Moonsa, K.G.M.: Meta-analysis and aggregation of multiple published prediction models. Stat. Med. 33, 2341–2362 (2014)
Article MathSciNet Google Scholar
De Luca, G., Magnus, J.R., Peracchi, F.: Weighted-average least squares estimation of generalized linear models. J. Econ. (2018). https://doi.org/10.1016/j.jeconom.2017.12.007
Article MathSciNet MATH Google Scholar
Donohue, M.C., Overholser, R., Xu, R., Vaida, F.: Conditional Akaike information under generalized linear and proportional hazards mixed models. Biometrika 98, 685–700 (2011)
Article MathSciNet MATH Google Scholar
Dormann, C.F., Calabrese, J.M., GuilleraArroita, G., Matechou, E., Bahn, V., Bartoń, K., Beale, C.M., Ciuti, S., Elith, J., Gerstner, K., Guelat, J., Keil, P., LahozMonfort, J.J., Pollock, L.J., Reineking, B., Roberts, D.R., Schröder, B., Thuiller, W., Warton, D.I., Wintle, B.A., Wood, S.N., Wüest, R.O., Hartig, F.: Model averaging in ecology: a review of Bayesian, information-theoretic, and tactical approaches for predictive inference. Ecol. Monogr. (2018). https://doi.org/10.1002/ecm.1309
Article Google Scholar
Draper, D.: Model uncertainty yes, discrete model averaging maybe. Stat. Sci. 14, 405–409 (1999)
Google Scholar
Drucker, H., Cortes, C., Jackel, L.D., LeCun, Y., Vapnik, V.: Boosting and other ensemble methods. Neural Comput. 6, 1289–1301 (1994)
Article MATH Google Scholar
Efron, B.: Estimating the error rate of a prediction rule: improvement on cross-validation. J. Am. Stat. Assoc. 78, 316–331 (1983)
Article MathSciNet MATH Google Scholar
Efron, B.: How biased is the apparent error rate of a prediction rule? J. Am. Stat. Assoc. 81, 461–470 (1986)
Article MathSciNet MATH Google Scholar
Efron, B., Tibshirani, R.: Improvements on cross-validation: the 632+ bootstrap method. J. Am. Stat. Assoc. 92, 548–560 (1997)
MathSciNet MATH Google Scholar
Efron, B.: The estimation of prediction error: covariance penalties and cross-validation. J. Am. Stat. Assoc. 99, 619–632 (2004)
Article MathSciNet MATH Google Scholar
Efron, B.: Estimation and accuracy after model selection. J. Am. Stat. Assoc. 109, 991–1007 (2014)
Article MathSciNet MATH Google Scholar
Efron, B., Hastie, T.: Computer Age Statistical Inference, vol. 5. Cambridge University Press (2016)
Google Scholar
Ewald, K., Schneider, U.: Uniformly valid confidence sets based on the Lasso. Electron. J. Stat. 12, 1358–1387 (2018)
Article MathSciNet MATH Google Scholar
Fang, Y.: Asymptotic equivalence between cross-validations and Akaike information criteria in mixed-effects models. J. Data Sci. 9, 15–21 (2011)
MathSciNet Google Scholar
Fletcher, D., Dillingham, P.W.: Model-averaged confidence intervals for factorial experiments. Comput. Stat. Data. An. 55, 3041–3048 (2011)
Article MathSciNet Google Scholar
Fletcher, D., Turek, D.: Model-averaged profile likelihood intervals. J. Agr. Biol. Environ. Stat. 17, 38–51 (2011)
Article MathSciNet MATH Google Scholar
Fletcher, D.: Estimating overdispersion when fitting a generalized linear model to sparse data. Biometrika 99, 230–237 (2011)
Article MathSciNet MATH Google Scholar
Foster, D.P., George, E.I.: The risk inflation criterion for multiple regression. Ann. Stat. 22, 1947–1975 (1994)
Article MathSciNet MATH Google Scholar
Freund, Y.: Boosting a weak learning algorithm by majority. Inform. Comput. 121, 256–285 (1995)
Article MathSciNet MATH Google Scholar
Freund, Y., Schapire, R.E.: A decision-theoretic generalization of online learning and an application to boosting. J. Comput. Syst. Sci. 55, 119–139 (1997)
Article MATH Google Scholar
Fu, P., Pan, J.: A review on high-dimensional frequentist model averaging. Open. J. Sta. 8, 513–518 (2018)
Article Google Scholar
Gal, Y., Ghahramani, Z.: Dropout as a Bayesian approximation: representing model uncertainty in deep learning. In: International Conference on Machine Learning, pp. 1050–1059 (2016)
Google Scholar
Galipaud, M., Gillingham, M.A.F., David, M., Dechaume-Moncharmont, F.-X.: Ecologists overestimate the importance of predictor variables in model averaging: a plea for cautious interpretations. Methods Ecol. Evol. 5, 983–991 (2014)
Article Google Scholar
Galipaud, M., Gillingham, M.A.F., DechaumeMoncharmont, F.-X.: A farewell to the sum of Akaike weights: the benefits of alternative metrics for variable importance estimations in model selection. Methods Ecol. Evol. 8, 1668–1678 (2017)
Article Google Scholar
Gao, Y., Zhang, X., Wang, S., Zou, G.: Model averaging based on leave-subject-out cross-validation. J. Econ. 192, 139–151 (2016)
Article MathSciNet MATH Google Scholar
Gao, Y., Zhang, X., Wang, S., Chong, T.T., Zou, G. Frequentist model averaging for threshold models. Ann. I. Stat. Math. (2018). https://doi.org/10.1007/s10463-017-0642-9
Genre, V., Kenny, G., Meyler, A., Timmermann, A.: Combining expert forecasts: can anything beat the simple average? Int. J. Forecast. 29, 108–121 (2013)
Article Google Scholar
George, E., Foster, D.P.: Calibration and empirical Bayes variable selection. Biometrika 87, 731–747 (2000)
Article MathSciNet MATH Google Scholar
Geweke, J., Amisano, G.: Optimal prediction pools. J. Econ. 164, 130–141 (2011)
Article MathSciNet MATH Google Scholar
Giam, X., Olden, J.D.: Quantifying variable importance in a multimodel inference framework. Methods Ecol. Evol. 7, 388–397 (2016)
Article Google Scholar
Graefe, A., Kchenhoff, H., Stierle, V., Riedl, B.: Limitations of ensemble Bayesian model averaging for forecasting social science problems. Int. J. Forecast. 31, 943–951 (2015)
Article Google Scholar
Greven, S., Kneib, T.: On the behaviour of marginal and conditional AIC in linear mixed models. Biometrika 97, 773–789 (2010)
Article MathSciNet MATH Google Scholar
Hall, S.G., Mitchell, J.: Combining density forecasts. Int. J. Forecast. 23, 1–13 (2007)
Article Google Scholar
Hansen, M.H., Yu, B.: Model selection and the principle of minimum description length. J. Am. Stat. Assoc. 96, 746–774 (2001)
Article MathSciNet MATH Google Scholar
Hansen, B.E.: Least squares model averaging. Econometrica 75, 1175–1189 (2007)
Article MathSciNet MATH Google Scholar
Hansen, B.E.: Least-squares forecast averaging. J. Econ. 146, 342–350 (2008)
Article MathSciNet MATH Google Scholar
Hansen, B.E.: Averaging estimators for regressions with a possible structural break. Economet. Theor. 25, 1498–1514 (2009)
Article MathSciNet MATH Google Scholar
Hansen, P.R., Lunde, A., Nason, J.M.: The model confidence set. Econometrica 79, 453–497 (2011)
Article MathSciNet MATH Google Scholar
Hansen, B.E., Racine, J.S.: Jackknife model averaging. J. Econ. 167, 38–46 (2012)
Article MathSciNet MATH Google Scholar
Hastie, T., Tibshirani, R., Friedman, J.J.H.: The Elements of Statistical Learning, vol. 1. Springer, New York (2001)
Chapter MATH Google Scholar
Hauenstein, S., Wood, S.N., Dormann, C.F.: Computing AIC for black-box models using generalized degrees of freedom: a comparison with cross-validation. Commun. Stat.-Simul. C. 47, 1382–1396 (2018)
Article MathSciNet Google Scholar
Henderson, D.J., Parmeter, C.F.: Model averaging over nonparametric estimators. In: Essays in Honor of Aman Ullah. Advances in Econometrics, vol. 36, pp. 539–560. Emerald Group Publishing Limited, UK (2016)
Google Scholar
Hinde, J., Demtrio, C.G.B.: Overdispersion: models and estimation. Comput. Stat. Data Anal. 27, 151–170 (1998)
Article MATH Google Scholar
Hjort, N.L., Claeskens, G.: Frequentist model average estimators. J. Am. Stat. Assoc. 98, 879–945 (2003)
Article MathSciNet MATH Google Scholar
Hjort, N.L., Claeskens, G.: Rejoinder to the Discussion of Hjort, N.L., Claeskens, G.: Frequentist model average estimators. J. Am. Stat. Assoc. 98, 938–945 (2003)
Google Scholar
Hjort, N.L., Claeskens, G.: Focused information criteria and model averaging for the Cox hazard regression model. J. Am. Stat. Assoc. 101, 1449–1464 (2006)
Article MathSciNet MATH Google Scholar
Hobbs, N.T., Hilborn, R.: Alternatives to statistical hypothesis testing in ecology: a guide to self teaching. Ecol. Appl. 16, 5–19 (2006)
Article Google Scholar
Holbrook, A., Gillen, D.: Estimating prediction error for complex samples (2017). arXiv preprint: arXiv:1711.04877
Hong, C.Y.: Focussed model averaging in generalised linear models. Thesis, Doctor of Philosophy, University of Otago (2018)
Google Scholar
Hoogerheide, L., Kleijn, R., Ravazzolo, F., Van Dijk, H.K., Verbeek, M.: Forecast accuracy and economic gains from Bayesian model averaging using time-varying weights. J. Forecast. 29, 251–269 (2010)
Article MathSciNet MATH Google Scholar
Hurvich, C.M., Tsai, C.-L.: Regression and time series model selection in small samples. Biometrika 76, 297–307 (1989)
Article MathSciNet MATH Google Scholar
Hurvich, C.M., Tsai, C.-L.: Model selection for extended quasi-likelihood models in small samples. Biometrics 51, 1077–1084 (1995)
Article MATH Google Scholar
Ishiguro, M., Sakamoto, Y.: WIC: An Estimation-free Information Criterion. Research Memorandum, Institute of Statistical Mathematics, Tokyo (1991)
Google Scholar
Ishiguro, M., Sakamoto, Y., Kitagawa, G.: Bootstrapping log likelihood and EIC, an extension of AIC. Ann. Inst. Stat. Math. 49, 411–434 (1997)
Article MathSciNet MATH Google Scholar
Jacobs, R.A.: Methods for combining experts’ probability assessments. Neural Comput. 7, 867–888 (1995)
Article Google Scholar
Jensen, S.M., Ritz, C.: Simultaneous inference for model averaging of derived parameters. Risk Anal. 35, 68–76 (2015)
Article Google Scholar
Jiang, J., Rao, J.S., Gu, Z., Nguyen, T.: Fence methods for mixed model selection. Ann. Stat. 36, 1669–1692 (2008)
Article MathSciNet MATH Google Scholar
Jin, S., Ankargren, S.: Frequentist model averaging in structural equation modelling. Psychometrika (2018). https://doi.org/10.1007/s11336-018-9624-y
Johnson, W.O.: Discussion of Hjort, N.L., Claeskens, G.: Frequentist model average estimators. J. Am. Stat. Assoc. 98, 919–921 (2003)
Google Scholar
Jullum, M., Hjort, N.L.: Parametric or nonparametric: the FIC approach. Stat. Sin. 27, 951–981 (2017)
MathSciNet MATH Google Scholar
Kabaila, P., Leeb, H.: On the large-sample minimal coverage probability of confidence intervals after model selection. J. Am. Stat. Assoc. 101, 619–629 (2006)
Article MathSciNet MATH Google Scholar
Kabaila, P., Welsh, A.H., Abeysekera, W.: Model-averaged confidence intervals. Scand. J. Stat. 43, 35–48 (2016)
Article MathSciNet MATH Google Scholar
Kabaila, P., Welsh, A.H., Mainzer, R.: The performance of model averaged tail area confidence intervals Commun. Stat-Theor. M. 46, 10718–10732 (2016)
Article MATH Google Scholar
Kabaila, P., Wijethunga, C.: Confidence intervals centered on bootstrap smoothed estimators (2016). arXiv preprint: arXiv:1610.09802
Kabaila, P.: On the minimum coverage probability of model averaged tail area confidence intervals. Can. J. Stat. 46, 279–297 (2018)
Article MathSciNet Google Scholar
Kapetanios, G., Mitchell, J., Price, S., Fawcett, N.: Generalised density forecast combinations. J. Econ. 188, 150–165 (2015)
Article MathSciNet MATH Google Scholar
LeBlanc, M., Tibshirani, R.: Combining estimates in regression and classification. J. Am. Stat. Assoc. 91, 1641–1650 (1996)
MathSciNet MATH Google Scholar
Lee, H., Jogesh Babu, G., Rao, C.R.R.: A jackknife type approach to statistical model selection. J. Stat. Plan. Inference 142, 301–311 (2012)
Article MathSciNet MATH Google Scholar
Leeb, H., Pötscher, B.M.: Model selection and inference: facts and fiction. Econ. Theory 21, 21–59 (2005)
Article MathSciNet MATH Google Scholar
Leeb, H., Pötscher, B.M.: Can one estimate the conditional distribution of post-model-selection estimators? Ann. Stat. 34, 2554–2591 (2006)
Article MathSciNet MATH Google Scholar
Lemke, C., Budka, M., Gabrys, B.: Metalearning: a survey of trends and technologies. Artif. Intell. Rev. 44, 117–130 (2015)
Article Google Scholar
Lenkoski, A., Eicher, T.S., Raftery, A.E.: Two-stage Bayesian model averaging in endogenous variable models. Econ. Rev. 33, 122–151 (2014)
Article MathSciNet Google Scholar
Leung, G., Barron, A.R.: Information theory and mixing least-squares regressions. IEEE Trans. Inf. Theory 52, 3396–3410 (2006)
Article MathSciNet MATH Google Scholar
Li, C., Li, Q., Racine, J.S., Zhang, D.: Optimal model averaging of varying-coefficient models. Stat. Sin. (2018). https://doi.org/10.5705/ss.202017.0034
Li, J., Xia, X., Wong, W.K., Nott, D.: Varying-coefficient semiparametric model averaging prediction. Biometrics (2018). https://doi.org/10.1111/biom.12904
Liang, H., Wu, H., Zou, G.: A note on conditional AIC for linear mixed-effects models. Biometrika 95, 773–778 (2008)
Article MathSciNet MATH Google Scholar
Liang, H., Zou, G., Wan. A.T.K., Zhang, X.: Optimal weight choice for frequentist model average estimators: J. Am. Stat. Assoc. 106, 1053–1066 (2011)
Article MathSciNet MATH Google Scholar
Lieb, L., Smeekes, S.: Inference for impulse responses under model uncertainty (2017). arXiv preprint: arXiv:1709.09583
Lin, B., Wang, Q., Zhang, J., Pang, Z.: Stable prediction in high-dimensional linear models. Stat. Comput. 27, 1401–1412 (2017)
Article MathSciNet MATH Google Scholar
Link, W., Barker, R.: Model weights and the foundations of multimodel inference. Ecology 87, 2626–2635 (2006)
Article Google Scholar
Liu, Q., Okui, R.: Heteroscedasticity-robust \(\text{ C }_{\text{ p }}\) model averaging. Econ. J. 16, 463–472 (2013)
MathSciNet Google Scholar
Liu, S., Yang, Y.: Combining models in longitudinal data analysis. Ann. Inst. Stat. Math. 64, 233–254 (2012)
Article MathSciNet MATH Google Scholar
Liu, S., Yang, Y.: Mixing partially linear regression models. Sankhyā 75, 74–95 (2013)
Article MathSciNet MATH Google Scholar
Liu, C.A.: Distribution theory of the least squares averaging estimator. J. Econ. 186, 142–159 (2015)
Article MathSciNet MATH Google Scholar
Liu, Q., Okui, R., Yoshimura, A.: Generalized least squares model averaging. Econ. Rev. 35, 1692–1752 (2016)
Article MathSciNet Google Scholar
Longford, N.T.: An alternative to model selection in ordinary regression. Stat. Comput. 13, 67–80 (2003)
Article MathSciNet Google Scholar
Longford, N.T.: An alternative analysis of variance. SORT Stat. Oper. Res. T. 32, 77–92 (2008)
MathSciNet MATH Google Scholar
Lu, X., Su, L.: Jackknife model averaging for quantile regressions. J. Econ. 188, 40–58 (2015)
Article MathSciNet MATH Google Scholar
Lumley, T., Scott, A.: AIC and BIC for modeling with complex survey data. J. Surv. Stat. Methodol. 3, 1–18 (2015)
Article Google Scholar
Lv, J., Liu, J.S.: Model selection principles in misspecified models. J. R. Stat. Soc. Ser. B (Stat. Methodol.) 76, 141–167 (2014)
Article MathSciNet Google Scholar
Magnus, J.R., Wan, A.T.K., Zhang, X: Weighted average least squares estimation with nonspherical disturbances and an application to the Hong Kong housing market. Comput. Stat. Data Anal. 55, 1331–1341 (2011)
Article MathSciNet MATH Google Scholar
Magnus, J.R., De Luca, G.: Weighted-average least squares (WALS): a survey. J. Econ. Surv. 30, 117–148 (2016)
Article Google Scholar
Mallows, C.L.: Some comments on Cp. Technometrics 42, 87–94 (2000)
Google Scholar
Martins, L.F., Gabriel, V.J.: Linear instrumental variables model averaging estimation. Comput. Stat. Data. Anal. 71, 709–724 (2014)
Article MathSciNet MATH Google Scholar
McQuarrie, A., Shumway, R., Tsai, C.-L.: The model selection criterion AICu. Stat. Probabil. Lett. 34, 285–292 (1997)
Article MathSciNet MATH Google Scholar
McQuarrie, A.D.R., Tsai, C.-L.: Regression and Time Series Model Selection. World Scientific, Singapore (1998)
Book MATH Google Scholar
Mead, R.: The Design of Experiments: Statistical Principles for Practical Applications. Cambridge University Press, Cambridge (1988)
MATH Google Scholar
Mitra, P., Lian, H., Mitra, R., Liang, H., Xie, M.: A general framework for frequentist model averaging (2018). arXiv preprint: arXiv:1802.03511
Moody, J.E.: The effective number of parameters: an analysis of generalization and regularization in nonlinear learning systems. In: Moody, J.E., Hanson, S.J., Lippmann, R.P. (eds.) Advances in Neural Information Processing Systems, vol. 4, pp. 847–854. Morgan Kaufmann, San Mateo, California (1992)
Google Scholar
Müller, S., Scealy, J.L., Welsh, A.H.: Model selection in linear mixed models. Stat. Sci. 28, 135–167 (2013)
Article MathSciNet MATH Google Scholar
Murata, N., Yoshizawa, S., Amari, S.: Network information criterion-determining the number of hidden units for artificial neural network models. IEEE Trans. Neural Netw. 5, 865–872 (1994)
Article Google Scholar
Murray, K., Conner, M.M.: Methods to quantify variable importance: implications for the analysis of noisy ecological data. Ecology 90, 348–355 (2009)
Article Google Scholar
Naftaly, U., Intrator, N., Horn, D.: Optimal ensemble averaging of neural networks. Network-Comp. Neural 8, 283–296 (1997)
Article MATH Google Scholar
Nakagawa, S., Freckleton, R.P.: Model averaging, missing data and multiple imputation: a case study for behavioural ecology. Behav. Ecol. Sociobiol. 65, 103–116 (2011)
Article Google Scholar
Neath, A.A., Cavanaugh, J.E., Weyhaupt, A.G.: Model evaluation, discrepancy function estimation, and social choice theory. Comput. Stat. 29, 1–19 (2014)
Article MATH Google Scholar
Owen, A.B.: Small sample central confidence intervals for the mean. Technical Report 302, Department of Statistics, Stanford University (1988)
Google Scholar
Polley, E.C., van der Laan, M.J.: Super learner in prediction. UC Berkeley Division of Biostatistics Working Paper Series. Working Paper 266 (2010). http://biostats.bepress.com/ucbbiostat/paper266
Poeter, E.P., Hill, M.C.: MMA, a computer code for multi-model analysis. U.S. Geological Survey Techniques and Methods TM6-E3. Reston, Virginia (2007)
Google Scholar
Pötscher, B.M.: The distribution of model averaging estimators and an impossibility result regarding its estimation. Inst. Math. S. 52, 113–129 (2006)
MathSciNet MATH Google Scholar
Quenouille, M.H.: Notes on bias in estimation. Biometrika 43, 353–360 (1956)
Article MathSciNet MATH Google Scholar
R Core Team: R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria (2017). https://www.R-project.org/
Raftery, A.E., Zheng, Y.: Discussion of Hjort, N.L., Claeskens, G.: Frequentist model average estimators. J. Am. Stat. Assoc. 98, 931–938 (2003)
Google Scholar
Rao, J.S., Tibshirani, R.: The out-of-bootstrap method for model averaging and selection. University of Toronto (1997)
Google Scholar
Richards, S.A.: Testing ecological theory using the information-theoretic approach: examples and cautionary results. Ecology 86, 2805–2814 (2005)
Article Google Scholar
Ripley, B.D.: Selecting amongst large classes of models. In: Adams, N., Crowder, M., Hand, D.J., Stephens, D. (eds.) Methods and Models in Statistics: in Honor of Professor John Nelder, FRS, pp. 155–170. Imperial College Press, London (2004)
Chapter Google Scholar
Rubin, D.B.: Inference and missing data. Biometrika 63, 581–592 (1976)
Article MathSciNet MATH Google Scholar
Saefken, B., Kneib, T., van Waveren, C.-S., Greven, S.: A unifying approach to the estimation of the conditional Akaike information in generalized linear mixed models. Electron. J. Stat. 8, 201–225 (2014)
Article MathSciNet MATH Google Scholar
Sapp, S., van der Laan, M.J., Canny, J.: Subsemble: an ensemble method for combining subset-specific algorithm fits. J. Appl. Stat. 41, 1247–1259 (2014)
Article MathSciNet MATH Google Scholar
Schapire, R.E.: The strength of weak learnability. Mach. Learn. 5, 197–227 (1990)
Google Scholar
Schomaker, M.: Shrinkage averaging estimation. Stat. Pap. 53, 1015–1034 (2012)
Article MathSciNet MATH Google Scholar
Schomaker, M., Wan, A.T.K., Heumannm, C.: Frequentist model averaging with missing observations. Comput. Stat. Data. Anal. 54, 3336–3347 (2010)
Article MathSciNet MATH Google Scholar
Schomaker, M., Heumannm, C.: Model selection and model averaging after multiple imputation. Comput. Stat. Data. Anal. 71, 758–770 (2014)
Article MathSciNet MATH Google Scholar
Schomaker, M., Heumann, C.: When and when not to use optimal model averaging (2018). arXiv preprint: arXiv:1802.04589
Shan, K., Yang, Y.: Combining regression quantile estimators. Stat. Sin. 19, 1171–1191 (2009)
MathSciNet MATH Google Scholar
Shang, J., Cavanaugh, J.E.: Bootstrap variants of the Akaike information criterion for mixed model selection. Comput. Stat. Data. Anal. 52, 2004–2021 (2008)
Article MathSciNet MATH Google Scholar
Shen, X., Huang, H.-C., Ye, J.: Adaptive model selection and assessment for exponential family distributions. Technometrics 46, 306–317 (2004)
Article MathSciNet Google Scholar
Shen, X., Huang, Huang.-C.: Optimal model assessment, selection, and combination. J. Am. Stat. Assoc. 101, 554–568 (2006)
Article MathSciNet MATH Google Scholar
Shibata, R.: Bootstrap estimate of Kullback-Leibler information for model selection. Stat. Sin. 7, 375–394 (1997)
MathSciNet MATH Google Scholar
Smith, A.C., Koper, N., Francis, C.M., Fahrig, L.: Confronting collinearity: comparing methods for disentangling the effects of habitat loss and fragmentation. Landscape Ecol. 24, 1271–1285 (2009)
Article Google Scholar
Smyth, P., Wolpert, D.: Linearly combining density estimators via stacking. Mach. Learn. 36, 59–83 (1999)
Article Google Scholar
Stone, M.: Cross-validatory choice and assessment of statistical predictions. J. R. Stat. Soc. Ser. B. (Methodol.) 36, 111–147 (1974)
MathSciNet MATH Google Scholar
Stone, M.: An asymptotic equivalence of choice of model by cross-validation and Akaike’s criterion. J. R. Stat. Soc. Ser. B. (Methodol.) 39, 44–47 (1977)
MathSciNet MATH Google Scholar
Sugiura, N.: Further analysts of the data by Akaike’s information criterion and the finite corrections: further analysts of the data by Akaike’s. Commun. Stat. Theory 7, 13–26 (1978)
Article MATH Google Scholar
Takeuchi, K.: Distribution of informational statistics and a criterion of model fitting. Suri-Kagaku (Math. Sci.) 153, 12–18 (1976)
Google Scholar
Timmermann, A.: Forecast combinations. In: Elliott, G., Granger, C.W.J., Timmermann, A. (eds.) Handbook of Economic Forecasting, pp. 135–196. Elsevier, Amsterdam (2006)
Chapter Google Scholar
Ting, K.M., Witten, I.H.: Issues in stacked generalization. J. Artif. Intell. Res. 10, 271–289 (1999)
Article MATH Google Scholar
Turek, D., Fletcher, D.: Model-averaged Wald confidence intervals. Comput. Stat. Data. Anal. 56, 2809–2815 (2012)
Article MathSciNet MATH Google Scholar
Turek, D.: Comparison of the frequentist MATA confidence interval with Bayesian model-averaged confidence intervals. J. Probab. Stat. (2015). https://doi.org/10.1155/2015/420483
Article MathSciNet MATH Google Scholar
Ullah, A., Wang, H.: Parametric and nonparametric frequentist model selection and model averaging. Econ. J. 1, 157–179 (2013)
Google Scholar
Vaida, F., Blanchard, S.: Conditional Akaike information for mixed-effects models. Biometrika 92, 351–370 (2005)
Article MathSciNet MATH Google Scholar
van der Laan, M.J., Dudoit, S., Keles, S.: Asymptotic optimality of likelihood-based cross-validation. Stat. Appl. Genet. Mol. 3, Article 4 (2004)
Google Scholar
van der Laan, M.J., Polley, E.C., Hubbard, A.E.: Super learner. Stat. Appl. Genet. Mol. Biol. 6, 1–23 (2007)
MathSciNet MATH Google Scholar
Wagenmakers, E.-J., Farrell, S., Ratcliff, R.: Letter to the editors. Biometrics 60, 281–283 (2004)
Article MathSciNet Google Scholar
Wager, S., Hastie, T., Efron, B.: Confidence intervals for random forests: the jackknife and the infinitesimal jackknife. J. Mach. Learn. Res. 15, 1625–1651 (2014)
MathSciNet MATH Google Scholar
Wallis, K.F.: Combining density and interval forecasts: a modest proposal. Oxford B. Econ. Stat. 67, 983–994 (2005)
Article Google Scholar
Wan, A.T.K., Zhang, X., Zou, G.: Least squares model averaging by Mallows criterion. J. Econ. 156, 277–283 (2010)
Article MathSciNet MATH Google Scholar
Wan, A.T.K., Zhang, X., Wang, S.: Frequentist model averaging for multinomial and ordered logit models. In. J. Forecast. 30, 118–128 (2014)
Article Google Scholar
Wang, H., Zou, G., Wan, A.T.K.: Model averaging for varying-coefficient partially linear measurement error models. Electron. J. Stat. 6, 1017–1039 (2012)
Article MathSciNet MATH Google Scholar
Wang, H., Zhou, S.Z.F.: Interval estimation by frequentist model averaging. Commun. Stat. Theory 42, 4342–4356 (2013)
Article MathSciNet MATH Google Scholar
Wang, H.Y., Chen, X., Flournoy, N.: The focused information criterion for varying-coefficient partially linear measurement error models. Stat. Pap. 1–15. Springer, Heidelberg (2014)
Google Scholar
Wang, H., Li, Y., Sun, J.: Focused and model average estimation for regression analysis of panel count data. Scand. J. Stat. 42, 732–745 (2015)
Article MathSciNet MATH Google Scholar
Wedderburn, R.W.M.: Quasi-likelihood functions, generalized linear models, and the Gauss-Newton method. Biometrika 61, 439–447 (1974)
MathSciNet MATH Google Scholar
White, H.: Maximum likelihood estimation of misspecified models. Econometica 50, 1–25 (1982)
Article MathSciNet MATH Google Scholar
Wolpert, D.H.: Stacked generalization. Neural Netw. 5, 241–259 (1992)
Article Google Scholar
Wood, S.N.: Core Statistics. Cambridge University Press, Cambridge (2015)
Google Scholar
Xie, T.: Prediction model averaging estimator. Econ. Lett. 131, 5–8 (2015)
Article MathSciNet MATH Google Scholar
Xu, R., Gamst, A., Donohue, M., Vaida, F., Harrington, D.P.: Using profile likelihood for semiparametric model selection with application to proportional hazards mixed models. Harvard University Biostatistics Working Paper Series, Paper 43 (2006). http://biostats.bepress.com/harvardbiostat/paper43/
Xu, G., Wang, S., Huang, J.Z.: Focused information criterion and model averaging based on weighted composite quantile regression. Scand. J. Stat. 41, 365–381 (2014)
Article MathSciNet MATH Google Scholar
Xu, R., Mehrotra, D.V., Shaw, P.A.: Incorporating baseline measurements into the analysis of crossover trials with timetoevent endpoints. Stat. Med. (2018). https://doi.org/10.1002/sim.7834
Article MathSciNet Google Scholar
Yang, Y.: Adaptive regression by mixing. J. Am. Stat. Assoc. 96, 574–588 (2001)
Article MathSciNet MATH Google Scholar
Yang, Y.: Regression with multiple candidate models: selecting or mixing? Stat. Sin. 13, 783–809 (2003)
MathSciNet MATH Google Scholar
Yang, Y.: Can the strengths of AIC and BIC be shared? A conflict between model indentification and regression estimation. Biometrika 92, 937–950 (2005)
Article MathSciNet MATH Google Scholar
Ye, J.: On measuring and correcting the effects of data mining and model selection. J. Am. Stat. Assoc. 93, 120–131 (1998)
Article MathSciNet MATH Google Scholar
Yu, D., Yau, K.K.W.: Conditional Akaike information criterion for generalized linear mixed models. Comput. Stat. Data. Anal. 56, 629–644 (2012)
Article MathSciNet MATH Google Scholar
Yu, Y., Thurston, S.W., Hauser, R., Liang, H.: Model averaging procedure for partially linear single-index models. J. Stat. Plan. Infer. 143, 2160–2170 (2013)
Article MathSciNet MATH Google Scholar
Yu, W., Xu, W., Zhu, L.: Transformation-based model averaged tail area inference. Comput. Stat. 29, 1713–1726 (2014)
Article MathSciNet MATH Google Scholar
Yu, D., Zhang, X., Yau, K.K.W.: Asymptotic properties and information criteria for misspecified generalized linear mixed models. J. R. Stat. Soc. Ser. B (Methodol.) (2018). https://doi.org/10.1111/rssb.12270
Article MathSciNet MATH Google Scholar
Yuan, Z., Yang, Y.: Combining linear regression models. J. Am. Stat. Assoc. 100, 1202–1214 (2005)
Article MathSciNet MATH Google Scholar
Yuan, Z., Ghosh, D.: Combining multiple biomarker models in logistic regression. Biometrics 64, 431–439 (2008)
Article MathSciNet MATH Google Scholar
Zeng, J.: Model-Averaged Confidence Intervals. (Thesis, Doctor of Philosophy). University of Otago (2013)
Google Scholar
Zeng, J., Cheng, W., Hu, G., Ronga, Y.: Model averaging procedure for varying-coefficient partially linear models with missing responses. J. Korean Stat. Soc. 47, 379–394 (2018)
Article MathSciNet MATH Google Scholar
Zhang, X., Liang, H.: Focused information criterion and model averaging for generalized additive partial linear models. Ann. Stat. 39, 174–200 (2011)
Article MathSciNet MATH Google Scholar
Zhang, C., Ma, Y.: (eds.) Ensemble Machine Learning: Methods and Applications. Springer, New York (2012)
MATH Google Scholar
Zhang, X., Wan, A.T.K., Zhou, S.Z.: Focused information criteria, model selection, and model averaging in a Tobit model with a nonzero threshold. J. Bus. Econ. Stat. 30, 132–142 (2012)
Article MathSciNet Google Scholar
Zhang, X., Wan, A.T.K., Zou, G.: Model averaging by jackknife criterion in models with dependent data. J. Econ. 174, 82–94 (2013)
Article MathSciNet MATH Google Scholar
Zhang, X., Zou, G., Carroll, R.J.: Model averaging based on Kullback-Leibler distance. Stat. Sin. 25, 1583–1598 (2015)
MathSciNet MATH Google Scholar
Zhang, X.: Consistency of model averaging estimators. Econ. Lett. 130, 120–123 (2015)
Article MathSciNet MATH Google Scholar
Zhang, Y., Yang, Y.: Cross-validation for selecting a model selection procedure. J. Econ. 187, 95–112 (2015)
Article MathSciNet MATH Google Scholar
Zhang, X., Yu, D., Zou, G., Liang, H.: Optimal model averaging estimation for generalized linear models and generalized linear mixed-effects models. J. Am. Stat. Assoc. 111, 1775–1790 (2016)
Article MathSciNet Google Scholar
Zhang, Q., Duan, X., Ma, S.: Focused information criterion and model averaging with generalized rank regression. Stat. Probabil. Lett. 122, 11–19 (2017)
Article MathSciNet MATH Google Scholar
Zhao, N., Zhao, Z., Liao, S.: Probabilistic model combination for support vector machine using positive-definite kernel-based regularization path. In: Wang, Y., Li, T. (eds.) Foundations of Intelligent Systems. Advances in Intelligent and Soft Computing, vol. 122, pp. 201–206. Springer, Heidelberg (2011)
Google Scholar
Zhao, S., Zhang, X., Gao, Y.: Model averaging with averaging covariance matrix. Econ. Lett. 145, 214–217 (2016)
Article MathSciNet MATH Google Scholar
Zhao, S., Ullah, A., Zhang, X.: A class of model averaging estimators. Econ. Lett. 162, 101–106 (2018)
Article MathSciNet MATH Google Scholar
Zou, G., Wan, A.T.K., Wu, X., Chen, T.: Estimation of regression coefficients of interest when other regression coefficients are of no interest: the case of non-normal errors. Stat. Probabil. Lett. 77, 803–810 (2007)
Article MathSciNet MATH Google Scholar

Download references

Author information

Authors and Affiliations

Department of Mathematics and Statistics, University of Otago, Dunedin, New Zealand
David Fletcher

Authors

David Fletcher
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to David Fletcher .

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Fletcher, D. (2018). Frequentist Model Averaging. In: Model Averaging. SpringerBriefs in Statistics. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-58541-2_3

Download citation

DOI: https://doi.org/10.1007/978-3-662-58541-2_3
Published: 18 January 2019
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-58540-5
Online ISBN: 978-3-662-58541-2
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics