Approximate Bayesian forecasting

https://doi.org/10.1016/j.ijforecast.2018.08.003Get rights and content

Abstract

Approximate Bayesian Computation (ABC) has become increasingly prominent as a method for conducting parameter inference in a range of challenging statistical problems, most notably those characterized by an intractable likelihood function. In this paper, we focus on the use of ABC not as a tool for parametric inference, but as a means of generating probabilistic forecasts; or for conducting what we refer to as ‘approximate Bayesian forecasting’. The four key issues explored are: (i) the link between the theoretical behavior of the ABC posterior and that of the ABC-based predictive; (ii) the use of proper scoring rules to measure the (potential) loss of forecast accuracy when using an approximate rather than an exact predictive; (iii) the performance of approximate Bayesian forecasting in state space models; and (iv) the use of forecasting criteria to inform the selection of ABC summaries in empirical settings. The primary finding of the paper is that ABC can provide a computationally efficient means of generating probabilistic forecasts that are nearly identical to those produced by the exact predictive, and in a fraction of the time required to produce predictions via an exact method.

Introduction

Approximate Bayesian Computation (ABC) has become an increasingly prominent inferential tool in challenging problems, most notably those characterized by an intractable likelihood function. ABC requires only that one can simulate pseudo-data from the assumed model, for given draws of the parameters from the prior. Parameter draws that produce a ‘match’ between the pseudo and observed data - according to a given set of summary statistics, a chosen metric and a pre-specified tolerance - are retained and used to estimate the posterior distribution, with the resultant estimate of the exact (but inaccessible) posterior being conditioned on the summaries used in the matching. Various guiding principles have been established to select summary statistics in ABC (see, for instance, Drovandi, Pettitt, & Lee, 2015, Fearnhead & Prangle, 2012 and Joyce & Marjoram, 2008) and we refer the reader to reviews by Blum, Nunes, Prangle, and Sisson (2013) and Prangle (2015) for discussions of these different approaches.

Along with the growth in applications of ABC (see Marin, Pudlo, Robert, & Ryder, 2012, Robert, 2016, and Sisson & Fan, 2011, for recent surveys), attention has recently been paid to the theoretical properties of the method, including the asymptotic behaviour of: ABC posterior distributions, point estimates derived from those distributions, and Bayes factors that condition on summaries. Notable contributions here are Creel, Gao, Hong, and Kristensen (2015), Frazier, Martin, Robert, and Rousseau (2018), Jasra (2015), Li and Fearnhead, 2018a, Li and Fearnhead, 2018b, Marin, Pillai, Robert, and Rousseau (2014) and Martin, McCabe, Frazier, Maneesoonthorn, and Robert (2018), with Frazier et al. (2018) providing the full suite of asymptotic results pertaining to the ABC posterior - namely, Bayesian (or posterior) consistency, limiting posterior shape, and the asymptotic distribution of the posterior mean.

This current paper stands in contrast to the vast majority of ABC studies, with their focus on parametric inference and/or model choice. Our goal herein is to exploit ABC as a means of generating probabilistic forecasts; or for conducting what we refer to hereafter as ‘approximate Bayesian forecasting’ (ABF). Whilst ABF has particular relevance in scenarios in which the likelihood function and, hence, the exact predictive distribution, are inaccessible, we also give attention to cases where the exact predictive is able to be estimated (via a Monte Carlo Markov chain algorithm), but at a greater computational cost than that associated with ABF. That is, in part, we explore ABF as a computationally convenient means of constructing predictive distributions.1

We prove that, under certain regularity conditions, ABF produces forecasts that are asymptotically equivalent to those obtained from exact Bayesian methods, and illustrate numerically the close match that can occur between approximate and exact predictives, even when the corresponding approximate and exact posteriors for the parameters are very distinct. We also explore the application of ABF to state space models, in which the production of an approximate Bayesian predictive requires integration over both a small number of static parameters and a set of states with dimension equal to the sample size.

In summary, the four primary questions addressed in the paper are the following: (i) What role does the asymptotic behavior of the ABC posterior - in particular Bayesian consistency - play in determining the accuracy of the approximate predictive as an estimate of the exact predictive? (ii) Can we characterize the loss incurred by using the approximate rather than the exact predictive, using proper scoring rules? (iii) How does ABF perform in state space models, and what role does (particle) filtering play therein? (iv) How can forecast accuracy be used to guide the choice of summary statistics in an empirical setting?

We note that, independent of this research, Canale and Ruggiero (2016) propose the use of ABC as a means of generating nonparametric forecasts of certain functional time series models with intractable likelihoods. In particular, Canale and Ruggiero use ABC sampling as a means of generating h-step-ahead point and interval forecasts for some underlying unknown curve of interest. The authors apply this methodology to the prediction of price dynamics in the Italian natural gas market. Whilst not pursuing the same lines of enquiry as in the current research, the Canale and Ruggiero paper highlights the usefulness of ABC as a forecasting tool in scenarios when exact Bayesian inference - and, hence, exact Bayesian prediction - is infeasible, and thereby provides further evidence of the practical importance of the results we provide herein.

The remainder of the paper proceeds as follows. In Section 2 we first provide a brief overview of the method of ABC for producing estimates of an exact, but potentially inaccessible, posterior for the unknown parameters. The use of an ABC posterior to yield an approximate forecast distribution is then proposed. After a brief outline of existing asymptotic results pertaining to ABC in Section 3.1, the role played by Bayesian consistency in determining the accuracy of ABF is formally established in Section 3.2, with this building on earlier insights by Blackwell and Dubins (1962) and Diaconis and Freedman (1986) regarding the merging of predictive distributions. In Section 3.3, the concept of a proper scoring rule is adopted in order to formalize the loss incurred when adopting the approximate rather than the exact Bayesian predictive. The relative performance of ABF is then quantified in Section 3.4 using two simple examples: one in which an integer autoregressive model for count time series data is adopted as the data generating process (DGP), with a single set of summaries used to implement ABC; and a second in which a moving average (MA) model is the assumed DGP, and predictives based on alternative sets of summaries are investigated. In both examples there is little visual distinction between the approximate and exact predictives, despite enormous visual differences between the corresponding posteriors. Furthermore, the visual similarity between the exact and approximate predictives extends to forecast accuracy: using averages of various proper scores over a hold-out sample, we demonstrate that the predictive superiority of the exact predictive, over the approximate, is minimal in both examples. Moreover, we highlight the fact that all approximate predictives can be produced in a fraction of the time taken to produce the corresponding exact predictive.

In Section 4, we explore ABF in the context of a model in which latent variables feature. Using a simple stochastic volatility model for which the exact predictive is accessible via Markov chain Monte Carlo (MCMC), the critical importance (in terms of matching the exact predictive) of augmenting ABC inference on the static parameters with ‘exact’ inference on the states, via a particle filtering step, is made clear. An extensive empirical illustration is then undertaken in Section 5. Approximate predictives for both a financial return and its volatility, in a dynamic jump-diffusion model with α-stable volatility transitions, are produced, using different sets of summary statistics, including those extracted from simple auxiliary models with closed-form likelihood functions. Particular focus is given to using out-of-sample predictive performance to choose the ‘best’ set of summaries for driving ABC, in the case where prediction is the primary goal of the investigation. A discussion section concludes the paper in Section 6, and proofs are included in the Appendix. All Matlab code used in the production of the numerical results will be made available at http://users.monash.edu.au/~gmartin/.

Section snippets

Approximate Bayesian computation (ABC): Inference and forecasting

We observe a T-dimensional vector of data y=(y1,y2,,yT), assumed to be generated from some model with likelihood p(y|θ), with θΘ Rkθ a kθ-dimension vector of unknown parameters, and where we possess prior beliefs on θ specified by p(θ). In this section, we propose a means of producing probabilistic forecasts for the random variables YT+k,k=1,2,,h, in situations where p(y|θ) is computationally intractable or numerically difficult to calculate. Before presenting this approach, we first give

Accuracy of ABF

It is well-known in the ABC literature that the posterior pε(θ|η(y)) is sometimes a poor approximation to p(θ|y) (Marin et al., 2012). What is unknown, however, is whether or not this same degree of inaccuracy will transfer to the ABC-based predictive. To this end, we begin by characterizing the difference between g(yT+1|y) and p(yT+1|y) using the large sample behavior of pε(θ|η(y)) and p(θ|y). In so doing, in Section 3.2 we demonstrate that if both pε(θ|η(y)) and p(θ|y) are Bayesian consistent

ABF in state space models

So far the focus has been on the case in which the vector of unknowns, θ, is a kθ-dimensional set of parameters for which informative summary statistics are sought for the purpose of generating probabilistic predictions. By implication, and certainly in the case of both the INAR(1) and MA(2) examples, the elements of θ are static in nature, with kθ small enough for a set of summaries of manageable dimension to be defined with relative ease.

State space models, in which the set of unknowns is

Background, model and computational details

The effective management of financial risk entails the ability to plan for unexpected, and potentially large, movements in asset prices. Central to this is the ability to accurately quantify the probability distribution of the future return on the asset, including its degree of variation, or volatility. The stylized features of time-varying and autocorrelated volatility, allied with non-Gaussian return distributions, are now extensively documented in the literature (Bollerslev, Chou, & Kroner,

Discussion

This paper explores the use of approximate Bayesian computation (ABC) in generating probabilistic forecasts and proposes the concept of approximate Bayesian forecasting (ABF). Theoretical and numerical evidence has been presented which indicates that if the assumed data generating process (DGP) is correctly specified, very little is lost - in terms of forecast accuracy - by conducting approximate inference (only) on the unknowns that characterize the DGP. A caveat here applies to latent

David T. Frazier is Senior Lecturer, Department of Econometrics and Business Statistics, Monash University, Melbourne, Australia. His research interests include statistical and econometric theory, simulation-based inference, Bayesian inference and financial econometrics.

References (60)

  • BeaumontM.A. et al.

    Approximate Bayesian computation in population genetics

    Genetics

    (2002)
  • BlackwellD. et al.

    Merging of opinions with increasing information

    The Annals of Mathematical Statistics

    (1962)
  • BlomstedtP. et al.

    Posterior predictive comparisons for the two-sample problem

    Communications in Statistics. Theory and Methods

    (2015)
  • BlumM.G.B.

    Approximate Bayesian computation: a nonparametric perspective

    Journal of the American Statistical Association

    (2010)
  • BlumM.G.B. et al.

    A comparative review of dimension reduction methods in approximate Bayesian computation

    Statistical Science

    (2013)
  • BroadieM. et al.

    Model specification and risk premia: Evidence from futures options

    The Journal of Finance

    (2007)
  • CanaleA. et al.

    Bayesian nonparametric forecasting of monotonic functional time series

    Electronic Journal of Statistics

    (2016)
  • ChambersJ.M. et al.

    A method for simulating stable random variables

    Journal of the American Statistical Association

    (1976)
  • ChanJ.C.-C. et al.

    MCMC estimation of restricted covariance matrices

    Journal of Computational and Graphical Statistics

    (2009)
  • Creel, M., Gao, J., Hong, H., & Kristensen, D. (2015). Bayesian indirect inference and the ABC of GMM. arXiv preprint,...
  • DiaconisP. et al.

    On the consistency of Bayes estimates

    The Annals of Statistics

    (1986)
  • DrostF.C. et al.

    Efficient estimation of auto-regression parameters and innovation distributions for semiparametric integer-valued AR(p) models

    Journal of the Royal Statistical Society. Series B.

    (2009)
  • DrovandiC.C. et al.

    Bayesian indirect inference using a parametric auxiliary model

    Statistical Science

    (2015)
  • FearnheadP. et al.

    Constructing summary statistics for approximate Bayesian computation: semi-automatic approximate Bayesian computation

    Journal of the Royal Statistical Society. Series B.

    (2012)
  • FrazierD.T. et al.

    Asymptotic properties of approximate Bayesian computation

    Biometrika

    (2018)
  • Frazier, D. T., Robert, C. P., & Rousseau, J. (2017). Model misspecification in ABC: consequences and diagnostics....
  • FulopA. et al.

    Self-exciting jumps, learning, and asset pricing implications

    The Review of Financial Studies

    (2014)
  • GallantR. et al.

    Which moments to match?

    Econometric Theory

    (1996)
  • GhosalS. et al.

    On convergence of posterior distributions

    The Annals of Statistics

    (1995)
  • GhoshJ.K. et al.

    Bayesian Nonparametrics

    (2003)
  • Cited by (37)

    • Bayesian forecasting in economics and finance: A modern review

      2024, International Journal of Forecasting
    • Forecasting: theory and practice

      2022, International Journal of Forecasting
      Citation Excerpt :

      It also factors parameter uncertainty into the predictive distribution, plus model uncertainty if Bayesian model averaging is adopted, producing a distribution whose location, shape and degree of dispersion reflect all such uncertainty as a consequence. See Berry and West (2020), Bisaglia and Canale (2016), Frazier, Maneesoonthorn, Martin, and McCabe (2019), Lu (2021), McCabe and Martin (2005) and Neal and Kypraios (2015), for examples of Bayesian probabilistic forecasts of counts. In contrast, frequentist probabilistic forecasts of counts typically adopt a ‘plug-in’ approach, with the predictive distribution conditioned on estimates of the unknown parameters of a given count model.

    • Variational Bayes approximation of factor stochastic volatility models

      2021, International Journal of Forecasting
      Citation Excerpt :

      In related work, Tomasetti, Forbes, and Panagiotelis (2019) propose a sequential updating variational method, and apply it to a simple univariate autoregressive model with a tractable likelihood. Frazier, Maneesoonthorn, Martin, and McCabe (2019) explore the use of Approximate Bayesian Computation (ABC) to generate forecasts. The ABC method is a very different approach to approximate Bayesian inference, and is usually only effective in low dimensions.

    View all citing articles on Scopus

    David T. Frazier is Senior Lecturer, Department of Econometrics and Business Statistics, Monash University, Melbourne, Australia. His research interests include statistical and econometric theory, simulation-based inference, Bayesian inference and financial econometrics.

    Worapree (Ole) Maneesoonthorn is Lecturer, Melbourne Business School, University of Melbourne, Australia. Her research interests include Bayesian inference, simulation methods and financial econometrics.

    Gael M. Martin is Professor, Department of Econometrics and Business Statistics, Monash University, Melbourne, Australia. Her research interests include Bayesian econometrics, simulation methods, financial econometrics and forecasting methodology.

    Brendan P.M. McCabe is Professor of Econometrics, School of Management, University of Liverpool, U.K. His research interests include statistical and econometric theory, Bayesian econometrics, count time series analysis and forecasting methodology.

    We thank the Editor and two anonymous referees for very constructive and detailed comments on earlier drafts of the paper. This research has been supported by Australian Research Council Discovery Grants No. DP150101728 and DP170100729.

    View full text