Skip to main content
Log in

Subsampling sequential Monte Carlo for static Bayesian models

  • Published:
Statistics and Computing Aims and scope Submit manuscript

Abstract

We show how to speed up sequential Monte Carlo (SMC) for Bayesian inference in large data problems by data subsampling. SMC sequentially updates a cloud of particles through a sequence of distributions, beginning with a distribution that is easy to sample from such as the prior and ending with the posterior distribution. Each update of the particle cloud consists of three steps: reweighting, resampling, and moving. In the move step, each particle is moved using a Markov kernel; this is typically the most computationally expensive part, particularly when the dataset is large. It is crucial to have an efficient move step to ensure particle diversity. Our article makes two important contributions. First, in order to speed up the SMC computation, we use an approximately unbiased and efficient annealed likelihood estimator based on data subsampling. The subsampling approach is more memory efficient than the corresponding full data SMC, which is an advantage for parallel computation. Second, we use a Metropolis within Gibbs kernel with two conditional updates. A Hamiltonian Monte Carlo update makes distant moves for the model parameters, and a block pseudo-marginal proposal is used for the particles corresponding to the auxiliary variables for the data subsampling. We demonstrate both the usefulness and limitations of the methodology for estimating four generalized linear models and a generalized additive model with large datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

Notes

  1. https://nci.org.au/our-systems/hpc-systems.

  2. https://research.unsw.edu.au/katana.

References

  • Baldi, P., Sadowski, P., Whiteson, D.: Searching for exotic particle in high energy physics with deep learning. Nat. Commun. 5, 1–9 (2014)

    Article  Google Scholar 

  • Bardenet, R., Doucet, A., Holmes, C.: On Markov chain Monte Carlo methods for tall data. J. Mach. Learn. Res. 18(1), 1515–1557 (2017)

    MathSciNet  MATH  Google Scholar 

  • Beskos, A., Jasra, A., Kantas, N., Thiery, A.: On the convergence of adaptive sequential Monte Carlo methods. Ann. Appl. Probab. 26(2), 1111–1146 (2016)

    Article  MathSciNet  MATH  Google Scholar 

  • Betancourt, M.: A conceptual introduction to Hamiltonian Monte Carlo. ArXiv preprint arXiv:1701.02434 (2017)

  • Brooks, S., Gelman, A., Jones, G., Meng, X.-L.: Handbook of Markov chain Monte Carlo. CRC Press, Boca Raton (2011)

    Book  MATH  Google Scholar 

  • Buchholz, A., Chopin, N., Jacob, P.E.: Adaptive tuning of Hamiltonian Monte Carlo within sequential Monte Carlo. ArXiv preprint arXiv:1808.07730 (2018)

  • Ceperley, D., Dewing, M.: The penalty method for random walks with uncertain energies. J. Chem. Phys. 110(20), 9812–9820 (1999)

    Article  Google Scholar 

  • Chib, S., Jeliazkov, I.: Marginal likelihood from the Metropolis–Hastings output. J. Am. Stat. Assoc. 96(453), 270–281 (2001)

    Article  MathSciNet  MATH  Google Scholar 

  • Chopin, N.: A sequential particle filter method for static models. Biometrika 89(3), 539–552 (2002)

    Article  MathSciNet  MATH  Google Scholar 

  • Dang, K.-D., Quiroz, M., Kohn, R., Tran, M.-N., Villani, M.: Hamiltonian Monte Carlo with energy conserving subsampling. J. Mach. Learn. Res. 20(100), 1–31 (2019)

    MathSciNet  MATH  Google Scholar 

  • Daviet, R.: Inference with Hamiltonian sequential Monte Carlo simulators. http://www.remidaviet.com/files/HSMC-paper.pdf (2016)

  • Del Moral, P., Doucet, A., Jasra, A.: Sequential Monte Carlo samplers. J. Roy. Stat. Soc. B 68(3), 411–436 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  • Del Moral, P., Doucet, A., Jasra, A.: An adaptive Sequential Monte Carlo for approximate Bayesian computation. Stat. Comput. 22(5), 1009–1020 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  • Deligiannidis, G., Doucet, A., Pitt, M.K.: The correlated pseudomarginal method. J. R. Stat. Soc. Ser. B Stat. Methodol. 80(5), 839–870 (2018)

    Article  MathSciNet  MATH  Google Scholar 

  • Doucet, A., De Freitas, N., Gordon, N.: An introduction to sequential Monte Carlo methods. In: Sequential Monte Carlo Methods in Practice, pp. 3–14. Springer (2001)

  • Duan, J.C., Fulop, A.: Density-tempered marginalised sequential Monte Carlo samplers. J. Bus. Econ. Stat. 33(2), 192–202 (2015)

    Article  Google Scholar 

  • Duane, S., Kennedy, A.D., Pendleton, B.J., Roweth, D.: Hybrid Monte Carlo. Phys. Lett. B 195(2), 216–222 (1987)

    Article  MathSciNet  Google Scholar 

  • Fearnhead, P., Taylor, B.M.: An adaptive sequential Monte Carlo sampler. Bayesian Anal. 8(2), 411–438 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  • Giordani, P., Jacobson, T., Von Schedvin, E., Villani, M.: Taking the twists into account: predicting firm bankruptcy risk with splines of financial ratios. J. Financ. Quant. Anal. 49(4), 1071–1099 (2014)

    Article  Google Scholar 

  • Guldas, H., Cemgil, A.T., Whiteley, N., Heine, K.: A practical introduction to butterfly and adaptive resampling in sequential Monte Carlo. IFAC-PapersOnLine 48(28), 787–792 (2015)

    Article  Google Scholar 

  • Heine, K., Whiteley, N., Cemgil, A.T.: Parallelizing particle filters with butterfly interactions. Scand. J. Stat. 47, 361–396 (2019)

    Article  MATH  Google Scholar 

  • Jasra, A., Stephens, D.A., Doucet, A., Tsagaris, T.: Inference for Lévy-driven stochastic volatility models via adaptive Sequential Monte Carlo. Scand. J. Stat. 38(1), 1–22 (2011)

    Article  MathSciNet  MATH  Google Scholar 

  • Jeffreys, H.: The Theory of Probability. OUP, Oxford (1961)

    MATH  Google Scholar 

  • Johnson, A.A., Jones, G.L., Neath, R.C.: Component-wise Markov chain Monte Carlo: uniform and geometric ergodicity under mixing and composition. Stat. Sci. 28(3), 360–375 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  • Kass, R.E., Raftery, A.E.: Bayes factors. J. Am. Stat. Assoc. 90(430), 773–795 (1995)

    Article  MathSciNet  MATH  Google Scholar 

  • Lee, A., Yau, C., Giles, M.B., Doucet, A., Holmes, C.C.: On the utility of graphics cards to perform massively parallel simulation of advanced Monte Carlo methods. J. Comput. Graph. Stat. 19(4), 769–789 (2010)

    Article  Google Scholar 

  • Liu, J.S.: Monte Carlo Strategies in Scientific Computing. Springer, New York (2001)

    MATH  Google Scholar 

  • Murray, L.M., Lee, A., Jacob, P.E.: Parallel resampling in the particle filter. J. Comput. Graph. Stat. 25(3), 789–805 (2016)

    Article  MathSciNet  Google Scholar 

  • Neal, R.: Annealed importance sampling. Stat. Comput. 11, 125–139 (2001)

    Article  MathSciNet  Google Scholar 

  • Neal, R.M.: MCMC using Hamiltonian dynamics. Handbook of Markov chain Monte Carlo (2011)

  • Quiroz, M., Villani, M.: Dynamic mixture-of-experts models for longitudinal and discrete-time survival data. https://github.com/mattiasvillani/Papers/raw/master/DynamicMixture.pdf (2013)

  • Quiroz, M., Tran, M.-N., Villani, M., Kohn, R., Dang, K.-D.: The block-Poisson estimator for optimally tuned exact subsampling MCMC. ArXiv preprint arXiv:1603.08232v5 (2018a)

  • Quiroz, M., Villani, M., Kohn, R., Tran, M.-N., Dang, K.-D.: Subsampling MCMC: an introduction for the survey statistician. Sankhya A 80, 33–69 (2018b)

    Article  MathSciNet  MATH  Google Scholar 

  • Quiroz, M., Kohn, R., Villani, M., Tran, M.N.: Speeding up MCMC by efficient data subsampling. J. Am. Stat. Assoc. 114, 831–843 (2019)

    Article  MathSciNet  MATH  Google Scholar 

  • Roberts, G.O., Stramer, O.: Langevin diffusions and Metropolis-Hastings algorithms. Methodol. Comput. Appl. Probab. 4(4), 337–357 (2002)

    Article  MathSciNet  MATH  Google Scholar 

  • Roberts, G.O., Gelman, A., Gilks, W.R.: Weak convergence and optimal scaling of random walk Metropolis-Hastings. Ann. Appl. Probab. 7(1), 110–120 (1997)

    Article  MathSciNet  MATH  Google Scholar 

  • Sim, A., Filippi, S., Stumpf, M.P.: Information geometry and sequential Monte Carlo. ArXiv preprint arXiv:1212.0764 (2012)

  • South, L.F., Pettitt, A.N., Friel, N., Drovandi, C.C.: Efficient use of derivative information within SMC methods for static Bayesian models. https://eprints.qut.edu.au/108150/ (2017)

  • South, L.F., Pettitt, A.N., Drovandi, C.C., et al.: Sequential Monte Carlo samplers with independent Markov chain Monte Carlo proposals. Bayesian Anal. 14(3), 753–776 (2019)

    Article  MathSciNet  MATH  Google Scholar 

  • Tran, M.N., Kohn, R., Quiroz, M., Villani, M.: The block-pseudo marginal sampler. Arxiv preprint arXiv:1603.02485v5 (2017)

  • Wang, L., Wang, S., Bouchard-Côté, A.: An annealed sequential Monte Carlo method for Bayesian phylogenetics. Syst. Biol. 69(1), 155–183 (2020)

    Article  Google Scholar 

Download references

Acknowledgements

We thank the Associate Editor and two reviewers for helping to improve both the content and the presentation of the article. Khue-Dung Dang, David Gunawan, Matias Quiroz and Robert Kohn were partially supported by Australian Research Council Center of Excellence grant CE140100049.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to David Gunawan.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Gunawan, D., Dang, KD., Quiroz, M. et al. Subsampling sequential Monte Carlo for static Bayesian models. Stat Comput 30, 1741–1758 (2020). https://doi.org/10.1007/s11222-020-09969-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11222-020-09969-z

Keywords

Navigation