skip to main content
10.1145/3009837.3009852acmconferencesArticle/Chapter ViewAbstractPublication PagespoplConference Proceedingsconference-collections
research-article
Open Access

Exact Bayesian inference by symbolic disintegration

Published:01 January 2017Publication History

ABSTRACT

Bayesian inference, of posterior knowledge from prior knowledge and observed evidence, is typically defined by Bayes's rule, which says the posterior multiplied by the probability of an observation equals a joint probability. But the observation of a continuous quantity usually has probability zero, in which case Bayes's rule says only that the unknown times zero is zero. To infer a posterior distribution from a zero-probability observation, the statistical notion of disintegration tells us to specify the observation as an expression rather than a predicate, but does not tell us how to compute the posterior. We present the first method of computing a disintegration from a probabilistic program and an expression of a quantity to be observed, even when the observation has probability zero. Because the method produces an exact posterior term and preserves a semantics in which monadic terms denote measures, it composes with other inference methods in a modular way-without sacrificing accuracy or performance.

References

  1. Nathanael L. Ackerman, Cameron E. Freer, and Daniel M. Roy. 2011. Noncomputable conditional distributions. In LICS 2011: Proceedings of the 26th Symposium on Logic in Computer Science, pages 107–116, Washington, DC. IEEE Computer Society Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Nathanael L. Ackerman, Cameron E. Freer, and Daniel M. Roy. 2016. On computability and disintegration. Mathematical Structures in Computer Science, pages 1–28. Hadi Mohasel Afshar, Scott Sanner, and Christfried Webers. 2016.Google ScholarGoogle Scholar
  3. Closed-form Gibbs sampling for graphical models with algebraic constraints. In Proceedings of the 30th AAAI Conference on Artificial Intelligence. AAAI Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Nimar S. Arora, Stuart Russell, and Erik Sudderth. 2013. NETVISA: Network processing vertically integrated seismic analysis. Bulletin of the Seismological Society of America, 103(2A): 709–729. Philippe Audebaud and Christine Paulin-Mohring. 2009. Proofs of randomized algorithms in Coq. Science of Computer Programming, 74(8):568–589. Robert J. Aumann. 1961. Borel structures for function spaces. Illinois Journal of Mathematics, 5(4):614–630. Joseph Bertrand. 1889. Calcul des Probabilités. Gauthier-Villars et fils, Paris. Sooraj Bhat, Ashish Agarwal, Richard Vuduc, and Alexander Gray. 2012.Google ScholarGoogle Scholar
  5. A type theory for probability density functions. In POPL’12: Proceedings of the 39th Annual ACM SIGPLANSIGACT Symposium on Principles of Programming Languages, pages 545–556, New York. ACM Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Sooraj Bhat, Johannes Borgström, Andrew D. Gordon, and Claudio V. Russo. 2013. Deriving probability density functions from probabilistic functional programs. In Proceedings of TACAS 2013: 19th International Conference on Tools and Algorithms for the Construction and Analysis of Systems, number 7795 in Lecture Notes in Computer Science, pages 508–522, Berlin. Springer. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Maximilian Bolingbroke and Simon Peyton Jones. 2010. Supercompilation by evaluation. In Haskell’10: Proceedings of the 2010 ACM SIGPLAN Haskell Symposium, pages 135–146, New York. ACM Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Anders Bondorf. 1992. Improving binding times without explicit CPS-conversion. In LFP’92: Proceedings of the 1992 ACM Conference on LISP and Functional Programming, pages 1–10, New York. ACM Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Émile Borel. 1909. Éléments de la Théorie des Probabilités. Librairie scientifique A. Hermann et fils, Paris. Johannes Borgström, Andrew D. Gordon, Michael Greenberg, James Margetson, and Jurgen Van Gael. 2013. Measure transformer semantics for Bayesian machine learning. Logical Methods in Computer Science, 9(3:11):1–39. Jacques Carette and Chung-chieh Shan. 2016. Simplifying probabilistic programs using computer algebra. In Practical Aspects of Declarative Languages: 18th International Symposium, PADL 2016, Lecture Notes in Computer Science, pages 135– 152, Berlin. Springer.Google ScholarGoogle Scholar
  10. Joseph T. Chang and David Pollard. 1997. Conditioning as disintegration. Statistica Neerlandica, 51(3):287–317. Olivier Danvy and Andrzej Filinski. 1990. Abstracting control. In LFP’90: Proceedings of the 1990 ACM Conference on Lisp and Functional Programming, pages 151–160, New York. ACM Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Olivier Danvy, Karoline Malmkjær, and Jens Palsberg. 1996. Etaexpansion does The Trick. ACM Transactions on Programming Languages and Systems, 18(6):730–751. Bruno de Finetti. 1974. Theory of Probability: A Critical Introductory Treatment, volume 1. Wiley, New York. Translated from Teoria Delle Probabilità, 1970. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Jean Dieudonné. 1947–1948.Google ScholarGoogle Scholar
  13. Sur le théorème de Lebesgue-Nikodym (III). Annales de l’université de Grenoble, 23:25–53. Edsger W. Dijkstra. 1975. Guarded commands, nondeterminacy and formal derivation of programs. Communications of the ACM, 18(8):453–457. Joshua Dunfield and Neelakantan R. Krishnaswami. 2013. Complete and easy bidirectional typechecking for higher-rank polymorphism. In ICFP’13: Proceedings of the 2013 ACM SIGPLAN International Conference on Functional Programming, pages 429–442, New York. ACM Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Peter Dybjer and Andrzej Filinski. 2002. Normalization and partial evaluation. In APPSEM 2000: International Summer School on Applied Semantics, Advanced Lectures, number 2395 in Lecture Notes in Computer Science, pages 137–192, Berlin. Springer. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Sebastian Fischer, Oleg Kiselyov, and Chung-chieh Shan. 2011.Google ScholarGoogle Scholar
  16. Purely functional lazy nondeterministic programming. Journal of Functional Programming, 21(4–5):413–465. Sebastian Fischer, Josep Silva, Salvador Tamarit, and Germán Vidal. 2008. Preserving sharing in the partial evaluation of lazy functional programs. In Revised Selected Papers from LOPSTR 2007: 17th International Symposium on Logic-Based Program Synthesis and Transformation, number 4915 in Lecture Notes in Computer Science, pages 74–89, Berlin. Springer. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Nate Foster, Kazutaka Matsuda, and Janis Voigtländer. 2012.Google ScholarGoogle Scholar
  18. Three complementary approaches to bidirectional programming. In Generic and Indexed Programming, International Spring School, SSGIP 2010, Revised Lectures, number 7470 in Lecture Notes in Computer Science, pages 1–46, Berlin. Springer.Google ScholarGoogle Scholar
  19. Timon Gehr, Sasa Misailovic, and Martin T. Vechev. 2016. PSI: Exact symbolic inference for probabilistic programs. In Proceedings of the 28th International Conference on Computer Aided Verification, Part I, number 9779 in Lecture Notes in Computer Science, pages 62–83, Berlin. Springer.Google ScholarGoogle Scholar
  20. Michèle Giry. 1982. A categorical approach to probability theory. In Categorical Aspects of Topology and Analysis: Proceedings of an International Conference Held at Carleton University, Ottawa, August 11–15, 1981, number 915 in Lecture Notes in Mathematics, pages 68–85, Berlin. Springer.Google ScholarGoogle Scholar
  21. Noah D. Goodman, Vikash K. Mansinghka, Daniel Roy, Keith Bonawitz, and Joshua B. Tenenbaum. 2008. Church: A language for generative models. In Proceedings of the 24th Conference on Uncertainty in Artificial Intelligence, pages 220–229, Corvallis, Oregon. AUAI Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Noah D. Goodman and Andreas Stuhlmüller. 2014. The design and implementation of probabilistic programming languages. http://dippl.org. Accessed: 2016-11-04. Carl A. Gunter, Didier Rémy, and Jon G. Riecke. 1998. Return types for functional continuations.Google ScholarGoogle Scholar
  23. Jesper Jørgensen. 1992. Generating a compiler for a lazy language by partial evaluation. In POPL’92: Proceedings of the 19th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, pages 258–268, New York. ACM Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Oleg Kiselyov and Chung-chieh Shan. 2009. Embedded probabilistic programming. In Proceedings of the Working Conference on Domain-Specific Languages, number 5658 in Lecture Notes in Computer Science, pages 360–384, Berlin. Springer. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Oleg Kiselyov, Chung-chieh Shan, and Amr Sabry. 2006. Delimited dynamic binding. In ICFP’06: Proceedings of the 11th ACM SIGPLAN International Conference on Functional Programming, pages 26–37, New York. ACM Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Andrey Nikolaevich Kolmogorov. 1933. Grundbegriffe der Wahrscheinlichkeitsrechnung. Springer, Berlin. English translation Foundations of the Theory of Probability, Chelsea, New York, 1950.Google ScholarGoogle Scholar
  27. Dexter Kozen. 1981. Semantics of probabilistic programs. Journal of Computer and System Sciences, 22(3):328–350. John Launchbury. 1993.Google ScholarGoogle ScholarCross RefCross Ref
  28. A natural semantics for lazy evaluation. In POPL’93: Proceedings of the 20th ACM SIGPLANSIGACT Symposium on Principles of Programming Languages, pages 144–154, New York. ACM Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Julia L. Lawall and Olivier Danvy. 1994.Google ScholarGoogle Scholar
  30. Continuation-based partial evaluation. In LFP’94: Proceedings of the 1994 ACM Conference on Lisp and Functional Programming, pages 227– 238, New York. ACM Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. David J. C. MacKay. 1998. Introduction to Monte Carlo methods. In Michael I. Jordan, editor, Learning and Inference in Graphical Models. Kluwer, Dordrecht. Paperback: Learning in Graphical Models, MIT Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Geoffrey Mainland. 2007. Why it’s nice to be quoted: Quasiquoting for Haskell. In Proceedings of the ACM SIGPLAN Workshop on Haskell, Haskell ’07, pages 73–82, New York, NY, USA. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Neil Mitchell. 2010. Rethinking supercompilation. In ICFP’10: Proceedings of the 2010 ACM SIGPLAN International Conference on Functional Programming, pages 309–320, New York. ACM Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Wazim Mohammed Ismail and Chung-chieh Shan. 2016. Deriving a probability density calculator (functional pearl). In ICFP’16: Proceedings of the 2016 ACM SIGPLAN International Conference on Functional Programming, pages 47–59, New York. ACM Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Praveen Narayanan, Jacques Carette, Wren Romano, Chung-chieh Shan, and Robert Zinkov. 2016. Probabilistic inference by program transformation in Hakaru (system description). In Functional and Logic Programming: 13th International Symposium, FLOPS 2016, number 9613 in Lecture Notes in Computer Science, pages 62–79, Berlin. Springer.Google ScholarGoogle ScholarCross RefCross Ref
  36. Aditya V. Nori, Chung-Kil Hur, Sriram K. Rajamani, and Selva Samuel. 2014. R2: An efficient MCMC sampler for probabilistic programs. In Proceedings of the 28th AAAI Conference on Artificial Intelligence, pages 2476–2482. AAAI Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Sungwoo Park, Frank Pfenning, and Sebastian Thrun. 2008.Google ScholarGoogle Scholar
  38. A probabilistic language based on sampling functions. ACM Transactions on Programming Languages and Systems, 31(1): 4:1–4:46. David Pollard. 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. A User’s Guide to Measure Theoretic Probability. Cambridge University Press, Cambridge. Norman Ramsey and Avi Pfeffer. 2002. Stochastic lambda calculus and monads of probability distributions. In POPL’02: Proceedings of the 29th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, pages 154–165, New York. ACM Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. H. L. Royden. 1988. Real Analysis. Macmillan, third edition. Vijay A. Saraswat, Martin C. Rinard, and Prakash Panangaden. 1991. Semantic foundations of concurrent constraint programming. In POPL’91: Proceedings of the 18th ACM SIGPLANSIGACT Symposium on Principles of Programming Languages, pages 333–352, New York. ACM Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Saurabh Srivastava, Sumit Gulwani, Swarat Chaudhuri, and Jeffrey S. Foster. 2011. Path-based inductive synthesis for program inversion. In PLDI’11: Proceedings of the 32nd ACM SIGPLAN Conference on Programming Language Design and Implementation, pages 492–503, New York. ACM Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Sam Staton, Hongseok Yang, Chris Heunen, Ohad Kammar, and Frank Wood. 2016. Semantics for probabilistic programming: Higher-order functions, continuous distributions, and soft constraints. In LICS 2016: Proceedings of the 31st Symposium on Logic in Computer Science, Washington, DC. IEEE Computer Society Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Joseph E. Stoy. 1977. Denotational Semantics: The Scott-Strachey Approach to Programming Language Theory. MIT Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Luke Tierney. 1998. A note on Metropolis-Hastings kernels for general state spaces. The Annals of Applied Probability, 8(1): 1–9. Tue Tjur. 1975. A constructive definition of conditional distributions. Preprint 13, Institute of Mathematical Statistics, University of Copenhagen. Neil Toronto, Jay McCarthy, and David Van Horn. 2015. Running probabilistic programs backwards. In ESOP 2015: Proceedings of the 24th European Symposium on Programming, number 9032 in Lecture Notes in Computer Science, pages 53–79, Berlin. Springer.Google ScholarGoogle ScholarCross RefCross Ref
  45. David Wingate, Andreas Stuhlmüller, and Noah D. Goodman. 2011. Lightweight implementations of probabilistic programming languages via transformational compilation. In Proceedings of AISTATS 2011: 14th International Conference on Artificial Intelligence and Statistics, number 15 in JMLR Workshop and Conference Proceedings, pages 770–778, Cambridge. MIT Press.Google ScholarGoogle Scholar

Index Terms

  1. Exact Bayesian inference by symbolic disintegration

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader