Exact Bayesian inference by symbolic disintegration

Authors:
Chung-chieh Shan

Indiana University, USA

Indiana University, USA
View Profile

,
Norman Ramsey

Tufts University, USA

Tufts University, USA
View Profile

POPL '17: Proceedings of the 44th ACM SIGPLAN Symposium on Principles of Programming LanguagesJanuary 2017Pages 130–144https://doi.org/10.1145/3009837.3009852

Published:01 January 2017Publication History

POPL '17: Proceedings of the 44th ACM SIGPLAN Symposium on Principles of Programming Languages

Pages 130–144

ABSTRACT

Bayesian inference, of posterior knowledge from prior knowledge and observed evidence, is typically defined by Bayes's rule, which says the posterior multiplied by the probability of an observation equals a joint probability. But the observation of a continuous quantity usually has probability zero, in which case Bayes's rule says only that the unknown times zero is zero. To infer a posterior distribution from a zero-probability observation, the statistical notion of disintegration tells us to specify the observation as an expression rather than a predicate, but does not tell us how to compute the posterior. We present the first method of computing a disintegration from a probabilistic program and an expression of a quantity to be observed, even when the observation has probability zero. Because the method produces an exact posterior term and preserves a semantics in which monadic terms denote measures, it composes with other inference methods in a modular way-without sacrificing accuracy or performance.

References

Nathanael L. Ackerman, Cameron E. Freer, and Daniel M. Roy. 2011. Noncomputable conditional distributions. In LICS 2011: Proceedings of the 26th Symposium on Logic in Computer Science, pages 107–116, Washington, DC. IEEE Computer Society Press. Google ScholarDigital Library
Nathanael L. Ackerman, Cameron E. Freer, and Daniel M. Roy. 2016. On computability and disintegration. Mathematical Structures in Computer Science, pages 1–28. Hadi Mohasel Afshar, Scott Sanner, and Christfried Webers. 2016.Google Scholar
Closed-form Gibbs sampling for graphical models with algebraic constraints. In Proceedings of the 30th AAAI Conference on Artificial Intelligence. AAAI Press. Google ScholarDigital Library
Nimar S. Arora, Stuart Russell, and Erik Sudderth. 2013. NETVISA: Network processing vertically integrated seismic analysis. Bulletin of the Seismological Society of America, 103(2A): 709–729. Philippe Audebaud and Christine Paulin-Mohring. 2009. Proofs of randomized algorithms in Coq. Science of Computer Programming, 74(8):568–589. Robert J. Aumann. 1961. Borel structures for function spaces. Illinois Journal of Mathematics, 5(4):614–630. Joseph Bertrand. 1889. Calcul des Probabilités. Gauthier-Villars et fils, Paris. Sooraj Bhat, Ashish Agarwal, Richard Vuduc, and Alexander Gray. 2012.Google Scholar
A type theory for probability density functions. In POPL’12: Proceedings of the 39th Annual ACM SIGPLANSIGACT Symposium on Principles of Programming Languages, pages 545–556, New York. ACM Press. Google ScholarDigital Library
Sooraj Bhat, Johannes Borgström, Andrew D. Gordon, and Claudio V. Russo. 2013. Deriving probability density functions from probabilistic functional programs. In Proceedings of TACAS 2013: 19th International Conference on Tools and Algorithms for the Construction and Analysis of Systems, number 7795 in Lecture Notes in Computer Science, pages 508–522, Berlin. Springer. Google ScholarDigital Library
Maximilian Bolingbroke and Simon Peyton Jones. 2010. Supercompilation by evaluation. In Haskell’10: Proceedings of the 2010 ACM SIGPLAN Haskell Symposium, pages 135–146, New York. ACM Press. Google ScholarDigital Library
Anders Bondorf. 1992. Improving binding times without explicit CPS-conversion. In LFP’92: Proceedings of the 1992 ACM Conference on LISP and Functional Programming, pages 1–10, New York. ACM Press. Google ScholarDigital Library
Émile Borel. 1909. Éléments de la Théorie des Probabilités. Librairie scientifique A. Hermann et fils, Paris. Johannes Borgström, Andrew D. Gordon, Michael Greenberg, James Margetson, and Jurgen Van Gael. 2013. Measure transformer semantics for Bayesian machine learning. Logical Methods in Computer Science, 9(3:11):1–39. Jacques Carette and Chung-chieh Shan. 2016. Simplifying probabilistic programs using computer algebra. In Practical Aspects of Declarative Languages: 18th International Symposium, PADL 2016, Lecture Notes in Computer Science, pages 135– 152, Berlin. Springer.Google Scholar
Joseph T. Chang and David Pollard. 1997. Conditioning as disintegration. Statistica Neerlandica, 51(3):287–317. Olivier Danvy and Andrzej Filinski. 1990. Abstracting control. In LFP’90: Proceedings of the 1990 ACM Conference on Lisp and Functional Programming, pages 151–160, New York. ACM Press. Google ScholarDigital Library
Olivier Danvy, Karoline Malmkjær, and Jens Palsberg. 1996. Etaexpansion does The Trick. ACM Transactions on Programming Languages and Systems, 18(6):730–751. Bruno de Finetti. 1974. Theory of Probability: A Critical Introductory Treatment, volume 1. Wiley, New York. Translated from Teoria Delle Probabilità, 1970. Google ScholarDigital Library
Jean Dieudonné. 1947–1948.Google Scholar
Sur le théorème de Lebesgue-Nikodym (III). Annales de l’université de Grenoble, 23:25–53. Edsger W. Dijkstra. 1975. Guarded commands, nondeterminacy and formal derivation of programs. Communications of the ACM, 18(8):453–457. Joshua Dunfield and Neelakantan R. Krishnaswami. 2013. Complete and easy bidirectional typechecking for higher-rank polymorphism. In ICFP’13: Proceedings of the 2013 ACM SIGPLAN International Conference on Functional Programming, pages 429–442, New York. ACM Press. Google ScholarDigital Library
Peter Dybjer and Andrzej Filinski. 2002. Normalization and partial evaluation. In APPSEM 2000: International Summer School on Applied Semantics, Advanced Lectures, number 2395 in Lecture Notes in Computer Science, pages 137–192, Berlin. Springer. Google ScholarDigital Library
Sebastian Fischer, Oleg Kiselyov, and Chung-chieh Shan. 2011.Google Scholar
Purely functional lazy nondeterministic programming. Journal of Functional Programming, 21(4–5):413–465. Sebastian Fischer, Josep Silva, Salvador Tamarit, and Germán Vidal. 2008. Preserving sharing in the partial evaluation of lazy functional programs. In Revised Selected Papers from LOPSTR 2007: 17th International Symposium on Logic-Based Program Synthesis and Transformation, number 4915 in Lecture Notes in Computer Science, pages 74–89, Berlin. Springer. Google ScholarDigital Library
Nate Foster, Kazutaka Matsuda, and Janis Voigtländer. 2012.Google Scholar
Three complementary approaches to bidirectional programming. In Generic and Indexed Programming, International Spring School, SSGIP 2010, Revised Lectures, number 7470 in Lecture Notes in Computer Science, pages 1–46, Berlin. Springer.Google Scholar
Timon Gehr, Sasa Misailovic, and Martin T. Vechev. 2016. PSI: Exact symbolic inference for probabilistic programs. In Proceedings of the 28th International Conference on Computer Aided Verification, Part I, number 9779 in Lecture Notes in Computer Science, pages 62–83, Berlin. Springer.Google Scholar
Michèle Giry. 1982. A categorical approach to probability theory. In Categorical Aspects of Topology and Analysis: Proceedings of an International Conference Held at Carleton University, Ottawa, August 11–15, 1981, number 915 in Lecture Notes in Mathematics, pages 68–85, Berlin. Springer.Google Scholar
Noah D. Goodman, Vikash K. Mansinghka, Daniel Roy, Keith Bonawitz, and Joshua B. Tenenbaum. 2008. Church: A language for generative models. In Proceedings of the 24th Conference on Uncertainty in Artificial Intelligence, pages 220–229, Corvallis, Oregon. AUAI Press. Google ScholarDigital Library
Noah D. Goodman and Andreas Stuhlmüller. 2014. The design and implementation of probabilistic programming languages. http://dippl.org. Accessed: 2016-11-04. Carl A. Gunter, Didier Rémy, and Jon G. Riecke. 1998. Return types for functional continuations.Google Scholar
Jesper Jørgensen. 1992. Generating a compiler for a lazy language by partial evaluation. In POPL’92: Proceedings of the 19th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, pages 258–268, New York. ACM Press. Google ScholarDigital Library
Oleg Kiselyov and Chung-chieh Shan. 2009. Embedded probabilistic programming. In Proceedings of the Working Conference on Domain-Specific Languages, number 5658 in Lecture Notes in Computer Science, pages 360–384, Berlin. Springer. Google ScholarDigital Library
Oleg Kiselyov, Chung-chieh Shan, and Amr Sabry. 2006. Delimited dynamic binding. In ICFP’06: Proceedings of the 11th ACM SIGPLAN International Conference on Functional Programming, pages 26–37, New York. ACM Press. Google ScholarDigital Library
Andrey Nikolaevich Kolmogorov. 1933. Grundbegriffe der Wahrscheinlichkeitsrechnung. Springer, Berlin. English translation Foundations of the Theory of Probability, Chelsea, New York, 1950.Google Scholar
Dexter Kozen. 1981. Semantics of probabilistic programs. Journal of Computer and System Sciences, 22(3):328–350. John Launchbury. 1993.Google ScholarCross Ref
A natural semantics for lazy evaluation. In POPL’93: Proceedings of the 20th ACM SIGPLANSIGACT Symposium on Principles of Programming Languages, pages 144–154, New York. ACM Press. Google ScholarDigital Library
Julia L. Lawall and Olivier Danvy. 1994.Google Scholar
Continuation-based partial evaluation. In LFP’94: Proceedings of the 1994 ACM Conference on Lisp and Functional Programming, pages 227– 238, New York. ACM Press. Google ScholarDigital Library
David J. C. MacKay. 1998. Introduction to Monte Carlo methods. In Michael I. Jordan, editor, Learning and Inference in Graphical Models. Kluwer, Dordrecht. Paperback: Learning in Graphical Models, MIT Press. Google ScholarDigital Library
Geoffrey Mainland. 2007. Why it’s nice to be quoted: Quasiquoting for Haskell. In Proceedings of the ACM SIGPLAN Workshop on Haskell, Haskell ’07, pages 73–82, New York, NY, USA. ACM. Google ScholarDigital Library
Neil Mitchell. 2010. Rethinking supercompilation. In ICFP’10: Proceedings of the 2010 ACM SIGPLAN International Conference on Functional Programming, pages 309–320, New York. ACM Press. Google ScholarDigital Library
Wazim Mohammed Ismail and Chung-chieh Shan. 2016. Deriving a probability density calculator (functional pearl). In ICFP’16: Proceedings of the 2016 ACM SIGPLAN International Conference on Functional Programming, pages 47–59, New York. ACM Press. Google ScholarDigital Library
Praveen Narayanan, Jacques Carette, Wren Romano, Chung-chieh Shan, and Robert Zinkov. 2016. Probabilistic inference by program transformation in Hakaru (system description). In Functional and Logic Programming: 13th International Symposium, FLOPS 2016, number 9613 in Lecture Notes in Computer Science, pages 62–79, Berlin. Springer.Google ScholarCross Ref
Aditya V. Nori, Chung-Kil Hur, Sriram K. Rajamani, and Selva Samuel. 2014. R2: An efficient MCMC sampler for probabilistic programs. In Proceedings of the 28th AAAI Conference on Artificial Intelligence, pages 2476–2482. AAAI Press. Google ScholarDigital Library
Sungwoo Park, Frank Pfenning, and Sebastian Thrun. 2008.Google Scholar
A probabilistic language based on sampling functions. ACM Transactions on Programming Languages and Systems, 31(1): 4:1–4:46. David Pollard. 2001. Google ScholarDigital Library
A User’s Guide to Measure Theoretic Probability. Cambridge University Press, Cambridge. Norman Ramsey and Avi Pfeffer. 2002. Stochastic lambda calculus and monads of probability distributions. In POPL’02: Proceedings of the 29th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, pages 154–165, New York. ACM Press. Google ScholarDigital Library
H. L. Royden. 1988. Real Analysis. Macmillan, third edition. Vijay A. Saraswat, Martin C. Rinard, and Prakash Panangaden. 1991. Semantic foundations of concurrent constraint programming. In POPL’91: Proceedings of the 18th ACM SIGPLANSIGACT Symposium on Principles of Programming Languages, pages 333–352, New York. ACM Press. Google ScholarDigital Library
Saurabh Srivastava, Sumit Gulwani, Swarat Chaudhuri, and Jeffrey S. Foster. 2011. Path-based inductive synthesis for program inversion. In PLDI’11: Proceedings of the 32nd ACM SIGPLAN Conference on Programming Language Design and Implementation, pages 492–503, New York. ACM Press. Google ScholarDigital Library
Sam Staton, Hongseok Yang, Chris Heunen, Ohad Kammar, and Frank Wood. 2016. Semantics for probabilistic programming: Higher-order functions, continuous distributions, and soft constraints. In LICS 2016: Proceedings of the 31st Symposium on Logic in Computer Science, Washington, DC. IEEE Computer Society Press. Google ScholarDigital Library
Joseph E. Stoy. 1977. Denotational Semantics: The Scott-Strachey Approach to Programming Language Theory. MIT Press. Google ScholarDigital Library
Luke Tierney. 1998. A note on Metropolis-Hastings kernels for general state spaces. The Annals of Applied Probability, 8(1): 1–9. Tue Tjur. 1975. A constructive definition of conditional distributions. Preprint 13, Institute of Mathematical Statistics, University of Copenhagen. Neil Toronto, Jay McCarthy, and David Van Horn. 2015. Running probabilistic programs backwards. In ESOP 2015: Proceedings of the 24th European Symposium on Programming, number 9032 in Lecture Notes in Computer Science, pages 53–79, Berlin. Springer.Google ScholarCross Ref
David Wingate, Andreas Stuhlmüller, and Noah D. Goodman. 2011. Lightweight implementations of probabilistic programming languages via transformational compilation. In Proceedings of AISTATS 2011: 14th International Conference on Artificial Intelligence and Statistics, number 15 in JMLR Workshop and Conference Proceedings, pages 770–778, Cambridge. MIT Press.Google Scholar

Index Terms

Exact Bayesian inference by symbolic disintegration
1. Mathematics of computing
  1. Probability and statistics
2. Theory of computation
  1. Semantics and reasoning
    1. Program semantics

Recommendations

Exact Bayesian inference by symbolic disintegration
POPL '17

Bayesian inference, of posterior knowledge from prior knowledge and observed evidence, is typically defined by Bayes's rule, which says the posterior multiplied by the probability of an observation equals a joint probability. But the observation of a ...
Read More
Exact Bayesian Inference for Loopy Probabilistic Programs using Generating Functions

We present an exact Bayesian inference method for inferring posterior distributions encoded by probabilistic programs featuring possibly unbounded loops. Our method is built on a denotational semantics represented by probability generating functions, ...
Read More
Exact Bayesian inference for the Bingham distribution

This paper is concerned with making Bayesian inference from data that are assumed to be drawn from a Bingham distribution. A barrier to the Bayesian approach is the parameter-dependent normalising constant of the Bingham distribution, which, even when ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
POPL '17: Proceedings of the 44th ACM SIGPLAN Symposium on Principles of Programming Languages
January 2017
901 pages
ISBN:9781450346603
DOI:10.1145/3009837
General Chair:
Giuseppe Castagna
Paris Diderot University, France / CNRS, France
,
Program Chair:
Andrew D. Gordon
Microsoft Research, UK / University of Edinburgh, UK
ACM SIGPLAN Notices Volume 52, Issue 1
POPL '17
January 2017
901 pages
ISSN:0362-1340
EISSN:1558-1160
DOI:10.1145/3093333
Editor:
Matthew Fluet
Issue’s Table of Contents
Copyright © 2017 Owner/Author
This work is licensed under a Creative Commons Attribution International 4.0 License.
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 1 January 2017
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
conditional measures
continuous distributions
probabilistic programs
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate824of4,130submissions,20%
Upcoming Conference
POPL '25

Sponsor:

sigplan

The 52nd Annual ACM SIGPLAN Symposium on Principles of Programming Languages

January 19 - 25, 2025

Denver , CO , USA
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 22
  Total Citations
  View Citations
- 994
  Total Downloads
- Downloads (Last 12 months)112
- Downloads (Last 6 weeks)24
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Exact Bayesian inference by symbolic disintegration

POPL '17: Proceedings of the 44th ACM SIGPLAN Symposium on Principles of Programming Languages

ABSTRACT

References

Cited By

Index Terms

Recommendations

Exact Bayesian inference by symbolic disintegration

Exact Bayesian Inference for Loopy Probabilistic Programs using Generating Functions

Exact Bayesian inference for the Bingham distribution