Abstract
As inductive inference and machine-learning methods in computer science see continued success, researchers are aiming to describe ever more complex probabilistic models and inference algorithms. It is natural to ask whether there is a universal computational procedure for probabilistic inference. We investigate the computability of conditional probability, a fundamental notion in probability theory, and a cornerstone of Bayesian statistics. We show that there are computable joint distributions with noncomputable conditional distributions, ruling out the prospect of general inference algorithms, even inefficient ones. Specifically, we construct a pair of computable random variables in the unit interval such that the conditional distribution of the first variable given the second encodes the halting problem. Nevertheless, probabilistic inference is possible in many common modeling settings, and we prove several results giving broadly applicable conditions under which conditional distributions are computable. In particular, conditional distributions become computable when measurements are corrupted by independent computable noise with a sufficiently smooth bounded density.
- Nathanael L. Ackerman, Cameron E. Freer, and Daniel M. Roy. 2010. On the computability of conditional probability. Retrieved from http://arxiv.org/abs/1005.3014v1.Google Scholar
- Nathanael L. Ackerman, Cameron E. Freer, and Daniel M. Roy. 2011. Noncomputable conditional distributions. In Proceedings of the 26th IEEE Symposium on Logic in Computer Science (LICS’11). IEEE Computer Society, 107--116.Google Scholar
- Shai Ben-David, Benny Chor, Oded Goldreich, and Michael Luby. 1992. On the theory of average case complexity. J. Comput. System Sci. 44, 2 (1992), 193--219. Google ScholarDigital Library
- Jens Blanck. 1997. Domain representability of metric spaces. Ann. Pure Appl. Logic 83, 3 (1997), 225--247.Google ScholarCross Ref
- Volker Bosserhoff. 2008. Notions of probabilistic computability on represented spaces. J. Univ. Comput. Sci. 14, 6 (2008), 956--995.Google Scholar
- Mark Braverman. 2005. On the complexity of real functions. In Proceedings of the 46th Symposium on Foundations of Computer Science. IEEE Computer Society, 155--164. Google ScholarDigital Library
- Mark Braverman and Stephen Cook. 2006. Computing over the reals: Foundations for scientific computing. Notices Amer. Math. Soc. 53, 3 (2006), 318--329.Google Scholar
- Gregory F. Cooper. 1990. The computational complexity of probabilistic inference using Bayesian belief networks. Artific. Intell. 42, 2--3 (1990), 393--405. Google ScholarDigital Library
- Paul Dagum and Michael Luby. 1993. Approximating probabilistic inference in Bayesian belief networks is NP-hard. Artific. Intell. 60, 1 (1993), 141--153. Google ScholarDigital Library
- Paul Dagum and Michael Luby. 1997. An optimal approximation algorithm for Bayesian inference. Artific. Intell. 93, 1--2 (1997), 1--27.Google ScholarDigital Library
- François G. Dorais, Gerald Edgar, and Jason Rute. 2013. Continuity on a measure one set versus measure one set of points of continuity. MathOverflow. Retrieved from http://mathoverflow.net/q/146063 (version: 2013-10-27).Google Scholar
- Abbas Edalat. 1996. The Scott topology induces the weak topology. In Proceedings of the 11th IEEE Symposium on Logic in Computer Science. IEEE Computer Society Press, Los Alamitos, CA, 372--381. Google ScholarDigital Library
- Abbas Edalat and Reinhold Heckmann. 1998. A computational model for metric spaces. Theoret. Comput. Sci. 193, 1--2 (1998), 53--73. Google ScholarDigital Library
- Cameron E. Freer and Daniel M. Roy. 2009. Computable exchangeable sequences have computable de Finetti measures. In Mathematical Theory and Computational Practice: Proceedings of the 5th Conference on Computability in Europe (CiE 2009). (Lecture Notes in Computer Sciience), Klaus Ambos-Spies, Benedikt Löwe, and Wolfgang Merkle (Eds.), Vol. 5635. Springer, 218--231.Google Scholar
- Cameron E. Freer and Daniel M. Roy. 2010. Posterior distributions are computable from predictive distributions. In Proceedings of the 13th International Conference on Artificial Intelligence and Statistics (AISTATS’10). Y. W. Teh and M. Titterington (Eds.). JMLR: W8CP 9 (2010), 233--240.Google Scholar
- Cameron E. Freer and Daniel M. Roy. 2012. Computable de Finetti measures. Ann. Pure Appl. Logic 163, 5 (2012), 530--546.Google ScholarCross Ref
- Peter Gács. 2005. Uniform test of algorithmic randomness over a general space. Theoret. Comput. Sci. 341, 1--3 (2005), 91--137. Google ScholarDigital Library
- Stefano Galatolo, Mathieu Hoyrup, and Cristóbal Rojas. 2010. Effective symbolic dynamics, random points, statistical behavior, complexity and entropy. Inform. and Comput. 208, 1 (2010), 23--41. Google ScholarDigital Library
- E. Mark Gold. 1967. Language identification in the limit. Inform. and Control 10, 5 (1967), 447--474.Google ScholarCross Ref
- Noah D. Goodman, Vikash K. Mansinghka, Daniel M. Roy, Keith Bonawitz, and Joshua B. Tenenbaum. 2008. Church: A language for generative models. In Proceedings of the 24th Conference on Uncertainty in Artificial Intelligence. Google ScholarDigital Library
- Armin Hemmerling. 2002. Effective metric spaces and representations of the reals. Theoret. Comput. Sci. 284, 2 (2002), 347--372. Google ScholarDigital Library
- Mathieu Hoyrup. 2008. Computability, Randomness and Ergodic Theory on Metric Spaces. Ph.D. Dissertation. Université Paris Diderot (Paris VII), Paris, France.Google Scholar
- Mathieu Hoyrup and Cristóbal Rojas. 2009a. An application of Martin-Löf randomness to effective probability theory. In Mathematical Theory and Computational Practice, Klaus Ambos-Spies, Benedikt Löwe, and Wolfgang Merkle (Eds.). Lecture Notes in Computer Science, Vol. 5635. Springer, 260--269. Google ScholarDigital Library
- Mathieu Hoyrup and Cristóbal Rojas. 2009b. Computability of probability measures and Martin-Löf randomness over metric spaces. Inform. and Comput. 207, 7 (2009), 830--847. Google ScholarDigital Library
- Mathieu Hoyrup and Cristóbal Rojas. 2011. Absolute continuity of measures and preservation of randomness. (2011). Retrieved from https://members.loria.fr/MHoyrup/abscont.pdf.Google Scholar
- Mathieu Hoyrup, Cristóbal Rojas, and Klaus Weihrauch. 2011. Computability of the Radon-Nikodym derivative. In Models of Computation in Context, Proceedings of the 7th Conference on Computability in Europe (CiE 2011), Benedikt Löwe, Dag Normann, Ivan N. Soskov, and Alexandra A. Soskova (Eds.). Lecture Notes in Computer Science, Vol. 6735. Springer, 132--141. Google ScholarDigital Library
- Marcus Hutter. 2007. On universal prediction and Bayesian confirmation. Theoret. Comput. Sci. 384, 1 (2007), 33--48.Google ScholarCross Ref
- Olav Kallenberg. 2002. Foundations of Modern Probability (2nd ed.). Springer, New York.Google Scholar
- Alexander S. Kechris. 1995. Classical Descriptive Set Theory. Graduate Texts in Mathematics, Vol. 156. Springer-Verlag, New York.Google Scholar
- Oleg Kiselyov and Chung-chieh Shan. 2009. Embedded probabilistic programming. In Domain-Specific Languages, Walid Mohamed Taha (Ed.). Lecture Notes in Computer Science, Vol. 5658. Springer, 360--384. Google ScholarDigital Library
- Stephen Cole Kleene. 1938. On notation for ordinal numbers. J. Symbolic Logic 3, 4 (1938), 150--155.Google ScholarCross Ref
- Donald E. Knuth and Andrew C. Yao. 1976. The complexity of nonuniform random number generation. In Proceedings of the Symposium on Algorithms and Complexity. Academic Press, New York, 357--428.Google Scholar
- A. N. Kolmogorov. 1933. Grundbegriffe der Wahrscheinlichkeitsrechnung. Springer.Google Scholar
- Leonid A. Levin. 1986. Average case complete problems. SIAM J. Comput. 15, 1 (1986), 285--286. Google ScholarDigital Library
- Irwin Mann. 1973. Probabilistic recursive functions. Trans. Amer. Math. Soc. 177 (1973), 447--467.Google ScholarCross Ref
- T. Minka, J. M. Winn, J. P. Guiver, and D. A. Knowles. 2010. Infer.NET 2.4. Microsoft Research Cambridge. Cambridge, UK. Retrieved from http://research.microsoft.com/infernet.Google Scholar
- Kenshi Miyabe. 2013. L<sup>1</sup>-computability, layerwise computability and Solovay reducibility. Computability 2, 1 (2013), 15--29.Google ScholarCross Ref
- Yiannis N. Moschovakis. 2009. Descriptive Set Theory (2nd ed.). Mathematical Surveys and Monographs, Vol. 155. American Mathematical Society, Providence, RI.Google Scholar
- Norbert Th. Müller. 1999. Computability on random variables. Theoret. Comput. Sci. 219, 1--2 (1999), 287--299. Google ScholarDigital Library
- J. Myhill. 1971. A recursive function, defined on a compact interval and having a continuous derivative that is not recursive. Michigan Math. J. 18 (1971), 97--98.Google ScholarCross Ref
- André Nies. 2009. Computability and Randomness. Oxford Logic Guides, Vol. 51. Oxford University Press, Oxford. Google ScholarDigital Library
- Daniel N. Osherson, Michael Stob, and Scott Weinstein. 1988. Mechanical learners pay a price for Bayesianism. J. Symbolic Logic 53, 4 (1988), 1245--1251.Google ScholarCross Ref
- Sungwoo Park, Frank Pfenning, and Sebastian Thrun. 2008. A probabilistic language based on sampling functions. ACM Trans. Program. Lang. Syst. 31, 1 (2008), 1--46.Google ScholarDigital Library
- J. Pfanzagl. 1979. Conditional distributions as derivatives. Ann. Probab. 7, 6 (1979), 1046--1050.Google ScholarCross Ref
- Avi Pfeffer. 2001. IBAL: A probabilistic rational programming language. In Proceedings of the 17th International Joint Conference on Artificial Intelligence. Morgan Kaufmann Publishers, 733--740. Google ScholarDigital Library
- David Poole. 1991. Representing Bayesian networks within probabilistic Horn abduction. In Proceedings of the 7th Conference on Uncertainty in Artificial Intelligence. 271--278. Google ScholarDigital Library
- Marian B. Pour-El and J. Ian Richards. 1989. Computability in Analysis and Physics. Springer-Verlag, Berlin.Google Scholar
- Hilary Putnam. 1965. Trial and error predicates and the solution to a problem of Mostowski. J. Symbolic Logic 30 (1965), 49--57.Google ScholarCross Ref
- M. M. Rao. 1988. Paradoxes in conditional probability. J. Multivariate Anal. 27, 2 (1988), 434--446. Google ScholarDigital Library
- M. M. Rao. 2005. Conditional Measures and Applications (2nd ed.). Pure and Applied Mathematics, Vol. 271. Chapman 8 Hall/CRC.Google Scholar
- Matthew Richardson and Pedro Domingos. 2006. Markov logic networks. Machine Learn. 62, 1--2 (2006), 107--136. Google ScholarDigital Library
- Hartley Rogers, Jr. 1987. Theory of Recursive Functions and Effective Computability (2nd ed.). MIT Press, Cambridge, MA. Google ScholarDigital Library
- Daniel M. Roy. 2011. Computability, Inference and Modeling in Probabilistic Programming. Ph.D. Dissertation. Massachusetts Institute of Technology, Cambridge, MA. Google ScholarDigital Library
- Mark J. Schervish. 1995. Theory of Statistics. Springer-Verlag, New York.Google Scholar
- Matthias Schröder. 2007. Admissible representations for probability measures. Math. Log. Q. 53, 4--5 (2007), 431--445.Google ScholarCross Ref
- R. J. Solomonoff. 1964. A formal theory of inductive inference II. Inform. and Control 7 (1964), 224--254.Google ScholarCross Ref
- Hayato Takahashi. 2008. On a definition of random sequences with respect to conditional probability. Inform. and Comput. 206, 12 (2008), 1375--1382. Google ScholarDigital Library
- Tue Tjur. 1974. Conditional Probability Distributions. Institute of Mathematical Statistics, University of Copenhagen, Copenhagen, Denmark.Google Scholar
- Tue Tjur. 1975. A Constructive Definition of Conditional Distributions. Institute of Mathematical Statistics, University of Copenhagen, Copenhagen, Denmark.Google Scholar
- Tue Tjur. 1980. Probability Based on Radon Measures. John Wiley 8 Sons Ltd., Chichester, UK.Google Scholar
- Klaus Weihrauch. 1993. Computability on computable metric spaces. Theoret. Comput. Sci. 113, 2 (1993), 191--210. Google ScholarDigital Library
- Klaus Weihrauch. 1999. Computability on the probability measures on the Borel sets of the unit interval. Theoret. Comput. Sci. 219, 1--2 (1999), 421--437. Google ScholarDigital Library
- Klaus Weihrauch. 2000. Computable Analysis: An Introduction. Springer-Verlag, Berlin. Google ScholarDigital Library
- Tomoyuki Yamakami. 1999. Polynomial time samplable distributions. J. Complexity 15, 4 (1999), 557--574. Google ScholarDigital Library
- A. K. Zvonkin and L. A. Levin. 1970. The complexity of finite objects and the development of the concepts of information and randomness by means of the theory of algorithms. Uspekhi Mat. Nauk 25, 6 (156) (1970), 85--127.Google Scholar
Index Terms
- On the Computability of Conditional Probability
Recommendations
Noncomputable Conditional Distributions
LICS '11: Proceedings of the 2011 IEEE 26th Annual Symposium on Logic in Computer ScienceWe study the computability of conditional probability, a fundamental notion in probability theory and Bayesian statistics. In the elementary discrete setting, a ratio of probabilities defines conditional probability. In more general settings, ...
Asymptotic confidence interval for conditional probability at decision making
Consideration was given to construction of a unilateral asymptotic confidence interval for an unknown conditional probability of the event A under the condition B. Analytical expressions for the boundaries of such interval were obtained for various a ...
Bayesian definition of random sequences with respect to conditional probabilities
AbstractWe study Martin-Löf random (ML-random) points on computable probability measures on sample and parameter spaces (Bayes models). We consider variants of conditional randomness defined by ML-randomness on Bayes models and those of conditional blind ...
Highlights- Algorithmic randomness for conditional probabilities is studied.
- Blind randomness is ill-defined for conditional probabilities.
- Effective orthogonality and existence of consistent estimator are equivalent.
- An algorithmic ...
Comments