Graphical models for interactive POMDPs: representations and solutions

Doshi, Prashant; Zeng, Yifeng; Chen, Qiongyu

doi:10.1007/s10458-008-9064-7

Graphical models for interactive POMDPs: representations and solutions

Published: 25 September 2008

Volume 18, pages 376–416, (2009)
Cite this article

Autonomous Agents and Multi-Agent Systems Aims and scope Submit manuscript

Prashant Doshi¹,
Yifeng Zeng² &
Qiongyu Chen³

674 Accesses
44 Citations
Explore all metrics

Abstract

We develop new graphical representations for the problem of sequential decision making in partially observable multiagent environments, as formalized by interactive partially observable Markov decision processes (I-POMDPs). The graphical models called interactive influence diagrams (I-IDs) and their dynamic counterparts, interactive dynamic influence diagrams (I-DIDs), seek to explicitly model the structure that is often present in real-world problems by decomposing the situation into chance and decision variables, and the dependencies between the variables. I-DIDs generalize DIDs, which may be viewed as graphical representations of POMDPs, to multiagent settings in the same way that I-POMDPs generalize POMDPs. I-DIDs may be used to compute the policy of an agent given its belief as the agent acts and observes in a setting that is populated by other interacting agents. Using several examples, we show how I-IDs and I-DIDs may be applied and demonstrate their usefulness. We also show how the models may be solved using the standard algorithms that are applicable to DIDs. Solving I-DIDs exactly involves knowing the solutions of possible models of the other agents. The space of models grows exponentially with the number of time steps. We present a method of solving I-DIDs approximately by limiting the number of other agents’ candidate models at each time step to a constant. We do this by clustering models that are likely to be behaviorally equivalent and selecting a representative set from the clusters. We discuss the error bound of the approximation technique and demonstrate its empirical performance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Adam B., Dekel E. (1993) Hierarchies of beliefs and common knowledge. International Journal of Game Theory 59(1): 189–198
MATH Google Scholar
Aumann R.J. (1999) Interactive epistemology i: Knowledge. International Journal of Game Theory, 28(3): 263–300
Article MATH MathSciNet Google Scholar
Boutilier, C. (1999). Sequential optimality and coordination in multiagent systems. In Sixteenth International Joint Conference on Artificial Intelligence (IJCAI) (pp. 478–485). Stockhom, Sweeden.
Boutilier, C., & Poole, D. (1996). Computing optimal policies for partially observable decision processes using compact representations. In Thirteenth Conference on Artificial Intelligence (AAAI) (pp. 1168–1175). Portland, Oregon.
Camerer C. (2003) Behavioral game theory: Experiments in strategic interaction. Princeton University Press, Princeton New Jersey
MATH Google Scholar
Charnes J.M., Shenoy P. (2004) Multistage monte carlo methods for solving influence diagrams using local computation. Management Science 50(3): 405–418
Article Google Scholar
Dennett D. (1986) Intentional systems. MIT Press, Brainstorms
Google Scholar
Doshi, P., & Gmytrasiewicz, P. J. (2005). Approximating state estimation in multiagent settings using particle filters. In Autonomous Agents and Multi-agent Systems Conference (AAMAS) (pp. 320–327). Utrecht, Netherlands.
Doshi, P., & Gmytrasiewicz, P. J. (2005). A particle filtering based approach to approximating interactive pomdps. In Twentieth Conference on Artificial Intelligence (AAAI) (pp. 969–974). Pittsburg, PA.
Fehr E., Gachter S. (2000) Cooperation and punishment in public good experiments. American Economic Review 90(4): 980–994
Google Scholar
Fudenberg D., Levine D.K. (1998) The theory of learning in games. The MIT Press, Cambridge MA
MATH Google Scholar
Fudenberg, D., & Tirole, J. (1991). Game theory. MIT Press.
Gal, Y., & Pfeffer, A. (2003). A language for modeling agent’s decision-making processes in games. In Autonomous Agents and Multi-Agents Systems Conference (AAMAS) (pp. 265–272). Melbourne, Australia.
Gmytrasiewicz P., Doshi P. (2005) A framework for sequential planning in multiagent settings. Journal of Artificial Intelligence Research (JAIR) 24: 49–79
MATH Google Scholar
Gmytrasiewicz P.J., Durfee E.H. (2000) Rational coordination in multi-agent environments. Journal of Autonomous Agents and Multi-Agent Systems 3(4): 319–350
Article Google Scholar
Guestrin, C., Koller, D., & Parr, R. (2001). Solving factored pomdps with linear value functions. In Workshop on Planning under Uncertainty and Incomplete Information, IJCAI. Seattle, Washington.
Harsanyi J.C. (1967) Games with incomplete information played by bayesian players. Management Science 14(3): 159–182
Article MATH MathSciNet Google Scholar
Howard, R. A., & Matheson, J. E. (1984). Influence diagrams. In Readings on the Principles and Applications of Decision Analysis (pp. 721–762).
Kaelbling L., Littman M., Cassandra A. (1998) Planning and acting in partially observable stochastic domains. Artificial Intelligence Journal 101(1–2): 99–134
Article MATH MathSciNet Google Scholar
Koller, D., & Milch, B. (2001). Multi-agent influence diagrams for representing and solving games. In International Joint Conference on Artificial Intelligence (IJCAI) (pp. 1027–1034). Seattle, Washington.
Littman, M. (1994). Markov games as a framework for multiagent reinforcement learning. In International Conference on Machine Learning (ICML) (pp. 157–163). New Brunswick, New Jersey.
MacQueen J. (1967) Some methods for classification and analysis of multivariate observations. In: LeCam L.M., Neyman J.(eds) Proceedings of the Fifth Berkeley Symposium on Mathematics, Statistics, and Probablity. UC Press, Berkeley, CA, pp 281–297
Google Scholar
Mertens J.F., Zamir S. (1985) Formulation of bayesian analysis for games with incomplete information. International Journal of Game Theory 14: 1–29
Article MATH MathSciNet Google Scholar
Nair, R., Tambe, M., Yokoo, M., Pynadath, D., & Marsella, S. (2003). Taming decentralized pomdps: Towards efficient policy computation for multiagent settings. In International Joint Conference on Artificial Intelligence (IJCAI) (pp. 705–711). Acapulco, Mexico.
Nilsson, D., & Lauritzen, S. (2000). Evaluating influence diagrams using limids. In Uncertainty in Artificial Intelligence (UAI) (pp. 436–445). Stanford, California.
Pearl J. (1988). Probabilistic reasoning in intelligent systems: Networks of plausible inference. Morgan-Kaufmann: Los Altos, California.
Pineau J., Gordon G., Thrun S. (2006) Anytime point-based approximations for large pomdps. Journal of Artificial Intelligence Research (JAIR) 27: 335–380
MATH Google Scholar
Polich, K., & Gmytrasiewicz, P. (2006). Interactive dynamic influence diagrams. In Game Theory and Decision Theory (GTDT) Workshop, AAMAS. Hakodate, Japan.
Pynadath, D., & Marsella, S. (2007). Minimal mental models. In Twenty-Second Conference on Artificial Intelligence (AAAI) (pp. 1038–1044). Canada, Vancouver.
Rathnas, B., Doshi, P., & Gmytrasiewicz, P. J. (2006). Exact solutions to interactive pomdps using behavioral equivalence. In Autonomous Agents and Multi-Agents Systems Conference (AAMAS) (pp. 1025–1032). Hakodate, Japan.
Russell, S., & Norvig, P. (2003). Artificial intelligence: A modern approach (2nd edn). Prentice Hall.
Seuken, S., & Zilberstein, S. (2008). Formal models and algorithms for decentralized decision making under uncertainty. Journal of Autonomous Agents and Multi-agent Systems. doi:10.1007/s10458-007-9026-5.
Shachter R.D. (1986) Evaluating influence diagrams. Operations Research 34(6): 871–882
Article MathSciNet Google Scholar
Smallwood R., Sondik E. (1973) The optimal control of partially observable markov decision processes over a finite horizon. Operations Research (OR) 21: 1071–1088
Article MATH Google Scholar
Suryadi, D., & Gmytrasiewicz, P. (1999). Learning models of other agents using influence diagrams. In International Conference on User Modeling (pp. 223–232).
Tatman J.A., Shachter R.D. (1990) Dynamic programming and influence diagrams. IEEE Transactions on Systems, Man, and Cybernetics 20(2): 365–379
Article MATH MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science and Institute for AI, University of Georgia, Athens, GA, 30602, USA
Prashant Doshi
Department of Computer Science, Aalborg University, 9220, Aalborg, Denmark
Yifeng Zeng
Department of Computer Science, National University of Singapore, Singapore, Singapore, 117543
Qiongyu Chen

Authors

Prashant Doshi
View author publications
You can also search for this author in PubMed Google Scholar
Yifeng Zeng
View author publications
You can also search for this author in PubMed Google Scholar
Qiongyu Chen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Prashant Doshi.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Doshi, P., Zeng, Y. & Chen, Q. Graphical models for interactive POMDPs: representations and solutions. Auton Agent Multi-Agent Syst 18, 376–416 (2009). https://doi.org/10.1007/s10458-008-9064-7

Download citation

Published: 25 September 2008
Issue Date: June 2009
DOI: https://doi.org/10.1007/s10458-008-9064-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Graphical models for interactive POMDPs: representations and solutions

Abstract

Access this article

Similar content being viewed by others

Approximating behavioral equivalence for scaling solutions of I-DIDs

Toward data-driven solutions to interactive dynamic influence diagrams

Interpreting Systems as Solving POMDPs: A Step Towards a Formal Understanding of Agency

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Graphical models for interactive POMDPs: representations and solutions

Abstract

Access this article

Similar content being viewed by others

Approximating behavioral equivalence for scaling solutions of I-DIDs

Toward data-driven solutions to interactive dynamic influence diagrams

Interpreting Systems as Solving POMDPs: A Step Towards a Formal Understanding of Agency

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation