Skip to main content
Log in

An Evolutionary Dynamical Analysis of Multi-Agent Learning in Iterated Games

  • Published:
Autonomous Agents and Multi-Agent Systems Aims and scope Submit manuscript

Abstract

In this paper, we investigate Reinforcement learning (RL) in multi-agent systems (MAS) from an evolutionary dynamical perspective. Typical for a MAS is that the environment is not stationary and the Markov property is not valid. This requires agents to be adaptive. RL is a natural approach to model the learning of individual agents. These Learning algorithms are however known to be sensitive to the correct choice of parameter settings for single agent systems. This issue is more prevalent in the MAS case due to the changing interactions amongst the agents. It is largely an open question for a developer of MAS of how to design the individual agents such that, through learning, the agents as a collective arrive at good solutions. We will show that modeling RL in MAS, by taking an evolutionary game theoretic point of view, is a new and potentially successful way to guide learning agents to the most suitable solution for their task at hand. We show how evolutionary dynamics (ED) from Evolutionary Game Theory can help the developer of a MAS in good choices of parameter settings of the used RL algorithms. The ED essentially predict the equilibriums outcomes of the MAS where the agents use individual RL algorithms. More specifically, we show how the ED predict the learning trajectories of Q-Learners for iterated games. Moreover, we apply our results to (an extension of) the COllective INtelligence framework (COIN). COIN is a proved engineering approach for learning of cooperative tasks in MASs. The utilities of the agents are re-engineered to contribute to the global utility. We show how the improved results for MAS RL in COIN, and a developed extension, are predicted by the ED.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. B. Banerjee and J. Peng, “Adaptive poilcy gradient in multiagent learning,” in Proceedings of the Third International Conference on Autonomous Agents and Multiagent Systems (AAMAS), 2003.

  2. A. Barto S. Mahadevan (2003) ArticleTitle“Recent advances in hierarchicalr einforcement learning” Discrete-Event Syst. J. 13 41–77 Occurrence Handle1972050

    MathSciNet  Google Scholar 

  3. A. L. C. Bazzan, A game-theoretic approach to coordination of traffic signal agents, PhD thesis, University of Karlsruhe, 1997.

  4. R. Becker, S. Zilberstein, V. Lesser, and C. V. Goldman, “Transition independent decentralized Markov decision problem,” in Proceedings of the Third International Conference on Autonomous Agents and Multiagent Systems (AAMAS), 2003.

  5. T. Börgers R. Sarin (1997) ArticleTitle“Reinforcement and replicator dynamics” J. Econ. Theory 77 IssueID1 1–14

    Google Scholar 

  6. R. Boyd and P. J. Richerson, Culture and the Evolutionary Process, The University of Chicago Press, 1985.

  7. R. R. Bush F. Mosteller (1951) ArticleTitle“A Mathematical Model for Simple Learning” The Psychol. Rev. 58 15–18

    Google Scholar 

  8. R. R. Bush F. Mosteller (1955) Stochastic Models for Learning Wiley New York

    Google Scholar 

  9. C. Claus and G. Boutilier, “The Dynamics of Reinforcement Learning in Cooperative Multi-Agent Systems,” in Proceedings of the 15th International Conference on Artificial Intelligence, pp. 746–752, 1998.

  10. J. G. Cross (1973) ArticleTitle“A stochastic learning model of economic behaviour” Quart. J. Econ. 87 IssueID5 239–266

    Google Scholar 

  11. C. M. Gintis, Game Theory Evolving, Princeton University Press, 2000.

  12. T. Grenager, and R. Powers, and Y. Shoham, “Dispersion games: general definitions and some specific learning results,” in Proceedings of the Eighteenth National Conference on Artificial Intelligence AAAI 02, 2002.

  13. C. Guestrin, D. Koller, C. Gearhart, and N. Kanodia, “Generalizing plans to new environments in relational MDPs,” in International Joint Conference on Artificial Intelligence (IJCAI-03), 2003.

  14. M. W. Hrisch and S. Smale, Differential Equation, Dynamical Systems and Linear Algebra, Academic Press, Inc 1974.

  15. J. Hofbauer and K. Sigmund, Evolutionary Games and Population Dynamics, Cambridge University Press, 1998.

  16. J. Hu and M. P. Wellman, “Multiagent reinforcement learning in stochastic games,” in Internal Report from the Laboratory for Information and Decision Systems and the Operation Research Center, 1999.

  17. P. Huang and K. Sycara, “Multi-agent Learning in Extensive Games with complete information,” in Proceedings of the Third International Conference on Autonomous Agents and Multiagent Systems (AAMAS), 2003.

  18. C. Jafari, A. Greenwald, D. Gondek, a0nd G. Ercal, “On no-regret learning fictitious play and nash equilibrium,” in Proceedings of the Eighteenth International Conference on Machine Learning (ICML), Cambridge University Press, pp. 223–226, 2001.

  19. H. Jung and M. Tambe, “Performance model for large scale multiagent systems” in Proceedings of the Third International Conference Autonomous Agents and Multiagent Systems (AAMAS), 2003.

  20. L. P. Kaelbling M. L. Littman A. W. Moore (1996) ArticleTitle“Reinforcement Learning: A Survey” J. Artif. Intell. Res. 4 237–285

    Google Scholar 

  21. M. Lauer and M. Riedmiller, “An algorithm for distributed reinforcement learning in cooperative multi-agent systems,” in Proc. 17th International Conf. on Machine Learning Morgan Kaufmann: San Francisco, CA, pp. 535–542, 2000.

  22. M. L. Littman, “Markov games as a framework for multi-agent reinforcement learning,” in Proceedings of the Eleventh International Conference on Machine Learning,” Cambridge University Press, pp. 157–163, 1994.

  23. J. Maynard Smith, Evolution and the Theory of the Games, Cambridge University Press, 1982.

  24. J. Maynard Smith G. R. Price (1973) ArticleTitle“The logic of animal conflict” Nature 146 IssueID2 15–18

    Google Scholar 

  25. K. Narendra M. Thathachar (1974) ArticleTitle“Learning automata: A survey” IEEE Trans. Syst. Man Cybernet 14 IssueID5 323–334 Occurrence Handle57 #9368

    MathSciNet  Google Scholar 

  26. K. Narendra and M. Thathachar, Learning Automata: An Introduction, Prentice-Hall, 1989.

  27. A. Nowé, J. Parent, and K. Verbeeck, “Social agents playing a periodical policy,” in Proceedings of the 12th European Conference on Machine Learning, Volume 2176 of Lecture Notes in Artificial Intelligence, Springer, pp. 382–393, 2001.

  28. E. Nudelman, J. Wortman, K. Leyton-Brown, and Y. Shoham, “Run the GAMUT: A comprehensive approach to evaluating game-theoretic algorithms, algorithms,” in Proceedings of the Fourth International Conference on Autonomous Agents and Multiagent Systems (AAMAS), 2004.

  29. M. J. Osborne and A. Rubinstein, A Coruse in Game Theory, MIT Press, 1994.

  30. J. Parent, K. Verbeeck, A. Nowé, K. Steenhaut, J. Lemeire, and E. Dirkx, “Adaptive load balancing of parallel applications with social reinforement learning on heterogeneous sysems,” J. Sci. Program. 2004. to appear.

  31. S. Phelps, S. Parsons, and P. McBurney, “An evolutionary game-theoretic comparison of two double-action market designs,” in Workshop on Agent Medicated Electronic commerce VI: Theories for Engineering of Distributed Mechanisms and Systems (AMEC’04), Volume 2531 of Lecture Notes in Artificial Intelligence, Springer, pp. 109–118, 2004.

  32. R. Powers and Y. Shoham, “New criteria and a new algorithm for learning in multi-agent system,” in Proceedings of Eighteenth Annual Conference on Neural Information Processing Systems (NIPS), 2004.

  33. F.V. Redondo, Game Theory and Economics, Cambridge University Press, 2001.

  34. L. Samuelson (1997) Evolutionary Games and Equilibrium Selection MIT Press Cambridge, MA

    Google Scholar 

  35. T. D. Schneider (2000) ArticleTitle“Evolution of biological information” J. Nucl. Acid Res. 28 IssueID14 2794–2799

    Google Scholar 

  36. D. Stauffer (1999) Life Love and Death: Models of Biological Reproduction and Aging Institute for Theoretical physics Köln Euroland

    Google Scholar 

  37. R. S. Sutton A. G. Barto (1998) Reinforcement Learning: An Introduction MIT Press Cambridge MA

    Google Scholar 

  38. P. J. ’t Hoen and S. M. Bohte, “COllective INtelligence with sequence of actions,” in 14th European conference on Machine Learning, Volume 2837 of Lecture Notes in Articifical Intelligence, Springer, 2003.

  39. P. J. ’t Hoen and S. M. Bohte, “COllective INtelligence with task assignment,” in proceedings of the Workshop on Collectives and the Design of Complex Systems (CDOCS03), forthcoming. Also available as Technical Rapport SEN-E0315, Lecture Notes in Artificial Intelligence, Springer, 2003.

  40. P. J. ’t Hoen and K. Tuyls, “Analyzing multi-agent reinforcement learning using evolutionary dynamics,” in Proceedings of the 15th European Conference on Machine Learning (ECML), Lecture Notes in Artificial Intelligence, Springer, 2004.

  41. P. S. Sastry M. A. L. Thathacher (2002) ArticleTitle“Varieties of Learning Automata: An Overview” IEEE Trans. Sys. Man Cybernet 32 IssueID6 323–334

    Google Scholar 

  42. J. N. Tsitsiklis (1994) ArticleTitle“Asynchronous stochastic approximation and Q-learning” Machine Learn 16 185–202 Occurrence Handle0820.68105

    MATH  Google Scholar 

  43. K. Tumer and D. Wolpert, “COllective INtelligence and Braess’ Paradox,” in Proceedings of the Sixteenth National Conference on Artificial Intelligence, Austin, pp. 104–109, August, 2000.

  44. K. Tuyls (2004) Learning in Multi-Agent Systems, An Evolutionary Game Theoretic Approach Computational Modeling Lab, Vrije Universiteit Brussel Belgium

    Google Scholar 

  45. K. Tuyls, D. Heytens, A. Nowé, and B. Manderick, “Extended Replicator Dynamics as a Key to Reinforcement Learning in Multi-Agent Systems,” in Proceedings of the 14th European Conference on Machine Learning (ECML), Volume 2837, of Lecture Notes in Artificial Intelligence, Springer, 2003.

  46. K. Tuyls, T. Lenaerts, K. Verbeeck, S. Maes and B. Manderick, “Towards a relation between learning agents and evolutionary dynamics”, in Proceedings of the Belgian-Dutch Conference on Artificial Intelligence (BNAIC 2002), Cambridge University Press, pp. 223–226, 2002.

  47. K. Tuyls, A. Nowe, T. Lenaerts, and B. Manderick, “An evolutionary game theoretic perspective on learning in multi-agent systems,” in Synthese, Section Knowledge, Rationality and Action, Kluwer Academic Publishers, 2004, vol. 139, no. 2, pp. 297–330.

  48. K. Tuyls, K. Verbeeck, T. Lenaerts, “A Selection-Mutation model for Q-learning in Multi-Agent Systems,” in Proceedings of the Third Interational conference on Autonomous Agents and Multiagent Systems (AAMAS), The ACM International Conference Proceedings Series, 2003.

  49. K. Verbeeck, A. Nowé, and J. Parent, “Homo egualis reinforcement learning agents for load balancing,” in Proceedings of the 1st NASA Workshop on Radical Agent Concepts, Volume 2564 of Lecture Notes in Artificial Intelligence, Springer, pp. 109–118, 2002.

  50. J. von Neumann and O. Morgenstern, Theory of Games and Economic Behavior, Princeton University Press, 1944.

  51. W. E. Walsh, R. Das, G. Tesauro, and J. O. Kephart, “Analyzing complex strategic interactions in multi-agent games,” in Proceedings of the The Eighteenth National Conference on Artificial Intelligence (AAAI-02) Workshop on Game Theoretic and Decision Theoretic Agents, Lecture Notes in Artificial Intelligence, Springer, pp. 109–118, 2002.

  52. C. Watkins P. Dayan (1992) ArticleTitle“Q-learning” Machine Learn. 8 279–292

    Google Scholar 

  53. J. W. Weibull, Evolutionary Game Theory, MIT Press, 1996.

  54. David H. Wolpert, Kagan Tumer, and Jeremy Frank, “Using COllective INtelligence to route internet traffic,” in Advances in Neural Information Processing Systems-11, Denver, pp. 952–958, 1998.

  55. David H. Wolpert Kevin R. Wheler Kagan Tumer (1999) “General Principles of learning-based multi-agent systems” Oren Etzioni Jörg P. Müller Jeffrey M. Bradshaw (Eds) Proceedings of the Third International Conference on Autonomous Agents (Agents’99). ACM Press Seattle, WA, USA 77–83

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Bram Vanschoenwinkel.

Additional information

Author funded by a doctoral grant of the institute for advancement of scientific technological research in Flanders (IWT).

Rights and permissions

Reprints and permissions

About this article

Cite this article

Tuyls, K., Hoen, P.J.’. & Vanschoenwinkel, B. An Evolutionary Dynamical Analysis of Multi-Agent Learning in Iterated Games. Auton Agent Multi-Agent Syst 12, 115–153 (2006). https://doi.org/10.1007/s10458-005-3783-9

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10458-005-3783-9

Keywords

Navigation