Skip to main content
Log in

Summarizing agent strategies

  • Published:
Autonomous Agents and Multi-Agent Systems Aims and scope Submit manuscript

Abstract

Intelligent agents and AI-based systems are becoming increasingly prevalent. They support people in different ways, such as providing users with advice, working with them to achieve goals or acting on users’ behalf. One key capability missing in such systems is the ability to present their users with an effective summary of their strategy and expected behaviors under different conditions and scenarios. This capability, which we see as complementary to those currently under development in the context of “interpretable machine learning” and “explainable AI”, is critical in various settings. In particular, it is likely to play a key role when a user needs to collaborate with an agent, when having to choose between different available agents to act on her behalf, or when requested to determine the level of autonomy to be granted to an agent or approve its strategy. In this paper, we pose the challenge of developing capabilities for strategy summarization, which is not addressed by current theories and methods in the field. We propose a conceptual framework for strategy summarization, which we envision as a collaborative process that involves both agents and people. Last, we suggest possible testbeds that could be used to evaluate progress in research on strategy summarization.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1

Similar content being viewed by others

Notes

  1. The use of the term “agent” in this paper refers to any system for which strategy can be formally captured or simulated. Autonomous agents are a specific case in that sense.

  2. For example, when considering the autonomous operation of UAVs in SAR missions, various recent solutions can be considered [1, 6, 54, 61, 62].

  3. The method is inspired by the “strategy method” paradigm from behavioral economics [56] in the sense of eliciting people’s strategy. Nevertheless, while in the strategy method people state their action for every possible situation that may arise in their interaction (i.e., a state-machine-like description) with PDAs people are actually required to program their (not-necessarily-state-based) strategy into an agent.

References

  1. Abrahamsen, H. B. (2015). A remotely piloted aircraft system in major incident management: Concept and pilot, feasibility study. BMC Emergency Medicine, 15(1), 12. https://doi.org/10.1186/s12873-015-0036-3.

    Article  MathSciNet  Google Scholar 

  2. Amgoud, L., & Prade, H. (2009). Using arguments for making and explaining decisions. Artificial Intelligence, 173(3–4), 413–436.

    Article  MathSciNet  MATH  Google Scholar 

  3. Amir, D., & Amir, O. (2018). Highlights: Summarizing agent behavior to people. In Proceedings of the 17th international conference on autonomous agents and multi-agent systems (AAMAS).

  4. Amir, O., Kamar, E., Kolobov, A., & Grosz, B. J. (2016). Interactive teaching strategies for agent training. In International joint conferences on artificial intelligence.

  5. Baarslag, T., Hindriks, K., Jonker, C. M., Kraus, S., & Lin, R. (2012). The first automated negotiating agents competition (anac 2010). In T. Ito, M. Zhang, V. Robu, S. Fatima, & T. Matsuo (Eds.), New trends in agent-based complex automated negotiations (Vol. 383, pp. 113–135). Berlin, Heidelberg: Springer.

    Chapter  Google Scholar 

  6. Bejiga, M. B., Zeggada, A., Nouffidj, A., & Melgani, F. (2017). A convolutional neural network approach for assisting avalanche search and rescue operations with uav imagery. Remote Sensing, 9(2). https://doi.org/10.3390/rs9020100. http://www.mdpi.com/2072-4292/9/2/100.

  7. Brooks, D. J., Shultz, A., Desai, M., Kovac, P., & Yanco, H. A. (2010). Towards state summarization for autonomous robots. In AAAI fall symposium: Dialog with robots (Vol. 61, p. 62).

  8. Caminada, M. W., Kutlak, R., Oren. N., & Vasconcelos, W. W. (2014). Scrutable plan enactment via argumentation and natural language generation. In Proceedings of the 2014 international conference on Autonomous agents and multi-agent systems, international foundation for autonomous agents and multiagent systems (pp. 1625–1626).

  9. Chalamish, M., Sarne, D., & Lin, R. (2012). The effectiveness of peer-designed agents in agent-based simulations. Multiagent and Grid Systems, 8(4), 349–372.

    Article  Google Scholar 

  10. Chalamish, M., Sarne, D., & Lin, R. (2013). Enhancing parking simulations using peer-designed agents. IEEE Transactions on Intelligent Transportation Systems, 14(1), 492–498.

    Article  Google Scholar 

  11. Clouse, J. A. (1996). On integrating apprentice learning and reinforcement learning. PhD thesis, University of Massachusetts

  12. Devin, S., & Alami, R. (2016). An implemented theory of mind to improve human–robot shared plans execution. In 2016 11th ACM/IEEE international conference on human–robot interaction (HRI) (pp. 319–326). IEEE.

  13. Dodson, T., Mattei, N., & Goldsmith, J. (2011). A natural language argumentation interface for explanation generation in Markov decision processes. In Algorithmic decision theory (pp. 42–55).

  14. Doshi-Velez, F., & Kim, B. (2017). Towards a rigorous science of interpretable machine learning. arXiv preprint arXiv:1702.08608.

  15. Elizalde, F. (2008). Policy explanation in factored Markov decision processes. In Proceedings of the 4th European workshop on probabilistic graphical models (PGM 2008) (pp. 97–104).

  16. Elizalde, F., Sucar, L. E., Reyes, A., & de Buen, P. (2007). An MDP approach for explanation generation. In ExaCt (pp. 28–33).

  17. Elmalech, A., & Sarne, D. (2014). Evaluating the applicability of peer-designed agents for mechanism evaluation. Web Intelligence and Agent Systems, 12(2), 171–191.

    Google Scholar 

  18. Elmalech, A., Sarne, D., & Agmon, N. (2016). Agent development as a strategy shaper. Autonomous Agents and Multi-Agent Systems, 30(3), 506–525.

    Article  Google Scholar 

  19. Ernst, D., Stan, G. B., Goncalves, J., & Wehenkel, L. (2006). Clinical data based optimal STI strategies for HIV: A reinforcement learning approach. In 2006 45th IEEE conference on decision and control (pp. 667–672). IEEE.

  20. Garg, A. X., Adhikari, N. K., McDonald, H., Rosas-Arellano, M. P., Devereaux, P., Beyene, J., et al. (2005). Effects of computerized clinical decision support systems on practitioner performance and patient outcomes: A systematic review. Jama, 293(10), 1223–1238.

    Article  Google Scholar 

  21. Glass, A., McGuinness, D. L., & Wolverton, M. (2008). Toward establishing trust in adaptive agents. In Proceedings of the 13th international conference on intelligent user interfaces (pp. 227–236). ACM.

  22. Greenwald, A., & Stone, P. (2001). Autonomous bidding agents in the trading agent competition. IEEE Internet Computing, 5(2), 52–60. https://doi.org/10.1109/4236.914648.

    Article  Google Scholar 

  23. Greydanus, S., Koul, A., Dodge, J., & Fern, A. (2017). Visualizing and understanding atari agents. arXiv preprint arXiv:1711.00138.

  24. Hadfi, R., Ito, T. (2016a). Holonic multiagent simulation of complex adaptive systems. In Workshop on MAS for complex networks and social computation (CNSC).

  25. Hadfi, R., & Ito, T. (2016b). Multilayered multiagent system for traffic simulation. In International conference on autonomous agents and multi-agent systems (AAMAS), Singapore, May 9–13, 2016.

  26. Hart, S. G., & Staveland, L. E. (1988). Development of NASA-TLX (task load index): Results of empirical and theoretical research. Advances in Psychology, 52, 139–183.

    Article  Google Scholar 

  27. Hayes, B., & Shah, J. A. (2017). Improving robot controller transparency through autonomous policy explanation. In Proceedings of the 2017 ACM/IEEE international conference on human–robot interaction (pp. 303–312). ACM.

  28. Hoffman, G. (2013). Evaluating fluency in human–robot collaboration. In International conference on human–robot interaction (HRI), workshop on human robot collaboration (Vol. 381, pp. 1–8).

  29. Horvitz, E. (1999). Principles of mixed-initiative user interfaces. In Proceedings of the SIGCHI conference on human factors in computing systems (pp. 159–166). ACM.

  30. Huang, S. H., Held, D., Abbeel, P., & Dragan, A. D. (2019). Enabling robots to communicate their objectives. Autonomous Robots, 43(2), 309–326.

    Article  Google Scholar 

  31. Khan, O., Poupart, P., Black, J., Sucar, L., Morales, E., & Hoey, J. (2011). Automatically generated explanations for markov decision processes. In Decision theory models for applications in AI: Concepts and solutions (pp. 144–163).

  32. Khan, O. Z., Poupart, P., & Black, J. P. (2009). Minimal sufficient explanations for factored Markov decision processes. In ICAPS.

  33. Kim, B., Khanna, R., & Koyejo, O. O. (2016). Examples are not enough, learn to criticize! criticism for interpretability. In Advances in neural information processing systems (pp. 2280–2288).

  34. Kim, B., Rudin, C., & Shah, J. A. (2014). The Bayesian case model: A generative approach for case-based reasoning and prototype classification. In Advances in neural information processing systems (pp. 1952–1960).

  35. Kosti, S., Sarne, D., & Kaminka, G. A. (2014). A novel user-guided interface for robot search. In Proceedings of the international conference on intelligent robots and systems (IROS) (pp. 3305–3310).

  36. Lage, I., Lifschitz, D., Doshi-Velez, F., & Amir, O. (2019a). Exploring computational user models for agent policy summarization. In Proceedings of the 28th international joint conference on artificial intelligence (IJCAI).

  37. Lage, I., Lifschitz, D., Doshi-Velez, F., & Amir, O. (2019b). Toward robust policy summarization. In Proceedings of the 18th international conference on autonomous agents and multi-agent systems (AAMAS).

  38. Langley, P., Meadows, B., Sridharan, M., & Choi, D. (2017). Explainable agency for intelligent autonomous systems. In AAAI (pp. 4762–4764).

  39. Lin, R., Kraus, S., Agmon, N., Barrett, S., & Stone, P. (2011). Comparing agents’ success against people in security domains. In Proceedings of the twenty-fifth AAAI conference on artificial intelligence.

  40. Lin, R., Kraus, S., Oshrat, Y., & Gal, Y. K. (2010). Facilitating the evaluation of automated negotiators using peer designed agents. In Proceedings of the twenty-fourth AAAI conference on artificial intelligence.

  41. Lipton, Z. C. (2016). The mythos of model interpretability. arXiv preprint arXiv:1606.03490.

  42. Lomas, M., Chevalier, R., Cross II, E. V., Garrett, R. C., Hoare, J., & Kopack, M. (2012). Explaining robot actions. In Proceedings of the seventh annual ACM/IEEE international conference on human–robot interaction (pp. 187–188). ACM.

  43. Manisterski, E., Lin, R., & Kraus, S. (2008). Understanding how people design trading agents over time. In Proceedings of 7th international joint conference on autonomous agents and multiagent systems (AAMAS) (pp. 1593–1596).

  44. Mash, M., Lin. R., & Sarne. D. (2014). Peer-design agents for reliably evaluating distribution of outcomes in environments involving people. In Proceedings of the international conference on autonomous agents and multi-agent systems (AAMAS) (pp. 949–956).

  45. McGuinness, D. L., Glass, A., Wolverton, M., & Da Silva, P. P. (2007a). A categorization of explanation questions for task processing systems. In ExaCt (pp. 42–48).

  46. McGuinness, D. L., Glass, A., Wolverton, M., & Da Silva, P. P. (2007b). Explaining task processing in cognitive assistants that learn. In AAAI spring symposium: Interaction challenges for intelligent assistants (pp. 80–87).

  47. Myers, K. L. (2006). Metatheoretic plan summarization and comparison. In ICAPS (pp. 182–192).

  48. Nikolaidis, S., & Shah, J. (2013). Human–robot cross-training: Computational formulation, modeling and evaluation of a human team training strategy. In Proceedings of the 8th ACM/IEEE international conference on human–robot interaction (pp. 33–40). IEEE Press.

  49. Norman, D. A. (1983). Some observations on mental models. Mental Models, 7(112), 7–14.

    Google Scholar 

  50. Olsen, D. R., & Goodrich, M. A. (2003). Metrics for evaluating human–robot interactions. In Proceedings of PERMIS (Vol. 2003, p. 4).

  51. Ribeiro, M. T., Singh, S., & Guestrin, C. (2016a). Model-agnostic interpretability of machine learning. arXiv preprint arXiv:1606.05386.

  52. Ribeiro, M. T., Singh, S., & Guestrin, C. (2016b). Why should i trust you?: Explaining the predictions of any classifier. In Proceedings of ACM international conference on knowledge discovery and data mining (pp. 1135–1144). ACM.

  53. Sampedro, C., Rodriguez-Ramos, A., Bavle, H., Carrio, A., de la Puente, P., & Campoy, P. (2018). A fully-autonomous aerial robot for search and rescue applications in indoor environments using learning-based techniques. Journal of Intelligent & Robotic Systems. https://doi.org/10.1007/s10846-018-0898-1.

  54. Scherer, J., Yahyanejad, S., Hayat, S., Yanmaz, E., Andre, T., Khan, A., Vukadinovic, V., Bettstetter, C., Hellwagner, H., & Rinner, B. (2015). An autonomous multi-UAV system for search and rescue. In Proceedings of the first workshop on micro aerial vehicle networks, systems, and applications for civilian use, DroNet ’15 (pp. 33–38). ACM, New York, NY, USA. https://doi.org/10.1145/2750675.2750683.

  55. Seegebarth, B., Müller, F., Schattenberg, B., & Biundo, S. (2012). Making hybrid plans more clear to human users-a formal approach for generating sound explanations. In Twenty-second international conference on automated planning and scheduling.

  56. Selten, R., Mitzkewitz, M., & Uhlich, G. (1997). Duopoly strategies programmed by experienced players. Econometrica, 65(3), 517–555.

    Article  MathSciNet  MATH  Google Scholar 

  57. Sohrabi, S., Baier, J. A., & McIlraith, S. A. (2011). Preferred explanations: Theory and generation via planning. In AAAI.

  58. Sreedharan, S., Srivastava, S., & Kambhampati, S. (2018). Hierarchical expertise level modeling for user specific contrastive explanations. In IJCAI (pp. 4829–4836).

  59. Stone, P., Brooks, R., Brynjolfsson, E., Calo, R., Etzioni, O., Hager, G., Hirschberg, J., Kalyanakrishnan, S., Kamar, E., Kraus, S., Leyton-Brown, K., Parkes, D., William, P., AnnaLee, S., Julie, S., Milind, T., & Astro, T. (2016). Artificial intelligence and life in 2030. One hundred year study on artificial intelligence: Report of the 2015–2016 study panel.

  60. Stubbs, K., Hinds, P. J., & Wettergreen, D. (2007). Autonomy and common ground in human–robot interaction: A field study. IEEE Intelligent Systems, 22(2), 42–50.

    Article  Google Scholar 

  61. Sun, J., Li, B., Jiang, Y., & Wen, C. (2016). A camera-based target detection and positioning UAV system for search and rescue (SAR) purposes. Sensors, 16(11). https://doi.org/10.3390/s16111778. http://www.mdpi.com/1424-8220/16/11/1778.

  62. Tomic, T., Schmid, K., Lutz, P., Domel, A., Kassecker, M., Mair, E., et al. (2012). Toward a fully autonomous UAV: Research platform for indoor and outdoor urban search and rescue. IEEE Robotics Automation Magazine, 19(3), 46–56. https://doi.org/10.1109/MRA.2012.2206473.

    Article  Google Scholar 

  63. Torrey, L., & Taylor, M. (2013). Teaching on a budget: Agents advising agents in reinforcement learning. In Proceedings of the 2013 international conference on Autonomous agents and multi-agent systems (pp. 1053–1060).

  64. Urieli, D., & Stone, P. (2014). Tactex’13: A champion adaptive power trading agent. In Proceedings of the twenty-eighth conference on artificial intelligence (AAAI’14) (pp. 465–471).

  65. Velagapudi, P., Wang, J., Wang, H., Scerri, P., Lewis, M., & Sycara, K. (2008). Synchronous vs. asynchronous video in multi-robot search. In ACHI’08 (pp. 224–229).

  66. Vellido, A., Martín-Guerrero, J. D., & Lisboa, P. J. (2012). Making machine learning models interpretable. ESANN, 12, 163–172.

    Google Scholar 

  67. Wang, H., Kolling, A., Brooks, N., Owens, S., Abedin, S., Scerri, P., Lee, P., Chien, S. Y., Lewis, M., & Sycara, K. (2011). Scalable target detection for large robot teams. In HRI’11 (pp. 363–370). https://doi.org/10.1145/1957656.1957792.

  68. Wang, N., Pynadath, D. V., & Hill, S. G. (2016). The impact of pomdp-generated explanations on trust and performance in human–robot teams. In Proceedings of the 2016 international conference on autonomous agents & multiagent systems (pp. 997–1005).

  69. Wang, H., Velagapudi, P., Scerri, P., Sycara, K., & Lewis, M. (2009). Using humans as sensors in robotic search. In FUSION’09 (pp. 1249 – 1256).

  70. Wellman, M., Greenwald, A., & Stone, P. (2007). Autonomous bidding agents—Strategies and lessons from the trading agent competition. Cambridge: MIT Press.

    Book  Google Scholar 

  71. Yanco, H. A., & Drury, J. L. (2006). Rescuing interfaces: A multi-year study of human–robot interaction at the AAAI robot rescue competition. Autonomous Robots, 22(4), 333–352. https://doi.org/10.1007/s10514-006-9016-5.

    Article  Google Scholar 

  72. Yang, Z., Bai, S., Zhang, L., & Torr, P. H. (2018). Learn to interpret atari agents. arXiv preprint arXiv:1812.11276.

  73. Zhang, Y., Sreedharan, S., Kulkarni, A., Chakraborti, T., Zhuo, H. H., & Kambhampati, S. (2017) Plan explicability and predictability for robot task planning. In 2017 IEEE international conference on robotics and automation (ICRA) (pp. 1313–1320). IEEE.

  74. Zhu, X. (2015). Machine teaching: An inverse problem to machine learning and an approach toward optimal education. In AAAI (pp. 4083–4087).

  75. Zuckerman, I., Cheng, K. L., & Nau, D. S. (2018). Modeling agent’s preferences by its designer’s social value orientation. Journal of Experimental & Theoretical Artificial Intelligence, 30(2), 257–277. https://doi.org/10.1080/0952813X.2018.1430856.

    Article  Google Scholar 

Download references

Acknowledgements

The research was partially supported by a J.P. Morgan faculty research award and by the Israel Science Foundation (Grant No. 1162/17).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ofra Amir.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Amir, O., Doshi-Velez, F. & Sarne, D. Summarizing agent strategies. Auton Agent Multi-Agent Syst 33, 628–644 (2019). https://doi.org/10.1007/s10458-019-09418-w

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10458-019-09418-w

Keywords

Navigation