ABSTRACT
From the beginning of the history of AI, there has been interest in games as a platform of research. As the field developed, human-level competence in complex games became a target researchers worked to reach. Only relatively recently has this target been finally met for traditional tabletop games such as Backgammon, Chess and Go. This prompted a shift in research focus towards electronic games, which provide unique new challenges. As is often the case with AI research, these results are liable to be exaggerated or mis-represented by either authors or third parties. The extent to which these game benchmarks constitute "fair" competition between human and AI is also a matter of debate. In this paper, we review statements made by reseachers and third parties in the general media and academic publications about these game benchmark results. We analyze what a fair competition would look like and suggest a taxonomy of dimensions to frame the debate of fairness in game contests between humans and machines. Eventually, we argue that there is no completely fair way to compare human and AI performance on a game.
- Thomas Anthony, Zheng Tian, and David Barber. 2017. Thinking fast and slow with deep learning and tree search. In Advances in Neural Information Processing Systems. 5360--5370. Google ScholarDigital Library
- Marc G Bellemare, Yavar Naddaf, Joel Veness, and Michael Bowling. 2013. The arcade learning environment: An evaluation platform for general agents. Journal of Artificial Intelligence Research 47 (2013), 253--279. Google ScholarCross Ref
- Rodney Brooks. 2017. The Seven Deadly Sins of Predicting the Future of AI. Retrieved October 23, 2018 from https://rodneybrooks.com/the-seven-deadly-sins-of-predicting-the-future-of-ai/Google Scholar
- Sarah F Brosnan and Frans BM De Waal. 2003. Monkeys reject unequal pay. Nature 425, 6955 (2003), 297.Google Scholar
- Dustin Browder. 2010. StarCraft II: Wings of Liberty.Google Scholar
- Michael Buro. 2003. Real-time strategy games: A new AI research challenge. In IJCAI, Vol. 2003. 1534--1535. Google ScholarDigital Library
- Murray Campbell, A Joseph Hoane Jr, and Feng-hsiung Hsu. 2002. Deep blue. Artificial intelligence 134, 1-2 (2002), 57--83. Google ScholarDigital Library
- Andy Clark and David Chalmers. 1998. The extended mind. analysis 58, 1 (1998), 7--19.Google Scholar
- Mike Cook. 2018. OpenAI Dota 2: Game Is Hard. Retrieved October 23, 2018 from http://www.gamesbyangelina.org/2018/08/openai-dota-2-game-is-hard/Google Scholar
- DeepMind. 2019. AlphaStar: Mastering the Real-Time Strategy Game StarCraft II. Retrieved April 23, 2019 from https://deepmind.com/blog/alphastar-mastering-real-time-strategy-game-starcraft-ii/Google Scholar
- Nathan Ensmenger. 2012. Is chess the drosophila of artificial intelligence? A social history of an algorithm. Social Studies of Science 42, 1 (2012), 5--30.Google ScholarCross Ref
- Jonathan St BT Evans. 1984. Heuristic and analytic processes in reasoning. British Journal of Psychology 75, 4 (1984), 451--468.Google ScholarCross Ref
- Michael Genesereth, Nathaniel Love, and Barney Pell. 2005. General game playing: Overview of the AAAI competition. AI magazine 26, 2 (2005), 62.Google Scholar
- Philip Hingston. 2009. A turing test for computer game bots. IEEE Transactions on Computational Intelligence and AI in Games 1, 3 (2009), 169--186.Google ScholarCross Ref
- Haomiao Huang. 2019. AlphaStar's Strategies Might Be Bad for Star-craft 2 But They're Great for AI. Retrieved May 03, 2019 from https://medium.com/datadriveninvestor/alphastars-strategies-might-be-bad-for-starcraft-2-but-they-re-great-for-ai-c0a879564da22Google Scholar
- IceFrog. 2013. Dota2.Google Scholar
- Daniel Kahneman, Jack L Knetsch, and Richard H Thaler. 1986. Fairness and the assumptions of economics. Journal of business (1986), S285--S300.Google Scholar
- Sergey Karakovskiy and Julian Togelius. 2012. The mario ai benchmark and competitions. IEEE Transactions on Computational Intelligence and AI in Games 4, 1 (2012), 55--67.Google ScholarCross Ref
- Michal Kempka, Marek Wydmuch, Grzegorz Runc, Jakub Toczek, and Wojciech Jaśkowski. 2016. Vizdoom: A doom-based ai research platform for visual reinforcement learning. In Computational Intelligence and Games (CIG), 2016 IEEE Conference on. IEEE, 1--8.Google ScholarDigital Library
- Ahmed Khalifa, Aaron Isaksen, Julian Togelius, and Andy Nealen. 2016. Modifying MCTS for Human-Like General Video Game Playing.. In IJCAI. 2514--2520. Google ScholarDigital Library
- Team Liquid. 2018. OpenAI's Dota 2 bots vs. 5 top professionals in TI. Retrieved October 23, 2018 from https://www.liquiddota.com/forum/dota-2-general/534977-openais-dota-2--bots-vs-5-top-professionals-in-tiGoogle Scholar
- Marlos C Machado, Marc G Bellemare, Erik Talvitie, Joel Veness, Matthew Hausknecht, and Michael Bowling. 2017. Revisiting the arcade learning environment: Evaluation protocols and open problems for general agents. arXiv preprint arXiv:1709.06009 (2017).Google Scholar
- Gary Marcus. 2018. Innateness, AlphaZero, and Artificial Intelligence. arXiv preprint arXiv:1801.05667 (2018).Google Scholar
- Chris Metzen and Rob Pardo. 1998. StarCraft: Brood War.Google Scholar
- Motherboard. 2018. OpenAI Is Beating Humans at 'Dota 2' Because It's Basically Cheating. Retrieved October 23, 2018 from https://motherboard.vice.com/enus/article/gy3nvq/ai-beat-humans-at-dota-2Google Scholar
- Santiago Ontanón, Gabriel Synnaeve, Alberto Uriarte, Florian Richoux, David Churchill, and Mike Preuss. 2013. A survey of real-time strategy game AI research and competition in StarCraft. IEEE Transactions on Computational Intelligence and AI in games 5, 4 (2013), 293--311.Google ScholarCross Ref
- OpenAI.2017. Dota2. Retrieved October 23, 2018 from https://blog.openai.com/dota-2/Google Scholar
- OpenAI.2018. The International 2018: Results. Retrieved October 23, 2018 from https://blog.openai.com/the-international-2018-results/Google Scholar
- OpenAI. 2018. OpenAI Five. Retrieved October 23, 2018 from https://blog.openai.com/openai-five/Google Scholar
- OpenAI. 2018. OpenAI Five Benchmark: Results. Retrieved October 23, 2018 from https://blog.openai.com/openai-five-benchmark-results/Google Scholar
- OpenAI. 2019. How to Train Your OpenAI Five. Retrieved May 03, 2019 from https://openai.com/blog/how-to-train-your-openai-five/Google Scholar
- OpenAI. 2019. OpenAI Five Arena. Retrieved May 03, 2019 from https://arena.openai.com/#/Google Scholar
- Diego Perez-Liebana, Spyridon Samothrakis, Julian Togelius, Simon M Lucas, and Tom Schaul. 2016. General video game ai: Competition, challenges and opportunities. In Thirtieth AAAI Conference on Artificial Intelligence. 4335--4337. Google ScholarDigital Library
- Washington Post. 2016. What AlphaGo's sly move says about machine creativity. Retrieved October 23, 2018 from https://www.washingtonpost.com/news/innovations/wp/2016/03/15/what-alphagos-sly-move-says-about-machine-creativity/?utmterm=.543bc9ade906Google Scholar
- John Rawls. 2001. Justice as fairness: A restatement. Harvard University Press.Google Scholar
- Reddit. 2018. Team Human vs. OpenAI Five Match Discussions. Retrieved October 23, 2018 from https://www.reddit.com/r/DotA2/comments/94udao/teamhumanvsopenaifivematchdiscussions/Google Scholar
- MIT Technology Review. 2017. Humans Are Still Better Than AI at StarCraft--for Now. Retrieved October 23, 2018 from https://www.technologyreview.com/s/609242/humans-are-still-better-than-ai-at-starcraftfor-now/Google Scholar
- Christoph Salge, Michael Cerny Green, Rodgrigo Canaan, and Julian Togelius. 2018. Generative design in minecraft (GDMC): settlement generation competition. In Proceedings of the 13th International Conference on the Foundations of Digital Games. ACM, 49. Google ScholarDigital Library
- Tom Schaul, Julian Togelius, and Jürgen Schmidhuber. 2011. Measuring intelligence through games. arXiv preprint arXiv:1109.1314 (2011).Google Scholar
- Oscar Scwartz. 2018. 'The discourse is unhinged': how the media gets AI alarmingly wrong. Retrieved October 23, 2018 from https://www.theguardian.com/technology/2018/jul/25/ai-artificial-intelligence-social-media-bots-wrongGoogle Scholar
- John Searle. 1999. The Chinese Room. (1999).Google Scholar
- Noor Shaker, Julian Togelius, Georgios N Yannakakis, Ben Weber, Tomoyuki Shimizu, Tomonori Hashiyama, Nathan Sorenson, Philippe Pasquier, Peter Mawhorter, Glen Takahashi, et al. 2011. The 2010 Mario AI championship: Level generation track. IEEE Transactions on Computational Intelligence and AI in Games 3, 4 (2011), 332--347.Google ScholarCross Ref
- David Silver, Aja Huang, Chris J Maddison, Arthur Guez, Laurent Sifre, George Van Den Driessche, Julian Schrittwieser, Ioannis Antonoglou, Veda Panneershelvam, Marc Lanctot, et al. 2016. Mastering the game of Go with deep neural networks and tree search. nature 529, 7587 (2016), 484.Google Scholar
- David Silver, Thomas Hubert, Julian Schrittwieser, Ioannis Antonoglou, Matthew Lai, Arthur Guez, Marc Lanctot, Laurent Sifre, Dharshan Kumaran, Thore Graepel, et al. 2017. Mastering chess and shogi by self-play with a general reinforcement learning algorithm. arXiv preprint arXiv:1712.01815 (2017).Google Scholar
- David Silver, Julian Schrittwieser, Karen Simonyan, Ioannis Antonoglou, Aja Huang, Arthur Guez, Thomas Hubert, Lucas Baker, Matthew Lai, Adrian Bolton, et al. 2017. Mastering the game of Go without human knowledge. Nature 550, 7676 (2017), 354.Google Scholar
- Peter Stone, Michael Quinlan, and Todd Hester. 2010. The Essence of Soccer, Can Robots Play Too? In Soccer and Philosophy: Beautiful Thoughts on the Beautiful Game, Ted Richards (Ed.). Popular Culture and Philosophy, Vol. 51. Open Court Publishing Company, 75--88.Google Scholar
- Hans Strasburger, Ingo Rentschler, and Martin Jüttner. 2011. Peripheral vision and pattern recognition: A review. Journal of vision 11, 5 (2011), 13--13.Google ScholarCross Ref
- Gerald Tesauro. 1995. Temporal difference learning and TD-Gammon. Commun. ACM 38, 3 (1995), 58--68. Google ScholarDigital Library
- The New York Times. 1997. Deep, Deeper, Deepest Blue. Retrieved October 23, 2018 from https://www.nytimes.com/1997/05/18/weekinreview/deep-deeper-deepest-blue.htmlGoogle Scholar
- The New York Times.1997. Swift and Slashing, Computer Topples Kasparov. Retrieved October 23, 2018 from https://www.nytimes.com/1997/05/12/nyregion/swift-and-slashing-computer-topples-kasparov.htmlGoogle Scholar
- The New York Times. 2017. Google's A.I. Program Rattles Chinese Go Master as It Wins Match. Retrieved October 23, 2018 from https://www.nytimes.com/2017/05/25/business/google-alphago-defeats-go-ke-jie-again.htmlGoogle Scholar
- Vernor Vinge. 1993. Technological singularity. In VISION-21 Symposium sponsored by NASA Lewis Research Center and the Ohio Aerospace Institute. 30--31.Google Scholar
- Oriol Vinyals, Timo Ewalds, Sergey Bartunov, Petko Georgiev, Alexander Sasha Vezhnevets, Michelle Yeo, Alireza Makhzani, Heinrich Küttler, John Agapiou, Julian Schrittwieser, et al. 2017. Starcraft ii: A new challenge for reinforcement learning. arXiv preprint arXiv:1708.04782 (2017).Google Scholar
- WeeklyStandard. 1997. Be Afraid. Retrieved October 23, 2018 from https://www.weeklystandard.com/charles-krauthammer/be-afraid-9802Google Scholar
- Wired. 2016. IN TWO MOVES, ALPHAGO AND LEE SEDOL REDEFINED THE FUTURE. Retrieved October 23, 2018 from https://www.wired.com/2016/03/two-moves-alphago-lee-sedol-redefined-future/Google Scholar
- WorldAIShow. 2018. Why is Elon Musk afraid of AlphaGo-Zero? Retrieved October 23, 2018 from https://singapore.worldaishow.com/elon-musk-afraid-alphago-zero-ai/Google Scholar
- Georgios N. Yannakakis and Julian Togelius. 2018. Artificial Intelligence and Games. Springer. http://gameaibook.org. Google ScholarDigital Library
Index Terms
- Leveling the playing field: fairness in AI versus human game benchmarks
Recommendations
Levelling the playing field: games handicapping
MIV'06: Proceedings of the 6th WSEAS International Conference on Multimedia, Internet & Video TechnologiesImagine the look on your ten year old sons face having just been totally annihilated by an over zealous big brother/father combination in Halo-2. Ignoring the fact that he may be scarred for life by this experience, his whole attitude towards the ...
Evolving Intelligent Mario Controller by Reinforcement Learning
TAAI '11: Proceedings of the 2011 International Conference on Technologies and Applications of Artificial IntelligenceArtificial Intelligence for computer games is an interesting topic which attracts intensive attention recently. In this context, Mario AI Competition modifies a Super Mario Bros game to be a benchmark software for people who program AI controller to ...
An analysis of play style of advanced mahjong players toward the implementation of strong AI player
Special Issue: Artificial, Biological and Bio-Inspired Intelligence Guest Editors: Tomohiro Shirakawa and Hiroshi SatoThe studies of artificial intelligence AI on the game with perfect information has been very much advanced to have an ability to compete top-rate human players. In contrast, it is still difficult for AI to seek the best strategy of the facing situation ...
Comments