Abstract
Understanding causality in text is crucial for intelligent agents. In this article, inspired by human causality learning, we propose an experience-based causality learning framework. Comparing to traditional approaches, which attempt to handle the causality problem relying on textual clues and linguistic resources, we are the first to use experience information for causality learning. Specifically, we first construct various scenarios for intelligent agents, thus, the agents can gain experience from interaction in these scenarios. Then, human participants build a number of training instances for agents of causality learning based on these scenarios. Each instance contains two sentences and a label. Each sentence describes an event that an agent experienced in a scenario, and the label indicates whether the sentence (event) pair has a causal relation. Accordingly, we propose a model that can infer the causality in text using experience by accessing the corresponding event information based on the input sentence pair. Experiment results show that our method can achieve impressive performance on the grounded causality corpus and significantly outperform the conventional approaches. Our work suggests that experience is very important for intelligent agents to understand causality.
- Sam Adams, Itmar Arel, Joscha Bach, Robert Coop, Rod Furlan, Ben Goertzel, J. Storrs Hall, Alexei Samsonovich, Matthias Scheutz, Matthew Schlesinger, et al. 2012. Mapping the landscape of human-level artificial general intelligence. AI Mag. 33, 1 (2012), 25--42.Google ScholarDigital Library
- Yoshua Bengio, Hugo Larochelle, Pascal Lamblin, Dan Popovici, Aaron Courville, Clarence Simard, Jerome Louradour, and Dumitru Erhan. 2007. Deep architectures for baby AI. http://www.cs.toronto.edu/~amnih/cifar/talks/bengio_tutorial.pdf.Google Scholar
- Paul van den Broek and Anne Helder. 2017. Cognitive processes in discourse comprehension: Passive processes, reader-initiated processes, and evolving mental representations. Discourse Process. 54, 5--6 (2017), 360--372.Google ScholarCross Ref
- Devendra Singh Chaplot, Kanthashree Mysore Sathyendra, Rama Kumar Pasumarthi, Dheeraj Rajagopal, and Ruslan Salakhutdinov. 2018. Gated-attention architectures for task-oriented language grounding. In Proceedings of the 32nd AAAI Conference on Artificial Intelligence.Google Scholar
- Maxime Chevalier-Boisvert, Dzmitry Bahdanau, Salem Lahlou, Lucas Willems, Chitwan Saharia, Thien Huu Nguyen, and Yoshua Bengio. 2018. BabyAI: First steps towards grounded language learning with a human in the loop. arXiv preprint arXiv:1810.08272.Google Scholar
- Elizabeth Couper-Kuhlen and Bernd Kortmann. 2009. Cause-Condition-Concession-Contrast: Cognitive and Discourse Perspectives, vol. 33. Walter de Gruyter.Google Scholar
- Martin Curd and Stathis Psillos. 2013. The Routledge Companion to Philosophy of Science. Routledge.Google Scholar
- Abhishek Das, Samyak Datta, Georgia Gkioxari, Stefan Lee, Devi Parikh, and Dhruv Batra. 2018. Embodied question answering. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. 2054--2063.Google Scholar
- Quang Xuan Do, Yee Seng Chan, and Dan Roth. 2011. Minimally supervised event causality identification. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 294--303. Google ScholarDigital Library
- John Duchi, Elad Hazan, and Yoram Singer. 2011. Adaptive subgradient methods for online learning and stochastic optimization. J. Mach. Learn. Res. 12, Jul (2011), 2121--2159. Google ScholarDigital Library
- Qiaozi Gao, Malcolm Doering, Shaohua Yang, and Joyce Chai. 2016. Physical causality of action verbs in grounded language understanding. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, vol. 1. 1814--1824.Google ScholarCross Ref
- Raymond W. Gibbs Jr. 2003. Embodied experience and linguistic meaning. Brain Lang. 84, 1 (2003), 1--15.Google ScholarCross Ref
- Roxana Girju. 2003. Automatic detection of causal relations for question answering. In Proceedings of the ACL Workshop on Multilingual Summarization and Question Answering. Association for Computational Linguistics, 76--83. Google ScholarDigital Library
- Alison Gopnik, Clark Glymour, David M. Sobel, Laura E. Schulz, Tamar Kushnir, and David Danks. 2004. A theory of causal learning in children: Causal maps and Bayes nets. Psychol. Rev. 111, 1 (2004), 3.Google ScholarCross Ref
- Alison Gopnik, David M. Sobel, Laura E. Schulz, and Clark Glymour. 2001. Causal learning mechanisms in very young children: Two-, three-, and four-year-olds infer causal relations from patterns of variation and covariation. Dev. Psychol. 37, 5 (2001), 620.Google ScholarCross Ref
- Stevan Harnad. 1990. The symbol grounding problem. Physica D: Nonlin. Phenom. 42, 1–3 (1990), 335--346. Google ScholarDigital Library
- Karl Moritz Hermann, Felix Hill, Simon Green, Fumin Wang, Ryan Faulkner, Hubert Soyer, David Szepesvari, Wojtek Czarnecki, Max Jaderberg, Denis Teplyashin, et al. 2017. Grounded language learning in a simulated 3D world. arXiv preprint arXiv:1706.06551.Google Scholar
- Christopher Hidey and Kathy McKeown. 2016. Identifying causal relations using parallel Wikipedia articles. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, vol. 1. 1424--1433.Google ScholarCross Ref
- Felix Hill, Karl Moritz Hermann, Phil Blunsom, and Stephen Clark. 2017. Understanding grounded language learning agents {J}. arXiv preprint arXiv:1710.09867.Google Scholar
- Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural Comput. 9, 8 (1997), 1735--1780. Google ScholarDigital Library
- Jet Hoek, Sandrine Zufferey, Jacqueline Evers-Vermeul, and Ted J. M. Sanders. 2017. Cognitive complexity and the linguistic marking of coherence relations: A parallel corpus study. J. Pragmat. 121 (2017), 113--131.Google ScholarCross Ref
- Eduard H. Hovy. 1990. Parsimonious and profligate approaches to the question of discourse structure relations. In Proceedings of the 5th International Workshop on Natural Language Generation.Google Scholar
- David Hume. 2016. An enquiry concerning human understanding. In Seven Masterpieces of Philosophy. Routledge, 191--284.Google Scholar
- Yangfeng Ji and Jacob Eisenstein. 2015. One vector is not enough: Entity-augmented distributed semantics for discourse relations. Trans. Assoc. Comput.Linguist. 3 (2015), 329--344.Google ScholarCross Ref
- Matthew Johnson, Katja Hofmann, Tim Hutton, and David Bignell. 2016. The Malmo platform for artificial intelligence experimentation. In Proceedings of the International Joint Conferences on Artificial Intelligence (IJCAI’16). 4246--4247. Google ScholarDigital Library
- Xiaomian Kang, Haoran Li, Long Zhou, Jiajun Zhang, and Chengqing Zong. 2016. An end-to-end Chinese discourse parser with adaptation to explicit and non-explicit relation recognition. In Proceedings of the SIGNLL Conference on Computational Natural Language Learning (CoNLL’16 Shared Task).Google ScholarCross Ref
- Michał Kempka, Marek Wydmuch, Grzegorz Runc, Jakub Toczek, and Wojciech Jaśkowski. 2016. Vizdoom: A doom-based ai research platform for visual reinforcement learning. In Proceedings of the IEEE Conference on Computational Intelligence and Games (CIG’16). IEEE, 1--8.Google ScholarCross Ref
- Douwe Kiela, Luana Bulat, Anita L. Vero, and Stephen Clark. 2016. Virtual embodiment: A scalable long-term strategy for artificial intelligence research. arXiv preprint arXiv:1610.07432.Google Scholar
- Walter Kintsch. 1988. The role of knowledge in discourse comprehension: A construction-integration model. Psychol. Rev. 95, 2 (1988), 163.Google ScholarCross Ref
- Simon Kirby, Tom Griffiths, and Kenny Smith. 2014. Iterated learning and the evolution of language. Curr. Opin. Neurobiol. 28 (2014), 108--114.Google ScholarCross Ref
- Alistair Knott and Ted Sanders. 1998. The classification of coherence relations and their linguistic markers: An exploration of two languages. J. Pragmat. 30, 2 (1998), 135--175.Google ScholarCross Ref
- Satwik Kottur, José Moura, Stefan Lee, and Dhruv Batra. 2017. Natural language does not emerge “naturally” in multi-agent dialog. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP’17).Google ScholarCross Ref
- Brenden M. Lake, Tomer D. Ullman, Joshua B. Tenenbaum, and Samuel J. Gershman. 2017. Building machines that learn and think like people. Behav. Brain Sci. 40 (2017).Google Scholar
- Alex Lascarides and Nicholas Asher. 1993. Temporal interpretation, discourse relations and commonsense entailment. Linguist. Philos. 16, 5 (1993), 437--493.Google ScholarCross Ref
- Angeliki Lazaridou, Alexander Peysakhovich, and Marco Baroni. 2016. Multi-agent cooperation and the emergence of (natural) language. arXiv preprint arXiv:1612.07182.Google Scholar
- Haoran Li, Jiajun Zhang, and Chengqing Zong. 2017. Implicit discourse relation recognition for English and Chinese with multiview modeling and effective representation learning. ACM Trans. Asian Low-Resour. Lang. Info. Process. 16, 3 (2017), 19. Google ScholarDigital Library
- Ziheng Lin, Min-Yen Kan, and Hwee Tou Ng. 2009. Recognizing implicit discourse relations in the penn discourse treebank. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. 343--351. Google ScholarDigital Library
- Zhiyi Luo, Yuchen Sha, Kenny Q. Zhu, Seung Won Hwang, and Zhongyuan Wang. 2016. Commonsense causal reasoning between short texts. In Proceedings of the 15th International Conference on Principles of Knowledge Representation and Reasoning. 421--430. Google ScholarDigital Library
- Minh-Thang Luong, Eugene Brevdo, and Rui Zhao. 2017. Neural machine translation (seq2seq) tutorial. Retrieved from https://github.com/tensorflow/nmt.Google Scholar
- William C. Mann and Sandra A. Thompson. 1986. Relational propositions in discourse. Discourse Process. 9, 1 (1986), 57--90.Google ScholarCross Ref
- William C. Mann and Sandra A. Thompson. 1988. Rhetorical structure theory: Toward a functional theory of text organization. Text-Interdisc. J. Study Discourse 8, 3 (1988), 243--281.Google ScholarCross Ref
- Eugenio Martínez-Cámara, Vered Shwartz, Iryna Gurevych, and Ido Dagan. 2017. Neural disambiguation of causal lexical markers based on context. In Proceedings of the 12th International Conference on Computational Semantics (IWCS’17).Google Scholar
- Todor Mihaylov and Anette Frank. 2016. Discourse relation sense classification using cross-argument semantic similarity based on word embeddings. In Proceedings of the SIGNLL Conference on Computational Natural Language Learning (CoNLL’16 Shared Task). 100--107.Google ScholarCross Ref
- Tomas Mikolov, Armand Joulin, and Marco Baroni. 2016. A roadmap towards machine intelligence. In Proceedings of the International Conference on Intelligent Text Processing and Computational Linguistics. Springer, 29--61.Google Scholar
- Eleni Miltsakaki, Rashmi Prasad, Aravind K. Joshi, and Bonnie L. Webber. 2004. The Penn Discourse Treebank. In Proceedings of the International Conference on Language Resources and Evaluation (LREC’04).Google Scholar
- Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Andrei A. Rusu, Joel Veness, Marc G. Bellemare, Alex Graves, Martin Riedmiller, Andreas K. Fidjeland, Georg Ostrovski, et al. 2015. Human-level control through deep reinforcement learning. Nature 518, 7540 (2015), 529--533.Google Scholar
- Gerben Mulder. 2008. Understanding Causal Coherence Relations. Vol. 172. LOT.Google Scholar
- Steve Nebel, Sascha Schneider, and Gunter Daniel Rey. 2016. Mining learning and crafting scientific experiments: A literature review on the use of Minecraft in education and research. J. Edu. Technol. Soc. 19, 2 (2016), 355--366.Google Scholar
- Junhyuk Oh, Valliappa Chockalingam, Satinder Singh, and Honglak Lee. 2016. Control of memory, active perception, and action in Minecraft. arXiv preprint arXiv:1605.09128. Google ScholarDigital Library
- Junhyuk Oh, Satinder Singh, Honglak Lee, and Pushmeet Kohli. 2017. Zero-shot task generalization with multi-task deep reinforcement learning. In Proceedings of the 34th International Conference on Machine Learning. JMLR.org, 2661--2670. Google ScholarDigital Library
- K. Papineni, S. Roukos, T. Ward, and W. J. Zhu. 2002. IBM research report bleu: A method for automatic evaluation of machine translation. In Proceedings of Annual Meeting of the Association for Computational Linguistics. 311--318. Google ScholarDigital Library
- Jean Piaget. 1970. Piaget’s theory. Piaget and His School. Springer, Berlin, 11--23.Google Scholar
- Emily Pitler, Mridhula Raghupathy, Hena Mehta, Ani Nenkova, Alan Lee, and Aravind K. Joshi. 2008. Easily identifiable discourse relations. In Proceedings of the International Conference on Computational Linguistics (COLING’08). 87--90.Google Scholar
- Rashmi Prasad, Nikhil Dinesh, Alan Lee, Eleni Miltsakaki, Livio Robaldo, Aravind K. Joshi, and Bonnie L. Webber. 2008. The Penn Discourse Treebank 2.0. In Proceedings of the International Conference on Language Resources and Evaluation (LREC’08). 2961--2968.Google Scholar
- Mehwish Riaz and Roxana Girju. 2010. Another look at causality: Discovering scenario-specific contingency relationships with no supervision. In Proceedings of the IEEE 4th International Conference on Semantic Computing (ICSC’10). IEEE, 361--368. Google ScholarDigital Library
- Mehwish Riaz and Roxana Girju. 2013. Toward a better understanding of causality between verbal events: Extraction and analysis of the causal power of verb-verb associations. In Proceedings of the Annual Meeting of the Special Interest Group on Discourse and Dialogue (SIGDIAL’13). 21--30.Google Scholar
- Mehwish Riaz and Roxana Girju. 2014. In-depth exploitation of noun and verb semantics to identify causation in verb-noun pairs. In Proceedings of the 15th Annual Meeting of the Special Interest Group on Discourse and Dialogue (SIGDIAL’14). 161--170.Google ScholarCross Ref
- Hannah Rohde, Alexander Johnson, Nathan Schneider, and Bonnie Webber. 2018. Discourse coherence: Concurrent explicit and implicit relations. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, vol. 1. 2257--2267.Google ScholarCross Ref
- Attapol Rutherford, Vera Demberg, and Nianwen Xue. 2017. A systematic study of neural discourse models for implicit discourse relation. In Proceedings of the European Chapter of the Association for Computational Linguistics (EACL’17), vol. 1. 281--291.Google ScholarCross Ref
- Attapol Rutherford and Nianwen Xue. 2016. Robust non-explicit neural discourse parser in English and Chinese. In Proceedings of the SIGNLL Conference on Computational Natural Language Learning (CoNLL’16 Shared Task). 55--59.Google ScholarCross Ref
- Ted Sanders. 2005. Coherence, causality and cognitive complexity in discourse. In Proceedings of the 1st International Symposium on the Exploration and Modelling of Meaning (ACTS/SEM’05). University of Toulouse-le-Mirail Toulouse, 105--114.Google Scholar
- Ted J. M. Sanders and Leo G. M. Noordman. 2000. The role of coherence relations and their linguistic markers in text processing. Discourse Process. 29, 1 (2000), 37--60.Google ScholarCross Ref
- Ted J. M. Sanders, Wilbert P. M. Spooren, and Leo G. M. Noordman. 1993. Coherence Relations in a Cognitive Theory of Discourse Representation. De Gruyter, Berlin.Google Scholar
- Maria Moffat Frederick Schifter, Catherine C. Cipollone. 2013. Piaget, inhelder. and “Minecraft.” Int. Assoc. Dev. Info. Soc. (2013).Google Scholar
- David M. Sobel and Natasha Z. Kirkham. 2006. Blickets and babies: The development of causal reasoning in toddlers and infants. Dev. Psychol. 42, 6 (2006), 1103.Google ScholarCross Ref
- Sainbayar Sukhbaatar, Arthur Szlam, Gabriel Synnaeve, Soumith Chintala, and Rob Fergus. 2015. Mazebase: A sandbox for learning from games. arXiv preprint arXiv:1511.07401.Google Scholar
- Ilya Sutskever, Oriol Vinyals, and Quoc V. Le. 2014. Sequence to sequence learning with neural networks. In Advances in Neural Information Processing Systems. 3104--3112. Google ScholarDigital Library
- Francisco J. Varela, Evan Thompson, and Eleanor Rosch. 2017. The Embodied Mind: Cognitive Science and Human Experience. MIT Press.Google Scholar
- Georg Henrik Von Wright. 2004. Explanation and Understanding. Cornell University Press.Google Scholar
- Gregor Weiss and Marko Bajec. 2016. Discourse sense classification from scratch using focused RNNs. In Proceedings of the SIGNLL Conference on Computational Natural Language Learning (CoNLL’16 Shared Task). 50--54.Google ScholarCross Ref
- Peter A. White. 1988. Causal processing: Origins and development. Psychol. Bull. 104, 1 (1988), 36.Google ScholarCross Ref
- Terry Winograd. 1971. Procedures as a representation for data in a computer program for understanding natural language. No. MAC-TR-84. MASSACHUSETTS INST OF TECH CAMBRIDGE PROJECT MAC, 1971.Google Scholar
- Terry Winograd. 1972. Understanding natural language. Cogn. Psychol. 3, 1 (1972), 1--191.Google ScholarCross Ref
- Yi Wu, Yuxin Wu, Georgia Gkioxari, and Yuandong Tian. 2018. Building generalizable agents with a realistic and rich 3D environment. arXiv preprint arXiv:1801.02209.Google Scholar
- Nianwen Xue, Hwee Tou Ng, Sameer Pradhan, Rashmi Prasad, Christopher Bryant, and Attapol Rutherford. 2015. The CoNLL 2015 shared task on shallow discourse parsing. In Proceedings of the SIGNLL Conference on Computational Natural Language Learning (CoNLL’15). 1--16.Google ScholarCross Ref
- Nianwen Xue, Hwee Tou Ng, Sameer Pradhan, Attapol Rutherford, Bonnie Webber, Chuan Wang, and Hongmin Wang. 2016. CoNLL 2016 shared task on multilingual shallow discourse parsing. In Proceedings of the SIGNLL Conference on Computational Natural Language Learning (CoNLL’16 Shared Task). 1--19.Google ScholarCross Ref
- Haonan Yu, Haichao Zhang, and Wei Xu. 2018. Interactive grounded language acquisition and generalization in a 2D world. arXiv preprint arXiv:1802.01433.Google Scholar
- Qinghua Zhang and Albert Benveniste. 1992. Wavelet networks. IEEE Trans. Neural Netw. 3, 6 (1992), 889--898. Google ScholarDigital Library
- Yuping Zhou and Nianwen Xue. 2015. The Chinese discourse treebank: A Chinese corpus annotated with discourse relations. Lang. Resour. Eval. 49, 2 (2015), 397--431. Google ScholarDigital Library
Index Terms
- Experience-based Causality Learning for Intelligent Agents
Recommendations
Modelling Intelligent Agents through Causality Theory
MICAI '07: Proceedings of the 2007 Sixth Mexican International Conference on Artificial Intelligence, Special SessionWe introduce Causal Agents, a methodology and agent architecture for modeling intelligent agents based on Causality Theory. We draw upon concepts from classical philosophy about metaphysical causes of existing entities for defining agents in terms of ...
Complexity results for structure-based causality
We give a precise picture of the computational complexity of causal relationships in Pearl's structural models, where we focus on causality between variables, event causality, and probabilistic causality. As for causality between variables, we consider ...
A hybrid agent architecture for modeling autonomous agents in SAGE
IDEAL'05: Proceedings of the 6th international conference on Intelligent Data Engineering and Automated LearningThis paper highlights the Hybrid agent construction model being developed that allows the description and development of autonomous agents in SAGE (Scalable, fault Tolerant Agent Grooming Environment) – a second generation FIPA-Compliant Multi-Agent ...
Comments