An Integrated Approach of Learning, Planning, and Execution

García-Martínez, Ramón; Borrajo, Daniel

doi:10.1023/A:1008134010576

An Integrated Approach of Learning, Planning, and Execution

Published: September 2000

Volume 29, pages 47–78, (2000)
Cite this article

Journal of Intelligent and Robotic Systems Aims and scope Submit manuscript

Ramón García-Martínez¹ &
Daniel Borrajo²

255 Accesses
27 Citations
Explore all metrics

Abstract

Agents (hardware or software) that act autonomously in an environment have to be able to integrate three basic behaviors: planning, execution, and learning. This integration is mandatory when the agent has no knowledge about how its actions can affect the environment, how the environment reacts to its actions, or, when the agent does not receive as an explicit input, the goals it must achieve. Without an “a priori” theory, autonomous agents should be able to self-propose goals, set-up plans for achieving the goals according to previously learned models of the agent and the environment, and learn those models from past experiences of successful and failed executions of plans. Planning involves selecting a goal to reach and computing a set of actions that will allow the autonomous agent to achieve the goal. Execution deals with the interaction with the environment by application of planned actions, observation of resulting perceptions, and control of successful achievement of the goals. Learning is needed to predict the reactions of the environment to the agent actions, thus guiding the agent to achieve its goals more efficiently.

In this context, most of the learning systems applied to problem solving have been used to learn control knowledge for guiding the search for a plan, but few systems have focused on the acquisition of planning operator descriptions. As an example, currently, one of the most used techniques for the integration of (a way of) planning, execution, and learning is reinforcement learning. However, they usually do not consider the representation of action descriptions, so they cannot reason in terms of goals and ways of achieving those goals.

In this paper, we present an integrated architecture, lope, that learns operator definitions, plans using those operators, and executes the plans for modifying the acquired operators. The resulting system is domain-independent, and we have performed experiments in a robotic framework. The results clearly show that the integrated planning, learning, and executing system outperforms the basic planner in that domain.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Artificial Intelligence in Autonomous Systems. A Collection of Projects in Six Problem Classes

A Learning Based Approach for Planning with Safe Actions

Combining Planning and Action, Lessons from Robots and the Natural World

References

Ashish, N., Knoblock, C., and Levy, A.: 1997, Information gathering plans with sensing actions, in: S. Steel (ed.), Proc. of the 4th European Conf. on Planning, Toulouse, France, pp. 15–27.
Barbehenn, M. and Hutchinson, S.: 1991, An integrated architecture for learning and planning in robotic domains, Sigart Bulletin 2(4), 29–33.
Google Scholar
Bennet, S. W. and DeJong, G.: 1996, Real world robotics: Learning to plan for a robust execution, Machine Learning 23(2/3), 121–162.
Google Scholar
Borrajo, D. and Veloso, M.: 1997, Lazy incremental learning of control knowledge for efficiently obtaining quality plans, AI Rev. J. Special Issue on Lazy Learning 11(1–5), 371–405.
Google Scholar
Boutilier, C., Dearden, R., and Goldszmidt, M.: 1995, Exploiting structure in policy construction, in: Proc. of the 14th Internat. Joint Conf. on Artificial Intelligence (IJCAI-95), Montreal, Quebec, Canada, pp. 1104–1111.
Brooks, R. A.: 1986, A roboust layered control system for a mobile robot, IEEE J. Robotics Automat. 2(1), 14–23.
Google Scholar
Calistri-Yeh: 1990, Classifying and detecting plan-based misconceptions for robust plan recognition, PhD Thesis, Department of Computer Science, Brown University.
Carbonell, J. G. and Gil, Y.: 1990, Learning by experimentation: The operator refinement method', in: R. S. Michalski and Y. Kodratoff (eds.), Machine Learning: An Artificial Intelligence Approach, Vol. III, Palo Alto, CA: Morgan Kaufmann, pp. 191–213.
Google Scholar
Cassandra, A., Kaebling, L., and Littman, M.: 1994, Acting optimally in partially observable stochastic domains, in: Proc. of the American Association of Artificial Intelligence (AAAI-94), pp. 1023–1028.
Christiansen, A.: 1992, Automatic acquisition of task theories for robotic manipulation, PhD Thesis, School of Computer Science, Carnegie-Mellon University, Pittsburgh, PA.
Google Scholar
Dean, T. and Givan, R.: 1997, Model minimization in Markov decision processes, in: Proc. of the American Association of Artificial Intelligence (AAAI-97).
Falkenhainer, B.: 1990, A unified approach to explanation and theory formation, in: J. Shrager and L. P. (eds), Computational Models of Scientific Discovery and Theory Formation, Morgan Kaufmann.
Fernández, F. and Borrajo, D.: 1999, Vector quantization applied to reinforcement learning, in: M. Veloso (ed.), Working notes of the IJCAI'99 3rd Internat. Workshop on Robocup, Stockholm, Sweden, pp. 97–102.
Fikes, R. E., Hart, P. E., and Nilsson, N. J.: 1972, Learning and executing generalized robot plans, Artificial Intelligence 3, 251–288.
Google Scholar
Fikes, R. E. and Nilsson, N. J.: 1971, STRIPS: A new approach to the application of theorem proving to problem solving, Artificial Intelligence 2, 189–208.
Google Scholar
Fritz, W., García-Martínez, R., Blanqué, J., Rama, A., Adobbati, R., and Samo, M.: 1989, The autonomous intelligent system, Robotics Autonom. Systems 5(2), 109–125.
Google Scholar
García-Martínez, R. and Borrajo, D.: 1996, Unsupervised machine learning embedded in autonomous intelligent systems, in: Proc. of the 14th IASTED Internat. Conf. on Applied Informatics, Innsbruck, Austria, pp. 71–73.
García-Martínez, R. and Borrajo, D.: 1997, Planning, learning, and executing in autonomous systems, in: S. Steel (ed.), Recent Advances in AI Planning, 4th European Conf. on Planning, ECP'97, Toulouse, France, pp. 208–220.
García-Martínez, R. and Borrajo, D.: 1998, Learning in unknown environments by knowledge sharing, in: J. Demiris and A. Birk (eds.), Proc. of the 7th European Workshop on Learning Robots, EWLR'98, Edinburgh, Scotland, pp. 22–32.
García-Martínez, R.: 1993, Heuristic theory formation as a machine learning method, in: Proc. of the VI Internat. Symposium on Artificial Intelligence, México, pp. 294–298.
García-Martínez, R.: 1997, Un Modelo de aprendizaje por observación en planificación, PhD Thesis, Facultad de Informática, Universidad Politécnica de Madrid.
Hayes-Roth, F.: 1983, Using proofs and refutations to learn from experience, in: R. S. Michalski, J. G. Carbonell, and T. M. Mitchell (eds.), Machine Learning, An Artificial Intelligence Approach, Palo Alto, CA, Tioga Press, pp. 221–240.
Google Scholar
Kitano, H., Asada, M., Kuniyoshi, Y., Noda, I., and Osawa, E.: 1995, RoboCup: The robot world cup initiative, in: Proc. of the IJCAI-95 Workshop on Entertainment and AI/Life, pp. 19–24.
Klingspor, V., Morik, K. J., and Rieger, A. D.: 1996, Learning concepts from sensor data of a mobile robot, Machine Learning 23(2/3), 305–000.
Google Scholar
Langley, P.: 1983, Learning effective search heuristics, in: Proc. of the 8th Internat. Joint Conf. on Artificial Intelligence, Los Altos, CA, pp. 419–421.
Lin, L.-J.: 1993, 'scaling-up reinforcement learning for robot control, in: Proc. of the 10th Internat. Conf. on Machine Learning, Amherst, MA, pp. 182–189.
Mahavedan, S. and Connell, J.: 1992, Automatic programming of behavior-based robots using reinforcement learning, Artificial Intelligence 55, 311–365.
Google Scholar
Matellán, V., Borrajo, D., and Fernández, C.: 1998, Using ABC² in the RoboCup domain, in: H. Kitano (ed.), RoboCup-97: Robot Soccer World Cup I, pp. 475–483.
Minton, S.: 1988, Learning Effective Search Control Knowledge: An Explanation-Based Approach, Boston, MA, Kluwer Academic, Dordrecht.
Google Scholar
Mitchell, T.: 1977, Version spaces: A candidate elimination approach to rule learning, in: Proc. of the 5th IJCAI, MIT, Cambridge, MA, pp. 305–310.
Google Scholar
Safra, S. and Tennenholtz, M.: 1994, On planning while learning, J. Artificial Intell. Res. 2, 111–129.
Google Scholar
Salzberg, S.: 1985, Heuristics for inductive learning, in: Proc. of the 9th Internat. Joint Conf. on Artificial Intelligence, Los Angeles, CA, pp. 603–609.
Shen, W.: 1993, Discovery as autonomous learning from enviroment, Machine Learning 12, 143–165.
Google Scholar
Simmons, R. and Mitchell, T. M.: 1989, A task control architecture for mobile robots, in: Working Notes of the AAAI Spring Symposium on Robot Navigation.
Stone, P. and Veloso, M. M.: 1998, Towards collaborative and adversarial learning: A case study in robotic soccer, Internat. J. Human-Comput. Systems 48.
Sutton, R.: 1990, Integrated architectures for learning, planning, and reacting based on approximating dynamic programming, in: Proc. of the 7th Internat. Conf. on Machine Learning, Austin, TX, pp. 216–224.
Tan, M.: 1993, Multi-agent reinforcement learning: Independent vs. cooperative agents, in: Proc. of the 10th Internat. Conf. on Machine Learning, Amherst, MA, pp. 330–337.
Veloso, M.: 1994, Planning and Learning by Analogical Reasoning, Springer, Berlin.
Google Scholar
Veloso, M., Carbonell, J., Pérez, A., Borrajo, D., Fink, E., and Blythe, J.: 1995, Integrating planning and learning: The PRODIGY architecture, J. Experim. Theoret. AI 7, 81–120.
Google Scholar
Wang, X.: 1996, Planning while learning operators, in: B. Drabble (ed.), Proc. of the 3rd Internat. Conf. on Artificial Intelligence Planning Systems (AIPS'96), Edinburgh, Scotland, pp. 229–236.
Watkins, C. J. C. H. and Dayan, P.: 1992, Technical note: Q-learning, Machine Learning 8(3/4), 279–292.
Google Scholar

Download references

Author information

Authors and Affiliations

Departamento de Computación, Facultad de Ingeniería, Universidad de Buenos Aires, Bynon 1605, Adrogue (1846), Buenos Aires, Argentina
Ramón García-Martínez
Departamento de Informática, Universidad Carlos III de Madrid, Avda. de la Universidad, 30, 28911, Leganés, Madrid, Spain
Daniel Borrajo

Authors

Ramón García-Martínez
View author publications
You can also search for this author in PubMed Google Scholar
Daniel Borrajo
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

García-Martínez, R., Borrajo, D. An Integrated Approach of Learning, Planning, and Execution. Journal of Intelligent and Robotic Systems 29, 47–78 (2000). https://doi.org/10.1023/A:1008134010576

Download citation

Issue Date: September 2000
DOI: https://doi.org/10.1023/A:1008134010576

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An Integrated Approach of Learning, Planning, and Execution

Abstract

Access this article

Similar content being viewed by others

Artificial Intelligence in Autonomous Systems. A Collection of Projects in Six Problem Classes

A Learning Based Approach for Planning with Safe Actions

Combining Planning and Action, Lessons from Robots and the Natural World

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation