Skip to main content
Log in

An Integrated Approach of Learning, Planning, and Execution

  • Published:
Journal of Intelligent and Robotic Systems Aims and scope Submit manuscript

Abstract

Agents (hardware or software) that act autonomously in an environment have to be able to integrate three basic behaviors: planning, execution, and learning. This integration is mandatory when the agent has no knowledge about how its actions can affect the environment, how the environment reacts to its actions, or, when the agent does not receive as an explicit input, the goals it must achieve. Without an “a priori” theory, autonomous agents should be able to self-propose goals, set-up plans for achieving the goals according to previously learned models of the agent and the environment, and learn those models from past experiences of successful and failed executions of plans. Planning involves selecting a goal to reach and computing a set of actions that will allow the autonomous agent to achieve the goal. Execution deals with the interaction with the environment by application of planned actions, observation of resulting perceptions, and control of successful achievement of the goals. Learning is needed to predict the reactions of the environment to the agent actions, thus guiding the agent to achieve its goals more efficiently.

In this context, most of the learning systems applied to problem solving have been used to learn control knowledge for guiding the search for a plan, but few systems have focused on the acquisition of planning operator descriptions. As an example, currently, one of the most used techniques for the integration of (a way of) planning, execution, and learning is reinforcement learning. However, they usually do not consider the representation of action descriptions, so they cannot reason in terms of goals and ways of achieving those goals.

In this paper, we present an integrated architecture, lope, that learns operator definitions, plans using those operators, and executes the plans for modifying the acquired operators. The resulting system is domain-independent, and we have performed experiments in a robotic framework. The results clearly show that the integrated planning, learning, and executing system outperforms the basic planner in that domain.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Ashish, N., Knoblock, C., and Levy, A.: 1997, Information gathering plans with sensing actions, in: S. Steel (ed.), Proc. of the 4th European Conf. on Planning, Toulouse, France, pp. 15–27.

  • Barbehenn, M. and Hutchinson, S.: 1991, An integrated architecture for learning and planning in robotic domains, Sigart Bulletin 2(4), 29–33.

    Google Scholar 

  • Bennet, S. W. and DeJong, G.: 1996, Real world robotics: Learning to plan for a robust execution, Machine Learning 23(2/3), 121–162.

    Google Scholar 

  • Borrajo, D. and Veloso, M.: 1997, Lazy incremental learning of control knowledge for efficiently obtaining quality plans, AI Rev. J. Special Issue on Lazy Learning 11(1–5), 371–405.

    Google Scholar 

  • Boutilier, C., Dearden, R., and Goldszmidt, M.: 1995, Exploiting structure in policy construction, in: Proc. of the 14th Internat. Joint Conf. on Artificial Intelligence (IJCAI-95), Montreal, Quebec, Canada, pp. 1104–1111.

  • Brooks, R. A.: 1986, A roboust layered control system for a mobile robot, IEEE J. Robotics Automat. 2(1), 14–23.

    Google Scholar 

  • Calistri-Yeh: 1990, Classifying and detecting plan-based misconceptions for robust plan recognition, PhD Thesis, Department of Computer Science, Brown University.

  • Carbonell, J. G. and Gil, Y.: 1990, Learning by experimentation: The operator refinement method', in: R. S. Michalski and Y. Kodratoff (eds.), Machine Learning: An Artificial Intelligence Approach, Vol. III, Palo Alto, CA: Morgan Kaufmann, pp. 191–213.

    Google Scholar 

  • Cassandra, A., Kaebling, L., and Littman, M.: 1994, Acting optimally in partially observable stochastic domains, in: Proc. of the American Association of Artificial Intelligence (AAAI-94), pp. 1023–1028.

  • Christiansen, A.: 1992, Automatic acquisition of task theories for robotic manipulation, PhD Thesis, School of Computer Science, Carnegie-Mellon University, Pittsburgh, PA.

    Google Scholar 

  • Dean, T. and Givan, R.: 1997, Model minimization in Markov decision processes, in: Proc. of the American Association of Artificial Intelligence (AAAI-97).

  • Falkenhainer, B.: 1990, A unified approach to explanation and theory formation, in: J. Shrager and L. P. (eds), Computational Models of Scientific Discovery and Theory Formation, Morgan Kaufmann.

  • Fernández, F. and Borrajo, D.: 1999, Vector quantization applied to reinforcement learning, in: M. Veloso (ed.), Working notes of the IJCAI'99 3rd Internat. Workshop on Robocup, Stockholm, Sweden, pp. 97–102.

  • Fikes, R. E., Hart, P. E., and Nilsson, N. J.: 1972, Learning and executing generalized robot plans, Artificial Intelligence 3, 251–288.

    Google Scholar 

  • Fikes, R. E. and Nilsson, N. J.: 1971, STRIPS: A new approach to the application of theorem proving to problem solving, Artificial Intelligence 2, 189–208.

    Google Scholar 

  • Fritz, W., García-Martínez, R., Blanqué, J., Rama, A., Adobbati, R., and Samo, M.: 1989, The autonomous intelligent system, Robotics Autonom. Systems 5(2), 109–125.

    Google Scholar 

  • García-Martínez, R. and Borrajo, D.: 1996, Unsupervised machine learning embedded in autonomous intelligent systems, in: Proc. of the 14th IASTED Internat. Conf. on Applied Informatics, Innsbruck, Austria, pp. 71–73.

  • García-Martínez, R. and Borrajo, D.: 1997, Planning, learning, and executing in autonomous systems, in: S. Steel (ed.), Recent Advances in AI Planning, 4th European Conf. on Planning, ECP'97, Toulouse, France, pp. 208–220.

  • García-Martínez, R. and Borrajo, D.: 1998, Learning in unknown environments by knowledge sharing, in: J. Demiris and A. Birk (eds.), Proc. of the 7th European Workshop on Learning Robots, EWLR'98, Edinburgh, Scotland, pp. 22–32.

  • García-Martínez, R.: 1993, Heuristic theory formation as a machine learning method, in: Proc. of the VI Internat. Symposium on Artificial Intelligence, México, pp. 294–298.

  • García-Martínez, R.: 1997, Un Modelo de aprendizaje por observación en planificación, PhD Thesis, Facultad de Informática, Universidad Politécnica de Madrid.

  • Hayes-Roth, F.: 1983, Using proofs and refutations to learn from experience, in: R. S. Michalski, J. G. Carbonell, and T. M. Mitchell (eds.), Machine Learning, An Artificial Intelligence Approach, Palo Alto, CA, Tioga Press, pp. 221–240.

    Google Scholar 

  • Kitano, H., Asada, M., Kuniyoshi, Y., Noda, I., and Osawa, E.: 1995, RoboCup: The robot world cup initiative, in: Proc. of the IJCAI-95 Workshop on Entertainment and AI/Life, pp. 19–24.

  • Klingspor, V., Morik, K. J., and Rieger, A. D.: 1996, Learning concepts from sensor data of a mobile robot, Machine Learning 23(2/3), 305–000.

    Google Scholar 

  • Langley, P.: 1983, Learning effective search heuristics, in: Proc. of the 8th Internat. Joint Conf. on Artificial Intelligence, Los Altos, CA, pp. 419–421.

  • Lin, L.-J.: 1993, 'scaling-up reinforcement learning for robot control, in: Proc. of the 10th Internat. Conf. on Machine Learning, Amherst, MA, pp. 182–189.

  • Mahavedan, S. and Connell, J.: 1992, Automatic programming of behavior-based robots using reinforcement learning, Artificial Intelligence 55, 311–365.

    Google Scholar 

  • Matellán, V., Borrajo, D., and Fernández, C.: 1998, Using ABC2 in the RoboCup domain, in: H. Kitano (ed.), RoboCup-97: Robot Soccer World Cup I, pp. 475–483.

  • Minton, S.: 1988, Learning Effective Search Control Knowledge: An Explanation-Based Approach, Boston, MA, Kluwer Academic, Dordrecht.

    Google Scholar 

  • Mitchell, T.: 1977, Version spaces: A candidate elimination approach to rule learning, in: Proc. of the 5th IJCAI, MIT, Cambridge, MA, pp. 305–310.

    Google Scholar 

  • Safra, S. and Tennenholtz, M.: 1994, On planning while learning, J. Artificial Intell. Res. 2, 111–129.

    Google Scholar 

  • Salzberg, S.: 1985, Heuristics for inductive learning, in: Proc. of the 9th Internat. Joint Conf. on Artificial Intelligence, Los Angeles, CA, pp. 603–609.

  • Shen, W.: 1993, Discovery as autonomous learning from enviroment, Machine Learning 12, 143–165.

    Google Scholar 

  • Simmons, R. and Mitchell, T. M.: 1989, A task control architecture for mobile robots, in: Working Notes of the AAAI Spring Symposium on Robot Navigation.

  • Stone, P. and Veloso, M. M.: 1998, Towards collaborative and adversarial learning: A case study in robotic soccer, Internat. J. Human-Comput. Systems 48.

  • Sutton, R.: 1990, Integrated architectures for learning, planning, and reacting based on approximating dynamic programming, in: Proc. of the 7th Internat. Conf. on Machine Learning, Austin, TX, pp. 216–224.

  • Tan, M.: 1993, Multi-agent reinforcement learning: Independent vs. cooperative agents, in: Proc. of the 10th Internat. Conf. on Machine Learning, Amherst, MA, pp. 330–337.

  • Veloso, M.: 1994, Planning and Learning by Analogical Reasoning, Springer, Berlin.

    Google Scholar 

  • Veloso, M., Carbonell, J., Pérez, A., Borrajo, D., Fink, E., and Blythe, J.: 1995, Integrating planning and learning: The PRODIGY architecture, J. Experim. Theoret. AI 7, 81–120.

    Google Scholar 

  • Wang, X.: 1996, Planning while learning operators, in: B. Drabble (ed.), Proc. of the 3rd Internat. Conf. on Artificial Intelligence Planning Systems (AIPS'96), Edinburgh, Scotland, pp. 229–236.

  • Watkins, C. J. C. H. and Dayan, P.: 1992, Technical note: Q-learning, Machine Learning 8(3/4), 279–292.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

García-Martínez, R., Borrajo, D. An Integrated Approach of Learning, Planning, and Execution. Journal of Intelligent and Robotic Systems 29, 47–78 (2000). https://doi.org/10.1023/A:1008134010576

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1008134010576

Navigation