Abstract
This work presents a new algorithm, called Heuristically Accelerated Q–Learning (HAQL), that allows the use of heuristics to speed up the well-known Reinforcement Learning algorithm Q–learning. A heuristic function \(\mathcal{H}\) that influences the choice of the actions characterizes the HAQL algorithm. The heuristic function is strongly associated with the policy: it indicates that an action must be taken instead of another. This work also proposes an automatic method for the extraction of the heuristic function \(\mathcal{H}\) from the learning process, called Heuristic from Exploration. Finally, experimental results shows that even a very simple heuristic results in a significant enhancement of performance of the reinforcement learning algorithm.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Bertsekas, D.P.: Dynamic Programming: Deterministic and Stochastic Models. Prentice-Hall, Upper Saddle River (1987)
Bonabeau, E., Dorigo, M., Theraulaz, G.: Inspiration for optimization from social insect behaviour. Nature 406 [6791] (2000)
Drummond, C.: Accelerating reinforcement learning by composing solutions of automatically identified subtasks. Journal of Artificial Intelligence Research 16, 59–104 (2002)
Foster, D., Dayan, P.: Structure in the space of value functions. Machine Learning 49(2/3), 325–346 (2002)
Gambardella, L., Dorigo, M.: Ant–Q: A reinforcement learning approach to the traveling salesman problem. In: Proceedings of the ML 1995 – Twelfth International Conference on Machine Learning, pp. 252–260 (1995)
Hart, P.E., Nilsson, N.J., Raphael, B.: A formal basis for the heuristic determination of minimum cost paths. IEEE Transactions on Systems Science and Cybernetics 4(2), 100–107 (1968)
Kaelbling, L.P., Littman, M.L., Moore, A.W.: Reinforcement learning: A survey. Journal of Artificial Intelligence Research 4, 237–285 (1996)
Littman, M.L., Szepesvári, C.: A generalized reinforcement learning model: Convergence and applications. In: Procs. of the Thirteenth International Conf. on Machine Learning (ICML 1996), pp. 310–318 (1996)
Mitchell, T.: Machine Learning. McGraw Hill, New York (1997)
Nehmzow, U.: Mobile Robotics: A Practical Introduction. Springer, Berlin (2000)
Watkins, C.J.C.H.: Learning from Delayed Rewards. PhD thesis, University of Cambridge (1989)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Bianchi, R.A.C., Ribeiro, C.H.C., Costa, A.H.R. (2004). Heuristically Accelerated Q–Learning: A New Approach to Speed Up Reinforcement Learning. In: Bazzan, A.L.C., Labidi, S. (eds) Advances in Artificial Intelligence – SBIA 2004. SBIA 2004. Lecture Notes in Computer Science(), vol 3171. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-28645-5_25
Download citation
DOI: https://doi.org/10.1007/978-3-540-28645-5_25
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-23237-7
Online ISBN: 978-3-540-28645-5
eBook Packages: Springer Book Archive