Covariant Policy Search

Bagnell, J. Andrew; Schneider, Jeff

doi:10.1184/R1/6552458.v1

file.pdf (138.95 kB)

Covariant Policy Search

journal contribution

posted on 2003-01-01, 00:00 authored by J. Andrew Bagnell, Jeff Schneider

We investigate the problem of non-covariant behavior of policy gradient reinforcement learning algorithms. The policy gradient approach is amenable to analysis by information geometric methods. This leads us to propose a natural metric on controller parameterization that results from considering the manifold of probability distributions over paths induced by a stochastic controller. Investigation of this approach leads to a covariant gradient ascent rule. Interesting properties of this rule are discussed, including its relation with actor-critic style reinforcement learning algorithms. The algorithms discussed here are computationally quite efficient and on some interesting problems lead to dramatic performance improvement over noncovariant rules.

History

Date

2003-01-01

Usage metrics

Keywords

Robotics

Licence

In Copyright

Exports

RefWorks

BibTeX

Ref. manager

Endnote

DataCite

NLM

DC

Covariant Policy Search

History

Date

Usage metrics

Categories

Keywords

Licence

Exports