Local Policy Search in a Convex Space and Conservative Policy Iteration as Boosted Policy Search

Bruno Scherrer; Matthieu Geist

doi:10.1007/978-3-662-44845-8_3

Communication Dans Un Congrès Année : 2014

Local Policy Search in a Convex Space and Conservative Policy Iteration as Boosted Policy Search

(1) , (2)

1
2

Bruno Scherrer

Fonction : Auteur
PersonId : 1406
IdHAL : bruno-scherrer
IdRef : 073360708

Computational Radiology Laboratory [Boston]

Matthieu Geist

Fonction : Auteur
PersonId : 6945
IdHAL : matthieu-geist

Georgia Tech Lorraine [Metz]

Résumé

Local Policy Search is a popular reinforcement learning approach for handling large state spaces. Formally, it searches locally in a parameterized policy space in order to maximize the associated value function averaged over some pre-defined distribution. The best one can hope in general from such an approach is to get a local optimum of this criterion. The first contribution of this article is the following surprising result: if the policy space is convex, any (approximate) local optimum enjoys a global performance guarantee. Unfortunately, the convexity assumption is strong: it is not satisfied by commonly used parameterizations and designing a parameterization that induces this property seems hard. A natural so-lution to alleviate this issue consists in deriving an algorithm that solves the local policy search problem using a boosting approach (constrained to the convex hull of the policy space). The resulting algorithm turns out to be a slight generalization of conservative policy iteration; thus, our second contribution is to highlight an original connection between local policy search and approximate dynamic pro-gramming.

Domaines

Informatique [cs] Sciences de l'ingénieur [physics]

Fichier principal

supelec886.pdf (304.27 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Sébastien Van Luchene : Connectez-vous pour contacter le contributeur

https://centralesupelec.hal.science/hal-01086345

Soumis le : lundi 24 novembre 2014-08:38:59

Dernière modification le : mardi 9 avril 2024-11:58:06

Archivage à long terme le : mercredi 25 février 2015-10:15:51

Dates et versions

hal-01086345 , version 1 (24-11-2014)

Identifiants

HAL Id : hal-01086345 , version 1
DOI : 10.1007/978-3-662-44845-8_3

Citer

Bruno Scherrer, Matthieu Geist. Local Policy Search in a Convex Space and Conservative Policy Iteration as Boosted Policy Search. ECMLPKDD 2014, Sep 2014, Nancy, France. pp.35 - 50, ⟨10.1007/978-3-662-44845-8_3⟩. ⟨hal-01086345⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

SUPELEC CNRS UNIV-FCOMTE CENTRALESUPELEC UMI-GTL

147 Consultations

195 Téléchargements

Local Policy Search in a Convex Space and Conservative Policy Iteration as Boosted Policy Search

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager