A cascaded supervised learning approach to inverse reinforcement learning

Edouard Klein; Bilal Piot; Matthieu Geist; Olivier Pietquin

doi:10.1007/978-3-642-40988-2_1

Communication Dans Un Congrès Année : 2013

A cascaded supervised learning approach to inverse reinforcement learning

(1) , (1) , (1) , (1)

Edouard Klein

Fonction : Auteur
PersonId : 901877

IMS : Information, Multimodalité & Signal

Bilal Piot

Fonction : Auteur

IMS : Information, Multimodalité & Signal

Matthieu Geist

Fonction : Auteur
PersonId : 6945
IdHAL : matthieu-geist

IMS : Information, Multimodalité & Signal

Olivier Pietquin

Fonction : Auteur
PersonId : 4024
IdHAL : olivier-pietquin
ORCID : 0000-0002-5386-465X
IdRef : 142821861

IMS : Information, Multimodalité & Signal

Résumé

This paper considers the Inverse Reinforcement Learning (IRL) problem, that is inferring a reward function for which a demonstrated expert policy is optimal. We propose to break the IRL problem down into two generic Supervised Learning steps: this is the Cascaded Supervised IRL (CSI) approach. A classification step that defines a score function is followed by a regression step providing a reward function. A theoretical analysis shows that the demonstrated expert policy is nearoptimal for the computed reward function. Not needing to repeatedly solve a Markov Decision Process (MDP) and the ability to leverage existing techniques for classification and regression are two important advantages of the CSI approach. It is furthermore empirically demonstrated to compare positively to state-of-the-art approaches when using only transitions sampled according to the expert policy, up to the use of some heuristics. This is exemplified on two classical benchmarks (the mountain car problem and a highway driving simulator).

Domaines

Machine Learning [stat.ML]

Fichier principal

csi_irl.pdf (591.53 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Sébastien Van Luchene : Connectez-vous pour contacter le contributeur

https://centralesupelec.hal.science/hal-00869804

Soumis le : lundi 6 novembre 2017-17:44:27

Dernière modification le : lundi 13 février 2023-08:47:47

Dates et versions

hal-00869804 , version 1 (06-11-2017)

Identifiants

HAL Id : hal-00869804 , version 1
DOI : 10.1007/978-3-642-40988-2_1

Citer

Edouard Klein, Bilal Piot, Matthieu Geist, Olivier Pietquin. A cascaded supervised learning approach to inverse reinforcement learning. Joint European Conference on Machine Learning and Knowledge Discovery in Databases (ECML/PKDD 2013), Sep 2013, Prague, Czech Republic. pp.1-16, ⟨10.1007/978-3-642-40988-2_1⟩. ⟨hal-00869804⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

SUPELEC

237 Consultations

196 Téléchargements

A cascaded supervised learning approach to inverse reinforcement learning

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager