Inverse Reinforcement Learning through Structured Classification

Edouard Klein; Matthieu Geist; Bilal Piot; Olivier Pietquin

Communication Dans Un Congrès Année : 2012

Inverse Reinforcement Learning through Structured Classification

(1) , (1) , (1) , (1)

Edouard Klein

Fonction : Auteur
PersonId : 901877

IMS : Information, Multimodalité & Signal

Matthieu Geist

Fonction : Auteur
PersonId : 6945
IdHAL : matthieu-geist

IMS : Information, Multimodalité & Signal

Bilal Piot

Fonction : Auteur

IMS : Information, Multimodalité & Signal

Olivier Pietquin

Fonction : Auteur
PersonId : 4024
IdHAL : olivier-pietquin
ORCID : 0000-0002-5386-465X
IdRef : 142821861

IMS : Information, Multimodalité & Signal

Résumé

This paper adresses the inverse reinforcement learning (IRL) problem, that is inferring a reward for which a demonstrated expert behavior is optimal. We introduce a new algorithm, SCIRL, whose principle is to use the so-called feature expectation of the expert as the parameterization of the score function of a multi-class classifier. This approach produces a reward function for which the expert policy is provably near-optimal. Contrary to most of existing IRL algorithms, SCIRL does not require solving the direct RL problem. Moreover, with an appropriate heuristic, it can succeed with only trajectories sampled according to the expert behavior. This is illustrated on a car driving simulator.

Domaines

Apprentissage [cs.LG]

Fichier principal

NIPS2012_0491.pdf (155.69 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Sébastien Van Luchene : Connectez-vous pour contacter le contributeur

https://centralesupelec.hal.science/hal-00778624

Soumis le : lundi 21 janvier 2013-11:26:06

Dernière modification le : mardi 14 février 2023-03:36:09

Archivage à long terme le : lundi 22 avril 2013-03:52:32

Dates et versions

hal-00778624 , version 1 (21-01-2013)

Identifiants

HAL Id : hal-00778624 , version 1

Citer

Edouard Klein, Matthieu Geist, Bilal Piot, Olivier Pietquin. Inverse Reinforcement Learning through Structured Classification. NIPS 2012, Dec 2012, Lake Tahoe, Nevada, United States. pp.1-9. ⟨hal-00778624⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

SUPELEC SUP_IMS CENTRALESUPELEC

5986 Consultations

634 Téléchargements

Inverse Reinforcement Learning through Structured Classification

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager