An algorithmic Survey of Parametric Value Function Approximation - CentraleSupélec Accéder directement au contenu
Article Dans Une Revue IEEE Transactions on Neural Networks and Learning Systems Année : 2013

An algorithmic Survey of Parametric Value Function Approximation

Résumé

Reinforcement learning is a machine learning answer to the optimal control problem. It consists in learning an optimal control policy through interactions with the system to be controlled, the quality of this policy being quantified by the so-called value function. A recurrent subtopic of reinforcement learning is to compute an approximation of this value function when the system is too large for an exact representation. This survey reviews state-of-the-art methods for (parametric) value function approximation by grouping them into three main categories: bootstrapping, residual and projected fixed-point approaches. Related algorithms are derived by considering one of the associated cost functions and a specific minimization method, generally a stochastic gradient descent or a recursive least-squares approach.
Fichier principal
Vignette du fichier
vfa_survey.pdf (540.07 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-00869725 , version 1 (06-11-2017)

Identifiants

Citer

Matthieu Geist, Olivier Pietquin. An algorithmic Survey of Parametric Value Function Approximation. IEEE Transactions on Neural Networks and Learning Systems, 2013, 24 (6), pp.845-867. ⟨10.1109/TNNLS.2013.2247418⟩. ⟨hal-00869725⟩

Collections

SUPELEC
116 Consultations
521 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More