Online adaptation of dialogue systems
Abstract
This document is a report on online adaptation of dialogue systems (deliverable 1.5), due at month 36 of the CLASSIC project. It consists of four contributions. First, it demonstrates fast policy adaptation using the GP-SARSA algo- rithm applied to Hidden Information State (HIS) dialogue manager. Second, it describes online adapta- tion of dialogue model parameters using the NBC algorithm within the Belief Update of Dialogue State (BUDS) dialogue manager. Third, it proposes the Kalman Temporal Differences algorithm for manage- ment of uncertainty in estimate of the optimal value function. Finally, it details optimisation techniques for industrial spoken dialogue systems based on compliance-based reinforcement learning. Work related to this deliverable has been published in Gaˇsi'c et al. (2010), Jurˇc'ıˇcek et al. (2010b), Laroche et al. (2010b), and Geist and Pietquin (2010, 2011).