Improved algorithms for linear stochastic bandits, NIPS, 2011. ,
Stochastic convex optimization with bandit feedback, NIPS, pp.1035-1043, 2011. ,
The continuum-armed bandit problem, SIAM J. Control Optim, vol.33, issue.6, pp.1926-1951, 1995. ,
Thompson sampling for contextual bandits with linear payoffs, ICML, 2013. ,
Online linear optimization and adaptive routing, J. Comput. Syst. Sci, vol.74, issue.1, pp.97-114, 2008. ,
Regret analysis of stochastic and nonstochastic multi-armed bandit problems. Foundations and Trends in Machine Learning, vol.5, pp.1-122, 2012. ,
Optimal adaptive policies for sequential allocation problems, Advances in Applied Mathematics, vol.17, issue.2, pp.122-142, 1996. ,
Revealing graph bandits for maximizing local influence, AISTATS, 2016. ,
URL : https://hal.archives-ouvertes.fr/hal-01304020
Combinatorial bandits, J. Comput. Syst. Sci, vol.78, issue.5, pp.1404-1422, 2012. ,
Combinatorial multi-armed bandit: General framework and applications, ICML, 2013. ,
Learning to rank: Regret lower bound and efficient algorithms, SIGMETRICS, 2015. ,
URL : https://hal.archives-ouvertes.fr/hal-01257894
Unimodal bandits: Regret lower bounds and optimal algorithms, ICML, 2014. ,
URL : https://hal.archives-ouvertes.fr/hal-01092662
Combinatorial bandits revisited, NIPS, 2015. ,
URL : https://hal.archives-ouvertes.fr/hal-01257796
Stochastic linear optimization under bandit feedback, COLT, 2008. ,
Thompson sampling for combinatorial bandits and its application to online feature selection, Workshops at the Twenty-Eighth AAAI Conference on Artificial Intelligence, 2014. ,
Parametric bandits: The generalized linear case, NIPS, pp.586-594, 2010. ,
Combinatorial network optimization with unknown variables: Multi-armed bandits with linear rewards and individual observations, IEEE/ACM Trans. on Networking, vol.20, issue.5, pp.1466-1478, 2012. ,
The KL-UCB algorithm for bounded stochastic bandits and beyond, COLT, 2011. ,
Linear Optimization and Approximation, 1983. ,
Thompson sampling for complex online problems, ICML, 2014. ,
Asymptotically efficient adaptive choice of control laws in controlled markov chains, SIAM J. Control and Optimization, vol.35, issue.3, pp.715-743, 1997. ,
The on-line shortest path problem under partial monitoring, Journal of Machine Learning Research, vol.8, issue.10, 2007. ,
The n-armed bandit with unimodal structure, Metrika, vol.30, issue.1, pp.195-210, 1983. ,
An asymptotically optimal bandit algorithm for bounded support models, COLT, 2010. ,
On the complexity of best-arm identification in multi-armed bandit models, Journal of Machine Learning Research, vol.17, issue.1, pp.1-42, 2016. ,
URL : https://hal.archives-ouvertes.fr/hal-01024894
Thompson sampling: An asymptotically optimal finite-time analysis, ALT, 2012. ,
URL : https://hal.archives-ouvertes.fr/hal-00830033
Regret lower bound and optimal algorithm in dueling bandit problem, COLT, 2015. ,
Cascading bandits: Learning to rank in the cascade model, NIPS, 2015. ,
Tight regret bounds for stochastic combinatorial semi-bandits, AISTATS, 2015. ,
Asymptotically efficient adaptive allocation rules, Advances in Applied Mathematics, vol.6, issue.1, pp.4-22, 1985. ,
The end of optimism? an asymptotic analysis of finite-armed linear bandits, 2016. ,
, Lipschitz bandits: Regret lower bounds and optimal algorithms. COLT, 2014.
URL : https://hal.archives-ouvertes.fr/hal-01092791
Some aspects of the sequential design of experiments, Herbert Robbins Selected Papers, pp.169-177, 1985. ,
Linearly parameterized bandits, Math. Oper. Res, vol.35, issue.2, 2010. ,
Efficient learning in large-scale combinatorial semi-bandits, ICML, 2015. ,
Unimodal bandits, ICML, 2011. ,