Reinforcement Learning Approaches and Evaluation Criteria for Opportunistic Spectrum Access
Abstract
This paper deals with the learning and decision making issue for cognitive radio (CR). Two reinforcement-learning algorithms proposed in the literature are compared for opportunistic spectrum access (OSA): Upper Confidence Bound (UCB) algorithm and Weight Driven (WD) algorithm. This paper also introduces two new metrics in order to evaluate the machine learning algorithm performance for CR: effective cumulative regret and percentage of successful trials. They provide a fair evaluation means for CR performance.