arXiv 2209.09419
Multi-armed Bandit Learning on a Graph
By Tianpeng Zhang, Kasper Johansson, et al.
Published 2022-09-20
Citation lineage
Review the prior work and downstream research connected to this paper.
The multi-armed bandit(MAB) problem is a simple yet powerful framework that has been extensively studied in the context of decision-making under uncertainty. In many real-world applications, such as robotic applications, selecting an arm corresponds to a physical action that constrains the choices of the next available arms (actions). Motivated by this, we study an extension of MAB called the graph bandit, where an…