arXiv 2209.09419

Multi-armed Bandit Learning on a Graph

By Tianpeng Zhang, Kasper Johansson, et al.

Published 2022-09-20

Citation lineage

Review the prior work and downstream research connected to this paper.

The multi-armed bandit(MAB) problem is a simple yet powerful framework that has been extensively studied in the context of decision-making under uncertainty. In many real-world applications, such as robotic applications, selecting an arm corresponds to a physical action that constrains the choices of the next available arms (actions). Motivated by this, we study an extension of MAB called the graph bandit, where an…

View the original paper on arXiv