arXiv 2209.09419

Multi-armed Bandit Learning on a Graph

By Tianpeng Zhang, Kasper Johansson, et al.

Published 2022-09-20

Wiki summary

Explore the paper's summary, context, and related research on Papiers.

The multi-armed bandit(MAB) problem is a simple yet powerful framework that has been extensively studied in the context of decision-making under uncertainty. In many real-world applications, such as robotic applications, selecting an arm corresponds to a physical action that constrains the choices of the next available arms (actions). Motivated by this, we study an extension of MAB called the graph bandit, where an…

View the original paper on arXiv