arXiv 2010.16011
POMO: Policy Optimization with Multiple Optima for Reinforcement Learning
By Yeong-Dae Kwon, Jinho Choo, et al.
Published 2020-10-30
Citation lineage
Review the prior work and downstream research connected to this paper.
In neural combinatorial optimization (CO), reinforcement learning (RL) can turn a deep neural net into a fast, powerful heuristic solver of NP-hard problems. This approach has a great potential in practical applications because it allows near-optimal solutions to be found without expert guides armed with substantial domain knowledge. We introduce Policy Optimization with Multiple Optima (POMO), an end-to-end approac…