arXiv 2010.16011

POMO: Policy Optimization with Multiple Optima for Reinforcement Learning

By Yeong-Dae Kwon, Jinho Choo, et al.

Published 2020-10-30

Citation lineage

Review the prior work and downstream research connected to this paper.

In neural combinatorial optimization (CO), reinforcement learning (RL) can turn a deep neural net into a fast, powerful heuristic solver of NP-hard problems. This approach has a great potential in practical applications because it allows near-optimal solutions to be found without expert guides armed with substantial domain knowledge. We introduce Policy Optimization with Multiple Optima (POMO), an end-to-end approac…

View the original paper on arXiv