arXiv 2303.00001
Reward Design with Language Models
By Minae Kwon, Sang Michael Xie, et al.
Published 2023-02-27
Citation lineage
Review the prior work and downstream research connected to this paper.
Reward design in reinforcement learning (RL) is challenging since specifying human notions of desired behavior may be difficult via reward functions or require many expert demonstrations. Can we instead cheaply design rewards using a natural language interface? This paper explores how to simplify reward design by prompting a large language model (LLM) such as GPT-3 as a proxy reward function, where the user provides…