arXiv 2303.00001

Reward Design with Language Models

By Minae Kwon, Sang Michael Xie, et al.

Published 2023-02-27

Wiki summary

Explore the paper's summary, context, and related research on Papiers.

Reward design in reinforcement learning (RL) is challenging since specifying human notions of desired behavior may be difficult via reward functions or require many expert demonstrations. Can we instead cheaply design rewards using a natural language interface? This paper explores how to simplify reward design by prompting a large language model (LLM) such as GPT-3 as a proxy reward function, where the user provides…

View the original paper on arXiv