arXiv 2310.12931

Eureka: Human-Level Reward Design via Coding Large Language Models

By Yecheng Jason Ma, William Liang, et al.

Published 2023-10-19

Wiki summary

Explore the paper's summary, context, and related research on Papiers.

Large Language Models (LLMs) have excelled as high-level semantic planners for sequential decision-making tasks. However, harnessing them to learn complex low-level manipulation tasks, such as dexterous pen spinning, remains an open problem. We bridge this fundamental gap and present Eureka, a human-level reward design algorithm powered by LLMs. Eureka exploits the remarkable zero-shot generation, code-writing, and…

View the original paper on arXiv