arXiv 2312.14925

A Survey of Reinforcement Learning from Human Feedback

By Timo Kaufmann, Paul Weng, et al.

Published 2023-12-22

Mindmap

Browse the paper's core ideas, clusters, and relationships in a structured outline.

Reinforcement learning from human feedback (RLHF) is a variant of reinforcement learning (RL) that learns from human feedback instead of relying on an engineered reward function. Building on prior work on the related setting of preference-based reinforcement learning (PbRL), it stands at the intersection of artificial intelligence and human-computer interaction. This positioning offers a promising avenue to enhance…

View the original paper on arXiv