arXiv 2312.14925
A Survey of Reinforcement Learning from Human Feedback
By Timo Kaufmann, Paul Weng, et al.
Published 2023-12-22
Citation lineage
Review the prior work and downstream research connected to this paper.
Reinforcement learning from human feedback (RLHF) is a variant of reinforcement learning (RL) that learns from human feedback instead of relying on an engineered reward function. Building on prior work on the related setting of preference-based reinforcement learning (PbRL), it stands at the intersection of artificial intelligence and human-computer interaction. This positioning offers a promising avenue to enhance…