arXiv 2409.05283
On the Relationship between Truth and Political Bias in Language Models
By Suyash Fulay, William Brannon, et al.
Published 2024-09-09
Wiki summary
Explore the paper's summary, context, and related research on Papiers.
Language model alignment research often attempts to ensure that models are not only helpful and harmless, but also truthful and unbiased. However, optimizing these objectives simultaneously can obscure how improving one aspect might impact the others. In this work, we focus on analyzing the relationship between two concepts essential in both language model alignment and political science: truthfulness and political…