arXiv 2409.05283

On the Relationship between Truth and Political Bias in Language Models

By Suyash Fulay, William Brannon, et al.

Published 2024-09-09

Wiki summary

Explore the paper's summary, context, and related research on Papiers.

Language model alignment research often attempts to ensure that models are not only helpful and harmless, but also truthful and unbiased. However, optimizing these objectives simultaneously can obscure how improving one aspect might impact the others. In this work, we focus on analyzing the relationship between two concepts essential in both language model alignment and political science: truthfulness and political…

View the original paper on arXiv