arXiv 2511.04962

Too Good to be Bad: On the Failure of LLMs to Role-Play Villains

By Zihao Yi, Qingxuan Jiang, et al.

Published 2025-11-07

Discussion

Read the public discussion and references gathered around this paper.

Large Language Models (LLMs) are increasingly tasked with creative generation, including the simulation of fictional characters. However, their ability to portray non-prosocial, antagonistic personas remains largely unexamined. We hypothesize that the safety alignment of modern LLMs creates a fundamental conflict with the task of authentically role-playing morally ambiguous or villainous characters. To investigate t…

View the original paper on arXiv