arXiv 2511.04962

Too Good to be Bad: On the Failure of LLMs to Role-Play Villains

By Zihao Yi, Qingxuan Jiang, et al.

Published 2025-11-07

Citation lineage

Review the prior work and downstream research connected to this paper.

Large Language Models (LLMs) are increasingly tasked with creative generation, including the simulation of fictional characters. However, their ability to portray non-prosocial, antagonistic personas remains largely unexamined. We hypothesize that the safety alignment of modern LLMs creates a fundamental conflict with the task of authentically role-playing morally ambiguous or villainous characters. To investigate t…

View the original paper on arXiv