arXiv 2310.18913
Debiasing Algorithm through Model Adaptation
By Tomasz Limisiewicz, David Mareček, et al.
Published 2023-10-29
Mindmap
Browse the paper's core ideas, clusters, and relationships in a structured outline.
Large language models are becoming the go-to solution for the ever-growing number of tasks. However, with growing capacity, models are prone to rely on spurious correlations stemming from biases and stereotypes present in the training data. This work proposes a novel method for detecting and mitigating gender bias in language models. We perform causal analysis to identify problematic model components and discover th…