arXiv 2310.18913
Debiasing Algorithm through Model Adaptation
By Tomasz Limisiewicz, David Mareček, et al.
Published 2023-10-29
Wiki summary
Explore the paper's summary, context, and related research on Papiers.
Large language models are becoming the go-to solution for the ever-growing number of tasks. However, with growing capacity, models are prone to rely on spurious correlations stemming from biases and stereotypes present in the training data. This work proposes a novel method for detecting and mitigating gender bias in language models. We perform causal analysis to identify problematic model components and discover th…