arXiv 2310.18913

Debiasing Algorithm through Model Adaptation

By Tomasz Limisiewicz, David Mareček, et al.

Published 2023-10-29

Mindmap

Browse the paper's core ideas, clusters, and relationships in a structured outline.

Large language models are becoming the go-to solution for the ever-growing number of tasks. However, with growing capacity, models are prone to rely on spurious correlations stemming from biases and stereotypes present in the training data. This work proposes a novel method for detecting and mitigating gender bias in language models. We perform causal analysis to identify problematic model components and discover th…

View the original paper on arXiv