Abstract: Bias in natural language processing manifests as disparities in error rates across author demographics, typically disadvantaging minority groups. Although dataset balancing has been shown to be effective in mitigating bias, existing approaches do not directly account for correlations between author demographics and linguistic variables. To achieve Equal Opportunity fairness, this paper introduces a simple but highly effective objective for countering bias using balanced training. We extend the method in the form of a gated model, which incorporates protected attributes as input, and show that it is effective at reducing bias in predictions through demographic input perturbation, outperforming all other bias mitigation techniques when combined with balanced training.
Paper Type: long
Consent To Share Data: yes
0 Replies
Loading