Projects per year
Abstract
Large Language Models (LLMs) reproduce and exacerbate the social biases present in their training data, and resources to quantify this issue are limited. While research has attempted to identify and mitigate such biases, most efforts have been concentrated around English, lagging the rapid advancement of LLMs in multilingual settings. In this paper, we introduce a new multilingual parallel dataset SHADES to help address this issue, designed for examining culturally-specific stereotypes that may be learned by LLMs. The dataset includes stereotypes from 20 regions around the world and 16 languages, spanning multiple identity categories subject to discrimination worldwide. We demonstrate its utility in a series of exploratory evaluations for both “base” and “instruction-tuned” language models. Our results suggest that stereotypes are consistently reflected across models and languages, with some languages and models indicating much stronger stereotype biases than others.
Original language | English |
---|---|
Title of host publication | Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies |
Publisher | Association for Computational Linguistics |
Publication date | 1 May 2025 |
Pages | 11995-12041 |
Publication status | Published - 1 May 2025 |
Fingerprint
Dive into the research topics of 'SHADES: Towards a Multilingual Assessment of Stereotypes in Large Language Models'. Together they form a unique fingerprint.Projects
- 1 Active
-
Digital Twins for Abundant Feedback: Novel Feedback Paradigms via Explainable Multilingual Natural Language Processing
Bjerva, J. (PI), Lindsay, E. (PI) & Zhang, M. (Project Participant)
01/01/2024 → 31/12/2025
Project: Research