Projects per year
Abstract
In order to draw generalizable conclusions about the performance of multilingual models across languages, it is important to evaluate on a set of languages that captures linguistic diversity.Linguistic typology is increasingly used to justify language selection, inspired by language sampling in linguistics. However, justifications for ‘typological diversity’ exhibit great variation, as there seems to be no set definition, methodology or consistent link to linguistic typology.In this work, we provide a systematic insight into how previous work in the ACL Anthology uses the term ‘typological diversity’.Our two main findings are: 1) what is meant by typologically diverse language selection is not consistent and 2) the actual typological diversity of the language sets in these papers varies greatly.We argue that, when making claims about ‘typological diversity’, an operationalization of this should be included.A systematic approach that quantifies this claim, also with respect to the number of languages used, would be even better.
Original language | English |
---|---|
Title of host publication | SIGTYP 2024 - 6th Workshop on Research in Computational Linguistic Typology and Multilingual NLP, Proceedings of the Workshop |
Editors | Michael Hahn, Alexey Sorokin, Ritesh Kumar, Andreas Shcherbakov, Yulia Otmakhova, Jinrui Yang, Oleg Serikov, Priya Rani, Edoardo M. Ponti, Saliha Muradoglu, Rena Gao, Ryan Cotterell, Ekaterina Vylomova |
Number of pages | 3 |
Publisher | Association for Computational Linguistics |
Publication date | 17 Mar 2024 |
Pages | 75-77 |
ISBN (Print) | 979-8-89176-071-4 |
ISBN (Electronic) | 9798891760714 |
Publication status | Published - 17 Mar 2024 |
Event | The 18th Conference of the European Chapter of the Association for Computational Linguistics - Radisson Blu, St. Julian's, Malta Duration: 17 Mar 2024 → 22 Mar 2024 https://2024.eacl.org/ |
Conference
Conference | The 18th Conference of the European Chapter of the Association for Computational Linguistics |
---|---|
Location | Radisson Blu |
Country/Territory | Malta |
City | St. Julian's |
Period | 17/03/2024 → 22/03/2024 |
Internet address |
Fingerprint
Dive into the research topics of 'A Call for Consistency in Reporting Typological Diversity'. Together they form a unique fingerprint.Projects
- 1 Active
-
Multilingual Modelling for Resource-Poor Languages
Bjerva, J. (PI), Lent, H. C. (Project Participant), Chen, Y. (Project Participant), Ploeger, E. (Project Participant), Fekete, M. R. (Project Participant) & Lavrinovics, E. (Project Participant)
01/09/2022 → 31/08/2025
Project: Research
Activities
- 1 Conference presentations
-
A Call for Consistency in Reporting Typological Diversity
Ploeger, E. (Lecturer)
22 Mar 2024Activity: Talks and presentations › Conference presentations
Research output
- 1 Article in proceeding
-
What is "Typological Diversity" in NLP?
Ploeger, E., Poelman, W., de Lhoneux, M. & Bjerva, J., Nov 2024, Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing: EMNLP. Association for Computational Linguistics, p. 5681-5700 20 p.Research output: Contribution to book/anthology/report/conference proceeding › Article in proceeding › Research › peer-review
Open AccessFile1 Citation (Scopus)33 Downloads (Pure)