Gradual Language Model Adaptation Using Fine-Grained Typology

Marcell Richard Fekete*, Johannes Bjerva

*Corresponding author for this work

Research output: Contribution to book/anthology/report/conference proceedingArticle in proceedingResearchpeer-review

Abstract

Transformer-based language models (LMs) offer superior performance in a wide range of NLP tasks compared to previous paradigms. However, the vast majority of the world's languages do not have adequate training data available for monolingual LMs (Joshi et al., 2020). While the use of multilingual LMs might address this data imbalance, there is evidence that multilingual LMs struggle when it comes to model adaptation to to resource-poor languages (Wu and Dredze, 2020), or to languages which have typological characteristics unseen by the LM (Üstün et al., 2022). Other approaches aim to adapt monolingual LMs to resource-poor languages that are related to the model language. However, there are conflicting findings regarding whether language relatedness correlates with successful adaptation (de Vries et al., 2021), or not (Ács et al., 2021).

With gradual LM adaptation, our approach presented in this extended abstract, we add to the research direction of monolingual LM adaptation. Instead of direct adaptation to a target language, we propose adaptation in stages, first adapting to one or more intermediate languages before the final adaptation step. Inspired by principles of curriculum learning (Bengio et al., 2009), we search for an ideal ordering of languages that can result in improved LM performance on the target language. We follow evidence that typological similarity might correlate with the success of cross-lingual transfer (Pires et al., 2019; Üstün et al., 2022; de Vries et al., 2021) as we believe the success of this transfer is essential for successful model adaptation. Thus we order languages based on their relative typological similarity between them. In our approach, we quantify typological similarity using structural vectors as derived from counts of dependency links (Bjerva et al., 2019), as such fine-grained measures can give a more accurate picture of the typological characteristics of languages (Ponti et al., 2019).

We believe that gradual LM adaptation may lead to improved LM performance on a range of resource-poor languages and typologically diverse languages. Additionally, it enables future research to evaluate the correlation between the success of cross-lingual transfer and various typological similarity measures.
Original languageEnglish
Title of host publicationProceedings of the 5th Workshop on Research in Computational Linguistic Typology and Multilingual NLP
Number of pages6
PublisherAssociation for Computational Linguistics
Publication dateMay 2023
Pages153-158
DOIs
Publication statusPublished - May 2023
EventThe 17th Conference of the European Chapter of the Association for Computational Linguistics - Dubrovnik, Croatia
Duration: 2 May 20236 May 2023
https://2023.eacl.org/

Conference

ConferenceThe 17th Conference of the European Chapter of the Association for Computational Linguistics
Country/TerritoryCroatia
CityDubrovnik
Period02/05/202306/05/2023
Internet address

Keywords

  • Natural Language Processing
  • Computational Typology
  • Artificial Intelligence (AI)
  • Low-resource settings

Fingerprint

Dive into the research topics of 'Gradual Language Model Adaptation Using Fine-Grained Typology'. Together they form a unique fingerprint.

Cite this