Does Typological Blinding Impede Cross-Lingual Sharing?

Johannes Bjerva; Isabelle Augenstein

doi:10.18653/v1/2021.eacl-main.38

Does Typological Blinding Impede Cross-Lingual Sharing?

Publikation: Bidrag til bog/antologi/rapport/konference proceeding › Konferenceartikel i proceeding › Forskning › peer review

12 Citationer (Scopus)

Abstract

Bridging the performance gap between high- and low-resource languages has been the focus of much previous work. Typological features from databases such as the World Atlas of Language Structures (WALS) are a prime candidate for this, as such data exists even for very low-resource languages. However, previous work has only found minor benefits from using typological information. Our hypothesis is that a model trained in a cross-lingual setting will pick up on typological cues from the input data, thus overshadowing the utility of explicitly using such features. We verify this hypothesis by blinding a model to typological information, and investigate how cross-lingual sharing and performance is impacted. Our model is based on a cross-lingual architecture in which the latent weights governing the sharing between languages is learnt during training. We show that (i) preventing this model from exploiting typology severely reduces performance, while a control experiment reaffirms that (ii) encouraging sharing according to typology somewhat improves performance.

Originalsprog	Engelsk
Titel	Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics
Redaktører	Paola Merlo, Jorg Tiedemann, Reut Tsarfaty
Forlag	Association for Computational Linguistics
Publikationsdato	21 apr. 2021
Sider	480-486
DOI	https://doi.org/10.18653/v1/2021.eacl-main.38
Status	Udgivet - 21 apr. 2021
Begivenhed	Conference of the European Chapter of the Association for Computational Linguistics - Varighed: 21 apr. 2021 → 23 apr. 2021 Konferencens nummer: 16 https://2021.eacl.org/

Konference

Konference	Conference of the European Chapter of the Association for Computational Linguistics
Nummer	16
Periode	21/04/2021 → 23/04/2021
Internetadresse	https://2021.eacl.org/

Adgang til dokumentet

10.18653/v1/2021.eacl-main.38

https://arxiv.org/abs/2101.11888Licens: CC BY 4.0

AUB Link

Søg efter materialet i Aalborg Universitetsbiblioteks søgemaskine

Citationsformater

Bjerva, J., & Augenstein, I. (2021). Does Typological Blinding Impede Cross-Lingual Sharing? I P. Merlo, J. Tiedemann, & R. Tsarfaty (red.), Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics (s. 480-486). Association for Computational Linguistics. Advance online publication. https://doi.org/10.18653/v1/2021.eacl-main.38

@inproceedings{2a2088d5fbf24cb0851284d3b8813b3c,

title = "Does Typological Blinding Impede Cross-Lingual Sharing?",

abstract = "Bridging the performance gap between high- and low-resource languages has been the focus of much previous work. Typological features from databases such as the World Atlas of Language Structures (WALS) are a prime candidate for this, as such data exists even for very low-resource languages. However, previous work has only found minor benefits from using typological information. Our hypothesis is that a model trained in a cross-lingual setting will pick up on typological cues from the input data, thus overshadowing the utility of explicitly using such features. We verify this hypothesis by blinding a model to typological information, and investigate how cross-lingual sharing and performance is impacted. Our model is based on a cross-lingual architecture in which the latent weights governing the sharing between languages is learnt during training. We show that (i) preventing this model from exploiting typology severely reduces performance, while a control experiment reaffirms that (ii) encouraging sharing according to typology somewhat improves performance.",

keywords = "Natural Language Processing, Machine Learning, Computational Linguistics",

author = "Johannes Bjerva and Isabelle Augenstein",

year = "2021",

month = apr,

day = "21",

doi = "10.18653/v1/2021.eacl-main.38",

language = "English",

pages = "480--486",

editor = "Paola Merlo and Jorg Tiedemann and Reut Tsarfaty",

booktitle = "Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics",

publisher = "Association for Computational Linguistics",

address = "United States",

note = "Conference of the European Chapter<br/>of the Association for Computational Linguistics, EACL ; Conference date: 21-04-2021 Through 23-04-2021",

url = "https://2021.eacl.org/",

}

Bjerva, J & Augenstein, I 2021, Does Typological Blinding Impede Cross-Lingual Sharing? i P Merlo, J Tiedemann & R Tsarfaty (red), Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics. Association for Computational Linguistics, s. 480-486, Conference of the European Chapter
of the Association for Computational Linguistics, 21/04/2021. https://doi.org/10.18653/v1/2021.eacl-main.38

Does Typological Blinding Impede Cross-Lingual Sharing? / Bjerva, Johannes; Augenstein, Isabelle.
Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics. red. / Paola Merlo; Jorg Tiedemann; Reut Tsarfaty. Association for Computational Linguistics, 2021. s. 480-486.

Publikation: Bidrag til bog/antologi/rapport/konference proceeding › Konferenceartikel i proceeding › Forskning › peer review