Dynamic Ensembles in Named Entity Recognition for Historical Arabic Texts

Muhammad Majadly, Tomer Sagi

Research output: Contribution to book/anthology/report/conference proceedingArticle in proceedingResearchpeer-review

Abstract

The use of Named Entity Recognition (NER) over archaic Arabic texts is steadily increasing. However, most tools have been either developed for modern English or trained over English language documents and are limited over historical Arabic text. Even Arabic NER tools are often trained on modern web-sourced text, making their fit for a historical task questionable. To mitigate historic Arabic NER resource scarcity, we propose a dynamic ensemble model utilizing several learners. The dynamic aspect is achieved by utilizing predictors and features over NER algorithm results that identify which have performed better on a specific task in real-time. We evaluate our approach against state-of-the-art Arabic NER and static ensemble methods over a novel historical Arabic NER task we have created. Our results show that our approach improves upon the state-of-the-art and reaches a 0.8 F-score on this challenging task.

Original languageEnglish
Title of host publicationWANLP 2021 - 6th Arabic Natural Language Processing Workshop, Proceedings of the Workshop
EditorsNizar Habash, Houda Bouamor, Hazem Hajj, Walid Magdy, Wajdi Zaghouani, Fethi Bougares, Nadi Tomeh, Ibrahim Abu Farha, Samia Touileb
Number of pages11
PublisherAssociation for Computational Linguistics, ACL Anthology
Publication date2021
Pages115-125
ISBN (Electronic)9781954085091
Publication statusPublished - 2021
Externally publishedYes
Event6th Arabic Natural Language Processing Workshop, WANLP 2021 - Virtual, Kyiv, Ukraine
Duration: 19 Apr 2021 → …

Conference

Conference6th Arabic Natural Language Processing Workshop, WANLP 2021
Country/TerritoryUkraine
CityVirtual, Kyiv
Period19/04/2021 → …
SeriesWANLP 2021 - 6th Arabic Natural Language Processing Workshop, Proceedings of the Workshop

Bibliographical note

Publisher Copyright:
© WANLP 2021 - 6th Arabic Natural Language Processing Workshop

Fingerprint

Dive into the research topics of 'Dynamic Ensembles in Named Entity Recognition for Historical Arabic Texts'. Together they form a unique fingerprint.

Cite this