Wiki-VEL: Visual Entity Linking for Structured Data on Wikimedia Commons

Philipp Bielefeld, Jasmin Geppert, Necdet Güven, Melna Treesa John, Adrian Ziupka, Lucie Aimée Kaffee, Russa Biswas, Gerard de Melo

Publikation: Bidrag til bog/antologi/rapport/konference proceedingKonferenceartikel i proceedingForskningpeer review

4 Downloads (Pure)

Abstract

Describing images using structured data enables a wide range of automation tasks, such as search and organization, as well as downstream tasks, such as labeling images or training machine learning models. However, there is currently a lack of structured data labels for large image repositories such as Wikimedia Commons. To close this gap, we propose the task of Visual Entity Linking (VEL) for Wikimedia Commons, which involves predicting labels for Wikimedia Commons images based on Wikidata items as the label inventory. We create a novel dataset leveraging community-created structured data on Wikimedia Commons. Additionally, we fine-tune pre-trained models based on the CLIP architecture using this dataset. Although the best-performing models show promising results, the study also identifies key challenges of the data and the task.

OriginalsprogEngelsk
TitelALVR 2024 - 3rd Workshop on Advances in Language and Vision Research, Proceedings of the Workshop
RedaktørerJing Gu, Tsu-Jui Fu, Tsu-Jui Fu, Drew Hudson, Asli Celikyilmaz, William Wang
Antal sider9
ForlagAssociation for Computational Linguistics
Publikationsdato2024
Sider186-194
ISBN (Elektronisk)9798891761537
DOI
StatusUdgivet - 2024
Begivenhed3rd Workshop on Advances in Language and Vision Research, ALVR 2024 - Bangkok, Thailand
Varighed: 16 aug. 2024 → …

Konference

Konference3rd Workshop on Advances in Language and Vision Research, ALVR 2024
Land/OmrådeThailand
ByBangkok
Periode16/08/2024 → …

Bibliografisk note

Publisher Copyright:
© 2024 Association for Computational Linguistics.

Fingeraftryk

Dyk ned i forskningsemnerne om 'Wiki-VEL: Visual Entity Linking for Structured Data on Wikimedia Commons'. Sammen danner de et unikt fingeraftryk.

Citationsformater