Reproducibility and Analysis of Scientific Dataset Recommendation Methods

Ornella Irrera*, Matteo Lissandrini, Daniele Dell'Aglio, Gianmaria Silvello

*Kontaktforfatter

Publikation: Bidrag til bog/antologi/rapport/konference proceedingKonferenceartikel i proceedingForskningpeer review

10 Downloads (Pure)

Abstract

Datasets play a central role in scholarly communications. However, scholarly graphs are often incomplete, particularly due to the lack of connections between publications and datasets. Therefore, the importance of dataset recommendation—identifying relevant datasets for a scientific paper, an author, or a textual query—is increasing. Although various methods have been proposed for this task, their reproducibility remains unexplored, making it difficult to compare them with new approaches. We reviewed current recommendation methods for scientific datasets, focusing on the most recent and competitive approaches, including an SVM-based model, a bi-encoder retriever, a method leveraging co-authors and citation network embeddings, and a heterogeneous variational graph autoencoder. These approaches underwent a comprehensive analysis under consistent experimental conditions. Our reproducibility efforts show that three methods can be reproduced, while the graph variational autoencoder is challenging due to unavailable code and test datasets. Hence, we re-implemented this method and performed a component-based analysis to examine its strengths and limitations. Furthermore, our study indicated that three out of four considered methods produce subpar results when applied to real-world data instead of specialized datasets with ad-hoc features.
OriginalsprogEngelsk
TitelRecSys '24: Proceedings of the 18th ACM Conference on Recommender Systems
Antal sider10
UdgivelsesstedNew York, NY, USA
ForlagAssociation for Computing Machinery (ACM)
Publikationsdatookt. 2024
Sider570-579
ISBN (Elektronisk)9798400705052
DOI
StatusUdgivet - okt. 2024
BegivenhedRecSys '24: 18th ACM Conference on Recommender Systems - Bari, Italien
Varighed: 14 okt. 202418 okt. 2024
https://dl.acm.org/doi/proceedings/10.1145/3640457

Konference

KonferenceRecSys '24: 18th ACM Conference on Recommender Systems
Land/OmrådeItalien
ByBari
Periode14/10/202418/10/2024
Internetadresse

Fingeraftryk

Dyk ned i forskningsemnerne om 'Reproducibility and Analysis of Scientific Dataset Recommendation Methods'. Sammen danner de et unikt fingeraftryk.

Citationsformater