Optimal citation context window sizes for biomedical retrieval

Boris Lykke Nielsen, Stefan Lavlund Skau, Florian Meier, Birger Larsen*

*Corresponding author for this work

Research output: Contribution to journalConference article in JournalResearchpeer-review

2 Citations (Scopus)
102 Downloads (Pure)

Abstract

We investigate the TREC-CDS 2016 test collections as a new resource for citation context and citation-based IR experiments. The collection contains more than 1.25 million biomedical full-text articles in XML. We find that a citation index can easily be extracted, and citation contexts easily be identified. We conduct initial experiments to determine the optimal citation context window size in this domain and collection. Surprisingly We find that quite long citation contexts of more than 250 word yield the best performance when combined linearly with the fulltext and with moderate weight on the citation contexts.

Original languageEnglish
JournalCEUR Workshop Proceedings
Volume2345
Pages (from-to)51-63
Number of pages13
ISSN1613-0073
Publication statusPublished - 1 Jan 2019
Event8th International Workshop on Bibliometric-Enhanced Information Retrieval, BIR 2019 - Cologne, Germany
Duration: 14 Apr 2019 → …

Conference

Conference8th International Workshop on Bibliometric-Enhanced Information Retrieval, BIR 2019
Country/TerritoryGermany
CityCologne
Period14/04/2019 → …

Keywords

  • Citation context windows
  • Citation contexts for IR
  • TREC-CDS 2016

Fingerprint

Dive into the research topics of 'Optimal citation context window sizes for biomedical retrieval'. Together they form a unique fingerprint.

Cite this