Abstract
We investigate the TREC-CDS 2016 test collections as a new resource for citation context and citation-based IR experiments. The collection contains more than 1.25 million biomedical full-text articles in XML. We find that a citation index can easily be extracted, and citation contexts easily be identified. We conduct initial experiments to determine the optimal citation context window size in this domain and collection. Surprisingly We find that quite long citation contexts of more than 250 word yield the best performance when combined linearly with the fulltext and with moderate weight on the citation contexts.
Original language | English |
---|---|
Journal | CEUR Workshop Proceedings |
Volume | 2345 |
Pages (from-to) | 51-63 |
Number of pages | 13 |
ISSN | 1613-0073 |
Publication status | Published - 1 Jan 2019 |
Event | 8th International Workshop on Bibliometric-Enhanced Information Retrieval, BIR 2019 - Cologne, Germany Duration: 14 Apr 2019 → … |
Conference
Conference | 8th International Workshop on Bibliometric-Enhanced Information Retrieval, BIR 2019 |
---|---|
Country/Territory | Germany |
City | Cologne |
Period | 14/04/2019 → … |
Keywords
- Citation context windows
- Citation contexts for IR
- TREC-CDS 2016