Abstract
We investigate the TREC-CDS 2016 test collections as a new resource for citation context and citation-based IR experiments. The collection contains more than 1.25 million biomedical full-text articles in XML. We find that a citation index can easily be extracted, and citation contexts easily be identified. We conduct initial experiments to determine the optimal citation context window size in this domain and collection. Surprisingly We find that quite long citation contexts of more than 250 word yield the best performance when combined linearly with the fulltext and with moderate weight on the citation contexts.
Originalsprog | Engelsk |
---|---|
Tidsskrift | CEUR Workshop Proceedings |
Vol/bind | 2345 |
Sider (fra-til) | 51-63 |
Antal sider | 13 |
ISSN | 1613-0073 |
Status | Udgivet - 1 jan. 2019 |
Begivenhed | 8th International Workshop on Bibliometric-Enhanced Information Retrieval, BIR 2019 - Cologne, Tyskland Varighed: 14 apr. 2019 → … |
Konference
Konference | 8th International Workshop on Bibliometric-Enhanced Information Retrieval, BIR 2019 |
---|---|
Land/Område | Tyskland |
By | Cologne |
Periode | 14/04/2019 → … |