A study of factuality, objectivity and relevance: Three desiderata in large-scale information retrieval?

Christina Lioma, Birger Larsen, Ying-Wei Lu, Yong Huang

Publikation: Bidrag til bog/antologi/rapport/konference proceedingKonferenceartikel i proceedingForskningpeer review

4 Citationer (Scopus)

Abstract

Much of the information processed by Information Retrieval (IR) systems is unreliable, biased, and generally untrustworthy [15, 45, 48]. Yet, factuality & objectivity detection is not a standard component of IR systems, even though it has been possible in Natural Language Processing (NLP) in the last decade. Motivated by this, we ask if and how factuality & objectivity detection may benefit IR. We answer this in two parts. First, we use state-of-the-art NLP to compute the probability of document factuality & objectivity in two TREC collections, and analyse its relation to document relevance. We find that factuality is strongly and positively correlated to document relevance, but objectivity is not. Second, we study the impact of factuality & objectivity to retrieval effectiveness by treating them as query independent features that we combine with a competitive language modelling baseline. Experiments with 450 TREC queries show that factuality improves precision by more than 10% over strong baselines, especially for the type of uncurated data typically used in web search; objectivity gives mixed results. An overall clear trend is that document factuality & objectivity is much more beneficial to IR when searching uncurated (e.g. web) documents vs. curated (e.g. state documentation and newswire articles). To our knowledge, this is the first study of factuality & objectivity for back-end IR, contributing novel findings about the relation between relevance and factuality/objectivity, and statistically significant gains to retrieval effectiveness in the competitive web search task.

OriginalsprogEngelsk
TitelProceedings of the 3rd IEEE/ACM International Conference on Big Data Computing, Applications and Technologies
Antal sider11
ForlagAssociation for Computing Machinery
Publikationsdato6 dec. 2016
Sider107-117
ISBN (Elektronisk)978-1-4503-4617-7
DOI
StatusUdgivet - 6 dec. 2016
Begivenhed3rd IEEE/ACM International Conference on Big Data Computing, Applications and Technologies, BDCAT 2016 - Shanghai, Kina
Varighed: 6 dec. 20169 dec. 2016

Konference

Konference3rd IEEE/ACM International Conference on Big Data Computing, Applications and Technologies, BDCAT 2016
Land/OmrådeKina
ByShanghai
Periode06/12/201609/12/2016

Fingeraftryk

Dyk ned i forskningsemnerne om 'A study of factuality, objectivity and relevance: Three desiderata in large-scale information retrieval?'. Sammen danner de et unikt fingeraftryk.

Citationsformater