TY - GEN
T1 - Evaluation in context
AU - Kamps, Jaap
AU - Lalmas, Mounia
AU - Larsen, Birger
PY - 2009/12/1
Y1 - 2009/12/1
N2 - All search happens in a particular context - such as the particular collection of a digital library, its associated search tasks, and its associated users. Information retrieval researchers usually agree on the importance of context, but they rarely address the issue. In particular, evaluation in the Cranfield tradition requires abstracting away from individual differences between users. This paper investigates if we can bring some of this context into the Cranfield paradigm. Our approach is the following: we will attempt to record the "context" of the humans already in the loop - the topic authors/assessors - by designing targeted questionnaires. The questionnaire data becomes part of the evaluation test-suite as valuable data on the context of the search requests. We have experimented with this questionnaire approach during the evaluation campaign of the INitiative for the Evaluation of XML Retrieval (INEX). The results of this case study demonstrate the viability of the questionnaire approach as a means to capture context in evaluation. This can help explain and control some of the user or topic variation in the test collection. Moreover, it allows to break down the set of topics in various meaningful categories, e.g. those that suit a particular task scenario, and zoom in on the relative performance for such a group of topics.
AB - All search happens in a particular context - such as the particular collection of a digital library, its associated search tasks, and its associated users. Information retrieval researchers usually agree on the importance of context, but they rarely address the issue. In particular, evaluation in the Cranfield tradition requires abstracting away from individual differences between users. This paper investigates if we can bring some of this context into the Cranfield paradigm. Our approach is the following: we will attempt to record the "context" of the humans already in the loop - the topic authors/assessors - by designing targeted questionnaires. The questionnaire data becomes part of the evaluation test-suite as valuable data on the context of the search requests. We have experimented with this questionnaire approach during the evaluation campaign of the INitiative for the Evaluation of XML Retrieval (INEX). The results of this case study demonstrate the viability of the questionnaire approach as a means to capture context in evaluation. This can help explain and control some of the user or topic variation in the test collection. Moreover, it allows to break down the set of topics in various meaningful categories, e.g. those that suit a particular task scenario, and zoom in on the relative performance for such a group of topics.
UR - http://www.scopus.com/inward/record.url?scp=77952058526&partnerID=8YFLogxK
U2 - 10.1007/978-3-642-04346-8_33
DO - 10.1007/978-3-642-04346-8_33
M3 - Article in proceeding
AN - SCOPUS:77952058526
SN - 3642043453
SN - 9783642043451
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 339
EP - 351
BT - Research and Advanced Technology for Digital Libraries - 13th European Conference, ECDL 2009, Proceedings
T2 - 13th European Conference on Research and Advanced Technologies for Digital Libraries, ECDL 2009
Y2 - 27 September 2009 through 2 October 2009
ER -