TY - CONF
T1 - Evaluation in Context
AU - Jaap, Kamps
AU - Lalmas, Mounia
AU - Larsen, Birger
N1 - Værtspublikationsredaktører: Maristella Agosti Værtspublikationsredaktører: José Borbinha Værtspublikationsredaktører: Sarantos Kapidakis Værtspublikationsredaktører: Christos Papatheodorou Værtspublikationsredaktører: Giannis Taskonas Serie: Lecture Notes in Computer Science, Springer, 5714, 978-3-642-04345
PY - 2009
Y1 - 2009
N2 - All search happens in a particular context - such as the particular collection of a digital library, its associated search tasks, and its associated users. Information retrieval researchers usually agree on the importance of context, but they rarely address the issue. In particular, evaluation in the Cranfield tradition requires abstracting away from individual differences between users. This paper investigates if we can bring some of this context into the Cranfield paradigm. Our approach is the following: we will attempt to record the "context" of the humans already in the loop - the topic authors/assessors - by designing targeted questionnaires. The questionnaire data becomes part of the evaluation test-suite as valuable data on the context of the search requests.We have experimented with this questionnaire approach during the evaluation campaign of the INitiative for the Evaluation of XML Retrieval (INEX). The results of this case study demonstrate the viability of the questionnaire approach as a means to capture context in evaluation. This can help explain and control some of the user or topic variation in the test collection. Moreover, it allows to break down the set of topics in various meaningful categories, e.g. those that suit a particular task scenario, and zoom in on the relative performance for such a group of topics.
AB - All search happens in a particular context - such as the particular collection of a digital library, its associated search tasks, and its associated users. Information retrieval researchers usually agree on the importance of context, but they rarely address the issue. In particular, evaluation in the Cranfield tradition requires abstracting away from individual differences between users. This paper investigates if we can bring some of this context into the Cranfield paradigm. Our approach is the following: we will attempt to record the "context" of the humans already in the loop - the topic authors/assessors - by designing targeted questionnaires. The questionnaire data becomes part of the evaluation test-suite as valuable data on the context of the search requests.We have experimented with this questionnaire approach during the evaluation campaign of the INitiative for the Evaluation of XML Retrieval (INEX). The results of this case study demonstrate the viability of the questionnaire approach as a means to capture context in evaluation. This can help explain and control some of the user or topic variation in the test collection. Moreover, it allows to break down the set of topics in various meaningful categories, e.g. those that suit a particular task scenario, and zoom in on the relative performance for such a group of topics.
M3 - Paper without publisher/journal
ER -