Towards Comparing Recommendation to Multiple-Query Search Sessions for Talent Search

Mesut Kaya*, Toine Bogers

*Corresponding author for this work

Research output: Contribution to journalConference article in JournalResearchpeer-review


Query-level evaluation metrics such as nDCG that originate from field of Information Retrieval (IR) have seen widespread adoption in the Recommender Systems (RS) community for comparing the quality of different ranked lists of recommendations with different levels of relevance to the user. However, the traditional (offline) RS evaluation paradigm is typically restricted to evaluating a single results list. In contrast, IR researchers have also developed evaluation metrics over the past decade for the session-based evaluation of more complex search tasks. Here, the sessions consist of multiple queries and multi-round search interactions, and the metrics evaluate the quality of the session as a whole. Despite the popularity of the more traditional single-list evaluation paradigm, RS can also be used to assist users with complex information access tasks. In this paper, we explore the usefulness of session-level evaluation metrics for evaluating and comparing the performance of both recommender systems and search engines. We show that, despite possible misconceptions that comparing both scenarios is akin to comparing apples to oranges, it is indeed possible to compare recommendation results from a single ranked list to the results from a whole search session. In doing so, we address the following questions: (1) how can we fairly and realistically compare the quality of an individual list of recommended items to the quality of an entire manual search session; (2) how can we measure the contribution that the RS is making to the entire search session. We contextualize our claims by focusing on a particular complex search scenario: the problem of talent search. An example of professional search, talent search involves recruiters searching for relevant candidates given a specific job posting by issuing multiple queries in the course of a search session. We show that it is possible to compare the search behavior and success of recruiters to that of a matchmaking recommender system that generates a single ranked list of relevant candidates for a given job posting. In particular, we adopt a session-based metric from IR and motivate how it can be used to perform valid and realistic comparisons of recommendation lists to multiple-query search sessions.

Original languageEnglish
Article number6
JournalCEUR Workshop Proceedings
Publication statusPublished - 2022
Event2022 Perspectives on the Evaluation of Recommender Systems Workshop, PERSPECTIVES 2022 - Seattle, United States
Duration: 22 Sep 2022 → …


Conference2022 Perspectives on the Evaluation of Recommender Systems Workshop, PERSPECTIVES 2022
Country/TerritoryUnited States
Period22/09/2022 → …

Bibliographical note

Funding Information:
This research was supported by the Innovation Fund Denmark, grant no. 0175-000005B.

Publisher Copyright:
© 2022 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0)


  • evaluation
  • job recommendation
  • recruitment
  • search
  • session-based recommendation


Dive into the research topics of 'Towards Comparing Recommendation to Multiple-Query Search Sessions for Talent Search'. Together they form a unique fingerprint.

Cite this