A Heuristic Hierarchical Scheme for Academic Search and Retrieval

Emmanouil Amolochitis, Ioannis T. Christou, Zheng-Hua Tan, Ramjee Prasad

Publikation: Bidrag til tidsskriftTidsskriftartikelForskningpeer review

16 Citationer (Scopus)

Abstract

We present PubSearch, a hybrid heuristic scheme for re-ranking academic papers retrieved from standard digital libraries such as the ACM Portal. The scheme is based on the hierarchical combination of a custom implementation of the term frequency heuristic, a time-depreciated citation score and a graph-theoretic computed score that relates the paper’s index terms with each other. We designed and developed a meta-search engine that submits user queries to standard digital repositories of academic publications and re-ranks the repository results using the hierarchical heuristic scheme. We evaluate our proposed reranking scheme via user feedback against the results of ACM Portal on a total of 58 different user queries specified from 15 different users. The results show that our proposed scheme significantly outperforms ACM Portal in terms of retrieval precision as measured by most common metrics in Information Retrieval including Normalized Discounted Cumulative Gain (NDCG), Expected Reciprocal Rank (ERR) as well as a newly introduced lexicographic
rule (LEX) of ranking search results. In particular, PubSearch outperforms ACM
Portal by more than 77% in terms of ERR, by more than 11% in terms of NDCG, and by more than 907.5% in terms of LEX. We also re-rank the top-10 results of a subset of the original 58 user queries produced by Google Scholar, Microsoft Academic Search, and ArnetMiner; the results show that PubSearch compares very well against these search engines as well. The proposed scheme can be easily plugged in any existing search engine for retrieval of academic publications.
OriginalsprogEngelsk
TidsskriftInformation Processing & Management
Vol/bind49
Udgave nummer6
Sider (fra-til)1326-1343
Antal sider18
ISSN0306-4573
DOI
StatusUdgivet - 2013

Fingeraftryk

Dyk ned i forskningsemnerne om 'A Heuristic Hierarchical Scheme for Academic Search and Retrieval'. Sammen danner de et unikt fingeraftryk.

Citationsformater