A Heuristic Hierarchical Scheme for Academic Search and Retrieval

Emmanouil Amolochitis, Ioannis T. Christou, Zheng-Hua Tan, Ramjee Prasad

Research output: Contribution to journalJournal articleResearchpeer-review

12 Citations (Scopus)

Abstract

We present PubSearch, a hybrid heuristic scheme for re-ranking academic papers retrieved from standard digital libraries such as the ACM Portal. The scheme is based on the hierarchical combination of a custom implementation of the term frequency heuristic, a time-depreciated citation score and a graph-theoretic computed score that relates the paper’s index terms with each other. We designed and developed a meta-search engine that submits user queries to standard digital repositories of academic publications and re-ranks the repository results using the hierarchical heuristic scheme. We evaluate our proposed reranking scheme via user feedback against the results of ACM Portal on a total of 58 different user queries specified from 15 different users. The results show that our proposed scheme significantly outperforms ACM Portal in terms of retrieval precision as measured by most common metrics in Information Retrieval including Normalized Discounted Cumulative Gain (NDCG), Expected Reciprocal Rank (ERR) as well as a newly introduced lexicographic
rule (LEX) of ranking search results. In particular, PubSearch outperforms ACM
Portal by more than 77% in terms of ERR, by more than 11% in terms of NDCG, and by more than 907.5% in terms of LEX. We also re-rank the top-10 results of a subset of the original 58 user queries produced by Google Scholar, Microsoft Academic Search, and ArnetMiner; the results show that PubSearch compares very well against these search engines as well. The proposed scheme can be easily plugged in any existing search engine for retrieval of academic publications.
Original languageEnglish
JournalInformation Processing & Management
Volume49
Issue number6
Pages (from-to)1326-1343
Number of pages18
ISSN0306-4573
DOIs
Publication statusPublished - 2013

Fingerprint

Search engines
heuristics
search engine
Digital libraries
ranking
Information retrieval
Feedback
information retrieval
Heuristics
Query
Search engine
Reranking
Repository

Keywords

  • Academic search
  • Search and retrieval
  • Heuristic document re-ranking

Cite this

Amolochitis, Emmanouil ; Christou, Ioannis T. ; Tan, Zheng-Hua ; Prasad, Ramjee. / A Heuristic Hierarchical Scheme for Academic Search and Retrieval. In: Information Processing & Management. 2013 ; Vol. 49, No. 6. pp. 1326-1343.
@article{9a20d03dfc2242eeb31acf40e6b1f635,
title = "A Heuristic Hierarchical Scheme for Academic Search and Retrieval",
abstract = "We present PubSearch, a hybrid heuristic scheme for re-ranking academic papers retrieved from standard digital libraries such as the ACM Portal. The scheme is based on the hierarchical combination of a custom implementation of the term frequency heuristic, a time-depreciated citation score and a graph-theoretic computed score that relates the paper’s index terms with each other. We designed and developed a meta-search engine that submits user queries to standard digital repositories of academic publications and re-ranks the repository results using the hierarchical heuristic scheme. We evaluate our proposed reranking scheme via user feedback against the results of ACM Portal on a total of 58 different user queries specified from 15 different users. The results show that our proposed scheme significantly outperforms ACM Portal in terms of retrieval precision as measured by most common metrics in Information Retrieval including Normalized Discounted Cumulative Gain (NDCG), Expected Reciprocal Rank (ERR) as well as a newly introduced lexicographicrule (LEX) of ranking search results. In particular, PubSearch outperforms ACMPortal by more than 77{\%} in terms of ERR, by more than 11{\%} in terms of NDCG, and by more than 907.5{\%} in terms of LEX. We also re-rank the top-10 results of a subset of the original 58 user queries produced by Google Scholar, Microsoft Academic Search, and ArnetMiner; the results show that PubSearch compares very well against these search engines as well. The proposed scheme can be easily plugged in any existing search engine for retrieval of academic publications.",
keywords = "Academic search, Search and retrieval, Heuristic document re-ranking",
author = "Emmanouil Amolochitis and Christou, {Ioannis T.} and Zheng-Hua Tan and Ramjee Prasad",
year = "2013",
doi = "10.1016/j.ipm.2013.07.002",
language = "English",
volume = "49",
pages = "1326--1343",
journal = "Information Processing & Management",
issn = "0306-4573",
publisher = "Pergamon Press",
number = "6",

}

A Heuristic Hierarchical Scheme for Academic Search and Retrieval. / Amolochitis, Emmanouil; Christou, Ioannis T.; Tan, Zheng-Hua; Prasad, Ramjee.

In: Information Processing & Management, Vol. 49, No. 6, 2013, p. 1326-1343.

Research output: Contribution to journalJournal articleResearchpeer-review

TY - JOUR

T1 - A Heuristic Hierarchical Scheme for Academic Search and Retrieval

AU - Amolochitis, Emmanouil

AU - Christou, Ioannis T.

AU - Tan, Zheng-Hua

AU - Prasad, Ramjee

PY - 2013

Y1 - 2013

N2 - We present PubSearch, a hybrid heuristic scheme for re-ranking academic papers retrieved from standard digital libraries such as the ACM Portal. The scheme is based on the hierarchical combination of a custom implementation of the term frequency heuristic, a time-depreciated citation score and a graph-theoretic computed score that relates the paper’s index terms with each other. We designed and developed a meta-search engine that submits user queries to standard digital repositories of academic publications and re-ranks the repository results using the hierarchical heuristic scheme. We evaluate our proposed reranking scheme via user feedback against the results of ACM Portal on a total of 58 different user queries specified from 15 different users. The results show that our proposed scheme significantly outperforms ACM Portal in terms of retrieval precision as measured by most common metrics in Information Retrieval including Normalized Discounted Cumulative Gain (NDCG), Expected Reciprocal Rank (ERR) as well as a newly introduced lexicographicrule (LEX) of ranking search results. In particular, PubSearch outperforms ACMPortal by more than 77% in terms of ERR, by more than 11% in terms of NDCG, and by more than 907.5% in terms of LEX. We also re-rank the top-10 results of a subset of the original 58 user queries produced by Google Scholar, Microsoft Academic Search, and ArnetMiner; the results show that PubSearch compares very well against these search engines as well. The proposed scheme can be easily plugged in any existing search engine for retrieval of academic publications.

AB - We present PubSearch, a hybrid heuristic scheme for re-ranking academic papers retrieved from standard digital libraries such as the ACM Portal. The scheme is based on the hierarchical combination of a custom implementation of the term frequency heuristic, a time-depreciated citation score and a graph-theoretic computed score that relates the paper’s index terms with each other. We designed and developed a meta-search engine that submits user queries to standard digital repositories of academic publications and re-ranks the repository results using the hierarchical heuristic scheme. We evaluate our proposed reranking scheme via user feedback against the results of ACM Portal on a total of 58 different user queries specified from 15 different users. The results show that our proposed scheme significantly outperforms ACM Portal in terms of retrieval precision as measured by most common metrics in Information Retrieval including Normalized Discounted Cumulative Gain (NDCG), Expected Reciprocal Rank (ERR) as well as a newly introduced lexicographicrule (LEX) of ranking search results. In particular, PubSearch outperforms ACMPortal by more than 77% in terms of ERR, by more than 11% in terms of NDCG, and by more than 907.5% in terms of LEX. We also re-rank the top-10 results of a subset of the original 58 user queries produced by Google Scholar, Microsoft Academic Search, and ArnetMiner; the results show that PubSearch compares very well against these search engines as well. The proposed scheme can be easily plugged in any existing search engine for retrieval of academic publications.

KW - Academic search

KW - Search and retrieval

KW - Heuristic document re-ranking

U2 - 10.1016/j.ipm.2013.07.002

DO - 10.1016/j.ipm.2013.07.002

M3 - Journal article

VL - 49

SP - 1326

EP - 1343

JO - Information Processing & Management

JF - Information Processing & Management

SN - 0306-4573

IS - 6

ER -