Efficient Retrieval of the Top-k Most Relevant Spatial Web Objects

Gao Cong, Christian Søndergaard Jensen, Dingming Wu

Research output: Contribution to journalConference article in JournalResearchpeer-review

382 Citations (Scopus)

Abstract

The conventional Internet is acquiring a geo-spatial dimension. Web documents are being geo-tagged, and geo-referenced objects such as points of interest are being associated with descriptive text documents. The resulting fusion of geo-location and documents enables a new kind of top-k query that takes into account both location proximity and text relevancy. To our knowledge, only naive techniques exist that are capable of computing a general web information retrieval query while also taking location into account. This paper proposes a new indexing framework for location-aware top-k text retrieval. The framework leverages the inverted file for text retrieval and the R-tree for spatial proximity querying. Several indexing approaches are explored within the framework. The framework encompasses algorithms that utilize the proposed indexes for computing the top-k query, thus taking into account both text relevancy and location proximity to prune the search space. Results of empirical studies with an implementation of the framework demonstrate that the paper’s proposal offers scalability and is capable of excellent performance.
Original languageEnglish
JournalInternational Conference on Very Large Data Bases. Proceedings
Volume2
Issue number1
Pages (from-to)337-348
ISSN1047-7349
Publication statusPublished - 2009
EventInternational Conference on Very Large Databases VLDB '09 - Lyon, France
Duration: 24 Aug 200928 Aug 2009
Conference number: 35

Conference

ConferenceInternational Conference on Very Large Databases VLDB '09
Number35
CountryFrance
CityLyon
Period24/08/200928/08/2009

Fingerprint

Information retrieval
Scalability
Fusion reactions
Internet

Cite this

@inproceedings{e9395ad0ebbd11deb63d000ea68e967b,
title = "Efficient Retrieval of the Top-k Most Relevant Spatial Web Objects",
abstract = "The conventional Internet is acquiring a geo-spatial dimension. Web documents are being geo-tagged, and geo-referenced objects such as points of interest are being associated with descriptive text documents. The resulting fusion of geo-location and documents enables a new kind of top-k query that takes into account both location proximity and text relevancy. To our knowledge, only naive techniques exist that are capable of computing a general web information retrieval query while also taking location into account. This paper proposes a new indexing framework for location-aware top-k text retrieval. The framework leverages the inverted file for text retrieval and the R-tree for spatial proximity querying. Several indexing approaches are explored within the framework. The framework encompasses algorithms that utilize the proposed indexes for computing the top-k query, thus taking into account both text relevancy and location proximity to prune the search space. Results of empirical studies with an implementation of the framework demonstrate that the paper’s proposal offers scalability and is capable of excellent performance.",
author = "Gao Cong and Jensen, {Christian S{\o}ndergaard} and Dingming Wu",
year = "2009",
language = "English",
volume = "2",
pages = "337--348",
journal = "International Conference on Very Large Data Bases. Proceedings",
issn = "1047-7349",
publisher = "A C M Special Interest Group",
number = "1",

}

Efficient Retrieval of the Top-k Most Relevant Spatial Web Objects. / Cong, Gao; Jensen, Christian Søndergaard; Wu, Dingming.

In: International Conference on Very Large Data Bases. Proceedings, Vol. 2, No. 1, 2009, p. 337-348.

Research output: Contribution to journalConference article in JournalResearchpeer-review

TY - GEN

T1 - Efficient Retrieval of the Top-k Most Relevant Spatial Web Objects

AU - Cong, Gao

AU - Jensen, Christian Søndergaard

AU - Wu, Dingming

PY - 2009

Y1 - 2009

N2 - The conventional Internet is acquiring a geo-spatial dimension. Web documents are being geo-tagged, and geo-referenced objects such as points of interest are being associated with descriptive text documents. The resulting fusion of geo-location and documents enables a new kind of top-k query that takes into account both location proximity and text relevancy. To our knowledge, only naive techniques exist that are capable of computing a general web information retrieval query while also taking location into account. This paper proposes a new indexing framework for location-aware top-k text retrieval. The framework leverages the inverted file for text retrieval and the R-tree for spatial proximity querying. Several indexing approaches are explored within the framework. The framework encompasses algorithms that utilize the proposed indexes for computing the top-k query, thus taking into account both text relevancy and location proximity to prune the search space. Results of empirical studies with an implementation of the framework demonstrate that the paper’s proposal offers scalability and is capable of excellent performance.

AB - The conventional Internet is acquiring a geo-spatial dimension. Web documents are being geo-tagged, and geo-referenced objects such as points of interest are being associated with descriptive text documents. The resulting fusion of geo-location and documents enables a new kind of top-k query that takes into account both location proximity and text relevancy. To our knowledge, only naive techniques exist that are capable of computing a general web information retrieval query while also taking location into account. This paper proposes a new indexing framework for location-aware top-k text retrieval. The framework leverages the inverted file for text retrieval and the R-tree for spatial proximity querying. Several indexing approaches are explored within the framework. The framework encompasses algorithms that utilize the proposed indexes for computing the top-k query, thus taking into account both text relevancy and location proximity to prune the search space. Results of empirical studies with an implementation of the framework demonstrate that the paper’s proposal offers scalability and is capable of excellent performance.

M3 - Conference article in Journal

VL - 2

SP - 337

EP - 348

JO - International Conference on Very Large Data Bases. Proceedings

JF - International Conference on Very Large Data Bases. Proceedings

SN - 1047-7349

IS - 1

ER -