Parallel Semantic Trajectory Similarity Join

Lisi Chen, Shuo Shang, Christian S. Jensen, Bin Yao, Panos Kalnis

Publikation: Bidrag til bog/antologi/rapport/konference proceedingKonferenceartikel i proceedingForskningpeer review

2 Citationer (Scopus)

Abstrakt

Matching similar pairs of trajectories, called trajectory similarity join, is a fundamental functionality in spatial data management. We consider the problem of semantic trajectory similarity join (STS-Join). Each semantic trajectory is a sequence of Points-of-interest (POIs) with both location and text information. Thus, given two sets of semantic trajectories and a threshold θ, the STS-Join returns all pairs of semantic trajectories from the two sets with spatio-textual similarity no less than θ. This join targets applications such as term-based trajectory near-duplicate detection, geo-text data cleaning, personalized ridesharing recommendation, keyword-aware route planning, and travel itinerary recommendation.With these applications in mind, we provide a purposeful definition of spatio-textual similarity. To enable efficient STS-Join processing on large sets of semantic trajectories, we develop trajectory pair filtering techniques and consider the parallel processing capabilities of modern processors. Specifically, we present a two-phase parallel search algorithm. We first group semantic trajectories based on their text information. The algorithm's per-group searches are independent of each other and thus can be performed in parallel. For each group, the trajectories are further partitioned based on the spatial domain. We generate spatial and textual summaries for each trajectory batch, based on which we develop batch filtering and trajectory-batch filtering techniques to prune unqualified trajectory pairs in a batch mode. Additionally, we propose an efficient divide-and-conquer algorithm to derive bounds of spatial similarity and textual similarity between two semantic trajectories, which enable us prune dissimilar trajectory pairs without the need of computing the exact value of spatio-textual similarity. Experimental study with large semantic trajectory data confirms that our algorithm of processing semantic trajectory join is capable of outperforming our well-designed baseline by a factor of 8-12.

OriginalsprogEngelsk
Titel2020 IEEE 36th International Conference on Data Engineering
Antal sider12
ForlagIEEE
Publikationsdato2020
Sider997-1008
Artikelnummer9101683
ISBN (Trykt)978-1-7281-2904-4
ISBN (Elektronisk)9781728129037
DOI
StatusUdgivet - 2020
BegivenhedInternational Conference on Data Engineering - Dallas, USA
Varighed: 20 apr. 202024 apr. 2020
Konferencens nummer: 36th

Konference

KonferenceInternational Conference on Data Engineering
Nummer36th
LandUSA
ByDallas
Periode20/04/202024/04/2020
NavnProceedings of the International Conference on Data Engineering
ISSN1063-6382

Fingeraftryk

Dyk ned i forskningsemnerne om 'Parallel Semantic Trajectory Similarity Join'. Sammen danner de et unikt fingeraftryk.

Citationsformater