Ghost: A General Framework for High-Performance Online Similarity Queries over Distributed Trajectory Streams

Zikuan Fang, Shenghao Gong, Lu Chen, Jiachen Xu, Yunyao Gao, Christian S. Jensen

Research output: Contribution to journalConference article in JournalResearchpeer-review

Abstract

Trajectory similarity queries, including similarity search and similarity join, offer a foundation for many geo-spatial applications. With the rapid increase of streaming trajectory data volumes, e.g., data from mobile phones, vessel monitoring, or traffic systems, many location-based services benefit from online similarity analytics over trajectory data streams, where moving objects continually emit real-time position data. However, most existing studies focus on offline settings, and thus several major challenges remain unanswered in an online setting. To this end, we describe Ghost, a distributed stream processing framework that enables generic, efficient, and scalable online trajectory similarity search and join.

We propose a novel incremental online similarity computation (IOSC) mechanism to accelerate pair-wise streaming trajectory distance calculation, which supports a broad range of trajectory distance metrics. Compared with previous studies, IOSC reduces the complexity from quadratic to linear in terms of trajectory length. Building on this foundation, we propose histogram-based algorithms that exploit histogram indexes and a series of pruning bounds to enable streaming trajectory similarity search and join. Finally, we extend our methods to the distributed platform Flink for scalability, where a CostPartitioner is developed to ensure parallel processing and workload balancing. An experimental study using two real-life and one synthetic datasets shows that Ghost (i) acquires 6-20× efficiency/throughput gains and one order of magnitude memory overhead savings over state-of-the-art baselines, (ii) achieves 3--8× workload balancing gains on Flink, and (iii) exhibits low parameter sensitivity and high robustness.
Original languageEnglish
Article number173
JournalProceedings of the ACM on Management of Data
Volume1
Issue number2
Pages (from-to)1-25
Number of pages25
ISSN2836-6573
DOIs
Publication statusPublished - 2023
Event2023 ACM/SIGMOD International Conference on Management of Data, SIGMOD 2023 - Seattle, United States
Duration: 18 Jun 202323 Jun 2023

Conference

Conference2023 ACM/SIGMOD International Conference on Management of Data, SIGMOD 2023
Country/TerritoryUnited States
CitySeattle
Period18/06/202323/06/2023
SponsorACM SIGMOD

Fingerprint

Dive into the research topics of 'Ghost: A General Framework for High-Performance Online Similarity Queries over Distributed Trajectory Streams'. Together they form a unique fingerprint.

Cite this