Efficient and Distributed Temporal Pattern Mining

Nguyen Thi Thao Ho*, Van Ho Long, Torben Bach Pedersen, Mai Vu

*Kontaktforfatter

Publikation: Bidrag til bog/antologi/rapport/konference proceedingKonferenceartikel i proceedingForskningpeer review

3 Citationer (Scopus)

Abstract

The widespread deployment of IoT systems in the real world today has enabled the generation and collection of an enormous amount of sensor times series. One of the important mining techniques to extract patterns from time series is temporal pattern mining (TPM). Unlike the sequential pattern mining, TPM adds an additional temporal dimension, i.e., time intervals, into extracted patterns, making them more informative. However, adding the extra temporal dimension into patterns results in an additional exponential factor to the growth of the search space, and thus, significantly increases the mining complexity. Current TPM approaches work sequentially, therefore, cannot scale to large datasets. In this paper, we propose Distributed Hierarchical Pattern Graph TPM (DHPG-TPM), the first distributed solution that supports large-scale TPM using the leading distributed platform Apache Spark. Moreover, DHPG-TPM employs efficient data structures, distributed bitmap and distributed Hierarchical Pattern Graph that are carefully designed to work efficiently in a distributed environment to enable fast computations of support and confidence. To address the exponential search space of TPM, we design effective distributed pruning techniques based on the Apriori principle and the transitivity property of temporal relations to reduce the search space while minimizing the communication overhead between the cluster nodes. We con- duct extensive experiments on real-world and synthetic datasets, showing that DHPG-TPM outperforms the sequential baselines and scales to very large datasets.
OriginalsprogEngelsk
Titel2021 IEEE International Conference on Big Data (Big Data)
ForlagIEEE
Publikationsdato7 dec. 2021
Artikelnummer9671753
ISBN (Trykt)978-1-6654-4599-3
ISBN (Elektronisk)978-1-6654-3902-2
DOI
StatusUdgivet - 7 dec. 2021
Begivenhed2021 IEEE International Conference on Big Data - Virtual Event
Varighed: 15 dec. 202118 dec. 2021
Konferencens nummer: 9
https://bigdataieee.org/BigData2021/index.html

Konference

Konference2021 IEEE International Conference on Big Data
Nummer9
LokationVirtual Event
Periode15/12/202118/12/2021
Internetadresse

Fingeraftryk

Dyk ned i forskningsemnerne om 'Efficient and Distributed Temporal Pattern Mining'. Sammen danner de et unikt fingeraftryk.

Citationsformater