Efficient and Distributed Temporal Pattern Mining

Nguyen Thi Thao Ho*, Van Ho Long, Torben Bach Pedersen, Mai Vu

*Corresponding author for this work

Research output: Contribution to book/anthology/report/conference proceedingArticle in proceedingResearchpeer-review

3 Citations (Scopus)

Abstract

The widespread deployment of IoT systems in the real world today has enabled the generation and collection of an enormous amount of sensor times series. One of the important mining techniques to extract patterns from time series is temporal pattern mining (TPM). Unlike the sequential pattern mining, TPM adds an additional temporal dimension, i.e., time intervals, into extracted patterns, making them more informative. However, adding the extra temporal dimension into patterns results in an additional exponential factor to the growth of the search space, and thus, significantly increases the mining complexity. Current TPM approaches work sequentially, therefore, cannot scale to large datasets. In this paper, we propose Distributed Hierarchical Pattern Graph TPM (DHPG-TPM), the first distributed solution that supports large-scale TPM using the leading distributed platform Apache Spark. Moreover, DHPG-TPM employs efficient data structures, distributed bitmap and distributed Hierarchical Pattern Graph that are carefully designed to work efficiently in a distributed environment to enable fast computations of support and confidence. To address the exponential search space of TPM, we design effective distributed pruning techniques based on the Apriori principle and the transitivity property of temporal relations to reduce the search space while minimizing the communication overhead between the cluster nodes. We con- duct extensive experiments on real-world and synthetic datasets, showing that DHPG-TPM outperforms the sequential baselines and scales to very large datasets.
Original languageEnglish
Title of host publication2021 IEEE International Conference on Big Data (Big Data)
PublisherIEEE
Publication date7 Dec 2021
Article number9671753
ISBN (Print)978-1-6654-4599-3
ISBN (Electronic)978-1-6654-3902-2
DOIs
Publication statusPublished - 7 Dec 2021
Event2021 IEEE International Conference on Big Data - Virtual Event
Duration: 15 Dec 202118 Dec 2021
Conference number: 9
https://bigdataieee.org/BigData2021/index.html

Conference

Conference2021 IEEE International Conference on Big Data
Number9
LocationVirtual Event
Period15/12/202118/12/2021
Internet address

Keywords

  • temporal patterns
  • distributed artificial intelligence

Fingerprint

Dive into the research topics of 'Efficient and Distributed Temporal Pattern Mining'. Together they form a unique fingerprint.

Cite this