Very large time series are increasingly available from an ever wider range of IoT-enabled sensors deployed in different environments. Significant insights and values can be obtained from these time series through performing cross-domain analyses, one of which is analyzing time delay temporal correlations across different datasets. Most existing works in this area are either limited in the type of detected relations, e.g., linear relations alone, only working with a fixed temporal scale, or not considering time delay between time series. This paper presents our Time delaY COrrelation Search (TYCOS) approach which provides a powerful and robust solution with the following features: (1) TYCOS is based on the concept of mutual information (MI) from information theory, giving it a strong theoretical foundation to detect all types of relations including non-linear ones, (2) TYCOS is able to discover time delay correlations at multiple temporal scales, (3) TYCOS works in an efficient, bottom-up fashion, pruning non-interesting time intervals from the search by employing a novel MI-based theory, and (4) TYCOS is designed to efficiently minimize computational redundancy. A comprehensive experimental evaluation using synthetic and real-world datasets from the energy and smart city domains shows that TYCOS is able to find significant time delay correlations across different time intervals among big time series. The performance evaluation shows that TYCOS can scale to large datasets, and achieve an average speedup of 2 to 3 orders of magnitude compared to the baselines by using the proposed optimizations.
|Title of host publication||Proceedings - The 23rd International Conference on Extending Database Technology (EDBT), March 30-April 2, 2020|
|Publisher||Association for Computing Machinery|
|Publication status||Accepted/In press - 2020|
- time delay
- temporal correlation
- mutual information
- hill climbing
Ho, N. T. T., Pedersen, T. B., Ho, L. V., & Vu, M. (Accepted/In press). Efficient Search for Multi-Scale Time Delay Correlations in Big Time Series. In Proceedings - The 23rd International Conference on Extending Database Technology (EDBT), March 30-April 2, 2020 Association for Computing Machinery.