TY - GEN
T1 - Efficient Search for Multi-Scale Time Delay Correlations in Big Time Series
AU - Ho, Nguyen Thi Thao
AU - Pedersen, Torben Bach
AU - Ho, Long Van
AU - Vu, Mai
PY - 2020/4/2
Y1 - 2020/4/2
N2 - Very large time series are increasingly available from an ever wider range of IoT-enabled sensors deployed in different environments. Significant insights and values can be obtained from these time series through performing cross-domain analyses, one of which is analyzing time delay temporal correlations across different datasets. Most existing works in this area are either limited in the type of detected relations, e.g., linear relations alone, only working with a fixed temporal scale, or not considering time delay between time series. This paper presents our Time delaY COrrelation Search (TYCOS) approach which provides a powerful and robust solution with the following features: (1) TYCOS is based on the concept of mutual information (MI) from information theory, giving it a strong theoretical foundation to detect all types of relations including non-linear ones, (2) TYCOS is able to discover time delay correlations at multiple temporal scales, (3) TYCOS works in an efficient, bottom-up fashion, pruning non-interesting time intervals from the search by employing a novel MI-based noise theory, and (4) TYCOS is designed to efficiently minimize computational redundancy. A comprehensive experimental evaluation using synthetic and real-world datasets from the energy and smart city domains shows that TYCOS is able to find significant time delay correlations across different time intervals among big time series. The performance evaluation shows that TYCOS can scale to large datasets, and achieve an average speedup of 2 to 3 orders of magnitude compared to the baselines by using the proposed optimizations.
AB - Very large time series are increasingly available from an ever wider range of IoT-enabled sensors deployed in different environments. Significant insights and values can be obtained from these time series through performing cross-domain analyses, one of which is analyzing time delay temporal correlations across different datasets. Most existing works in this area are either limited in the type of detected relations, e.g., linear relations alone, only working with a fixed temporal scale, or not considering time delay between time series. This paper presents our Time delaY COrrelation Search (TYCOS) approach which provides a powerful and robust solution with the following features: (1) TYCOS is based on the concept of mutual information (MI) from information theory, giving it a strong theoretical foundation to detect all types of relations including non-linear ones, (2) TYCOS is able to discover time delay correlations at multiple temporal scales, (3) TYCOS works in an efficient, bottom-up fashion, pruning non-interesting time intervals from the search by employing a novel MI-based noise theory, and (4) TYCOS is designed to efficiently minimize computational redundancy. A comprehensive experimental evaluation using synthetic and real-world datasets from the energy and smart city domains shows that TYCOS is able to find significant time delay correlations across different time intervals among big time series. The performance evaluation shows that TYCOS can scale to large datasets, and achieve an average speedup of 2 to 3 orders of magnitude compared to the baselines by using the proposed optimizations.
KW - time delay
KW - temporal correlation
KW - mutual information
KW - hill climbing
UR - http://www.scopus.com/inward/record.url?scp=85084186813&partnerID=8YFLogxK
U2 - 10.5441/002/edbt.2020.05
DO - 10.5441/002/edbt.2020.05
M3 - Article in proceeding
T3 - Advances in Database Technology
SP - 37
EP - 48
BT - Advances in Database Technology - EDBT 2020
A2 - Bonifati, Angela
A2 - Zhou, Yongluan
A2 - Vaz Salles, Marcos Antonio
A2 - Bohm, Alexander
A2 - Olteanu, Dan
A2 - Fletcher, George
A2 - Khan, Arijit
PB - OpenProceedings.org
T2 - 23rd International Conference on Extending Database Technology, EDBT 2020
Y2 - 30 March 2020 through 2 April 2020
ER -