TY - GEN
T1 - Extreme-Scale Model-Based Time Series Management with ModelarDB (Invited Talk)
AU - Pedersen, Torben Bach
N1 - Conference code: 28
PY - 2021/9
Y1 - 2021/9
N2 - To monitor critical industrial devices such as wind turbines, high quality sensors sampled at a high frequency are increasingly used. Current technology does not handle these extreme-scale time series well [Søren Kejser Jensen et al., 2017], so only simple aggregates are traditionally stored, removing outliers and fluctuations that could indicate problems. As a remedy, we present a model-based approach for managing extreme-scale time series that approximates the time series values using mathematical functions (models) and stores only model coefficients rather than data values. Compression is done both for individual time series and for correlated groups of time series. The keynote will present concepts, techniques, and algorithms from model-based time series management and our implementation of these in the open source Time Series Management System (TSMS) ModelarDB[Søren Kejser Jensen et al., 2018; Søren Kejser Jensen et al., 2019; Søren Kejser Jensen et al., 2021] . Furthermore, it will present our experimental evaluation of ModelarDB on extreme-scale real-world time series, which shows that that compared to widely used Big Data formats, ModelarDB provides up to 14× faster ingestion due to high compression, 113× better compression due to its adaptability, 573× faster aggregatation by using models, and close to linear scale-out scalability. ModelarDB is being commercialized by the spin-out company ModelarData.
AB - To monitor critical industrial devices such as wind turbines, high quality sensors sampled at a high frequency are increasingly used. Current technology does not handle these extreme-scale time series well [Søren Kejser Jensen et al., 2017], so only simple aggregates are traditionally stored, removing outliers and fluctuations that could indicate problems. As a remedy, we present a model-based approach for managing extreme-scale time series that approximates the time series values using mathematical functions (models) and stores only model coefficients rather than data values. Compression is done both for individual time series and for correlated groups of time series. The keynote will present concepts, techniques, and algorithms from model-based time series management and our implementation of these in the open source Time Series Management System (TSMS) ModelarDB[Søren Kejser Jensen et al., 2018; Søren Kejser Jensen et al., 2019; Søren Kejser Jensen et al., 2021] . Furthermore, it will present our experimental evaluation of ModelarDB on extreme-scale real-world time series, which shows that that compared to widely used Big Data formats, ModelarDB provides up to 14× faster ingestion due to high compression, 113× better compression due to its adaptability, 573× faster aggregatation by using models, and close to linear scale-out scalability. ModelarDB is being commercialized by the spin-out company ModelarData.
KW - Model-based storage
KW - approximate query processing
KW - time series management
KW - extreme-scale data
U2 - 10.4230/LIPIcs.TIME.2021.2
DO - 10.4230/LIPIcs.TIME.2021.2
M3 - Article in proceeding
T3 - Leibniz International Proceedings in Informatics
SP - 2:1-2:2
BT - 28th International Symposium on Temporal Representation and Reasoning, TIME 2021, September 27-29, 2021, Klagenfurt, Austria.
PB - Schloss Dagstuhl. Leibniz-Zentrum für Informatik
T2 - 28th International Symposium on Temporal Representation and Reasoning, TIME 2021
Y2 - 27 September 2021 through 29 September 2021
ER -