Abstract
Time series data is widely used for decision-making and advanced analytics such as forecasting. However, the vast data volumes make storage challenging. Using lossy compression can save more space compared to lossless methods, but it can affect the forecasting accuracy. Understanding the impact of lossy compression on forecasting accuracy is a multifaceted challenge, necessitating experimental evaluation across various forecasting models, compression methods, and time series. This paper conducts such experimental evaluation by combining seven forecasting models, three lossy compression algorithms, and six datasets. By simulating a real-life scenario where forecasting models use lossy compressed data for prediction, we address three main research questions related to compression error and its effects on the time series characteristics and the forecasting models. The results show that the Poor Man’s Compression and Swing Filter lossy compression algorithms add less error than the Squeeze method as the error bound increases. Poor Man’s Compression provides the best balance between compression ratio and forecasting accuracy. Specifically, we obtained an average compression ratio of 13.65, 5.56, and 14.97 for PMC, SWING, and SZ with an average impact on forecasting accuracy of 5.56%, 3.3%, and 8.5%, respectively. An analysis of several time series characteristics shows that the maximum Kullback-Leibler divergence between consecutive windows in the time series is the best indicator of the impact of lossy compression on forecasting accuracy. Finally, our results indicate that simple models like Arima, are more resilient to lossy compression than complex deep learning models. The source code and data are available at https://github.com/cmcuza/EvalImpLSTS.
Original language | English |
---|---|
Title of host publication | Advances in Database Technology - EDBT |
Number of pages | 14 |
Publisher | OpenProceedings.org |
Publication date | 18 Mar 2024 |
Pages | 650-663 |
ISBN (Electronic) | 978-3-89318-095-0 |
DOIs | |
Publication status | Published - 18 Mar 2024 |
Event | 27th International Conference on Extending Database Technology, EDBT 2024 - Paestum, Italy Duration: 25 Mar 2024 → 28 Mar 2024 |
Conference
Conference | 27th International Conference on Extending Database Technology, EDBT 2024 |
---|---|
Country/Territory | Italy |
City | Paestum |
Period | 25/03/2024 → 28/03/2024 |
Series | Advances in Database Technology - EDBT |
---|---|
Number | 3 |
Volume | 27 |
Bibliographical note
Publisher Copyright:© 2024 Copyright held by the owner/author(s).