Evaluating the Impact of Error-Bounded Lossy Compression on Time Series Forecasting

Research output: Contribution to book/anthology/report/conference proceedingArticle in proceedingResearchpeer-review

1 Citation (Scopus)
1 Downloads (Pure)

Abstract

Time series data is widely used for decision-making and advanced analytics such as forecasting. However, the vast data volumes make storage challenging. Using lossy compression can save more space compared to lossless methods, but it can affect the forecasting accuracy. Understanding the impact of lossy compression on forecasting accuracy is a multifaceted challenge, necessitating experimental evaluation across various forecasting models, compression methods, and time series. This paper conducts such experimental evaluation by combining seven forecasting models, three lossy compression algorithms, and six datasets. By simulating a real-life scenario where forecasting models use lossy compressed data for prediction, we address three main research questions related to compression error and its effects on the time series characteristics and the forecasting models. The results show that the Poor Man’s Compression and Swing Filter lossy compression algorithms add less error than the Squeeze method as the error bound increases. Poor Man’s Compression provides the best balance between compression ratio and forecasting accuracy. Specifically, we obtained an average compression ratio of 13.65, 5.56, and 14.97 for PMC, SWING, and SZ with an average impact on forecasting accuracy of 5.56%, 3.3%, and 8.5%, respectively. An analysis of several time series characteristics shows that the maximum Kullback-Leibler divergence between consecutive windows in the time series is the best indicator of the impact of lossy compression on forecasting accuracy. Finally, our results indicate that simple models like Arima, are more resilient to lossy compression than complex deep learning models. The source code and data are available at https://github.com/cmcuza/EvalImpLSTS.

Original languageEnglish
Title of host publicationAdvances in Database Technology - EDBT
Number of pages14
PublisherOpenProceedings.org
Publication date18 Mar 2024
Pages650-663
ISBN (Electronic)978-3-89318-095-0
DOIs
Publication statusPublished - 18 Mar 2024
Event27th International Conference on Extending Database Technology, EDBT 2024 - Paestum, Italy
Duration: 25 Mar 202428 Mar 2024

Conference

Conference27th International Conference on Extending Database Technology, EDBT 2024
Country/TerritoryItaly
CityPaestum
Period25/03/202428/03/2024
SeriesAdvances in Database Technology - EDBT
Number3
Volume27

Bibliographical note

Publisher Copyright:
© 2024 Copyright held by the owner/author(s).

Fingerprint

Dive into the research topics of 'Evaluating the Impact of Error-Bounded Lossy Compression on Time Series Forecasting'. Together they form a unique fingerprint.

Cite this