Evaluating the Impact of Error-Bounded Lossy Compression on Time Series Forecasting

Publikation: Bidrag til bog/antologi/rapport/konference proceedingKonferenceartikel i proceedingForskningpeer review

1 Citationer (Scopus)
1 Downloads (Pure)

Abstract

Time series data is widely used for decision-making and advanced analytics such as forecasting. However, the vast data volumes make storage challenging. Using lossy compression can save more space compared to lossless methods, but it can affect the forecasting accuracy. Understanding the impact of lossy compression on forecasting accuracy is a multifaceted challenge, necessitating experimental evaluation across various forecasting models, compression methods, and time series. This paper conducts such experimental evaluation by combining seven forecasting models, three lossy compression algorithms, and six datasets. By simulating a real-life scenario where forecasting models use lossy compressed data for prediction, we address three main research questions related to compression error and its effects on the time series characteristics and the forecasting models. The results show that the Poor Man’s Compression and Swing Filter lossy compression algorithms add less error than the Squeeze method as the error bound increases. Poor Man’s Compression provides the best balance between compression ratio and forecasting accuracy. Specifically, we obtained an average compression ratio of 13.65, 5.56, and 14.97 for PMC, SWING, and SZ with an average impact on forecasting accuracy of 5.56%, 3.3%, and 8.5%, respectively. An analysis of several time series characteristics shows that the maximum Kullback-Leibler divergence between consecutive windows in the time series is the best indicator of the impact of lossy compression on forecasting accuracy. Finally, our results indicate that simple models like Arima, are more resilient to lossy compression than complex deep learning models. The source code and data are available at https://github.com/cmcuza/EvalImpLSTS.

OriginalsprogEngelsk
TitelAdvances in Database Technology - EDBT
Antal sider14
ForlagOpenProceedings.org
Publikationsdato18 mar. 2024
Sider650-663
ISBN (Elektronisk)978-3-89318-095-0
DOI
StatusUdgivet - 18 mar. 2024
Begivenhed27th International Conference on Extending Database Technology, EDBT 2024 - Paestum, Italien
Varighed: 25 mar. 202428 mar. 2024

Konference

Konference27th International Conference on Extending Database Technology, EDBT 2024
Land/OmrådeItalien
ByPaestum
Periode25/03/202428/03/2024
NavnAdvances in Database Technology - EDBT
Nummer3
Vol/bind27

Bibliografisk note

Publisher Copyright:
© 2024 Copyright held by the owner/author(s).

Fingeraftryk

Dyk ned i forskningsemnerne om 'Evaluating the Impact of Error-Bounded Lossy Compression on Time Series Forecasting'. Sammen danner de et unikt fingeraftryk.

Citationsformater