TY - JOUR
T1 - Variational Inference over Nonstationary Data Streams for Exponential Family Models
AU - Masegosa, Andres R.
AU - Ramos-López, Darío
AU - Cerdán, Antonio Salmerón
AU - Langseth, Helge
AU - Nielsen, Thomas Dyhre
PY - 2020/11
Y1 - 2020/11
N2 - In many modern data analysis problems, the available data is not static but, instead, comes in a streaming fashion. Performing Bayesian inference on a data stream is challenging for several reasons. First, it requires continuous model updating and the ability to handle a posterior distribution conditioned on an unbounded data set. Secondly, the underlying data distribution may drift from one time step to another, and the classic i.i.d. (independent and identically distributed), or data exchangeability assumption does not hold anymore. In this paper, we present an approximate Bayesian inference approach using variational methods that addresses these issues for conjugate exponential family models with latent variables. Our proposal makes use of a novel scheme based on hierarchical priors to explicitly model temporal changes of the model parameters. We show how this approach induces an exponential forgetting mechanism with adaptive forgetting rates. The method is able to capture the smoothness of the concept drift, ranging from no drift to abrupt drift. The proposed variational inference scheme maintains the computational efficiency of variational methods over conjugate models, which is critical in streaming settings. The approach is validated on four different domains (energy, finance, geolocation, and text) using four real-world data sets.
AB - In many modern data analysis problems, the available data is not static but, instead, comes in a streaming fashion. Performing Bayesian inference on a data stream is challenging for several reasons. First, it requires continuous model updating and the ability to handle a posterior distribution conditioned on an unbounded data set. Secondly, the underlying data distribution may drift from one time step to another, and the classic i.i.d. (independent and identically distributed), or data exchangeability assumption does not hold anymore. In this paper, we present an approximate Bayesian inference approach using variational methods that addresses these issues for conjugate exponential family models with latent variables. Our proposal makes use of a novel scheme based on hierarchical priors to explicitly model temporal changes of the model parameters. We show how this approach induces an exponential forgetting mechanism with adaptive forgetting rates. The method is able to capture the smoothness of the concept drift, ranging from no drift to abrupt drift. The proposed variational inference scheme maintains the computational efficiency of variational methods over conjugate models, which is critical in streaming settings. The approach is validated on four different domains (energy, finance, geolocation, and text) using four real-world data sets.
KW - Concept drift
KW - Exponential forgetting
KW - Latent variable models
KW - Nonstationary data streams
KW - Power priors
KW - Variational inference
UR - http://www.scopus.com/inward/record.url?scp=85096046351&partnerID=8YFLogxK
U2 - 10.3390/math8111942
DO - 10.3390/math8111942
M3 - Journal article
SN - 2227-7390
VL - 8
SP - 1
EP - 27
JO - Mathematics
JF - Mathematics
IS - 11
M1 - 1942
ER -