首页 | 本学科首页   官方微博 | 高级检索  
     检索      


Multivariate online regression analysis with heterogeneous streaming data
Authors:Lan Luo  Peter X-K Song
Institution:1. Department of Statistics and Actuarial Science, University of Iowa, Iowa City, IA, 52242-1409 USA;2. Department of Biostatistics, University of Michigan, Ann Arbor, MI, 48109-2029 USA
Abstract:New data collection and storage technologies have given rise to a new field of streaming data analytics, called real-time statistical methodology for online data analyses. Most existing online learning methods are based on homogeneity assumptions, which require the samples in a sequence to be independent and identically distributed. However, inter-data batch correlation and dynamically evolving batch-specific effects are among the key defining features of real-world streaming data such as electronic health records and mobile health data. This article is built under a state-space mixed model framework in which the observed data stream is driven by a latent state process that follows a Markov process. In this setting, online maximum likelihood estimation is made challenging by high-dimensional integrals and complex covariance structures. In this article, we develop a real-time Kalman-filter-based regression analysis method that updates both point estimates and their standard errors for fixed population average effects while adjusting for dynamic hidden effects. Both theoretical justification and numerical experiments demonstrate that our proposed online method has statistical properties similar to those of its offline counterpart and enjoys great computational efficiency. We also apply this method to analyze an electronic health record dataset.
Keywords:dynamic effects  Kalman filter  online learning  state-space mixed models  streaming data
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号