首页 | 本学科首页   官方微博 | 高级检索  
     检索      


Data skeletons: simultaneous estimation of multiple quantiles for massive streaming datasets with applications to density estimation
Authors:James P McDermott  G Jogesh Babu  John C Liechty  Dennis K J Lin
Institution:(1) Department of Statistics, The Pennsylvania State University, 326 Thomas Building, University Park, PA 16802, USA;(2) Departments of Marketing and Statistics, The Pennsylvania State University, 407 Business Building, University Park, PA 16802, USA;(3) Department of Supply Chain and Information Systems, The Pennsylvania State University, 483 Business Building, University Park, PA 16802, USA
Abstract:We consider the problem of density estimation when the data is in the form of a continuous stream with no fixed length. In this setting, implementations of the usual methods of density estimation such as kernel density estimation are problematic. We propose a method of density estimation for massive datasets that is based upon taking the derivative of a smooth curve that has been fit through a set of quantile estimates. To achieve this, a low-storage, single-pass, sequential method is proposed for simultaneous estimation of multiple quantiles for massive datasets that form the basis of this method of density estimation. For comparison, we also consider a sequential kernel density estimator. The proposed methods are shown through simulation study to perform well and to have several distinct advantages over existing methods.
Keywords:Sequential quantile estimation  Sequential density estimation  Online algorithms  Sequential algorithms  Cubic spline
本文献已被 SpringerLink 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号