首页 | 本学科首页   官方微博 | 高级检索  
     


Data skeletons: simultaneous estimation of multiple quantiles for massive streaming datasets with applications to density estimation
Authors:James P. McDermott  G. Jogesh Babu  John C. Liechty  Dennis K. J. Lin
Affiliation:(1) Department of Statistics, The Pennsylvania State University, 326 Thomas Building, University Park, PA 16802, USA;(2) Departments of Marketing and Statistics, The Pennsylvania State University, 407 Business Building, University Park, PA 16802, USA;(3) Department of Supply Chain and Information Systems, The Pennsylvania State University, 483 Business Building, University Park, PA 16802, USA
Abstract:We consider the problem of density estimation when the data is in the form of a continuous stream with no fixed length. In this setting, implementations of the usual methods of density estimation such as kernel density estimation are problematic. We propose a method of density estimation for massive datasets that is based upon taking the derivative of a smooth curve that has been fit through a set of quantile estimates. To achieve this, a low-storage, single-pass, sequential method is proposed for simultaneous estimation of multiple quantiles for massive datasets that form the basis of this method of density estimation. For comparison, we also consider a sequential kernel density estimator. The proposed methods are shown through simulation study to perform well and to have several distinct advantages over existing methods.
Keywords:Sequential quantile estimation  Sequential density estimation  Online algorithms  Sequential algorithms  Cubic spline
本文献已被 SpringerLink 等数据库收录!
相似文献(共20条):
[1]、Liechty,John C.,Lin,Dennis K. J.,McDermott,James P..Single-pass low-storage arbitrary quantile estimation for massive datasets[J].Statistics and Computing,2003,13(2):91-100.
[2]、Guorui Bian,James M. Dickey.Moments of the poly-Cauchy density with applications in estimation[J].Statistical Methods and Applications,1996,5(1):1-11.
[3]、Udi E. Makov.Bayesian and approximate bayesian solutions to simultaneous estimation of multiple dynamic processes[J].统计学通讯:理论与方法,2013,42(3):851-871.
[4]、Reinhard Furrer,Stephan R. Sain.Spatial model fitting for large datasets with applications to climate and microarray problems[J].Statistics and Computing,2009,19(2):113-128.
[5]、A. Cuberos,E. Masiello.Copulas checker-type approximations: Application to quantiles estimation of sums of dependent random variables[J].统计学通讯:理论与方法,2020,49(12):3044-3062.
[6]、P. J. Heagerty,& M. S. Pepe.Semiparametric estimation of regression quantiles with application to standardizing weight for height and age in US children[J].Journal of the Royal Statistical Society. Series C, Applied statistics,1999,48(4):533-551.
[7]、Dongliang Wang,Jeffrey C. Miecznikowski,Alan D. Hutson.Direct density estimation of L-estimates via characteristic functions with applications[J].Journal of statistical planning and inference,2012,142(2):567-578.
[8]、Pi-Erh Lin,Ke-Tsai Wu,Ibrahim A. Ahmad.Asymptotic joint distribution of sample quantiles and sample mean with applications[J].统计学通讯:理论与方法,2013,42(1):51-60.
[9]、Junjiro Ogawa.Optimal spacings for the simultaneous estimation of the location and scale parameters of a normal distribution based on selected two sample quantiles[J].Journal of statistical planning and inference,1977,1(1):61-72.
[10]、P. K. Tsay ,A. Chao.Population size estimation for capture-recapture models with applications to epidemiological data[J].Journal of applied statistics,2001,28(1):25-36.
[11]、Nadiminti Nagamani.Improved estimation of quantiles of two normal populations with common mean and ordered variances[J].统计学通讯:理论与方法,2020,49(19):4669-4692.
[12]、Faqir Muhammad,Hassan Dawood.An effective approach to linear calibration estimation with its applications[J].统计学通讯:理论与方法,2020,49(21):5154-5174.
[13]、Marco Di Marzio,Charles C. Taylor.On boosting kernel density methods for multivariate data: density estimation and classification[J].Statistical Methods and Applications,2005,14(2):163-178.
[14]、Baisuo Jin,Xiaoping Shi,Yuehua Wu.A novel and fast methodology for simultaneous multiple structural break estimation and variable selection for nonstationary time series models[J].Statistics and Computing,2013,23(2):221-231.
[15]、Konrad Furmańczyk.Archimedean copulas with applications to $${{\mathrm{}}}$$ estimation[J].Statistical Methods and Applications,2016,25(2):269-283.
[16]、Zdravko I. Botev,Pierre L’Ecuyer,Bruno Tuffin.Markov chain importance sampling with applications to rare event probability estimation[J].Statistics and Computing,2013,23(2):271-285.
[17]、Michal Abrahamowicz,Antonio Clampl,James O. Ramsay.Nonparametric density estimation for censored survival data: Regression-spline approach[J].Revue canadienne de statistique,1992,20(2):171-185.
[18]、D.S. Poskitt,M.O. Salau.Stable spectral factorization with applications to the estimation of time series models[J].统计学通讯:理论与方法,2013,42(2):347-367.
[19]、Ricardo Fraiman.General m-esttmators and applications to bounded influence estimation for non-linear regression[J].统计学通讯:理论与方法,2013,42(22):2617-2631.
[20]、Mohamed Belalia,Taoufik Bouezmarni,Alexandre Leblanc.Bernstein conditional density estimation with application to conditional distribution and regression functions[J].Journal of the Korean Statistical Society,2019,48(3):356-383.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号