首页 | 本学科首页   官方微博 | 高级检索  
     检索      


Estimation of a probability density function using interval aggregated data
Abstract:ABSTRACT

In economics and government statistics, aggregated data instead of individual level data are usually reported for data confidentiality and for simplicity. In this paper we develop a method of flexibly estimating the probability density function of the population using aggregated data obtained as group averages when individual level data are grouped according to quantile limits. The kernel density estimator has been commonly applied to such data without taking into account the data aggregation process and has been shown to perform poorly. Our method models the quantile function as an integral of the exponential of a spline function and deduces the density function from the quantile function. We match the aggregated data to their theoretical counterpart using least squares, and regularize the estimation by using the squared second derivatives of the density function as the penalty function. A computational algorithm is developed to implement the method. Application to simulated data and US household income survey data show that our penalized spline estimator can accurately recover the density function of the underlying population while the common use of kernel density estimation is severely biased. The method is applied to study the dynamic of China's urban income distribution using published interval aggregated data of 1985–2010.
Keywords:Density estimation  Gini index  growth incidence curve  income disparity  Lorenz curve  penalized splines
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号