首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 62 毫秒
1.
This paper considers the problem of selecting optimal bandwidths for variable (sample‐point adaptive) kernel density estimation. A data‐driven variable bandwidth selector is proposed, based on the idea of approximating the log‐bandwidth function by a cubic spline. This cubic spline is optimized with respect to a cross‐validation criterion. The proposed method can be interpreted as a selector for either integrated squared error (ISE) or mean integrated squared error (MISE) optimal bandwidths. This leads to reflection upon some of the differences between ISE and MISE as error criteria for variable kernel estimation. Results from simulation studies indicate that the proposed method outperforms a fixed kernel estimator (in terms of ISE) when the target density has a combination of sharp modes and regions of smooth undulation. Moreover, some detailed data analyses suggest that the gains in ISE may understate the improvements in visual appeal obtained using the proposed variable kernel estimator. These numerical studies also show that the proposed estimator outperforms existing variable kernel density estimators implemented using piecewise constant bandwidth functions.  相似文献   

2.
In recent years, the Quintile Share Ratio (or QSR) has become a very popular measure of inequality. In 2001, the European Council decided that income inequality in European Union member states should be described using two indicators: the Gini Index and the QSR. The QSR is generally defined as the ratio of the total income earned by the richest 20% of the population relative to that earned by the poorest 20%. Thus, it can be expressed using quantile shares, where a quantile share is the share of total income earned by all of the units up to a given quantile. The aim of this paper is to propose an improved methodology for the estimation and variance estimation of the QSR in a complex sampling design framework. Because the QSR is a non-linear function of interest, the estimation of its sampling variance requires advanced methodology. Moreover, a non-trivial obstacle in the estimation of quantile shares in finite populations is the non-unique definition of a quantile. Thus, two different conceptions of the quantile share are presented in the paper, leading us to two different estimators of the QSR. Regarding variance estimation, [Osier, 2006] and [Osier, 2009] proposed a variance estimator based on linearization techniques. However, his method involves Gaussian kernel smoothing of cumulative distribution functions. Our approach, also based on linearization, shows that no smoothing is needed. The construction of confidence intervals is discussed and a proposition is made to account for the skewness of the sampling distribution of the QSR. Finally, simulation studies are run to assess the relevance of our theoretical results.  相似文献   

3.
Abstract

In this work, we propose beta prime kernel estimator for estimation of a probability density functions defined with nonnegative support. For the proposed estimator, beta prime probability density function used as a kernel. It is free of boundary bias and nonnegative with a natural varying shape. We obtained the optimal rate of convergence for the mean squared error (MSE) and the mean integrated squared error (MISE). Also, we use adaptive Bayesian bandwidth selection method with Lindley approximation for heavy tailed distributions and compare its performance with the global least squares cross-validation bandwidth selection method. Simulation studies are performed to evaluate the average integrated squared error (ISE) of the proposed kernel estimator against some asymmetric competitors using Monte Carlo simulations. Moreover, real data sets are presented to illustrate the findings.  相似文献   

4.
We consider the problem of density estimation when the data is in the form of a continuous stream with no fixed length. In this setting, implementations of the usual methods of density estimation such as kernel density estimation are problematic. We propose a method of density estimation for massive datasets that is based upon taking the derivative of a smooth curve that has been fit through a set of quantile estimates. To achieve this, a low-storage, single-pass, sequential method is proposed for simultaneous estimation of multiple quantiles for massive datasets that form the basis of this method of density estimation. For comparison, we also consider a sequential kernel density estimator. The proposed methods are shown through simulation study to perform well and to have several distinct advantages over existing methods.  相似文献   

5.
Abstract.  A new semiparametric method for density deconvolution is proposed, based on a model in which only the ratio of the unconvoluted to convoluted densities is specified parametrically. Deconvolution results from reweighting the terms in a standard kernel density estimator, where the weights are defined by the parametric density ratio. We propose that in practice, the density ratio be modelled on the log-scale as a cubic spline with a fixed number of knots. Parameter estimation is based on maximization of a type of semiparametric likelihood. The resulting asymptotic properties for our deconvolution estimator mirror the convergence rates in standard density estimation without measurement error when attention is restricted to our semiparametric class of densities. Furthermore, numerical studies indicate that for practical sample sizes our weighted kernel estimator can provide better results than the classical non-parametric kernel estimator for a range of densities outside the specified semiparametric class.  相似文献   

6.
The estimation of the distribution functon of a random variable X measured with error is studied. Let the i-th observation on X be denoted by YiXii where εi is the measuremen error. Let {Yi} (i=1,2,…,n) be a sample of independent observations. It is assumed that {Xi} and {∈i} are mutually independent and each is identically distributed. As is standard in the literature for this problem, the distribution of e is assumed known in the development of the methodology. In practice, the measurement error distribution is estimated from replicate observations.

The proposed semiparametric estimator is derived by estimating the quantises of X on a set of n transformed V-values and smoothing the estimated quantiles using a spline function. The number of parameters of the spline function is determined by the data with a simple criterion, such as AIC. In a simulation study, the semiparametric estimator dominates an optimal kernel estimator and a normal mixture estimator for a wide class of densities.

The proposed estimator is applied to estimate the distribution function of the mean pH value in a field plot. The density function of the measurement error is estimated from repeated measurements of the pH values in a plot, and is treated as known for the estimation of the distribution function of the mean pH value.  相似文献   

7.
Abstract

Based on the Gamma kernel density estimation procedure, this article constructs a nonparametric kernel estimate for the regression functions when the covariate are nonnegative. Asymptotic normality and uniform almost sure convergence results for the new estimator are systematically studied, and the finite performance of the proposed estimate is discussed via a simulation study and a comparison study with an existing method. Finally, the proposed estimation procedure is applied to the Geyser data set.  相似文献   

8.
We consider the estimation of the conditional quantile function when the covariates take values in some abstract function space. The main goal of this article is to establish the almost complete convergence and the asymptotic normality of the kernel estimator of the conditional quantile under the α-mixing assumption and on the concentration properties on small balls of the probability measure of the functional regressors. Some applications and particular cases are studied. This approach can be applied in time series analysis to the prediction and building of confidence bands. We illustrate our methodology with El Niño data.  相似文献   

9.
Variance estimation for a low income proportion   总被引:1,自引:0,他引:1  
Summary. Proportions below a given fraction of a quantile of an income distribution are often estimated from survey data in comparisons of poverty. We consider the estimation of the variance of such a proportion, estimated from Family Expenditure Survey data. We show how a linearization method of variance estimation may be applied to this proportion, allowing for the effects of both a complex sampling design and weighting by a raking method to population controls. We show that, for data for 1998–1999, the estimated variances are always increased when allowance is made for the design and raking weights, the principal effect arising from the design. We also study the properties of a simplified variance estimator and discuss extensions to a wider class of poverty measures.  相似文献   

10.
基于北京市城镇住户调查数据,采用分位数回归与分解技术,结合核密度估计方法,定量考察了性别、年龄、受教育年限、工作经验等因素对城镇家庭收入差异及其演变的贡献。研究表明:不同收入群体并没有均等地分享经济社会发展的成果;城镇家庭收入差异是"回报效应"和"变量效应"共同作用的结果,不同历史时期两种效应的重要性有所不同;教育机会不均等是导致城镇家庭收入差异的关键因素;就业性质的差异助长了城镇家庭收入差距;不同收入群体间自身素质的差异是影响城镇家庭收入差异的重要因素;工作年限、工作经验、年龄、性别及家庭规模等因素影响力不容忽视。  相似文献   

11.
Abstract

Using a model-assisted approach, this paper studies asymptotically design-unbiased (ADU) estimation of a population “distribution function” and extends to deriving an asymptotic and approximate unbiased estimator for a population quantile from a sample chosen with varying probabilities. The respective asymptotic standard errors and confidence intervals are then worked out. Numerical findings based on an actual data support the theory with efficient results.  相似文献   

12.
Nonparametric density estimation in the presence of measurement error is considered. The usual kernel deconvolution estimator seeks to account for the contamination in the data by employing a modified kernel. In this paper a new approach based on a weighted kernel density estimator is proposed. Theoretical motivation is provided by the existence of a weight vector that perfectly counteracts the bias in density estimation without generating an excessive increase in variance. In practice a data driven method of weight selection is required. Our strategy is to minimize the discrepancy between a standard kernel estimate from the contaminated data on the one hand, and the convolution of the weighted deconvolution estimate with the measurement error density on the other hand. We consider a direct implementation of this approach, in which the weights are optimized subject to sum and non-negativity constraints, and a regularized version in which the objective function includes a ridge-type penalty. Numerical tests suggest that the weighted kernel estimation can lead to tangible improvements in performance over the usual kernel deconvolution estimator. Furthermore, weighted kernel estimates are free from the problem of negative estimation in the tails that can occur when using modified kernels. The weighted kernel approach generalizes to the case of multivariate deconvolution density estimation in a very straightforward manner.  相似文献   

13.
Several asymptotically equivalent quantile estimators recently have been proposed as alternative to the conventional sample quantile. A variety of weight functions have been obtained either by subsampling considerations or by a kernel approach, analogous to density estimation techniques. Focusing on the former approach, a unified treatment of quantile estimators derived by subsampling is developed. Closely related to the generalized Harrell-Davis (HD) and Kaigh-Lachenbruch (KL) estimators, a new statistic performed well in Monte Carlo effiency comparisons presented here. Moreover, the new estimator shares certain desirable computational and finite-sample theeoretical properties with the KL estimator to yield convenient components representations for tests of uniformity and goodness-of-fit criteria. Similar analytic treatment for the HD statistics and kernel quantile estimators, however, is precluded by intractable eigenvalue problems.  相似文献   

14.
《Econometric Reviews》2012,31(1):1-26
Abstract

This paper proposes a nonparametric procedure for testing conditional quantile independence using projections. Relative to existing smoothed nonparametric tests, the resulting test statistic: (i) detects the high frequency local alternatives that converge to the null hypothesis in probability at faster rate and, (ii) yields improvements in the finite sample power when a large number of variables are included under the alternative. In addition, it allows the researcher to include qualitative information and, if desired, direct the test against specific subsets of alternatives without imposing any functional form on them. We use the weighted Nadaraya-Watson (WNW) estimator of the conditional quantile function avoiding the boundary problems in estimation and testing and prove weak uniform consistency (with rate) of the WNW estimator for absolutely regular processes. The procedure is applied to a study of risk spillovers among the banks. We show that the methodology generalizes some of the recently proposed measures of systemic risk and we use the quantile framework to assess the intensity of risk spillovers among individual financial institutions.  相似文献   

15.
Abstract. Although generalized cross‐validation (GCV) has been frequently applied to select bandwidth when kernel methods are used to estimate non‐parametric mixed‐effect models in which non‐parametric mean functions are used to model covariate effects, and additive random effects are applied to account for overdispersion and correlation, the optimality of the GCV has not yet been explored. In this article, we construct a kernel estimator of the non‐parametric mean function. An equivalence between the kernel estimator and a weighted least square type estimator is provided, and the optimality of the GCV‐based bandwidth is investigated. The theoretical derivations also show that kernel‐based and spline‐based GCV give very similar asymptotic results. This provides us with a solid base to use kernel estimation for mixed‐effect models. Simulation studies are undertaken to investigate the empirical performance of the GCV. A real data example is analysed for illustration.  相似文献   

16.
EMPIRICAL LIKELIHOOD-BASED KERNEL DENSITY ESTIMATION   总被引:2,自引:0,他引:2  
This paper considers the estimation of a probability density function when extra distributional information is available (e.g. the mean of the distribution is known or the variance is a known function of the mean). The standard kernel method cannot exploit such extra information systematically as it uses an equal probability weight n-1 at each data point. The paper suggests using empirical likelihood to choose the probability weights under constraints formulated from the extra distributional information. An empirical likelihood-based kernel density estimator is given by replacing n-1 by the empirical likelihood weights, and has these advantages: it makes systematic use of the extra information, it is able to reflect the extra characteristics of the density function, and its variance is smaller than that of the standard kernel density estimator.  相似文献   

17.
Abstract

We consider statistical inference for additive partial linear models when the linear covariate is measured with error. A bias-corrected spline-backfitted kernel smoothing method is proposed. Under mild assumptions, the proposed component function and parameter estimator are oracally efficient and fast to compute. The nonparametric function estimator’s pointwise distribution is asymptotically equivalent to an function estimator in partial linear model. Finite-sample performance of the proposed estimators is assessed by simulation experiments. The proposed methods are applied to Boston house data set.  相似文献   

18.
We develop and study in the framework of Pareto-type distributions a class of nonparametric kernel estimators for the conditional second order tail parameter. The estimators are obtained by local estimation of the conditional second order parameter using a moving window approach. Asymptotic normality of the proposed class of kernel estimators is proven under some suitable conditions on the kernel function and the conditional tail quantile function. The nonparametric estimators for the second order parameter are subsequently used to obtain a class of bias-corrected kernel estimators for the conditional tail index. In particular it is shown how for a given kernel function one obtains a bias-corrected kernel function, and that replacing the second order parameter in the latter with a consistent estimator does not change the limiting distribution of the bias-corrected estimator for the conditional tail index. The finite sample behavior of some specific estimators is illustrated with a simulation experiment. The developed methodology is also illustrated on fire insurance claim data.  相似文献   

19.
居民收入密度函数的核密度估计具有非连续性,因无法通过积分计算特定收入区间的人口规模,故在核密度估计基础上,构建二分递归算法用以测算特定收入群体规模。使用中国健康和营养调查中的中国农村居民人均纯收入的微观调查数据,对中国农村居民收入分布进行核密度估计,并通过二分递归算法测算中国农村贫困发生率,结果显示:考虑到微观数据源和数据内容上的一些差异,计算得到的农村贫困发生率符合国家统计局公布的变动趋势且数值差异不大。因此,在核密度估计下使用二分递归算法计算特定收入群体规模具有有效性。  相似文献   

20.
Consider estimation of a population mean of a response variable when the observations are missing at random with respect to the covariate. Two common approaches to imputing the missing values are the nonparametric regression weighting method and the Horvitz-Thompson (HT) inverse weighting approach. The regression approach includes the kernel regression imputation and the nearest neighbor imputation. The HT approach, employing inverse kernel-estimated weights, includes the basic estimator, the ratio estimator and the estimator using inverse kernel-weighted residuals. Asymptotic normality of the nearest neighbor imputation estimators is derived and compared to kernel regression imputation estimator under standard regularity conditions of the regression function and the missing pattern function. A comprehensive simulation study shows that the basic HT estimator is most sensitive to discontinuity in the missing data patterns, and the nearest neighbors estimators can be insensitive to missing data patterns unbalanced with respect to the distribution of the covariate. Empirical studies show that the nearest neighbor imputation method is most effective among these imputation methods for estimating a finite population mean and for classifying the species of the iris flower data.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号