首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
On boundary correction in kernel density estimation   总被引:1,自引:0,他引:1  
It is well known now that kernel density estimators are not consistent when estimating a density near the finite end points of the support of the density to be estimated. This is due to boundary effects that occur in nonparametric curve estimation problems. A number of proposals have been made in the kernel density estimation context with some success. As of yet there appears to be no single dominating solution that corrects the boundary problem for all shapes of densities. In this paper, we propose a new general method of boundary correction for univariate kernel density estimation. The proposed method generates a class of boundary corrected estimators. They all possess desirable properties such as local adaptivity and non-negativity. In simulation, it is observed that the proposed method perform quite well when compared with other existing methods available in the literature for most shapes of densities, showing a very important robustness property of the method. The theory behind the new approach and the bias and variance of the proposed estimators are given. Results of a data analysis are also given.  相似文献   

2.
The kernel method of estimation of curves is now popular and widely used in statistical applications. Kernel estimators suffer from boundary effects, however, when the support of the function to be estimated has finite endpoints. Several solutions to this problem have already been proposed. Here the authors develop a new method of boundary correction for kernel density estimation. Their technique is a kind of generalized reflection involving transformed data. It generates a class of boundary corrected estimators having desirable properties such as local smoothness and nonnegativity. Simulations show that the proposed method performs quite well when compared with the existing methods for almost all shapes of densities. The authors present the theory behind this new methodology, and they determine the bias and variance of their estimators.  相似文献   

3.
A partially balanced nested row-column design, referred to as PBNRC, is defined as an arrangement of v treatments in b p × q blocks for which, with the convention that p q, the information matrix for the estimation of treatment parameters is equal to that of the column component design which is itself a partially balanced incomplete block design. In this paper, previously known optimal incomplete block designs, and row-column and nested row-column designs are utilized to develop some methods of constructing optimal PBNRC designs. In particular, it is shown that an optimal group divisible PBNRC design for v = mn kn treatments in p × q blocks can be constructed whenever a balanced incomplete block design for m treatments in blocks of size k each and a group divisible PBNRC design for kn treatments in p × q blocks exist. A simple sufficient condition is given under which a group divisible PBNRC is Ψf-better for all f> 0 than the corresponding balanced nested row-column designs having binary blocks. It is also shown that the construction techniques developed particularly for group divisible designs can be generalized to obtain PBNRC designs based on rectangular association schemes.  相似文献   

4.
Simple boundary correction for kernel density estimation   总被引:8,自引:0,他引:8  
If a probability density function has bounded support, kernel density estimates often overspill the boundaries and are consequently especially biased at and near these edges. In this paper, we consider the alleviation of this boundary problem. A simple unified framework is provided which covers a number of straightforward methods and allows for their comparison: generalized jackknifing generates a variety of simple boundary kernel formulae. A well-known method of Rice (1984) is a special case. A popular linear correction method is another: it has close connections with the boundary properties of local linear fitting (Fan and Gijbels, 1992). Links with the optimal boundary kernels of Müller (1991) are investigated. Novel boundary kernels involving kernel derivatives and generalized reflection arise too. In comparisons, various generalized jackknifing methods perform rather similarly, so this, together with its existing popularity, make linear correction as good a method as any. In an as yet unsuccessful attempt to improve on generalized jackknifing, a variety of alternative approaches is considered. A further contribution is to consider generalized jackknife boundary correction for density derivative estimation. En route to all this, a natural analogue of local polynomial regression for density estimation is defined and discussed.  相似文献   

5.
Kernel smoothing methods are widely used in many research areas in statistics. However, kernel estimators suffer from boundary effects when the support of the function to be estimated has finite endpoints. Boundary effects seriously affect the overall performance of the estimator. In this article, we propose a new method of boundary correction for univariate kernel density estimation. Our technique is based on a data transformation that depends on the point of estimation. The proposed method possesses desirable properties such as local adaptivity and non-negativity. Furthermore, unlike many other transformation methods available, the proposed estimator is easy to implement. In a Monte Carlo study, the accuracy of the proposed estimator is numerically analyzed and compared with the existing methods of boundary correction. We find that it performs well for most shapes of densities. The theory behind the new methodology, along with the bias and variance of the proposed estimator, are presented. Results of a data analysis are also given.  相似文献   

6.
Kernel-based density estimation algorithms are inefficient in presence of discontinuities at support endpoints. This is substantially due to the fact that classic kernel density estimators lead to positive estimates beyond the endopoints. If a nonparametric estimate of a density functional is required in determining the bandwidth, then the problem also affects the bandwidth selection procedure. In this paper algorithms for bandwidth selection and kernel density estimation are proposed for non-negative random variables. Furthermore, the methods we propose are compared with some of the principal solutions in the literature through a simulation study.  相似文献   

7.
The commonly used survey technique of clustering introduces dependence into sample data. Such data is frequently used in economic analysis, though the dependence induced by the sample structure of the data is often ignored. In this paper, the effect of clustering on the non-parametric, kernel estimate of the density, f(x), is examined. The window width commonly used for density estimation for the case of i.i.d. data is shown to no longer be optimal. A new optimal bandwidth using a higher-order kernel is proposed and is shown to give a smaller integrated mean squared error than two window widths which are widely used for the case of i.i.d. data. Several illustrations from simulation are provided.  相似文献   

8.
The paper considers estimation of the boundary of an elliptical domain when the data without a measurement error are distributed uniformly on this domain but are superimposed by random errors. The problem is solved in two phases. In the first phase the domain is subdivided into thin slices and the endpoints of these slices are estimated within the framework of a corresponding one-dimensional problem. In the second phase the estimated endpoints are used to estimate the boundary using the total least-squares curve fitting procedure.  相似文献   

9.
A local orthogonal polynomial expansion (LOrPE) of the empirical density function is proposed as a novel method to estimate the underlying density. The estimate is constructed by matching localised expectation values of orthogonal polynomials to the values observed in the sample. LOrPE is related to several existing methods, and generalises straightforwardly to multivariate settings. By manner of construction, it is similar to local likelihood density estimation (LLDE). In the limit of small bandwidths, LOrPE functions as kernel density estimation (KDE) with high-order (effective) kernels inherently free of boundary bias, a natural consequence of kernel reshaping to accommodate endpoints. Consistency and faster asymptotic convergence rates follow. In the limit of large bandwidths LOrPE is equivalent to orthogonal series density estimation (OSDE) with Legendre polynomials, thereby inheriting its consistency. We compare the performance of LOrPE to KDE, LLDE, and OSDE, in a number of simulation studies. In terms of mean integrated squared error, the results suggest that with a proper balance of the two tuning parameters, bandwidth and degree, LOrPE generally outperforms these competitors when estimating densities with sharply truncated supports.  相似文献   

10.
Whereas there are many references on univariate boundary kernels, the construction of boundary kernels for multivariate density and curve estimation has not been investigated in detail. The use of multivariate boundary kernels ensures global consistency of multivariate kernel estimates as measured by the integrated mean-squared error or sup-norm deviation for functions with compact support. We develop a class of boundary kernels which work for any support, regardless of the complexity of its boundary. Our construction yields a boundary kernel for each point in the boundary region where the function is to be estimated. These boundary kernels provide a natural continuation of non-negative kernels used in the interior onto the boundary. They are obtained as solutions of the same kernel-generating variational problem which also produces the kernel function used in the interior as its solution. We discuss the numerical implementation of the proposed boundary kernels and their relationship to locally weighted least squares. Along the way we establish a continuous least squares principle and a continuous analogue of the Gauss–Markov theorem.  相似文献   

11.
Beta-Bernstein Smoothing for Regression Curves with Compact Support   总被引:5,自引:0,他引:5  
ABSTRACT. The problem of boundary bias is associated with kernel estimation for regression curves with compact support. This paper proposes a simple and uni(r)ed approach for remedying boundary bias in non-parametric regression, without dividing the compact support into interior and boundary areas and without applying explicitly different smoothing treatments separately. The approach uses the beta family of density functions as kernels. The shapes of the kernels vary according to the position where the curve estimate is made. Theyare symmetric at the middle of the support interval, and become more and more asymmetric nearer the boundary points. The kernels never put any weight outside the data support interval, and thus avoid boundary bias. The method is a generalization of classical Bernstein polynomials, one of the earliest methods of statistical smoothing. The proposed estimator has optimal mean integrated squared error at an order of magnitude n −4/5, equivalent to that of standard kernel estimators when the curve has an unbounded support.  相似文献   

12.
Abstract.  Given n independent and identically distributed observations in a set G  = {( x ,  y ) ∈ [0, 1] p  ×  R  : 0 ≤  y  ≤  g ( x )} with an unknown function g , called a boundary or frontier, it is desired to estimate g from the observations. The problem has several important applications including classification and cluster analysis, and is closely related to edge estimation in image reconstruction. The convex-hull estimator of a boundary or frontier is also very popular in econometrics, where it is a cornerstone of a method known as 'data envelope analysis'. In this paper, we give a large sample approximation of the distribution of the convex-hull estimator in the general case where p  ≥ 1. We discuss ways of using the large sample approximation to correct the bias of the convex-hull and the DEA estimators and to construct confidence intervals for the true function.  相似文献   

13.
The problem of density estimation arises naturally in many contexts. In this paper, we consider the approach using a piecewise constant function to approximate the underlying density. We present a new density estimation method via the random forest method based on the Bayesian Sequential Partition (BSP) (Lu, Jiang, and Wong 2013 Lu, L., H. Jiang, and W. H. Wong, 2013. Multivariate density estimation by Bayesian Sequential Partitioning. Journal of the American Statistical Association 108(504):140210.[Taylor &; Francis Online], [Web of Science ®] [Google Scholar]). Extensive simulations are carried out with comparison to the kernel density estimation method, BSP method, and four local kernel density estimation methods. The experiment results show that the new method is capable of providing accurate and reliable density estimation, even at the boundary, especially for i.i.d. data. In addition, the likelihood of the out-of-bag density estimation, which is a byproduct of the training process, is an effective hyperparameter selection criterion.  相似文献   

14.
Kernel density classification and boosting: an L2 analysis   总被引:1,自引:0,他引:1  
Kernel density estimation is a commonly used approach to classification. However, most of the theoretical results for kernel methods apply to estimation per se and not necessarily to classification. In this paper we show that when estimating the difference between two densities, the optimal smoothing parameters are increasing functions of the sample size of the complementary group, and we provide a small simluation study which examines the relative performance of kernel density methods when the final goal is classification.A relative newcomer to the classification portfolio is boosting, and this paper proposes an algorithm for boosting kernel density classifiers. We note that boosting is closely linked to a previously proposed method of bias reduction in kernel density estimation and indicate how it will enjoy similar properties for classification. We show that boosting kernel classifiers reduces the bias whilst only slightly increasing the variance, with an overall reduction in error. Numerical examples and simulations are used to illustrate the findings, and we also suggest further areas of research.  相似文献   

15.
16.
This paper addresses the problem of the probability density estimation in the presence of covariates when data are missing at random (MAR). The inverse probability weighted method is used to define a nonparametric and a semiparametric weighted probability density estimators. A regression calibration technique is also used to define an imputed estimator. It is shown that all the estimators are asymptotically normal with the same asymptotic variance as that of the inverse probability weighted estimator with known selection probability function and weights. Also, we establish the mean squared error (MSE) bounds and obtain the MSE convergence rates. A simulation is carried out to assess the proposed estimators in terms of the bias and standard error.  相似文献   

17.
Assume that in independent two-dimensional random vectors (X11),…,(Xnn), each θi is distributed according to some unknown prior density function g. Also, given θi=θ, Xi has the conditional density function q(x−θ), x,θ(−∞,∞) (a location parameter case), or θ−1q(x/θ), x,θ(0,∞) (a scale parameter case). In each pair the first component is observable, but the second is not. After the (n+1)th pair (Xn+1n+1) is obtained, the objective is to construct an empirical Bayes (EB) estimator of θ. In this paper we derive the EB estimators of θ based on a wavelet approximation with Meyer-type wavelets. We show that these estimators provide adaptation not only in the case when g belongs to the Sobolev space H with an unknown , but also when g is supersmooth.  相似文献   

18.
The standard approach to non-parametric bivariate density estimation is to use a kernel density estimator. Practical performance of this estimator is hindered by the fact that the estimator is not adaptive (in the sense that the level of smoothing is not sensitive to local properties of the density). In this paper a simple, automatic and adaptive bivariate density estimator is proposed based on the estimation of marginal and conditional densities. Asymptotic properties of the estimator are examined, and guidance to practical application of the method is given. Application to two examples illustrates the usefulness of the estimator as an exploratory tool, particularly in situations where the local behaviour of the density varies widely. The proposed estimator is also appropriate for use as a pilot estimate for an adaptive kernel estimate, since it is relatively inexpensive to calculate.  相似文献   

19.
We consider the problem of making inferences about extreme values from a sample. The underlying model distribution is the generalized extreme-value (GEV) distribution, and our interest is in estimating the parameters and quantiles of the distribution robustly. In doing this we find estimates for the GEV parameters based on that part of the data which is well fitted by a GEV distribution. The robust procedure will assign weights between 0 and 1 to each data point. A weight near 0 indicates that the data point is not well modelled by the GEV distribution which fits the points with weights at or near 1. On the basis of these weights we are able to assess the validity of a GEV model for our data. It is important that the observations with low weights be carefully assessed to determine whether diey are valid observations or not. If they are, we must examine whether our data could be generated by a mixture of GEV distributions or whether some other process is involved in generating the data. This process will require careful consideration of die subject matter area which led to the data. The robust estimation techniques are based on optimal B-robust estimates. Their performance is compared to the probability-weighted moment estimates of Hosking et al. (1985) in both simulated and real data.  相似文献   

20.
Summary.  We consider the problem of estimating the proportion of true null hypotheses, π 0, in a multiple-hypothesis set-up. The tests are based on observed p -values. We first review published estimators based on the estimator that was suggested by Schweder and Spjøtvoll. Then we derive new estimators based on nonparametric maximum likelihood estimation of the p -value density, restricting to decreasing and convex decreasing densities. The estimators of π 0 are all derived under the assumption of independent test statistics. Their performance under dependence is investigated in a simulation study. We find that the estimators are relatively robust with respect to the assumption of independence and work well also for test statistics with moderate dependence.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号