首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
This paper considers the problem of selecting optimal bandwidths for variable (sample‐point adaptive) kernel density estimation. A data‐driven variable bandwidth selector is proposed, based on the idea of approximating the log‐bandwidth function by a cubic spline. This cubic spline is optimized with respect to a cross‐validation criterion. The proposed method can be interpreted as a selector for either integrated squared error (ISE) or mean integrated squared error (MISE) optimal bandwidths. This leads to reflection upon some of the differences between ISE and MISE as error criteria for variable kernel estimation. Results from simulation studies indicate that the proposed method outperforms a fixed kernel estimator (in terms of ISE) when the target density has a combination of sharp modes and regions of smooth undulation. Moreover, some detailed data analyses suggest that the gains in ISE may understate the improvements in visual appeal obtained using the proposed variable kernel estimator. These numerical studies also show that the proposed estimator outperforms existing variable kernel density estimators implemented using piecewise constant bandwidth functions.  相似文献   

2.
In this paper we study the ideal variable bandwidth kernel density estimator introduced by McKay (1993a, b) and Jones et al. (1994) and the plug-in practical version of the variable bandwidth kernel estimator with two sequences of bandwidths as in Giné and Sang (2013). Based on the bias and variance analysis of the ideal and plug-in variable bandwidth kernel density estimators, we study the central limit theorems for each of them. The simulation study confirms the central limit theorem and demonstrates the advantage of the plug-in variable bandwidth kernel method over the classical kernel method.  相似文献   

3.
This paper demonstrates that cross-validation (CV) and Bayesian adaptive bandwidth selection can be applied in the estimation of associated kernel discrete functions. This idea is originally proposed by Brewer [A Bayesian model for local smoothing in kernel density estimation, Stat. Comput. 10 (2000), pp. 299–309] to derive variable bandwidths in adaptive kernel density estimation. Our approach considers the adaptive binomial kernel estimator and treats the variable bandwidths as parameters with beta prior distribution. The best variable bandwidth selector is estimated by the posterior mean in the Bayesian sense under squared error loss. Monte Carlo simulations are conducted to examine the performance of the proposed Bayesian adaptive approach in comparison with the performance of the Asymptotic mean integrated squared error estimator and CV technique for selecting a global (fixed) bandwidth proposed in Kokonendji and Senga Kiessé [Discrete associated kernels method and extensions, Stat. Methodol. 8 (2011), pp. 497–516]. The Bayesian adaptive bandwidth estimator performs better than the global bandwidth, in particular for small and moderate sample sizes.  相似文献   

4.
Research in the area of bandwidth selection was an active topic in the 1980s and 1990s, however, recently there has been little research in the area. We re-opened this investigation and have found a new method for estimating mean integrated squared error for kernel density estimators. We provide an overview of other methods to obtain optimal bandwidths and offer a comparison of these methods via a simulation study. In certain situations, our method of estimating an optimal bandwidth yields a smaller MISE than competing methods to compute bandwidths. This procedure is illustrated by an application to two data sets.  相似文献   

5.
In the context of estimating local modes of a conditional density based on kernel density estimators, we show that existing bandwidth selection methods developed for kernel density estimation are unsuitable for mode estimation. We propose two methods to select bandwidths tailored for mode estimation in the regression setting . Numerical studies using synthetic data and a real-life dataset are carried out to demonstrate the performance of the proposed methods in comparison with several well-received bandwidth selection methods for density estimation.  相似文献   

6.
In Kernel density estimation, a criticism of bandwidth selection techniques which minimize squared error expressions is that they perform poorly when estimating tails of probability density functions. Techniques minimizing absolute error expressions are thought to result in more uniform performance and be potentially superior. An asympotic mean absolute error expression for nonparametric kernel density estimators from right-censored data is developed here. This expression is used to obtain local and global bandwidths that are optimal in the sense that they minimize asymptotic mean absolute error and integrated asymptotic mean absolute error, respectively. These estimators are illustrated fro eight data sets from known distributions. Computer simulation results are discussed, comparing the estimation methods with squared-error-based bandwidth selection for right-censored data.  相似文献   

7.
Integrated squared density derivatives are important to the plug-in type of bandwidth selector for kernel density estimation. Conventional estimators of these quantities are inefficient when there is a non-smooth boundary in the support of the density. We introduce estimators that utilize density derivative estimators obtained from local polynomial fitting. They retain the rates of convergence in mean-squared error that are familiar from non-boundary cases, and the constant coefficients have similar forms. The estimators and the formula for their asymptotically optimal bandwidths, which depend on integrated products of density derivatives, are applied to automatic bandwidth selection for local linear density estimation. Simulation studies show that the constructed bandwidth rule and the Sheather–Jones bandwidth are competitive in non-boundary cases, but the former overcomes boundary problems whereas the latter does not.  相似文献   

8.
M. C. Jones 《Statistics》2013,47(1-2):65-71
Two types of non-global bandwidth, which may be called local and variable, have been defined in attempts to improve the performance of kernel density estimators. In nonparametric regression, local linear fitting has become a method of much popularity. It is natural, therefore, to consider the use of non-global bandwidths in the local linear context, and indeed local bandwidths are often used. In this paper, it is observed that a natural proposal in the literature for combining variable bandwidths with local linear fitting fails in the sense that the resulting mean squared error properties are those normally associated with local rather than variable bandwidths. We are able to understand why this happens in terms of weightings that are involved. We also attempt to investigate how the bias reduction expected of well-chosen variable bandwidths might be achieved in conjunction with local linear fitting.  相似文献   

9.
This paper studies bandwidth selection for kernel estimation of derivatives of multidimensional conditional densities, a non-parametric realm unexplored in the literature. This paper extends Baird [Cross validation bandwidth selection for derivatives of multidimensional densities. RAND Working Paper series, WR-1060; 2014] in its examination of conditional multivariate densities, derives and presents criteria for arbitrary kernel order and density dimension, shows consistency of the estimators, and investigates a minimization criterion which jointly estimates numerator and denominator bandwidths. I conduct a Monte Carlo simulation study for various orders of kernels in the Gaussian family and compare the new cross validation criterion with those implied by Baird [Cross validation bandwidth selection for derivatives of multidimensional densities. RAND Working Paper series, WR-1060; 2014]. The paper finds that higher order kernels become increasingly important as the dimension of the distribution increases. I find that the cross validation criterion developed in this paper that jointly estimates the derivative of the joint density (numerator) and the marginal density (denominator) does orders of magnitude better than criteria that estimate the bandwidths separately. I further find that using the infinite order Dirichlet kernel tends to have the best results.  相似文献   

10.
A great deal of research has focused on improving the bias properties of kernel estimators. One proposal involves removing the restriction of non-negativity on the kernel to construct “higher-order” kernels that eliminate additional terms in the Taylor's series expansion of the bias. This paper considers an alternative that uses a local approach to bandwidth selection to not only reduce the bias, but to eliminate it entirely. These so-called “zero-bias bandwidths” are shown to exist for univariate and multivariate kernel density estimation as well as kernel regression. Implications of the existence of such bandwidths are discussed. An estimation strategy is presented, and the extent of the reduction or elimination of bias in practice is studied through simulation and example.  相似文献   

11.
We propose a modification to the regular kernel density estimation method that use asymmetric kernels to circumvent the spill over problem for densities with positive support. First a pivoting method is introduced for placement of the data relative to the kernel function. This yields a strongly consistent density estimator that integrates to one for each fixed bandwidth in contrast to most density estimators based on asymmetric kernels proposed in the literature. Then a data-driven Bayesian local bandwidth selection method is presented and lognormal, gamma, Weibull and inverse Gaussian kernels are discussed as useful special cases. Simulation results and a real-data example illustrate the advantages of the new methodology.  相似文献   

12.
It is well established that bandwidths exist that can yield an unbiased non–parametric kernel density estimate at points in particular regions (e.g. convex regions) of the underlying density. These zero–bias bandwidths have superior theoretical properties, including a 1/n convergence rate of the mean squared error. However, the explicit functional form of the zero–bias bandwidth has remained elusive. It is difficult to estimate these bandwidths and virtually impossible to achieve the higher–order rate in practice. This paper addresses these issues by taking a fundamentally different approach to the asymptotics of the kernel density estimator to derive a functional approximation to the zero–bias bandwidth. It develops a simple approximation algorithm that focuses on estimating these zero–bias bandwidths in the tails of densities where the convexity conditions favourable to the existence of the zerobias bandwidths are more natural. The estimated bandwidths yield density estimates with mean squared error that is O(n–4/5), the same rate as the mean squared error of density estimates with other choices of local bandwidths. Simulation studies and an illustrative example with air pollution data show that these estimated zero–bias bandwidths outperform other global and local bandwidth estimators in estimating points in the tails of densities.  相似文献   

13.
Kernel-based density estimation algorithms are inefficient in presence of discontinuities at support endpoints. This is substantially due to the fact that classic kernel density estimators lead to positive estimates beyond the endopoints. If a nonparametric estimate of a density functional is required in determining the bandwidth, then the problem also affects the bandwidth selection procedure. In this paper algorithms for bandwidth selection and kernel density estimation are proposed for non-negative random variables. Furthermore, the methods we propose are compared with some of the principal solutions in the literature through a simulation study.  相似文献   

14.
Multivariate associated kernel estimators, which depend on both target point and bandwidth matrix, are appropriate for distributions with partially or totally bounded supports and generalize the classical ones such as the Gaussian. Previous studies on multivariate associated kernels have been restricted to products of univariate associated kernels, also considered having diagonal bandwidth matrices. However, it has been shown in classical cases that, for certain forms of target density such as multimodal ones, the use of full bandwidth matrices offers the potential for significantly improved density estimation. In this paper, general associated kernel estimators with correlation structure are introduced. Asymptotic properties of these estimators are presented; in particular, the boundary bias is investigated. Generalized bivariate beta kernels are handled in more details. The associated kernel with a correlation structure is built with a variant of the mode-dispersion method and two families of bandwidth matrices are discussed using the least squared cross validation method. Simulation studies are done. In the particular situation of bivariate beta kernels, a very good performance of associated kernel estimators with correlation structure is observed compared to the diagonal case. Finally, an illustration on a real dataset of paired rates in a framework of political elections is presented.  相似文献   

15.
Strategies for improving fixed non-negative kernel estimators have focused on reducing the bias, either by employing higher-order kernels or by adjusting the bandwidth locally. Intuitively, bandwidths in the tails should be relatively larger in order to reduce wiggles since there is less data available in the tails. We show that in regions where the density function is convex, it is theoretically possible to find local bandwidths such that the pointwise bias is exactly zero. The corresponding pointwise mean squared error converges at the parametric rate of O ( n −1 ) rather than the slower O ( n −4/5). These so-called zero-bias bandwidths are constant and are usually orders of magnitude larger than the optimal locally adaptive bandwidths predicted by asymptotic mean squared error analysis. We describe data-based algorithms for estimating zero-bias bandwidths over intervals where the density is convex. We find that our particular density estimator attains the usual O ( n −4/5) rate. However, we demonstrate that the algorithms can provide significant improvement in mean squared error, often clearly visually superior curves, and a new operating point in the usual bias-variance tradeoff.  相似文献   

16.
A data-driven bandwidth choice for a kernel density estimator called critical bandwidth is investigated. This procedure allows the estimation to have as many modes as assumed for the density to estimate. Both Gaussian and uniform kernels are considered. For the Gaussian kernel, asymptotic results are given. For the uniform kernel, an argument against these properties is mentioned. These theoretical results are illustrated with a simulation study that compares the kernel estimators that rely on critical bandwidth with another one that uses a plug-in method to select its bandwidth. An estimator that consists in estimates of density contour clusters and takes assumptions on number of modes into account is also considered. Finally, the methodology is illustrated using environment monitoring data.  相似文献   

17.
In this note we discuss two-step kernel estimation of varying coefficient regression models that have a common smoothing variable. The method allows one to use different bandwidths for different coefficient functions. We consider local polynomial fitting and present explicit formulas for the asymptotic biases and variances of the estimators.  相似文献   

18.
Discrete associated kernels method and extensions   总被引:1,自引:0,他引:1  
Discrete kernel estimation of a probability mass function (p.m.f.), often mentioned in the literature, has been far less investigated in comparison with continuous kernel estimation of a probability density function (p.d.f.). In this paper, we are concerned with a general methodology of discrete kernels for smoothing a p.m.f. f. We give a basic of mathematical tools for further investigations. First, we point out a generalizable notion of discrete associated kernel which is defined at each point of the support of f and built from any parametric discrete probability distribution. Then, some properties of the corresponding estimators are shown, in particular pointwise and global (asymptotical) properties. Other discrete kernels are constructed from usual discrete probability distributions such as Poisson, binomial and negative binomial. For small samples sizes, underdispersed discrete kernel estimators are more interesting than the empirical estimator; thus, an importance of discrete kernels is illustrated. The choice of smoothing bandwidth is classically investigated according to cross-validation and, novelly, to excess of zeros methods. Finally, a unification way of this method concerning the general probability function is discussed.  相似文献   

19.
The spatially inhomogeneous smoothness of the non-parametric density or regression-function to be estimated by non-parametric methods is often modelled by Besov- and Triebel-type smoothness constraints. For such problems, Donoho and Johnstone [D.L. Donoho and I.M. Johnstone, Minimax estimation via wavelet shrinkage. Ann. Stat. 26 (1998), pp. 879–921.], Delyon and Juditsky [B. Delyon and A. Juditsky, On minimax wavelet estimators, Appl. Comput. Harmon. Anal. 3 (1996), pp. 215–228.] studied minimax rates of convergence for wavelet estimators with thresholding, while Lepski et al. [O.V. Lepski, E. Mammen, and V.G. Spokoiny, Optimal spatial adaptation to inhomogeneous smoothness: an approach based on kernel estimators with variable bandwidth selectors, Ann. Stat. 25 (1997), pp. 929–947.] proposed a variable bandwidth selection for kernel estimators that achieved optimal rates over Besov classes. However, a second challenge in many real applications of non-parametric curve estimation is that the function must be positive. Here, we show how to construct estimators under positivity constraints that satisfy these constraints and also achieve minimax rates over the appropriate smoothness class.  相似文献   

20.
A new procedure is proposed for deriving variable bandwidths in univariate kernel density estimation, based upon likelihood cross-validation and an analysis of a Bayesian graphical model. The procedure admits bandwidth selection which is flexible in terms of the amount of smoothing required. In addition, the basic model can be extended to incorporate local smoothing of the density estimate. The method is shown to perform well in both theoretical and practical situations, and we compare our method with those of Abramson (The Annals of Statistics 10: 1217–1223) and Sain and Scott (Journal of the American Statistical Association 91: 1525–1534). In particular, we note that in certain cases, the Sain and Scott method performs poorly even with relatively large sample sizes.We compare various bandwidth selection methods using standard mean integrated square error criteria to assess the quality of the density estimates. We study situations where the underlying density is assumed both known and unknown, and note that in practice, our method performs well when sample sizes are small. In addition, we also apply the methods to real data, and again we believe our methods perform at least as well as existing methods.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号