首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 843 毫秒
1.
This paper considers the problem of selecting optimal bandwidths for variable (sample‐point adaptive) kernel density estimation. A data‐driven variable bandwidth selector is proposed, based on the idea of approximating the log‐bandwidth function by a cubic spline. This cubic spline is optimized with respect to a cross‐validation criterion. The proposed method can be interpreted as a selector for either integrated squared error (ISE) or mean integrated squared error (MISE) optimal bandwidths. This leads to reflection upon some of the differences between ISE and MISE as error criteria for variable kernel estimation. Results from simulation studies indicate that the proposed method outperforms a fixed kernel estimator (in terms of ISE) when the target density has a combination of sharp modes and regions of smooth undulation. Moreover, some detailed data analyses suggest that the gains in ISE may understate the improvements in visual appeal obtained using the proposed variable kernel estimator. These numerical studies also show that the proposed estimator outperforms existing variable kernel density estimators implemented using piecewise constant bandwidth functions.  相似文献   

2.
In this paper, we propose a robust bandwidth selection method for local M-estimates used in nonparametric regression. We study the asymptotic behavior of the resulting estimates. We use the results of a Monte Carlo study to compare the performance of various competitors for moderate samples sizes. It appears that the robust plug-in bandwidth selector we propose compares favorably to its competitors, despite the need to select a pilot bandwidth. The Monte Carlo study shows that the robust plug-in bandwidth selector is very stable and relatively insensitive to the choice of the pilot.  相似文献   

3.
This paper focuses on bivariate kernel density estimation that bridges the gap between univariate and multivariate applications. We propose a subsampling-extrapolation bandwidth matrix selector that improves the reliability of the conventional cross-validation method. The proposed procedure combines a U-statistic expression of the mean integrated squared error and asymptotic theory, and can be used in both cases of diagonal bandwidth matrix and unconstrained bandwidth matrix. In the subsampling stage, one takes advantage of the reduced variability of estimating the bandwidth matrix at a smaller subsample size m (m < n); in the extrapolation stage, a simple linear extrapolation is used to remove the incurred bias. Simulation studies reveal that the proposed method reduces the variability of the cross-validation method by about 50% and achieves an expected integrated squared error that is up to 30% smaller than that of the benchmark cross-validation. It shows comparable or improved performance compared to other competitors across six distributions in terms of the expected integrated squared error. We prove that the components of the selected bivariate bandwidth matrix have an asymptotic multivariate normal distribution, and also present the relative rate of convergence of the proposed bandwidth selector.  相似文献   

4.
Abstract.  The performance of multivariate kernel density estimates depends crucially on the choice of bandwidth matrix, but progress towards developing good bandwidth matrix selectors has been relatively slow. In particular, previous studies of cross-validation (CV) methods have been restricted to biased and unbiased CV selection of diagonal bandwidth matrices. However, for certain types of target density the use of full (i.e. unconstrained) bandwidth matrices offers the potential for significantly improved density estimation. In this paper, we generalize earlier work from diagonal to full bandwidth matrices, and develop a smooth cross-validation (SCV) methodology for multivariate data. We consider optimization of the SCV technique with respect to a pilot bandwidth matrix. All the CV methods are studied using asymptotic analysis, simulation experiments and real data analysis. The results suggest that SCV for full bandwidth matrices is the most reliable of the CV methods. We also observe that experience from the univariate setting can sometimes be a misleading guide for understanding bandwidth selection in the multivariate case.  相似文献   

5.
The choice of the bandwidth is a crucial issue for kernel density estimation. Among all the data-dependent methods for choosing the bandwidth, the direct plug-in method has shown a particularly good performance in practice. This procedure is based on estimating an asymptotic approximation of the optimal bandwidth, using two “pilot” kernel estimation stages. Although two pilot stages seem to be enough for most densities, for a long time the problem of how to choose an appropriate number of stages has remained open. Here we propose an automatic (i.e., data-based) method for choosing the number of stages to be employed in the plug-in bandwidth selector. Asymptotic properties of the method are presented and an extensive simulation study is carried out to compare its small-sample performance with that of the most recommended bandwidth selectors in the literature.  相似文献   

6.
Abstract.  The problem of choosing the bandwidth h for kernel density estimation is considered. All the plug-in-type bandwidth selection methods require the use of a pilot bandwidth g . The usual way to make an h -dependent choice of g is by obtaining their asymptotic expressions separately and solving the two equations. In contrast, we obtain the asymptotically optimal value of g for every fixed h , thus making our selection 'less asymptotic'. Exact error expressions show that some usually assumed hypotheses have to be discarded in the asymptotic study in this case. Two versions of a new bandwidth selector based on this idea are proposed, and their properties are analysed through theoretical results and a simulation study.  相似文献   

7.
Non‐parametric estimation and bootstrap techniques play an important role in many areas of Statistics. In the point process context, kernel intensity estimation has been limited to exploratory analysis because of its inconsistency, and some consistent alternatives have been proposed. Furthermore, most authors have considered kernel intensity estimators with scalar bandwidths, which can be very restrictive. This work focuses on a consistent kernel intensity estimator with unconstrained bandwidth matrix. We propose a smooth bootstrap for inhomogeneous spatial point processes. The consistency of the bootstrap mean integrated squared error (MISE) as an estimator of the MISE of the consistent kernel intensity estimator proves the validity of the resampling procedure. Finally, we propose a plug‐in bandwidth selection procedure based on the bootstrap MISE and compare its performance with several methods currently used through both as a simulation study and an application to the spatial pattern of wildfires registered in Galicia (Spain) during 2006.  相似文献   

8.
The Nadaraya–Watson estimator is among the most studied nonparametric regression methods. A classical result is that its convergence rate depends on the number of covariates and deteriorates quickly as the dimension grows. This underscores the “curse of dimensionality” and has limited its use in high‐dimensional settings. In this paper, however, we show that the Nadaraya–Watson estimator has an oracle property such that when the true regression function is single‐ or multi‐index, it discovers the low‐rank dependence structure between the response and the covariates, mitigating the curse of dimensionality. Specifically, we prove that, using K‐fold cross‐validation and a positive‐semidefinite bandwidth matrix, the Nadaraya–Watson estimator has a convergence rate that depends on the number of indices rather than on the number of covariates. This result follows by allowing the bandwidths to diverge to infinity rather than restricting them all to converge to zero at certain rates, as in previous theoretical studies.  相似文献   

9.
Abstract. We consider the properties of the local polynomial estimators of a counting process intensity function and its derivatives. By expressing the local polynomial estimators in a kernel smoothing form via effective kernels, we show that the bias and variance of the estimators at boundary points are of the same magnitude as at interior points and therefore the local polynomial estimators in the context of intensity estimation also enjoy the automatic boundary correction property as they do in other contexts such as regression. The asymptotically optimal bandwidths and optimal kernel functions are obtained through the asymptotic expressions of the mean square error of the estimators. For practical purpose, we suggest an effective and easy‐to‐calculate data‐driven bandwidth selector. Simulation studies are carried out to assess the performance of the local polynomial estimators and the proposed bandwidth selector. The estimators and the bandwidth selector are applied to estimate the rate of aftershocks of the Sichuan earthquake and the rate of the Personal Emergency Link calls in Hong Kong.  相似文献   

10.
A bandwidth selection based on Linex discrepancy is proposed for kernel smoothing of periodogram. The selection minimizes Linex discrepancy between the smoothed and true spectrums. Two estimators are introduced for Linex discrepancy. The bandwidth choice outperforms some common bandwidth choices.  相似文献   

11.
Bandwidth selection is an important problem of kernel density estimation. Traditional simple and quick bandwidth selectors usually oversmooth the density estimate. Existing sophisticated selectors usually have computational difficulties and occasionally do not exist. Besides, they may not be robust against outliers in the sample data, and some are highly variable, tending to undersmooth the density. In this paper, a highly robust simple and quick bandwidth selector is proposed, which adapts to different types of densities.  相似文献   

12.
Matching estimators and optimal bandwidth choice   总被引:1,自引:0,他引:1  
Optimal bandwidth choice for matching estimators and their finite sample properties are examined. An approximation to their MSE is derived, as a basis for a plug-in bandwidth selector. In small samples, this approximation is not very accurate, though. Alternatively, conventional cross-validation bandwidth selection is considered and performs rather well in simulation studies: Compared to standard pair-matching, kernel and ridge matching achieve reductions in MSE of about 25 to 40%. Local linear matching and weighting perform poorly. Furthermore, the scope for developing better bandwidth selectors seems to be limited for ridge matching, but non-negligible for kernel and local linear matching.  相似文献   

13.
Abstract. Although generalized cross‐validation (GCV) has been frequently applied to select bandwidth when kernel methods are used to estimate non‐parametric mixed‐effect models in which non‐parametric mean functions are used to model covariate effects, and additive random effects are applied to account for overdispersion and correlation, the optimality of the GCV has not yet been explored. In this article, we construct a kernel estimator of the non‐parametric mean function. An equivalence between the kernel estimator and a weighted least square type estimator is provided, and the optimality of the GCV‐based bandwidth is investigated. The theoretical derivations also show that kernel‐based and spline‐based GCV give very similar asymptotic results. This provides us with a solid base to use kernel estimation for mixed‐effect models. Simulation studies are undertaken to investigate the empirical performance of the GCV. A real data example is analysed for illustration.  相似文献   

14.
A new, fully data-driven bandwidth selector with a double smoothing (DS) bias term and a data-driven variance estimator is developed following the bootstrap idea. The data-driven variance estimation does not involve any additional bandwidth selection. The proposed bandwidth selector convergences faster than a plug-in one due to the DS bias estimate, whereas the data-driven variance improves its finite sample performance clearly and makes it stable. Asymptotic results of the proposals are obtained. A comparative simulation study was done to show the overall gains and the gains obtained by improving either the bias term or the variance estimate, respectively. It is shown that the use of a good variance estimator is more important when the sample size is relatively small.  相似文献   

15.
Multivariate associated kernel estimators, which depend on both target point and bandwidth matrix, are appropriate for distributions with partially or totally bounded supports and generalize the classical ones such as the Gaussian. Previous studies on multivariate associated kernels have been restricted to products of univariate associated kernels, also considered having diagonal bandwidth matrices. However, it has been shown in classical cases that, for certain forms of target density such as multimodal ones, the use of full bandwidth matrices offers the potential for significantly improved density estimation. In this paper, general associated kernel estimators with correlation structure are introduced. Asymptotic properties of these estimators are presented; in particular, the boundary bias is investigated. Generalized bivariate beta kernels are handled in more details. The associated kernel with a correlation structure is built with a variant of the mode-dispersion method and two families of bandwidth matrices are discussed using the least squared cross validation method. Simulation studies are done. In the particular situation of bivariate beta kernels, a very good performance of associated kernel estimators with correlation structure is observed compared to the diagonal case. Finally, an illustration on a real dataset of paired rates in a framework of political elections is presented.  相似文献   

16.
The penalized spline is a popular method for function estimation when the assumption of “smoothness” is valid. In this paper, methods for estimation and inference are proposed using penalized splines under additional constraints of shape, such as monotonicity or convexity. The constrained penalized spline estimator is shown to have the same convergence rates as the corresponding unconstrained penalized spline, although in practice the squared error loss is typically smaller for the constrained versions. The penalty parameter may be chosen with generalized cross‐validation, which also provides a method for determining if the shape restrictions hold. The method is not a formal hypothesis test, but is shown to have nice large‐sample properties, and simulations show that it compares well with existing tests for monotonicity. Extensions to the partial linear model, the generalized regression model, and the varying coefficient model are given, and examples demonstrate the utility of the methods. The Canadian Journal of Statistics 40: 190–206; 2012 © 2012 Statistical Society of Canada  相似文献   

17.
ABSTRACT

Kernel estimation is a popular approach to estimation of the pair correlation function which is a fundamental spatial point process characteristic. Least squares cross validation was suggested by Guan [A least-squares cross-validation bandwidth selection approach in pair correlation function estimations. Statist Probab Lett. 2007;77(18):1722–1729] as a data-driven approach to select the kernel bandwidth. The method can, however, be computationally demanding for large point pattern data sets. We suggest a modified least squares cross validation approach that is asymptotically equivalent to the one proposed by Guan but is computationally much faster.  相似文献   

18.
Kernel Density Estimation on a Linear Network   总被引:1,自引:0,他引:1       下载免费PDF全文
This paper develops a statistically principled approach to kernel density estimation on a network of lines, such as a road network. Existing heuristic techniques are reviewed, and their weaknesses are identified. The correct analogue of the Gaussian kernel is the ‘heat kernel’, the occupation density of Brownian motion on the network. The corresponding kernel estimator satisfies the classical time‐dependent heat equation on the network. This ‘diffusion estimator’ has good statistical properties that follow from the heat equation. It is mathematically similar to an existing heuristic technique, in that both can be expressed as sums over paths in the network. However, the diffusion estimate is an infinite sum, which cannot be evaluated using existing algorithms. Instead, the diffusion estimate can be computed rapidly by numerically solving the time‐dependent heat equation on the network. This also enables bandwidth selection using cross‐validation. The diffusion estimate with automatically selected bandwidth is demonstrated on road accident data.  相似文献   

19.
When spatial data are correlated, currently available data‐driven smoothing parameter selection methods for nonparametric regression will often fail to provide useful results. The authors propose a method that adjusts the generalized cross‐validation criterion for the effect of spatial correlation in the case of bivariate local polynomial regression. Their approach uses a pilot fit to the data and the estimation of a parametric covariance model. The method is easy to implement and leads to improved smoothing parameter selection, even when the covariance model is misspecified. The methodology is illustrated using water chemistry data collected in a survey of lakes in the Northeastern United States.  相似文献   

20.
Methods for smoothed isotonic or convex regression are useful in many applications. Sometimes the shape assumptions constitute a priori knowledge about the regression function, but often the shape is part of the research question. The authors propose tests for monotonicity and convexity using constrained and unconstrained regression splines. The tests have good large‐sample properties and the small‐sample behaviour is illustrated through simulations. Extensions to the partial linear model and the generalized regression model are presented. The Canadian Journal of Statistics 39: 89–107; 2011 © 2011 Statistical Society of Canada  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号