首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 490 毫秒
1.
We investigate the asymptotic behaviour of binned kernel density estimators for dependent and locally non-stationary random fields converging to stationary random fields. We focus on the study of the bias and the asymptotic normality of the estimators. A simulation experiment conducted shows that both the kernel density estimator and the binned kernel density estimator have the same behavior and both estimate accurately the true density when the number of fields increases. We apply our results to the 2002 incidence rates of tuberculosis in the departments of France.  相似文献   

2.
We propose kernel density estimators based on prebinned data. We use generalized binning schemes based on the quantiles points of a certain auxiliary distribution function. Therein the uniform distribution corresponds to usual binning. The statistical accuracy of the resulting kernel estimators is studied, i.e. we derive mean squared error results for the closeness of these estimators to both the true function and the kernel estimator based on the original data set. Our results show the influence of the choice of the auxiliary density on the binned kernel estimators and they reveal that non-uniform binning can be worthwhile.  相似文献   

3.
Two common kernel-based methods for non-parametric regression estimation suffer from well-known drawbacks when the design is random. The Gasser-Müller estimator is inadmissible due to its high variance while the Nadaraya-Watson estimator has zero asymptotic efficiency because of poor bias behavior. Under asymptotic consideration, the local linear estimator avoids these two drawbacks of kernel estimators and achieves minimax optimality. However, when based on compact support kernels its finite sample behavior is disappointing because sudden kinks may show up in the estimate.

This paper proposes a modification of the kernel estimator, called the binned convolution estimator leading to a fast O(n) method. Provided the design density is continously differentiable and the conditional fourth moments exist the binned convolution estimator has asymptotic properties identical with those of the local linear estimator.  相似文献   

4.
Interval-grouped data are defined, in general, when the event of interest cannot be directly observed and it is only known to have been occurred within an interval. In this framework, a nonparametric kernel density estimator is proposed and studied. The approach is based on the classical Parzen–Rosenblatt estimator and on the generalisation of the binned kernel density estimator. The asymptotic bias and variance of the proposed estimator are derived under usual assumptions, and the effect of using non-equally spaced grouped data is analysed. Additionally, a plug-in bandwidth selector is proposed. Through a comprehensive simulation study, the behaviour of both the estimator and the plug-in bandwidth selector considering different scenarios of data grouping is shown. An application to real data confirms the simulation results, revealing the good performance of the estimator whenever data are not heavily grouped.  相似文献   

5.
Model checking with discrete data regressions can be difficult because the usual methods such as residual plots have complicated reference distributions that depend on the parameters in the model. Posterior predictive checks have been proposed as a Bayesian way to average the results of goodness-of-fit tests in the presence of uncertainty in estimation of the parameters. We try this approach using a variety of discrepancy variables for generalized linear models fitted to a historical data set on behavioural learning. We then discuss the general applicability of our findings in the context of a recent applied example on which we have worked. We find that the following discrepancy variables work well, in the sense of being easy to interpret and sensitive to important model failures: structured displays of the entire data set, general discrepancy variables based on plots of binned or smoothed residuals versus predictors and specific discrepancy variables created on the basis of the particular concerns arising in an application. Plots of binned residuals are especially easy to use because their predictive distributions under the model are sufficiently simple that model checks can often be made implicitly. The following discrepancy variables did not work well: scatterplots of latent residuals defined from an underlying continuous model and quantile–quantile plots of these residuals.  相似文献   

6.
In this paper a new method called the EMS algorithm is used to solve Wicksell's corpuscle problem, that is the determination of the distribution of the sphere radii in a medium given the radii of their profiles in a random slice. The EMS algorithm combines the EM algorithm, a procedure for obtaining maximum likelihood estimates of parameters from incomplete data, with simple smoothing. The method is tested on simulated data from three different sphere radii densities, namely a bimodal mixture of Normals, a Weibull and a Normal. The effect of varying the level of smoothing, the number of classes in which the data is binned and the number of classes for which the estimated density is evaluated, is investigated. Comparisons are made between these results and those obtained by others in this field.  相似文献   

7.
This paper considers the problem of selecting optimal bandwidths for variable (sample‐point adaptive) kernel density estimation. A data‐driven variable bandwidth selector is proposed, based on the idea of approximating the log‐bandwidth function by a cubic spline. This cubic spline is optimized with respect to a cross‐validation criterion. The proposed method can be interpreted as a selector for either integrated squared error (ISE) or mean integrated squared error (MISE) optimal bandwidths. This leads to reflection upon some of the differences between ISE and MISE as error criteria for variable kernel estimation. Results from simulation studies indicate that the proposed method outperforms a fixed kernel estimator (in terms of ISE) when the target density has a combination of sharp modes and regions of smooth undulation. Moreover, some detailed data analyses suggest that the gains in ISE may understate the improvements in visual appeal obtained using the proposed variable kernel estimator. These numerical studies also show that the proposed estimator outperforms existing variable kernel density estimators implemented using piecewise constant bandwidth functions.  相似文献   

8.
Conditional expectation imputation and local-likelihood methods are contrasted with a midpoint imputation method for bivariate regression involving interval-censored responses. Although the methods can be extended in principle to higher order polynomials, our focus is on the local constant case. Comparisons are based on simulations of data scattered about three target functions with normally distributed errors. Two censoring mechanisms are considered: the first is analogous to current-status data in which monitoring times occur according to a homogeneous Poisson process; the second is analogous to a coarsening mechanism such as would arise when the response values are binned. We find that, according to a pointwise MSE criterion, no method dominates any other when interval sizes are fixed, but when the intervals have a variable width, the local-likelihood method often performs better than the other methods, and midpoint imputation performs the worst. Several illustrative examples are presented.  相似文献   

9.
CORRECTING FOR KURTOSIS IN DENSITY ESTIMATION   总被引:1,自引:0,他引:1  
Using a global window width kernel estimator to estimate an approximately symmetric probability density with high kurtosis usually leads to poor estimation because good estimation of the peak of the distribution leads to unsatisfactory estimation of the tails and vice versa. The technique proposed corrects for kurtosis via a transformation of the data before using a global window width kernel estimator. The transformation depends on a “generalised smoothing parameter” consisting of two real-valued parameters and a window width parameter which can be selected either by a simple graphical method or, for a completely data-driven implementation, by minimising an estimate of mean integrated squared error. Examples of real and simulated data demonstrate the effectiveness of this approach, which appears suitable for a wide range of symmetric, unimodal densities. Its performance is similar to ordinary kernel estimation in situations where the latter is effective, e.g. Gaussian densities. For densities like the Cauchy where ordinary kernel estimation is not satisfactory, our methodology offers a substantial improvement.  相似文献   

10.
Intensity functions—which describe the spatial distribution of the occurrences of point processes—are useful for risk assessment. This paper deals with the robust nonparametric estimation of the intensity function of space–time data from events such as earthquakes. The basic approach consists of smoothing the frequency histograms with the local polynomial regression (LPR) estimator. This method allows for automatic boundary corrections, and its jump-preserving ability can be improved with robustness. We derive a robust local smoother from the weighted-average approach to M-estimation and we select its bandwidths with robust cross-validation (RCV). Further, we develop a robust recursive algorithm for sequential processing of the data binned in time. An extensive application to the Northern California earthquake catalog in the San Francisco, CA, area illustrates the method and proves its validity.  相似文献   

11.
Nonparametric density estimation in the presence of measurement error is considered. The usual kernel deconvolution estimator seeks to account for the contamination in the data by employing a modified kernel. In this paper a new approach based on a weighted kernel density estimator is proposed. Theoretical motivation is provided by the existence of a weight vector that perfectly counteracts the bias in density estimation without generating an excessive increase in variance. In practice a data driven method of weight selection is required. Our strategy is to minimize the discrepancy between a standard kernel estimate from the contaminated data on the one hand, and the convolution of the weighted deconvolution estimate with the measurement error density on the other hand. We consider a direct implementation of this approach, in which the weights are optimized subject to sum and non-negativity constraints, and a regularized version in which the objective function includes a ridge-type penalty. Numerical tests suggest that the weighted kernel estimation can lead to tangible improvements in performance over the usual kernel deconvolution estimator. Furthermore, weighted kernel estimates are free from the problem of negative estimation in the tails that can occur when using modified kernels. The weighted kernel approach generalizes to the case of multivariate deconvolution density estimation in a very straightforward manner.  相似文献   

12.
We provide a common approach for studying several nonparametric estimators used for smoothing functional time series data. Linear filters based on different building assumptions are transformed into kernel functions via reproducing kernel Hilbert spaces. For each estimator, we identify a density function or second order kernel, from which a hierarchy of higher order estimators is derived. These are shown to give excellent representations for the currently applied symmetric filters. In particular, we derive equivalent kernels of smoothing splines in Sobolev and polynomial spaces. The asymmetric weights are obtained by adapting the kernel functions to the length of the various filters, and a theoretical and empirical comparison is made with the classical estimators used in real time analysis. The former are shown to be superior in terms of signal passing, noise suppression and speed of convergence to the symmetric filter.  相似文献   

13.
There are several levels of sophistication when specifying the bandwidth matrix H to be used in a multivariate kernel density estimator, including H to be a positive multiple of the identity matrix, a diagonal matrix with positive elements or, in its most general form, a symmetric positive‐definite matrix. In this paper, the author proposes a data‐based method for choosing the smoothing parametrization to be used in the kernel density estimator. The procedure is fully illustrated by a simulation study and some real data examples. The Canadian Journal of Statistics © 2009 Statistical Society of Canada  相似文献   

14.
An approach to non-linear principal components using radially symmetric kernel basis functions is described. The procedure consists of two steps: a projection of the data set to a reduced dimension using a non-linear transformation whose parameters are determined by the solution of a generalized symmetric eigenvector equation. This is achieved by demanding a maximum variance transformation subject to a normalization condition (Hotelling's approach) and can be related to the homogeneity analysis approach of Gifi through the minimization of a loss function. The transformed variables are the principal components whose values define contours, or more generally hypersurfaces, in the data space. The second stage of the procedure defines the fitting surface, the principal surface, in the data space (again as a weighted sum of kernel basis functions) using the definition of self-consistency of Hastie and Stuetzle. The parameters of this principal surface are determined by a singular value decomposition and crossvalidation is used to obtain the kernel bandwidths. The approach is assessed on four data sets.  相似文献   

15.
In this article, we propose a new estimator for the density of objects using line transect data. The proposed estimator combines the nonparametric kernel estimator with parametric detection function: the exponential or the half normal detection function to estimate the density of objects. The selection of the detection function depends on the testing of the shoulder condition assumption. If the shoulder condition is true then the half-normal detection function is introduced together with the kernel estimator. Otherwise, the negative exponential is combined with the kernel estimator. Under these assumptions, the proposed estimator is asymptotically unbiased and it is strongly consistent estimator for the density of objects using line transect data. The simulation results indicate that the proposed estimator is very successful in taking the advantage of the parametric detection function available.  相似文献   

16.
The purpose of this paper is to examine the multiple group (>2) discrimination problem in which the group sizes are unequal and the variables used in the classification are correlated with skewed distributions. Using statistical simulation based on data from a clinical study, we compare the performances, in terms of misclassification rates, of nine statistical discrimination methods. These methods are linear and quadratic discriminant analysis applied to untransformed data, rank transformed data, and inverse normal scores data, as well as fixed kernel discriminant analysis, variable kernel discriminant analysis, and variable kernel discriminant analysis applied to inverse normal scores data. It is found that the parametric methods with transformed data generally outperform the other methods, and the parametric methods applied to inverse normal scores usually outperform the parametric methods applied to rank transformed data. Although the kernel methods often have very biased estimates, the variable kernel method applied to inverse normal scores data provides considerable improvement in terms of total nonerror rate.  相似文献   

17.
We consider the problem related to clustering of gamma-ray bursts (from “BATSE” catalogue) through kernel principal component analysis in which our proposed kernel outperforms results of other competent kernels in terms of clustering accuracy and we obtain three physically interpretable groups of gamma-ray bursts. The effectivity of the suggested kernel in combination with kernel principal component analysis in revealing natural clusters in noisy and nonlinear data while reducing the dimension of the data is also explored in two simulated data sets.  相似文献   

18.
通过建立灰色异构数据"核"序列的DGM(1,1)模型,实现双重异构数据"核"的预测;以"核"为基础、以双重异构数据序列中较大的区间灰数信息域作为预测结果的信息域,构建基于区间灰数与实数的双重异构数据序列灰色预测模型,有效地将灰色预测模型建模对象从"同质数据"拓展至"双重异构数据"。研究成果对丰富灰色预测模型理论体系具有积极意义。  相似文献   

19.
In this article, we propose a nonparametric estimator for percentiles of the time-to-failure distribution obtained from a linear degradation model using the kernel density method. The properties of the proposed kernel estimator are investigated and compared with well-known maximum likelihood and ordinary least squares estimators via a simulation technique. The mean squared error and the length of the bootstrap confidence interval are used as the basis criteria of the comparisons. The simulation study shows that the performance of the kernel estimator is acceptable as a general estimator. When the distribution of the data is assumed to be known, the maximum likelihood and ordinary least squares estimators perform better than the kernel estimator, while the kernel estimator is superior when the assumption of our knowledge of the data distribution is violated. A comparison among different estimators is achieved using a real data set.  相似文献   

20.
李双博 《统计研究》2018,35(6):117-128
函数型数据研究近年来为越来越多的学者所重视,其在天文,医药,经济现象,生态环境及工业制造等诸多方面均有重要应用.非参数统计是统计研究的一个重要方面,其中核函数估计和局部多项式方法是这一类研究中重要常用方法.函数型数据的非参数方法中以核函数估计方法较为常见,且其收敛速度与极限分布无论在独立情形还是相依情形都有理论结果.而局部多项式的研究在函数型数据背景下较为少见,原因在于将局部多项式方法推广到函数型数据背景一直是一个难题. Marin, Ferraty, Vieu [Journal of Nonparametric Statistics, 22 (5) (2010), pp.617-632] 提出了非参函数型模型的局部回归估计. 这种估计可以看作是局部多项式估计在函数型数据背景下的一个推广.这种方法提出后,许多学者进一步研究了这种方法,考察了这种方法的收敛速度和极限分布,并将这种方法应用到不同的模型中以适应实际需求.但是,前人的研究都要求数据具有独立同分布的性质.然而许多实际数据并不符合这一假设.本文研究了在相依函数型数据情形下局部回归估计的渐近正态性.由于估计方法有差异,核函数估计的研究方法无法直接推广到局部回归估计,而相依性结构也给研究带来了一些挑战,我们采用Bernstein分块方法将相依性问题转化为渐近独立的问题,从而得到了估计的渐近正态性.此外我们还采用数据模拟的方法进一步验证了渐近正态的结果.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号