首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
In this paper, we consider the estimation problem of f(0), the value of density f at the left endpoint 0. Nonparametric estimation of f(0) is rather formidable due to boundary effects that occur in nonparametric curve estimation. It is well known that the usual kernel density estimates require modifications when estimating the density near endpoints of the support. Here we investigate the local polynomial smoothing technique as a possible alternative method for the problem. It is observed that our density estimator also possesses desirable properties such as automatic adaptability for boundary effects near endpoints. We also obtain an ‘optimal kernel’ in order to estimate the density at endpoints as a solution of a variational problem. Two bandwidth variation schemes are discussed and investigated in a Monte Carlo study.  相似文献   

2.
It is often critical to accurately model the upper tail behaviour of a random process. Nonparametric density estimation methods are commonly implemented as exploratory data analysis techniques for this purpose and can avoid model specification biases implied by using parametric estimators. In particular, kernel-based estimators place minimal assumptions on the data, and provide improved visualisation over scatterplots and histograms. However kernel density estimators can perform poorly when estimating tail behaviour above a threshold, and can over-emphasise bumps in the density for heavy tailed data. We develop a transformation kernel density estimator which is able to handle heavy tailed and bounded data, and is robust to threshold choice. We derive closed form expressions for its asymptotic bias and variance, which demonstrate its good performance in the tail region. Finite sample performance is illustrated in numerical studies, and in an expanded analysis of the performance of global climate models.  相似文献   

3.
Abstract

In this article we propose an automatic selection of the bandwidth of the recursive kernel density estimators for spatial data defined by the stochastic approximation algorithm. We showed that, using the selected bandwidth and the stepsize which minimize the MWISE (Mean Weighted Integrated Squared Error), the recursive estimator will be quite similar to the nonrecursive one in terms of estimation error and much better in terms of computational costs. In addition, we obtain the central limit theorem for the nonparametric recursive density estimator under some mild conditions.  相似文献   

4.
Since the late 1980s, several methods have been considered in the literature to reduce the sample variability of the least-squares cross-validation bandwidth selector for kernel density estimation. In this article, a weighted version of this classical method is proposed and its asymptotic and finite-sample behavior is studied. The simulation results attest that the weighted cross-validation bandwidth performs quite well, presenting a better finite-sample performance than the standard cross-validation method for “easy-to-estimate” densities, and retaining the good finite-sample performance of the standard cross-validation method for “hard-to-estimate” ones.  相似文献   

5.
Spatial point pattern data sets are commonplace in a variety of different research disciplines. The use of kernel methods to smooth such data is a flexible way to explore spatial trends and make inference about underlying processes without, or perhaps prior to, the design and fitting of more intricate semiparametric or parametric models to quantify specific effects. The long-standing issue of ‘optimal’ data-driven bandwidth selection is complicated in these settings by issues such as high heterogeneity in observed patterns and the need to consider edge correction factors. We scrutinize bandwidth selectors built on leave-one-out cross-validation approximation to likelihood functions. A key outcome relates to previously unconsidered adaptive smoothing regimens for spatiotemporal density and multitype conditional probability surface estimation, whereby we propose a novel simultaneous pilot-global selection strategy. Motivated by applications in epidemiology, the results of both simulated and real-world analyses suggest this strategy to be largely preferable to classical fixed-bandwidth estimation for such data.  相似文献   

6.
By approximating the nonparametric component using a regression spline in generalized partial linear models (GPLM), robust generalized estimating equations (GEE), involving bounded score function and leverage-based weighting function, can be used to estimate the regression parameters in GPLM robustly for longitudinal data or clustered data. In this paper, score test statistics are proposed for testing the regression parameters with robustness, and their asymptotic distributions under the null hypothesis and a class of local alternative hypotheses are studied. The proposed score tests reply on the estimation of a smaller model without the testing parameters involved, and perform well in the simulation studies and real data analysis conducted in this paper.  相似文献   

7.
A bandwidth selection method that combines the concept of least-squares cross-validation and the plug-in approach is being introduced in connection with kernel density estimation. A simulation study reveals that this hybrid methodology outperforms some commonly used bandwidth selection rules. It is shown that the proposed approach can also be readily employed in the context of variable kernel density estimation. We conclude with two illustrative examples.  相似文献   

8.
A spline-backfitted kernel smoothing method is proposed for partially linear additive model. Under assumptions of stationarity and geometric mixing, the proposed function and parameter estimators are oracally efficient and fast to compute. Such superior properties are achieved by applying to the data spline smoothing and kernel smoothing consecutively. Simulation experiments with both moderate and large number of variables confirm the asymptotic results. Application to the Boston housing data serves as a practical illustration of the method.  相似文献   

9.
In this paper, we extend Choi and Hall's [Data sharpening as a prelude to density estimation. Biometrika. 1999;86(4):941–947] data sharpening algorithm for kernel density estimation to interval-censored data. Data sharpening has several advantages, including bias and mean integrated squared error (MISE) reduction as well as increased robustness to bandwidth misspecification. Several interval metrics are explored for use with the kernel function in the data sharpening transformation. A simulation study based on randomly generated data is conducted to assess and compare the performance of each interval metric. It is found that the bias is reduced by sharpening, often with little effect on the variance, thus maintaining or reducing overall MISE. Applications involving time to onset of HIV and running distances subject to measurement error are used for illustration.  相似文献   

10.
The Amoroso kernel density estimator (Igarashi and Kakizawa 2017 Igarashi, G., and Y. Kakizawa. 2017. Amoroso kernel density estimation for nonnegative data and its bias reduction. Department of Policy and Planning Sciences Discussion Paper Series No. 1345, University of Tsukuba. [Google Scholar]) for non-negative data is boundary-bias-free and has the mean integrated squared error (MISE) of order O(n? 4/5), where n is the sample size. In this paper, we construct a linear combination of the Amoroso kernel density estimator and its derivative with respect to the smoothing parameter. Also, we propose a related multiplicative estimator. We show that the MISEs of these bias-reduced estimators achieve the convergence rates n? 8/9, if the underlying density is four times continuously differentiable. We illustrate the finite sample performance of the proposed estimators, through the simulations.  相似文献   

11.
Summary.  Likelihood methods are often difficult to use with large, irregularly sited spatial data sets, owing to the computational burden. Even for Gaussian models, exact calculations of the likelihood for n observations require O ( n 3) operations. Since any joint density can be written as a product of conditional densities based on some ordering of the observations, one way to lessen the computations is to condition on only some of the 'past' observations when computing the conditional densities. We show how this approach can be adapted to approximate the restricted likelihood and we demonstrate how an estimating equations approach allows us to judge the efficacy of the resulting approximation. Previous work has suggested conditioning on those past observations that are closest to the observation whose conditional density we are approximating. Through theoretical, numerical and practical examples, we show that there can often be considerable benefit in conditioning on some distant observations as well.  相似文献   

12.
In this paper, we investigate a nonparametric robust estimation for spatial regression. More precisely, given a strictly stationary random field Zi=(Xi,Yi)iNNN1Zi=(Xi,Yi)iNNN1, we consider a family of robust nonparametric estimators for a regression function based on the kernel method. Under some general mixing assumptions, the almost complete consistency and the asymptotic normality of these estimators are obtained. A robust procedure to select the smoothing parameter adapted to the spatial data is also discussed.  相似文献   

13.
14.
In this paper, we investigate the asymptotic properties of the kernel estimator for non parametric regression operator when the functional stationary ergodic data with randomly censorship are considered. More precisely, we introduce the kernel-type estimator of the non parametric regression operator with the responses randomly censored and obtain the almost surely convergence with rate as well as the asymptotic normality of the estimator. As an application, the asymptotic (1 ? ζ) confidence interval of the regression operator is also presented (0 < ζ < 1). Finally, the simulation study is carried out to show the finite-sample performances of the estimator.  相似文献   

15.
Cause-specific hazard functions are employed to analyze a semi-Markov model which could be used to describe data arising from clinical trials or certain types of observational studies. The use of these hazard functions to fit a set of data arising from N possibly incomplete case histories is shown to have several notable advantages over the approach adopted by Lagakos, Sommer, and Zelen (1978).  相似文献   

16.
Let X1,., Xn, be i.i.d. random variables with distribution function F, and let Y1,.,.,Yn be i.i.d. with distribution function G. For i = 1, 2,.,., n set δi, = 1 if Xi ≤ Yi, and 0 otherwise, and Xi, = min{Xi, Ki}. A kernel-type density estimate of f, the density function of F w.r.t. Lebesgue measure on the Borel o-field, based on the censored data (δi, Xi), i = 1,.,.,n, is considered. Weak and strong uniform consistency properties over the whole real line are studied. Rates of convergence results are established under higher-order differentiability assumption on f. A procedure for relaxing such assumptions is also proposed.  相似文献   

17.
Likelihood cross-validation for kernel density estimation is known to be sensitive to extreme observations and heavy-tailed distributions. We propose a robust likelihood-based cross-validation method to select bandwidths in multivariate density estimations. We derive this bandwidth selector within the framework of robust maximum likelihood estimation. This method establishes a smooth transition from likelihood cross-validation for nonextreme observations to least squares cross-validation for extreme observations, thereby combining the efficiency of likelihood cross-validation and the robustness of least-squares cross-validation. We also suggest a simple rule to select the transition threshold. We demonstrate the finite sample performance and practical usefulness of the proposed method via Monte Carlo simulations and a real data application on Chinese air pollution.  相似文献   

18.
This paper focuses on bivariate kernel density estimation that bridges the gap between univariate and multivariate applications. We propose a subsampling-extrapolation bandwidth matrix selector that improves the reliability of the conventional cross-validation method. The proposed procedure combines a U-statistic expression of the mean integrated squared error and asymptotic theory, and can be used in both cases of diagonal bandwidth matrix and unconstrained bandwidth matrix. In the subsampling stage, one takes advantage of the reduced variability of estimating the bandwidth matrix at a smaller subsample size m (m < n); in the extrapolation stage, a simple linear extrapolation is used to remove the incurred bias. Simulation studies reveal that the proposed method reduces the variability of the cross-validation method by about 50% and achieves an expected integrated squared error that is up to 30% smaller than that of the benchmark cross-validation. It shows comparable or improved performance compared to other competitors across six distributions in terms of the expected integrated squared error. We prove that the components of the selected bivariate bandwidth matrix have an asymptotic multivariate normal distribution, and also present the relative rate of convergence of the proposed bandwidth selector.  相似文献   

19.
20.
Statistical learning is emerging as a promising field where a number of algorithms from machine learning are interpreted as statistical methods and vice-versa. Due to good practical performance, boosting is one of the most studied machine learning techniques. We propose algorithms for multivariate density estimation and classification. They are generated by using the traditional kernel techniques as weak learners in boosting algorithms. Our algorithms take the form of multistep estimators, whose first step is a standard kernel method. Some strategies for bandwidth selection are also discussed with regard both to the standard kernel density classification problem, and to our 'boosted' kernel methods. Extensive experiments, using real and simulated data, show an encouraging practical relevance of the findings. Standard kernel methods are often outperformed by the first boosting iterations and in correspondence of several bandwidth values. In addition, the practical effectiveness of our classification algorithm is confirmed by a comparative study on two real datasets, the competitors being trees including AdaBoosting with trees.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号