首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Length-biased data are a particular case of weighted data, which arise in many situations: biomedicine, quality control or epidemiology among others. In this paper we study the theoretical properties of kernel density estimation in the context of length-biased data, proposing two consistent bootstrap methods that we use for bandwidth selection. Apart from the bootstrap bandwidth selectors we suggest a rule-of-thumb. These bandwidth selection proposals are compared with a least-squares cross-validation method. A simulation study is accomplished to understand the behaviour of the procedures in finite samples.  相似文献   

2.
The problem of selecting the bandwidth for optimal kernel density estimation at a point is considered. A class of local bandwidth selectors which minimize smoothed bootstrap estimates of mean-squared error in density estimation is introduced. It is proved that the bandwidth selectors in the class achieve optimal relative rates of convergence, dependent upon the local smoothness of the target density. Practical implementation of the bandwidth selection methodology is discussed. The use of Gaussian-based kernels to facilitate computation of the smoothed bootstrap estimate of mean-squared error is proposed. The performance of the bandwidth selectors is investigated empirically.  相似文献   

3.
We consider the problem of data-based choice of the bandwidth of a kernel density estimator, with an aim to estimate the density optimally at a given design point. The existing local bandwidth selectors seem to be quite sensitive to the underlying density and location of the design point. For instance, some bandwidth selectors perform poorly while estimating a density, with bounded support, at the median. Others struggle to estimate a density in the tail region or at the trough between the two modes of a multimodal density. We propose a scale invariant bandwidth selection method such that the resulting density estimator performs reliably irrespective of the density or the design point. We choose bandwidth by minimizing a bootstrap estimate of the mean squared error (MSE) of a density estimator. Our bootstrap MSE estimator is different in the sense that we estimate the variance and squared bias components separately. We provide insight into the asymptotic accuracy of the proposed density estimator.  相似文献   

4.
The periodic multiplicative intensity model is considered. A new bootstrap method for non stationary counting processes which intensity function has some periodicity properties is presented. Its main advantage is that it does not destroy the temporal order and the original periodicity of the underlying counting process. The proposed algorithm is used to construct a bootstrap version of the maximum likelihood hazard function estimator. The consistency of the bootstrap method is shown. A possible modification of the proposed bootstrap method is discussed. The bootstrap simultaneous confidence intervals for the hazard function are presented. The telecommunication network traffic real data example is discussed.  相似文献   

5.
The single bootstrap is implemented by using a saddlepoint approximation to determine estimates for the survival and hazard functions of first-passage times in complicated semi-Markov processes. The double bootstrap is also implemented by resampling saddlepoint inversions and provides BCa confidence bands for these functions. Confidence intervals for the mean and variance of first-passage times are easily computed. A new characterization of the asymptotic hazard rate for survival times is presented and leads to an indirect method for constructing its bootstrap confidence interval.  相似文献   

6.
Our goal is to find a regression technique that can be used in a small-sample situation with possible model misspecification. The development of a new bandwidth selector allows nonparametric regression (in conjunction with least squares) to be used in this small-sample problem, where nonparametric procedures have previously proven to be inadequate. Considered here are two new semiparametric (model-robust) regression techniques that combine parametric and nonparametric techniques when there is partial information present about the underlying model. A general overview is given of how typical concerns for bandwidth selection in nonparametric regression extend to the model-robust procedures. A new penalized PRESS criterion (with a graphical selection strategy for applications) is developed that overcomes these concerns and is able to maintain the beneficial mean squared error properties of the new model-robust methods. It is shown that this new selector outperforms standard and recently improved bandwidth selectors. Comparisons of the selectors are made via numerous generated data examples and a small simulation study.  相似文献   

7.
We consider kernel density estimation when the observations are contaminated by measurement errors. It is well-known that the success of kernel estimators depends heavily on the choice of a smoothing parameter called the bandwidth. A number of data-driven bandwidth selectors exist, but they are all global. Such techniques are appropriate when the density is relatively simple, but local bandwidth selectors can be more attractive in more complex settings. We suggest several data-driven local bandwidth selectors and illustrate via simulations the significant improvement they can bring over a global bandwidth.  相似文献   

8.
This article is concerned with one discrete nonparametric kernel and two parametric regression approaches for providing the evolution law of pavement deterioration. The first parametric approach is a survival data analysis method; and the second is a nonlinear mixed-effects model. The nonparametric approach consists of a regression estimator using the discrete associated kernels. Some asymptotic properties of the discrete nonparametric kernel estimator are shown as, in particular, its almost sure consistency. Moreover, two data-driven bandwidth selection methods are also given, with a new theoretical explicit expression of optimal bandwidth provided for this nonparametric estimator. A comparative simulation study is realized with an application of bootstrap methods to a measure of statistical accuracy.  相似文献   

9.
Bandwidth selection is an important problem of kernel density estimation. Traditional simple and quick bandwidth selectors usually oversmooth the density estimate. Existing sophisticated selectors usually have computational difficulties and occasionally do not exist. Besides, they may not be robust against outliers in the sample data, and some are highly variable, tending to undersmooth the density. In this paper, a highly robust simple and quick bandwidth selector is proposed, which adapts to different types of densities.  相似文献   

10.
Abstract.  Recurrent event data are largely characterized by the rate function but smoothing techniques for estimating the rate function have never been rigorously developed or studied in statistical literature. This paper considers the moment and least squares methods for estimating the rate function from recurrent event data. With an independent censoring assumption on the recurrent event process, we study statistical properties of the proposed estimators and propose bootstrap procedures for the bandwidth selection and for the approximation of confidence intervals in the estimation of the occurrence rate function. It is identified that the moment method without resmoothing via a smaller bandwidth will produce a curve with nicks occurring at the censoring times, whereas there is no such problem with the least squares method. Furthermore, the asymptotic variance of the least squares estimator is shown to be smaller under regularity conditions. However, in the implementation of the bootstrap procedures, the moment method is computationally more efficient than the least squares method because the former approach uses condensed bootstrap data. The performance of the proposed procedures is studied through Monte Carlo simulations and an epidemiological example on intravenous drug users.  相似文献   

11.
Two very effective data-based procedures which are simple and fast to compute are proposed for selecting the number of bins in a histogram. The idea is to choose the number of bins that minimizes the circumference (or a bootstrap estimate of the expected circumference) of the frequency histogram. Contrary to most rules derived in the literature, our method is therefore not dependent on precise asymptotic analyses. It is shown by means of an extensive Monte-Carlo study that our selectors perform well in comparison with recently suggested selectors in the literature, for a wide range of density functions and sample sizes. The behaviour of one of the proposed rules is also illustrated on real data sets.  相似文献   

12.
We propose a new modified (biased) cross-validation method for adaptively determining the bandwidth in a nonparametric density estimation setup. It is shown that the method provides consistent minimizers. Some simulation results are reported on which compare the small sample behavior of the new and the classical cross-validation selectors.  相似文献   

13.
Abstract.  A new kernel distribution function (df) estimator based on a non-parametric transformation of the data is proposed. It is shown that the asymptotic bias and mean squared error of the estimator are considerably smaller than that of the standard kernel df estimator. For the practical implementation of the new estimator a data-based choice of the bandwidth is proposed. Two possible areas of application are the non-parametric smoothed bootstrap and survival analysis. In the latter case new estimators for the survival function and the mean residual life function are derived.  相似文献   

14.
ABSTRACT

Local linear estimator is a popularly used method to estimate the non-parametric regression functions, and many methods have been derived to estimate the smoothing parameter, or the bandwidth in this case. In this article, we propose an information criterion-based bandwidth selection method, with the degrees of freedom originally derived for non-parametric inferences. Unlike the plug-in method, the new method does not require preliminary parameters to be chosen in advance, and is computationally efficient compared to the cross-validation (CV) method. Numerical study shows that the new method performs better or comparable to existing plug-in method or CV method in terms of the estimation of the mean functions, and has lower variability than CV selectors. Real data applications are also provided to illustrate the effectiveness of the new method.  相似文献   

15.
The choice of the bandwidth is a crucial issue for kernel density estimation. Among all the data-dependent methods for choosing the bandwidth, the direct plug-in method has shown a particularly good performance in practice. This procedure is based on estimating an asymptotic approximation of the optimal bandwidth, using two “pilot” kernel estimation stages. Although two pilot stages seem to be enough for most densities, for a long time the problem of how to choose an appropriate number of stages has remained open. Here we propose an automatic (i.e., data-based) method for choosing the number of stages to be employed in the plug-in bandwidth selector. Asymptotic properties of the method are presented and an extensive simulation study is carried out to compare its small-sample performance with that of the most recommended bandwidth selectors in the literature.  相似文献   

16.
Spatial point pattern data sets are commonplace in a variety of different research disciplines. The use of kernel methods to smooth such data is a flexible way to explore spatial trends and make inference about underlying processes without, or perhaps prior to, the design and fitting of more intricate semiparametric or parametric models to quantify specific effects. The long-standing issue of ‘optimal’ data-driven bandwidth selection is complicated in these settings by issues such as high heterogeneity in observed patterns and the need to consider edge correction factors. We scrutinize bandwidth selectors built on leave-one-out cross-validation approximation to likelihood functions. A key outcome relates to previously unconsidered adaptive smoothing regimens for spatiotemporal density and multitype conditional probability surface estimation, whereby we propose a novel simultaneous pilot-global selection strategy. Motivated by applications in epidemiology, the results of both simulated and real-world analyses suggest this strategy to be largely preferable to classical fixed-bandwidth estimation for such data.  相似文献   

17.
Alternative methods of estimating properties of unknown distributions include the bootstrap and the smoothed bootstrap. In the standard bootstrap setting, Johns (1988) introduced an importance resam¬pling procedure that results in more accurate approximation to the bootstrap estimate of a distribution function or a quantile. With a suitable “exponential tilting” similar to that used by Johns, we derived a smoothed version of importance resampling in the framework of the smoothed bootstrap. Smoothed importance resampling procedures were developed for the estimation of distribution functions of the Studentized mean, the Studentized variance, and the correlation coefficient. Implementation of these procedures are presented via simulation results which concentrate on the problem of estimation of distribution functions of the Studentized mean and Studentized variance for different sample sizes and various pre-specified smoothing bandwidths for the normal data; additional simulations were conducted for the estimation of quantiles of the distribution of the Studentized mean under an optimal smoothing bandwidth when the original data were simulated from three different parent populations: lognormal, t(3) and t(10). These results suggest that in cases where it is advantageous to use the smoothed bootstrap rather than the standard bootstrap, the amount of resampling necessary might be substantially reduced by the use of importance resampling methods and the efficiency gains depend on the bandwidth used in the kernel density estimation.  相似文献   

18.
Abstract.  The performance of multivariate kernel density estimates depends crucially on the choice of bandwidth matrix, but progress towards developing good bandwidth matrix selectors has been relatively slow. In particular, previous studies of cross-validation (CV) methods have been restricted to biased and unbiased CV selection of diagonal bandwidth matrices. However, for certain types of target density the use of full (i.e. unconstrained) bandwidth matrices offers the potential for significantly improved density estimation. In this paper, we generalize earlier work from diagonal to full bandwidth matrices, and develop a smooth cross-validation (SCV) methodology for multivariate data. We consider optimization of the SCV technique with respect to a pilot bandwidth matrix. All the CV methods are studied using asymptotic analysis, simulation experiments and real data analysis. The results suggest that SCV for full bandwidth matrices is the most reliable of the CV methods. We also observe that experience from the univariate setting can sometimes be a misleading guide for understanding bandwidth selection in the multivariate case.  相似文献   

19.
The random censorship model (RCM) is commonly used in biomedical science for modeling life distributions. The popular non-parametric Kaplan–Meier estimator and some semiparametric models such as Cox proportional hazard models are extensively discussed in the literature. In this paper, we propose to fit the RCM with the assumption that the actual life distribution and the censoring distribution have a proportional odds relationship. The parametric model is defined using Marshall–Olkin's extended Weibull distribution. We utilize the maximum-likelihood procedure to estimate model parameters, the survival distribution, the mean residual life function, and the hazard rate as well. The proportional odds assumption is also justified by the newly proposed bootstrap Komogorov–Smirnov type goodness-of-fit test. A simulation study on the MLE of model parameters and the median survival time is carried out to assess the finite sample performance of the model. Finally, we implement the proposed model on two real-life data sets.  相似文献   

20.
In this article, we give the asymptotic mean integrated squared error and the mean squared error for the kernel estimator of the hazard rate from truncated and censored data. Martingale techniques and combinatory calculus are used to obtain these results. A probability bound and the optimal bandwidth choice are also given.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号