首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
In order to explore and compare a finite number T of data sets by applying functional principal component analysis (FPCA) to the T associated probability density functions, we estimate these density functions by using the multivariate kernel method. The data set sizes being fixed, we study the behaviour of this FPCA under the assumption that all the bandwidth matrices used in the estimation of densities are proportional to a common parameter h and proportional to either the variance matrices or the identity matrix. In this context, we propose a selection criterion of the parameter h which depends only on the data and the FPCA method. Then, on simulated examples, we compare the quality of approximation of the FPCA when the bandwidth matrices are selected using either the previous criterion or two other classical bandwidth selection methods, that is, a plug-in or a cross-validation method.  相似文献   

2.
This paper demonstrates that cross-validation (CV) and Bayesian adaptive bandwidth selection can be applied in the estimation of associated kernel discrete functions. This idea is originally proposed by Brewer [A Bayesian model for local smoothing in kernel density estimation, Stat. Comput. 10 (2000), pp. 299–309] to derive variable bandwidths in adaptive kernel density estimation. Our approach considers the adaptive binomial kernel estimator and treats the variable bandwidths as parameters with beta prior distribution. The best variable bandwidth selector is estimated by the posterior mean in the Bayesian sense under squared error loss. Monte Carlo simulations are conducted to examine the performance of the proposed Bayesian adaptive approach in comparison with the performance of the Asymptotic mean integrated squared error estimator and CV technique for selecting a global (fixed) bandwidth proposed in Kokonendji and Senga Kiessé [Discrete associated kernels method and extensions, Stat. Methodol. 8 (2011), pp. 497–516]. The Bayesian adaptive bandwidth estimator performs better than the global bandwidth, in particular for small and moderate sample sizes.  相似文献   

3.
This paper studies a functional coe?cient time series model with trending regressors, where the coe?cients are unknown functions of time and random variables. We propose a local linear estimation method to estimate the unknown coe?cient functions, and establish the corresponding asymptotic theory under mild conditions. We also develop a test procedure to see if the functional coe?cients take particular parametric forms. For practical use, we further propose a Bayesian approach to select the bandwidths, and conduct several numerical experiments to examine the finite sample performance of our proposed local linear estimator and the test procedure. The results show that the local linear estimator works well and the proposed test has satisfactory size and power. In addition, our simulation studies show that the Bayesian bandwidth selection method performs better than the cross-validation method. Furthermore, we use the functional coe?cient model to study the relationship between consumption per capita and income per capita in United States, and it was shown that the functional coe?cient model with our proposed local linear estimator and Bayesian bandwidth selection method performs well in both in-sample fitting and out-of-sample forecasting.  相似文献   

4.
Research in the area of bandwidth selection was an active topic in the 1980s and 1990s, however, recently there has been little research in the area. We re-opened this investigation and have found a new method for estimating mean integrated squared error for kernel density estimators. We provide an overview of other methods to obtain optimal bandwidths and offer a comparison of these methods via a simulation study. In certain situations, our method of estimating an optimal bandwidth yields a smaller MISE than competing methods to compute bandwidths. This procedure is illustrated by an application to two data sets.  相似文献   

5.
In this paper, we investigate the problem of testing semiparametric hypotheses in locally stationary processes. The proposed method is based on an empirical version of the L2‐distance between the true time varying spectral density and its best approximation under the null hypothesis. As this approach only requires estimation of integrals of the time varying spectral density and its square, we do not have to choose a smoothing bandwidth for the local estimation of the spectral density – in contrast to most other procedures discussed in the literature. Asymptotic normality of the test statistic is derived both under the null hypothesis and the alternative. We also propose a bootstrap procedure to obtain critical values in the case of small sample sizes. Additionally, we investigate the finite sample properties of the new method and compare it with the currently available procedures by means of a simulation study. Finally, we illustrate the performance of the new test in two data examples, one regarding log returns of the S&P 500 and the other a well‐known series of weekly egg prices.  相似文献   

6.
Summary.  We construct approximate confidence intervals for a nonparametric regression function, using polynomial splines with free-knot locations. The number of knots is determined by generalized cross-validation. The estimates of knot locations and coefficients are obtained through a non-linear least squares solution that corresponds to the maximum likelihood estimate. Confidence intervals are then constructed based on the asymptotic distribution of the maximum likelihood estimator. Average coverage probabilities and the accuracy of the estimate are examined via simulation. This includes comparisons between our method and some existing methods such as smoothing spline and variable knots selection as well as a Bayesian version of the variable knots method. Simulation results indicate that our method works well for smooth underlying functions and also reasonably well for discontinuous functions. It also performs well for fairly small sample sizes.  相似文献   

7.
Summary.  The paper introduces a new local polynomial estimator and develops supporting asymptotic theory for nonparametric regression in the presence of covariate measurement error. We address the measurement error with Cook and Stefanski's simulation–extrapolation (SIMEX) algorithm. Our method improves on previous local polynomial estimators for this problem by using a bandwidth selection procedure that addresses SIMEX's particular estimation method and considers higher degree local polynomial estimators. We illustrate the accuracy of our asymptotic expressions with a Monte Carlo study, compare our method with other estimators with a second set of Monte Carlo simulations and apply our method to a data set from nutritional epidemiology. SIMEX was originally developed for parametric models. Although SIMEX is, in principle, applicable to nonparametric models, a serious problem arises with SIMEX in nonparametric situations. The problem is that smoothing parameter selectors that are developed for data without measurement error are no longer appropriate and can result in considerable undersmoothing. We believe that this is the first paper to address this difficulty.  相似文献   

8.
Typically, parametric approaches to spatial problems require restrictive assumptions. On the other hand, in a wide variety of practical situations nonparametric bivariate smoothing techniques has been shown to be successfully employable for estimating small or large scale regularity factors, or even the signal content of spatial data taken as a whole.We propose a weighted local polynomial regression smoother suitable for fitting of spatial data. To account for spatial variability, we both insert a spatial contiguity index in the standard formulation, and construct a spatial-adaptive bandwidth selection rule. Our bandwidth selector depends on the Gearys local indicator of spatial association. As illustrative example, we provide a brief Monte Carlo study case on equally spaced data, the performances of our smoother and the standard polynomial regression procedure are compared.This note, though it is the result of a close collaboration, was specifically elaborated as follows: paragraphs 1 and 2 by T. Sclocco and the remainder by M. Di Marzio. The authors are grateful to the referees for constructive comments and suggestions.  相似文献   

9.
In this article, we are interested in comparing growth curves for the Red Delicious apple in several locations to that of a reference site. Although such multiple comparisons are common for linear models, statistical techniques for nonlinear models are not prolific. We theoretically derive a test statistic, considering the issues of sample size and design points. Under equal sample sizes and same design points, our test statistic is based on the maximum of an equi-correlated multivariate chi-square distribution. Under unequal sample sizes and design points, we derive a general correlation structure, and then utilize the multivariate normal distribution to numerically compute critical points for the maximum of the multivariate chi-square. We apply this statistical technique to compare the growth of Red Delicious apples at six locations to a reference site in the state of Washington in 2009. Finally, we perform simulations to verify the performance of our proposed procedure for Type I error and marginal power. Our proposed method performs well in regard to both.  相似文献   

10.
ABSTRACT

Local linear estimator is a popularly used method to estimate the non-parametric regression functions, and many methods have been derived to estimate the smoothing parameter, or the bandwidth in this case. In this article, we propose an information criterion-based bandwidth selection method, with the degrees of freedom originally derived for non-parametric inferences. Unlike the plug-in method, the new method does not require preliminary parameters to be chosen in advance, and is computationally efficient compared to the cross-validation (CV) method. Numerical study shows that the new method performs better or comparable to existing plug-in method or CV method in terms of the estimation of the mean functions, and has lower variability than CV selectors. Real data applications are also provided to illustrate the effectiveness of the new method.  相似文献   

11.
Kernel smoothing of spatial point data can often be improved using an adaptive, spatially varying bandwidth instead of a fixed bandwidth. However, computation with a varying bandwidth is much more demanding, especially when edge correction and bandwidth selection are involved. This paper proposes several new computational methods for adaptive kernel estimation from spatial point pattern data. A key idea is that a variable-bandwidth kernel estimator for d-dimensional spatial data can be represented as a slice of a fixed-bandwidth kernel estimator in \((d+1)\)-dimensional scale space, enabling fast computation using Fourier transforms. Edge correction factors have a similar representation. Different values of global bandwidth correspond to different slices of the scale space, so that bandwidth selection is greatly accelerated. Potential applications include estimation of multivariate probability density and spatial or spatiotemporal point process intensity, relative risk, and regression functions. The new methods perform well in simulations and in two real applications concerning the spatial epidemiology of primary biliary cirrhosis and the alarm calls of capuchin monkeys.  相似文献   

12.

Motivated by the study of traffic accidents on a road network, we discuss the estimation of the relative risk, the ratio of rates of occurrence of different types of events occurring on a network of lines. Methods developed for two-dimensional spatial point patterns can be adapted to a linear network, but their requirements and performance are very different on a network. Computation is slow and we introduce new techniques to accelerate it. Intensities (occurrence rates) are estimated by kernel smoothing using the heat kernel on the network. The main methodological problem is bandwidth selection. Binary regression methods, such as likelihood cross-validation and least squares cross-validation, perform tolerably well in our simulation experiments, but the Kelsall–Diggle density-ratio cross-validation method does not. We find a theoretical explanation, and propose a modification of the Kelsall–Diggle method which has better performance. The methods are applied to traffic accidents in a regional city, and to protrusions on the dendritic tree of a neuron.

  相似文献   

13.
We consider the problem of data-based choice of the bandwidth of a kernel density estimator, with an aim to estimate the density optimally at a given design point. The existing local bandwidth selectors seem to be quite sensitive to the underlying density and location of the design point. For instance, some bandwidth selectors perform poorly while estimating a density, with bounded support, at the median. Others struggle to estimate a density in the tail region or at the trough between the two modes of a multimodal density. We propose a scale invariant bandwidth selection method such that the resulting density estimator performs reliably irrespective of the density or the design point. We choose bandwidth by minimizing a bootstrap estimate of the mean squared error (MSE) of a density estimator. Our bootstrap MSE estimator is different in the sense that we estimate the variance and squared bias components separately. We provide insight into the asymptotic accuracy of the proposed density estimator.  相似文献   

14.
A crucial problem in kernel density estimates of a probability density function is the selection of the bandwidth. The aim of this study is to propose a procedure for selecting both fixed and variable bandwidths. The present study also addresses the question of how different variable bandwidth kernel estimators perform in comparison with each other and to the fixed type of bandwidth estimators. The appropriate algorithms for implementation of the proposed method are given along with a numerical simulation.The numerical results serve as a guide to determine which bandwidth selection method is most appropriate for a given type of estimator over a vide class of probability density functions, Also, we obtain a numerical comparison of the different types of kernel estimators under various types of bandwidths.  相似文献   

15.
In many medical studies patients are nested or clustered within doctor. With many explanatory variables, variable selection with clustered data can be challenging. We propose a method for variable selection based on random forest that addresses clustered data through stratified binary splits. Our motivating example involves the detection orthopedic device components from a large pool of candidates, where each patient belongs to a surgeon. Simulations compare the performance of survival forests grown using the stratified logrank statistic to conventional and robust logrank statistics, as well as a method to select variables using a threshold value based on a variable's empirical null distribution. The stratified logrank test performs superior to conventional and robust methods when data are generated to have cluster-specific effects, and when cluster sizes are sufficiently large, perform comparably to the splitting alternatives in the absence of cluster-specific effects. Thresholding was effective at distinguishing between important and unimportant variables.  相似文献   

16.
Abstract.  In this paper, we consider a semiparametric time-varying coefficients regression model where the influences of some covariates vary non-parametrically with time while the effects of the remaining covariates follow certain parametric functions of time. The weighted least squares type estimators for the unknown parameters of the parametric coefficient functions as well as the estimators for the non-parametric coefficient functions are developed. We show that the kernel smoothing that avoids modelling of the sampling times is asymptotically more efficient than a single nearest neighbour smoothing that depends on the estimation of the sampling model. The asymptotic optimal bandwidth is also derived. A hypothesis testing procedure is proposed to test whether some covariate effects follow certain parametric forms. Simulation studies are conducted to compare the finite sample performances of the kernel neighbourhood smoothing and the single nearest neighbour smoothing and to check the empirical sizes and powers of the proposed testing procedures. An application to a data set from an AIDS clinical trial study is provided for illustration.  相似文献   

17.
The performances of data-driven bandwidth selection procedures in local polynomial regression are investigated by using asymptotic methods and simulation. The bandwidth selection procedures considered are based on minimizing 'prelimit' approximations to the (conditional) mean-squared error (MSE) when the MSE is considered as a function of the bandwidth h . We first consider approximations to the MSE that are based on Taylor expansions around h=0 of the bias part of the MSE. These approximations lead to estimators of the MSE that are accurate only for small bandwidths h . We also consider a bias estimator which instead of using small h approximations to bias naïvely estimates bias as the difference of two local polynomial estimators of different order and we show that this estimator performs well only for moderate to large h . We next define a hybrid bias estimator which equals the Taylor-expansion-based estimator for small h and the difference estimator for moderate to large h . We find that the MSE estimator based on this hybrid bias estimator leads to a bandwidth selection procedure with good asymptotic and, for our Monte Carlo examples, finite sample properties.  相似文献   

18.
In this article we consider data-sharpening methods for nonparametric regression. In particular modifications are made to existing methods in the following two directions. First, we introduce a new tuning parameter to control the extent to which the data are to be sharpened, so that the amount of sharpening is adaptive and can be tuned to best suit the data at hand. We call this new parameter the sharpening parameter. Second, we develop automatic methods for jointly choosing the value of this sharpening parameter as well as the values of other required smoothing parameters. These automatic parameter selection methods are shown to be asymptotically optimal in a well defined sense. Numerical experiments were also conducted to evaluate their finite-sample performances. To the best of our knowledge, there is no bandwidth selection method developed in the literature for sharpened nonparametric regression.  相似文献   

19.
In non-parametric function estimation selection of a smoothing parameter is one of the most important issues. The performance of smoothing techniques depends highly on the choice of this parameter. Preferably the bandwidth should be determined via a data-driven procedure. In this paper we consider kernel estimators in a white noise model, and investigate whether locally adaptive plug-in bandwidths can achieve optimal global rates of convergence. We consider various classes of functions: Sobolev classes, bounded variation function classes, classes of convex functions and classes of monotone functions. We study the situations of pilot estimation with oversmoothing and without oversmoothing. Our main finding is that simple local plug-in bandwidth selectors can adapt to spatial inhomogeneity of the regression function as long as there are no local oscillations of high frequency. We establish the pointwise asymptotic distribution of the regression estimator with local plug-in bandwidth.  相似文献   

20.
We consider small sample equivalence tests for exponentialy. Statistical inference in this setting is particularly challenging since equivalence testing procedures typically require much larger sample sizes, in comparison with classical “difference tests,” to perform well. We make use of Butler's marginal likelihood for the shape parameter of a gamma distribution in our development of small sample equivalence tests for exponentiality. We consider two procedures using the principle of confidence interval inclusion, four Bayesian methods, and the uniformly most powerful unbiased (UMPU) test where a saddlepoint approximation to the intractable distribution of a canonical sufficient statistic is used. We perform small sample simulation studies to assess the bias of our various tests and show that all of the Bayes posteriors we consider are integrable. Our simulation studies show that the saddlepoint-approximated UMPU method performs remarkably well for small sample sizes and is the only method that consistently exhibits an empirical significance level close to the nominal 5% level.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号