首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 593 毫秒
1.
In order to explore and compare a finite number T of data sets by applying functional principal component analysis (FPCA) to the T associated probability density functions, we estimate these density functions by using the multivariate kernel method. The data set sizes being fixed, we study the behaviour of this FPCA under the assumption that all the bandwidth matrices used in the estimation of densities are proportional to a common parameter h and proportional to either the variance matrices or the identity matrix. In this context, we propose a selection criterion of the parameter h which depends only on the data and the FPCA method. Then, on simulated examples, we compare the quality of approximation of the FPCA when the bandwidth matrices are selected using either the previous criterion or two other classical bandwidth selection methods, that is, a plug-in or a cross-validation method.  相似文献   

2.
In non-parametric function estimation selection of a smoothing parameter is one of the most important issues. The performance of smoothing techniques depends highly on the choice of this parameter. Preferably the bandwidth should be determined via a data-driven procedure. In this paper we consider kernel estimators in a white noise model, and investigate whether locally adaptive plug-in bandwidths can achieve optimal global rates of convergence. We consider various classes of functions: Sobolev classes, bounded variation function classes, classes of convex functions and classes of monotone functions. We study the situations of pilot estimation with oversmoothing and without oversmoothing. Our main finding is that simple local plug-in bandwidth selectors can adapt to spatial inhomogeneity of the regression function as long as there are no local oscillations of high frequency. We establish the pointwise asymptotic distribution of the regression estimator with local plug-in bandwidth.  相似文献   

3.
Abstract

An exact, closed form, and easy to compute expression for the mean integrated squared error (MISE) of a kernel estimator of a normal mixture cumulative distribution function is derived for the class of arbitrary order Gaussian-based kernels. Comparisons are made with MISE of the empirical distribution function, the infeasible minimum MISE, and the uniform kernel. A simple plug-in method of simultaneously selecting the optimal bandwidth and kernel order is proposed based on a non asymptotic approximation of the unknown distribution by a normal mixture. A simulation study shows that the method provides a viable alternative to existing bandwidth selection procedures.  相似文献   

4.
ABSTRACT

A frequently encountered statistical problem is to determine if the variability among k populations is heterogeneous. If the populations are measured using different scales, comparing variances may not be appropriate. In this case, comparing coefficient of variation (CV) can be used because CV is unitless. In this paper, a non-parametric test is introduced to test whether the CVs from k populations are different. With the assumption that the populations are independent normally distributed, the Miller test, Feltz and Miller test, saddlepoint-based test, log likelihood ratio test and the proposed simulated Bartlett-corrected log likelihood ratio test are derived. Simulation results show the extreme accuracy of the simulated Bartlett-corrected log likelihood ratio test if the model is correctly specified. If the model is mis-specified and the sample size is small, the proposed test still gives good results. However, with a mis-specified model and large sample size, the non-parametric test is recommended.  相似文献   

5.
The geographical relative risk function is a useful tool for investigating the spatial distribution of disease based on case and control data. The most common way of estimating this function is using the ratio of bivariate kernel density estimates constructed from the locations of cases and controls, respectively. An alternative is to use a local-linear (LL) estimator of the log-relative risk function. In both cases, the choice of bandwidth is critical. In this article, we examine the relative performance of the two estimation techniques using a variety of data-driven bandwidth selection methods, including likelihood cross-validation (CV), least-squares CV, rule-of-thumb reference methods, and a new approximate plug-in (PI) bandwidth for the LL estimator. Our analysis includes the comparison of asymptotic results; a simulation study; and application of the estimators on two real data sets. Our findings suggest that the density ratio method implemented with the least-squares CV bandwidth selector is generally best, with the LL estimator with PI bandwidth being competitive in applications with strong large-scale trends but much worse in situations with elliptical clusters.  相似文献   

6.
Abstract

The gambler's ruin problem is one of the most important problems in the emergence of probability. The problem has been long considered “solved” from a probabilistic viewpoint. However, we do not find the solution satisfactory. In this paper, the problem is recast as a statistical problem. Bounds of the estimate are derived over wide classes of priors. Interestingly, the probabilistic estimates ω(1/2) are identified as the most conservative solutions while the plug-in estimates are found to be out of range of the bounds. It implies that, although conservative, the probabilistic estimates ω(1/2) are justified by our analysis while the plug-in estimates are too extreme for estimating the ruin probability of gambler.  相似文献   

7.
This article introduces a new specification for the heterogenous autoregressive (HAR) model for the realized volatility of S&P 500 index returns. In this modeling framework, the coefficients of the HAR are allowed to be time-varying with unspecified functional forms. The local linear method with the cross-validation (CV) bandwidth selection is applied to estimate the time-varying coefficient HAR (TVC-HAR) model, and a bootstrap method is used to construct the point-wise confidence bands for the coefficient functions. Furthermore, the asymptotic distribution of the proposed local linear estimators of the TVC-HAR model is established under some mild conditions. The results of the simulation study show that the local linear estimator with CV bandwidth selection has favorable finite sample properties. The outcomes of the conditional predictive ability test indicate that the proposed nonparametric TVC-HAR model outperforms the parametric HAR and its extension to HAR with jumps and/or GARCH in terms of multi-step out-of-sample forecasting, in particular in the post-2003 crisis and 2007 global financial crisis (GFC) periods, during which financial market volatilities were unduly high.  相似文献   

8.
Abstract

In this paper, we deal with the problem of estimating the delayed renewal and variance functions in delayed renewal processes. Two parametric plug-in estimators for these functions are proposed and their unbiasedness, asymptotic unbiasedness and consistency properties are investigated. The asymptotic normality of these estimators are established. Further, a method for the computation of the estimators is given. Finally, the performances of the estimators are evaluated for small sample sizes by a simulation study.  相似文献   

9.
Relative potency estimations in both multiple parallel-line and slope-ratio assays involve construction of simultaneous confidence intervals for ratios of linear combinations of general linear model parameters. The key problem here is that of determining multiplicity adjusted percentage points of a multivariate t-distribution, the correlation matrix R of which depends on the unknown relative potency parameters. Several methods have been proposed in the literature on how to deal with R . In this article, we introduce a method based on an estimate of R (also called the plug-in approach) and compare it with various methods including conservative procedures based on probability inequalities. Attention is restricted to parallel-line assays though the theory is applicable for any ratios of coefficients in the general linear model. Extension of the plug-in method to linear mixed effect models is also discussed. The methods will be compared with respect to their simultaneous coverage probabilities via Monte Carlo simulations. We also evaluate the methods in terms of confidence interval width through application to data on multiple parallel-line assay.  相似文献   

10.
We propose a fast data-driven procedure for decomposing seasonal time series using the Berlin Method, the procedure used, e.g. by the German Federal Statistical Office in this context. The formula of the asymptotic optimal bandwidth h A is obtained. Methods for estimating the unknowns in h A are proposed. The algorithm is developed by adapting the well-known iterative plug-in idea to time series decomposition. Asymptotic behaviour of the proposal is investigated. Some computational aspects are discussed in detail. Data examples show that the proposal works very well in practice and that data-driven bandwidth selection offers new possibilities to improve the Berlin Method. Deep insights into the iterative plug-in rule are also provided.  相似文献   

11.
Functional principal component analysis (FPCA) as a reduction data technique of a finite number T of functions can be used to identify the dominant modes of variation of numeric three-way data.

We carry out the FPCA on multidimensional probability density functions, relate this method to other standard methods and define its centered or standardized versions. Grounded on the relationship between FPCA of densities, FPCA of their corresponding characteristic functions, PCA of the MacLaurin expansions of these characteristic functions and dual STATIS method applied to their variance matrices, we propose a method for interpreting the results of the FPCA of densities. This method is based on the investigations of the relationships between the scores of the FPCA and the moments associated to the densities.

The method is illustrated using known Gaussian densities. In practice, FPCA of densities deals with observations of multidimensional variables on T occasions. These observations can be used to estimate the T associated densities (i) by estimating the parameters of these densities, assuming that they are Gaussian, or (ii) by using the Gaussian kernel method and choosing the matrix bandwidth by the normal reference rule. Thereafter, FPCA estimate is derived from these estimates and the interpretation method is carried out to explore the dominant modes of variation of the types of three-way data encountered in sensory analysis and archaeology.  相似文献   

12.
Consider a regression model where the regression function is the sum of a linear and a nonparametric component. Assuming that the errors of the model follow a stationary strong mixing process with mean zero, the problem of bandwidth selection for a kernel estimator of the nonparametric component is addressed here. We obtain an asymptotic expression for an optimal band-width and we propose to use a plug-in methodology in order to estimate this bandwidth through preliminary estimates of the unknown quantities. Asymptotic optimality for the plug-in bandwidth is established.  相似文献   

13.
Abstract

Estimation of quantiles from two normal populations is considered under the assumption of common mean and ordered variances. Several new estimators have been proposed using certain estimators of the common mean, including the plug-in type restricted MLE. A sufficient condition for improving equivariant estimators is proved and as a result improved estimators are derived. The percentage of risk improvements for each of the improved estimators have been computed numerically, which are quite significant. All the improved estimators have been compared numerically using Monte-Carlo simulation method. Finally, recommendations have been made for the use of estimators in practice.  相似文献   

14.
A new, fully data-driven bandwidth selector with a double smoothing (DS) bias term and a data-driven variance estimator is developed following the bootstrap idea. The data-driven variance estimation does not involve any additional bandwidth selection. The proposed bandwidth selector convergences faster than a plug-in one due to the DS bias estimate, whereas the data-driven variance improves its finite sample performance clearly and makes it stable. Asymptotic results of the proposals are obtained. A comparative simulation study was done to show the overall gains and the gains obtained by improving either the bias term or the variance estimate, respectively. It is shown that the use of a good variance estimator is more important when the sample size is relatively small.  相似文献   

15.
ABSTRACT

This paper discusses the detailed performance of an iterative plug-in (IPI) bandwidth selector for estimating the diurnal duration pattern in a Semi-ACD (semiparametric autoregressive conditional duration) model. For this purpose a large simulation study was carried out. The effects of different factors, which affect the selected bandwidth are discussed in detail. The simulated results and data examples show that the proposed IPI algorithm works very well in practice and that the Semi-ACD model in general is clearly superior to the parametric ACD model, if there is a deterministic trend in the duration data. It is also shown that the bandwidth selection, and the estimation of the diurnal pattern and the model parameters will all be clearly improved, if the sample size is enlarged. According to the goodness-of-fit of the estimated diurnal pattern, a best combination of the above-mentioned factors is found. Moreover, a comparative study shows that our proposal usually outperforms the commonly used cubic spline.  相似文献   

16.
Abstract.  A kernel regression imputation method for missing response data is developed. A class of bias-corrected empirical log-likelihood ratios for the response mean is defined. It is shown that any member of our class of ratios is asymptotically chi-squared, and the corresponding empirical likelihood confidence interval for the response mean is constructed. Our ratios share some of the desired features of the existing methods: they are self-scale invariant and no plug-in estimators for the adjustment factor and asymptotic variance are needed; when estimating the non-parametric function in the model, undersmoothing to ensure root- n consistency of the estimator for the parameter is avoided. Since the range of bandwidths contains the optimal bandwidth for estimating the regression function, the existing data-driven algorithm is valid for selecting an optimal bandwidth. We also study the normal approximation-based method. A simulation study is undertaken to compare the empirical likelihood with the normal approximation method in terms of coverage accuracies and average lengths of confidence intervals.  相似文献   

17.
This paper demonstrates that cross-validation (CV) and Bayesian adaptive bandwidth selection can be applied in the estimation of associated kernel discrete functions. This idea is originally proposed by Brewer [A Bayesian model for local smoothing in kernel density estimation, Stat. Comput. 10 (2000), pp. 299–309] to derive variable bandwidths in adaptive kernel density estimation. Our approach considers the adaptive binomial kernel estimator and treats the variable bandwidths as parameters with beta prior distribution. The best variable bandwidth selector is estimated by the posterior mean in the Bayesian sense under squared error loss. Monte Carlo simulations are conducted to examine the performance of the proposed Bayesian adaptive approach in comparison with the performance of the Asymptotic mean integrated squared error estimator and CV technique for selecting a global (fixed) bandwidth proposed in Kokonendji and Senga Kiessé [Discrete associated kernels method and extensions, Stat. Methodol. 8 (2011), pp. 497–516]. The Bayesian adaptive bandwidth estimator performs better than the global bandwidth, in particular for small and moderate sample sizes.  相似文献   

18.
In this paper we study the ideal variable bandwidth kernel density estimator introduced by McKay (1993a, b) and Jones et al. (1994) and the plug-in practical version of the variable bandwidth kernel estimator with two sequences of bandwidths as in Giné and Sang (2013). Based on the bias and variance analysis of the ideal and plug-in variable bandwidth kernel density estimators, we study the central limit theorems for each of them. The simulation study confirms the central limit theorem and demonstrates the advantage of the plug-in variable bandwidth kernel method over the classical kernel method.  相似文献   

19.
ABSTRACT

The most important factor in kernel regression is a choice of a bandwidth. Considerable attention has been paid to extension the idea of an iterative method known for a kernel density estimate to kernel regression. Data-driven selectors of the bandwidth for kernel regression are considered. The proposed method is based on an optimally balanced relation between the integrated variance and the integrated square bias. This approach leads to an iterative quadratically convergent process. The analysis of statistical properties shows the rationale of the proposed method. In order to see statistical properties of this method the consistency is determined. The utility of the method is illustrated through a simulation study and real data applications.  相似文献   

20.
Abstract

Based on the Gamma kernel density estimation procedure, this article constructs a nonparametric kernel estimate for the regression functions when the covariate are nonnegative. Asymptotic normality and uniform almost sure convergence results for the new estimator are systematically studied, and the finite performance of the proposed estimate is discussed via a simulation study and a comparison study with an existing method. Finally, the proposed estimation procedure is applied to the Geyser data set.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号