首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 851 毫秒
1.
This paper demonstrates that cross-validation (CV) and Bayesian adaptive bandwidth selection can be applied in the estimation of associated kernel discrete functions. This idea is originally proposed by Brewer [A Bayesian model for local smoothing in kernel density estimation, Stat. Comput. 10 (2000), pp. 299–309] to derive variable bandwidths in adaptive kernel density estimation. Our approach considers the adaptive binomial kernel estimator and treats the variable bandwidths as parameters with beta prior distribution. The best variable bandwidth selector is estimated by the posterior mean in the Bayesian sense under squared error loss. Monte Carlo simulations are conducted to examine the performance of the proposed Bayesian adaptive approach in comparison with the performance of the Asymptotic mean integrated squared error estimator and CV technique for selecting a global (fixed) bandwidth proposed in Kokonendji and Senga Kiessé [Discrete associated kernels method and extensions, Stat. Methodol. 8 (2011), pp. 497–516]. The Bayesian adaptive bandwidth estimator performs better than the global bandwidth, in particular for small and moderate sample sizes.  相似文献   

2.
Since the late 1980s, several methods have been considered in the literature to reduce the sample variability of the least-squares cross-validation bandwidth selector for kernel density estimation. In this article, a weighted version of this classical method is proposed and its asymptotic and finite-sample behavior is studied. The simulation results attest that the weighted cross-validation bandwidth performs quite well, presenting a better finite-sample performance than the standard cross-validation method for “easy-to-estimate” densities, and retaining the good finite-sample performance of the standard cross-validation method for “hard-to-estimate” ones.  相似文献   

3.
The mean squared error (MSE)-minimizing local variable bandwidth for the univariate local linear estimator (the LL) is well-known. This bandwidth does not stabilize variance over the domain. Moreover, in regions where a regression function has zero curvature, the LL estimator is discontinuous. In this paper, we propose a variance-stabilizing (VS) local variable diagonal bandwidth matrix for the multivariate LL estimator. Theoretically, the VS bandwidth can outperform the multivariate extension of the MSE-minimizing local variable scalar bandwidth in terms of asymptotic mean integrated squared error and can avoid discontinuity created by the MSE-minimizing bandwidth. We present an algorithm for estimating the VS bandwidth and simulation studies.  相似文献   

4.
This paper considers the problem of selecting optimal bandwidths for variable (sample‐point adaptive) kernel density estimation. A data‐driven variable bandwidth selector is proposed, based on the idea of approximating the log‐bandwidth function by a cubic spline. This cubic spline is optimized with respect to a cross‐validation criterion. The proposed method can be interpreted as a selector for either integrated squared error (ISE) or mean integrated squared error (MISE) optimal bandwidths. This leads to reflection upon some of the differences between ISE and MISE as error criteria for variable kernel estimation. Results from simulation studies indicate that the proposed method outperforms a fixed kernel estimator (in terms of ISE) when the target density has a combination of sharp modes and regions of smooth undulation. Moreover, some detailed data analyses suggest that the gains in ISE may understate the improvements in visual appeal obtained using the proposed variable kernel estimator. These numerical studies also show that the proposed estimator outperforms existing variable kernel density estimators implemented using piecewise constant bandwidth functions.  相似文献   

5.
ABSTRACT

This work treats non-parametric estimation of multivariate probability mass functions, using multivariate discrete associated kernels. We propose a Bayesian local approach to select the matrix of bandwidths considering the multivariate Dirac Discrete Uniform and the product of binomial kernels, and treating the bandwidths as a diagonal matrix of parameters with some prior distribution. The performances of this approach and the cross-validation method are compared using simulations and real count data sets. The obtained results show that the Bayes local method performs better than cross-validation in terms of integrated squared error.  相似文献   

6.
Abstract

In this work, we propose beta prime kernel estimator for estimation of a probability density functions defined with nonnegative support. For the proposed estimator, beta prime probability density function used as a kernel. It is free of boundary bias and nonnegative with a natural varying shape. We obtained the optimal rate of convergence for the mean squared error (MSE) and the mean integrated squared error (MISE). Also, we use adaptive Bayesian bandwidth selection method with Lindley approximation for heavy tailed distributions and compare its performance with the global least squares cross-validation bandwidth selection method. Simulation studies are performed to evaluate the average integrated squared error (ISE) of the proposed kernel estimator against some asymmetric competitors using Monte Carlo simulations. Moreover, real data sets are presented to illustrate the findings.  相似文献   

7.
Recently, Kokonendji et al. have adapted the well-known Nadaraya–Watson kernel estimator for estimating the count function m in the context of nonparametric discrete regression. The authors have also investigated the bandwidth selection using the cross-validation method. In this article, we propose a Bayesian approach in the context of nonparametric count regression for estimating the bandwidth and the variance of the model error, which has not been estimated in Kokonendji et al. The model error is considered as Gaussian with mean of zero and a variance of σ2. The Bayes estimates cannot be obtained in closed form and then, we use the well-known Markov chain Monte Carlo (MCMC) technique to compute the Bayes estimates under the squared errors loss function. The performance of this proposed approach and the cross-validation method are compared through simulation and real count data.  相似文献   

8.
Risk estimation is an important statistical question for the purposes of selecting a good estimator (i.e., model selection) and assessing its performance (i.e., estimating generalization error). This article introduces a general framework for cross-validation and derives distributional properties of cross-validated risk estimators in the context of estimator selection and performance assessment. Arbitrary classes of estimators are considered, including density estimators and predictors for both continuous and polychotomous outcomes. Results are provided for general full data loss functions (e.g., absolute and squared error, indicator, negative log density). A broad definition of cross-validation is used in order to cover leave-one-out cross-validation, V-fold cross-validation, Monte Carlo cross-validation, and bootstrap procedures. For estimator selection, finite sample risk bounds are derived and applied to establish the asymptotic optimality of cross-validation, in the sense that a selector based on a cross-validated risk estimator performs asymptotically as well as an optimal oracle selector based on the risk under the true, unknown data generating distribution. The asymptotic results are derived under the assumption that the size of the validation sets converges to infinity and hence do not cover leave-one-out cross-validation. For performance assessment, cross-validated risk estimators are shown to be consistent and asymptotically linear for the risk under the true data generating distribution and confidence intervals are derived for this unknown risk. Unlike previously published results, the theorems derived in this and our related articles apply to general data generating distributions, loss functions (i.e., parameters), estimators, and cross-validation procedures.  相似文献   

9.
The estimation of a multivariate function from a stationary m-dependent process is investigated, with a special focus on the case where m is large or unbounded. We develop an adaptive estimator based on wavelet methods. Under flexible assumptions on the nonparametric model, we prove the good performances of our estimator by determining sharp rates of convergence under two kinds of errors: the pointwise mean squared error and the mean integrated squared error. We illustrate our theoretical result by considering the multivariate density estimation problem, the derivatives density estimation problem, the density estimation problem in a GARCH-type model and the multivariate regression function estimation problem. The performance of proposed estimator has been shown by a numerical study for a simulated and real data sets.  相似文献   

10.
The geographical relative risk function is a useful tool for investigating the spatial distribution of disease based on case and control data. The most common way of estimating this function is using the ratio of bivariate kernel density estimates constructed from the locations of cases and controls, respectively. An alternative is to use a local-linear (LL) estimator of the log-relative risk function. In both cases, the choice of bandwidth is critical. In this article, we examine the relative performance of the two estimation techniques using a variety of data-driven bandwidth selection methods, including likelihood cross-validation (CV), least-squares CV, rule-of-thumb reference methods, and a new approximate plug-in (PI) bandwidth for the LL estimator. Our analysis includes the comparison of asymptotic results; a simulation study; and application of the estimators on two real data sets. Our findings suggest that the density ratio method implemented with the least-squares CV bandwidth selector is generally best, with the LL estimator with PI bandwidth being competitive in applications with strong large-scale trends but much worse in situations with elliptical clusters.  相似文献   

11.
A plug-in the number of interior knots (NIKs) selector is proposed for polynomial spline estimation in nonparametric regression. The existence and properties of the optimal NIKs for spline regression are established by minimising the weighted mean integrated squared error. We obtain plug-in formulae for the optimal NIKs based on the theoretical results of asymptotic optimality, and develop strategies for choosing the NIKs of the spline estimator. The proposed NIKs selection method is tested on our simulated data with quite satisfactory performance, and is illustrated by analysing a fossil data set.  相似文献   

12.
We propose a new nonparametric estimator for the density function of multivariate bounded data. As frequently observed in practice, the variables may be partially bounded (e.g. nonnegative) or completely bounded (e.g. in the unit interval). In addition, the variables may have a point mass. We reduce the conditions on the underlying density to a minimum by proposing a nonparametric approach. By using a gamma, a beta, or a local linear kernel (also called boundary kernels), in a product kernel, the suggested estimator becomes simple in implementation and robust to the well known boundary bias problem. We investigate the mean integrated squared error properties, including the rate of convergence, uniform strong consistency and asymptotic normality. We establish consistency of the least squares cross-validation method to select optimal bandwidth parameters. A detailed simulation study investigates the performance of the estimators. Applications using lottery and corporate finance data are provided.  相似文献   

13.
Likelihood cross-validation for kernel density estimation is known to be sensitive to extreme observations and heavy-tailed distributions. We propose a robust likelihood-based cross-validation method to select bandwidths in multivariate density estimations. We derive this bandwidth selector within the framework of robust maximum likelihood estimation. This method establishes a smooth transition from likelihood cross-validation for nonextreme observations to least squares cross-validation for extreme observations, thereby combining the efficiency of likelihood cross-validation and the robustness of least-squares cross-validation. We also suggest a simple rule to select the transition threshold. We demonstrate the finite sample performance and practical usefulness of the proposed method via Monte Carlo simulations and a real data application on Chinese air pollution.  相似文献   

14.
Abstract.  The performance of multivariate kernel density estimates depends crucially on the choice of bandwidth matrix, but progress towards developing good bandwidth matrix selectors has been relatively slow. In particular, previous studies of cross-validation (CV) methods have been restricted to biased and unbiased CV selection of diagonal bandwidth matrices. However, for certain types of target density the use of full (i.e. unconstrained) bandwidth matrices offers the potential for significantly improved density estimation. In this paper, we generalize earlier work from diagonal to full bandwidth matrices, and develop a smooth cross-validation (SCV) methodology for multivariate data. We consider optimization of the SCV technique with respect to a pilot bandwidth matrix. All the CV methods are studied using asymptotic analysis, simulation experiments and real data analysis. The results suggest that SCV for full bandwidth matrices is the most reliable of the CV methods. We also observe that experience from the univariate setting can sometimes be a misleading guide for understanding bandwidth selection in the multivariate case.  相似文献   

15.
Abstract

An exact, closed form, and easy to compute expression for the mean integrated squared error (MISE) of a kernel estimator of a normal mixture cumulative distribution function is derived for the class of arbitrary order Gaussian-based kernels. Comparisons are made with MISE of the empirical distribution function, the infeasible minimum MISE, and the uniform kernel. A simple plug-in method of simultaneously selecting the optimal bandwidth and kernel order is proposed based on a non asymptotic approximation of the unknown distribution by a normal mixture. A simulation study shows that the method provides a viable alternative to existing bandwidth selection procedures.  相似文献   

16.
Two of the most useful multivariate bandwidth selection techniques are the plug‐in and cross‐validation methods. The smoothed version of the cross‐validation method is known to reduce the variability of its non‐smoothed counterpart; however, it shares with the plug‐in choice the need for a pilot bandwidth matrix. Owing to the mathematical difficulties encountered in the optimal pilot choice, it is common to restrict this pilot matrix to be a scalar multiple of the identity matrix, at the expense of losing the flexibility afforded by the unconstrained approach. Here we show how to overcome these difficulties and propose a smoothed cross‐validation selector using an unconstrained pilot matrix. Our numerical results indicate that the unconstrained selector outperforms the constrained one in practice, and is a viable competitor to unconstrained plug‐in selectors.  相似文献   

17.
Generalized additive models provide a way of circumventing curse of dimension in a wide range of nonparametric regression problem. In this paper, we present a multiplicative model for conditional variance functions where one can apply a generalized additive regression method. This approach extends Fan and Yao (1998) to multivariate cases with a multiplicative structure. In this approach, we use squared residuals instead of using log-transformed squared residuals. This idea gives a smaller variance than Yu (2017) when the variance of squared error is smaller than the variance of log-transformed squared error. We provide estimators based on quasi-likelihood and an iterative algorithm based on smooth backfitting for generalized additive models. We also provide some asymptotic properties of estimators and the convergence of proposed algorithm. A numerical study shows the empirical evidence of the theory.  相似文献   

18.
In this article, we give the asymptotic mean integrated squared error and the mean squared error for the kernel estimator of the hazard rate from truncated and censored data. Martingale techniques and combinatory calculus are used to obtain these results. A probability bound and the optimal bandwidth choice are also given.  相似文献   

19.
This paper studies nonparametric regression with long memory (LRD) errors and predictors. First, we formulate general conditions which guarantee the standard rate of convergence for a nonparametric kernel estimator. Second, we calculate the mean integrated squared error (MISE). In particular, we show that LRD of errors may influence MISE. On the other hand, an estimator for a shape function is typically not influenced by LRD in errors. Finally, we investigate properties of a data-driven bandwidth choice. We show that averaged squared error (ASE) is a good approximation of MISE; however, this is not the case for a cross-validation criterion.  相似文献   

20.
We present a multi-stage conditional quantile predictor for time series of Markovian structure. It is proved that at any quantile level, p ∈ (0, 1), the asymptotic mean squared error (MSE) of the new predictor is smaller than the single-stage conditional quantile predictor. A simulation study confirms this result in a small sample situation. Because the improvement by the proposed predictor increases for quantiles at the tails of the conditional distribution function, the multi-stage predictor can be used to compute better predictive intervals with smaller variability. Applying this predictor to the changes in the U.S. short-term interest rate, rather smooth out-of-sample predictive intervals are obtained.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号