Cross-validation Revisited期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

按检索

Cross-validation Revisited

Authors:	Santanu Dutta

Institution:	Mathematical Sciences Department, Tezpur University, Tezpur, Assam, India

Abstract:	Data-based choice of the bandwidth is an important problem in kernel density estimation. The pseudo-likelihood and the least-squares cross-validation bandwidth selectors are well known, but widely criticized in the literature. For heavy-tailed distributions, the L₁ distance between the pseudo-likelihood-based estimator and the density does not seem to converge in probability to zero with increasing sample size. Even for normal-tailed densities, the rate of L₁ convergence is disappointingly slow. In this article, we report an interesting finding that with minor modifications both the cross-validation methods can be implemented effectively, even for heavy-tailed densities. For both these estimators, the L₁ distance (from the density) are shown to converge completely to zero irrespective of the tail of the density. The expected L₁ distance also goes to zero. These results hold even in the presence of a strongly mixing-type dependence. Monte Carlo simulations and analysis of the Old Faithful geyser data suggest that if implemented appropriately, contrary to the traditional belief, the cross-validation estimators compare well with the sophisticated plug-in and bootstrap-based estimators.

Keywords:	Density estimation Least-squares cross-validation Pseudo-likelihood