Cross-validation Revisited |
| |
Authors: | Santanu Dutta |
| |
Institution: | Mathematical Sciences Department, Tezpur University, Tezpur, Assam, India |
| |
Abstract: | Data-based choice of the bandwidth is an important problem in kernel density estimation. The pseudo-likelihood and the least-squares cross-validation bandwidth selectors are well known, but widely criticized in the literature. For heavy-tailed distributions, the L1 distance between the pseudo-likelihood-based estimator and the density does not seem to converge in probability to zero with increasing sample size. Even for normal-tailed densities, the rate of L1 convergence is disappointingly slow. In this article, we report an interesting finding that with minor modifications both the cross-validation methods can be implemented effectively, even for heavy-tailed densities. For both these estimators, the L1 distance (from the density) are shown to converge completely to zero irrespective of the tail of the density. The expected L1 distance also goes to zero. These results hold even in the presence of a strongly mixing-type dependence. Monte Carlo simulations and analysis of the Old Faithful geyser data suggest that if implemented appropriately, contrary to the traditional belief, the cross-validation estimators compare well with the sophisticated plug-in and bootstrap-based estimators. |
| |
Keywords: | Density estimation Least-squares cross-validation Pseudo-likelihood |
|
|