期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

On kernel density estimation near endpoints

Shunpu Zhang Rohana J. Karunamuni 《Journal of statistical planning and inference》1998,70(2):1-316

In this paper, we consider the estimation problem of f(0), the value of density f at the left endpoint 0. Nonparametric estimation of f(0) is rather formidable due to boundary effects that occur in nonparametric curve estimation. It is well known that the usual kernel density estimates require modifications when estimating the density near endpoints of the support. Here we investigate the local polynomial smoothing technique as a possible alternative method for the problem. It is observed that our density estimator also possesses desirable properties such as automatic adaptability for boundary effects near endpoints. We also obtain an ‘optimal kernel’ in order to estimate the density at endpoints as a solution of a variational problem. Two bandwidth variation schemes are discussed and investigated in a Monte Carlo study. 相似文献

2.

Tail density estimation for exploratory data analysis using kernel methods

B. Béranger T. Duong S. E. Perkins-Kirkpatrick S. A. Sisson 《Journal of nonparametric statistics》2019,31(1):144-174

It is often critical to accurately model the upper tail behaviour of a random process. Nonparametric density estimation methods are commonly implemented as exploratory data analysis techniques for this purpose and can avoid model specification biases implied by using parametric estimators. In particular, kernel-based estimators place minimal assumptions on the data, and provide improved visualisation over scatterplots and histograms. However kernel density estimators can perform poorly when estimating tail behaviour above a threshold, and can over-emphasise bumps in the density for heavy tailed data. We develop a transformation kernel density estimator which is able to handle heavy tailed and bounded data, and is robust to threshold choice. We derive closed form expressions for its asymptotic bias and variance, which demonstrate its good performance in the tail region. Finite sample performance is illustrated in numerical studies, and in an expanded analysis of the performance of global climate models. 相似文献

3.

Bandwidth selector for nonparametric recursive density estimation for spatial data defined by stochastic approximation method

Salim Bouzebda Yousri Slaoui 《统计学通讯:理论与方法》2020,49(12):2942-2963

Abstract

In this article we propose an automatic selection of the bandwidth of the recursive kernel density estimators for spatial data defined by the stochastic approximation algorithm. We showed that, using the selected bandwidth and the stepsize which minimize the MWISE (Mean Weighted Integrated Squared Error), the recursive estimator will be quite similar to the nonrecursive one in terms of estimation error and much better in terms of computational costs. In addition, we obtain the central limit theorem for the nonparametric recursive density estimator under some mild conditions. 相似文献

4.

A weighted least-squares cross-validation bandwidth selector for kernel density estimation

C. Tenreiro 《统计学通讯:理论与方法》2017,46(7):3438-3458

Since the late 1980s, several methods have been considered in the literature to reduce the sample variability of the least-squares cross-validation bandwidth selector for kernel density estimation. In this article, a weighted version of this classical method is proposed and its asymptotic and finite-sample behavior is studied. The simulation results attest that the weighted cross-validation bandwidth performs quite well, presenting a better finite-sample performance than the standard cross-validation method for “easy-to-estimate” densities, and retaining the good finite-sample performance of the standard cross-validation method for “hard-to-estimate” ones. 相似文献

5.

An evaluation of likelihood-based bandwidth selectors for spatial and spatiotemporal kernel estimates

Tilman M. Davies Andrew B. Lawson 《Journal of Statistical Computation and Simulation》2019,89(7):1131-1152

Spatial point pattern data sets are commonplace in a variety of different research disciplines. The use of kernel methods to smooth such data is a flexible way to explore spatial trends and make inference about underlying processes without, or perhaps prior to, the design and fitting of more intricate semiparametric or parametric models to quantify specific effects. The long-standing issue of ‘optimal’ data-driven bandwidth selection is complicated in these settings by issues such as high heterogeneity in observed patterns and the need to consider edge correction factors. We scrutinize bandwidth selectors built on leave-one-out cross-validation approximation to likelihood functions. A key outcome relates to previously unconsidered adaptive smoothing regimens for spatiotemporal density and multitype conditional probability surface estimation, whereby we propose a novel simultaneous pilot-global selection strategy. Motivated by applications in epidemiology, the results of both simulated and real-world analyses suggest this strategy to be largely preferable to classical fixed-bandwidth estimation for such data. 相似文献

6.

Robust testing with generalized partial linear models for longitudinal data

Jianhui Zhou Zhongyi Zhu Wing K. Fung 《Journal of statistical planning and inference》2008

By approximating the nonparametric component using a regression spline in generalized partial linear models (GPLM), robust generalized estimating equations (GEE), involving bounded score function and leverage-based weighting function, can be used to estimate the regression parameters in GPLM robustly for longitudinal data or clustered data. In this paper, score test statistics are proposed for testing the regression parameters with robustness, and their asymptotic distributions under the null hypothesis and a class of local alternative hypotheses are studied. The proposed score tests reply on the estimation of a smaller model without the testing parameters involved, and perform well in the simulation studies and real data analysis conducted in this paper. 相似文献

7.

A hybrid bandwidth selection methodology for kernel density estimation

《Journal of Statistical Computation and Simulation》2012,82(3):614-627

A bandwidth selection method that combines the concept of least-squares cross-validation and the plug-in approach is being introduced in connection with kernel density estimation. A simulation study reveals that this hybrid methodology outperforms some commonly used bandwidth selection rules. It is shown that the proposed approach can also be readily employed in the context of variable kernel density estimation. We conclude with two illustrative examples. 相似文献

8.

Spline-backfitted kernel smoothing of partially linear additive model

Shujie MaLijian Yang 《Journal of statistical planning and inference》2011,141(1):204-219

A spline-backfitted kernel smoothing method is proposed for partially linear additive model. Under assumptions of stationarity and geometric mixing, the proposed function and parameter estimators are oracally efficient and fast to compute. Such superior properties are achieved by applying to the data spline smoothing and kernel smoothing consecutively. Simulation experiments with both moderate and large number of variables confirm the asymptotic results. Application to the Boston housing data serves as a practical illustration of the method. 相似文献

9.

Interval-censored unimodal kernel density estimation via data sharpening

Devan G. Becker Bethany J. G. White 《Journal of Statistical Computation and Simulation》2017,87(10):2023-2037

In this paper, we extend Choi and Hall's [Data sharpening as a prelude to density estimation. Biometrika. 1999;86(4):941–947] data sharpening algorithm for kernel density estimation to interval-censored data. Data sharpening has several advantages, including bias and mean integrated squared error (MISE) reduction as well as increased robustness to bandwidth misspecification. Several interval metrics are explored for use with the kernel function in the data sharpening transformation. A simulation study based on randomly generated data is conducted to assess and compare the performance of each interval metric. It is found that the bias is reduced by sharpening, often with little effect on the variance, thus maintaining or reducing overall MISE. Applications involving time to onset of HIV and running distances subject to measurement error are used for illustration. 相似文献

10.

Limiting bias-reduced Amoroso kernel density estimators for non-negative data

Gaku Igarashi Yoshihide Kakizawa 《统计学通讯:理论与方法》2018,47(20):4905-4937

The Amoroso kernel density estimator (Igarashi and Kakizawa 2017 Igarashi, G., and Y. Kakizawa. 2017. Amoroso kernel density estimation for nonnegative data and its bias reduction. Department of Policy and Planning Sciences Discussion Paper Series No. 1345, University of Tsukuba. [Google Scholar]) for non-negative data is boundary-bias-free and has the mean integrated squared error (MISE) of order O(n^{? 4/5}), where n is the sample size. In this paper, we construct a linear combination of the Amoroso kernel density estimator and its derivative with respect to the smoothing parameter. Also, we propose a related multiplicative estimator. We show that the MISEs of these bias-reduced estimators achieve the convergence rates n^{? 8/9}, if the underlying density is four times continuously differentiable. We illustrate the finite sample performance of the proposed estimators, through the simulations. 相似文献

11.

Approximating likelihoods for large spatial data sets

Michael L. Stein Zhiyi Chi Leah J. Welty 《Journal of the Royal Statistical Society. Series B, Statistical methodology》2004,66(2):275-296

Summary. Likelihood methods are often difficult to use with large, irregularly sited spatial data sets, owing to the computational burden. Even for Gaussian models, exact calculations of the likelihood for n observations require O ( n ³) operations. Since any joint density can be written as a product of conditional densities based on some ordering of the observations, one way to lessen the computations is to condition on only some of the 'past' observations when computing the conditional densities. We show how this approach can be adapted to approximate the restricted likelihood and we demonstrate how an estimating equations approach allows us to judge the efficacy of the resulting approximation. Previous work has suggested conditioning on those past observations that are closest to the observation whose conditional density we are approximating. Through theoretical, numerical and practical examples, we show that there can often be considerable benefit in conditioning on some distant observations as well. 相似文献

12.

Robust nonparametric estimation for spatial regression

Abdelkader Gheriballah Ali Laksaci Rachida Rouane 《Journal of statistical planning and inference》2010

In this paper, we investigate a nonparametric robust estimation for spatial regression. More precisely, given a strictly stationary random field _Z_i=(_X_i,_Y_i₎_i_∈_N_N_N_≥₁

Z_{i} = (X_{i}, Y_{i})_{i \in N^{N} N \geq 1}

, we consider a family of robust nonparametric estimators for a regression function based on the kernel method. Under some general mixing assumptions, the almost complete consistency and the asymptotic normality of these estimators are obtained. A robust procedure to select the smoothing parameter adapted to the spatial data is also discussed. 相似文献

13.

Recursive kernel density estimators under missing data

Yousri Slaoui 《统计学通讯:理论与方法》2017,46(18):9101-9125

相似文献

14.

The kernel regression estimation for randomly censored functional stationary ergodic data

Nengxiang Ling Yang Liu 《统计学通讯:理论与方法》2017,46(17):8557-8574

In this paper, we investigate the asymptotic properties of the kernel estimator for non parametric regression operator when the functional stationary ergodic data with randomly censorship are considered. More precisely, we introduce the kernel-type estimator of the non parametric regression operator with the responses randomly censored and obtain the almost surely convergence with rate as well as the asymptotic normality of the estimator. As an application, the asymptotic (1 ? ζ) confidence interval of the regression operator is also presented (0 < ζ < 1). Finally, the simulation study is carried out to show the finite-sample performances of the estimator. 相似文献

15.

Some observations on semi-markov models for partially censored data

David E. Matthews 《Revue canadienne de statistique》1984,12(3):201-205

Cause-specific hazard functions are employed to analyze a semi-Markov model which could be used to describe data arising from clinical trials or certain types of observational studies. The use of these hazard functions to fit a set of data arising from N possibly incomplete case histories is shown to have several notable advantages over the approach adopted by Lagakos, Sommer, and Zelen (1978). 相似文献

16.

Weak and strong uniform consistency rates of kernel density estimates for randomly censored data

R. J. Karunamuni Song Yang 《Revue canadienne de statistique》1991,19(4):349-359

Let X₁,., X_n, be i.i.d. random variables with distribution function F, and let Y₁,.,.,Y_n be i.i.d. with distribution function G. For i = 1, 2,.,., n set δ_i, = 1 if X_i ≤ Y_i, and 0 otherwise, and X_i, = min{X_i, K_i}. A kernel-type density estimate of f, the density function of F w.r.t. Lebesgue measure on the Borel o-field, based on the censored data (δ_i, X_i), i = 1,.,.,n, is considered. Weak and strong uniform consistency properties over the whole real line are studied. Rates of convergence results are established under higher-order differentiability assumption on f. A procedure for relaxing such assumptions is also proposed. 相似文献

17.

Robust Likelihood Cross-Validation for Kernel Density Estimation

Ximing Wu 《商业与经济统计学杂志》2013,31(4):761-770

Likelihood cross-validation for kernel density estimation is known to be sensitive to extreme observations and heavy-tailed distributions. We propose a robust likelihood-based cross-validation method to select bandwidths in multivariate density estimations. We derive this bandwidth selector within the framework of robust maximum likelihood estimation. This method establishes a smooth transition from likelihood cross-validation for nonextreme observations to least squares cross-validation for extreme observations, thereby combining the efficiency of likelihood cross-validation and the robustness of least-squares cross-validation. We also suggest a simple rule to select the transition threshold. We demonstrate the finite sample performance and practical usefulness of the proposed method via Monte Carlo simulations and a real data application on Chinese air pollution. 相似文献

18.

Subsampling-extrapolation bandwidth selection in bivariate kernel density estimation

Qing Wang Adriano Z. Zambom 《Journal of Statistical Computation and Simulation》2019,89(9):1740-1759

This paper focuses on bivariate kernel density estimation that bridges the gap between univariate and multivariate applications. We propose a subsampling-extrapolation bandwidth matrix selector that improves the reliability of the conventional cross-validation method. The proposed procedure combines a U-statistic expression of the mean integrated squared error and asymptotic theory, and can be used in both cases of diagonal bandwidth matrix and unconstrained bandwidth matrix. In the subsampling stage, one takes advantage of the reduced variability of estimating the bandwidth matrix at a smaller subsample size m (m < n); in the extrapolation stage, a simple linear extrapolation is used to remove the incurred bias. Simulation studies reveal that the proposed method reduces the variability of the cross-validation method by about 50% and achieves an expected integrated squared error that is up to 30% smaller than that of the benchmark cross-validation. It shows comparable or improved performance compared to other competitors across six distributions in terms of the expected integrated squared error. We prove that the components of the selected bivariate bandwidth matrix have an asymptotic multivariate normal distribution, and also present the relative rate of convergence of the proposed bandwidth selector. 相似文献

19.

On kernel density and mode estimates for associated and censored data

Yacine Ferrani Abdelkader Tatachak 《统计学通讯:理论与方法》2013,42(7):1853-1862

相似文献

20.

On boosting kernel density methods for multivariate data: density estimation and classification

Marco Di Marzio Charles C. Taylor 《Statistical Methods and Applications》2005,14(2):163-178

Statistical learning is emerging as a promising field where a number of algorithms from machine learning are interpreted as statistical methods and vice-versa. Due to good practical performance, boosting is one of the most studied machine learning techniques. We propose algorithms for multivariate density estimation and classification. They are generated by using the traditional kernel techniques as weak learners in boosting algorithms. Our algorithms take the form of multistep estimators, whose first step is a standard kernel method. Some strategies for bandwidth selection are also discussed with regard both to the standard kernel density classification problem, and to our 'boosted' kernel methods. Extensive experiments, using real and simulated data, show an encouraging practical relevance of the findings. Standard kernel methods are often outperformed by the first boosting iterations and in correspondence of several bandwidth values. In addition, the practical effectiveness of our classification algorithm is confirmed by a comparative study on two real datasets, the competitors being trees including AdaBoosting with trees. 相似文献