期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

EMPIRICAL INFLUENCE FOR ROBUST SMOOTHING

Lise Manchester 《Australian & New Zealand Journal of Statistics》1996,38(3):275-290

A simple graphical method is presented to display the sensitivity of a scatter plot smoother (e.g. loess, kernel smoothers) to perturbations in the data. This enables the robustness of smoothers which have been designed to be robust to be examined directly in particular examples. Graphs are shown of various robust smoothers on several standard datasets, so that the robustness of the smoothers can be compared. The method is found to be useful in revealing features of the smoothers. Related graphs for displaying the sensitivity of a smoother to k > 1 outliers are also presented. 相似文献

2.

A comparison of six smoothers when there are multiple predictors

Rand R. Wilcox 《Statistical Methodology》2005,2(1):379

The paper compares six smoothers, in terms of mean squared error and bias, when there are multiple predictors and the sample size is relatively small. The focus is on methods that use robust measures of location (primarily a 20% trimmed mean) and where there are four predictors. To add perspective, some methods designed for means are included. The smoothers include the locally weighted (loess) method derived by Cleveland and Devlin [W.S. Cleveland, S.J. Devlin, Locally-weighted regression: an approach to regression analysis by local fitting, Journal of the American Statistical Association 83 (1988) 596–610], a variation of a so-called running interval smoother where distances from a point are measured via a particular group of projections of the data, a running interval smoother where distances are measured based in part using the minimum volume ellipsoid estimator, a generalized additive model based on the running interval smoother, a generalized additive model based on the robust version of the smooth derived by Cleveland [W.S. Cleveland, Robust locally weighted regression and smoothing scatterplots, Journal of the American Statistical Association 74 (1979) 829–836], and a kernel regression method stemming from [J. Fan, Local linear smoothers and their minimax efficiencies, The Annals of Statistics 21 (1993) 196–216]. When the regression surface is a plane, the method stemming from [J. Fan, Local linear smoothers and their minimax efficiencies, The Annals of Statistics 21 (1993) 196–216] was found to dominate, and indeed offers a substantial advantage in various situations, even when the error term has a heavy-tailed distribution. However, if there is curvature, this method can perform poorly compared to the other smooths considered. Now the projection-type smoother used in conjunction with a 20% trimmed mean is recommended with the minimum volume ellipsoid method a close second. 相似文献

3.

An Extended Single‐index Model with Missing Response at Random

Qihua Wang Tao Zhang Wolfgang Karl Härdle 《Scandinavian Journal of Statistics》2016,43(4):1140-1152

An extended single‐index model is considered when responses are missing at random. A three‐step estimation procedure is developed to define an estimator for the single‐index parameter vector by a joint estimating equation. The proposed estimator is shown to be asymptotically normal. An algorithm for computing this estimator is proposed. This algorithm only involves one‐dimensional nonparametric smoothers, thereby avoiding the data sparsity problem caused by high model dimensionality. Some simulation studies are conducted to investigate the finite sample performances of the proposed estimators. 相似文献

4.

Robust estimation of multivariate regression model

Jiantao Li Min Zheng 《Statistical Papers》2009,50(1):81-100

This paper studies robust estimation of multivariate regression model using kernel weighted local linear regression. A robust estimation procedure is proposed for estimating the regression function and its partial derivatives. The proposed estimators are jointly asymptotically normal and attain nonparametric optimal convergence rate. One-step approximations to the robust estimators are introduced to reduce computational burden. The one-step local M-estimators are shown to achieve the same efficiency as the fully iterative local M-estimators as long as the initial estimators are good enough. The proposed estimators inherit the excellent edge-effect behavior of the local polynomial methods in the univariate case and at the same time overcome the disadvantages of the local least-squares based smoothers. Simulations are conducted to demonstrate the performance of the proposed estimators. Real data sets are analyzed to illustrate the practical utility of the proposed methodology. This work was supported by the National Natural Science Foundation of China (Grant No. 10471006). 相似文献

5.

Some Statistical Properties of Efficiency Robust Tests with Applications to Genetic Association Studies

Gang Zheng Qizhai Li Ao Yuan 《Scandinavian Journal of Statistics》2014,41(3):762-774

Although efficiency robust tests are preferred for genetic association studies when the genetic model is unknown, their statistical properties have been studied for different study designs separately under special situations. We study some statistical properties of the maximin efficiency robust test and a maximum‐type robust test (MAX3) under a general setting and obtain unified results. The results can also be applied to testing hypothesis with a constrained two‐dimensional parameter space. The results are applied to genetic association studies using case–parents trio data. 相似文献

6.

Robust estimators for additive models using backfitting

Graciela Boente Alejandra Martínez Matías Salibián-Barrera 《Journal of nonparametric statistics》2017,29(4):744-767

Additive models provide an attractive setup to estimate regression functions in a nonparametric context. They provide a flexible and interpretable model, where each regression function depends only on a single explanatory variable and can be estimated at an optimal univariate rate. Most estimation procedures for these models are highly sensitive to the presence of even a small proportion of outliers in the data. In this paper, we show that a relatively simple robust version of the backfitting algorithm (consisting of using robust local polynomial smoothers) corresponds to the solution of a well-defined optimisation problem. This formulation allows us to find mild conditions to show Fisher consistency and to study the convergence of the algorithm. Our numerical experiments show that the resulting estimators have good robustness and efficiency properties. We illustrate the use of these estimators on a real data set where the robust fit reveals the presence of influential outliers. 相似文献

7.

Iterative bias reduction: a comparative study

P.-A. Cornillon N. Hengartner N. Jegou E. Matzner-Løber 《Statistics and Computing》2013,23(6):777-791

Multivariate nonparametric smoothers, such as kernel based smoothers and thin plate splines smoothers, are adversely impacted by the sparseness of data in high dimension, also known as the curse of dimensionality. Adaptive smoothers, that can exploit the underlying smoothness of the regression function, may partially mitigate this effect. This paper presents a comparative simulation study of a novel adaptive smoother (IBR) with competing multivariate smoothers available as package or function within the R language and environment for statistical computing. Comparison between the methods are made on simulated datasets of moderate size, from 50 to 200 observations, with two, five or 10 potential explanatory variables, and on a real dataset. The results show that the good asymptotic properties of IBR are complemented by a very good behavior on moderate sized datasets, results which are similar to those obtained with Duchon low rank splines. 相似文献

8.

Global and local statistical properties of fixed-length nonparametric smoothers

Estela Bee Dagum Alessandra Luati 《Statistical Methods and Applications》2002,11(3):313-333

The main purpose of this study is to analyze the global and local statistical properties of nonparametric smoothers subject to a priori fixed length restriction. In order to do so, we introduce a set of local statistical measures based on their weighting system shapes and weight values. In this way, the local statistical measures of bias, variance and mean square error are intrinsic to the smoothers and independent of the data to which they will be applied on. One major advantage of the statistical measures relative to the classical spectral ones is their easiness of calculation. However, in this paper we use both in a complementary manner. The smoothers studied are based on two broad classes of weighting generating functions, local polynomials and probability distributions. We consider within the first class, the locally weighted regression smoother (loess) of degree 1 and 2 (L1 and L2), the cubic smoothing spline (CSS), and the Henderson smoothing linear filter (H); and in the second class, the Gaussian kernel (GK). The weighting systems of these estimators depend on a smoothing parameter that traditionally, is estimated by means of data dependent optimization criteria. However, by imposing to all of them the condition of an equal number of weights, it will be shown that some of their optimal statistical properties are no longer valid. Without any loss of generality, the analysis is carried out for 13- and 9-term lengths because these are the most often selected for the Henderson filters in the context of monthly time series decomposition. We would like to thank an Associate Editor and an anonymous referee for their valuable comments on an earlier version of this paper. Financing from MURST is also gratefully acknowledged. 相似文献

9.

Thin plate regression splines 总被引：2，自引：0，他引：2

Simon N. Wood 《Journal of the Royal Statistical Society. Series B, Statistical methodology》2003,65(1):95-114

Summary. I discuss the production of low rank smoothers for d ≥ 1 dimensional data, which can be fitted by regression or penalized regression methods. The smoothers are constructed by a simple transformation and truncation of the basis that arises from the solution of the thin plate spline smoothing problem and are optimal in the sense that the truncation is designed to result in the minimum possible perturbation of the thin plate spline smoothing problem given the dimension of the basis used to construct the smoother. By making use of Lanczos iteration the basis change and truncation are computationally efficient. The smoothers allow the use of approximate thin plate spline models with large data sets, avoid the problems that are associated with 'knot placement' that usually complicate modelling with regression splines or penalized regression splines, provide a sensible way of modelling interaction terms in generalized additive models, provide low rank approximations to generalized smoothing spline models, appropriate for use with large data sets, provide a means for incorporating smooth functions of more than one variable into non-linear models and improve the computational efficiency of penalized likelihood models incorporating thin plate splines. Given that the approach produces spline-like models with a sparse basis, it also provides a natural way of incorporating unpenalized spline-like terms in linear and generalized linear models, and these can be treated just like any other model terms from the point of view of model selection, inference and diagnostics. 相似文献

10.

Generalised Rank Regression Estimator with Standard Error Adjusted Lasso

下载免费PDF全文

A.S. Turkmen O. Ozturk 《Australian & New Zealand Journal of Statistics》2016,58(1):121-135

One of the standard variable selection procedures in multiple linear regression is to use a penalisation technique in least‐squares (LS) analysis. In this setting, many different types of penalties have been introduced to achieve variable selection. It is well known that LS analysis is sensitive to outliers, and consequently outliers can present serious problems for the classical variable selection procedures. Since rank‐based procedures have desirable robustness properties compared to LS procedures, we propose a rank‐based adaptive lasso‐type penalised regression estimator and a corresponding variable selection procedure for linear regression models. The proposed estimator and variable selection procedure are robust against outliers in both response and predictor space. Furthermore, since rank regression can yield unstable estimators in the presence of multicollinearity, in order to provide inference that is robust against multicollinearity, we adjust the penalty term in the adaptive lasso function by incorporating the standard errors of the rank estimator. The theoretical properties of the proposed procedures are established and their performances are investigated by means of simulations. Finally, the estimator and variable selection procedure are applied to the Plasma Beta‐Carotene Level data set. 相似文献

11.

Robust Kalman tracking and smoothing with propagating and non-propagating outliers

Peter Ruckdeschel Bernhard Spangl Daria Pupashenko 《Statistical Papers》2014,55(1):93-123

A common situation in filtering where classical Kalman filtering does not perform particularly well is tracking in the presence of propagating outliers. This calls for robustness understood in a distributional sense, i.e.; we enlarge the distribution assumptions made in the ideal model by suitable neighborhoods. Based on optimality results for distributional-robust Kalman filtering from Ruckdeschel (Ansätze zur Robustifizierung des Kalman-Filters, vol 64, 2001; Optimally (distributional-)robust Kalman filtering, arXiv: 1004.3393, 2010a), we propose new robust recursive filters and smoothers designed for this purpose as well as specialized versions for non-propagating outliers. We apply these procedures in the context of a GPS problem arising in the car industry. To better understand these filters, we study their behavior at stylized outlier patterns (for which they are not designed) and compare them to other approaches for the tracking problem. Finally, in a simulation study we discuss efficiency of our procedures in comparison to competitors. 相似文献

12.

A Robust Score Test for Testing Several Coefficients of Variation with Unknown Underlying Distributions

Tsung-Shan Tsou 《统计学通讯:理论与方法》2013,42(9):1350-1360

A parametric robust test is proposed for comparing several coefficients of variation. This test is derived by properly correcting the normal likelihood function according to the technique suggested by Royall and Tsou. The proposed test statistic is asymptotically valid for general random variables, as long as their underlying distributions have finite fourth moments.

Simulation studies and real data analyses are provided to demonstrate the effectiveness of the novel robust procedure. 相似文献

13.

Focused information criterion and model averaging based on weighted composite quantile regression

Ganggang Xu Suojin Wang Jianhua Z. Huang 《Scandinavian Journal of Statistics》2014,41(2):365-381

We study the focused information criterion and frequentist model averaging and their application to post‐model‐selection inference for weighted composite quantile regression (WCQR) in the context of the additive partial linear models. With the non‐parametric functions approximated by polynomial splines, we show that, under certain conditions, the asymptotic distribution of the frequentist model averaging WCQR‐estimator of a focused parameter is a non‐linear mixture of normal distributions. This asymptotic distribution is used to construct confidence intervals that achieve the nominal coverage probability. With properly chosen weights, the focused information criterion based WCQR estimators are not only robust to outliers and non‐normal residuals but also can achieve efficiency close to the maximum likelihood estimator, without assuming the true error distribution. Simulation studies and a real data analysis are used to illustrate the effectiveness of the proposed procedure. 相似文献

14.

Weighted quantile regression with nonelliptically structured covariates

Matías Salibián‐Barrera Ying Wei 《Revue canadienne de statistique》2008,36(4):595-611

Although quantile regression estimators are robust against low leverage observations with atypically large responses (Koenker & Bassett 1978), they can be seriously affected by a few points that deviate from the majority of the sample covariates. This problem can be alleviated by downweighting observations with high leverage. Unfortunately, when the covariates are not elliptically distributed, Mahalanobis distances may not be able to correctly identify atypical points. In this paper the authors discuss the use of weights based on a new leverage measure constructed using Rosenblatt's multivariate transformation which is able to reflect nonelliptical structures in the covariate space. The resulting weighted estimators are consistent, asymptotically normal, and have a bounded influence function. In addition, the authors also discuss a selection criterion for choosing the downweighting scheme. They illustrate their approach with child growth data from Finland. Finally, their simulation studies suggest that this methodology has good finite‐sample properties. 相似文献

15.

Design of kernel M-smoothers for spatial data

Carlo Grillenzoni 《Statistical Methodology》2008,5(3):220-237

Robust nonparametric smoothers have been proved effective to preserve edges in image denoising. As an extension, they should be capable to estimate multivariate surfaces containing discontinuities on the basis of a random spatial sampling. A crucial problem is the design of their coefficients, in particular those of the kernels which concern robustness. In this paper it is shown that bandwidths which regard smoothness can consistently be estimated, whereas those which concern robustness cannot be estimated with plug-in and cross-validation criteria. Heuristic and graphical methods are proposed for their selection and their efficacy is proved in simulation experiments. 相似文献

16.

Simplex Mixed‐Effects Models for Longitudinal Proportional Data

ZHENGUO QIU PETER X.‐K. SONG MING TAN 《Scandinavian Journal of Statistics》2008,35(4):577-596

Abstract. Continuous proportional outcomes are collected from many practical studies, where responses are confined within the unit interval (0,1). Utilizing Barndorff‐Nielsen and Jørgensen's simplex distribution, we propose a new type of generalized linear mixed‐effects model for longitudinal proportional data, where the expected value of proportion is directly modelled through a logit function of fixed and random effects. We establish statistical inference along the lines of Breslow and Clayton's penalized quasi‐likelihood (PQL) and restricted maximum likelihood (REML) in the proposed model. We derive the PQL/REML using the high‐order multivariate Laplace approximation, which gives satisfactory estimation of the model parameters. The proposed model and inference are illustrated by simulation studies and a data example. The simulation studies conclude that the fourth order approximate PQL/REML performs satisfactorily. The data example shows that Aitchison's technique of the normal linear mixed model for logit‐transformed proportional outcomes is not robust against outliers. 相似文献

17.

For censored and truncated data

Chul-Ki Kim Tze Leung Lai 《统计学通讯:理论与方法》2013,42(11):2717-2747

In this paper we develop nonparametric methods for regression analysis when the response variable is subject to censoring and/or truncation. The development is based on a data completion princple that enables us to apply, via an iterative scheme, nonparametric regression techniques to iteratively com¬pleted data from a given sample with censored and/or truncated observations. In particular, locally weighted regression smoothers and additive regression models are extended to left-truncated and right-censored data Nonparamet¬ric regression analysis is applied to the Stanford heart transplant data, which have been analyzed by previous authors using semiparametric regression meth¬ods. and provides new insights into the relationship between expected survival time after a heart transplant and explanatory variables. 相似文献

18.

Robust Bayesian nonlinear mixed‐effects modeling of time to positivity in tuberculosis trials

下载免费PDF全文

Divan Aristo Burger Robert Schall Ding‐Geng Chen 《Pharmaceutical statistics》2018,17(5):615-628

Early phase 2 tuberculosis (TB) trials are conducted to characterize the early bactericidal activity (EBA) of anti‐TB drugs. The EBA of anti‐TB drugs has conventionally been calculated as the rate of decline in colony forming unit (CFU) count during the first 14 days of treatment. The measurement of CFU count, however, is expensive and prone to contamination. Alternatively to CFU count, time to positivity (TTP), which is a potential biomarker for long‐term efficacy of anti‐TB drugs, can be used to characterize EBA. The current Bayesian nonlinear mixed‐effects (NLME) regression model for TTP data, however, lacks robustness to gross outliers that often are present in the data. The conventional way of handling such outliers involves their identification by visual inspection and subsequent exclusion from the analysis. However, this process can be questioned because of its subjective nature. For this reason, we fitted robust versions of the Bayesian nonlinear mixed‐effects regression model to a wide range of TTP datasets. The performance of the explored models was assessed through model comparison statistics and a simulation study. We conclude that fitting a robust model to TTP data obviates the need for explicit identification and subsequent “deletion” of outliers but ensures that gross outliers exert no undue influence on model fits. We recommend that the current practice of fitting conventional normal theory models be abandoned in favor of fitting robust models to TTP data. 相似文献

19.

Bandwidth selection for local linear regression smoothers

Nicolas W. Hengartner Marten H. Wegkamp Eric Matzner-Løber 《Journal of the Royal Statistical Society. Series B, Statistical methodology》2002,64(4):791-804

Summary. The paper presents a general strategy for selecting the bandwidth of nonparametric regression estimators and specializes it to local linear regression smoothers. The procedure requires the sample to be divided into a training sample and a testing sample. Using the training sample we first compute a family of regression smoothers indexed by their bandwidths. Next we select the bandwidth by minimizing the empirical quadratic prediction error on the testing sample. The resulting bandwidth satisfies a finite sample oracle inequality which holds for all bounded regression functions. This permits asymptotically optimal estimation for nearly any regression function. The practical performance of the method is illustrated by a simulation study which shows good finite sample behaviour of our method compared with other bandwidth selection procedures. 相似文献

20.

Robust small area estimation

Sanjoy K. Sinha J. N. K. Rao 《Revue canadienne de statistique》2009,37(3):381-399

Small area estimation has received considerable attention in recent years because of growing demand for small area statistics. Basic area‐level and unit‐level models have been studied in the literature to obtain empirical best linear unbiased prediction (EBLUP) estimators of small area means. Although this classical method is useful for estimating the small area means efficiently under normality assumptions, it can be highly influenced by the presence of outliers in the data. In this article, the authors investigate the robustness properties of the classical estimators and propose a resistant method for small area estimation, which is useful for downweighting any influential observations in the data when estimating the model parameters. To estimate the mean squared errors of the robust estimators of small area means, a parametric bootstrap method is adopted here, which is applicable to models with block diagonal covariance structures. Simulations are carried out to study the behaviour of the proposed robust estimators in the presence of outliers, and these estimators are also compared to the EBLUP estimators. Performance of the bootstrap mean squared error estimator is also investigated in the simulation study. The proposed robust method is also applied to some real data to estimate crop areas for counties in Iowa, using farm‐interview data on crop areas and LANDSAT satellite data as auxiliary information. The Canadian Journal of Statistics 37: 381–399; 2009 © 2009 Statistical Society of Canada 相似文献