首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
In this paper we consider weighted generalized‐signed‐rank estimators of nonlinear regression coefficients. The generalization allows us to include popular estimators such as the least squares and least absolute deviations estimators but by itself does not give bounded influence estimators. Adding weights results in estimators with bounded influence function. We establish conditions needed for the consistency and asymptotic normality of the proposed estimator and discuss how weight functions can be chosen to achieve bounded influence function of the estimator. Real life examples and Monte Carlo simulation experiments demonstrate the robustness and efficiency of the proposed estimator. An example shows that the weighted signed‐rank estimator can be useful to detect outliers in nonlinear regression. The Canadian Journal of Statistics 40: 172–189; 2012 © 2012 Statistical Society of Canada  相似文献   

2.
The authors propose a two‐stage estimation procedure for the partially linear model Y = fo(T) + X'βo + ψ. They show how to estimate consistently the location of the nonzero components of βo. Their approach turns out to be compatible with minimax adaptive estimation of fo over Besov balls in the case of penalized least squares. Their proofs are based on a new type of oracle inequality.  相似文献   

3.
We consider various robust estimators for the extended Burr Type III (EBIII) distribution for complete data with outliers. The considered robust estimators are M-estimators, least absolute deviations, Theil, Siegel's repeated median, least trimmed squares, and least median of squares. Before we perform the aforementioned estimators for the EBIII, we adapt the quantiles method to the estimation of the shape parameter k of the EBIII. The simulation results show that the considered robust estimators generally outperform the existing estimation approaches for data with upper outliers, with certain of them retaining a relatively high degree of efficiency for small sample sizes.  相似文献   

4.
Estimators derived from the expectation‐maximization (EM) algorithm are not robust since they are based on the maximization of the likelihood function. We propose an iterative proximal‐point algorithm based on the EM algorithm to minimize a divergence criterion between a mixture model and the unknown distribution that generates the data. The algorithm estimates in each iteration the proportions and the parameters of the mixture components in two separate steps. Resulting estimators are generally robust against outliers and misspecification of the model. Convergence properties of our algorithm are studied. The convergence of the introduced algorithm is discussed on a two‐component Weibull mixture entailing a condition on the initialization of the EM algorithm in order for the latter to converge. Simulations on Gaussian and Weibull mixture models using different statistical divergences are provided to confirm the validity of our work and the robustness of the resulting estimators against outliers in comparison to the EM algorithm. An application to a dataset of velocities of galaxies is also presented. The Canadian Journal of Statistics 47: 392–408; 2019 © 2019 Statistical Society of Canada  相似文献   

5.
One of the standard variable selection procedures in multiple linear regression is to use a penalisation technique in least‐squares (LS) analysis. In this setting, many different types of penalties have been introduced to achieve variable selection. It is well known that LS analysis is sensitive to outliers, and consequently outliers can present serious problems for the classical variable selection procedures. Since rank‐based procedures have desirable robustness properties compared to LS procedures, we propose a rank‐based adaptive lasso‐type penalised regression estimator and a corresponding variable selection procedure for linear regression models. The proposed estimator and variable selection procedure are robust against outliers in both response and predictor space. Furthermore, since rank regression can yield unstable estimators in the presence of multicollinearity, in order to provide inference that is robust against multicollinearity, we adjust the penalty term in the adaptive lasso function by incorporating the standard errors of the rank estimator. The theoretical properties of the proposed procedures are established and their performances are investigated by means of simulations. Finally, the estimator and variable selection procedure are applied to the Plasma Beta‐Carotene Level data set.  相似文献   

6.
When the data contain outliers or come from population with heavy-tailed distributions, which appear very often in spatiotemporal data, the estimation methods based on least-squares (L2) method will not perform well. More robust estimation methods are required. In this article, we propose the local linear estimation for spatiotemporal models based on least absolute deviation (L1) and drive the asymptotic distributions of the L1-estimators under some mild conditions imposed on the spatiotemporal process. The simulation results for two examples, with outliers and heavy-tailed distribution, respectively, show that the L1-estimators perform better than the L2-estimators.  相似文献   

7.
In the multiple linear regression analysis, the ridge regression estimator and the Liu estimator are often used to address multicollinearity. Besides multicollinearity, outliers are also a problem in the multiple linear regression analysis. We propose new biased estimators based on the least trimmed squares (LTS) ridge estimator and the LTS Liu estimator in the case of the presence of both outliers and multicollinearity. For this purpose, a simulation study is conducted in order to see the difference between the robust ridge estimator and the robust Liu estimator in terms of their effectiveness; the mean square error. In our simulations, the behavior of the new biased estimators is examined for types of outliers: X-space outlier, Y-space outlier, and X-and Y-space outlier. The results for a number of different illustrative cases are presented. This paper also provides the results for the robust ridge regression and robust Liu estimators based on a real-life data set combining the problem of multicollinearity and outliers.  相似文献   

8.
The problem of multicollinearity and outliers in the data set produce undesirable effects on the ordinary least squares estimator. Therefore, robust two parameter ridge estimation based on M-estimator (ME) is introduced to deal with multicollinearity and outliers in the y-direction. The proposed estimator outperforms ME, two parameter ridge estimator and robust ridge M-estimator according to mean square error criterion. Moreover, a numerical example and a Monte Carlo simulation experiment are presented.  相似文献   

9.
We developed robust estimators that minimize a weighted L1 norm for the first-order bifurcating autoregressive model. When all of the weights are fixed, our estimate is an L1 estimate that is robust against outlying points in the response space and more efficient than the least squares estimate for heavy-tailed error distributions. When the weights are random and depend on the points in the factor space, the weighted L1 estimate is robust against outlying points in the factor space. Simulated and artificial examples are presented. The behavior of the proposed estimate is modeled through a Monte Carlo study.  相似文献   

10.
We conducted confirmatory factor analysis (CFA) of responses (N=803) to a self‐reported measure of optimism, using full‐information estimation via adaptive quadrature (AQ), an alternative estimation method for ordinal data. We evaluated AQ results in terms of the number of iterations required to achieve convergence, model fit, parameter estimates, standard errors (SE), and statistical significance, across four link‐functions (logit, probit, log‐log, complimentary log‐log) using 3–10 and 20 quadrature points. We compared AQ results with those obtained using maximum likelihood, robust maximum likelihood, and robust diagonally weighted least‐squares estimation. Compared to the other two link‐functions, logit and probit not only produced fit statistics, parameters estimates, SEs, and levels of significance that varied less across numbers of quadrature points, but also fitted the data better and provided larger completely standardised loadings than did maximum likelihood and diagonally weighted least‐squares. Our findings demonstrate the viability of using full‐information AQ to estimate CFA models with real‐world ordinal data.  相似文献   

11.
This paper proposes a new robust Bayes factor for comparing two linear models. The factor is based on a pseudo‐model for outliers and is more robust to outliers than the Bayes factor based on the variance‐inflation model for outliers. If an observation is considered an outlier for both models this new robust Bayes factor equals the Bayes factor calculated after removing the outlier. If an observation is considered an outlier for one model but not the other then this new robust Bayes factor equals the Bayes factor calculated without the observation, but a penalty is applied to the model considering the observation as an outlier. For moderate outliers where the variance‐inflation model is suitable, the two Bayes factors are similar. The new Bayes factor uses a single robustness parameter to describe a priori belief in the likelihood of outliers. Real and synthetic data illustrate the properties of the new robust Bayes factor and highlight the inferior properties of Bayes factors based on the variance‐inflation model for outliers.  相似文献   

12.
The generalized method of moments (GMM) and empirical likelihood (EL) are popular methods for combining sample and auxiliary information. These methods are used in very diverse fields of research, where competing theories often suggest variables satisfying different moment conditions. Results in the literature have shown that the efficient‐GMM (GMME) and maximum empirical likelihood (MEL) estimators have the same asymptotic distribution to order n?1/2 and that both estimators are asymptotically semiparametric efficient. In this paper, we demonstrate that when data are missing at random from the sample, the utilization of some well‐known missing‐data handling approaches proposed in the literature can yield GMME and MEL estimators with nonidentical properties; in particular, it is shown that the GMME estimator is semiparametric efficient under all the missing‐data handling approaches considered but that the MEL estimator is not always efficient. A thorough examination of the reason for the nonequivalence of the two estimators is presented. A particularly strong feature of our analysis is that we do not assume smoothness in the underlying moment conditions. Our results are thus relevant to situations involving nonsmooth estimating functions, including quantile and rank regressions, robust estimation, the estimation of receiver operating characteristic (ROC) curves, and so on.  相似文献   

13.
Recently, several new robust multivariate estimators of location and scatter have been proposed that provide new and improved methods for detecting multivariate outliers. But for small sample sizes, there are no results on how these new multivariate outlier detection techniques compare in terms of p n , their outside rate per observation (the expected proportion of points declared outliers) under normality. And there are no results comparing their ability to detect truly unusual points based on the model that generated the data. Moreover, there are no results comparing these methods to two fairly new techniques that do not rely on some robust covariance matrix. It is found that for an approach based on the orthogonal Gnanadesikan–Kettenring estimator, p n can be very unsatisfactory with small sample sizes, but a simple modification gives much more satisfactory results. Similar problems were found when using the median ball algorithm, but a modification proved to be unsatisfactory. The translated-biweights (TBS) estimator generally performs well with a sample size of n≥20 and when dealing with p-variate data where p≤5. But with p=8 it can be unsatisfactory, even with n=200. A projection method as well the minimum generalized variance method generally perform best, but with p≤5 conditions where the TBS method is preferable are described. In terms of detecting truly unusual points, the methods can differ substantially depending on where the outliers happen to be, the number of outliers present, and the correlations among the variables.  相似文献   

14.
A general class of rank statistics based on the characteristic function is introduced for testing goodness‐of‐fit hypotheses about the copula of a continuous random vector. These statistics are defined as L 2 weighted functional distances between a nonparametric estimator and a semi‐parametric estimator of the characteristic function associated with a copula. It is shown that these statistics behave asymptotically as degenerate V ‐statistics of order four and that the limit distributions have representations in terms of weighted sums of independent chi‐square variables. The consistency of the tests against general alternatives is established and an asymptotically valid parametric bootstrap is suggested for the computation of the critical values of the tests. The behaviour of the new tests in small and moderate sample sizes is investigated with the help of simulations and compared with a competing test based on the empirical copula. Finally, the methodology is illustrated on a five‐dimensional data set.  相似文献   

15.
To perform regression analysis in high dimensions, lasso or ridge estimation are a common choice. However, it has been shown that these methods are not robust to outliers. Therefore, alternatives as penalized M-estimation or the sparse least trimmed squares (LTS) estimator have been proposed. The robustness of these regression methods can be measured with the influence function. It quantifies the effect of infinitesimal perturbations in the data. Furthermore, it can be used to compute the asymptotic variance and the mean-squared error (MSE). In this paper we compute the influence function, the asymptotic variance and the MSE for penalized M-estimators and the sparse LTS estimator. The asymptotic biasedness of the estimators make the calculations non-standard. We show that only M-estimators with a loss function with a bounded derivative are robust against regression outliers. In particular, the lasso has an unbounded influence function.  相似文献   

16.
The L1-type regularization provides a useful tool for variable selection in high-dimensional regression modeling. Various algorithms have been proposed to solve optimization problems for L1-type regularization. Especially the coordinate descent algorithm has been shown to be effective in sparse regression modeling. Although the algorithm shows a remarkable performance to solve optimization problems for L1-type regularization, it suffers from outliers, since the procedure is based on the inner product of predictor variables and partial residuals obtained from a non-robust manner. To overcome this drawback, we propose a robust coordinate descent algorithm, especially focusing on the high-dimensional regression modeling based on the principal components space. We show that the proposed robust algorithm converges to the minimum value of its objective function. Monte Carlo experiments and real data analysis are conducted to examine the efficiency of the proposed robust algorithm. We observe that our robust coordinate descent algorithm effectively performs for the high-dimensional regression modeling even in the presence of outliers.  相似文献   

17.
Numerous estimation techniques for regression models have been proposed. These procedures differ in how sample information is used in the estimation procedure. The efficiency of least squares (OLS) estimators implicity assumes normally distributed residuals and is very sensitive to departures from normality, particularly to "outliers" and thick-tailed distributions. Lead absolute deviation (LAD) estimators are less sensitive to outliers and are optimal for laplace random disturbances, but not for normal errors. This paper reports monte carlo comparisons of OLS,LAD, two robust estimators discussed by huber, three partially adaptiveestimators, newey's generalized method of moments estimator, and an adaptive maximum likelihood estimator based on a normal kernal studied by manski. This paper is the first to compare the relative performance of some adaptive robust estimators (partially adaptive and adaptive procedures) with some common nonadaptive robust estimators. The partially adaptive estimators are based on three flxible parametric distributions for the errors. These include the power exponential (Box-Tiao) and generalized t distributions, as well as a distribution for the errors, which is not necessarily symmetric. The adaptive procedures are "fully iterative" rather than one step estimators. The adaptive estimators have desirable large sample properties, but these properties do not necessarily carry over to the small sample case.

The monte carlo comparisons of the alternative estimators are based on four different specifications for the error distribution: a normal, a mixture of normals (or variance-contaminated normal), a bimodal mixture of normals, and a lognormal. Five hundred samples of 50 are used. The adaptive and partially adaptive estimators perform very well relative to the other estimation procedures considered, and preliminary results suggest that in some important cases they can perform much better than OLS with 50 to 80% reductions in standard errors.

  相似文献   

18.
In this paper, we suggest a least squares procedure for the determination of the number of upper outliers in an exponential sample by minimizing sample mean squared error. Moreover, the method can reduce the masking or “swamping” effects. In addition, we have also found that the least squares procedure is easy and simple to compute than test test procedure T k suggested by Zhang (1998) for determining the number of upper outliers, since Zhang (1998) need to use the complicated null distribution of T k . Moreover, we give three practical examples and a simulated example to illustrate the procedures. Further, simulation studies are given to show the advantages of the proposed method. Finally, the proposed least squares procedure can also determine the number of upper outliers in other continuous univariate distributions (for example, Pareto, Gumbel, Weibull, etc.). Received: May 10, 1999; revised version: June 5, 2000  相似文献   

19.
The authors propose a robust transformation linear mixed‐effects model for longitudinal continuous proportional data when some of the subjects exhibit outlying trajectories over time. It becomes troublesome when including or excluding such subjects in the data analysis results in different statistical conclusions. To robustify the longitudinal analysis using the mixed‐effects model, they utilize the multivariate t distribution for random effects or/and error terms. Estimation and inference in the proposed model are established and illustrated by a real data example from an ophthalmology study. Simulation studies show a substantial robustness gain by the proposed model in comparison to the mixed‐effects model based on Aitchison's logit‐normal approach. As a result, the data analysis benefits from the robustness of making consistent conclusions in the presence of influential outliers. The Canadian Journal of Statistics © 2009 Statistical Society of Canada  相似文献   

20.
We propose the Laplace Error Penalty (LEP) function for variable selection in high‐dimensional regression. Unlike penalty functions using piecewise splines construction, the LEP is constructed as an exponential function with two tuning parameters and is infinitely differentiable everywhere except at the origin. With this construction, the LEP‐based procedure acquires extra flexibility in variable selection, admits a unified derivative formula in optimization and is able to approximate the L0 penalty as close as possible. We show that the LEP procedure can identify relevant predictors in exponentially high‐dimensional regression with normal errors. We also establish the oracle property for the LEP estimator. Although not being convex, the LEP yields a convex penalized least squares function under mild conditions if p is no greater than n. A coordinate descent majorization‐minimization algorithm is introduced to implement the LEP procedure. In simulations and a real data analysis, the LEP methodology performs favorably among competitive procedures.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号