首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Abstract. In this article, we maximize the efficiency of a multivariate S‐estimator under a constraint on the breakdown point. In the linear regression model, it is known that the highest possible efficiency of a maximum breakdown S‐estimator is bounded above by 33 per cent for Gaussian errors. We prove the surprising result that in dimensions larger than one, the efficiency of a maximum breakdown S‐estimator of location and scatter can get arbitrarily close to 100 per cent, by an appropriate selection of the loss function.  相似文献   

2.
We consider a new class of scale estimators with 50% breakdown point. The estimators are defined as order statistics of certain subranges. They all have a finite-sample breakdown point of [n/2]/n, which is the best possible value. (Here, [...] denotes the integer part.) One estimator in this class has the same influence function as the median absolute deviation and the least median of squares (LMS) scale estimator (i.e., the length of the shortest half), but its finite-sample efficiency is higher. If we consider the standard deviation of a subsample instead of its range, we obtain a different class of 50% breakdown estimators. This class contains the least trimmed squares (LTS) scale estimator. Simulation shows that the LTS scale estimator is nearly unbiased, so it does not need a small-sample correction factor. Surprisingly, the efficiency of the LTS scale estimator is less than that of the LMS scale estimator.  相似文献   

3.
Several authors have taken the worst case breakdown measures in analyzing the robustness of a test. In general, these kinds of measures give only a rough picture of breakdown robustness of a test. To overcome this limitation, a new kind of breakdown measure of a test is defined as the smallest proportion of arbitrary outliers in the sample that can distort the test decision. It is called as the sample breakdown point of a test in this paper. A distinct advantage of this new measure is that it is directly concerned with the test decision based on the present sample and with the critical region of the test. The sample breakdown points of several commonly used tests of one-sided or two-sided hypotheses are calculated and their asymptotic properties are also established. By Monte Carlo simulations and asymptotic analysis, we show that the acceptance breakdown of the t-test and the Hotelling T2-test is slightly better than that of the sample mean test. Finally, we prove that, for a one-sided hypothesis testing of location, the sign test has the maximum sample breakdown points asymptotically within a class of M-tests and score-tests.  相似文献   

4.
We consider the problem of choosing among a class of possible estimators by selecting the estimator with the smallest bootstrap estimate of finite sample variance. This is an alternative to using cross-validation to choose an estimator adaptively. The problem of a confidence interval based on such an adaptive estimator is considered. We illustrate the ideas by applying the method to the problem of choosing the trimming proportion of an adaptive trimmed mean. It is shown that a bootstrap adaptive trimmed mean is asymptotically normal with an asymptotic variance equal to the smallest among trimmed means. The asymptotic coverage probability of a bootstrap confidence interval based on such adaptive estimators is shown to have the nominal level. The intervals based on the asymptotic normality of the estimator share the same asymptotic result, but have poor small-sample properties compared to the bootstrap intervals. A small-sample simulation demonstrates that bootstrap adaptive trimmed means adapt themselves rather well even for samples of size 10.  相似文献   

5.
Simultaneous robust estimates of location and scale parameters are derived from a class of M-estimating equations. A coefficient p ( p > 0), which plays a role similar to that of a tuning constant in the theory of M-estimation, determines the estimating equations. These estimating equations may be obtained as the gradient of a strictly convex criterion function. This article shows that the estimators are uniquely defined, asymptotically bi-variate normal and have positive breakdown for some choices of p . When p = 0.12 and p = 0.3, the estimators are almost fully efficient for normal and exponential distributions: efficiencies with respect to the maximum likelihood estimators are 1.00 and 0.99, respectively. It is shown that the location estimator for known scale has the maximum breakdown point 0.5 independent of p , when the target model is symmetric. Also it is shown that the scale estimator has a positive breakdown point which depends on the choice of p . A simulation study finds that the proposed location estimator has smaller variance than the Hodges–Lehmann estimator, Huber's minimax and bisquare M-estimators.  相似文献   

6.
This paper considers the robustness properties in the time series context of the least median of squares (LMS) estimator. The influence function of the LMS estimator is derived under additive outlier contamination. This influence function is redescending and bounded for fixed values of the AR parameters. The gross-error sensitivity, however, is an unbounded function of the AR parameters. In order to asses the global robustness behavior of the LMS estimator, we consider several notions of breakdown. The breakdown points of the LMS estimator depend on the value of the underlying AR parameter. Generally, the breakdown point is below one half for high values of the AR parameter. The bias curves of the LMS estimator reveal, however, that the magnitude of outliers has to be considerable in order to cause breakdown.  相似文献   

7.
Trimming principles play an important role in robust statistics. However, their use for clustering typically requires some preliminary information about the contamination rate and the number of groups. We suggest a fresh approach to trimming that does not rely on this knowledge and that proves to be particularly suited for solving problems in robust cluster analysis. Our approach replaces the original K‐population (robust) estimation problem with K distinct one‐population steps, which take advantage of the good breakdown properties of trimmed estimators when the trimming level exceeds the usual bound of 0.5. In this setting, we prove that exact affine equivariance is lost on one hand but, on the other hand, an arbitrarily high breakdown point can be achieved by “anchoring” the robust estimator. We also support the use of adaptive trimming schemes, in order to infer the contamination rate from the data. A further bonus of our methodology is its ability to provide a reliable choice of the usually unknown number of groups.  相似文献   

8.
The paper considers generalized maximum likelihood asymptotic power one tests which aim to detect a change point in logistic regression when the alternative specifies that a change occurred in parameters of the model. A guaranteed non-asymptotic upper bound for the significance level of each of the tests is presented. For cases in which the test supports the conclusion that there was a change point, we propose a maximum likelihood estimator of that point and present results regarding the asymptotic properties of the estimator. An important field of application of this approach is occupational medicine, where for a lot chemical compounds and other agents, so-called threshold limit values (or TLVs) are specified.We demonstrate applications of the test and the maximum likelihood estimation of the change point using an actual problem that was encountered with real data.  相似文献   

9.
The inverse hypergeometric distribution is of interest in applications of inverse sampling without replacement from a finite population where a binary observation is made on each sampling unit. Thus, sampling is performed by randomly choosing units sequentially one at a time until a specified number of one of the two types is selected for the sample. Assuming the total number of units in the population is known but the number of each type is not, we consider the problem of estimating this parameter. We use the Delta method to develop approximations for the variance of three parameter estimators. We then propose three large sample confidence intervals for the parameter. Based on these results, we selected a sampling of parameter values for the inverse hypergeometric distribution to empirically investigate performance of these estimators. We evaluate their performance in terms of expected probability of parameter coverage and confidence interval length calculated as means of possible outcomes weighted by the appropriate outcome probabilities for each parameter value considered. The unbiased estimator of the parameter is the preferred estimator relative to the maximum likelihood estimator and an estimator based on a negative binomial approximation, as evidenced by empirical estimates of closeness to the true parameter value. Confidence intervals based on the unbiased estimator tend to be shorter than the two competitors because of its relatively small variance but at a slight cost in terms of coverage probability.  相似文献   

10.
This paper discusses the robustness of discriminant analysis against contamination in the training data, the test data are assumed uncontaminated. The concept of training data breakdown point for discriminant analysis is introduced. It is quite different from the usual breakdown point in robust statistics. In the robust location parameter estimation problem, outliers are the main concern, but in discriminant analysis, not only are outliers a concern, but also inliers.  相似文献   

11.
Asymptotically valid inference in linear regression models is easily achieved under mild conditions using the well-known Eicker–White heteroskedasticity–robust covariance matrix estimator or one of its variant. In finite sample however, such inferences can suffer from substantial size distortion. Indeed, it is well established in the literature that the finite sample accuracy of a test may depend on which variant of the Eicker–White estimator is used, on the underlying data generating process (DGP) and on the desired level of the test.

This paper develops a new variant of the Eicker–White estimator which explicitly aims to minimize the finite sample null error in rejection probability (ERP) of the test. This is made possible by selecting the transformation of the squared residuals which results in the smallest possible ERP through a numerical algorithm based on the wild bootstrap. Monte Carlo evidence indicates that this new procedure achieves a level of robustness to the DGP, sample size and nominal testing level unequaled by any other Eicker–White estimator based asymptotic test.  相似文献   


12.
The problem of estimating a smooth distribution function F at a point t is treated under the proportional hazard model of random censorship. It is shown that a certain class of properly chosen kernel type estimator of F asymptotically perform better than the maximum likelihood estimator. It is shown that the relative deficiency of the maximum likelihood estimator of F under the proportional hazard model with respect to the properly chosen kernel type estimator tends to infinity as the sample size tends to infinity.  相似文献   

13.
We show that the jackknife technique fails badly when applied to the problem of estimating the variance of a sample quantile. When viewed as a point estimator, the jackknife estimator is known to be inconsistent. We show that the ratio of the jackknife variance estimate to the true variance has an asymptotic Weibull distribution with parameters 1 and 1/2. We also show that if the jackknife variance estimate is used to Studentize the sample quantile, the asymptotic distribution of the resulting Studentized statistic is markedly nonnormal, having infinite mean. This result is in stark contrast with that obtained in simpler problems, such as that of constructing confidence intervals for a mean, where the jackknife-Studentized statistic has an asymptotic standard normal distribution.  相似文献   

14.
A well-known problem in multiple regression is that it is possible to reject the hypothesis that all slope parameters are equal to zero, yet when applying the usual Student's T-test to the individual parameters, no significant differences are found. An alternative strategy is to estimate prediction error via the 0.632 bootstrap method for all models of interest and declare the parameters associated with the model that yields the smallest prediction error to differ from zero. The main results in this paper are that this latter strategy can have practical value versus Student's T; replacing squared error with absolute error can be beneficial in some situations and replacing least squares with an extension of the Theil-Sen estimator can substantially increase the probability of identifying the correct model under circumstances that are described.  相似文献   

15.
This paper considers the problem of estimating the error density and distribution functions in nonparametric regression models. The asymptotic distribution of a suitably standardized density estimator at a fixed point is shown to be normal while that of the maximum of a suitably normalized deviation of the density estimator from the true density function is the same as in the case of the one sample set up. Finally, the standardized residual empirical process is shown to be uniformly close to the similarly standardized empirical process of the errors. This paper thus generalizes some of the well known results about the residual density estimators and the empirical process in parametric regression models to nonparametric regression models, thereby enhancing the domain of their applications.  相似文献   

16.
The usual (global) breakdown point describes the worst effect that a given number of gross errors can have. In a two-way layout, without interaction, one is frustrated by the small number of gross errors such a design can tolerate. However, neither the whole fit nor all parameter estimates need to be affected by such a breakdown. An example from molecular spectroscopy serves to illustrate such partial breakdown in a large, “sparse” two-factor model. Because the global finite sample breakdown point is zero for all usual estimators in this example, this concept does not make sense in such problems. The more appropriate concept of partial breakdown point is discussed in this paper. It also provides a crude quantification of the robustness properties of an estimator, yet for any linear combination of the estimated parameters. The maximum number of gross errors to which the linear combination of the estimated parameters can resist is related to the minimum number of observations that must be omitted to make the linear function a non-estimable function. In the example, we are mainly interested in differences of parameters. Then the maximal partial breakdown point for regression equivariant estimators is one half, and Huber-type regression M-estimators with bounded ψ-function reach this limit.  相似文献   

17.
18.
The estimated test error of a learned classifier is the most commonly reported measure of classifier performance. However, constructing a high quality point estimator of the test error has proved to be very difficult. Furthermore, common interval estimators (e.g. confidence intervals) are based on the point estimator of the test error and thus inherit all the difficulties associated with the point estimation problem. As a result, these confidence intervals do not reliably deliver nominal coverage. In contrast we directly construct the confidence interval by use of smooth data-dependent upper and lower bounds on the test error. We prove that for linear classifiers, the proposed confidence interval automatically adapts to the non-smoothness of the test error, is consistent under fixed and local alternatives, and does not require that the Bayes classifier be linear. Moreover, the method provides nominal coverage on a suite of test problems using a range of classification algorithms and sample sizes.  相似文献   

19.
Researchers often report point estimates of turning point(s) obtained in polynomial regression models but rarely assess the precision of these estimates. We discuss three methods to assess the precision of such turning point estimates. The first is the delta method that leads to a normal approximation of the distribution of the turning point estimator. The second method uses the exact distribution of the turning point estimator of quadratic regression functions. The third method relies on Markov chain Monte Carlo methods to provide a finite sample approximation of the exact distribution of the turning point estimator. We argue that the delta method may lead to misleading inference and that the other two methods are more reliable. We compare the three methods using two data sets from the environmental Kuznets curve literature, where the presence and location of a turning point in the income-pollution relationship is the focus of much empirical work.  相似文献   

20.
Based on a general progressively type II censored sample, the maximum likelihood estimator (MLE), Bayes estimator under squared error loss and credible intervals for the scale parameter and the reliability function of the Rayleigh distribution are derived. Also, the Bayes predictive estimator and highest posterior density (HPD) prediction interval for future observation are considered. Comparisons among estimators are investigated through Monte Carlo simulations. An illustrative example with real data concerning 23 ball bearings in a life test is presented.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号