首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
We propose a robust version of Cox-type test statistics for the choice between two non-nested hypotheses. We first show that the influence of small amounts of contamination in the data on the test decision can be very large. Secondly, we build a robust test statistic by using the results on robust parametric tests that are available in the literature and show that the level of the robust test is stable. Finally, we show numerically not only the robustness of this new test statistic but also that its asymptotic distribution is a good approximation of its sample distribution, unlike for the classical test statistic. We apply our results to the choice between a Pareto and an exponential distribution as well as between two competing regressors in the simple linear regression model without intercept.  相似文献   

2.
ABSTRACT

Control charts are the frequently used tools for monitoring and controlling the processes. Classical control charts are sensitive to existing contaminated data which may be presented in the data collected from the processes. Thus, these charts are not able to control the processes precisely when the data are contaminated. Robust control charts are those which are less sensitive to contamination. Some robust control charts for monitoring the process variability were proposed in the past which are robust to some sorts of contamination. In this paper a new robust R control chart is proposed which is less sensitive to wide range of contaminations, i.e. general and local contaminations. Simulation studies are performed to compare the performance of the proposed control chart with some classical and robust control charts, using ARL and MSD as criteria for comparisons purposes. The simulation results show a very good performance of the proposed chart when both types of contaminations exist.  相似文献   

3.
Detecting outliers in a multivariate point cloud is not trivial, especially when dealing with a sizable fraction of contamination. Over time, it has increasingly been recognized that the safest and most feasible approach to exposing outliers starts by computing a highly robust estimator of location and scatter that can withstand a large proportion of contamination. Many such estimators have been proposed in recent years. We will compare the worst-case bias of several prominent robust multivariate estimators by means of simulation. We also propose a new tool to compare robust estimators on real data sets, and illustrate it.  相似文献   

4.
Abstract

Robust parameter design (RPD) is an effective tool, which involves experimental design and strategic modeling to determine the optimal operating conditions of a system. The usual assumptions of RPD are that normally distributed experimental data and no contamination due to outliers. And generally the parameter uncertainties in response models are neglected. However, using normal theory modeling methods for a skewed data and ignoring parameter uncertainties can create a chain of degradation in optimization and production phases such that misleading fit, poor estimated optimal operating conditions, and poor quality products. This article presents a new approach based on confidence interval (CI) response modeling for the process mean. The proposed interval robust design makes the system median unbiased for the mean and uses midpoint of the interval as a measure of location performance response. As an alternative robust estimator for the process variance response modeling, using biweight midvariance is proposed which is both resistant and robust of efficiency where normality is not met. The results further show that the proposed interval robust design gives a robust solution to the skewed structure of the data and to contaminated data. The procedure and its advantages are illustrated using two experimental design studies.  相似文献   

5.
This paper discusses the robustness of discriminant analysis against contamination in the training data, the test data are assumed uncontaminated. The concept of training data breakdown point for discriminant analysis is introduced. It is quite different from the usual breakdown point in robust statistics. In the robust location parameter estimation problem, outliers are the main concern, but in discriminant analysis, not only are outliers a concern, but also inliers.  相似文献   

6.
The presence of contamination often called outlier is a very common attribute in data. Among other causes, outliers in a homoscedastic model make the model heteroscedastic. Moreover, outliers distort diagnostic tools for heteroscedasticity such that it may not be correctly identified. In this article, we show how outliers affect heteroscedasticity diagnostics. We then proposed a robust procedure for detecting heteroscedasticity in the presence of outliers by robustifying the non-robust component of the Goldfeld–Quandt (GQ) test. The performance of the proposed procedure is examined using simulation experiment and real data sets. The proposed procedure offers great improvement where the conventional GQ and other procedures fail.  相似文献   

7.
Robust control charts are useful in statistical process control (SPC) when there is limited knowledge about the underlying process distribution, especially for multivariate observations. This article develops a new robust and self-starting multivariate procedure based on multivariate Smirnov test (MST), which integrates a multivariate two-sample goodness-of-fit (GOF) test based on multivariate empirical distribution function (MEDF) and the change-point model. As expected, simulation results show that our proposed control chart is robust to nonnormally distributed data, and moreover, it is efficient in detecting process shifts, especially large shifts, which is one of the main drawbacks of most robust control charts in the literature. As it avoids the need for a lengthy data-gathering step, the proposed chart is particularly useful in start-up or short-run situations. Comparison results and a real data example show that our proposed chart has great potential for application.  相似文献   

8.
The author introduces robust techniques for estimation, inference and variable selection in the analysis of longitudinal data. She first addresses the problem of the robust estimation of the regression and nuisance parameters, for which she derives the asymptotic distribution. She uses weighted estimating equations to build robust quasi‐likelihood functions. These functions are then used to construct a class of test statistics for variable selection. She derives the limiting distribution of these tests and shows its robustness properties in terms of stability of the asymptotic level and power under contamination. An application to a real data set allows her to illustrate the benefits of a robust analysis.  相似文献   

9.
Loosely speaking a robust projection index is one that prefers projections involving true clusters over projections consisting of a cluster and an outlier. We introduce a mathematical definition of one-dimensional index robustness and describe a numerical experiment to measure it. We design five new indices based on measuring divergence from Student's t -distribution which are intended to be especially robust: the experiment shows that they are more robust than several established indices. The experiment also reveals more generally that the robustness of moment indices depends on the number of approximation terms, providing additional practical guidance for existing projection pursuit implementations. We investigate the theoretical properties of one new Student t -index and Hall's index and show that the new index automatically adapts its robustness to the degree of outlier contamination. We conclude by outlining the possibilities for extending our experiments to both higher dimensions and other new indices.  相似文献   

10.
Diagnostic tools must rely on robust high-breakdown methodologies to avoid distortion in the presence of contamination by outliers. However, a disadvantage of having a single, even if robust, summary of the data is that important choices concerning parameters of the robust method, such as breakdown point, have to be made prior to the analysis. The effect of such choices may be difficult to evaluate. We argue that an effective solution is to look at several pictures, and possibly to a whole movie, of the available data. This can be achieved by monitoring, over a range of parameter values, the results computed through the robust methodology of choice. We show the information gain that monitoring provides in the study of complex data structures through the analysis of multivariate datasets using different high-breakdown techniques. Our findings support the claim that the principle of monitoring is very flexible and that it can lead to robust estimators that are as efficient as possible. We also address through simulation some of the tricky inferential issues that arise from monitoring.  相似文献   

11.
The main purpose of this paper is to introduce first a new family of empirical test statistics for testing a simple null hypothesis when the vector of parameters of interest is defined through a specific set of unbiased estimating functions. This family of test statistics is based on a distance between two probability vectors, with the first probability vector obtained by maximizing the empirical likelihood (EL) on the vector of parameters, and the second vector defined from the fixed vector of parameters under the simple null hypothesis. The distance considered for this purpose is the phi-divergence measure. The asymptotic distribution is then derived for this family of test statistics. The proposed methodology is illustrated through the well-known data of Newcomb's measurements on the passage time for light. A simulation study is carried out to compare its performance with that of the EL ratio test when confidence intervals are constructed based on the respective statistics for small sample sizes. The results suggest that the ‘empirical modified likelihood ratio test statistic’ provides a competitive alternative to the EL ratio test statistic, and is also more robust than the EL ratio test statistic in the presence of contamination in the data. Finally, we propose empirical phi-divergence test statistics for testing a composite null hypothesis and present some asymptotic as well as simulation results for evaluating the performance of these test procedures.  相似文献   

12.
This article develops a new test based on Spearman’s rank correlation coefficients for total independence in high dimensions. The test is robust to the non normality and heavy tails of the data, which is a merit that is not shared by the existing tests in literature. Simulation results suggest that the new test performs well under several typical null and alternative hypotheses. Besides, we employ a real data set to illustrate the use of the new test.  相似文献   

13.
This article develops a new distribution-free multivariate procedure for statistical process control based on minimal spanning tree (MST), which integrates a multivariate two-sample goodness-of-fit (GOF) test based on MST and change-point model. Simulation results show that our proposed procedure is quite robust to nonnormally distributed data, and moreover, it is efficient in detecting process shifts, especially moderate to large shifts, which is one of the main drawbacks of most distribution-free procedures in the literature. The proposed procedure is particularly useful in start-up situations. Comparison results and a real data example show that our proposed procedure has great potential for application.  相似文献   

14.
Brief Abstract

This article focuses on estimation of multivariate simple linear profiles. While outliers may hamper the expected performance of the ordinary regression estimators, this study resorts to robust estimators as the remedy of the estimation problem in presence of contaminated observations. More specifically, three robust estimators M, S and MM are employed. Extensive simulation runs show that in the absence of outliers or for small amount of contamination, the robust methods perform as well as the classical least square method, while for medium and large amounts of contamination the proposed estimators perform considerably better than classical method.  相似文献   

15.
Summary.  We use the forward search to provide robust Mahalanobis distances to detect the presence of outliers in a sample of multivariate normal data. Theoretical results on order statistics and on estimation in truncated samples provide the distribution of our test statistic. We also introduce several new robust distances with associated distributional results. Comparisons of our procedure with tests using other robust Mahalanobis distances show the good size and high power of our procedure. We also provide a unification of results on correction factors for estimation from truncated samples.  相似文献   

16.
Although the poor performance of the mean as a location estimate when outliers are present in the data is well-known, there has b.een no clear consensus as to whether robust estimation or outlier detection Is the appropriate corrective procedure. In this paper, the estimation accuracy of the sample mean and 27 robust estimation and outlier detection techniques are compared by computer simulation. Both symmetric and asymmetric contamination are considered, It Is shown that the proper class of estimates depends on the degree of contaminations whether the contamination is symmetric or asymmetric, and the sample size. Several data sets considered previously by Rocke et.al. (1982) are also examined.  相似文献   

17.
In genome-wide association studies (GWASs) to detect the disease-associated genetic variants, two-stage design has received much attention because of its cost effectiveness and high efficiency. Under the framework of a two-stage design, it has been shown that joint analysis is more powerful than replication-based analysis. Several robust tests have been proposed for joint analysis to handle the problem of unknown genetic mode of inheritance. However, existing joint analysis of combining test statistics from both stages might suffer from a loss of efficiency if the combined test statistics are not sufficient or the weight of the statistic for each stage is not appropriate. In this article, we propose a new strategy for joint analysis by combining the raw data rather than the test statistics across stages and construct a robust MAX3-based test for two-staged GWASs, which can make full use of the information of the data from both stages. Our numerical results show that the proposed procedure is more powerful and computationally much faster than the existing joint analysis procedures. An application to a type 2 diabetes dataset is used to illustrate the proposed approach.  相似文献   

18.
We study the problem of merging homogeneous groups of pre-classified observations from a robust perspective motivated by the anti-fraud analysis of international trade data. This problem may be seen as a clustering task which exploits preliminary information on the potential clusters, available in the form of group-wise linear regressions. Robustness is then needed because of the sensitivity of likelihood-based regression methods to deviations from the postulated model. Through simulations run under different contamination scenarios, we assess the impact of outliers both on group-wise regression fitting and on the quality of the final clusters. We also compare alternative robust methods that can be adopted to detect the outliers and thus to clean the data. One major conclusion of our study is that the use of robust procedures for preliminary outlier detection is generally recommended, except perhaps when contamination is weak and the identification of cluster labels is more important than the estimation of group-specific population parameters. We also apply the methodology to find homogeneous groups of transactions in one empirical example that illustrates our motivating anti-fraud framework.  相似文献   

19.
This work studies outlier detection and robust estimation with data that are naturally distributed into groups and which follow approximately a linear regression model with fixed group effects. For this, several methods are considered. First, the robust fitting method of Peña and Yohai [A fast procedure for outlier diagnostics in large regression problems. J Am Stat Assoc. 1999;94:434–445], called principal sensitivity components (PSC) method, is adapted to the grouped data structure and the mentioned model. The robust methods RDL1 of Hubert and Rousseeuw [Robust regression with both continuous and binary regressors. J Stat Plan Inference. 1997;57:153–163] and M-S of Maronna and Yohai [Robust regression with both continuous and categorical predictors. Journal of Statistical Planning and Inference 2000;89:197–214] are also considered. These three methods are compared in terms of their effectiveness in outlier detection and their robustness through simulations, considering several contamination scenarios and growing contamination levels. Results indicate that the adapted PSC procedure is able to detect a high percentage of true outliers and a small number of false outliers. It is appropriate when the contamination is in the error term or in the covariates, detecting also possibly masked high leverage points. Moreover, in simulations the final robust regression estimator preserved good efficiency under Normality while keeping good robustness properties.  相似文献   

20.
There exist many studies which treat the robust tests in homoscedastic linear models. However, the robust testing procedure in heteroscedastic linear models has not been examined. In this article, three classes of testing procedures for testing subhypothesis in heteroscedastic linear models are developed. These are Wald-type, score-type, and drop-in dispersion tests. The asymptotic distributions of these tests are obtained under the null hypothesis and contiguous alternatives. For a robustness criterion, the maximum asymptotic bias of the level of the test for distributions in a shrinking contamination neighborhood is used and the most-efficient robust test is derived. Finally, the performance of these tests in small sample is studied by Monte Carlo simulation.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号