首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 406 毫秒
1.
Control charts are one of the widest used techniques in statistical process control. In Phase I, historical observations are analysed in order to construct a control chart. Because of the existence of multiple outliers that are undetected by control charts such as Hotelling’s T 2 due to the masking effect, robust alternatives to Hotelling’s T 2 have been developed based on minimum volume ellipsoid (MVE) estimators, minimum covariance determinant (MCD) estimators, reweighted MCD estimators or trimmed estimators. In this paper, we use a simulation study to analyse the performance of each alternative in various situations and offer guidance for the correct use of each estimator.  相似文献   

2.
Abstract

This article proposes new regression-type estimators by considering Tukey-M, Hampel M, Huber MM, LTS, LMS and LAD robust methods and MCD and MVE robust covariance matrices in stratified sampling. Theoretically, we obtain the mean square error (MSE) for these estimators. We compare the efficiencies based on MSE equations, between the proposed estimators and the traditional combined and separate regression estimators. As a result of these comparisons, we observed that our proposed estimators give more efficient results than traditional approaches. And, these theoretical results are supported with the aid of numerical examples and simulation based on data sets that include outliers.  相似文献   

3.
Traditional multivariate control charts are based upon the assumption that the observations follow a multivariate normal distribution. In many practical applications, however, this supposition may be difficult to verify. In this paper, we use control charts based on robust estimators of location and scale to improve the capability of detection observations out of control under non-normality in the presence of multiple outliers. Concretely, we use a simulation process to analyse the behaviour of the robust alternatives to Hotelling's T 2, which use minimum volume ellipsoidal (MVE) and minimum covariance determinant (MCD) in the presence of observations with a Student's t-distribution. The results show that these robust control charts are good alternatives for small deviations from normality due to the fact that the percentage of out-of-control observations detected for these charts in the Phase II are higher.  相似文献   

4.
To overcome the main flaw of minimum covariance determinant (MCD) estimator, i.e. difficulty to determine its main parameter h, a modified-MCD (M-MCD) algorithm is proposed. In M-MCD, the self-adaptive iteration is proposed to minimize the deflection between the standard deviation of robust mahalanobis distance square, which is calculated by MCD with the parameter h based on the sample, and the standard deviation of theoretical mahalanobis distance square by adjusting the parameter h of MCD. Thus, the optimal parameter h of M-MCD is determined when the minimum deflection is obtained. The results of convergence analysis demonstrate that M-MCD has good convergence property. Further, M-MCD and MCD were applied to detect outliers for two typical data and chemical process data, respectively. The results show that M-MCD can get the optimal parameter h by using the self-adaptive iteration and thus its performances of outlier detection are better than MCD.  相似文献   

5.
In the past decade, different robust estimators have been proposed by several researchers to improve the ability to detect non-random patterns such as trend, process mean shift, and outliers in multivariate control charts. However, the use of the sample mean vector and the mean square successive difference matrix in the T 2 control chart is sensitive in detecting process mean shift or trend but less sensitive in detecting outliers. On the other hand, the minimum volume ellipsoid (MVE) estimators in the T 2 control chart are sensitive in detecting multiple outliers but less sensitive in detecting trend or process mean shift. Therefore, new robust estimators using both merits of the mean square successive difference matrix and the MVE estimators are developed to modify Hotelling's T 2 control chart. To compare the detection performance among various control charts, a simulation approach for establishing control limits and calculating signal probabilities is provided as well. Our simulation results show that a multivariate control chart using the new robust estimators can achieve a well-balanced sensitivity in detecting the above-mentioned non-random patterns. Finally, three numerical examples further demonstrate the usefulness of our new robust estimators.  相似文献   

6.
Recently, several new robust multivariate estimators of location and scatter have been proposed that provide new and improved methods for detecting multivariate outliers. But for small sample sizes, there are no results on how these new multivariate outlier detection techniques compare in terms of p n , their outside rate per observation (the expected proportion of points declared outliers) under normality. And there are no results comparing their ability to detect truly unusual points based on the model that generated the data. Moreover, there are no results comparing these methods to two fairly new techniques that do not rely on some robust covariance matrix. It is found that for an approach based on the orthogonal Gnanadesikan–Kettenring estimator, p n can be very unsatisfactory with small sample sizes, but a simple modification gives much more satisfactory results. Similar problems were found when using the median ball algorithm, but a modification proved to be unsatisfactory. The translated-biweights (TBS) estimator generally performs well with a sample size of n≥20 and when dealing with p-variate data where p≤5. But with p=8 it can be unsatisfactory, even with n=200. A projection method as well the minimum generalized variance method generally perform best, but with p≤5 conditions where the TBS method is preferable are described. In terms of detecting truly unusual points, the methods can differ substantially depending on where the outliers happen to be, the number of outliers present, and the correlations among the variables.  相似文献   

7.
In this paper, we consider the asymptotic distributions of functionals of the sample covariance matrix and the sample mean vector obtained under the assumption that the matrix of observations has a matrix‐variate location mixture of normal distributions. The central limit theorem is derived for the product of the sample covariance matrix and the sample mean vector. Moreover, we consider the product of the inverse sample covariance matrix and the mean vector for which the central limit theorem is established as well. All results are obtained under the large‐dimensional asymptotic regime, where the dimension p and the sample size n approach infinity such that p/nc ∈ [0, + ) when the sample covariance matrix does not need to be invertible and p/nc ∈ [0,1) otherwise.  相似文献   

8.
A criterion for robust estimation of location and covariance matrix is considered, and its application in outlier labeling is discussed. This method, unlike the methods based on MVE and MCD, is applicable to large and high-dimension data sets. The method proposed here is also robust and has the same breakdown point as the MVE- and MCD-based methods. Furthermore, the computational complexity of the proposed method is significantly smaller than that of other methods.  相似文献   

9.
10.
Let X n = (x i j ) be a k ×n data matrix with complex‐valued, independent and standardized entries satisfying a Lindeberg‐type moment condition. We consider simultaneously R sample covariance matrices , where the Q r 's are non‐random real matrices with common dimensions p ×k (k p ). Assuming that both the dimension p and the sample size n grow to infinity, the limiting distributions of the eigenvalues of the matrices { B n r } are identified, and as the main result of the paper, we establish a joint central limit theorem (CLT) for linear spectral statistics of the R matrices { B n r }. Next, this new CLT is applied to the problem of testing a high‐dimensional white noise in time series modelling. In experiments, the derived test has a controlled size and is significantly faster than the classical permutation test, although it does have lower power. This application highlights the necessity of such joint CLT in the presence of several dependent sample covariance matrices. In contrast, all the existing works on CLT for linear spectral statistics of large sample covariance matrices deal with a single sample covariance matrix (R = 1).  相似文献   

11.
《统计学通讯:理论与方法》2012,41(13-14):2465-2489
The Akaike information criterion, AIC, and Mallows’ C p statistic have been proposed for selecting a smaller number of regressors in the multivariate regression models with fully unknown covariance matrix. All of these criteria are, however, based on the implicit assumption that the sample size is substantially larger than the dimension of the covariance matrix. To obtain a stable estimator of the covariance matrix, it is required that the dimension of the covariance matrix is much smaller than the sample size. When the dimension is close to the sample size, it is necessary to use ridge-type estimators for the covariance matrix. In this article, we use a ridge-type estimators for the covariance matrix and obtain the modified AIC and modified C p statistic under the asymptotic theory that both the sample size and the dimension go to infinity. It is numerically shown that these modified procedures perform very well in the sense of selecting the true model in large dimensional cases.  相似文献   

12.
We propose a method that integrates bootstrap into the forward search algorithm in the construction of robust confidence intervals for elements of the eigenvectors of the correlation matrix in the presence of outliers. Coverage probability of the bootstrap simultaneous confidence intervals was compared to the coverage probabilities of regular asymptotic confidence region and asymptotic confidence region based on the minimum covariance determinant (MCD) approach through a simulation study. The method produced more stable coverage probabilities for datasets with or without outliers and across several sample sizes compared to approaches based on asymptotic confidence regions.  相似文献   

13.
When the data contain outliers or come from population with heavy-tailed distributions, which appear very often in spatiotemporal data, the estimation methods based on least-squares (L2) method will not perform well. More robust estimation methods are required. In this article, we propose the local linear estimation for spatiotemporal models based on least absolute deviation (L1) and drive the asymptotic distributions of the L1-estimators under some mild conditions imposed on the spatiotemporal process. The simulation results for two examples, with outliers and heavy-tailed distribution, respectively, show that the L1-estimators perform better than the L2-estimators.  相似文献   

14.
Summary.  We consider the problem of obtaining population-based inference in the presence of missing data and outliers in the context of estimating the prevalence of obesity and body mass index measures from the 'Healthy for life' study. Identifying multiple outliers in a multivariate setting is problematic because of problems such as masking, in which groups of outliers inflate the covariance matrix in a fashion that prevents their identification when included, and swamping, in which outliers skew covariances in a fashion that makes non-outlying observations appear to be outliers. We develop a latent class model that assumes that each observation belongs to one of K unobserved latent classes, with each latent class having a distinct covariance matrix. We consider the latent class covariance matrix with the largest determinant to form an 'outlier class'. By separating the covariance matrix for the outliers from the covariance matrices for the remainder of the data, we avoid the problems of masking and swamping. As did Ghosh-Dastidar and Schafer, we use a multiple-imputation approach, which allows us simultaneously to conduct inference after removing cases that appear to be outliers and to promulgate uncertainty in the outlier status through the model inference. We extend the work of Ghosh-Dastidar and Schafer by embedding the outlier class in a larger mixture model, consider penalized likelihood and posterior predictive distributions to assess model choice and model fit, and develop the model in a fashion to account for the complex sample design. We also consider the repeated sampling properties of the multiple imputation removal of outliers.  相似文献   

15.
The estimation of the covariance matrix is important in the analysis of bivariate longitudinal data. A good estimator for the covariance matrix can improve the efficiency of the estimators of the mean regression coefficients. Furthermore, the covariance estimation itself is also of interest, but it is a challenging job to model the covariance matrix of bivariate longitudinal data due to the complex structure and positive definite constraint. In addition, most of existing approaches are based on the maximum likelihood, which is very sensitive to outliers or heavy-tail error distributions. In this article, an adaptive robust estimation method is proposed for bivariate longitudinal data. Unlike the existing likelihood-based methods, the proposed method can adapt to different error distributions. Specifically, at first, we utilize the modified Cholesky block decomposition to parameterize the covariance matrices. Secondly, we apply the bounded Huber's score function to develop a set of robust generalized estimating equations to estimate the parameters both in the mean and the covariance models simultaneously. A data-driven approach is presented to select the parameter c in the Huber's score function, which can ensure that the proposed method is robust and efficient. A simulation study and a real data analysis are conducted to illustrate the robustness and efficiency of the proposed approach.  相似文献   

16.
Let X be a po-normal random vector with unknown µ and unknown covariance matrix ∑ and let X be partitioned as X = (X (1), …, X (r))′ where X(j)is a subvector of X with dimension pjsuch that ∑r j=1Pj = P0. Some admissible tests are derived for testing H0: μ = 0 versus H1: μ ¦0 based on a sample drawn from the whole vector X of dimension p and r additional samples drawn from X(1), X(2), …, X(r) respectively, All (r+1) samples are assumed to be independent. The distribution of some of the tests' statistics involved are also derived.  相似文献   

17.
Suppose m and V are respectively the vector of expected values and the covariance matrix of the order statistics of a sample of size n from a continuous distribution F. A method is presented to calculate asymptotic values of functions of m and V –1, for distributions F which are sufficiently regular. Values are given for the normal, logistic, and extreme-value distributions; also, for completeness, for the uniform and exponential distributions, although for these other methods must be used.  相似文献   

18.
In this paper, we suggest a least squares procedure for the determination of the number of upper outliers in an exponential sample by minimizing sample mean squared error. Moreover, the method can reduce the masking or “swamping” effects. In addition, we have also found that the least squares procedure is easy and simple to compute than test test procedure T k suggested by Zhang (1998) for determining the number of upper outliers, since Zhang (1998) need to use the complicated null distribution of T k . Moreover, we give three practical examples and a simulated example to illustrate the procedures. Further, simulation studies are given to show the advantages of the proposed method. Finally, the proposed least squares procedure can also determine the number of upper outliers in other continuous univariate distributions (for example, Pareto, Gumbel, Weibull, etc.). Received: May 10, 1999; revised version: June 5, 2000  相似文献   

19.
Let {xij(1 ? j ? ni)|i = 1, 2, …, k} be k independent samples of size nj from respective distributions of functions Fj(x)(1 ? j ? k). A classical statistical problem is to test whether these k samples came from a common distribution function, F(x) whose form may or may not be known. In this paper, we consider the complementary problem of estimating the distribution functions suspected to be homogeneous in order to improve the basic estimator known as “empirical distribution function” (edf), in an asymptotic setup. Accordingly, we consider four additional estimators, namely, the restricted estimator (RE), the preliminary test estimator (PTE), the shrinkage estimator (SE), and the positive rule shrinkage estimator (PRSE) and study their characteristic properties based on the mean squared error (MSE) and relative risk efficiency (RRE) with tables and graphs. We observed that for k ? 4, the positive rule SE performs uniformly better than both shrinkage and the unrestricted estimator, while PTEs works reasonably well for k < 4.  相似文献   

20.
Let X be a normally distributed p-dimensional column vector with mean μ and positive definite covariance matrix σ. and let X α, α = 1,…, N, be a random sample of size N from this distribution. Partition X as ( X 1, X (2)', X '(3))', where X1 is one-dimension, X(2) is p2- dimensional, and so 1 + p1 + p2 = p. Let ρ1 and ρ be the multiple correlation coefficients of X1 with X(2) and with ( X '(2), X '(3))', respectively. Write ρ2/2 = ρ2 - ρ2/1. We shall cosider the following two problems  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号