首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
A test for homogeneity of g ? 2 covariance matrices is presented when the dimension, p, may exceed the sample size, ni, i = 1, …, g, and the populations may not be normal. Under some mild assumptions on covariance matrices, the asymptotic distribution of the test is shown to be normal when ni, p → ∞. Under the null hypothesis, the test is extended for common covariance matrix to be of a specified structure, including sphericity. Theory of U-statistics is employed in constructing the tests and deriving their limits. Simulations are used to show the accuracy of tests.  相似文献   

2.
In this paper, we consider the asymptotic distributions of functionals of the sample covariance matrix and the sample mean vector obtained under the assumption that the matrix of observations has a matrix‐variate location mixture of normal distributions. The central limit theorem is derived for the product of the sample covariance matrix and the sample mean vector. Moreover, we consider the product of the inverse sample covariance matrix and the mean vector for which the central limit theorem is established as well. All results are obtained under the large‐dimensional asymptotic regime, where the dimension p and the sample size n approach infinity such that p/nc ∈ [0, + ) when the sample covariance matrix does not need to be invertible and p/nc ∈ [0,1) otherwise.  相似文献   

3.
Two new statistics are proposed for testing the identity of high-dimensional covariance matrix. Applying the large dimensional random matrix theory, we study the asymptotic distributions of our proposed statistics under the situation that the dimension p and the sample size n tend to infinity proportionally. The proposed tests can accommodate the situation that the data dimension is much larger than the sample size, and the situation that the population distribution is non-Gaussian. The numerical studies demonstrate that the proposed tests have good performance on the empirical powers for a wide range of dimensions and sample sizes.  相似文献   

4.
Let X =(x)ij=(111, …, X,)T, i = l, …n, be an n X random matrix having multivariate symmetrical distributions with parameters μ, Σ. The p-variate normal with mean μ and covariance matrix is a member of this family. Let be the squared multiple correlation coefficient between the first and the succeeding p1 components, and let p2 = + be the squared multiple correlation coefficient between the first and the remaining p1 + p2 =p – 1 components of the p-variate normal vector. We shall consider here three testing problems for multivariate symmetrical distributions. They are (A) to test p2 =0 against; (B) to test against =0, 0; (C) to test against p2 =0, We have shown here that for problem (A) the uniformly most powerful invariant (UMPI) and locally minimax test for the multivariate normal is UMPI and is locally minimax as p2 0 for multivariate symmetrical distributions. For problem (B) the UMPI and locally minimax test is UMPI and locally minimax as for multivariate symmetrical distributions. For problem (C) the locally best invariant (LBI) and locally minimax test for the multivariate normal is also LBI and is locally minimax as for multivariate symmetrical distributions.  相似文献   

5.
A control procedure is presented for monitoring changes in variation for a multivariate normal process in a Phase II operation where the subgroup size, m, is less than p, the number of variates. The methodology is based on a form of Wilk' statistic, which can be expressed as a function of the ratio of the determinants of two separate estimates of the covariance matrix. One estimate is based on the historical data set from Phase I and the other is based on an augmented data set including new data obtained in Phase II. The proposed statistic is shown to be distributed as the product of independent beta distributions that can be approximated using either a chi-square or F-distribution. An ARL study of the statistic is presented for a range of conditions for the population covariance matrix. Cases are considered where a p-variate process is being monitored using a sample of m observations per subgroup and m < p. Data from an industrial multivariate process is used to illustrate the proposed technique.  相似文献   

6.
Suppose m and V are respectively the vector of expected values and the covariance matrix of the order statistics of a sample of size n from a continuous distribution F. A method is presented to calculate asymptotic values of functions of m and V –1, for distributions F which are sufficiently regular. Values are given for the normal, logistic, and extreme-value distributions; also, for completeness, for the uniform and exponential distributions, although for these other methods must be used.  相似文献   

7.
Ahmad and von Rosen (2014 Ahmad, M. R. (2014). A U-statistic approach for a high-dimensional two-sample mean testing problem under non-normality and Behrens-Fisher setting. Annals of the Institute of Statistical Mathematics 66:3361.[Crossref], [Web of Science ®] [Google Scholar]) presented test statistics for sphericity and identity of the covariance matrix of a multivariate normal distribution when the dimension, p, exceeds the sample size, n. In this note, we show that their statistics are robust to normality assumption, when normality is replaced with certain mild assumptions on the traces of the covariance matrix. Under such assumptions, the test statistics are shown to follow the same asymptotic normal distribution as under normality for large p, also when p > >n. The asymptotic normality is proved using the theory of U-statistics, and is based on very general conditions, particularly avoiding any relationship between n and p.  相似文献   

8.
Many multivariate statistical procedures are based on the assumption of normality and different approaches have been proposed for testing this assumption. The vast majority of these tests, however, are exclusively designed for cases when the sample size n is larger than the dimension of the variable p, and the null distributions of their test statistics are usually derived under the asymptotic case when p is fixed and n increases. In this article, a test that utilizes principal components to test for nonnormality is proposed for cases when p/nc. The power and size of the test are examined through Monte Carlo simulations, and it is argued that the test remains well behaved and consistent against most nonnormal distributions under this type of asymptotics.  相似文献   

9.
A Gaussian process (GP) can be thought of as an infinite collection of random variables with the property that any subset, say of dimension n, of these variables have a multivariate normal distribution of dimension n, mean vector β and covariance matrix Σ [O'Hagan, A., 1994, Kendall's Advanced Theory of Statistics, Vol. 2B, Bayesian Inference (John Wiley & Sons, Inc.)]. The elements of the covariance matrix are routinely specified through the multiplication of a common variance by a correlation function. It is important to use a correlation function that provides a valid covariance matrix (positive definite). Further, it is well known that the smoothness of a GP is directly related to the specification of its correlation function. Also, from a Bayesian point of view, a prior distribution must be assigned to the unknowns of the model. Therefore, when using a GP to model a phenomenon, the researcher faces two challenges: the need of specifying a correlation function and a prior distribution for its parameters. In the literature there are many classes of correlation functions which provide a valid covariance structure. Also, there are many suggestions of prior distributions to be used for the parameters involved in these functions. We aim to investigate how sensitive the GPs are to the (sometimes arbitrary) choices of their correlation functions. For this, we have simulated 25 sets of data each of size 64 over the square [0, 5]×[0, 5] with a specific correlation function and fixed values of the GP's parameters. We then fit different correlation structures to these data, with different prior specifications and check the performance of the adjusted models using different model comparison criteria.  相似文献   

10.
11.
Under non-normality, this article is concerned with testing diagonality of high-dimensional covariance matrix, which is more practical than testing sphericity and identity in high-dimensional setting. The existing testing procedure for diagonality is not robust against either the data dimension or the data distribution, producing tests with distorted type I error rates much larger than nominal levels. This is mainly due to bias from estimating some functions of high-dimensional covariance matrix under non-normality. Compared to the sphericity and identity hypotheses, the asymptotic property of the diagonality hypothesis would be more involved and we should be more careful to deal with bias. We develop a correction that makes the existing test statistic robust against both the data dimension and the data distribution. We show that the proposed test statistic is asymptotically normal without the normality assumption and without specifying an explicit relationship between the dimension p and the sample size n. Simulations show that it has good size and power for a wide range of settings.  相似文献   

12.
The paper considers a significance test of regression variables in the high-dimensional linear regression model when the dimension of the regression variables p, together with the sample size n, tends to infinity. Under two sightly different cases, we proved that the likelihood ratio test statistic will converge in distribution to a Gaussian random variable, and the explicit expressions of the asymptotical mean and covariance are also obtained. The simulations demonstrate that our high-dimensional likelihood ratio test method outperforms those using the traditional methods in analyzing high-dimensional data.  相似文献   

13.
For two or more multivariate distributions with common covariance matrix, test statistics for certain special structures of the common covariance matrix are presented when the dimension of the multivariate vectors may exceed the number of such vectors. The test statistics are constructed as functions of location‐invariant estimators defined as U‐statistics, and the corresponding asymptotic theory is used to derive the limiting distributions of the proposed tests. The properties of the test statistics are established under mild and practical assumptions, and the same are numerically demonstrated using simulation results with small or moderate sample sizes and large dimensions.  相似文献   

14.
Multivariate control charts are used to monitor stochastic processes for changes and unusual observations. Hotelling's T2 statistic is calculated for each new observation and an out‐of‐control signal is issued if it goes beyond the control limits. However, this classical approach becomes unreliable as the number of variables p approaches the number of observations n, and impossible when p exceeds n. In this paper, we devise an improvement to the monitoring procedure in high‐dimensional settings. We regularise the covariance matrix to estimate the baseline parameter and incorporate a leave‐one‐out re‐sampling approach to estimate the empirical distribution of future observations. An extensive simulation study demonstrates that the new method outperforms the classical Hotelling T2 approach in power, and maintains appropriate false positive rates. We demonstrate the utility of the method using a set of quality control samples collected to monitor a gas chromatography–mass spectrometry apparatus over a period of 67 days.  相似文献   

15.
In this article, we consider the problem of testing (a) sphericity and (b) intraclass covariance structure under a growth curve model. The maximum likelihood estimator (MLE) for the mean in a growth curve model is a weighted estimator with the inverse of the sample covariance matrix which is unstable for large p close to N and singular for p larger than N. The MLE for the covariance matrix is based on the MLE for the mean, which can be very poor for p close to N. For both structures (a) and (b), we modify the MLE for the mean to an unweighted estimator and based on this estimator we propose a new estimator for the covariance matrix. This new estimator leads to new tests for (a) and (b). We also propose two other tests for each structure, which are just based on the sample covariance matrix.

To compare the performance of all four tests we compute for each structure (a) and (b) the attained significance level and the empirical power. We show that one of the tests based on the sample covariance matrix is better than the likelihood ratio test based on the MLE.  相似文献   


16.
Recently, several new robust multivariate estimators of location and scatter have been proposed that provide new and improved methods for detecting multivariate outliers. But for small sample sizes, there are no results on how these new multivariate outlier detection techniques compare in terms of p n , their outside rate per observation (the expected proportion of points declared outliers) under normality. And there are no results comparing their ability to detect truly unusual points based on the model that generated the data. Moreover, there are no results comparing these methods to two fairly new techniques that do not rely on some robust covariance matrix. It is found that for an approach based on the orthogonal Gnanadesikan–Kettenring estimator, p n can be very unsatisfactory with small sample sizes, but a simple modification gives much more satisfactory results. Similar problems were found when using the median ball algorithm, but a modification proved to be unsatisfactory. The translated-biweights (TBS) estimator generally performs well with a sample size of n≥20 and when dealing with p-variate data where p≤5. But with p=8 it can be unsatisfactory, even with n=200. A projection method as well the minimum generalized variance method generally perform best, but with p≤5 conditions where the TBS method is preferable are described. In terms of detecting truly unusual points, the methods can differ substantially depending on where the outliers happen to be, the number of outliers present, and the correlations among the variables.  相似文献   

17.
This paper derives first-order sampling moments of individual Mahalanobis distances (MDs) in cases when the dimension p of the variable is proportional to the sample size n. Asymptotic expected values when n, p → ∞ are derived under the assumption p/nc,?0 ? c < 1. It is shown that some types of standard estimators remain unbiased in this case, while others are asymptotically biased, a property that appears to be unnoticed in the literature. Second-order moments are also supplied to give some additional insight to the matter.  相似文献   

18.
To bootstrap a regression problem, pairs of response and explanatory variables or residuals can be resam‐pled, according to whether we believe that the explanatory variables are random or fixed. In the latter case, different residuals have been proposed in the literature, including the ordinary residuals (Efron 1979), standardized residuals (Bickel & Freedman 1983) and Studentized residuals (Weber 1984). Freedman (1981) has shown that the bootstrap from ordinary residuals is asymptotically valid when the number of cases increases and the number of variables is fixed. Bickel & Freedman (1983) have shown the asymptotic validity for ordinary residuals when the number of variables and the number of cases both increase, provided that the ratio of the two converges to zero at an appropriate rate. In this paper, the authors introduce the use of BLUS (Best Linear Unbiased with Scalar covariance matrix) residuals in bootstrapping regression models. The main advantage of the BLUS residuals, introduced in Theil (1965), is that they are uncorrelated. The main disadvantage is that only np residuals can be computed for a regression problem with n cases and p variables. The asymptotic results of Freedman (1981) and Bickel & Freedman (1983) for the ordinary (and standardized) residuals are generalized to the BLUS residuals. A small simulation study shows that even though only np residuals are available, in small samples bootstrapping BLUS residuals can be as good as, and sometimes better than, bootstrapping from standardized or Studentized residuals.  相似文献   

19.
Consider the canonical-form MANOVA setup with X: n × p = (+ E, Xi ni × p, i = 1, 2, 3, Mi: ni × p, i = 1, 2, n1 + n2 + n3) p, where E is a normally distributed error matrix with mean zero and dispersion In (> 0 (positive definite). Assume (in contrast with the usual case) that M1i is normal with mean zero and dispersion In1) and M22 is either fixed or random normal with mean zero and different dispersion matrix In2 (being unknown. It is also assumed that M1 E, and M2 (if random) are all independent. For testing H0) = 0 versus H1: (> 0, it is shown that when either n2 = 0 or M2 is fixed if n2 > 0, the trace test of Pillai (1955) is uniformly most powerful invariant (UMPI) if min(n1, p)= 1 and locally best invariant (LBI) if min(n1 p) > 1 underthe action of the full linear group Gl (p). When p > 1, the LBI test is also derived under a somewhat smaller group GT(p) of p × p lower triangular matrices with positive diagonal elements. However, such results do not hold if n2 > 0 and M2 is random. The null, nonnull, and optimality robustness of Pillai's trace test under Gl(p) for suitable deviations from normality is pointed out.  相似文献   

20.
The Dirichlet-multinomial model is considered as a model for cluster sampling. The model assumes that the design's covariance matrix is a constant times the covariance under multinomial sampling. The use of this model requires estimating a parameter C, that measures the clustering effect. In this paper, a regression estimate for C is obtained. An approximate distribution of this estimator is obtained through the use of asymptotic techniques. A goodness of fit statistic for testing the fit of the Dirichlet Multinomial model is also obtained, based on those asymptotic techniques. These statistics provide a means of knowing when the data satisfy the model assumption. These results are used to analyze data concerning the authorship of Greek prose.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号