首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 144 毫秒
The paper considers a significance test of regression variables in the high-dimensional linear regression model when the dimension of the regression variables p, together with the sample size n, tends to infinity. Under two sightly different cases, we proved that the likelihood ratio test statistic will converge in distribution to a Gaussian random variable, and the explicit expressions of the asymptotical mean and covariance are also obtained. The simulations demonstrate that our high-dimensional likelihood ratio test method outperforms those using the traditional methods in analyzing high-dimensional data.  相似文献   

Under non-normality, this article is concerned with testing diagonality of high-dimensional covariance matrix, which is more practical than testing sphericity and identity in high-dimensional setting. The existing testing procedure for diagonality is not robust against either the data dimension or the data distribution, producing tests with distorted type I error rates much larger than nominal levels. This is mainly due to bias from estimating some functions of high-dimensional covariance matrix under non-normality. Compared to the sphericity and identity hypotheses, the asymptotic property of the diagonality hypothesis would be more involved and we should be more careful to deal with bias. We develop a correction that makes the existing test statistic robust against both the data dimension and the data distribution. We show that the proposed test statistic is asymptotically normal without the normality assumption and without specifying an explicit relationship between the dimension p and the sample size n. Simulations show that it has good size and power for a wide range of settings.  相似文献   

Statistical inferences in high-dimensional precision matrices are equally important as statistical inferences in high-dimensional covariance matrices. In the literature, much attention has been paid to the latter, and significant advances have been achieved, especially in estimation and test of the banded structure. This paper proposes a new test for testing banded structures of precision matrices without assuming any specific parametric distribution. The test is adapted to the large p small n problems in which we derive the asymptotic distribution under the null hypothesis of bandedness. Simulation results show that the proposed test performs well with finite sample sizes. A real data application is realised to a phone call centre data.  相似文献   

Maclean et al. (1976) applied a specific Box-Cox transformation to test for mixtures of distributions against a single distribution. Their null hypothesis is that a sample of n observations is from a normal distribution with unknown mean and variance after a restricted Box-Cox transformation. The alternative is that the sample is from a mixture of two normal distributions, each with unknown mean and unknown, but equal, variance after another restricted Box-Cox transformation. We developed a computer program that calculated the maximum likelihood estimates (MLEs) and likelihood ratio test (LRT) statistic for the above. Our algorithm for the calculation of the MLEs of the unknown parameters used multiple starting points to protect against convergence to a local rather than global maximum. We then simulated the distribution of the LRT for samples drawn from a normal distribution and five Box-Cox transformations of a normal distribution. The null distribution appeared to be the same for the Box-Cox transformations studied and appeared to be distributed as a chi-square random variable for samples of 25 or more. The degrees of freedom parameter appeared to be a monotonically decreasing function of the sample size. The null distribution of this LRT appeared to converge to a chi-square distribution with 2.5 degrees of freedom. We estimated the critical values for the 0.10, 0.05, and 0.01 levels of significance.  相似文献   

In this paper, we consider the problem of testing the mean vector in the multivariate setting where the dimension p is greater than the sample size n, namely a large p and small n problem. We propose a new scalar transform invariant test and show the asymptotic null distribution and power of the proposed test under weaker conditions than Srivastava (2009). We also present numerical studies including simulations and a real example of microarray data with comparison to existing tests developed for a large p and small n problem.  相似文献   

This paper investigates a new family of goodness-of-fit tests based on the negative exponential disparities. This family includes the popular Pearson's chi-square as a member and is a subclass of the general class of disparity tests (Basu and Sarkar, 1994) which also contains the family of power divergence statistics. Pitman efficiency and finite sample power comparisons between different members of this new family are made. Three asymptotic approximations of the exact null distributions of the negative exponential disparity famiiy of tests are discussed. Some numerical results on the small sample perfomance of this family of tests are presented for the symmetric null hypothesis. It is shown that the negative exponential disparity famiiy, Like the power divergence family, produces a new goodness-of-fit test statistic that can be a very attractive alternative to the Pearson's chi-square. Some numerical results suggest that, application of this test statistic, as an alternative to Pearson's chi-square, could be preferable to the I 2/3 statistic of Cressie and Read (1984) under the use of chi-square critical values.  相似文献   

Two new statistics are proposed for testing the identity of high-dimensional covariance matrix. Applying the large dimensional random matrix theory, we study the asymptotic distributions of our proposed statistics under the situation that the dimension p and the sample size n tend to infinity proportionally. The proposed tests can accommodate the situation that the data dimension is much larger than the sample size, and the situation that the population distribution is non-Gaussian. The numerical studies demonstrate that the proposed tests have good performance on the empirical powers for a wide range of dimensions and sample sizes.  相似文献   

Generalized variance is a measure of dispersion of multivariate data. Comparison of dispersion of multivariate data is one of the favorite issues for multivariate quality control, generalized homogeneity of multidimensional scatter, etc. In this article, the problem of testing equality of generalized variances of k multivariate normal populations by using the Bartlett's modified likelihood ratio test (BMLRT) is proposed. Simulations to compare the Type I error rate and power of the BMLRT and the likelihood ratio test (LRT) methods are performed. These simulations show that the BMLRT method has a better chi-square approximation under the null hypothesis. Finally, a practical example is given.  相似文献   

We study the problem of testing: H0 : μ ∈ P against H1 : μ ? P, based on a random sample of N observations from a p-dimensional normal distribution Np(μ, Σ) with Σ > 0 and P a closed convex positively homogeneous set. We develop the likelihood-ratio test (LRT) for this problem. We show that the union-intersection principle leads to a test equivalent to the LRT. It also gives a large class of tests which are shown to be admissible by Stein's theorem (1956). Finally, we give the α-level cutoff points for the LRT.  相似文献   

For high-dimensional data, it is a tedious task to determine anomalies such as outliers. We present a novel outlier detection method for high-dimensional contingency tables. We use the class of decomposable graphical models to model the relationship among the variables of interest, which can be depicted by an undirected graph called the interaction graph. Given an interaction graph, we derive a closed-form expression of the likelihood ratio test (LRT) statistic and an exact distribution for efficient simulation of the test statistic. An observation is declared an outlier if it deviates significantly from the approximated distribution of the test statistic under the null hypothesis. We demonstrate the use of the LRT outlier detection framework on genetic data modeled by Chow–Liu trees.  相似文献   

A. Roy  D. Klein 《Statistics》2018,52(2):393-408
Testing hypotheses about the structure of a covariance matrix for doubly multivariate data is often considered in the literature. In this paper the Rao's score test (RST) is derived to test the block exchangeable covariance matrix or block compound symmetry (BCS) covariance structure under the assumption of multivariate normality. It is shown that the empirical distribution of the RST statistic under the null hypothesis is independent of the true values of the mean and the matrix components of a BCS structure. A significant advantage of the RST is that it can be performed for small samples, even smaller than the dimension of the data, where the likelihood ratio test (LRT) cannot be used, and it outperforms the standard LRT in a number of contexts. Simulation studies are performed for the sample size consideration, and for the estimation of the empirical quantiles of the null distribution of the test statistic. The RST procedure is illustrated on a real data set from the medical studies.  相似文献   

In high-dimensional data, one often seeks a few interesting low-dimensional projections which reveal important aspects of the data. Projection pursuit for classification finds projections that reveal differences between classes. Even though projection pursuit is used to bypass the curse of dimensionality, most indexes will not work well when there are a small number of observations relative to the number of variables, known as a large p (dimension) small n (sample size) problem. This paper discusses the relationship between the sample size and dimensionality on classification and proposes a new projection pursuit index that overcomes the problem of small sample size for exploratory classification.  相似文献   

In their recent work, Jiang and Yang studied six classical Likelihood Ratio Test statistics under high‐dimensional setting. Assuming that a random sample of size n is observed from a p‐dimensional normal population, they derive the central limit theorems (CLTs) when p and n are proportional to each other, which are different from the classical chi‐square limits as n goes to infinity, while p remains fixed. In this paper, by developing a new tool, we prove that the mentioned six CLTs hold in a more applicable setting: p goes to infinity, and p can be very close to n. This is an almost sufficient and necessary condition for the CLTs. Simulations of histograms, comparisons on sizes and powers with those in the classical chi‐square approximations and discussions are presented afterwards.  相似文献   

Many multivariate statistical procedures are based on the assumption of normality and different approaches have been proposed for testing this assumption. The vast majority of these tests, however, are exclusively designed for cases when the sample size n is larger than the dimension of the variable p, and the null distributions of their test statistics are usually derived under the asymptotic case when p is fixed and n increases. In this article, a test that utilizes principal components to test for nonnormality is proposed for cases when p/nc. The power and size of the test are examined through Monte Carlo simulations, and it is argued that the test remains well behaved and consistent against most nonnormal distributions under this type of asymptotics.  相似文献   

The classical unconditional exact p-value test can be used to compare two multinomial distributions with small samples. This general hypothesis requires parameter estimation under the null which makes the test severely conservative. Similar property has been observed for Fisher's exact test with Barnard and Boschloo providing distinct adjustments that produce more powerful testing approaches. In this study, we develop a novel adjustment for the conservativeness of the unconditional multinomial exact p-value test that produces nominal type I error rate and increased power in comparison to all alternative approaches. We used a large simulation study to empirically estimate the 5th percentiles of the distributions of the p-values of the exact test over a range of scenarios and implemented a regression model to predict the values for two-sample multinomial settings. Our results show that the new test is uniformly more powerful than Fisher's, Barnard's, and Boschloo's tests with gains in power as large as several hundred percent in certain scenarios. Lastly, we provide a real-life data example where the unadjusted unconditional exact test wrongly fails to reject the null hypothesis and the corrected unconditional exact test rejects the null appropriately.  相似文献   

Large sample tests for the standard To bit model versus the p -Tobit model by Deaton and Irish (1984) are studied. The normalized one-tailed score test by Deaton and Irish (1984) is shown to be a version of Neyman's C(α) test that is valid for the non-standard problem of the null hypothesis lying on the boundary of the parameter space. Then, this paper reports the results of Monte Carlo experiments designed to study the small sample performance of large sample tests for the standard Tobit specification versus the p -Tobit specification.  相似文献   

A control procedure is presented for monitoring changes in variation for a multivariate normal process in a Phase II operation where the subgroup size, m, is less than p, the number of variates. The methodology is based on a form of Wilk' statistic, which can be expressed as a function of the ratio of the determinants of two separate estimates of the covariance matrix. One estimate is based on the historical data set from Phase I and the other is based on an augmented data set including new data obtained in Phase II. The proposed statistic is shown to be distributed as the product of independent beta distributions that can be approximated using either a chi-square or F-distribution. An ARL study of the statistic is presented for a range of conditions for the population covariance matrix. Cases are considered where a p-variate process is being monitored using a sample of m observations per subgroup and m < p. Data from an industrial multivariate process is used to illustrate the proposed technique.  相似文献   

Test statistics for sphericity and identity of the covariance matrix are presented, when the data are multivariate normal and the dimension, p, can exceed the sample size, n. Under certain mild conditions mainly on the traces of the unknown covariance matrix, and using the asymptotic theory of U-statistics, the test statistics are shown to follow an approximate normal distribution for large p, also when p?n. The accuracy of the statistics is shown through simulation results, particularly emphasizing the case when p can be much larger than n. A real data set is used to illustrate the application of the proposed test statistics.  相似文献   

A test for homogeneity of g ? 2 covariance matrices is presented when the dimension, p, may exceed the sample size, ni, i = 1, …, g, and the populations may not be normal. Under some mild assumptions on covariance matrices, the asymptotic distribution of the test is shown to be normal when ni, p → ∞. Under the null hypothesis, the test is extended for common covariance matrix to be of a specified structure, including sphericity. Theory of U-statistics is employed in constructing the tests and deriving their limits. Simulations are used to show the accuracy of tests.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号