期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

A modified C_p statistic in a system-of-equations model

Vichit Lorchirachoonkul Jirawan Jitthavech 《Journal of statistical planning and inference》2012

A new statistic, SΓ(p), is developed for variable selection in a system-of-equations model. The standardized total mean square error in the SΓ(p)statistic is weighted by the covariance matrix of dependent variables instead of the error covariance matrix of the true model as in the original definition. The new statistic can be also used for model selection in the non-nested models. The estimate of SΓ(p), SC(p), is derived and shown to become SC_ε(p) in the similar form of C_p in a single-equation model when the covariance matrix of sampled dependent variables is replaced by the error covariance matrix under the full model. 相似文献

2.

Using a Truncated C p Statistic for Variable Selection in Multiple Linear Regression

D. W. Uys S. J. Steel 《统计学通讯:模拟与计算》2013,42(2):420-432

In multiple linear regression analysis each lower-dimensional subspace L of a known linear subspace M of ?ⁿ corresponds to a non empty subset of the columns of the regressor matrix. For a fixed subspace L, the C _p statistic is an unbiased estimator of the mean square error if the projection of the response vector onto L is used to estimate the expected response. In this article, we consider two truncated versions of the C _p statistic that can also be used to estimate this mean square error. The C _p statistic and its truncated versions are compared in two example data sets, illustrating that use of the truncated versions may result in models different from those selected by standard C _p. 相似文献

3.

A model selection criterion for discriminant analysis of high-dimensional data with fewer observations

Masashi Hyodo Takayuki Yamada Muni S. Srivastava 《Journal of statistical planning and inference》2012

This paper is concerned with the problem of selecting variables in two-group discriminant analysis for high-dimensional data with fewer observations than the dimension. We consider a selection criterion based on approximately unbiased for AIC type of risk. When the dimension is large compared to the sample size, AIC type of risk cannot be defined. We propose AIC by replacing maximum likelihood estimator with ridge-type estimator. This idea follows Srivastava and Kubokawa (2008). It has been further extended by Yamamura et al. (2010). Simulation revealed that the proposed AIC performs well. 相似文献

4.

Testing diagonality of high-dimensional covariance matrix under non-normality

Kai Xu 《Journal of Statistical Computation and Simulation》2017,87(16):3208-3224

Under non-normality, this article is concerned with testing diagonality of high-dimensional covariance matrix, which is more practical than testing sphericity and identity in high-dimensional setting. The existing testing procedure for diagonality is not robust against either the data dimension or the data distribution, producing tests with distorted type I error rates much larger than nominal levels. This is mainly due to bias from estimating some functions of high-dimensional covariance matrix under non-normality. Compared to the sphericity and identity hypotheses, the asymptotic property of the diagonality hypothesis would be more involved and we should be more careful to deal with bias. We develop a correction that makes the existing test statistic robust against both the data dimension and the data distribution. We show that the proposed test statistic is asymptotically normal without the normality assumption and without specifying an explicit relationship between the dimension p and the sample size n. Simulations show that it has good size and power for a wide range of settings. 相似文献

5.

Central limit theorems for functionals of large sample covariance matrix and mean vector in matrix‐variate location mixture of normal distributions

Taras Bodnar Stepan Mazur Nestor Parolya 《Scandinavian Journal of Statistics》2019,46(2):636-660

In this paper, we consider the asymptotic distributions of functionals of the sample covariance matrix and the sample mean vector obtained under the assumption that the matrix of observations has a matrix‐variate location mixture of normal distributions. The central limit theorem is derived for the product of the sample covariance matrix and the sample mean vector. Moreover, we consider the product of the inverse sample covariance matrix and the mean vector for which the central limit theorem is established as well. All results are obtained under the large‐dimensional asymptotic regime, where the dimension p and the sample size n approach infinity such that p/n→c ∈ [0, + ∞) when the sample covariance matrix does not need to be invertible and p/n→c ∈ [0,1) otherwise. 相似文献

6.

Monitoring Variation in a Multivariate Process When the Dimension is Large Relative to the Sample Size

Robert L. Mason Youn-Min Chou John C. Young 《统计学通讯:理论与方法》2013,42(6):939-951

A control procedure is presented for monitoring changes in variation for a multivariate normal process in a Phase II operation where the subgroup size, m, is less than p, the number of variates. The methodology is based on a form of Wilk' statistic, which can be expressed as a function of the ratio of the determinants of two separate estimates of the covariance matrix. One estimate is based on the historical data set from Phase I and the other is based on an augmented data set including new data obtained in Phase II. The proposed statistic is shown to be distributed as the product of independent beta distributions that can be approximated using either a chi-square or F-distribution. An ARL study of the statistic is presented for a range of conditions for the population covariance matrix. Cases are considered where a p-variate process is being monitored using a sample of m observations per subgroup and m < p. Data from an industrial multivariate process is used to illustrate the proposed technique. 相似文献

7.

Testing homogeneity of several covariance matrices and multi-sample sphericity for high-dimensional data under non-normality

M. Rauf Ahmad 《统计学通讯:理论与方法》2017,46(8):3738-3753

A test for homogeneity of g ? 2 covariance matrices is presented when the dimension, p, may exceed the sample size, n_i, i = 1, …, g, and the populations may not be normal. Under some mild assumptions on covariance matrices, the asymptotic distribution of the test is shown to be normal when n_i, p → ∞. Under the null hypothesis, the test is extended for common covariance matrix to be of a specified structure, including sphericity. Theory of U-statistics is employed in constructing the tests and deriving their limits. Simulations are used to show the accuracy of tests. 相似文献

8.

Akaike Information Criterion for Selecting Variables in the Nested Error Regression Model

Tatsuya Kubokawa Muni S. Srivastava 《统计学通讯:理论与方法》2013,42(15):2626-2642

The Akaike Information Criterion (AIC) is developed for selecting the variables of the nested error regression model where an unobservable random effect is present. Using the idea of decomposing the likelihood into two parts of “within” and “between” analysis of variance, we derive the AIC when the number of groups is large and the ratio of the variances of the random effects and the random errors is an unknown parameter. The proposed AIC is compared, using simulation, with Mallows' C _p, Akaike's AIC, and Sugiura's exact AIC. Based on the rates of selecting the true model, it is shown that the proposed AIC performs better. 相似文献

9.

A new test for the mean vector in large dimension and small samples

Junguang Zhao 《统计学通讯:模拟与计算》2017,46(8):6115-6128

In this article, we consider the problem of testing the mean vector in the multivariate normal distribution, where the dimension p is greater than the sample size N. We propose a new test T_Block and obtain its asymptotic distribution. We also compare the proposed test with other two tests. The simulation results suggest that the performance of the new test is comparable to the existing two tests, and under some circumstances it may have higher power. Therefore, the new statistic can be employed in practice as an alternative choice. 相似文献

10.

Decomposition of Scatter Ratios Used in Monitoring Multivariate Process Variability

Robert L. Mason Youn-Min Chou John C. Young 《统计学通讯:理论与方法》2013,42(12):2128-2145

Wilks’ ratio statistic can be defined in terms of the ratio of the sample generalized variances of two non-independent estimators of the same covariance matrix. Recently this statistic has been proposed as a control statistic for monitoring changes in the covariance matrix of a multivariate normal process in a Phase II situation, particularly when the dimension is larger than the sample size. In this article we derive a technique for decomposing Wilks’ ratio statistic into the product of independent factors that can be associated with the components of the covariance matrix. With these results, we demonstrate that, when a signal is detected in a control procedure for the Phase II monitoring of process variability using the ratio statistic, the signaling value can be decomposed and the process variables contributing to the signal can be specifically identified. 相似文献

11.

Testing identity of high-dimensional covariance matrix

Hao Wang Baisen Liu Ning-Zhong Shi Shurong Zheng 《Journal of Statistical Computation and Simulation》2018,88(13):2600-2611

Two new statistics are proposed for testing the identity of high-dimensional covariance matrix. Applying the large dimensional random matrix theory, we study the asymptotic distributions of our proposed statistics under the situation that the dimension p and the sample size n tend to infinity proportionally. The proposed tests can accommodate the situation that the data dimension is much larger than the sample size, and the situation that the population distribution is non-Gaussian. The numerical studies demonstrate that the proposed tests have good performance on the empirical powers for a wide range of dimensions and sample sizes. 相似文献

12.

On the sufficient statistics for multivariate ARMA models: approximate approach

M. Kharrati-Kopaei A. R. Nematollahi Z. Shishebor 《Statistical Papers》2009,50(2):261-276

This paper is an investigation on the sufficient statistic for the parameters of the vector-valued (multivariate) ARMA models, when a finite sample is available. In the simplest case ARMA(1,1), by using the factorization theorem, we present a sufficient statistic whose dimension depends on the sample size and this dimension is even larger than the sample size. In this case and under some restrictions, we have solved this problem and have presented a sufficient statistic whose dimension does not depend on the sample size. In the general case, due to the complexity of the problem, we will use the modified versions of the likelihood function to find an approximate sufficient statistic in terms of the periodogram. The dimension of this sufficient statistic depends on the sample size; however, this dimension is much lower than the sample size. 相似文献

13.

Tests for high-dimensional covariance matrices using the theory of U-statistics

M. Rauf Ahmad D. von Rosen 《Journal of Statistical Computation and Simulation》2015,85(13):2619-2631

Test statistics for sphericity and identity of the covariance matrix are presented, when the data are multivariate normal and the dimension, p, can exceed the sample size, n. Under certain mild conditions mainly on the traces of the unknown covariance matrix, and using the asymptotic theory of U-statistics, the test statistics are shown to follow an approximate normal distribution for large p, also when p?n. The accuracy of the statistics is shown through simulation results, particularly emphasizing the case when p can be much larger than n. A real data set is used to illustrate the application of the proposed test statistics. 相似文献

14.

High dimensional asymptotics for the naive Hotelling T2 statistic in pattern recognition

Mitsuru Tamatani Kanta Naito 《统计学通讯:理论与方法》2013,42(22):5637-5656

Abstract

This paper examines the high dimensional asymptotics of the naive Hotelling T² statistic. Naive Bayes has been utilized in high dimensional pattern recognition as a method to avoid singularities in the estimated covariance matrix. The naive Hotelling T² statistic, which is equivalent to the estimator of the naive canonical correlation, is a statistically important quantity in naive Bayes and its high dimensional behavior has been studied under several conditions. In this paper, asymptotic normality of the naive Hotelling T² statistic under a high dimension low sample size setting is developed using the central limit theorem of a martingale difference sequence. 相似文献

15.

M. Roozbeh M. Arashi M. Gasparini 《统计学通讯:理论与方法》2013,42(8):1364-1386

This article is concerned with the problem of multicollinearity in the linear part of a seemingly unrelated semiparametric (SUS) model. It is also suspected that some additional non stochastic linear constraints hold on the whole parameter space. In the sequel, we propose semiparametric ridge and non ridge type estimators combining the restricted least squares methods in the model under study. For practical aspects, it is assumed that the covariance matrix of error terms is unknown and thus feasible estimators are proposed and their asymptotic distributional properties are derived. Also, necessary and sufficient conditions for the superiority of the ridge-type estimator over the non ridge type estimator for selecting the ridge parameter K are derived. Lastly, a Monte Carlo simulation study is conducted to estimate the parametric and nonparametric parts. In this regard, kernel smoothing and cross validation methods for estimating the nonparametric function are used. 相似文献

16.

ROBUST RIDGE REGRESSION BASED ON AN M-ESTIMATOR

MERVYN J. SILVAPULLE 《Australian & New Zealand Journal of Statistics》1991,33(3):319-333

Consider the linear regression model y =β₀1 +Xβ+ in the usual notation. It is argued that the class of ordinary ridge estimators obtained by shrinking the least squares estimator by the matrix (X¹X + kI)^-1X'X is sensitive to outliers in the ^variable. To overcome this problem, we propose a new class of ridge-type M-estimators, obtained by shrinking an M-estimator (instead of the least squares estimator) by the same matrix. Since the optimal value of the ridge parameter k is unknown, we suggest a procedure for choosing it adaptively. In a reasonably large scale simulation study with a particular M-estimator, we found that if the conditions are such that the M-estimator is more efficient than the least squares estimator then the corresponding ridge-type M-estimator proposed here is better, in terms of a Mean Squared Error criteria, than the ordinary ridge estimator with k chosen suitably. An example illustrates that the estimators proposed here are less sensitive to outliers in the y-variable than ordinary ridge estimators. 相似文献

17.

Statistical analysis of process capability indices with measurement errors: The case ofC p

Silvano Bordignon Michele Scagliarini 《Statistical Methods and Applications》2001,10(1-3):273-285

Process capability indices (PCIs) have been widely used in manufacturing industries to previde a quantitative measure of process potential and performance. While some efforts have been dedicated in the literature to the statistical properties of PCIs estimators, scarce attention has been given to the evaluation of these properties when sample data are affected by measurement errors. In this work we deal with the problem of measurement errors effects on the performance of PCIs. The analysis is illustrated with reference toC _p, i.e. the simplest and most common measure suggested to evaluate process capability. The authors would like to thank two anonymous referees for their comments and suggestion that were useful in the preparation and improvement of this paper. This work was partially supported by a MURST research grant. 相似文献

18.

The likelihood ratio test for high-dimensional linear regression model

Junshan Xie 《统计学通讯:理论与方法》2017,46(17):8479-8492

The paper considers a significance test of regression variables in the high-dimensional linear regression model when the dimension of the regression variables p, together with the sample size n, tends to infinity. Under two sightly different cases, we proved that the likelihood ratio test statistic will converge in distribution to a Gaussian random variable, and the explicit expressions of the asymptotical mean and covariance are also obtained. The simulations demonstrate that our high-dimensional likelihood ratio test method outperforms those using the traditional methods in analyzing high-dimensional data. 相似文献

19.

Heteroskedasticity-consistent covariance matrix estimation:white's estimator and the bootstrap ∗

《Journal of Statistical Computation and Simulation》2012,82(4):391-411

This paper considers the issue of estimating the covariance matrix of ordinary least squares estimates in a linear regression model when heteroskedasticity is suspected. We perform Monte Carlo simulation on the White estimator, which is commonly used in.

empirical research, and also on some alternatives based on different bootstrapping schemes. Our results reveal that the White estimator can be considerably biased when the sample size is not very large, that bias correction via bootstrap does not work well, and that the weighted bootstrap estimators tend to display smaller biases than the White estimator and its variants, under both homoskedasticity and heteroskedasticity. Our results also reveal that the presence of (potentially) influential observations in the design matrix plays an important role in the finite-sample performance of the heteroskedasticity-consistent estimators. 相似文献

20.

Estimation of the process incapability index

K.S. Chen 《统计学通讯:理论与方法》2013,42(5):1263-1274

Greenwich and Jahr-Schaffrath (1995) introduced a new index C _ppa simple transformation of the index C _pm, which provides an uncontaminated separation between information concerning process accuracy and process precision. Under the assumption of normality, we first show that the estimators of C _pp proposed by Greenwich and Jahr-Schaffrath (1995) are UMVU estimators. We also show that for the inaccuracy index, the variance of the unbiased estimator is smaller than the mean squared error (MSE) of the natural (biased) estimator for n > 3. In addition, we obtain the r-th moment and the probability density function of these estimators. 相似文献