共查询到20条相似文献,搜索用时 15 毫秒
1.
Classical multivariate methods are often based on the sample covariance matrix, which is very sensitive to outlying observations. One alternative to the covariance matrix is the affine equivariant rank covariance matrix (RCM) that has been studied in Visuri et al. [2003. Affine equivariant multivariate rank methods. J. Statist. Plann. Inference 114, 161–185]. In this article we assume that the covariance matrix is partially known and study how to estimate the corresponding RCM. We use the properties that the RCM is affine equivariant and that the RCM is proportional to the inverse of the regular covariance matrix, and hence reduce the problem of estimating the original RCM to estimating marginal rank covariance matrices. This is a great computational advantage when the dimension of the original data vector is large. 相似文献
2.
In this paper we review some of recent developments in high dimensional data analysis, especially in the estimation of covariance and precision matrix, asymptotic results on the eigenstructure in the principal components analysis, and some relevant issues such as test on the equality of two covariance matrices, determination of the number of principal components, and detection of hubs in a complex network. 相似文献
3.
Hao Wang Baisen Liu Ning-Zhong Shi Shurong Zheng 《Journal of Statistical Computation and Simulation》2018,88(13):2600-2611
Two new statistics are proposed for testing the identity of high-dimensional covariance matrix. Applying the large dimensional random matrix theory, we study the asymptotic distributions of our proposed statistics under the situation that the dimension p and the sample size n tend to infinity proportionally. The proposed tests can accommodate the situation that the data dimension is much larger than the sample size, and the situation that the population distribution is non-Gaussian. The numerical studies demonstrate that the proposed tests have good performance on the empirical powers for a wide range of dimensions and sample sizes. 相似文献
4.
5.
Guoyou Qin 《Journal of applied statistics》2015,42(6):1240-1254
In this paper, we study estimation of linear models in the framework of longitudinal data with dropouts. Under the assumptions that random errors follow an elliptical distribution and all the subjects share the same within-subject covariance matrix which does not depend on covariates, we develop a robust method for simultaneous estimation of mean and covariance. The proposed method is robust against outliers, and does not require to model the covariance and missing data process. Theoretical properties of the proposed estimator are established and simulation studies show its good performance. In the end, the proposed method is applied to a real data analysis for illustration. 相似文献
6.
This article evaluates the economic benefit of methods that have been suggested to optimally sample (in an MSE sense) high-frequency return data for the purpose of realized variance/covariance estimation in the presence of market microstructure noise (Bandi and Russell, 2005a, 2008). We compare certainty equivalents derived from volatility-timing trading strategies relying on optimally-sampled realized variances and covariances, on realized variances and covariances obtained by sampling every 5 minutes, and on realized variances and covariances obtained by sampling every 15 minutes. In our sample, we show that a risk-averse investor who is given the option of choosing variance/covariance forecasts derived from MSE-based optimal sampling methods versus forecasts obtained from 5- and 15-minute intervals (as generally proposed in the literature) would be willing to pay up to about 80 basis points per year to achieve the level of utility that is guaranteed by optimal sampling. We find that the gains yielded by optimal sampling are economically large, statistically significant, and robust to realistic transaction costs. 相似文献
7.
This article evaluates the economic benefit of methods that have been suggested to optimally sample (in an MSE sense) high-frequency return data for the purpose of realized variance/covariance estimation in the presence of market microstructure noise (Bandi and Russell, 2005a, 2008). We compare certainty equivalents derived from volatility-timing trading strategies relying on optimally-sampled realized variances and covariances, on realized variances and covariances obtained by sampling every 5 minutes, and on realized variances and covariances obtained by sampling every 15 minutes. In our sample, we show that a risk-averse investor who is given the option of choosing variance/covariance forecasts derived from MSE-based optimal sampling methods versus forecasts obtained from 5- and 15-minute intervals (as generally proposed in the literature) would be willing to pay up to about 80 basis points per year to achieve the level of utility that is guaranteed by optimal sampling. We find that the gains yielded by optimal sampling are economically large, statistically significant, and robust to realistic transaction costs. 相似文献
8.
Jin Hyun Nam 《统计学通讯:模拟与计算》2017,46(3):1796-1807
Among many classification methods, linear discriminant analysis (LDA) is a favored tool due to its simplicity, robustness, and predictive accuracy but when the number of genes is larger than the number of observations, it cannot be applied directly because the within-class covariance matrix is singular. Also, diagonal LDA (DLDA) is a simpler model compared to LDA and has better performance in some cases. However, in reality, DLDA requires a strong assumption based on mutual independence. In this article, we propose the modified LDA (MLDA). MLDA is based on independence, but uses the information that has an effect on classification performance with the dependence structure. We suggest two approaches. One is the case of using gene rank. The other involves no use of gene rank. We found that MLDA has better performance than LDA, DLDA, or K-nearest neighborhood and is comparable with support vector machines in real data analysis and the simulation study. 相似文献
9.
Thomas J. Fisher 《Journal of statistical planning and inference》2012,142(1):312-326
This article explores the problem of testing the hypothesis that the covariance matrix is an identity matrix when the dimensionality is equal to the sample size or larger. Two new test statistics are proposed under comparable assumptions to those statistics in the literature. The asymptotic distribution of the proposed test statistics are found and are shown to be consistent in the general asymptotic framework. An extensive simulation study shows the newly proposed tests are comparable to, and in some cases more powerful than, the tests for an identity covariance matrix currently in the literature. 相似文献
10.
AbstractWe suggest shrinkage based technique for estimating covariance matrix in the high-dimensional normal model with missing data. Our approach is based on the monotone missing scheme assumption, meaning that missing values patterns occur completely at random. Our asymptotic framework allows the dimensionality p grow to infinity together with the sample size, N, and extends the methodology of Ledoit and Wolf (2004) to the case of two-step monotone missing data. Two new shrinkage-type estimators are derived and their dominance properties over the Ledoit and Wolf (2004) estimator are shown under the expected quadratic loss. We perform a simulation study and conclude that the proposed estimators are successful for a range of missing data scenarios. 相似文献
11.
Test statistics for sphericity and identity of the covariance matrix are presented, when the data are multivariate normal and the dimension, p, can exceed the sample size, n. Under certain mild conditions mainly on the traces of the unknown covariance matrix, and using the asymptotic theory of U-statistics, the test statistics are shown to follow an approximate normal distribution for large p, also when p?n. The accuracy of the statistics is shown through simulation results, particularly emphasizing the case when p can be much larger than n. A real data set is used to illustrate the application of the proposed test statistics. 相似文献
12.
Kai Xu 《Journal of Statistical Computation and Simulation》2017,87(16):3208-3224
Under non-normality, this article is concerned with testing diagonality of high-dimensional covariance matrix, which is more practical than testing sphericity and identity in high-dimensional setting. The existing testing procedure for diagonality is not robust against either the data dimension or the data distribution, producing tests with distorted type I error rates much larger than nominal levels. This is mainly due to bias from estimating some functions of high-dimensional covariance matrix under non-normality. Compared to the sphericity and identity hypotheses, the asymptotic property of the diagonality hypothesis would be more involved and we should be more careful to deal with bias. We develop a correction that makes the existing test statistic robust against both the data dimension and the data distribution. We show that the proposed test statistic is asymptotically normal without the normality assumption and without specifying an explicit relationship between the dimension p and the sample size n. Simulations show that it has good size and power for a wide range of settings. 相似文献
13.
Testing hypotheses about the structure of a covariance matrix for doubly multivariate data is often considered in the literature. In this paper the Rao's score test (RST) is derived to test the block exchangeable covariance matrix or block compound symmetry (BCS) covariance structure under the assumption of multivariate normality. It is shown that the empirical distribution of the RST statistic under the null hypothesis is independent of the true values of the mean and the matrix components of a BCS structure. A significant advantage of the RST is that it can be performed for small samples, even smaller than the dimension of the data, where the likelihood ratio test (LRT) cannot be used, and it outperforms the standard LRT in a number of contexts. Simulation studies are performed for the sample size consideration, and for the estimation of the empirical quantiles of the null distribution of the test statistic. The RST procedure is illustrated on a real data set from the medical studies. 相似文献
14.
Estimation of the population spectral distribution from a large dimensional sample covariance matrix
Weiming Li Jiaqi Chen Yingli Qin Zhidong Bai Jianfeng Yao 《Journal of statistical planning and inference》2013
This paper introduces a new method to estimate the spectral distribution of a population covariance matrix from high-dimensional data. The method is founded on a meaningful generalization of the seminal Mar?enko–Pastur equation, originally defined in the complex plane, to the real line. Beyond its easy implementation and the established asymptotic consistency, the new estimator outperforms two existing estimators from the literature in almost all the situations tested in a simulation experiment. An application to the analysis of the correlation matrix of S&P 500 daily stock returns is also given. 相似文献
15.
《Journal of Statistical Computation and Simulation》2012,82(1-3):115-128
In heteroskedastic regression models, the least squares (OLS) covariance matrix estimator is inconsistent and inference is not reliable. To deal with inconsistency one can estimate the regression coefficients by OLS, and then implement a heteroskedasticity consistent covariance matrix (HCCM) estimator. Unfortunately the HCCM estimator is biased. The bias is reduced by implementing a robust regression, and by using the robust residuals to compute the HCCM estimator (RHCCM). A Monte-Carlo study analyzes the behavior of RHCCM and of other HCCM estimators, in the presence of systematic and random heteroskedasticity, and of outliers in the explanatory variables. 相似文献
16.
17.
An approximation is given to calculate V, the covariance matrix for normal order statistics. The approximation gives considerable improvement over previous approximations, and the computing algorithm is available from the authors. 相似文献
18.
We propose a method to estimate the intraday volatility of a stock by integrating the instantaneous conditional return variance per unit time obtained from the autoregressive conditional duration (ACD) model, called the ACD-ICV method. We compare the daily volatility estimated using the ACD-ICV method against several versions of the realized volatility (RV) method, including the bipower variation RV with subsampling, the realized kernel estimate, and the duration-based RV. Our Monte Carlo results show that the ACD-ICV method has lower root mean-squared error than the RV methods in almost all cases considered. This article has online supplementary material. 相似文献
19.
20.
We study high-dimensional covariance/precision matrix estimation under the assumption that the covariance/precision matrix can be decomposed into a low-rank component and a diagonal component . The rank of can either be chosen to be small or controlled by a penalty function. Under moderate conditions on the population covariance/precision matrix itself and on the penalty function, we prove some consistency results for our estimators. A block-wise coordinate descent algorithm, which iteratively updates and , is then proposed to obtain the estimator in practice. Finally, various numerical experiments are presented; using simulated data, we show that our estimator performs quite well in terms of the Kullback–Leibler loss; using stock return data, we show that our method can be applied to obtain enhanced solutions to the Markowitz portfolio selection problem. The Canadian Journal of Statistics 48: 308–337; 2020 © 2019 Statistical Society of Canada 相似文献