首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
In this paper we comment on and review some unexpected but interesting features of the BLUE (best linear unbiased estimator) of the expectation vector in the general linear model and in particular, the BLUE's covariance matrix. Most of these features appear in the literature but are rather scattered or hidden.  相似文献   

2.
In this paper, we investigate some properties of 2-principal points for location mixtures of spherically symmetric distributions with focus on a linear subspace in which a set of 2-principal points must lie. Our results can be viewed as an extension of those of Yamamoto and Shinozaki [2000. Two principal points for multivariate location mixtures of spherically symmetric distributions. J. Japan Statist. Soc. 30, 53–63], where a finite location mixture of spherically symmetric distributions is treated. As an extension of their paper, this paper defines a wider class of distributions, and derives a linear subspace in which a set of 2-principal points must exist. A theorem useful for comparing the mean squared distances is also established.  相似文献   

3.
One strategy of exploratory factor analysis is to decide on the number of factors to extract by means of the eigenvalues of an initial principal component analysis. The present article proves that there is a non zero covariance of the factors with the components rejected when the number of factors to extract is determined by means of principal components analysis. Thus, some of the variance declared as irrelevant or unwanted in an initial principal component analysis is again part of the final factor model.  相似文献   

4.
We consider a regularized D-classification rule for high dimensional binary classification, which adapts the linear shrinkage estimator of a covariance matrix as an alternative to the sample covariance matrix in the D-classification rule (D-rule in short). We find an asymptotic expression for misclassification rate of the regularized D-rule, when the sample size n and the dimension p both increase and their ratio pn approaches a positive constant γ. In addition, we compare its misclassification rate to the standard D-rule under various settings via simulation.  相似文献   

5.
In the analysis of stationary stochastic process, one has to deal with covariance matrix of Toeplitz (or Laurent) structure. Such structure has a feature that not only the elements on the principal diagonal but also those lying on each of the parallel sub-diagonals are equal as well. The present investigation is on the problem of large sample testing of the Toeplitz pattern of the population covariance matrix. Apart from usual application of likelihood ratio and Rao’s efficient score criteria, some heuristic two-stage tests are suggested. The results of Monte Carlo experiment are reported for the size of the proposed tests.  相似文献   

6.
In this paper, an unstructured principal fitted response reduction approach is proposed. The new approach is mainly different from two existing model-based approaches, because a required condition is assumed in a covariance matrix of the responses instead of that of a random error. Also, it is invariant under one of popular ways of standardizing responses with its sample covariance equal to the identity matrix. According to numerical studies, the proposed approach yields more robust estimation than the two existing methods, in the sense that its asymptotic performances are not severely sensitive to various situations. So, it can be recommended that the proposed method should be used as a default model-based method.  相似文献   

7.
In this article we study the problem of classification of three-level multivariate data, where multiple qq-variate observations are measured on uu-sites and over pp-time points, under the assumption of multivariate normality. The new classification rules with certain structured and unstructured mean vectors and covariance structures are very efficient in small sample scenario, when the number of observations is not adequate to estimate the unknown variance–covariance matrix. These classification rules successfully model the correlation structure on successive repeated measurements over time. Computation algorithms for maximum likelihood estimates of the unknown population parameters are presented. Simulation results show that the introduction of sites in the classification rules improves their performance over the existing classification rules without the sites.  相似文献   

8.
In practice, when a principal component analysis is applied on a large number of variables the resultant principal components may not be easy to interpret, as each principal component is a linear combination of all the original variables. Selection of a subset of variables that contains, in some sense, as much information as possible and enhances the interpretations of the first few covariance principal components is one possible approach to tackle this problem. This paper describes several variable selection criteria and investigates which criteria are best for this purpose. Although some criteria are shown to be better than others, the main message of this study is that it is unwise to rely on only one or two criteria. It is also clear that the interdependence between variables and the choice of how to measure closeness between the original components and those using subsets of variables are both important in determining the best criteria to use.  相似文献   

9.
For two or more populations of which the covariance matrices have a common set of eigenvectors, but different sets of eigenvalues, the common principal components (CPC) model is appropriate. Pepler et al. (2015 Pepler, P. T., Uys, D. W. and Nel, D. G. (2015). Regularised covariance matrix estimation under the common principal components model. Communications in Statistics: Simulation and Computation. (In press). [Google Scholar]) proposed a regularized CPC covariance matrix estimator and showed that this estimator outperforms the unbiased and pooled estimators in situations, where the CPC model is applicable. This article extends their work to the context of discriminant analysis for two groups, by plugging the regularized CPC estimator into the ordinary quadratic discriminant function. Monte Carlo simulation results show that CPC discriminant analysis offers significant improvements in misclassification error rates in certain situations, and at worst performs similar to ordinary quadratic and linear discriminant analysis. Based on these results, CPC discriminant analysis is recommended for situations, where the sample size is small compared to the number of variables, in particular for cases where there is uncertainty about the population covariance matrix structures.  相似文献   

10.
Canonical discriminant functions are defined here as linear combinations that separate groups of observations, and canonical variates are defined as linear combinations associated with canonical correlations between two sets of variables. In standardized form, the coefficients in either type of canonical function provide information about the joint contribution of the variables to the canonical function. The standardized coefficients can be converted to correlations between the variables and the canonical function. These correlations generally alter the interpretation of the canonical functions. For canonical discriminant functions, the standardized coefficients are compared with the correlations, with partial t and F tests, and with rotated coefficients. For canonical variates, the discussion includes standardized coefficients, correlations between variables and the function, rotation, and redundancy analysis. Various approaches to interpretation of principal components are compared: the choice between the covariance and correlation matrices, the conversion of coefficients to correlations, the rotation of the coefficients, and the effect of special patterns in the covariance and correlation matrices.  相似文献   

11.
An alternative form of the Watson efficiency   总被引:1,自引:0,他引:1  
Watson [1951. Serial correlation in regression analysis. Ph.D. Thesis, Department of Experimental Statistics, North Carolina State College, Raleigh] introduced a relative efficiency, which is often called the Watson efficiency in literatures, to measure the inefficiency of the least squares in linear regression models. The Watson efficiency is defined by determinant, but we shall show by two examples that such a criterion does not always work well in some cases. In this paper, an alternative form based on Euclidean norm of the Watson efficiency is proposed and some examples are given to illustrate superiority of the new relative efficiency.  相似文献   

12.
In this paper the problem of estimating the scale matrix in a complex elliptically contoured distribution (complex ECD) is addressed. An extended Haff–Stein identity for this model is derived. It is shown that the minimax estimators of the covariance matrix obtained under the complex normal model remain robust under the complex ECD model when the Stein loss function is employed.  相似文献   

13.
In this paper the estimation of the unknown parameters is considered in standard growth curve model with special covariance structures. Based on the unbiased estimating equations, some new methods are proposed. The resulting estimators can be expressed in explicit forms. The statistical properties of the proposed estimators are investigated. Some simulation results are presented to compare the performance of the proposed estimator with that of the existing approaches. Finally, these methods are applied in general extended growth curve model with special covariance structures.  相似文献   

14.
The common principal components (CPC) model provides a way to model the population covariance matrices of several groups by assuming a common eigenvector structure. When appropriate, this model can provide covariance matrix estimators of which the elements have smaller standard errors than when using either the pooled covariance matrix or the per group unbiased sample covariance matrix estimators. In this article, a regularized CPC estimator under the assumption of a common (or partially common) eigenvector structure in the populations is proposed. After estimation of the common eigenvectors using the Flury–Gautschi (or other) algorithm, the off-diagonal elements of the nearly diagonalized covariance matrices are shrunk towards zero and multiplied with the orthogonal common eigenvector matrix to obtain the regularized CPC covariance matrix estimates. The optimal shrinkage intensity per group can be estimated using cross-validation. The efficiency of these estimators compared to the pooled and unbiased estimators is investigated in a Monte Carlo simulation study, and the regularized CPC estimator is applied to a real dataset to demonstrate the utility of the method.  相似文献   

15.
We propose optimal procedures to achieve the goal of partitioning k multivariate normal populations into two disjoint subsets with respect to a given standard vector. Definition of good or bad multivariate normal populations is given according to their Mahalanobis distances to a known standard vector as being small or large. Partitioning k multivariate normal populations is reduced to partitioning k non-central Chi-square or non-central F distributions with respect to the corresponding non-centrality parameters depending on whether the covariance matrices are known or unknown. The minimum required sample size for each population is determined to ensure that the probability of correct decision attains a certain level. An example is given to illustrate our procedures.  相似文献   

16.
An asymptotic expansion is given for the distribution of the α-th largest latent root of a correlation matrix, when the observations are from a multivariate normal distribution. An asymptotic expansion for the distribution of a test statistic based on a correlation matrix, which is useful in dimensionality reduction in principal component analysis, is also given. These expansions hold when the corresponding latent root of the population correlation matrix is simple. The approach here is based on a perturbation method.  相似文献   

17.
Summary.  Although the covariance matrices corresponding to different populations are unlikely to be exactly equal they can still exhibit a high degree of similarity. For example, some pairs of variables may be positively correlated across most groups, whereas the correlation between other pairs may be consistently negative. In such cases much of the similarity across covariance matrices can be described by similarities in their principal axes, which are the axes that are defined by the eigenvectors of the covariance matrices. Estimating the degree of across-population eigenvector heterogeneity can be helpful for a variety of estimation tasks. For example, eigenvector matrices can be pooled to form a central set of principal axes and, to the extent that the axes are similar, covariance estimates for populations having small sample sizes can be stabilized by shrinking their principal axes towards the across-population centre. To this end, the paper develops a hierarchical model and estimation procedure for pooling principal axes across several populations. The model for the across-group heterogeneity is based on a matrix-valued antipodally symmetric Bingham distribution that can flexibly describe notions of 'centre' and 'spread' for a population of orthogonal matrices.  相似文献   

18.
We discuss a general application of categorical data analysis to mutations along the HIV genome. We consider a multidimensional table for several positions at the same time. Due to the complexity of the multidimensional table, we may collapse it by pooling some categories. However, the association between the remaining variables may not be the same as before collapsing. We discuss the collapsibility of tables and the change in the meaning of parameters after collapsing categories. We also address this problem with a log-linear model. We present a parameterization with the consensus output as the reference cell as is appropriate to explain genomic mutations in HIV. We also consider five null hypotheses and some classical methods to address them. We illustrate methods for six positions along the HIV genome, through consideration of all triples of positions.  相似文献   

19.
The paper describes two regression models—principal components and maximum-likelihood factor analysis—which may be used when the stochastic predictor varibles are highly intereorrelated and/or contain measurement error. The two problems can occur jointly, for example in social-survey data where the true (but unobserved) covariance matrix can be singular. Departure from singularity of the sample dispersion matrix is then due to measurement error. We first consider the more elementary principal components regression model, where it is shown that it can be derived as a special case of (i) canonical correlation, and (ii) restricted least squares. The second part consists of the more general maximum-likelihood factor-analysis regression model, which is derived from the generalized inverse of the product of two singular matrices. Also, it is proved that factor-analysis regression can be considered as an instrumental variables estimator and therefore does not depend on whether factors have been “properly” identified in terms of substantive behaviour. Consequently the additional task of rotating factors to “simple structure” does not arise.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号