首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 942 毫秒
1.
The independent exploratory factor analysis method is introduced for recovering independent latent sources from their observed mixtures. The new model is viewed as a method of factor rotation in exploratory factor analysis (EFA). First, estimates for all EFA model parameters are obtained simultaneously. Then, an orthogonal rotation matrix is sought that minimizes the dependence between the common factors. The rotation of the scores is compensated by a rotation of the initial loading matrix. The proposed approach is applied to study winter monthly sea-level pressure anomalies over the Northern Hemisphere. The North Atlantic Oscillation, the North Pacific Oscillation, and the Scandinavian pattern are identified among the rotated spatial patterns with a physically interpretable structure.  相似文献   

2.
Many experiments in the physical and engineering sciences study complex processes in which bias due to model inadequacy dominates random error. A noteworthy example of this situation is the use of computer experiments, in which scientists simulate the phenomenon being studied by a computer code. Computer experiments are deterministic: replicate observations from running the code with the same inputs will be identical. Such high-bias settings demand different techniques for design and prediction. This paper will focus on the experimental design problem introducing a new class of designs called rotation designs. Rotation designs are found by taking an orthogonal starting design D and rotating it to obtain a new design matrix DR=DR, where R is any orthonormal matrix. The new design is still orthogonal for a first-order model. In this paper, we study some of the properties of rotation designs and we present a method to generate rotation designs that have some appealing symmetry properties.  相似文献   

3.
We propose an algorithmic framework for computing sparse components from rotated principal components. This methodology, called SIMPCA, is useful to replace the unreliable practice of ignoring small coefficients of rotated components when interpreting them. The algorithm computes genuinely sparse components by projecting rotated principal components onto subsets of variables. The so simplified components are highly correlated with the corresponding components. By choosing different simplification strategies different sparse solutions can be obtained which can be used to compare alternative interpretations of the principal components. We give some examples of how effective simplified solutions can be achieved with SIMPCA using some publicly available data sets.  相似文献   

4.
Common factor analysis (CFA) and principal component analysis (PCA) are widely used multivariate techniques. Using simulations, we compared CFA with PCA loadings for distortions of a perfect cluster configuration. Results showed that nonzero PCA loadings were higher and more stable than nonzero CFA loadings. Compared to CFA loadings, PCA loadings correlated weakly with the true factor loadings for underextraction, overextraction, and heterogeneous loadings within factors. The pattern of differences between CFA and PCA was consistent across sample sizes, levels of loadings, principal axis factoring versus maximum likelihood factor analysis, and blind versus target rotation.  相似文献   

5.
Sparse principal components analysis (SPCA) is a technique for finding principal components with a small number of non‐zero loadings. Our contribution to this methodology is twofold. First we derive the sparse solutions that minimise the least squares criterion subject to sparsity requirements. Second, recognising that sparsity is not the only requirement for achieving simplicity, we suggest a backward elimination algorithm that computes sparse solutions with large loadings. This algorithm can be run without specifying the number of non‐zero loadings in advance. It is also possible to impose the requirement that a minimum amount of variance be explained by the components. We give thorough comparisons with existing SPCA methods and present several examples using real datasets.  相似文献   

6.
In document clustering, a document may be assigned to multiple clusters and the probabilities of a document belonging to different clusters are directly normalized. We propose a new Posterior Probabilistic Clustering (PPC) model that has this normalization property. The clustering model is based on Nonnegative Matrix Factorization (NMF) and flexible such that if we use class conditional probability normalization, the model reduces to Probabilistic Latent Semantic Indexing (PLSI). Systematic comparison and evaluation indicates that PPC is competitive with other state-of-art clustering methods. Furthermore, the results of PPC are more sparse and orthogonal, both of which are highly desirable.  相似文献   

7.
Canonical discriminant functions are defined here as linear combinations that separate groups of observations, and canonical variates are defined as linear combinations associated with canonical correlations between two sets of variables. In standardized form, the coefficients in either type of canonical function provide information about the joint contribution of the variables to the canonical function. The standardized coefficients can be converted to correlations between the variables and the canonical function. These correlations generally alter the interpretation of the canonical functions. For canonical discriminant functions, the standardized coefficients are compared with the correlations, with partial t and F tests, and with rotated coefficients. For canonical variates, the discussion includes standardized coefficients, correlations between variables and the function, rotation, and redundancy analysis. Various approaches to interpretation of principal components are compared: the choice between the covariance and correlation matrices, the conversion of coefficients to correlations, the rotation of the coefficients, and the effect of special patterns in the covariance and correlation matrices.  相似文献   

8.
The methods developed by John and Draper et al. of partitioning the blends (runs) of four mixture components into two or more orthogonal blocks when fitting quadratic models are extended to mixtures of five components. The characteristics of Latin squares of side five are used to derive rules for reliably and quickly obtaining designs with specific properties. The designs also produce orthogonal blocks when higher order models are fitted.  相似文献   

9.
This study examines the practical implications of the fact that structural changes in factor loadings can produce spurious factors (or irrelevant factors) in forecasting exercises. These spurious factors can induce an overfitting problem in factor-augmented forecasting models. To address this concern, we propose a method to estimate nonspurious factors by identifying the set of response variables that have no structural changes in their factor loadings. Our theoretical results show that the obtained set may include a fraction of unstable response variables. However, the fraction is so small that the original factors are able to be identified and estimated consistently. Moreover, using this approach, we find that a significant portion of 132 U.S. macroeconomic time series have structural changes in their factor loadings. Although traditional principal components provide eight or more factors, there are significantly fewer nonspurious factors. The forecasts using the nonspurious factors can significantly improve out-of-sample performance.  相似文献   

10.
ABSTRACT

Factor analysis (FA) is the most commonly used pattern recognition methodology in social and health research. A technique that may help to better retrieve true information from FA is the rotation of the information axes. The main goal is to test the reliability of the results derived through FA and to reveal the best rotation method under various scenarios. Based on the results of the simulations, it was observed that when applying non-orthogonal rotation, the results were more repeatable as compared to the orthogonal rotation, and, when no rotation was applied.  相似文献   

11.
Quadratic forms capture multivariate information in a single number, making them useful, for example, in hypothesis testing. When a quadratic form is large and hence interesting, it might be informative to partition the quadratic form into contributions of individual variables. In this paper it is argued that meaningful partitions can be formed, though the precise partition that is determined will depend on the criterion used to select it. An intuitively reasonable criterion is proposed and the partition to which it leads is determined. The partition is based on a transformation that maximises the sum of the correlations between individual variables and the variables to which they transform under a constraint. Properties of the partition, including optimality properties, are examined. The contributions of individual variables to a quadratic form are less clear‐cut when variables are collinear, and forming new variables through rotation can lead to greater transparency. The transformation is adapted so that it has an invariance property under such rotation, whereby the assessed contributions are unchanged for variables that the rotation does not affect directly. Application of the partition to Hotelling's one‐ and two‐sample test statistics, Mahalanobis distance and discriminant analysis is described and illustrated through examples. It is shown that bootstrap confidence intervals for the contributions of individual variables to a partition are readily obtained.  相似文献   

12.
Abstract.  A flexible semi-parametric regression model is proposed for modelling the relationship between a response and multivariate predictor variables. The proposed multiple-index model includes smooth unknown link and variance functions that are estimated non-parametrically. Data-adaptive methods for automatic smoothing parameter selection and for the choice of the number of indices M are considered. This model adapts to complex data structures and provides efficient adaptive estimation through the variance function component in the sense that the asymptotic distribution is the same as if the non-parametric components are known. We develop iterative estimation schemes, which include a constrained projection method for the case where the regression parameter vectors are mutually orthogonal. The proposed methods are illustrated with the analysis of data from a growth bioassay and a reproduction experiment with medflies. Asymptotic properties of the estimated model components are also obtained.  相似文献   

13.
The effect of nonstationarity in time series columns of input data in principal components analysis is examined. Nonstationarity are very common among economic indicators collected over time. They are subsequently summarized into fewer indices for purposes of monitoring. Due to the simultaneous drifting of the nonstationary time series usually caused by the trend, the first component averages all the variables without necessarily reducing dimensionality. Sparse principal components analysis can be used, but attainment of sparsity among the loadings (hence, dimension-reduction is achieved) is influenced by the choice of parameter(s) (λ 1,i ). Simulated data with more variables than the number of observations and with different patterns of cross-correlations and autocorrelations were used to illustrate the advantages of sparse principal components analysis over ordinary principal components analysis. Sparse component loadings for nonstationary time series data can be achieved provided that appropriate values of λ 1,j are used. We provide the range of values of λ 1,j that will ensure convergence of the sparse principal components algorithm and consequently achieve sparsity of component loadings.  相似文献   

14.
The penalized likelihood principal component method of Park (2005) offers flexibility in the choice of the penalty function. This flexibility allows the method to be tailored to enhance interpretation in special cases. Of particular interest is a penalty function in the style of the Lasso that can be used to produce exactly zero loadings. Also of interest is a penalty function for cases in which interpretability is best represented by alignment with orthogonal subspaces, rather than with axis directions. In each case, a data example is presented.  相似文献   

15.
The hierarchically orthogonal functional decomposition of any measurable function η of a random vector X=(X1,?…?, Xp) consists in decomposing η(X) into a sum of increasing dimension functions depending only on a subvector of X. Even when X1,?…?, Xp are assumed to be dependent, this decomposition is unique if the components are hierarchically orthogonal. That is, two of the components are orthogonal whenever all the variables involved in one of the summands are a subset of the variables involved in the other. Setting Y=η(X), this decomposition leads to the definition of generalized sensitivity indices able to quantify the uncertainty of Y due to each dependent input in X [Chastaing G, Gamboa F, Prieur C. Generalized Hoeffding–Sobol decomposition for dependent variables – application to sensitivity analysis. Electron J Statist. 2012;6:2420–2448]. In this paper, a numerical method is developed to identify the component functions of the decomposition using the hierarchical orthogonality property. Furthermore, the asymptotic properties of the components estimation is studied, as well as the numerical estimation of the generalized sensitivity indices of a toy model. Lastly, the method is applied to a model arising from a real-world problem.  相似文献   

16.
Besides the basic model, Kronecker products of rotated models are used to isolate the variance components as parameters of a linear model. A characterization of BLUE given by Zmy?lony (1980) is applied to the different models. Generalized least squares are used to complete the estimation.  相似文献   

17.
We consider the problem of constructing search designs for 3m factorial designs. By using projection properties of some three-level orthogonal arrays, some search designs are obtained for 3 ? m ? 11. The new obtained orthogonal search designs are capable of searching and identifying up to four two-factor interactions and estimating them along with the general mean and main effects. The resulted designs have very high searching probabilities; it means that besides the well-known orthogonal structure, they have high ability in searching the true effects.  相似文献   

18.
Nested block designs and block designs properties such as orthogonality, orthogonal block structure and general balance are examined using the concept of a commutative quadratic subspace and standard properties of orthogonal projectors. In this geometrical context conditions for existence of the best linear unbiased estimators of treatment contrasts are also discussed.  相似文献   

19.
Exact influence measures are applied in the evaluation of a principal component decomposition for high dimensional data. Some data used for classifying samples of rice from their near infra-red transmission profiles, following a preliminary principal component analysis, are examined in detail. A normalization of eigenvalue influence statistics is proposed which ensures that measures reflect the relative orientations of observations, rather than their overall Euclidean distance from the sample mean. Thus, the analyst obtains more information from an analysis of eigenvalues than from approximate approaches to eigenvalue influence. This is particularly important for high dimensional data where a complete investigation of eigenvector perturbations may be cumbersome. The results are used to suggest a new class of influence measures based on ratios of Euclidean distances in orthogonal spaces.  相似文献   

20.
Interpretation of principal components is difficult due to their weights (loadings, coefficients) being of various sizes. Whereas very small weights or very large weights can give clear indication of the importance of particular variables, weights that are neither large nor small (‘grey area’ weights) are problematical. This is a particular problem in the fast moving goods industries where a lot of multivariate panel data are collected on products. These panel data are subjected to univariate analyses and multivariate analyses where principal components (PCs) are key to the interpretation of the data. Several authors have suggested alternatives to PCs, seeking simplified components such as sparse PCs. Here components, termed simple components (SCs), are sought in conjunction with Thurstonian criteria that a component should have only a few variables highly weighted on it and each variable should be weighted heavily on just a few components. An algorithm is presented that finds SCs efficiently. Simple components are found for panel data consisting of the responses to a questionnaire on efficacy and other features of deodorants. It is shown that five SCs can explain an amount of variation within the data comparable to that explained by the PCs, but with easier interpretation.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号