首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 453 毫秒
1.
Fujikoshi (1982) obtained the necessary and sufficient conditions for the increased number of variables in the two sets of vectors not affecting the original nonzero canonical correlations and used these to obtain the likelihood ratio test procedure. He assumed a nonsingular covariance matrix due to random variables. Here, we study the same problem when the covariance matrix is singular and establish some further results. In this study, we note that the unit canonical correlations have to be separated in some of the situations. These results are valid for complex random vector variables and in some situations, the test for redundancy is given for complex random variables.  相似文献   

2.
In this paper some hierarchical methods for identifying groups of variables are illustrated and compared. It is shown that the use of multivariate association measures between two sets of variables can overcome the drawbacks of the usually employed bivariate correlation coefficient, but the resulting methods are generally not monotonic. Thus a new multivariate association measure is proposed, based on the links existing between canonical correlation analysis and principal component analysis, which can be more suitably used for the purpose at hand. The hierarchical method based on the suggested measure is illustrated and compared with other possible solutions by analysing simulated and real data sets. Finally an extension of the suggested method to the more general situation of mixed (qualitative and quantitative) variables is proposed and theoretically discussed.  相似文献   

3.
Canonical discriminant functions are defined here as linear combinations that separate groups of observations, and canonical variates are defined as linear combinations associated with canonical correlations between two sets of variables. In standardized form, the coefficients in either type of canonical function provide information about the joint contribution of the variables to the canonical function. The standardized coefficients can be converted to correlations between the variables and the canonical function. These correlations generally alter the interpretation of the canonical functions. For canonical discriminant functions, the standardized coefficients are compared with the correlations, with partial t and F tests, and with rotated coefficients. For canonical variates, the discussion includes standardized coefficients, correlations between variables and the function, rotation, and redundancy analysis. Various approaches to interpretation of principal components are compared: the choice between the covariance and correlation matrices, the conversion of coefficients to correlations, the rotation of the coefficients, and the effect of special patterns in the covariance and correlation matrices.  相似文献   

4.
We propose new affine invariant tests for multivariate normality, based on independence characterizations of the sample moments of the normal distribution. The test statistics are obtained using canonical correlations between sets of sample moments in a way that resembles the construction of Mardia’s skewness measure and generalizes the Lin–Mudholkar test for univariate normality. The tests are compared to some popular tests based on Mardia’s skewness and kurtosis measures in an extensive simulation power study and are found to offer higher power against many of the alternatives.  相似文献   

5.
Consider a population of individuals who are free of a disease under study, and who are exposed simultaneously at random exposure levels, say X,Y,Z,… to several risk factors which are suspected to cause the disease in the populationm. At any specified levels X=x, Y=y, Z=z, …, the incidence rate of the disease in the population ot risk is given by the exposure–response relationship r(x,y,z,…) = P(disease|x,y,z,…). The present paper examines the relationship between the joint distribution of the exposure variables X,Y,Z, … in the population at risk and the joint distribution of the exposure variables U,V,W,… among cases under the linear and the exponential risk models. It is proven that under the exponential risk model, these two joint distributions belong to the same family of multivariate probability distributions, possibly with different parameters values. For example, if the exposure variables in the population at risk have jointly a multivariate normal distribution, so do the exposure variables among cases; if the former variables have jointly a multinomial distribution, so do the latter. More generally, it is demonstrated that if the joint distribution of the exposure variables in the population at risk belongs to the exponential family of multivariate probability distributions, so does the joint distribution of exposure variables among cases. If the epidemiologist can specify the differnce among the mean exposure levels in the case and control groups which are considered to be clinically or etiologically important in the study, the results of the present paper may be used to make sample size determinations for the case–control study, corresponding to specified protection levels, i.e., size α and 1–β of a statistical test. The multivariate normal, the multinomial, the negative multinomial and Fisher's multivariate logarithmic series exposure distributions are used to illustrate our results.  相似文献   

6.
Measures of association between two sets of random variables have long been of interest to statisticians. The classical canonical correlation analysis (LCCA) can characterize, but also is limited to, linear association. This article introduces a nonlinear and nonparametric kernel method for association study and proposes a new independence test for two sets of variables. This nonlinear kernel canonical correlation analysis (KCCA) can also be applied to the nonlinear discriminant analysis. Implementation issues are discussed. We place the implementation of KCCA in the framework of classical LCCA via a sequence of independent systems in the kernel associated Hilbert spaces. Such a placement provides an easy way to carry out the KCCA. Numerical experiments and comparison with other nonparametric methods are presented.  相似文献   

7.
An exploratory tool is introduced to examine potential non-linear relation-ships between two sets of variables, X andY, in a sample of multivariate data. Simulated annealing is applied to find canonical coefficient vectors a and b such that a squared non-linear correlation between a'Xand b'Y is maximiSed. A measure of non-linear correlation is developed for this optimization which utilies a nearest-neighbor regression estimate for the unknown functional relationship. In addition to examining potential relations between the canonical variables, this method can identify the important variables in each set.  相似文献   

8.
Air quality control usually requires a monitoring system of multiple indicators measured at various points in space and time. Hence, the use of space–time multivariate techniques are of fundamental importance in this context, where decisions and actions regarding environmental protection should be supported by studies based on either inter-variables relations and spatial–temporal correlations. This paper describes how canonical correlation analysis can be combined with space–time geostatistical methods for analysing two spatial–temporal correlated aspects, such as air pollution concentrations and meteorological conditions. Hourly averages of three pollutants (nitric oxide, nitrogen dioxide and ozone) and three atmospheric indicators (temperature, humidity and wind speed) taken for two critical months (February and August) at several monitoring stations are considered and space–time variograms for the variables are estimated. Simultaneous relationships between such sample space–time variograms are determined through canonical correlation analysis. The most correlated canonical variates are used for describing synthetically the underlying space–time behaviour of the components of the two sets.  相似文献   

9.
ABSTRACT

Canonical correlations are maximized correlation coefficients indicating the relationships between pairs of canonical variates that are linear combinations of the two sets of original variables. The number of non-zero canonical correlations in a population is called its dimensionality. Parallel analysis (PA) is an empirical method for determining the number of principal components or factors that should be retained in factor analysis. An example is given to illustrate for adapting proposed procedures based on PA and bootstrap modified PA to the context of canonical correlation analysis (CCA). The performances of the proposed procedures are evaluated in a simulation study by their comparison with traditional sequential test procedures with respect to the under-, correct- and over-determination of dimensionality in CCA.  相似文献   

10.
《统计学通讯:理论与方法》2012,41(13-14):2342-2355
We propose a distance-based method to relate two data sets. We define and study some measures of multivariate association based on distances between observations. The proposed approach can be used to deal with general data sets (e.g., observations on continuous, categorical or mixed variables). An application, using Hellinger distance, provides the relationships between two regions of hyperspectral images.  相似文献   

11.
In this article, a useful proposition relating the canonical correlations in multi-way layout to the singular values of a specific matrix is proved and also a geometrical explanation of canonical correlations as an angle between subspaces is given.  相似文献   

12.
Motivated from problems in canonical correlation analysis, reduced rank regression and sufficient dimension reduction, we introduce a double dimension reduction model where a single index of the multivariate response is linked to the multivariate covariate through a single index of these covariates, hence the name double single index model. Because nonlinear association between two sets of multivariate variables can be arbitrarily complex and even intractable in general, we aim at seeking a principal one‐dimensional association structure where a response index is fully characterized by a single predictor index. The functional relation between the two single‐indices is left unspecified, allowing flexible exploration of any potential nonlinear association. We argue that such double single index association is meaningful and easy to interpret, and the rest of the multi‐dimensional dependence structure can be treated as nuisance in model estimation. We investigate the estimation and inference of both indices and the regression function, and derive the asymptotic properties of our procedure. We illustrate the numerical performance in finite samples and demonstrate the usefulness of the modelling and estimation procedure in a multi‐covariate multi‐response problem concerning concrete.  相似文献   

13.
Linear combinations of random variables play a crucial role in multivariate analysis. Two extension of this concept are considered for functional data and shown to coincide using the Loève–Parzen reproducing kernel Hilbert space representation of a stochastic process. This theory is then used to provide an extension of the multivariate concept of canonical correlation. A solution to the regression problem of best linear unbiased prediction is obtained from this abstract canonical correlation formulation. The classical identities of Lawley and Rao that lead to canonical factor analysis are also generalized to the functional data setting. Finally, the relationship between Fisher's linear discriminant analysis and canonical correlation analysis for random vectors is extended to include situations with function-valued random elements. This allows for classification using the canonical Y scores and related distance measures.  相似文献   

14.
In this study, our aim was to investigate the changes of different data structures and different sample sizes on the structural equation modeling and the influence of these factors on the model fit measures. Examining the created structural equation modeling under different data structures and sample sizes, the evaluation of model fit measures were performed with a simulation study. As a result of the simulation study, optimization and negative variance estimation problems have been encountered depending on the sample size and changing correlations. It was observed that these problems disappeared either by increasing the sample size or the correlations between the variables in factor. For upcoming studies, the choice of RMSEA and IFI model fit measures can be suggested in all sample sizes and the correlation values for data sets are ensured the multivariate normal distribution assumption.  相似文献   

15.
This paper investigates the roles of partial correlation and conditional correlation as measures of the conditional independence of two random variables. It first establishes a sufficient condition for the coincidence of the partial correlation with the conditional correlation. The condition is satisfied not only for multivariate normal but also for elliptical, multivariate hypergeometric, multivariate negative hypergeometric, multinomial and Dirichlet distributions. Such families of distributions are characterized by a semigroup property as a parametric family of distributions. A necessary and sufficient condition for the coincidence of the partial covariance with the conditional covariance is also derived. However, a known family of multivariate distributions which satisfies this condition cannot be found, except for the multivariate normal. The paper also shows that conditional independence has no close ties with zero partial correlation except in the case of the multivariate normal distribution; it has rather close ties to the zero conditional correlation. It shows that the equivalence between zero conditional covariance and conditional independence for normal variables is retained by any monotone transformation of each variable. The results suggest that care must be taken when using such correlations as measures of conditional independence unless the joint distribution is known to be normal. Otherwise a new concept of conditional independence may need to be introduced in place of conditional independence through zero conditional correlation or other statistics.  相似文献   

16.
17.
Generalized discriminant analysis based on distances   总被引:14,自引:1,他引:13  
This paper describes a method of generalized discriminant analysis based on a dissimilarity matrix to test for differences in a priori groups of multivariate observations. Use of classical multidimensional scaling produces a low‐dimensional representation of the data for which Euclidean distances approximate the original dissimilarities. The resulting scores are then analysed using discriminant analysis, giving tests based on the canonical correlations. The asymptotic distributions of these statistics under permutations of the observations are shown to be invariant to changes in the distributions of the original variables, unlike the distributions of the multi‐response permutation test statistics which have been considered by other workers for testing differences among groups. This canonical method is applied to multivariate fish assemblage data, with Monte Carlo simulations to make power comparisons and to compare theoretical results and empirical distributions. The paper proposes classification based on distances. Error rates are estimated using cross‐validation.  相似文献   

18.
We study estimation of multivariate densities p of the form p(x) = h(g(x)) for x ∈ ?(d) and for a fixed monotone function h and an unknown convex function g. The canonical example is h(y) = e(-y) for y ∈ ?; in this case, the resulting class of densities [Formula: see text]is well known as the class of log-concave densities. Other functions h allow for classes of densities with heavier tails than the log-concave class.We first investigate when the maximum likelihood estimator p? exists for the class P(h) for various choices of monotone transformations h, including decreasing and increasing functions h. The resulting models for increasing transformations h extend the classes of log-convex densities studied previously in the econometrics literature, corresponding to h(y) = exp(y).We then establish consistency of the maximum likelihood estimator for fairly general functions h, including the log-concave class P(e(-y)) and many others. In a final section, we provide asymptotic minimax lower bounds for the estimation of p and its vector of derivatives at a fixed point x(0) under natural smoothness hypotheses on h and g. The proofs rely heavily on results from convex analysis.  相似文献   

19.
In this article we propose methodology for inference of binary-valued adjacency matrices from various measures of the strength of association between pairs of network nodes, or more generally pairs of variables. This strength of association can be quantified by sample covariance and correlation matrices, and more generally by test-statistics and hypothesis test p-values from arbitrary distributions. Community detection methods such as block modeling typically require binary-valued adjacency matrices as a starting point. Hence, a main motivation for the methodology we propose is to obtain binary-valued adjacency matrices from such pairwise measures of strength of association between variables. The proposed methodology is applicable to large high-dimensional data sets and is based on computationally efficient algorithms. We illustrate its utility in a range of contexts and data sets.  相似文献   

20.
Familial binary data occur in a wide range of scientific investigations. Numerous measures of association have been proposed in the literature for the study of intra-family dependence of the binary variables. These measures include correlations, odd ratios, kappa statistics, and relative risks. We study the permissible ranges of these measures of association such that a joint distribution exists for the familial binary variables. Our results are useful for developing efficient estimation methods for the measures of association.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号