首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 32 毫秒
This paper discusses biplots of the between-set correlation matrix obtained by canonical correlation analysis. It is shown that these biplots can be enriched with the representation of the cases of the original data matrices. A representation of the cases that is optimal in the generalized least squares sense is obtained by the superposition of a scatterplot of the canonical variates on the biplot of the between-set correlation matrix. Goodness of fit statistics for all correlation and data matrices involved in canonical correlation analysis are discussed. It is shown that adequacy and redundancy coefficients are in fact statistics that express the goodness of fit of the original data matrices in the biplot. The within-set correlation matrix that is represented in standard coordinates always has a better goodness of fit than the within-set correlation matrix that is represented in principal coordinates. Given certain scalings, the scalar products between variable vectors approximate correlations better than the cosines of angles between variable vectors. Several data sets are used to illustrate the results.  相似文献   

Canonical discriminant functions are defined here as linear combinations that separate groups of observations, and canonical variates are defined as linear combinations associated with canonical correlations between two sets of variables. In standardized form, the coefficients in either type of canonical function provide information about the joint contribution of the variables to the canonical function. The standardized coefficients can be converted to correlations between the variables and the canonical function. These correlations generally alter the interpretation of the canonical functions. For canonical discriminant functions, the standardized coefficients are compared with the correlations, with partial t and F tests, and with rotated coefficients. For canonical variates, the discussion includes standardized coefficients, correlations between variables and the function, rotation, and redundancy analysis. Various approaches to interpretation of principal components are compared: the choice between the covariance and correlation matrices, the conversion of coefficients to correlations, the rotation of the coefficients, and the effect of special patterns in the covariance and correlation matrices.  相似文献   

The quadratic inference function (QIF) method is increasingly popular for the marginal analysis of correlated data due to its advantages over generalized estimating equations. Asymptotic theory is used to derive analytical results from the QIF, and we, therefore, study three asymptotically equivalent weighting matrices in terms of finite-sample parameter estimation. Furthermore, to improve small-sample estimation, we study modifications to the estimation procedure. Examples are presented via simulations and application. Results show that although theoretical weighting matrices work best, the proposed estimation procedure, in which initial estimates are held constant within the matrix of estimated empirical covariances, is preferable in practice.  相似文献   

Regularization is a well-known and used statistical approach covering individual points or limit approximations. In this study, the canonical correlation analysis (CCA) process of the paths is discussed with partial least squares (PLS) as the other boundary covering transformation to a symmetric eigenvalue (or singular value) problem dependent on a parameter. Two regularizations of the original criterion in the parameterization domain are compared, i.e. using projection and by identity matrix. We discuss the existence and uniqueness of the analytic path for eigenvalues and corresponding elements of eigenvectors. Specifically, canonical analysis is applied to an ill-conditioned case of singular within-sets input matrices encompassing tourism accommodation data.KEYWORDS: Multivariate analysis, canonical correlation analysis, optimization, analytic decomposition, paths of eigenvalues and eigenvectors, tourismMSC Classifications: 62H20, 46N10, 62P20  相似文献   

M. Nussbaum 《Statistics》2013,47(2):173-198
For the problem of estimating a linear functional relation when the ratio of the error variances is known a general class of estimators is introduced. They include as special cases the instrumental variable and replication cases and some others. Conditions are given for consistency, asymptotic normality and asymptotic optimality within this class based on the variance of the limit distribution. Fisheb's lower bound for asymptotic variances is established, and under normality the asymptotically optimal estimators are shown to be best asymptotically normal. For an inhomogeneous linear relation only estimators which are invariant with respect to a translation of the origin are considered, and asymptotically optimal invariant and, under normality, best asymptotically normal invariant estimators are obtained. Several special cases are discussed.  相似文献   

Matthias Kohl 《Statistics》2013,47(4):473-488
Bednarski and Müller [Optimal bounded influence regression and scale M-estimators in the context of experimental design, Statistics 35 (2001), pp. 349–369] introduced a class of bounded influence M estimates for the simultaneous estimation of regression and scale in the linear model with normal errors by solving the corresponding normal location and scale problem at each design point. This limits the proposal to regressor distributions with finite support. Based on their approach, we propose a slightly extended class of M estimates that is not restricted to finite support and is numerically easier to handle. Moreover, we employ the even more general class of asymptotically linear (AL) estimators which, in addition, is not restricted to normal errors. The superiority of AL estimates is demonstrated by numerical comparisons of the maximum asymptotic mean-squared error over infinitesimal contamination neighbourhoods.  相似文献   

Linear combinations of random variables play a crucial role in multivariate analysis. Two extension of this concept are considered for functional data and shown to coincide using the Loève–Parzen reproducing kernel Hilbert space representation of a stochastic process. This theory is then used to provide an extension of the multivariate concept of canonical correlation. A solution to the regression problem of best linear unbiased prediction is obtained from this abstract canonical correlation formulation. The classical identities of Lawley and Rao that lead to canonical factor analysis are also generalized to the functional data setting. Finally, the relationship between Fisher's linear discriminant analysis and canonical correlation analysis for random vectors is extended to include situations with function-valued random elements. This allows for classification using the canonical Y scores and related distance measures.  相似文献   


Canonical correlations are maximized correlation coefficients indicating the relationships between pairs of canonical variates that are linear combinations of the two sets of original variables. The number of non-zero canonical correlations in a population is called its dimensionality. Parallel analysis (PA) is an empirical method for determining the number of principal components or factors that should be retained in factor analysis. An example is given to illustrate for adapting proposed procedures based on PA and bootstrap modified PA to the context of canonical correlation analysis (CCA). The performances of the proposed procedures are evaluated in a simulation study by their comparison with traditional sequential test procedures with respect to the under-, correct- and over-determination of dimensionality in CCA.  相似文献   

Longitudinal data analysis requires a proper estimation of the within-cluster correlation structure in order to achieve efficient estimates of the regression parameters. When applying likelihood-based methods one may select an optimal correlation structure by the AIC or BIC. However, such information criteria are not applicable for estimating equation based approaches. In this paper we develop a model averaging approach to estimate the correlation matrix by a weighted sum of a group of patterned correlation matrices under the GEE framework. The optimal weight is determined by minimizing the difference between the weighted sum and a consistent yet inefficient estimator of the correlation structure. The computation of our proposed approach only involves a standard quadratic programming on top of the standard GEE procedure and can be easily implemented in practice. We provide theoretical justifications and extensive numerical simulations to support the application of the proposed estimator. A couple of well-known longitudinal data sets are revisited where we implement and illustrate our methodology.  相似文献   

This paper extends the results of canonical correlation analysis of Anderson [2002. Canonical correlation analysis and reduced-rank regression in autoregressive models. Ann. Statist. 30, 1134–1154] to a vector AR(1) process with a vector ARCH(1) innovations. We obtain the limiting distributions of the sample matrices, the canonical correlations and the canonical vectors of the process. The extension is important because many time series in economics and finance exhibit conditional heteroscedasticity. We also use simulation to demonstrate the effects of ARCH innovations on the canonical correlation analysis in finite sample. Both the limiting distributions and simulation results show that overlooking the ARCH effects in canonical correlation analysis can easily lead to erroneous inference.  相似文献   

在典型相关分析中,求得典型相关变量的表达式并没有全部完成任务,例如需要确定典型相关变量的个数和变量选择。针对典型相关变量的个数问题,发现了常用的卡方检验和冗余分析方法的不足,进而提出了一种新的算法。针对原始变量的选择问题,提出了三种可能的路径。最后利用人体尺寸数据对相关结论进行了验证。  相似文献   

In this article, we develop estimation procedures for partially linear quantile regression models, where some of the responses are censored by another random variable. The nonparametric function is estimated by basis function approximations. The estimation procedure is easy to implement through existing weighted quantile regression, and it requires no specification of the error distributions. We show the large-sample properties of the resulting estimates, the proposed estimator of the regression parameter is root-n consistent and asymptotically normal and the estimator of the functional component achieves the optimal convergence rate of the nonparametric function. The proposed method is studied via simulations and illustrated with the analysis of a primary biliary cirrhosis (BPC) data.  相似文献   

Robust estimation of location vectors and scatter matrices is studied under the assumption that the unknown error distribution is spherically symmetric in a central region and completely unknown in the tail region. A precise formulation of the model is given, an analysis of the identifiable parameters in the model is presented, and consistent initial estimators of the identifiable parameters are constructed. Consistent and asymptotically normal M-estimators are constructed (solved iteratively beginning with the initial estimates) based on “influence functions” which vanish outside specified compact sets. Finally M-estimators which are asymptotically minimax (in the sense of Huber) are derived.  相似文献   

In modeling complex longitudinal data, semiparametric nonlinear mixed-effects (SNLME) models are very flexible and useful. Covariates are often introduced in the models to partially explain the inter-individual variations. In practice, data are often incomplete in the sense that there are often measurement errors and missing data in longitudinal studies. The likelihood method is a standard approach for inference for these models but it can be computationally very challenging, so computationally efficient approximate methods are quite valuable. However, the performance of these approximate methods is often based on limited simulation studies, and theoretical results are unavailable for many approximate methods. In this article, we consider a computationally efficient approximate method for a class of SNLME models with incomplete data and investigate its theoretical properties. We show that the estimates based on the approximate method are consistent and asymptotically normally distributed.  相似文献   

We propose a universal robust likelihood that is able to accommodate correlated binary data without any information about the underlying joint distributions. This likelihood function is asymptotically valid for the regression parameter for any underlying correlation configurations, including varying under- or over-dispersion situations, which undermines one of the regularity conditions ensuring the validity of crucial large sample theories. This robust likelihood procedure can be easily implemented by using any statistical software that provides naïve and sandwich covariance matrices for regression parameter estimates. Simulations and real data analyses are used to demonstrate the efficacy of this parametric robust method.  相似文献   

Some modem approaches for the analysis of non-normally distributed and correlated data, including Liang and Zeger's ( 1986 ) method of generalized estimating equations (GEE), model the pattern of association among outcomes by assuming a structure for their correlation matrix. A number of relatively simple patterned correlation matrices are available for measurements with one level of correlation. However, modeling the correlation structure of data with multiple levels, or causes, of association is not as straightforward; this note discusses some of the difficulties and discusses a simple class of correlation models that may prove useful in this endeavor.  相似文献   

Many procedures have been developed to deal with the high-dimensional problem that is emerging in various business and economics areas. To evaluate and compare these procedures, modeling uncertainty caused by model selection and parameter estimation has to be assessed and integrated into a modeling process. To do this, a data perturbation method estimates the modeling uncertainty inherited in a selection process by perturbing the data. Critical to data perturbation is the size of perturbation, as the perturbed data should resemble the original dataset. To account for the modeling uncertainty, we derive the optimal size of perturbation, which adapts to the data, the model space, and other relevant factors in the context of linear regression. On this basis, we develop an adaptive data-perturbation method that, unlike its nonadaptive counterpart, performs well in different situations. This leads to a data-adaptive model selection method. Both theoretical and numerical analysis suggest that the data-adaptive model selection method adapts to distinct situations in that it yields consistent model selection and optimal prediction, without knowing which situation exists a priori. The proposed method is applied to real data from the commodity market and outperforms its competitors in terms of price forecasting accuracy.  相似文献   

In this article, we consider the problem of sequentially estimating the mean of a Poisson distribution under LINEX (linear exponential) loss function and fixed cost per observation within a Bayesian framework. An asymptotically pointwise optimal rule with a prior distribution is proposed and shown to be asymptotically optimal for arbitrary priors. The proposed asymptotically pointwise optimal rule is illustrated using a real data set.  相似文献   

The article concerns covariance estimates in a replicated measurement error model with correlated, heteroscedastic errors. Freedman has conjectured that using more of the data will improve estimates of covariance matrices and result in a more efficient estimate of the coefficient of the regression model. The paper confirms the conjecture asymptotically for the case that all random variables are normally distributed, but the gain is not substantial.  相似文献   

In this article, we consider the Bayes and empirical Bayes problem of the current population mean of a finite population when the sample data is available from other similar (m-1) finite populations. We investigate a general class of linear estimators and obtain the optimal linear Bayes estimator of the finite population mean under a squared error loss function that considered the cost of sampling. The optimal linear Bayes estimator and the sample size are obtained as a function of the parameters of the prior distribution. The corresponding empirical Bayes estimates are obtained by replacing the unknown hyperparameters with their respective consistent estimates. A Monte Carlo study is conducted to evaluate the performance of the proposed empirical Bayes procedure.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号