期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

A Variable Selection Criterion for Two Sets of Principal Component Scores in Principal Canonical Correlation Analysis

Toru Ogura Yasunori Fujikoshi Takakazu Sugiyama 《统计学通讯:理论与方法》2013,42(12):2118-2135

Canonical correlation analysis (CCA) is often used to analyze the correlation between two random vectors. However, sometimes interpretation of CCA results may be hard. In an attempt to address these difficulties, principal canonical correlation analysis (PCCA) was proposed. PCCA is CCA between two sets of principal component (PC) scores. We consider the problem of selecting useful PC scores in CCA. A variable selection criterion for one set of PC scores has been proposed by Ogura (2010), here, we propose a variable selection criterion for two sets of PC scores in PCCA. Furthermore, we demonstrate the effectiveness of this criterion. 相似文献

2.

Parallel analysis approach for determining dimensionality in canonical correlation analysis

《Journal of Statistical Computation and Simulation》2012,82(17):3419-3431

ABSTRACT

Canonical correlations are maximized correlation coefficients indicating the relationships between pairs of canonical variates that are linear combinations of the two sets of original variables. The number of non-zero canonical correlations in a population is called its dimensionality. Parallel analysis (PA) is an empirical method for determining the number of principal components or factors that should be retained in factor analysis. An example is given to illustrate for adapting proposed procedures based on PA and bootstrap modified PA to the context of canonical correlation analysis (CCA). The performances of the proposed procedures are evaluated in a simulation study by their comparison with traditional sequential test procedures with respect to the under-, correct- and over-determination of dimensionality in CCA. 相似文献

3.

Influence functions of the Spearman and Kendall correlation measures

Christophe Croux Catherine Dehon 《Statistical Methods and Applications》2010,19(4):497-515

Nonparametric correlation estimators as the Kendall and Spearman correlation are widely used in the applied sciences. They are often said to be robust, in the sense of being resistant to outlying observations. In this paper we formally study their robustness by means of their influence functions and gross-error sensitivities. Since robustness of an estimator often comes at the price of an increased variance, we also compute statistical efficiencies at the normal model. We conclude that both the Spearman and Kendall correlation estimators combine a bounded and smooth influence function with a high efficiency. In a simulation experiment we compare these nonparametric estimators with correlations based on a robust covariance matrix estimator. 相似文献

4.

Bayesian spatial regression models with closed skew normal correlated errors and missing observations

Omid Karimi Mohsen Mohammadzadeh 《Statistical Papers》2012,53(1):205-218

This paper is concerned with Bayesian estimation of a spatial regression model with skew non-Gaussian errors. The regression parameters are estimated by using a closed skew normal (CSN) distribution, which is closed under conditioning and linear combination. The proposed model captures skewness in the response variable. Sometimes, we may encounter missing observations in the response variable, accordingly we model and predict the missing observations by a Bayesian approach using Gibbs sampling methods. Next, a simulation study is performed to asses our model validity. Also, the proposed model in this work is applied to CO data from Tehran, the capital city of Iran. Then, the accuracy of the CSN and Gaussian models is compared by cross validation criterion. 相似文献

5.

On the skew-normal calibration model

C.C. Figueiredo H. Bolfarine M.C. Sandoval C.R.O.P. Lima 《Journal of applied statistics》2010,37(3):435-451

In this article, we present the EM-algorithm for performing maximum likelihood estimation of an asymmetric linear calibration model with the assumption of skew-normally distributed error. A simulation study is conducted for evaluating the performance of the calibration estimator with interpolation and extrapolation situations. As one application in a real data set, we fitted the model studied in a dimensional measurement method used for calculating the testicular volume through a caliper and its calibration by using ultrasonography as the standard method. By applying this methodology, we do not need to transform the variables to have symmetrical errors. Another interesting aspect of the approach is that the developed transformation to make the information matrix nonsingular, when the skewness parameter is near zero, leaves the parameter of interest unchanged. Model fitting is implemented and the best choice between the usual calibration model and the model proposed in this article was evaluated by developing the Akaike information criterion, Schwarz’s Bayesian information criterion and Hannan–Quinn criterion. 相似文献

6.

Linear and non-linear canonical correlation analysis:an exploratory tool for the analysis of group-structured data 总被引：1，自引：1，他引：0

K. Luijtens F. Symons M. Vuylsteke-Wauters 《Journal of applied statistics》1994,21(3):43-61

Confronted with multivariate group-structured data, one is in fact always interested in describing differences between groups. In this paper, canonical correlation analysis (CCA) is used as an exploratory data analysis tool to detect and describe differences between groups of objects. CCA allows for the construction of Gabriel biplots, relating representations of objects and variables in the plane that best represents the distinction of the groups of object points. In the case of non-linear CCA, transformations of the original variables are suggested to achieve a better group separation compared with that obtained by linear CCA. One can detect which (transformed) variables are responsible for this separation. The separation itself might be due to several characteristics of the data (eg. distances between the centres of gravity of the original or transformed groups of object points, or differences in the structure of the original groups). Four case studies give an overview of an exploration of the possibilities offered by linear and non-linear CCA. 相似文献

7.

Focused information criterion and model averaging based on weighted composite quantile regression

Ganggang Xu Suojin Wang Jianhua Z. Huang 《Scandinavian Journal of Statistics》2014,41(2):365-381

We study the focused information criterion and frequentist model averaging and their application to post‐model‐selection inference for weighted composite quantile regression (WCQR) in the context of the additive partial linear models. With the non‐parametric functions approximated by polynomial splines, we show that, under certain conditions, the asymptotic distribution of the frequentist model averaging WCQR‐estimator of a focused parameter is a non‐linear mixture of normal distributions. This asymptotic distribution is used to construct confidence intervals that achieve the nominal coverage probability. With properly chosen weights, the focused information criterion based WCQR estimators are not only robust to outliers and non‐normal residuals but also can achieve efficiency close to the maximum likelihood estimator, without assuming the true error distribution. Simulation studies and a real data analysis are used to illustrate the effectiveness of the proposed procedure. 相似文献

8.

On the rank-deficient canonical correlation technique solved by analytic spectral decomposition

Luk&#x; Malec 《Journal of applied statistics》2022,49(4):819

Regularization is a well-known and used statistical approach covering individual points or limit approximations. In this study, the canonical correlation analysis (CCA) process of the paths is discussed with partial least squares (PLS) as the other boundary covering transformation to a symmetric eigenvalue (or singular value) problem dependent on a parameter. Two regularizations of the original criterion in the parameterization domain are compared, i.e. using projection and by identity matrix. We discuss the existence and uniqueness of the analytic path for eigenvalues and corresponding elements of eigenvectors. Specifically, canonical analysis is applied to an ill-conditioned case of singular within-sets input matrices encompassing tourism accommodation data.KEYWORDS: Multivariate analysis, canonical correlation analysis, optimization, analytic decomposition, paths of eigenvalues and eigenvectors, tourismMSC Classifications: 62H20, 46N10, 62P20 相似文献

9.

A comparative study for robust canonical correlation methods

Ali Alkenani Keming Yu 《Journal of Statistical Computation and Simulation》2013,83(4):692-720

The aim of this study is to obtain robust canonical vectors and correlation coefficients based on the percentage bend correlation and winsorized correlation in the correlation matrix and fast consistent high breakdown (FCH), reweighted fast consistent high breakdown (RFCH), and reweighted multivariate normal (RMVN) estimators to estimate the covariance matrix and then compare these estimators with the existing estimators. In the correlation matrix of canonical correlation analysis (CCA), we present an approach that substitutes the percentage bend correlation and the winsorized correlation in place of the widely employed the Pearson correlation. Moreover, we employ the FCH, RFCH, and RMVN estimators to estimate the covariance matrix in the CCA. We conduct a simulation study and employ real data with the objective of comparing the performance of the different estimators for canonical vectors and correlation with that of our proposed approaches. The breakdown plots and independent tests are employed as differentiating criteria of the robustness and performance of the estimators. Based on our computational and real data studies, we propose suggestions and guidelines on the practical implications of our findings. 相似文献

10.

On distribution of AIC in linear regression models

《Journal of statistical planning and inference》2005,133(2):417-433

This paper investigates an asymptotic distribution of the Akaike information criterion (AIC) and presents its characteristics in normal linear regression models. The bias correction of the AIC has been studied. It may be noted that the bias is only the mean, i.e., the first moment. Higher moments are important for investigating the behavior of the AIC. The variance increases as the number of explanatory variables increases. The skewness and kurtosis imply a favorable accuracy of the normal approximation. An asymptotic expansion of the distribution function of a standardized AIC is also derived. 相似文献

11.

A flexible semiparametric regression model for bimodal,asymmetric and censored data

Thiago G. Ramires Niel Hens Gauss M. Cordeiro Gilberto A. Paula 《Journal of applied statistics》2018,45(7):1303-1324

In this paper, we propose a new semiparametric heteroscedastic regression model allowing for positive and negative skewness and bimodal shapes using the B-spline basis for nonlinear effects. The proposed distribution is based on the generalized additive models for location, scale and shape framework in order to model any or all parameters of the distribution using parametric linear and/or nonparametric smooth functions of explanatory variables. We motivate the new model by means of Monte Carlo simulations, thus ignoring the skewness and bimodality of the random errors in semiparametric regression models, which may introduce biases on the parameter estimates and/or on the estimation of the associated variability measures. An iterative estimation process and some diagnostic methods are investigated. Applications to two real data sets are presented and the method is compared to the usual regression methods. 相似文献

12.

Understanding Canonical Correlation through the General Linear Model and Principal Components

Keith E. Muller 《The American statistician》2013,67(4):342-354

Canonical correlation has been little used and little understood, even by otherwise sophisticated analysts. An alternative approach to canonical correlation, based on a general linear multivariate model, is presented. Properties of principal component analysis are used to help explain the method. Standard computational methods for full rank canonical correlation, techniques for canonical correlation on component scores, and canonical correlation with less than full rank are discussed. They are seen to be essentially equivalent when the model equation for canonical correlation on component scores is presented. The two approaches to less than full rank situations are equivalent in some senses, but quite different in usefulness, depending on the application. An example dataset is analyzed in detail to help demonstrate the conclusions. 相似文献

13.

Correction methods for ties in rank correlations

Ilaria L. Amerise Agostino Tarsitano 《Journal of applied statistics》2015,42(12):2584-2596

Equal values are common when rank methods are applied to rounded data or data consisting solely of small integers. A popular technique for resolving ties in rank correlation is the mid-rank method: the mean of the rankings remains unaltered, but the variance is reduced and modified according to the number and location of ties. Although other methods for breaking ties were proposed in the literature as early as 1939, no such procedure has gained such wide acceptance as mid-ranks. This research analyses various techniques for assigning ranks to tied values, with two objectives: (1) to enable the computation of rank correlation coefficients, such as those of Spearman, Kendall and Gini, by using the usual definition applied in the absence of ties, and (2) to determine whether it really makes a difference which of the various techniques is selected and, if so, which technique is most appropriate for a given application. 相似文献

14.

On multivariate Gaussian copulas 总被引：1，自引：0，他引：1

Ivan eula 《Journal of statistical planning and inference》2009,139(11):3942

Gaussian copulas are handy tool in many applications. However, when dimension of data is large, there are too many parameters to estimate. Use of special variance structure can facilitate the task. In many cases, especially when different data types are used, Pearson correlation is not a suitable measure of dependence. We study the properties of Kendall and Spearman correlation coefficients—which have better properties and are invariant under monotone transformations—used at the place of Pearson coefficients. Spearman correlation coefficient appears to be more suitable for use in such complex applications. 相似文献

15.

Hierarchical clustering of variables: a comparison among strategies of analysis

Gabriele Soffritti 《统计学通讯:模拟与计算》2013,42(4):977-999

In this paper some hierarchical methods for identifying groups of variables are illustrated and compared. It is shown that the use of multivariate association measures between two sets of variables can overcome the drawbacks of the usually employed bivariate correlation coefficient, but the resulting methods are generally not monotonic. Thus a new multivariate association measure is proposed, based on the links existing between canonical correlation analysis and principal component analysis, which can be more suitably used for the purpose at hand. The hierarchical method based on the suggested measure is illustrated and compared with other possible solutions by analysing simulated and real data sets. Finally an extension of the suggested method to the more general situation of mixed (qualitative and quantitative) variables is proposed and theoretically discussed. 相似文献

16.

Interpretation of Canonical Discriminant Functions,Canonical Variates,and Principal Components 总被引：1，自引：0，他引：1

Alvin C. Rencher 《The American statistician》2013,67(3):217-225

Canonical discriminant functions are defined here as linear combinations that separate groups of observations, and canonical variates are defined as linear combinations associated with canonical correlations between two sets of variables. In standardized form, the coefficients in either type of canonical function provide information about the joint contribution of the variables to the canonical function. The standardized coefficients can be converted to correlations between the variables and the canonical function. These correlations generally alter the interpretation of the canonical functions. For canonical discriminant functions, the standardized coefficients are compared with the correlations, with partial t and F tests, and with rotated coefficients. For canonical variates, the discussion includes standardized coefficients, correlations between variables and the function, rotation, and redundancy analysis. Various approaches to interpretation of principal components are compared: the choice between the covariance and correlation matrices, the conversion of coefficients to correlations, the rotation of the coefficients, and the effect of special patterns in the covariance and correlation matrices. 相似文献

17.

Spatial Mallows model averaging for geostatistical models

Jun Liao Guohua Zou Yan Gao 《Revue canadienne de statistique》2019,47(3):336-351

Important progress has been made with model averaging methods over the past decades. For spatial data, however, the idea of model averaging has not been applied well. This article studies model averaging methods for the spatial geostatistical linear model. A spatial Mallows criterion is developed to choose weights for the model averaging estimator. The resulting estimator can achieve asymptotic optimality in terms of L₂ loss. Simulation experiments reveal that our proposed estimator is superior to the model averaging estimator by the Mallows criterion developed for ordinary linear models [Hansen, 2007] and the model selection estimator using the corrected Akaike's information criterion, developed for geostatistical linear models [Hoeting et al., 2006]. The Canadian Journal of Statistics 47: 336–351; 2019 © 2019 Statistical Society of Canada 相似文献

18.

Model-based segmentation of spatial cylindrical data

Francesco Lagona Marco Picone 《Journal of Statistical Computation and Simulation》2016,86(13):2598-2610

ABSTRACT

A new hidden Markov random field model is proposed for the analysis of cylindrical spatial series, i.e. bivariate spatial series of intensities and angles. It allows us to segment cylindrical spatial series according to a finite number of latent classes that represent the conditional distributions of the data under specific environmental conditions. The model parsimoniously accommodates circular–linear correlation, multimodality, skewness and spatial autocorrelation. A numerically tractable expectation–maximization algorithm is provided to compute parameter estimates by exploiting a mean-field approximation of the complete-data log-likelihood function. These methods are illustrated on a case study of marine currents in the Adriatic sea. 相似文献

19.

An extension of Banerjee and Rahim model in economic and economic statistical designs for multivariate quality characteristics under Burr XII distribution

A. A. Heydari M. B. Moghadam F. Eskandari 《统计学通讯:理论与方法》2017,46(16):7855-7871

The design parameters of the economic and economic statistical designs of control charts depend on the distribution of process failure mechanism or shock model. So far, only a small number of failure distributions, such as exponential, gamma, and Weibull with fixed or increasing hazard rates, have been used as a shock model in the economic and economic statistical designs of the Hotelling T² control charts. Due to both theoretical and practical aspects, the lifetime of the process under study may not follow a distribution with fixed or increasing hazard rate. A proper alternative for this situation may be the Burr distribution, in which the hazard rate can be fixed, increasing, decreasing, single mode, or even U-shaped. In this research article, economic and economic statistical designs of the Hotelling T² control charts under the Burr XII shock models under two uniform and non uniform sampling schemes were proposed, constructed, and compared. The obtained design models were implemented by a numerical example, and a sensitivity analysis was conducted to evaluate the effect of changing parameters of shock model distribution on the optimum values of the proposed design models. The results showed that first the proposed designs under non uniform sampling scheme perform better and second the optimum values of the designs are not significantly sensitive to changing of the Burr XII distribution parameters. We showed that the obtained design models are also true for the beta Burr XII shock model. 相似文献

20.

Approximate and exact distributions of rank tests for balanced incomplete block designs

Mayer Alvo Paul Cabilio 《统计学通讯:理论与方法》2013,42(12):3073-3121

Judges rank k out of t objects according to m replic ations of abasic balanced incomplete block design with bblocks. In Alvo and Cabilio(1991),it is shown that the Durbin test, which is the usual test in this situation, can be written in terms of Spearman correlations between the blocks, and using a Kendall correlation, they generated a new statistic for this situation.This Kendall tau based statistic has a richer support than the Durbin statistic, and is at least as efficient.In the present paper,exact and simulation based tables are generated for both statistics, and various approximations to these null distributions are considered and compared. 相似文献