首页 | 本学科首页   官方微博 | 高级检索  
     检索      


A New Approach to the Estimation of Inter-Variable Correlation
Authors:Marc Sobel  Bud Mishra
Institution:1. Department of Statistics , Temple University , Philadelphia, Pennsylvania, USA marc.sobel@temple.edu;3. Bioinformatics Unit , Courant Institute , New York, New York, USA
Abstract:The use of different measures of similarity between observed vectors for the purposes of classifying or clustering them has been expanding dramatically in recent years. One result of this expansion has been the use of many new similarity measures, designed for the purpose of satisfying various criteria. A noteworthy application involves estimating the relationships between genes using microarray experimental data. We consider the class of ‘correlation-type’ similarity measures. The use of these new measures of similarity suggest that the whole problem needs to be formulated in statistical terms to clarify their relative benefits. Pursuant to this need, we define, for each given observed vector, a baseline representing the ‘true’ value common to each of the component observations. These ‘true’ values are taken to be parameters. We define the ‘true correlation’ between each two observed vectors as the average (over the distribution of the observations for given baseline parameters) of Pearson's correlation with sample means replaced by the corresponding baseline parameters. Estimators of this true correlation are assessed using their mean squared error (MSE). Proper Bayes estimators of this true correlation, being based on the predictive posterior distribution of the data, are both difficult to calculate/analyze and highly non robust. By constrast, empirical Bayes estimators are: (i) close to their Bayesian counterparts; (ii) easy to analyze; and (iii) strongly robust. For these reasons, we employ empirical Bayes estimators of correlation in place of their Bayesian counterparts. We show how to construct two different kinds of simultaneous Bayes correlation estimators: the first assumes no apriori correlation between baseline parameters; the second assumes a common unknown correlation between them. Estimators of the latter type frequently have significantly smaller MSE than those of the former type which, in turn, frequently have significantly smaller MSE than their Pearson estimator counterparts. For purposes of illustrating our results, we examine the problem of inferring the relationships between gene expression level vectors, in the context of observing microarray experimental data.
Keywords:Admissibility  Bayes estimation  Bioinformatics  Correlation  Empirical Bayes
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号