首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
ABSTRACT

The analysis of variance of cross-classified (categorical) data (CATANOVA) is a technique designed to identify the variation between treatments of interest to the researcher. There are well-established links between CATANOVA and the Goodman and Kruskal tau statistic as well as the Light and Margolin R 2 for the purposes of the graphical identification of this variation.

The aim of this article is to present a partition of the numerator of the tau statistic, or equivalently, the BSS measure in the CATANOVA framework, into location, dispersion, and higher order components. Even if a CATANOVA identifies an overall lack of variation, by considering this partition and calculations derived from them, it is possible to identify hidden, but statistically significant, sources of variation.  相似文献   

2.
Variability explained by covariates or explained variance is a well‐known concept in assessing the importance of covariates for dependent outcomes. In this paper we study R2 statistics of explained variance pertinent to longitudinal data under linear mixed‐effect models, where the R2 statistics are computed at two different levels to measure, respectively, within‐ and between‐subject variabilities explained by the covariates. By deriving the limits of R2 statistics, we find that the interpretation of explained variance for the existing R2 statistics is clear only in the case where the covariance matrix of the outcome vector is compound symmetric. Two new R2 statistics are proposed to address the effect of time‐dependent covariate means. In the general case where the outcome covariance matrix is not compound symmetric, we introduce the concept of compound symmetry projection and use it to define level‐one and level‐two R2 statistics. Numerical results are provided to support the theoretical findings and demonstrate the performance of the R2 statistics. The Canadian Journal of Statistics 38: 352–368; 2010 © 2010 Statistical Society of Canada  相似文献   

3.
Linear mixed effects model (LMEM) is efficient in modeling repeated measures longitudinal data. However, little research has been done in developing goodness-of-fit measures that can evaluate the models, particularly those that can be interpreted in an absolute sense without referencing a null model. This paper proposes three coefficient of determination (R 2) as goodness-of-fit measures for LMEM with repeated measures longitudinal data. Theorems are presented describing the properties of R 2 and relationships between the R 2 statistics. A simulation study was conducted to evaluate and compare the R 2 along with other criteria from literature. Finally, we applied the proposed R 2 to a real virologic response data of an HIV-patient cohort. We conclude that our proposed R 2 statistics have more advantages than other goodness-of-fit measures in the literature, in terms of robustness to sample size, intuitive interpretation, well-defined range, and unnecessary to determine a null model.  相似文献   

4.
5.
This note discusses the effect of autocorrelated distrubances when they are not modelled on the statistics used in drawing inferences in the multiple linear regression model. It derives biases for the F and R2 statistics and evaluates them numerically for an example. The note concludes with a few brief reflections for empirical research on the causes, detection and treatment of autocorrelation.  相似文献   

6.
The coefficient of determination (R 2) is perhaps the single most extensively used measure of goodness of fit for regression models. It is also widely misused. The primary source of the problem is that except for linear models with an intercept term, the several alternative R 2 statistics are not generally equivalent. This article discusses various considerations and potential pitfalls in using the R 2's. Specific points are exemplified by means of empirical data. A new resistant statistic is also introduced.  相似文献   

7.
Comparison of two-way contingency tables using measures of association is considered. Multiple comparison procedures for dependent tables are proposed, enabling us to compare tables that are faces from larger multl-dimensional tables. An example

is given to Illustrate the analysis of two 2 × 2-tables formed

from a 24-table.  相似文献   

8.
Two methods are suggested for generating R 2 measures for a wide class of models. These measures are linked to the R 2 of the standard linear regression model through Wald and likelihood ratio statistics for testing the joint significance of the explanatory variables. Some currently used R 2's are shown to be special cases of these methods.  相似文献   

9.
R-squared (R2) and adjusted R-squared (R2Adj) are sometimes viewed as statistics detached from any target parameter, and sometimes as estimators for the population multiple correlation. The latter interpretation is meaningful only if the explanatory variables are random. This article proposes an alternative perspective for the case where the x’s are fixed. A new parameter is defined, in a similar fashion to the construction of R2, but relying on the true parameters rather than their estimates. (The parameter definition includes also the fixed x values.) This parameter is referred to as the “parametric” coefficient of determination, and denoted by ρ2*. The proposed ρ2* remains stable when irrelevant variables are removed (or added), unlike the unadjusted R2, which always goes up when variables, either relevant or not, are added to the model (and goes down when they are removed). The value of the traditional R2Adj may go up or down with added (or removed) variables, either relevant or not. It is shown that the unadjusted R2 overestimates ρ2*, while the traditional R2Adj underestimates it. It is also shown that for simple linear regression the magnitude of the bias of R2Adj can be as high as the bias of the unadjusted R2 (while their signs are opposite). Asymptotic convergence in probability of R2Adj to ρ2* is demonstrated. The effects of model parameters on the bias of R2 and R2Adj are characterized analytically and numerically. An alternative bi-adjusted estimator is presented and evaluated.  相似文献   

10.
The coefficient of determination, known also as the R 2, is a common measure in regression analysis. Many scientists use the R 2 and the adjusted R 2 on a regular basis. In most cases, the researchers treat the coefficient of determination as an index of ‘usefulness’ or ‘goodness of fit,’ and in some cases, they even treat it as a model selection tool. In cases in which the data is incomplete, most researchers and common statistical software will use complete case analysis in order to estimate the R 2, a procedure that might lead to biased results. In this paper, I introduce the use of multiple imputation for the estimation of R 2 and adjusted R 2 in incomplete data sets. I illustrate my methodology using a biomedical example.  相似文献   

11.
We develop a ‘robust’ statistic T2 R, based on Tiku's (1967, 1980) MML (modified maximum likelihood) estimators of location and scale parameters, for testing an assumed meam vector of a symmetric multivariate distribution. We show that T2 R is one the whole considerably more powerful than the prominenet Hotelling T2 statistics. We also develop a robust statistic T2 D for testing that two multivariate distributions (skew or symmetric) are identical; T2 D seems to be usually more powerful than nonparametric statistics. The only assumption we make is that the marginal distributions are of the type (1/σk)f((x-μk)/σk) and the means and variances of these marginal distributions exist.  相似文献   

12.
The identity of the Rao score and PearsonX 2 statistics is well known in the areas where the latter was first introduced: goodness-of-fit in contingency tables and binary responses. We show in this paper that the same identity holds when the two statistics are used for testing goodness-of-fit of Generalized Linear Models. We also highlight the connections that exist between the two statistics when they are used for the comparison of nested models. Finally, we discuss some merits of these unifying results. Work financially supported by cofin. MIUR grants 2000 and 2002.  相似文献   

13.
In genetic studies of complex diseases, multiple measures of related phenotypes are often collected. Jointly analyzing these phenotypes may improve statistical power to detect sets of rare variants affecting multiple traits. In this work, we consider association testing between a set of rare variants and multiple phenotypes in family‐based designs. We use a mixed linear model to express the correlations among the phenotypes and between related individuals. Given the many sources of correlations in this situation, deriving an appropriate test statistic is not straightforward. We derive a vector of score statistics, whose joint distribution is approximated using a copula. This allows us to have closed‐form expressions for the p‐values of several test statistics. A comprehensive simulation study and an application to Genetic Analysis Workshop 18 data highlight the gains associated with joint testing over univariate approaches, especially in the presence of pleiotropy or highly correlated phenotypes. The Canadian Journal of Statistics 47: 90–107; 2019 © 2018 Statistical Society of Canada  相似文献   

14.
Consider developing a regression model in a context where substantive theory is weak. To focus on an extreme case, suppose that in fact there is no relationship between the dependent variable and the explanatory variables. Even so, if there are many explanatory variables, the R 2 will be high. If explanatory variables with small t statistics are dropped and the equation refitted, the R 2 will stay high and the overall F will become highly significant. This is demonstrated by simulation and by asymptotic calculation.  相似文献   

15.
Many robust regression estimators are defined by minimizing a measure of spread of the residuals. An accompanying R 2-measure, or multiple correlation coefficient, is then easily obtained. In this paper, local robustness properties of these robust R 2-coefficients are investigated. It is also shown how confidence intervals for the population multiple correlation coefficient can be constructed in the case of multivariate normality.  相似文献   

16.
Testing for the difference in the strength of bivariate association in two independent contingency tables is an important issue that finds applications in various disciplines. Currently, many of the commonly used tests are based on single-index measures of association. More specifically, one obtains single-index measurements of association from two tables and compares them based on asymptotic theory. Although they are usually easy to understand and use, often much of the information contained in the data is lost with single-index measures. Accordingly, they fail to fully capture the association in the data. To remedy this shortcoming, we introduce a new summary statistic measuring various types of association in a contingency table. Based on this new summary statistic, we propose a likelihood ratio test comparing the strength of association in two independent contingency tables. The proposed test examines the stochastic order between summary statistics. We derive its asymptotic null distribution and demonstrate that the least favorable distributions are chi-bar distributions. We numerically compare the power of the proposed test to that of the tests based on single-index measures. Finally, we provide two examples illustrating the new summary statistics and the related tests.  相似文献   

17.
Fisher's A statistic, often called the adjusted R2 statistic, is shown to be a close approximation to the maximum likelihood estimate of the multiple correlation coefficient, p2, based on the marginal distribution of R2. Expansions for the estimate are obtained. The same methods lead to maximum marginal likelihood estimators for the noncentrality parameters for noncentral X2 and F.  相似文献   

18.
19.
A recent article in this journal presented a variety of expressions for the coefficient of determination (R 2) and demonstrated that these expressions were generally not equivalent. The article discussed potential pitfalls in interpreting the R 2 statistic in ordinary least-squares regression analysis. The current article extends this discussion to the case in which regression models are fit by weighted least squares and points out an additional pitfall that awaits the unwary data analyst. We show that unthinking reliance on the R 2 statistic can lead to an overly optimistic interpretation of the proportion of variance accounted for in the regression. We propose a modification of the estimator and demonstrate its utility by example.  相似文献   

20.
In an informal way, some dilemmas in connection with hypothesis testing in contingency tables are discussed. The body of the article concerns the numerical evaluation of Cochran's Rule about the minimum expected value in r × c contingency tables with fixed margins when testing independence with Pearson's X2 statistic using the χ2 distribution.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号