期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Goodness-of-fit measures of R 2 for repeated measures mixed effect models

Honghu Liu Yan Zheng Jie Shen 《Journal of applied statistics》2008,35(10):1081-1092

Linear mixed effects model (LMEM) is efficient in modeling repeated measures longitudinal data. However, little research has been done in developing goodness-of-fit measures that can evaluate the models, particularly those that can be interpreted in an absolute sense without referencing a null model. This paper proposes three coefficient of determination (R ²) as goodness-of-fit measures for LMEM with repeated measures longitudinal data. Theorems are presented describing the properties of R ² and relationships between the R ² statistics. A simulation study was conducted to evaluate and compare the R ² along with other criteria from literature. Finally, we applied the proposed R ² to a real virologic response data of an HIV-patient cohort. We conclude that our proposed R ² statistics have more advantages than other goodness-of-fit measures in the literature, in terms of robustness to sample size, intuitive interpretation, well-defined range, and unnecessary to determine a null model. 相似文献

2.

A Coefficient of Determination for Generalized Linear Models

Dabao Zhang 《The American statistician》2017,71(4):310-316

The coefficient of determination, a.k.a. R², is well-defined in linear regression models, and measures the proportion of variation in the dependent variable explained by the predictors included in the model. To extend it for generalized linear models, we use the variance function to define the total variation of the dependent variable, as well as the remaining variation of the dependent variable after modeling the predictive effects of the independent variables. Unlike other definitions that demand complete specification of the likelihood function, our definition of R² only needs to know the mean and variance functions, so applicable to more general quasi-models. It is consistent with the classical measure of uncertainty using variance, and reduces to the classical definition of the coefficient of determination when linear regression models are considered. 相似文献

3.

Another Cautionary Note about R 2: Its Use in Weighted Least-Squares Regression Analysis

John B. Willett Judith D. Singer 《The American statistician》2013,67(3):236-238

A recent article in this journal presented a variety of expressions for the coefficient of determination (R ²) and demonstrated that these expressions were generally not equivalent. The article discussed potential pitfalls in interpreting the R ² statistic in ordinary least-squares regression analysis. The current article extends this discussion to the case in which regression models are fit by weighted least squares and points out an additional pitfall that awaits the unwary data analyst. We show that unthinking reliance on the R ² statistic can lead to an overly optimistic interpretation of the proportion of variance accounted for in the regression. We propose a modification of the estimator and demonstrate its utility by example. 相似文献

4.

R 2 Measures Based on Wald and Likelihood Ratio Joint Significance Tests

Lonnie Magee 《The American statistician》2013,67(3):250-253

Two methods are suggested for generating R ² measures for a wide class of models. These measures are linked to the R ² of the standard linear regression model through Wald and likelihood ratio statistics for testing the joint significance of the explanatory variables. Some currently used R ²'s are shown to be special cases of these methods. 相似文献

5.

Pseudo latent models: Goodness of fit measures, residuals, estimation, testing, and simulation

Olaf Hübler 《Statistical Papers》1997,38(3):271-285

Binary response models consider pseudo-R ² measures which are not based on residuals while several concepts of residuals were developed for tests. In this paper the endogenous variable of the latent model corresponding to the binary observable model is substituted by a pseudo variable. Then goodness of fit measures and tests can be based on a joint concept of residuals as for linear models. Different kinds of residuals based on probit ML estimates are employed. The analytical investigations and the simulation results lead to the recommendation to use standardized residuals where there is no difference between observed and generalized residuals. In none of the investigated situations this estimator is far away from the best result. While in large samples all considered estimators are very similar, small sample properties speak in favour of residuals which are modifications of those suggested in the literature. An empirical application demonstrates that it is not necessary to develop new testing procedures for the observable models with dichotomous regressands. Well-know approaches for linear models with continuous endogenous variables which are implemented in usual econometric packages can be used for pseudo latent models. An erratum to this article is available at . 相似文献

6.

Quantifying R 2 bias in the presence of measurement error

Karl D. Majeske Terri Lynch-Caris Janet Brelin-Fornari 《Journal of applied statistics》2010,37(4):667-677

相似文献

7.

The Prediction Sum of Squares as a General Measure for Regression Diagnostics

Nguyen T. Quan 《商业与经济统计学杂志》2013,31(4):501-504

Statistics that usually accompany the regression model do not provide insight into the quality of the data or the potential influence of the individual observations on the estimates. In this study, the Q² statistic is used as a criterion for detecting influential observations or outliers. The statistic is derived from the jackknifed residuals, the squared sum of which is generally known as the prediction sum of squares or PRESS. This article compares R ² with Q² and suggests that the latter be used as part of the data-quality check. It is shown, for two separate data sets obtained from regional cost of living and U.S. food industry studies, that in the presence of outliers the Q² statistic can be negative, because it is sensitive to the choice of regressors and the inclusion of influential observations. Once the outliers are dropped from the sample, the discrepancy between Q² and R ² values is negligible. 相似文献

8.

A nonparametric R test for the presence of relevant variables

Feng Yao Aman Ullah 《Journal of statistical planning and inference》2013

相似文献

9.

Estimators of the multiple correlation coefficient: Local robustness and confidence intervals

Cristophe Croux Catherine Dehon 《Statistical Papers》2003,44(3):315-334

Many robust regression estimators are defined by minimizing a measure of spread of the residuals. An accompanying R ²-measure, or multiple correlation coefficient, is then easily obtained. In this paper, local robustness properties of these robust R ²-coefficients are investigated. It is also shown how confidence intervals for the population multiple correlation coefficient can be constructed in the case of multivariate normality. 相似文献

10.

An Asymptotic Characterization of Finite Degree U-statistics With Sample Size-Dependent Kernels: Applications to Nonparametric Estimators and Test Statistics

Feng Yao 《统计学通讯:理论与方法》2013,42(15):3251-3265

We provide a simple result on the H-decomposition of a U-statistics that allows for easy determination of its magnitude when the statistic’s kernel depends on the sample size n. The result provides a direct and convenient method to characterize the asymptotic magnitude of semiparametric and nonparametric estimators or test statistics involving high dimensional sums. We illustrate the use of our result in previously studied estimators/test statistics and in a novel nonparametric R² test for overall significance of a nonparametric regression model. 相似文献

11.

On Rao score and PearsonX 2 statistics in generalized linear models

Gianfranco Lovison 《Statistical Papers》2005,46(4):555-574

The identity of the Rao score and PearsonX ² statistics is well known in the areas where the latter was first introduced: goodness-of-fit in contingency tables and binary responses. We show in this paper that the same identity holds when the two statistics are used for testing goodness-of-fit of Generalized Linear Models. We also highlight the connections that exist between the two statistics when they are used for the comparison of nested models. Finally, we discuss some merits of these unifying results. Work financially supported by cofin. MIUR grants 2000 and 2002. 相似文献

12.

Misspecified T2 tests. i. location and scale

D.R. Jensen D.E. Ramiez 《统计学通讯:理论与方法》2013,42(1):249-259

Properties of Hotelling's (1931) T ² are studied under model misspecification in the model for a multivariate experiment. Stochastic bounds on T ² and further properties of the T ² test are studied under misspecified location and scale. The bounds are evaluated numerically in selected cases. 相似文献

13.

The large-sample performance of backwards variable elimination

Peter C. Austin 《Journal of applied statistics》2008,35(12):1355-1370

Prior studies have shown that automated variable selection results in models with substantially inflated estimates of the model R ², and that a large proportion of selected variables are truly noise variables. These earlier studies used simulated data sets whose sample sizes were at most 100. We used Monte Carlo simulations to examine the large-sample performance of backwards variable elimination. We found that in large samples, backwards variable elimination resulted in estimates of R ² that were at most marginally biased. However, even in large samples, backwards elimination tended to identify the correct regression model in a minority of the simulated data sets. 相似文献

14.

A Markov-chain-based regression model with random effects for the analysis of 18O-labelled mass spectra

Qi Zhu Tomasz Burzykowski 《Journal of Statistical Computation and Simulation》2013,83(1):145-157

The enzymatic ¹⁸O-labelling is a useful technique for reducing the influence of the between-spectra variability on the results of mass-spectrometry experiments. A difficulty in applying the technique lies in the quantification of the corresponding peptides due to the possibility of an incomplete labelling, which may result in biased estimates of the relative peptide abundance. To address the problem, Zhu et al. [A Markov-chain-based heteroscedastic regression model for the analysis of high-resolution enzymatically ¹⁸O-labeled mass spectra, J. Proteome Res. 9(5) (2010), pp. 2669–2677] proposed a Markov-chain-based regression model with heteroscedastic residual variance, which corrects for the possible bias. In this paper, we extend the model by allowing for the estimation of the technical and/or biological variability for the mass spectra data. To this aim, we use a mixed-effects version of the model. The performance of the model is evaluated based on results of an application to real-life mass spectra data and a simulation study. 相似文献

15.

A New Measure of Fit for Equations With Dichotomous Dependent Variables

Arturo Estrella 《商业与经济统计学杂志》2013,31(2):198-205

The econometrics literature contains many alternative measures of goodness of fit, roughly analogous to R ², for use with equations with dichotomous dependent variables. There is, however, no consensus as to the measures' relative merits or about which ones should be reported in empirical work. This article proposes a new measure that possesses several useful properties that the other measures lack. The new measure may be interpreted intuitively in a similar way to R ² in the linear regression context. 相似文献

16.

A study of R2 measure under the accelerated failure time models

Priscilla H. Chan Christina D. Chambers 《统计学通讯:模拟与计算》2018,47(2):380-391

For right-censored data, the accelerated failure time (AFT) model is an alternative to the commonly used proportional hazards regression model. It is a linear model for the (log-transformed) outcome of interest, and is particularly useful for censored outcomes that are not time-to-event, such as laboratory measurements. We provide a general and easily computable definition of the R² measure of explained variation under the AFT model for right-censored data. We study its behavior under different censoring scenarios and under different error distributions; in particular, we also study its robustness when the parametric error distribution is misspecified. Based on Monte Carlo investigation results, we recommend the log-normal distribution as a robust error distribution to be used in practice for the parametric AFT model, when the R² measure is of interest. We apply our methodology to an alcohol consumption during pregnancy data set from Ukraine. 相似文献

17.

Diagnosis of Multivariate Control Chart Signal Based on Dummy Variable Regression Technique

《统计学通讯:理论与方法》2013,42(8):1665-1684

Abstract

It is common to monitor several correlated quality characteristics using the Hotelling's T ² statistic. However, T ² confounds the location shift with scale shift and consequently it is often difficult to determine the factors responsible for out of control signal in terms of the process mean vector and/or process covariance matrix. In this paper, we propose a diagnostic procedure called ‘D-technique’ to detect the nature of shift. For this purpose, two sets of regression equations, each consisting of regression of a variable on the remaining variables, are used to characterize the ‘structure’ of the ‘in control’ process and that of ‘current’ process. To determine the sources responsible for an out of control state, it is shown that it is enough to compare these two structures using the dummy variable multiple regression equation. The proposed method is operationally simpler and computationally advantageous over existing diagnostic tools. The technique is illustrated with various examples. 相似文献

18.

The Target Parameter of Adjusted R-Squared in Fixed-Design Experiments

Hillel Bar-Gera 《The American statistician》2017,71(2):112-119

R-squared (R²) and adjusted R-squared (R²_Adj) are sometimes viewed as statistics detached from any target parameter, and sometimes as estimators for the population multiple correlation. The latter interpretation is meaningful only if the explanatory variables are random. This article proposes an alternative perspective for the case where the x’s are fixed. A new parameter is defined, in a similar fashion to the construction of R², but relying on the true parameters rather than their estimates. (The parameter definition includes also the fixed x values.) This parameter is referred to as the “parametric” coefficient of determination, and denoted by ρ²_*. The proposed ρ²_* remains stable when irrelevant variables are removed (or added), unlike the unadjusted R², which always goes up when variables, either relevant or not, are added to the model (and goes down when they are removed). The value of the traditional R²_Adj may go up or down with added (or removed) variables, either relevant or not. It is shown that the unadjusted R² overestimates ρ²_*, while the traditional R²_Adj underestimates it. It is also shown that for simple linear regression the magnitude of the bias of R²_Adj can be as high as the bias of the unadjusted R² (while their signs are opposite). Asymptotic convergence in probability of R²_Adj to ρ²_* is demonstrated. The effects of model parameters on the bias of R² and R²_Adj are characterized analytically and numerically. An alternative bi-adjusted estimator is presented and evaluated. 相似文献

19.

On the distribution of hotelling's t2 and multiple correlation r2when sampling from a mixture of two normals

M.S. Srivastava 《统计学通讯:理论与方法》2013,42(13):1481-1497

In this paper the non-null distribution of Hotelling's T² and the null distribution of multiple correlation R² are derived when the sample is taken from a mixture of two p-component multivariate normal distributions with mean vectors μ₁ and μ₂ respectively and common covariance matrix ∑, ∑. In a special case the non-null distribution of R² is a l s o given, while the general noncentral distribution is given i n Awan (1981). These results have been used to study the robustness of T² and R² tests by Srivastava and Awan (1982), and Awan and Srivastava (1982) respectively. 相似文献

20.

The Empirical Likelihood Goodness-of-Fit Test for a Regression Model with Randomly Censored Data

Yiping Yang Liugen Xue Weihu Cheng 《统计学通讯:理论与方法》2013,42(3):424-435

The regression model with randomly censored data has been intensively investigated. In this article, we consider a goodness-of-fit test for this model. Empirical likelihood (EL) tests are constructed. The asymptotic distributions of the test statistic under null hypothesis and the local alternative hypothesis are given. Simulations are carried out to illustrate the methodology. 相似文献