首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
2.
This article examines several goodness-of-fit measures in the binary probit regression model. Existing pseudo-R 2 measures are reviewed, two modified and one new pseudo-R 2 measure are proposed. For the probit regression model, empirical comparisons are made for different goodness-of-fit measures with the squared sample correlation coefficient of the observed response and the predicted probabilities. As an illustration, the goodness-of-fit measures are applied to a “paid labor force” data set.  相似文献   

3.
A recent article in this journal presented a variety of expressions for the coefficient of determination (R 2) and demonstrated that these expressions were generally not equivalent. The article discussed potential pitfalls in interpreting the R 2 statistic in ordinary least-squares regression analysis. The current article extends this discussion to the case in which regression models are fit by weighted least squares and points out an additional pitfall that awaits the unwary data analyst. We show that unthinking reliance on the R 2 statistic can lead to an overly optimistic interpretation of the proportion of variance accounted for in the regression. We propose a modification of the estimator and demonstrate its utility by example.  相似文献   

4.
In this paper the non-null distribution of Hotelling's T2 and the null distribution of multiple correlation R2 are derived when the sample is taken from a mixture of two p-component multivariate normal distributions with mean vectors μ1 and μ2 respectively and common covariance matrix ∑, ∑. In a special case the non-null distribution of R2 is a l s o given, while the general noncentral distribution is given i n Awan (1981). These results have been used to study the robustness of T2 and R2 tests by Srivastava and Awan (1982), and Awan and Srivastava (1982) respectively.  相似文献   

5.
The coefficient of determination, a.k.a. R2, is well-defined in linear regression models, and measures the proportion of variation in the dependent variable explained by the predictors included in the model. To extend it for generalized linear models, we use the variance function to define the total variation of the dependent variable, as well as the remaining variation of the dependent variable after modeling the predictive effects of the independent variables. Unlike other definitions that demand complete specification of the likelihood function, our definition of R2 only needs to know the mean and variance functions, so applicable to more general quasi-models. It is consistent with the classical measure of uncertainty using variance, and reduces to the classical definition of the coefficient of determination when linear regression models are considered.  相似文献   

6.
Statistics that usually accompany the regression model do not provide insight into the quality of the data or the potential influence of the individual observations on the estimates. In this study, the Q2 statistic is used as a criterion for detecting influential observations or outliers. The statistic is derived from the jackknifed residuals, the squared sum of which is generally known as the prediction sum of squares or PRESS. This article compares R 2 with Q2 and suggests that the latter be used as part of the data-quality check. It is shown, for two separate data sets obtained from regional cost of living and U.S. food industry studies, that in the presence of outliers the Q2 statistic can be negative, because it is sensitive to the choice of regressors and the inclusion of influential observations. Once the outliers are dropped from the sample, the discrepancy between Q2 and R 2 values is negligible.  相似文献   

7.
ABSTRACT

Recently, researchers have tried to design the T2 chart economically to achieve the minimum possible quality cost; however, when T2 chart is designed, it is important to consider multiple scenarios. This research presents the robust economic designs of the T2 chart where there is more than one scenario. An illustrative example is used to demonstrate the effect of the model parameters on the optimal designs. The genetic algorithm optimization method is employed to obtain the optimal designs. Simulation studies show that the robust economic designs of T2 chart are more effective than traditional economic design in practice.  相似文献   

8.
A loss function proposed by Wasan (1970) is well-fitted for a measure of inaccuracy for an estimator of a scale parameter of a distribution defined onR +=(0, ∞). We refer to this loss function as the K-loss function. A relationship between the K-loss and squared error loss functions is discussed. And an optimal estimator for a scale parameter with known coefficient of variation under the K-loss function is presented.  相似文献   

9.
Summary: L p –norm weighted depth functions are introduced and the local and global robustness of these weighted L p –depth functions and their induced multivariate medians are investigated via influence function and finite sample breakdown point. To study the global robustness of depth functions, a notion of finite sample breakdown point is introduced. The weighted L p –depth functions turn out to have the same low breakdown point as some other popular depth functions. Their influence functions are also unbounded. On the other hand, the weighted L p –depth induced medians are globally robust with the highest possible breakdown point for any reasonable estimator. The weighted L p –medians are also locally robust with bounded influence functions for suitable weight functions. Unlike other existing depth functions and multivariate medians, the weighted L p depth and medians are easy to calculate in high dimensions. The price for this advantage is the lack of affine invariance and equivariance of the weighted L p depth and medians, respectively.*The author thanks the referees for their very insightful and constructive comments and suggestions which led to corrections and substantial improvements. Supported in part by NSF Grants DMS-0071976 and DMS-0134628.  相似文献   

10.
Fisher's A statistic, often called the adjusted R2 statistic, is shown to be a close approximation to the maximum likelihood estimate of the multiple correlation coefficient, p2, based on the marginal distribution of R2. Expansions for the estimate are obtained. The same methods lead to maximum marginal likelihood estimators for the noncentrality parameters for noncentral X2 and F.  相似文献   

11.
We develop a ‘robust’ statistic T2 R, based on Tiku's (1967, 1980) MML (modified maximum likelihood) estimators of location and scale parameters, for testing an assumed meam vector of a symmetric multivariate distribution. We show that T2 R is one the whole considerably more powerful than the prominenet Hotelling T2 statistics. We also develop a robust statistic T2 D for testing that two multivariate distributions (skew or symmetric) are identical; T2 D seems to be usually more powerful than nonparametric statistics. The only assumption we make is that the marginal distributions are of the type (1/σk)f((x-μk)/σk) and the means and variances of these marginal distributions exist.  相似文献   

12.
In the past decade, different robust estimators have been proposed by several researchers to improve the ability to detect non-random patterns such as trend, process mean shift, and outliers in multivariate control charts. However, the use of the sample mean vector and the mean square successive difference matrix in the T 2 control chart is sensitive in detecting process mean shift or trend but less sensitive in detecting outliers. On the other hand, the minimum volume ellipsoid (MVE) estimators in the T 2 control chart are sensitive in detecting multiple outliers but less sensitive in detecting trend or process mean shift. Therefore, new robust estimators using both merits of the mean square successive difference matrix and the MVE estimators are developed to modify Hotelling's T 2 control chart. To compare the detection performance among various control charts, a simulation approach for establishing control limits and calculating signal probabilities is provided as well. Our simulation results show that a multivariate control chart using the new robust estimators can achieve a well-balanced sensitivity in detecting the above-mentioned non-random patterns. Finally, three numerical examples further demonstrate the usefulness of our new robust estimators.  相似文献   

13.
Hotelling’s T2 control chart with double warning lines   总被引:1,自引:1,他引:0  
Recent studies have shown that the T 2 control chart with variable sampling intervals (VSI) and/or variable sample sizes (VSS) detects process shifts faster than the traditional T 2 chart. This article extends these studies for processes that are monitored with VSI and VSS using double warning lines (T 2 —DWL). It is assumed that the length of time the process remains in control has exponential distribution. The properties of T 2 —DWL chart are obtained using Markov chains. The results show that the T 2 —DWL chart is quicker than VSI and/or VSS charts in detecting almost all shifts in the process mean.  相似文献   

14.
To assess the quality of the fit in a multiple linear regression, the coefficient of determination or R2 is a very simple tool, yet the most used by practitioners. Indeed, it is reported in most statistical analyzes, and although it is not recommended as a final model selection tool, it provides an indication of the suitability of the chosen explanatory variables in predicting the response. In the classical setting, it is well known that the least-squares fit and coefficient of determination can be arbitrary and/or misleading in the presence of a single outlier. In many applied settings, the assumption of normality of the errors and the absence of outliers are difficult to establish. In these cases, robust procedures for estimation and inference in linear regression are available and provide a suitable alternative.  相似文献   

15.
The present paper studies the normality of five transformations suggested in the literature to normalize the sample correlation coefficient. The parent populations are the bivariate t and the bivariate X 2The results in the previous work of Subrahmaniam and Gajjar are exploited to assess their performance. The density estimation procedure of Tarter and Kronmal is used to provide empiric support to the asymptotic results  相似文献   

16.
ABSTRACT

Economic statistical designs aim at minimizing the cost of process monitoring when a specific scenario or a set of estimated process and cost parameters is given. But, in practice the process may be affected by more than one scenario which may lead to severe cost penalties if the wrong design is used. Here, we investigate the robust economic statistical design (RESD) of the T2 chart in an attempt to reduce these cost penalties when there are multiple scenarios. Our method is to employ the genetic algorithm (GA) optimization method to minimize the total expected monitoring cost across all distinct scenarios. We illustrate the effectiveness of the method using two numerical examples. Simulation studies indicate that robust economic statistical designs should be encouraged in practice.  相似文献   

17.
It is well known that Yates' algorithm can be used to estimate the effects in a factorial design. We develop a modification of this algorithm and call it modified Yates' algorithm and its inverse. We show that the intermediate steps in our algorithm have a direct interpretation as estimated level-specific mean values and effects. Also we show how Yates' or our modified algorithm can be used to construct the blocks in a 2 k factorial design and to generate the layout sheet of a 2 k−p fractional factorial design and the confounding pattern in such a design. In a final example we put together all these methods by generating and analysing a 26-2 design with 2 blocks.  相似文献   

18.
R-squared (R2) and adjusted R-squared (R2Adj) are sometimes viewed as statistics detached from any target parameter, and sometimes as estimators for the population multiple correlation. The latter interpretation is meaningful only if the explanatory variables are random. This article proposes an alternative perspective for the case where the x’s are fixed. A new parameter is defined, in a similar fashion to the construction of R2, but relying on the true parameters rather than their estimates. (The parameter definition includes also the fixed x values.) This parameter is referred to as the “parametric” coefficient of determination, and denoted by ρ2*. The proposed ρ2* remains stable when irrelevant variables are removed (or added), unlike the unadjusted R2, which always goes up when variables, either relevant or not, are added to the model (and goes down when they are removed). The value of the traditional R2Adj may go up or down with added (or removed) variables, either relevant or not. It is shown that the unadjusted R2 overestimates ρ2*, while the traditional R2Adj underestimates it. It is also shown that for simple linear regression the magnitude of the bias of R2Adj can be as high as the bias of the unadjusted R2 (while their signs are opposite). Asymptotic convergence in probability of R2Adj to ρ2* is demonstrated. The effects of model parameters on the bias of R2 and R2Adj are characterized analytically and numerically. An alternative bi-adjusted estimator is presented and evaluated.  相似文献   

19.
The coefficient of determination, known also as the R 2, is a common measure in regression analysis. Many scientists use the R 2 and the adjusted R 2 on a regular basis. In most cases, the researchers treat the coefficient of determination as an index of ‘usefulness’ or ‘goodness of fit,’ and in some cases, they even treat it as a model selection tool. In cases in which the data is incomplete, most researchers and common statistical software will use complete case analysis in order to estimate the R 2, a procedure that might lead to biased results. In this paper, I introduce the use of multiple imputation for the estimation of R 2 and adjusted R 2 in incomplete data sets. I illustrate my methodology using a biomedical example.  相似文献   

20.
We provide a simple result on the H-decomposition of a U-statistics that allows for easy determination of its magnitude when the statistic’s kernel depends on the sample size n. The result provides a direct and convenient method to characterize the asymptotic magnitude of semiparametric and nonparametric estimators or test statistics involving high dimensional sums. We illustrate the use of our result in previously studied estimators/test statistics and in a novel nonparametric R2 test for overall significance of a nonparametric regression model.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号