首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Statistics that usually accompany the regression model do not provide insight into the quality of the data or the potential influence of the individual observations on the estimates. In this study, the Q2 statistic is used as a criterion for detecting influential observations or outliers. The statistic is derived from the jackknifed residuals, the squared sum of which is generally known as the prediction sum of squares or PRESS. This article compares R 2 with Q2 and suggests that the latter be used as part of the data-quality check. It is shown, for two separate data sets obtained from regional cost of living and U.S. food industry studies, that in the presence of outliers the Q2 statistic can be negative, because it is sensitive to the choice of regressors and the inclusion of influential observations. Once the outliers are dropped from the sample, the discrepancy between Q2 and R 2 values is negligible.  相似文献   

2.
3.
We provide a simple result on the H-decomposition of a U-statistics that allows for easy determination of its magnitude when the statistic’s kernel depends on the sample size n. The result provides a direct and convenient method to characterize the asymptotic magnitude of semiparametric and nonparametric estimators or test statistics involving high dimensional sums. We illustrate the use of our result in previously studied estimators/test statistics and in a novel nonparametric R2 test for overall significance of a nonparametric regression model.  相似文献   

4.
This article examines several goodness-of-fit measures in the binary probit regression model. Existing pseudo-R 2 measures are reviewed, two modified and one new pseudo-R 2 measure are proposed. For the probit regression model, empirical comparisons are made for different goodness-of-fit measures with the squared sample correlation coefficient of the observed response and the predicted probabilities. As an illustration, the goodness-of-fit measures are applied to a “paid labor force” data set.  相似文献   

5.
When a process is monitored with a T 2 control chart in a Phase II setting, the MYT decomposition is a valuable diagnostic tool for interpreting signals in terms of the process variables. The decomposition splits a signaling T 2 statistic into independent components that can be associated with either individual variables or groups of variables. Since these components are T 2 statistics with known distributions, they can be used to determine which of the process variable(s) contribute to the signal. However, this procedure cannot be applied directly to Phase I since the distributions of the individual components are unknown. In this article, we develop the MYT decomposition procedure for a Phase I operation, when monitoring a random sample of individual observations and identifying outliers. We use a relationship between the T 2 statistic in Phase I with the corresponding T 2 statistic resulting when an observation is omitted from this sample to derive the distributions of these components and demonstrate the Phase I application of the MYT decomposition.  相似文献   

6.
The coefficient of determination (R 2) is perhaps the single most extensively used measure of goodness of fit for regression models. It is also widely misused. The primary source of the problem is that except for linear models with an intercept term, the several alternative R 2 statistics are not generally equivalent. This article discusses various considerations and potential pitfalls in using the R 2's. Specific points are exemplified by means of empirical data. A new resistant statistic is also introduced.  相似文献   

7.
8.
Abstract

It is common to monitor several correlated quality characteristics using the Hotelling's T 2 statistic. However, T 2 confounds the location shift with scale shift and consequently it is often difficult to determine the factors responsible for out of control signal in terms of the process mean vector and/or process covariance matrix. In this paper, we propose a diagnostic procedure called ‘D-technique’ to detect the nature of shift. For this purpose, two sets of regression equations, each consisting of regression of a variable on the remaining variables, are used to characterize the ‘structure’ of the ‘in control’ process and that of ‘current’ process. To determine the sources responsible for an out of control state, it is shown that it is enough to compare these two structures using the dummy variable multiple regression equation. The proposed method is operationally simpler and computationally advantageous over existing diagnostic tools. The technique is illustrated with various examples.  相似文献   

9.
Many robust regression estimators are defined by minimizing a measure of spread of the residuals. An accompanying R 2-measure, or multiple correlation coefficient, is then easily obtained. In this paper, local robustness properties of these robust R 2-coefficients are investigated. It is also shown how confidence intervals for the population multiple correlation coefficient can be constructed in the case of multivariate normality.  相似文献   

10.
We develop a ‘robust’ statistic T2 R, based on Tiku's (1967, 1980) MML (modified maximum likelihood) estimators of location and scale parameters, for testing an assumed meam vector of a symmetric multivariate distribution. We show that T2 R is one the whole considerably more powerful than the prominenet Hotelling T2 statistics. We also develop a robust statistic T2 D for testing that two multivariate distributions (skew or symmetric) are identical; T2 D seems to be usually more powerful than nonparametric statistics. The only assumption we make is that the marginal distributions are of the type (1/σk)f((x-μk)/σk) and the means and variances of these marginal distributions exist.  相似文献   

11.
ABSTRACT

The one-sample Wilcoxon signed rank test was originally designed to test for a specified median, under the assumption that the distribution is symmetric, but it can also serve as a test for symmetry if the median is known. In this article we derive the Wilcoxon statistic as the first component of Pearson's X 2 statistic for independence in a particularly constructed contingency table. The second and third components are new test statistics for symmetry. In the second part of the article, the Wilcoxon test is extended so that symmetry around the median and symmetry in the tails can be examined seperately. A trimming proportion is used to split the observations in the tails from those around the median. We further extend the method so that no arbitrary choice for the trimming proportion has to be made. Finally, the new tests are compared to other tests for symmetry in a simulation study. It is concluded that our tests often have substantially greater powers than most other tests.  相似文献   

12.
Much research has been performed in the area of multiple linear regression, with the resuit that the field is well-developed. This is not true of logistic regression, however. The latter presents special problems because the response is not continuous. Some of these problems are: the difficulty of developing a suitable R2 statistic, possibly poor results produced by the method of maximum likelihood, and the challenge to develop suitable graphical techniques. We describe recent work in some of these directions, and discuss the need for additional research.  相似文献   

13.
The coefficient of determination, a.k.a. R2, is well-defined in linear regression models, and measures the proportion of variation in the dependent variable explained by the predictors included in the model. To extend it for generalized linear models, we use the variance function to define the total variation of the dependent variable, as well as the remaining variation of the dependent variable after modeling the predictive effects of the independent variables. Unlike other definitions that demand complete specification of the likelihood function, our definition of R2 only needs to know the mean and variance functions, so applicable to more general quasi-models. It is consistent with the classical measure of uncertainty using variance, and reduces to the classical definition of the coefficient of determination when linear regression models are considered.  相似文献   

14.
A “spurious regression” is one in which the time-series variables are non stationary and independent. It is well known that in this context the OLS parameter estimates and the R 2 converge to functionals of Brownian motions, the “t-ratios” diverge in distribution, and the Durbin–Watson statistic converges in probability to zero. We derive corresponding results for some common tests for the normality and homoskedasticity of the errors in a spurious regression.  相似文献   

15.
In an informal way, some dilemmas in connection with hypothesis testing in contingency tables are discussed. The body of the article concerns the numerical evaluation of Cochran's Rule about the minimum expected value in r × c contingency tables with fixed margins when testing independence with Pearson's X2 statistic using the χ2 distribution.  相似文献   

16.
A power study suggests that a good test of fit analysis for the binomial distribution is provided by a data-dependent Chernoff–Lehmann X 2 test with class expectations greater than unity, and its components. These data-dependent statistics involve arithmetically simple parameter estimation, convenient approximate distributions and provide a comprehensive assessment of how well the data agree with a binomial distribution. We suggest that a well-performed single test of fit statistic is the Anderson–Darling statistic.  相似文献   

17.
Two procedures for testing equality of two proportions are compared in terms of asymptotic efficiency. The comparison favors use of a statistic equivalent to Goodman's Y 2 over the usual X 2 statistic in some cases including that of equal sample sizes. Numerical comparisons indicate that the asymptotic results have some relevance for moderate sample sizes.  相似文献   

18.
Statistics R a based on power divergence can be used for testing the homogeneity of a product multinomial model. All R a have the same chi-square limiting distribution under the null hypothesis of homogeneity. R 0 is the log likelihood ratio statistic and R 1 is Pearson's X 2 statistic. In this article, we consider improvement of approximation of the distribution of R a under the homogeneity hypothesis. The expression of the asymptotic expansion of distribution of R a under the homogeneity hypothesis is investigated. The expression consists of continuous and discontinuous terms. Using the continuous term of the expression, a new approximation of the distribution of R a is proposed. A moment-corrected type of chi-square approximation is also derived. By numerical comparison, we show that both of the approximations perform much better than that of usual chi-square approximation for the statistics R a when a ≤ 0, which include the log likelihood ratio statistic.  相似文献   

19.
In this paper, we consider testing the equality of two mean vectors with unequal covariance matrices. In the case of equal covariance matrices, we can use Hotelling’s T2 statistic, which follows the F distribution under the null hypothesis. Meanwhile, in the case of unequal covariance matrices, the T2 type test statistic does not follow the F distribution, and it is also difficult to derive the exact distribution. In this paper, we propose an approximate solution to the problem by adjusting the degrees of freedom of the F distribution. Asymptotic expansions up to the term of order N? 2 for the first and second moments of the U statistic are given, where N is the total sample size minus two. A new approximate degrees of freedom and its bias correction are obtained. Finally, numerical comparison is presented by a Monte Carlo simulation.  相似文献   

20.
In statistical process control applications, the multivariate T 2 control chart based on Hotelling's T 2 statistic is useful for detecting the presence of special causes of variation. In particular, use of the T 2 statistic based on the successive differences covariance matrix estimator has been shown to be very effective in detecting the presence of a sustained step or ramp shift in the mean vector. However, the exact distribution of this statistic is unknown. In this article, we derive the maximum value of the T 2 statistic based on the successive differences covariance matrix estimator. This distributional property is crucial for calculating an approximate upper control limit of a T 2 control chart based on successive differences, as described in Williams et al. (2006 Williams , J. D. , Woodall , W. H. , Birch , J. B. , Sullivan , J. H. ( 2006 ). On the distribution of T 2 statistics based on successive differences . J. Qual. Technol. 38 : 217229 .[Taylor & Francis Online], [Web of Science ®] [Google Scholar]).  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号