首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 953 毫秒
1.
Statistics that usually accompany the regression model do not provide insight into the quality of the data or the potential influence of the individual observations on the estimates. In this study, the Q2 statistic is used as a criterion for detecting influential observations or outliers. The statistic is derived from the jackknifed residuals, the squared sum of which is generally known as the prediction sum of squares or PRESS. This article compares R 2 with Q2 and suggests that the latter be used as part of the data-quality check. It is shown, for two separate data sets obtained from regional cost of living and U.S. food industry studies, that in the presence of outliers the Q2 statistic can be negative, because it is sensitive to the choice of regressors and the inclusion of influential observations. Once the outliers are dropped from the sample, the discrepancy between Q2 and R 2 values is negligible.  相似文献   

2.
We distinguish between three types of outliers in a one-way random effects model. These are formally described in terms of their position relative to the main part of the observations. We propose simple rules for identifying such outliers and give an example which involves median-based statistics.  相似文献   

3.
A large number of statistics are used in the literature to detect outliers and influential observations in the linear regression model. In this paper comparison studies have been made for determining a statistic which performs better than the other. This includes: (i) a detailed simulation study, and (ii) analyses of several data sets studied by different authors. Different choices of the design matrix of regression model are considered. Design A studies the performance of the various statistics for detecting the scale shift type outliers, and designs B and C provide information on the performance of the statistics for identifying the influential observations. We have used cutoff points using the exact distributions and Bonferroni's inequality for each statistic. The results show that the studentized residual which is used for detection of mean shift outliers is appropriate for detection of scale shift outliers also, and the Welsch's statistic and the Cook's distance are appropriate for detection of influential observations.  相似文献   

4.
Because outliers and leverage observations unduly affect the least squares regression, the identification of influential observations is considered an important and integrai part of the analysis. However, very few techniques have been developed for the residual analysis and diagnostics for the minimum sum of absolute errors, L1 regression. Although the L1 regression is more resistant to the outliers than the least squares regression, it appears that outliers (leverage) in the predictor variables may affect it. In this paper, our objective is to develop an influence measure for the L1 regression based on the likelihood displacement function. We illustrate the proposed influence measure with examples.  相似文献   

5.
This article studies how to identify influential observations in univariate autoregressive integrated moving average time series models and how to measure their effects on the estimated parameters of the model. The sensitivity of the parameters to the presence of either additive or innovational outliers is analyzed, and influence statistics based on the Mahalanobis distance are presented. The statistic linked to additive outliers is shown to be very useful for indicating the robustness of the fitted model to the given data set. Its application is illustrated using a relevant set of historical data.  相似文献   

6.
The usefulness of an extra sum of squares statistics QK for detecting K outliers has been discussed previously in the context of two-way tables. (See Gentleman and Wilk, 1975a, 1975b; John and Draper 1978; and Draper and John, 1980,) That work is extended here to straight line regression situations arising from, and motivated by, a specific set of research data. Percentage points for the appropriate test statistics are obtained by simulation, and approximations for these percentage points are suggested. Power calculations made for various designs and outlier situations are briefly summarized.  相似文献   

7.
This paper investigates the hypothesis test of the parametric component in partially linear errors-in-variables (EV) model with random censorship. We construct two test statistics based on the difference of the corrected residual sum of squares and empirical likelihood ratio under the null and alternative hypotheses. It is shown that the limiting distributions of the proposed test statistics are both weighted sum of independent standard chi-squared distribution with one degree of freedom under the null hypothesis. Based on the adjusted test statistics, we further develop two new types of test procedures. Finite sample performance of the proposed test procedures is evaluated by extensive simulation studies.  相似文献   

8.
刘洪  金林 《统计研究》2012,29(10):99-104
本文以经济增长理论为基础,对1953-2010年中国GDP数据和劳动投入、资本投入、人力资本等因素建立了半参数回归模型。然后,文章对模型了进行了统计诊断分析,计算了相关统计诊断量,利用统计诊断量得到了模型的异常点,基于此对中国GDP数据的准确性进行了讨论:中国GDP数据的异常点主要集中两个时间段1958-1961年和1991-1994年。文章最后对基于半参数回归模型统计诊断的统计数据准确性评估方法进行了评述。  相似文献   

9.
In the present paper, we use the already defined alpha-divergence and gamma-divergence for constructing some goodness of fit tests for exponentiality. These divergence measures are very robust with respect to outliers. Since the existence of outliers among statistical data can be lead to misleading results, therefore utilizing these divergence measures can be of importance. In order to construct test statistics, two estimators are used for alpha-divergence and gamma-divergence. In the first one, we consider the alpha-divergence and gamma-divergence of the equilibrium distribution function, which is well defined on the empirical distribution function (EDF) and is proposed as an EDF-based goodness of fit test statistic. The second one is an estimator in manner of Vasicek entropy estimator. Simulation results indicate that in comparison with the other tests statistics, our mentioned test statistics almost in most of the cases have higher power. Finally, two examples containing outliers illustrate the importance and use of the proposed tests.  相似文献   

10.
In this paper, we derive some recurrence relations for the single and the product moments of order statistics from n independent and non-identically distributed Lomax and right-truncated Lomax random variables. These recurrence relations are simple in nature and could be used systematically in order to compute all the single and product moments of all order statistics in a simple recursive manner. The results for order statistics from the multiple-outlier model (with a slippage of p observations) are deduced as special cases. We then apply these results by examining the robustness of censored BLUE's to the presence of multiple outliers. Received: November 30, 1998; revised version: March 8, 2000  相似文献   

11.
Abstract.  The Extended Growth Curve model is considered. It turns out that the estimated mean of the model is the projection of the observations on the space generated by the design matrices which turns out to be the sum of two tensor product spaces. The orthogonal complement of this space is decomposed into four orthogonal spaces and residuals are defined by projecting the observation matrix on the resulting components. The residuals are interpreted and some remarks are given as to why we should not use ordinary residuals, what kind of information our residuals give and how this information might be used to validate model assumptions and detect outliers and influential observations. It is shown that the residuals are symmetrically distributed around zero and are uncorrelated with each other. The covariance between the residuals and the estimated model as well as the dispersion matrices for the residuals are also given.  相似文献   

12.
A large number of statistics have been proposed to study the influence of individual observations in the linear mixed model. An extensive Monte Carlo simulation study is used to evaluate the appropriateness of these influence diagnostic measures. The sensitivity of the diagnostic measures to outliers and leverages is examined, and helpful results are obtained.  相似文献   

13.
Under a randomization model for a completely randomized design permutation tests are considered based on the usual F statistic and on a multi-response permutation procedure statistic. For the first statistic the first two moments are obtained so a comparision with the distribution under the normal theory model can be made. The second statistic is shown to converge in distribution to an infinite weighted sum of chi-squared variates, the weights being the limits of the eigenvalues of a matrix depending on the distance measure used and the order statistics of the observations.  相似文献   

14.
Life time data analysis is regarded as one of the significant out-shoots of statistics. Classical statistical techniques reckon life time observations as precise numbers and solely cover variation among the observations. In fact, there are two types of uncertainty in data: variation among the observations and the fuzziness. To this effect, the analysis techniques, which do not consider fuzziness and are only based on precise life time observations, use incomplete information; hence lead to pseudo results. This study aimed at generalizing parameters estimation, survival functions, and hazard rates for fuzzy life time data.  相似文献   

15.
This study aims at exploring correct identification of seasonal outliers using most commonly applied test statistics. We evaluate the performance of seasonal level shift (SLS) by means of empirical level of significance, power of the test for sensitivity in detecting changes, and the vulnerability to masking of outliers by misspecification frequencies. We observe that the size of SLS affects the sampling distribution of ηSLS (test statistics for SLS detection) in case of SAR (1) and SMA (1) model. The empirical critical values for 1%, 5%, and 10% upper percentiles are higher than the usual cut off points and the empirical level of significance is inversely related to sample size and the model coefficients. The empirical power of the test statistics is not satisfactory at small sample size, and for large model coefficient. ηSLS gets confused with IO. The potential list of types of outliers should retain both IO and SLS as a part of outlier detection procedure for most efficient results. We apply the method suggested by Kaiser and Maravall with five possible types of outliers, that is, AO, IO, LS, TC, and SLS, to a number of quarterly and monthly time series data from Pakistan.  相似文献   

16.
This article considers the derivation of approximate distributions for two types of statistics that can be used in developing new tests of discordance in circular samples from the von Mises distribution. An alternative test of discordance is proposed based on the circular distance between sample points. The advantage of the test is that it allows users to detect possible outliers in both univariate and bivariate circular data. For illustration, the test is applied to two real circular data sets.  相似文献   

17.
Cook-statistic has been developed for detecting outliers in two likely situations of occurrence of outliers in multi-response experiments. In the first situation, more than one outlying observations vector has been considered. Each of these vectors is obtained on the assumption that a particular observation from each of the responses is an outlier. A general expression of Cook-statistic for detecting any such t outlying observations vectors has been obtained. Then some particular cases have been considered. In the second case a situation is considered where observations from all the responses may not be outliers. Here also a general expression of Cook-statistic is obtained for detecting any t observations from each of any k responses as outliers. In both the cases Cook-statistic is applied to real experimental data.  相似文献   

18.
The problem of testing for total independence of the variates of a stochastic p(≧3) component vector using rank correlation statistics is considered. Two distribution free statistics are considered, one based on the determinant of the matrix of rank correlation statistics, the second on their sum of squares. Tables of critical values are given for p=3,4 for the cases when (a) ranks, and (b) exponential scores are used to replace the ordered observations within each variate. Some approximations to the critical values are proposed and evaluated.  相似文献   

19.
Five widely used test statistics for detecting outliers and influential observations were studied using Monte Carlo method . The test statistic based on Studentized residuals, with critical values given by Tietjen, Moore and Beckman (1973), appears to be the best procedure for detecting a single outlier in simple linear regression.  相似文献   

20.
序列相关的检验与诊断   总被引:4,自引:0,他引:4       下载免费PDF全文
赵进文 《统计研究》1997,14(2):63-65
序列相关的检验与诊断赵进文ABSTRACTIngeneralopinion,themodelswebuilthavepowerfulpracticalforecastabilitysolongastheyareacceptedbysignifican...  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号