首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Efficient score tests exist among others, for testing the presence of additive and/or innovative outliers that are the result of the shifted mean of the error process under the regression model. A sample influence function of autocorrelation-based diagnostic technique also exists for the detection of outliers that are the result of the shifted autocorrelations. The later diagnostic technique is however not useful if the outlying observation does not affect the autocorrelation structure but is generated due to an inflation in the variance of the error process under the regression model. In this paper, we develop a unified maximum studentized type test which is applicable for testing the additive and innovative outliers as well as variance shifted outliers that may or may not affect the autocorrelation structure of the outlier free time series observations. Since the computation of the p-values for the maximum studentized type test is not easy in general, we propose a Satterthwaite type approximation based on suitable doubly non-central F-distributions for finding such p-values [F.E. Satterthwaite, An approximate distribution of estimates of variance components, Biometrics 2 (1946), pp. 110–114]. The approximations are evaluated through a simulation study, for example, for the detection of additive and innovative outliers as well as variance shifted outliers that do not affect the autocorrelation structure of the outlier free time series observations. Some simulation results on model misspecification effects on outlier detection are also provided.  相似文献   

2.
This paper proposes a new robust Bayes factor for comparing two linear models. The factor is based on a pseudo‐model for outliers and is more robust to outliers than the Bayes factor based on the variance‐inflation model for outliers. If an observation is considered an outlier for both models this new robust Bayes factor equals the Bayes factor calculated after removing the outlier. If an observation is considered an outlier for one model but not the other then this new robust Bayes factor equals the Bayes factor calculated without the observation, but a penalty is applied to the model considering the observation as an outlier. For moderate outliers where the variance‐inflation model is suitable, the two Bayes factors are similar. The new Bayes factor uses a single robustness parameter to describe a priori belief in the likelihood of outliers. Real and synthetic data illustrate the properties of the new robust Bayes factor and highlight the inferior properties of Bayes factors based on the variance‐inflation model for outliers.  相似文献   

3.
The presence of outliers in the data sets affects the structure of multicollinearity which arises from a high degree of correlation between explanatory variables in a linear regression analysis. This affect could be seen as an increase or decrease in the diagnostics used to determine multicollinearity. Thus, the cases of outliers reduce the reliability of diagnostics such as variance inflation factors, condition numbers and variance decomposition proportions. In this study, we propose to use a robust estimation of the correlation matrix obtained by the minimum covariance determinant method to determine the diagnostics of multicollinearity in the presence of outliers. As a result, the present paper demonstrates that the diagnostics of multicollinearity obtained by the robust estimation of the correlation matrix are more reliable in the presence of outliers.  相似文献   

4.
Cook距离公式常用于回归模型的异常值诊断,但由于公式中的样本方差■对异常值敏感,导致公式缺乏稳健性,使得诊断效果不理想。基于以上问题,文章选取绝对离差中位数作为样本标准差的稳健估计量,得到了样本方差■的稳健估计量,进而构造出稳健Cook距离公式;借鉴传统Cook距离的回归模型异常值诊断理论,将稳健Cook距离公式应用于时间序列异常值诊断,拓展了传统Cook距离公式的异常值诊断领域。通过选取模拟样本量分别为50、100、200,污染率分别为0、1%、5%、10%的ARMA(1,1)序列及金融时间序列进行实例分析,结果发现:(1)在无污染时,稳健Cook距离法与常规Cook距离法的诊断正确率均为100%,两者没有出现"误诊"现象;(2)在样本量、污染率同时增大时,常规Cook距离诊断正确率急剧下降,当污染率达到5%及以上时,已基本无诊断力,而稳健Cook距离法依然能保持较高的诊断力。稳健Cook距离法不仅能应用于时间序列异常值诊断,也能应用于回归分析的异常值诊断。  相似文献   

5.
In this article, we propose an outlier detection approach in a multiple regression model using the properties of a difference-based variance estimator. This type of a difference-based variance estimator was originally used to estimate error variance in a non parametric regression model without estimating a non parametric function. This article first employed a difference-based error variance estimator to study the outlier detection problem in a multiple regression model. Our approach uses the leave-one-out type method based on difference-based error variance. The existing outlier detection approaches using the leave-one-out approach are highly affected by other outliers, while ours is not because our approach does not use the regression coefficient estimator. We compared our approach with several existing methods using a simulation study, suggesting the outperformance of our approach. The advantages of our approach are demonstrated using a real data application. Our approach can be extended to the non parametric regression model for outlier detection.  相似文献   

6.
In this article, we model the relationship between two circular variables using the circular regression models, to be called JS circular regression model, which was proposed by Jammalamadaka and Sarma (1993). The model has many interesting properties and is sensitive enough to detect the occurrence of outliers. We focus our attention on the problem of identifying outliers in this model. In particular, we extend the use of the COVRATIO statistic, which has been successfully used in the linear case for the same purpose, to the JS circular regression model via a row deletion approach. Through simulation studies, the cut-off points for the new procedure are obtained and its power of performance is investigated. It is found that the performance improves when the resulting residuals have small variance and when the sample size gets larger. An example of the application of the procedure is presented using a real dataset.  相似文献   

7.
Fox (1972), Box and Tiao (1975), and Abraham and Box (1979) have proposed methods for detecting outliers in time series whose ARMA form is known (or identified). We show that the existence of a single aberrant observation, innovation, or intervention causes an ARMA model to be misidentified using unadjusted autocorrelation (acf) and partial autocorrelation estimates. The magnitude, location, type of outlier, and in some cases the ARMA's parameters, affect the identification outcome. We use variance inflation, signal-to-noise ratios, and acf critical values to determine an ARMA model's susceptibility to misidentifi-cation. Numerical and simulation examples suggest how to iteratively use the outlier detection methods in practice.  相似文献   

8.
This paper considers residuals for time series regression. Despite much literature on visual diagnostics for uncorrelated data, there is little on the autocorrelated case. To examine various aspects of the fitted time series regression model, three residuals are considered. The fitted regression model can be checked using orthogonal residuals; the time series error model can be analysed using marginal residuals; and the white noise error component can be tested using conditional residuals. When used together, these residuals allow identification of outliers, model mis‐specification and mean shifts. Due to the sensitivity of conditional residuals to model mis‐specification, it is suggested that the orthogonal and marginal residuals be examined first.  相似文献   

9.
The aim of this article is the estimation of annual food expenditures with limited information about bulk purchases with data from a Spanish household-budget survey for 1990—1991. Three alternatives are compared. The first, currently used for official purposes, does not use all the information. The second uses all the available information in a rough way. The third assumes a formal model for the unknown frequency of purchases. The three alternatives are compared by a regression model that should be homogeneous with respect to the dummy variables that represent the partial information of the groups and should show a distinct pattern of outliers under each alternative. Finally, we study the effect of the official and the best alternative on food inflation and inequality measures. We find that they lead to similar inflation rates but to different inequality estimates.  相似文献   

10.
This paper presents variance extraction procedures for univariate time series. The volatility of a times series is monitored allowing for non-linearities, jumps and outliers in the level. The volatility is measured using the height of triangles formed by consecutive observations of the time series. This idea was proposed by Rousseeuw and Hubert [1996. Regression-free and robust estimation of scale for bivariate data. Comput. Statist. Data Anal. 21, 67–85] in the bivariate setting. This paper extends their procedure to apply for online scale estimation in time series analysis. The statistical properties of the new methods are derived and finite sample properties are given. A financial and a medical application illustrate the use of the procedures.  相似文献   

11.
Zero adjusted regression models are used to fit variables that are discrete at zero and continuous at some interval of the positive real numbers. Diagnostic analysis in these models is usually performed using the randomized quantile residual, which is useful for checking the overall adequacy of a zero adjusted regression model. However, it may fail to identify some outliers. In this work, we introduce a class of residuals for outlier identification in zero adjusted regression models. Monte Carlo simulation studies and two applications suggest that one of the residuals of the class introduced here has good properties and detects outliers that are not identified by the randomized quantile residual.  相似文献   

12.
In his recent paper, Ali (1991) has shown that the mixed regression estimator, when data contain mean-shift or variance inflation outliers, is uniformly superior to the ordinary least squares estimator in terms of scalar-valued mean square error. However, when using the matrix-valued mean square error criterion, this dominance fails to hold in general. The subsequent investigation gives a complete characterization of the situation where the mixed estimator is superior to the LS-estimator when the comparison is made with respect to this stronger MSE-property. Vice versa, the LS-estimator never dominates the mixed estimator relative to this criterion.  相似文献   

13.
In this paper we present a "model free' method of outlier detection for Gaussian time series by using the autocorrelation structure of the time series. We also present a graphic diagnostic method in order to distinguish an additive outlier (AO) from an innovation outlier (IO). The test statistic for detecting the outlier has a χ ² distribution with one degree of freedom. We show that this method works well when the time series contain either one type of the outliers or both additive and innovation type outliers, and this method has the advantage that no time series model needs to be estimated from the data. Simulation evidence shows that different types of outliers can be graphically distinguished by using the techniques proposed.  相似文献   

14.
Censored median regression has proved useful for analyzing survival data in complicated situations, say, when the variance is heteroscedastic or the data contain outliers. In this paper, we study the sparse estimation for censored median regression models, which is an important problem for high dimensional survival data analysis. In particular, a new procedure is proposed to minimize an inverse-censoring-probability weighted least absolute deviation loss subject to the adaptive LASSO penalty and result in a sparse and robust median estimator. We show that, with a proper choice of the tuning parameter, the procedure can identify the underlying sparse model consistently and has desired large-sample properties including root-n consistency and the asymptotic normality. The procedure also enjoys great advantages in computation, since its entire solution path can be obtained efficiently. Furthermore, we propose a resampling method to estimate the variance of the estimator. The performance of the procedure is illustrated by extensive simulations and two real data applications including one microarray gene expression survival data.  相似文献   

15.
We propose a new regression-based filter for extracting signals online from multivariate high frequency time series. It separates relevant signals of several variables from noise and (multivariate) outliers.

Unlike parallel univariate filters, the new procedure takes into account the local covariance structure between the single time series components. It is based on high-breakdown estimates, which makes it robust against (patches of) outliers in one or several of the components as well as against outliers with respect to the multivariate covariance structure. Moreover, the trade-off problem between bias and variance for the optimal choice of the window width is approached by choosing the size of the window adaptively, depending on the current data situation.

Furthermore, we present an advanced algorithm of our filtering procedure that includes the replacement of missing observations in real time. Thus, the new procedure can be applied in online-monitoring practice. Applications to physiological time series from intensive care show the practical effect of the proposed filtering technique.  相似文献   

16.
Toxicologists and pharmacologists often describe toxicity of a chemical using parameters of a nonlinear regression model. Thus estimation of parameters of a nonlinear regression model is an important problem. The estimates of the parameters and their uncertainty estimates depend upon the underlying error variance structure in the model. Typically, a priori the researcher would not know if the error variances are homoscedastic (i.e., constant across dose) or if they are heteroscedastic (i.e., the variance is a function of dose). Motivated by this concern, in this paper we introduce an estimation procedure based on preliminary test which selects an appropriate estimation procedure accounting for the underlying error variance structure. Since outliers and influential observations are common in toxicological data, the proposed methodology uses M-estimators. The asymptotic properties of the preliminary test estimator are investigated; in particular its asymptotic covariance matrix is derived. The performance of the proposed estimator is compared with several standard estimators using simulation studies. The proposed methodology is also illustrated using a data set obtained from the National Toxicology Program.  相似文献   

17.
The Forward Search is a powerful general method, incorporating flexible data-driven trimming, for the detection of outliers and unsuspected structure in data and so for building robust models. Starting from small subsets of data, observations that are close to the fitted model are added to the observations used in parameter estimation. As this subset grows we monitor parameter estimates, test statistics and measures of fit such as residuals. The paper surveys theoretical development in work on the Forward Search over the last decade. The main illustration is a regression example with 330 observations and 9 potential explanatory variables. Mention is also made of procedures for multivariate data, including clustering, time series analysis and fraud detection.  相似文献   

18.
Early phase 2 tuberculosis (TB) trials are conducted to characterize the early bactericidal activity (EBA) of anti‐TB drugs. The EBA of anti‐TB drugs has conventionally been calculated as the rate of decline in colony forming unit (CFU) count during the first 14 days of treatment. The measurement of CFU count, however, is expensive and prone to contamination. Alternatively to CFU count, time to positivity (TTP), which is a potential biomarker for long‐term efficacy of anti‐TB drugs, can be used to characterize EBA. The current Bayesian nonlinear mixed‐effects (NLME) regression model for TTP data, however, lacks robustness to gross outliers that often are present in the data. The conventional way of handling such outliers involves their identification by visual inspection and subsequent exclusion from the analysis. However, this process can be questioned because of its subjective nature. For this reason, we fitted robust versions of the Bayesian nonlinear mixed‐effects regression model to a wide range of TTP datasets. The performance of the explored models was assessed through model comparison statistics and a simulation study. We conclude that fitting a robust model to TTP data obviates the need for explicit identification and subsequent “deletion” of outliers but ensures that gross outliers exert no undue influence on model fits. We recommend that the current practice of fitting conventional normal theory models be abandoned in favor of fitting robust models to TTP data.  相似文献   

19.
That outliers or influential observations can affect the results in a regression is well-known. But it is not clear how much influence a specific observation can have on other statistics. In time series, especially in predictive situations, the effect of additional observations is of singular importance. We here examine bounds for the effect of an additional observation on the mean, variance, Mahalanobis distance, product moment correlation, and coefficients of linearity and monotonicity.  相似文献   

20.
The multiprocess dynamic model provides a good framework for the modeling and analysis of the time series that contains outliers arid is subject to abrupt changes in pattern. In this paper we extend the multiprocess dynamic generalized linear model to allow for a known non-linear parameter evolution and predictor functions. This is done by approximating the non-linear function by a linear function based on a first order Taylor series expansions. This model has nice properties such as insensitivity to outliers and quick reaction to abrupt changes of pattern.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号