期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Stepwise local influence in generalized autoregressive conditional heteroskedasticity models

Lei Shi Md. Mostafizur Rahman Wen Gan Jianhua Zhao 《Journal of applied statistics》2015,42(2):428-444

Detection of outliers or influential observations is an important work in statistical modeling, especially for the correlated time series data. In this paper we propose a new procedure to detect patch of influential observations in the generalized autoregressive conditional heteroskedasticity (GARCH) model. Firstly we compare the performance of innovative perturbation scheme, additive perturbation scheme and data perturbation scheme in local influence analysis. We find that the innovative perturbation scheme give better result than other two schemes although this perturbation scheme may suffer from masking effects. Then we use the stepwise local influence method under innovative perturbation scheme to detect patch of influential observations and uncover the masking effects. The simulated studies show that the new technique can successfully detect a patch of influential observations or outliers under innovative perturbation scheme. The analysis based on simulation studies and two real data sets show that the stepwise local influence method under innovative perturbation scheme is efficient for detecting multiple influential observations and dealing with masking effects in the GARCH model. 相似文献

2.

Outliers in Multi-Response Experiments

Lalmohan Bhar Sankalpa Ojha 《统计学通讯:理论与方法》2014,43(13):2782-2798

Cook-statistic has been developed for detecting outliers in two likely situations of occurrence of outliers in multi-response experiments. In the first situation, more than one outlying observations vector has been considered. Each of these vectors is obtained on the assumption that a particular observation from each of the responses is an outlier. A general expression of Cook-statistic for detecting any such t outlying observations vectors has been obtained. Then some particular cases have been considered. In the second case a situation is considered where observations from all the responses may not be outliers. Here also a general expression of Cook-statistic is obtained for detecting any t observations from each of any k responses as outliers. In both the cases Cook-statistic is applied to real experimental data. 相似文献

3.

Multiple outliers detection in sparse high-dimensional regression

Tao Wang Qun Li Bin Chen 《Journal of Statistical Computation and Simulation》2018,88(1):89-107

The presence of outliers would inevitably lead to distorted analysis and inappropriate prediction, especially for multiple outliers in high-dimensional regression, where the high dimensionality of the data might amplify the chance of an observation or multiple observations being outlying. Noting that the detection of outliers is not only necessary but also important in high-dimensional regression analysis, we, in this paper, propose a feasible outlier detection approach in sparse high-dimensional linear regression model. Firstly, we search a clean subset by use of the sure independence screening method and the least trimmed square regression estimates. Then, we define a high-dimensional outlier detection measure and propose a multiple outliers detection approach through multiple testing procedures. In addition, to enhance efficiency, we refine the outlier detection rule after obtaining a relatively reliable non-outlier subset based on the initial detection approach. By comparison studies based on Monte Carlo simulation, it is shown that the proposed method performs well for detecting multiple outliers in sparse high-dimensional linear regression model. We further illustrate the application of the proposed method by empirical analysis of a real-life protein and gene expression data. 相似文献

4.

A comparative study on detection of influential observations in linear regression

A. Hossain D. N. Naik 《Statistical Papers》1991,32(1):55-69

A large number of statistics are used in the literature to detect outliers and influential observations in the linear regression model. In this paper comparison studies have been made for determining a statistic which performs better than the other. This includes: (i) a detailed simulation study, and (ii) analyses of several data sets studied by different authors. Different choices of the design matrix of regression model are considered. Design A studies the performance of the various statistics for detecting the scale shift type outliers, and designs B and C provide information on the performance of the statistics for identifying the influential observations. We have used cutoff points using the exact distributions and Bonferroni's inequality for each statistic. The results show that the studentized residual which is used for detection of mean shift outliers is appropriate for detection of scale shift outliers also, and the Welsch's statistic and the Cook's distance are appropriate for detection of influential observations. 相似文献

5.

Outliers and influential observations in the structural errors-in-variables model

Myung Geun Kim 《Journal of applied statistics》2000,27(4):451-460

The influence of observations on the parameter estimates for the simple structural errors-in-variables model with no equation error is investigated using the local influence method. Residuals themselves are not sufficient for detecting outliers. The likelihood displacement approach is useful for outlier detection especially when a masking phenomenon is present. An illustrative example is provided. 相似文献

6.

Robust multivariate diagnostics for PLSR and application on high dimensional spectrally overlapped drug systems

Aylin Alin Claudio Agostinelli Georgi Gergov Plamen Katsarov Yahya Al-Degs 《Journal of Statistical Computation and Simulation》2019,89(6):966-984

ABSTRACT

Statistical methods are effectively used in the evaluation of pharmaceutical formulations instead of laborious liquid chromatography. However, signal overlapping, nonlinearity, multicollinearity and presence of outliers deteriorate the performance of statistical methods. The Partial Least Squares Regression (PLSR) is a very popular method in the quantification of high dimensional spectrally overlapped drug formulations. The SIMPLS is the mostly used PLSR algorithm, but it is highly sensitive to outliers that also effect the diagnostics. In this paper, we propose new robust multivariate diagnostics to identify outliers, influential observations and points causing non-normality for a PLSR model. We study performances of the proposed diagnostics on two everyday use highly overlapping drug systems: Paracetamol–Caffeine and Doxylamine Succinate–Pyridoxine Hydrochloride. 相似文献

7.

Outliers detection in multivariate spatial linear models

《Journal of statistical planning and inference》2006,136(1):125-146

In geostatistics, detecting atypical observations is of special interest due to the changes they can cause in environmental and geological patterns. Several methods for detecting them have been already suggested for the univariate spatial case. However, the problem is more complicated when various variables are observed simultaneously and the spatial correlation among them must be taken into account. The aim of this paper is to detect outliers and influential observations in multivariate spatial linear models. For this purpose, we derive and explore two different methods. First, a multivariate version of the forward search algorithm is given, where locations with outliers are detected in the last steps of the procedure. Next, we derive influence measures to assess the impact of the observations on the multivariate spatial linear model. The procedures are easy to compute and to interpret by means of graphical representations. Finally, an example and a Monte Carlo study illustrate the performance of these methods for identification of outliers in multivariate spatial linear models. 相似文献

8.

Simultaneous variable selection and outlier identification in linear regression using the mean-shift outlier model

Sung-Soo Kim Sung H. Park W. J. Krzanowski 《Journal of applied statistics》2008,35(3):283-291

We provide a method for simultaneous variable selection and outlier identification using the mean-shift outlier model. The procedure consists of two steps: the first step is to identify potential outliers, and the second step is to perform all possible subset regressions for the mean-shift outlier model containing the potential outliers identified in step 1. This procedure is helpful for model selection while simultaneously considering outlier identification, and can be used to identify multiple outliers. In addition, we can evaluate the impact on the regression model of simultaneous omission of variables and interesting observations. In an example, we provide detailed output from the R system, and compare the results with those using posterior model probabilities as proposed by Hoeting et al. [Comput. Stat. Data Anal. 22 (1996), pp. 252-270] for simultaneous variable selection and outlier identification. 相似文献

9.

Distributions of test statistics for multiple outliers in exponential samples

M.S. Chikkagoudar S. H Kunchur 《统计学通讯:理论与方法》2013,42(18):2127-2142

Null as well as alternative distributions of two types of statistics used to test for multiple outliers in exponential samples are obtained. Of these two types one is based on the ratio of sum of the observations suspected to be outliers to sum of the sample observations and the other is Dixon's type. Powers of the tests based on these statistics are compared.

相似文献

10.

Combining Bayesian method and Kalman smoother for detection additive outlier patches in autoregressive time series

Farideh Mohammadinia Rahim Chinipardaz 《统计学通讯:模拟与计算》2013,42(7):2191-2209

ABSTRACT

This article proposes a development of detecting patches of additive outliers in autoregressive time series models. The procedure improves the existing detection methods via Gibbs sampling. We combine the Bayesian method and the Kalman smoother to present some candidate models of outlier patches and the best model with the minimum Bayesian information criterion (BIC) is selected among them. We propose that this combined Bayesian and Kalman method (CBK) can reduce the masking and swamping effects about detecting patches of additive outliers. The correctness of the method is illustrated by simulated data and then by analyzing a real set of observations. 相似文献

11.

Testing normality in the presence of outliers

Daniele Coin 《Statistical Methods and Applications》2008,17(1):3-12

Statistical models are often based on normal distributions and procedures for testing this distributional assumption are needed. Many goodness-of-fit tests suffer from the presence of outliers, in the sense that they may reject the null hypothesis even in the case of a single extreme observation. We show a possible extension of the Shapiro-Wilk test that is not affected by such a problem. The presented method is inspired by the forward search (FS), a new, recently proposed, diagnostic tool. An application to univariate observations shows how the procedure is able to capture the structure of the data, even in the presence of outliers. Other properties are also investigated. 相似文献

12.

Outlier detection based on extreme value theory and applications

Shrijita Bhattacharya Francois Kamper Jan Beirlant 《Scandinavian Journal of Statistics》2023,50(3):1466-1502

Whether an extreme observation is an outlier or not depends strongly on the corresponding tail behavior of the underlying distribution. We develop an automatic, data-driven method rooted in the mathematical theory of extremes to identify observations that deviate from the intermediate and central characteristics. The proposed algorithm is an extension of a method previously proposed in the literature for the specific case of heavy tailed Pareto-type distributions to all max-domains of attraction. We propose some applications such as a tail-adjusted boxplot which yields a more accurate representation of possible outliers, and the identification of outliers in a multivariate context through an analysis of associated random variables such as local outlier factors. Several examples and simulation results illustrate the finite sample behavior of the algorithm and its applications. 相似文献

13.

Pseudo–Modelling of Outliers for Estimation of Regression And Time Series Models

Ross H. Taplin 《Australian & New Zealand Journal of Statistics》2002,44(3):295-310

This paper documents situations where the variance inflation model for outliers has undesirable properties. The model is commonly used to accommodate outliers in a Bayesian analysis of regression and time series models. The alternative approach provided here does not suffer from these undesirable properties but gives inferences similar to those of the variance inflation model when this is appropriate. It can be used with regression, time series, and regression with correlated errors in a unified way, and adheres to the scientific principle that inference should be based on the data after obvious outliers have been discarded. Only one parameter is required for outliers; it is interpretable as the a priori willingness to remove observations from the analysis. 相似文献

14.

Identification and classification of multiple outliers,high leverage points and influential observations in linear regression

A.A.M. Nurunnabi M. Nasser A.H.M.R. Imon 《Journal of applied statistics》2016,43(3):509-525

Detection of multiple unusual observations such as outliers, high leverage points and influential observations (IOs) in regression is still a challenging task for statisticians due to the well-known masking and swamping effects. In this paper we introduce a robust influence distance that can identify multiple IOs, and propose a sixfold plotting technique based on the well-known group deletion approach to classify regular observations, outliers, high leverage points and IOs simultaneously in linear regression. Experiments through several well-referred data sets and simulation studies demonstrate that the proposed algorithm performs successfully in the presence of multiple unusual observations and can avoid masking and/or swamping effects. 相似文献

15.

Influence measure for the L1 regression

Silvia N. Elian Carmen D.S. André Subhash C. Narula 《统计学通讯:理论与方法》2013,42(4):837-849

Because outliers and leverage observations unduly affect the least squares regression, the identification of influential observations is considered an important and integrai part of the analysis. However, very few techniques have been developed for the residual analysis and diagnostics for the minimum sum of absolute errors, L₁ regression. Although the L₁ regression is more resistant to the outliers than the least squares regression, it appears that outliers (leverage) in the predictor variables may affect it. In this paper, our objective is to develop an influence measure for the L₁ regression based on the likelihood displacement function. We illustrate the proposed influence measure with examples. 相似文献

16.

Outlier detection in linear models: a comparative study in simple linear regression

Uditha Balasooriya Y.K. Tse 《统计学通讯:理论与方法》2013,42(12):3589-3597

Five widely used test statistics for detecting outliers and influential observations were studied using Monte Carlo method . The test statistic based on Studentized residuals, with critical values given by Tietjen, Moore and Beckman (1973), appears to be the best procedure for detecting a single outlier in simple linear regression. 相似文献

17.

Monitoring multivariate simple linear profiles using robust estimators

Moslem Kordestani Farid Hassanvand Hamid Shahriari 《统计学通讯:理论与方法》2020,49(12):2964-2989

Brief Abstract

This article focuses on estimation of multivariate simple linear profiles. While outliers may hamper the expected performance of the ordinary regression estimators, this study resorts to robust estimators as the remedy of the estimation problem in presence of contaminated observations. More specifically, three robust estimators M, S and MM are employed. Extensive simulation runs show that in the absence of outliers or for small amount of contamination, the robust methods perform as well as the classical least square method, while for medium and large amounts of contamination the proposed estimators perform considerably better than classical method. 相似文献

18.

Detecting outliers: power and some other considerations

Ram B. Jain 《统计学通讯:理论与方法》2013,42(22):2299-2314

The general problem of outlier detection and five recursive outlier detection procedures considered in the study are defined. The methods to compute powers, probabilities of detecting ≥1 outliers, and >1 observations including at least one inlier as outliers are computed and results are discussed. Results show that no procedure is most powerful when the actual number of outlier present in the sample is exactly, under-, and overestimated. The probabilities of inliers being detected as outliers are also substantial particularly when outliers occur only on one side of the sample 相似文献

19.

The forward search: Theory and data analysis

Anthony C. Atkinson Marco Riani Andrea Cerioli 《Journal of the Korean Statistical Society》2010,39(2):117-134

The Forward Search is a powerful general method, incorporating flexible data-driven trimming, for the detection of outliers and unsuspected structure in data and so for building robust models. Starting from small subsets of data, observations that are close to the fitted model are added to the observations used in parameter estimation. As this subset grows we monitor parameter estimates, test statistics and measures of fit such as residuals. The paper surveys theoretical development in work on the Forward Search over the last decade. The main illustration is a regression example with 330 observations and 9 potential explanatory variables. Mention is also made of procedures for multivariate data, including clustering, time series analysis and fraud detection. 相似文献

20.

Maximum studentized score tests for the detection of outliers in time series regression models

《Journal of Statistical Computation and Simulation》2012,82(12):1355-1372

Efficient score tests exist among others, for testing the presence of additive and/or innovative outliers that are the result of the shifted mean of the error process under the regression model. A sample influence function of autocorrelation-based diagnostic technique also exists for the detection of outliers that are the result of the shifted autocorrelations. The later diagnostic technique is however not useful if the outlying observation does not affect the autocorrelation structure but is generated due to an inflation in the variance of the error process under the regression model. In this paper, we develop a unified maximum studentized type test which is applicable for testing the additive and innovative outliers as well as variance shifted outliers that may or may not affect the autocorrelation structure of the outlier free time series observations. Since the computation of the p-values for the maximum studentized type test is not easy in general, we propose a Satterthwaite type approximation based on suitable doubly non-central F-distributions for finding such p-values [F.E. Satterthwaite, An approximate distribution of estimates of variance components, Biometrics 2 (1946), pp. 110–114]. The approximations are evaluated through a simulation study, for example, for the detection of additive and innovative outliers as well as variance shifted outliers that do not affect the autocorrelation structure of the outlier free time series observations. Some simulation results on model misspecification effects on outlier detection are also provided. 相似文献