期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Mean-shift outliers model in skew scale-mixtures of normal distributions

《Journal of Statistical Computation and Simulation》2012,82(12):2346-2361

ABSTRACT

Asymmetric models have been discussed quite extensively in recent years, in situations where the normality assumption is suspected due to lack of symmetry in the data. Techniques for assessing the quality of fit and diagnostic analysis are important for model validation. This paper presents a study of the mean-shift method for the detection of outliers in regression models under skew scale-mixtures of normal distributions. Analytical solutions for the estimators of the parameters are obtained through the use of Expectation–Maximization algorithm. The observed information matrix for the calculation of standard errors is obtained for each distribution. Simulation studies and an application to the analysis of a data have been carried out, showing the efficiency of the proposed method in detecting outliers. 相似文献

2.

Outlier detection and accommodation in general spatial models

Xiaowen?Dai Libin?Jin Anqi?Shi Lei?Shi Email author 《Statistical Methods and Applications》2016,25(3):453-475

This paper studies outlier detection and accommodation in general spatial models including spatial autoregressive models and spatial error model as special cases. Using mean-shift and variance-weight models respectively, test statistics for multiple outliers are derived and the detecting procedures are proposed. In addition, several key diagnostic measures such as standardized residuals and leverage measure are defined in general spatial models. Outlier modified models are proposed to accommodate outliers in the data set. The performance of test statistics, including size and power, are examined via simulation studies. Three real examples are analyzed and the results show that the proposed methodology is useful for identifying and accommodating outliers in general spatial models. 相似文献

3.

Robust Detection of Multiple Outliers in Grouped Multivariate Data

Chrys Caroni Nedret Billor 《Journal of applied statistics》2007,34(10):1241-1250

Many methods have been developed for detecting multiple outliers in a single multivariate sample, but very few for the case where there may be groups in the data. We propose a method of simultaneously determining groups (as in cluster analysis) and detecting outliers, which are points that are distant from every group. Our method is an adaptation of the BACON algorithm proposed by Billor, Hadi and Velleman for the robust detection of multiple outliers in a single group of multivariate data. There are two versions of our method, depending on whether or not the groups can be assumed to have equal covariance matrices. The effectiveness of the method is illustrated by its application to two real data sets and further shown by a simulation study for different sample sizes and dimensions for 2 and 3 groups, with and without planted outliers in the data. When the number of groups is not known in advance, the algorithm could be used as a robust method of cluster analysis, by running it for various numbers of groups and choosing the best solution. 相似文献

4.

A New Discordancy Test in Circular Data Using Spacings Theory

I. B. Mohamed A. Rambli N. Khaliddin A. I. N. Ibrahim 《统计学通讯:模拟与计算》2016,45(8):2904-2916

In this article, we propose a new test of discordancy based on spacing theory in circular data. The test should provide a good alternative to existing tests of discordancy for detecting single or well-separated multiple outliers. On top of that, the new method can be generalized to identify a patch of outliers in data. The percentage points are calculated and the performance is examined. We first investigate the performance of the test for detecting a single outlier and show that the new test performs well compared to other known tests. We then show that the generalized test works well in detecting a patch of outliers in the data. As an illustration, a practical example based on an eye dataset obtained from a glaucoma clinic at the University of Malaya Medical Center, Malaysia is presented. 相似文献

5.

Combining Bayesian method and Kalman smoother for detection additive outlier patches in autoregressive time series

Farideh Mohammadinia Rahim Chinipardaz 《统计学通讯:模拟与计算》2013,42(7):2191-2209

ABSTRACT

This article proposes a development of detecting patches of additive outliers in autoregressive time series models. The procedure improves the existing detection methods via Gibbs sampling. We combine the Bayesian method and the Kalman smoother to present some candidate models of outlier patches and the best model with the minimum Bayesian information criterion (BIC) is selected among them. We propose that this combined Bayesian and Kalman method (CBK) can reduce the masking and swamping effects about detecting patches of additive outliers. The correctness of the method is illustrated by simulated data and then by analyzing a real set of observations. 相似文献

6.

Detection of outliers in mixed regressive-spatial autoregressive models

Libin Jin Xiaowen Dai Anqi Shi 《统计学通讯:理论与方法》2013,42(17):5179-5192

ABSTRACT

This article studies the outlier detection problem in mixed regressive-spatial autoregressive model. The formulae for testing outliers and their approximate distributions are derived under the mean-shift model and the variance-weight model, respectively. The simulation studies are conducted for examining the power and size of the test, as well as for the detection of outliers when a simulated data contains several outliers. A real data is analyzed to illustrate the proposed method, and modified models based on mean-shift and variance-weight models in which detected outliers are taken into account are suggested to deal with the outliers and confirm theconclusions. 相似文献

7.

Local influence in multivariate normal data

Myung Geun Kim 《Journal of applied statistics》1996,23(5):535-542

The local influence method introduced by Cook is adapted to multivariate normal data for the purpose of detecting outliers. The method allows simultaneous perturbations on all observations, so that it can identify multiple outliers. An illustrative example is given to show the e ectiveness of the method for the identification of influential observations. 相似文献

8.

The Identification of Multiple Outliers in ARIMA Models

《统计学通讯:理论与方法》2013,42(6):1265-1287

Abstract

There are three main problems in the existing procedures for detecting outliers in ARIMA models. The first one is the biased estimation of the initial parameter values that may strongly affect the power to detect outliers. The second problem is the confusion between level shifts and innovative outliers when the series has a level shift. The third problem is masking. We propose a procedure that keeps the powerful features of previous methods but improves the initial parameter estimate, avoids the confusion between innovative outliers and level shifts and includes joint tests for sequences of additive outliers in order to solve the masking problem. A Monte Carlo study and one example of the performance of the proposed procedure are presented. 相似文献

9.

Outliers detection in multivariate spatial linear models

《Journal of statistical planning and inference》2006,136(1):125-146

In geostatistics, detecting atypical observations is of special interest due to the changes they can cause in environmental and geological patterns. Several methods for detecting them have been already suggested for the univariate spatial case. However, the problem is more complicated when various variables are observed simultaneously and the spatial correlation among them must be taken into account. The aim of this paper is to detect outliers and influential observations in multivariate spatial linear models. For this purpose, we derive and explore two different methods. First, a multivariate version of the forward search algorithm is given, where locations with outliers are detected in the last steps of the procedure. Next, we derive influence measures to assess the impact of the observations on the multivariate spatial linear model. The procedures are easy to compute and to interpret by means of graphical representations. Finally, an example and a Monte Carlo study illustrate the performance of these methods for identification of outliers in multivariate spatial linear models. 相似文献

10.

Outliers in Multi-Response Experiments

Lalmohan Bhar Sankalpa Ojha 《统计学通讯:理论与方法》2014,43(13):2782-2798

Cook-statistic has been developed for detecting outliers in two likely situations of occurrence of outliers in multi-response experiments. In the first situation, more than one outlying observations vector has been considered. Each of these vectors is obtained on the assumption that a particular observation from each of the responses is an outlier. A general expression of Cook-statistic for detecting any such t outlying observations vectors has been obtained. Then some particular cases have been considered. In the second case a situation is considered where observations from all the responses may not be outliers. Here also a general expression of Cook-statistic is obtained for detecting any t observations from each of any k responses as outliers. In both the cases Cook-statistic is applied to real experimental data. 相似文献

11.

Simultaneous rank tests for detecting differentially expressed genes

《Journal of Statistical Computation and Simulation》2012,82(5):959-972

Rank tests are known to be robust to outliers and violation of distributional assumptions. Two major issues besetting microarray data are violation of the normality assumption and contamination by outliers. In this article, we formulate the normal theory simultaneous tests and their aligned rank transformation (ART) analog for detecting differentially expressed genes. These tests are based on the least-squares estimates of the effects when data follow a linear model. Application of the two methods are then demonstrated on a real data set. To evaluate the performance of the aligned rank transform method with the corresponding normal theory method, data were simulated according to the characteristics of a real gene expression data. These simulated data are then used to compare the two methods with respect to their sensitivity to the distributional assumption and to outliers for controlling the family-wise Type I error rate, power, and false discovery rate. It is demonstrated that the ART generally possesses the robustness of validity property even for microarray data with small number of replications. Although these methods can be applied to more general designs, in this article the simulation study is carried out for a dye-swap design since this design is broadly used in cDNA microarray experiments. 相似文献

12.

On detecting outliers in the Pareto distribution

Mehdi Jabbari Nooghabi 《Journal of Statistical Computation and Simulation》2019,89(8):1466-1481

In this paper, we introduce two new statistics for detecting outliers in the Pareto distribution. These new statistics are the extension of the statistics for detecting outliers in exponential and gamma distributions. In fact, we compare the power of our test statistics with the other statistics and select the best test statistic for detecting outliers in the Pareto distribution. Finally, numerical examples of different insurance claims are used to see the performance of the test. 相似文献

13.

Bayesian change-point problem using Bayes factor with hierarchical prior distribution

Myoungjin Jung Seongho Song 《统计学通讯:理论与方法》2017,46(3):1352-1366

We consider the hierarchical Bayesian models of change-point problem in a sequence of random variables having either normal population or skew-normal population. Further, we consider the problem of detecting an influential point concerning change point using Bayes factors. Our proposed models are illustrated with the real data example, the annual flow volume data of Nile River at Aswan from 1871 to 1970. The result using our proposed models indicated the largest influential observation in the year 1888 among outliers. We have shown that it is useful to measure the influence of observations on Bayes factors. Here, we consider omitting single observation as well. 相似文献

14.

A clustering approach to detect multiple outliers in linear functional relationship model for circular data

Nurkhairany Amyra Mokhtar Abdul Ghapor Hussin 《Journal of applied statistics》2018,45(6):1041-1051

Outlier detection has been used extensively in data analysis to detect anomalous observation in data. It has important applications such as in fraud detection and robust analysis, among others. In this paper, we propose a method in detecting multiple outliers in linear functional relationship model for circular variables. Using the residual values of the Caires and Wyatt model, we applied the hierarchical clustering approach. With the use of a tree diagram, we illustrate the detection of outliers graphically. A Monte Carlo simulation study is done to verify the accuracy of the proposed method. Low probability of masking and swamping effects indicate the validity of the proposed approach. Also, the illustrations to two sets of real data are given to show its practical applicability. 相似文献

15.

Multiple outliers detection in sparse high-dimensional regression

Tao Wang Qun Li Bin Chen 《Journal of Statistical Computation and Simulation》2018,88(1):89-107

The presence of outliers would inevitably lead to distorted analysis and inappropriate prediction, especially for multiple outliers in high-dimensional regression, where the high dimensionality of the data might amplify the chance of an observation or multiple observations being outlying. Noting that the detection of outliers is not only necessary but also important in high-dimensional regression analysis, we, in this paper, propose a feasible outlier detection approach in sparse high-dimensional linear regression model. Firstly, we search a clean subset by use of the sure independence screening method and the least trimmed square regression estimates. Then, we define a high-dimensional outlier detection measure and propose a multiple outliers detection approach through multiple testing procedures. In addition, to enhance efficiency, we refine the outlier detection rule after obtaining a relatively reliable non-outlier subset based on the initial detection approach. By comparison studies based on Monte Carlo simulation, it is shown that the proposed method performs well for detecting multiple outliers in sparse high-dimensional linear regression model. We further illustrate the application of the proposed method by empirical analysis of a real-life protein and gene expression data. 相似文献

16.

A simple diagnostic method of outlier detection for stationary Gaussian time series 总被引：1，自引：0，他引：1

Yuzhi Cai Neville Davies 《Journal of applied statistics》2003,30(2):205-223

In this paper we present a "model free' method of outlier detection for Gaussian time series by using the autocorrelation structure of the time series. We also present a graphic diagnostic method in order to distinguish an additive outlier (AO) from an innovation outlier (IO). The test statistic for detecting the outlier has a χ ² distribution with one degree of freedom. We show that this method works well when the time series contain either one type of the outliers or both additive and innovation type outliers, and this method has the advantage that no time series model needs to be estimated from the data. Simulation evidence shows that different types of outliers can be graphically distinguished by using the techniques proposed. 相似文献

17.

Detection of outliers in bivariate time series data

Ravindra Khattree Dayanand N. Naik 《统计学通讯:理论与方法》2013,42(12):3701-3714

In this article, we use the influence function matrix of auto and cross-correlations of a bivariate (multivariate) time series for detecting the outliers. The multivariate analog of the graphical method of Chernick et. al. (1982), to detect outliers and partial outliers is presented. A simulation study illustrating the method is also given. 相似文献

18.

DETECTION AND TESTING OF DIFFEFENT TYPES OF OUTLIER IN LINEAR STRAUCTURAL RELATIONASHIPS

Vic Barnett 《Australian & New Zealand Journal of Statistics》1985,27(2):151-162

The linear structural model provides one way of modelling a linear relationship between two random variables. It is well known that problems of unidentifiability arise for unreplicated observations and normal error structure. As in all data sets, outliers can arise and methods are needed for detecting and testing them. An outlier-generating model of mean–slippage type can be used to characterise four different forms of outlier manifestation. It is interesting to find that the unidentifiability problem provides no obstacle for detecting or testing the outliers for three of the four forms. Detection principles, and specific discordancy tests, are derived and illustrated by application to some data on physical measurements of Pacific squid. 相似文献

19.

Approximate bounded influence estimation for longitudinal data with outliers and measurement errors

Lang Wu Jin Qiu 《Journal of statistical planning and inference》2011,141(7):2321-2330

Mixed effects models or random effects models are popular for the analysis of longitudinal data. In practice, longitudinal data are often complex since there may be outliers in both the response and the covariates and there may be measurement errors. The likelihood method is a common approach for these problems but it can be computationally very intensive and sometimes may even be computationally infeasible. In this article, we consider approximate robust methods for nonlinear mixed effects models to simultaneously address outliers and measurement errors. The approximate methods are computationally very efficient. We show the consistency and asymptotic normality of the approximate estimates. The methods can also be extended to missing data problems. An example is used to illustrate the methods and a simulation is conducted to evaluate the methods. 相似文献

20.

Robust mixture regression modeling using the least trimmed squares (LTS)-estimation method

Fatma Zehra Doğru Olcay Arslan 《统计学通讯:模拟与计算》2018,47(7):2184-2196

Mixture regression models are used to investigate the relationship between variables that come from unknown latent groups and to model heterogenous datasets. In general, the error terms are assumed to be normal in the mixture regression model. However, the estimators under normality assumption are sensitive to the outliers. In this article, we introduce a robust mixture regression procedure based on the LTS-estimation method to combat with the outliers in the data. We give a simulation study and a real data example to illustrate the performance of the proposed estimators over the counterparts in terms of dealing with outliers. 相似文献