首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
In this article, we present an alternative test of discordancy in samples of univariate circular data. The new technique is based on the effect of existence of an outlier on the summation of circular distances of the point of interest to all other points. The percentage points are calculated and the performance is examined. We compare the performance of the test in detecting an outlier with other tests and show that the new approach performs relatively better than other known tests. As an illustration a practical example is presented.  相似文献   

2.
As the Watson distribution is frequently used for modeling axial data, it is important to investigate the existence of possible outliers in samples from this distribution. Then, we develop for the bipolar Watson distribution defined on the hypersphere, some tests of discordancy of an outlier or several outliers en bloc based on the likelihood ratio, supposing an alternative model of contamination of slippage type. We evaluate the performance of these tests of discordancy of an outlier and we also compare some tests of discordancy of an outlier available for this distribution.  相似文献   

3.
Detection of outliers or influential observations is an important work in statistical modeling, especially for the correlated time series data. In this paper we propose a new procedure to detect patch of influential observations in the generalized autoregressive conditional heteroskedasticity (GARCH) model. Firstly we compare the performance of innovative perturbation scheme, additive perturbation scheme and data perturbation scheme in local influence analysis. We find that the innovative perturbation scheme give better result than other two schemes although this perturbation scheme may suffer from masking effects. Then we use the stepwise local influence method under innovative perturbation scheme to detect patch of influential observations and uncover the masking effects. The simulated studies show that the new technique can successfully detect a patch of influential observations or outliers under innovative perturbation scheme. The analysis based on simulation studies and two real data sets show that the stepwise local influence method under innovative perturbation scheme is efficient for detecting multiple influential observations and dealing with masking effects in the GARCH model.  相似文献   

4.
Data from recordings of ore assays from the Western Australian goldfields provide motivation to devise new tests for outliers when observations are distributed with the same mean but diff ering variances. In the case of equal variances, tests for a single outlier reduce to well-known tests of discordancy. A block discordancy test for k outliers is also described. The question of whether or not one should omit any observation(s) in the calculation of the mean recoverable gold content is addressed in the context of whether or not the data contain outliers, as judged by a normal model for the 'logged' ore assay values. The given data suggest that models with 'logged' values that follow long-tailed approximately normal distributions may be appropriate.  相似文献   

5.
The linear structural model provides one way of modelling a linear relationship between two random variables. It is well known that problems of unidentifiability arise for unreplicated observations and normal error structure. As in all data sets, outliers can arise and methods are needed for detecting and testing them. An outlier-generating model of mean–slippage type can be used to characterise four different forms of outlier manifestation. It is interesting to find that the unidentifiability problem provides no obstacle for detecting or testing the outliers for three of the four forms. Detection principles, and specific discordancy tests, are derived and illustrated by application to some data on physical measurements of Pacific squid.  相似文献   

6.
The study of multivariate outliers raises many problems of definition, principle and manipulation. Well-authenticated tests of discordancy exist only for the multivariate normal distribution. Detection of outliers in non-normal distributions involves the adoption of appropriate criteria to represent 'extremeness' of observations in a sample; corresponding tests of discordancy usually require tedious, or even intractable, distributional and computational manipulations. A class of transformations of the data is considered with a view of transferring some of the familiar and desirable features of discordancy tests for normal samples to non-normal situations.  相似文献   

7.
Statistical models are often based on normal distributions and procedures for testing this distributional assumption are needed. Many goodness-of-fit tests suffer from the presence of outliers, in the sense that they may reject the null hypothesis even in the case of a single extreme observation. We show a possible extension of the Shapiro-Wilk test that is not affected by such a problem. The presented method is inspired by the forward search (FS), a new, recently proposed, diagnostic tool. An application to univariate observations shows how the procedure is able to capture the structure of the data, even in the presence of outliers. Other properties are also investigated.  相似文献   

8.
We show that the existing tests for asymptotic independence are sensitive to outliers. A robust test is proposed. The new test is made stable under contamination through a shrinkage scheme. Simulations show that the new test performs well in the presence of contaminated data while maintaining good properties when there is no contamination. An application to real data shows the added value of our new robust approach.  相似文献   

9.
Outliers can as readily arise in sample survey (i.e. finite population) data as in samples from infinite populations. For infinite populations, an extensive methodology exists: very little has been written on the finite population case. We shall explore matters of definition and concept to formulate some basic principles for handling outliers in sample survey data. Some existing methods for outlier accommodation are reviewed and proposals are made for the dual problem of outlier tests of discordancy.  相似文献   

10.
The use of logistic regression modeling has seen a great deal of attention in the literature in recent years. This includes all aspects of the logistic regression model including the identification of outliers. A variety of methods for the identification of outliers, such as the standardized Pearson residuals, are now available in the literature. These methods, however, are successful only if the data contain a single outlier. In the presence of multiple outliers in the data, which is often the case in practice, these methods fail to detect the outliers. This is due to the well-known problems of masking (false negative) and swamping (false positive) effects. In this article, we propose a new method for the identification of multiple outliers in logistic regression. We develop a generalized version of standardized Pearson residuals based on group deletion and then propose a technique for identifying multiple outliers. The performance of the proposed method is then investigated through several examples.  相似文献   

11.
The presence of contamination often called outlier is a very common attribute in data. Among other causes, outliers in a homoscedastic model make the model heteroscedastic. Moreover, outliers distort diagnostic tools for heteroscedasticity such that it may not be correctly identified. In this article, we show how outliers affect heteroscedasticity diagnostics. We then proposed a robust procedure for detecting heteroscedasticity in the presence of outliers by robustifying the non-robust component of the Goldfeld–Quandt (GQ) test. The performance of the proposed procedure is examined using simulation experiment and real data sets. The proposed procedure offers great improvement where the conventional GQ and other procedures fail.  相似文献   

12.
We propose a new procedure for detecting a patch of outliers or influential observations for autoregressive integrated moving average (ARIMA) model using local influence analysis. It is shown that the dependency aspects of time series data gives rise to masking or smearing effects when the local influence analysis is performed using current perturbation schemes. We suggest a new perturbation scheme to take into account the dependent structure of time series data, and employ the stepwise local influence method to give a diagnostic procedure. We show that the new perturbation scheme can avoid the smearing effects, and the stepwise technique of local influence can successfully deal with masking effects. Various simulation studies are performed to show the efficiency of proposed methodology and a real example is used for illustrations.  相似文献   

13.
We study model selection for linear models when there are possible outliers both in the response and the predictor variables. We derive a new criterion based on generalized Huberization and on the newly developed theory of stochastic complexity. For purpose of comparison, several other criteria are also studied. Some asymptotic properties concerning strong consistency of selecting the optimal model by these criteria are given under general conditions. Other features like robustness against outliers and effect of signal-to-noise ratio are discussed as well. Finally, an example and a simulation study are presented to evaluate the finite sample performance.  相似文献   

14.
ABSTRACT

Cylindrical data are bivariate data from the combination of circular and linear variables. However, up to now no work has been done on the detection of outlier in cylindrical data. We introduce a definition of outlier for cylindrical data and present a new test of discordancy to detect outlier in this type of data, based on the k-nearest neighbor’s distance. Cut-off points of the new test statistic based on the Johnson-Wehrly distribution are calculated and its performance is examined using simulation. A practical example is presented using wind speed and wind direction data obtained from the Malaysian Meteorological Department.  相似文献   

15.
In this paper we present a "model free' method of outlier detection for Gaussian time series by using the autocorrelation structure of the time series. We also present a graphic diagnostic method in order to distinguish an additive outlier (AO) from an innovation outlier (IO). The test statistic for detecting the outlier has a χ ² distribution with one degree of freedom. We show that this method works well when the time series contain either one type of the outliers or both additive and innovation type outliers, and this method has the advantage that no time series model needs to be estimated from the data. Simulation evidence shows that different types of outliers can be graphically distinguished by using the techniques proposed.  相似文献   

16.
In this paper, we introduce two new statistics for detecting outliers in the Pareto distribution. These new statistics are the extension of the statistics for detecting outliers in exponential and gamma distributions. In fact, we compare the power of our test statistics with the other statistics and select the best test statistic for detecting outliers in the Pareto distribution. Finally, numerical examples of different insurance claims are used to see the performance of the test.  相似文献   

17.
Recently, several new robust multivariate estimators of location and scatter have been proposed that provide new and improved methods for detecting multivariate outliers. But for small sample sizes, there are no results on how these new multivariate outlier detection techniques compare in terms of p n , their outside rate per observation (the expected proportion of points declared outliers) under normality. And there are no results comparing their ability to detect truly unusual points based on the model that generated the data. Moreover, there are no results comparing these methods to two fairly new techniques that do not rely on some robust covariance matrix. It is found that for an approach based on the orthogonal Gnanadesikan–Kettenring estimator, p n can be very unsatisfactory with small sample sizes, but a simple modification gives much more satisfactory results. Similar problems were found when using the median ball algorithm, but a modification proved to be unsatisfactory. The translated-biweights (TBS) estimator generally performs well with a sample size of n≥20 and when dealing with p-variate data where p≤5. But with p=8 it can be unsatisfactory, even with n=200. A projection method as well the minimum generalized variance method generally perform best, but with p≤5 conditions where the TBS method is preferable are described. In terms of detecting truly unusual points, the methods can differ substantially depending on where the outliers happen to be, the number of outliers present, and the correlations among the variables.  相似文献   

18.
Repeating measurements of efficacy variables in clinical trials may be desirable when the measurement may be affected by ambient conditions. When such measurements are repeated at baseline and at the end of therapy, statistical questions relate to: (1) the best summary measurement to use for a subject when there is a possibility that some observations are contaminated and have increased variances; and (2) the effect of screening procedures which exclude outliers based on within- and between-subject contamination tests. We study these issues in two stages, each using a different set of models. The first stage deals only with the choice of the summary measure. The simulation results show that in some cases of contamination, the power achieved by the tests based on the median exceeds that achieved by the tests based on the mean of the replicates. However, even when we use the median, there are cases when contamination leads to a considerable loss in power. The combined issue of the best summary measurement and the effect of screening is studied in the second stage. The tests use either the observed data or the data after screening for outliers. The simulation results demonstrate that the power depends on the screening procedure as well as on the test statistic used in the study. We found that for the extent and magnitude of contamination considered, within-subject screening has a minimal effect on the power of the tests when there are at least three replicates; as a result, we found no advantage in the use of screening procedures for within-subject contamination. On the other hand, the use of a between-subject screening for outliers increases the power of the test procedures. However, even with the use of screening procedures, heterogeneity of variances can greatly reduce the power of the study.  相似文献   

19.
We construct and investigate robust nonparametric tests for the two-sample location problem. A test based on a suitable scaling of the median of the set of differences between the two samples, which is the Hodges-Lehmann shift estimator corresponding to the Wilcoxon two-sample rank test, leads to higher robustness against outliers than the Wilcoxon test itself, while preserving its efficiency under a broad range of distributions. The good performance of the constructed test is investigated under different distributions and outlier configurations and compared to alternatives like the two-sample t-, the Wilcoxon and the median test, as well as to tests based on the difference of the sample medians or the one-sample Hodges-Lehmann estimators.  相似文献   

20.
Five widely used test statistics for detecting outliers and influential observations were studied using Monte Carlo method . The test statistic based on Studentized residuals, with critical values given by Tietjen, Moore and Beckman (1973), appears to be the best procedure for detecting a single outlier in simple linear regression.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号