首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
In this paper we assess the sensitivity of the multivariate extreme deviate test for a single multivariate outlier to non-normality in the form of heavy tails. We find that the empirical significance levels can be markedly affected by even modest departures from multivariate normality. The effects are particularly severe when the sample size is large relative to the dimension. Finally, by way of example we demonstrate that certain graphical techniques may prove useful in identifying the source of rejection for the multivariate extreme deviate test.  相似文献   

2.
The Institute of Mathematical Statistics has published a table of critical values for the multivariate extreme deviate test. However, the critical values, derived by a Monte Carlo simulation, are given for only the dimensions 2 through 5. We present new critical values for the dimensions 6 through 10, 12, 15, and 20. The results are presented in both table and graphical form. All critical values for the test statistic have been generated by a Monte Carlo simulation using 10,000 observations per case. An example is presented using the new critical values.  相似文献   

3.
We propose a new approach for outlier detection, based on a ranking measure that focuses on the question of whether a point is ‘central’ for its nearest neighbours. Using our notations, a low cumulative rank implies that the point is central. For instance, a point centrally located in a cluster has a relatively low cumulative sum of ranks because it is among the nearest neighbours of its own nearest neighbours, but a point at the periphery of a cluster has a high cumulative sum of ranks because its nearest neighbours are closer to each other than the point. Use of ranks eliminates the problem of density calculation in the neighbourhood of the point and this improves the performance. Our method performs better than several density-based methods on some synthetic data sets as well as on some real data sets.  相似文献   

4.
ABSTRACT

Cylindrical data are bivariate data from the combination of circular and linear variables. However, up to now no work has been done on the detection of outlier in cylindrical data. We introduce a definition of outlier for cylindrical data and present a new test of discordancy to detect outlier in this type of data, based on the k-nearest neighbor’s distance. Cut-off points of the new test statistic based on the Johnson-Wehrly distribution are calculated and its performance is examined using simulation. A practical example is presented using wind speed and wind direction data obtained from the Malaysian Meteorological Department.  相似文献   

5.
In this paper we consider the multiple outlier problem in time series analysis. The underlying undisturbed time series is assumed to be an autoregressive process. The location of the suspicious values is supposed to be known. We introduce conditional least squares estimators for the parameters. The estimates are shown to be strongly consistent. Using similar arguments as in the theory of linear models, we get a test statistic for the general linear hypothesis. Its asymptotic distribution is derived.  相似文献   

6.
A new type of procedure for estimating the number of outliers in a sample is presented and compared with existing procedures. The probabilities of exact, under-, and overestimation with the different procedures are examined for two different contamination schemes.  相似文献   

7.
Multivariate control charts are used to monitor stochastic processes for changes and unusual observations. Hotelling's T2 statistic is calculated for each new observation and an out‐of‐control signal is issued if it goes beyond the control limits. However, this classical approach becomes unreliable as the number of variables p approaches the number of observations n, and impossible when p exceeds n. In this paper, we devise an improvement to the monitoring procedure in high‐dimensional settings. We regularise the covariance matrix to estimate the baseline parameter and incorporate a leave‐one‐out re‐sampling approach to estimate the empirical distribution of future observations. An extensive simulation study demonstrates that the new method outperforms the classical Hotelling T2 approach in power, and maintains appropriate false positive rates. We demonstrate the utility of the method using a set of quality control samples collected to monitor a gas chromatography–mass spectrometry apparatus over a period of 67 days.  相似文献   

8.
This paper investigates how classical measurement error and additive outliers (AO) influence tests for structural change based on F-statistics. We derive theoretically the impact of general additive disturbances in the regressors on the asymptotic distribution of these tests for structural change. The small sample properties in the case of classical measurement error and AO are investigated via Monte Carlo simulations, revealing that sizes are biased upwards and that powers are reduced. Two-wavelet-based denoising methods are used to reduce these distortions. We show that these two methods can significantly improve the performance of structural break tests.  相似文献   

9.
A general way of detecting multivariate outliers involves using robust depth functions, or, equivalently, the corresponding ‘outlyingness’ functions; the more outlying an observation, the more extreme (less deep) it is in the data cloud and thus potentially an outlier. Most outlier detection studies in the literature assume that the underlying distribution is multivariate normal. This paper deals with the case of multivariate skewed data, specifically when the data follow the multivariate skew-normal [1] distribution. We compare the outlier detection capabilities of four robust outlier detection methods through their outlyingness functions in a simulation study. Two scenarios are considered for the occurrence of outliers: ‘the cluster’ and ‘the radial’. Conclusions and recommendations are offered for each scenario.  相似文献   

10.
11.
Procedures for detection of outliers in familial data is given for mean-slippage and dispersion-slippage model of outliers for equal and unequal family sizes. The distributions of the test statistics are derived and Bonferroni's bounds for the values of significant probabilities are given.  相似文献   

12.
We show that deviance residuals derived using the proportional hazards assumption (including Cox regression) are not asymptotically standard normal, but that a scale-location adjustment makes them nearly standard normal, even for moderate sample sizes. This adjustment should aid in outlier detection, as it allows a more exact assessment of when a deviance residual is unusually large.  相似文献   

13.
14.
This study aims at exploring correct identification of seasonal outliers using most commonly applied test statistics. We evaluate the performance of seasonal level shift (SLS) by means of empirical level of significance, power of the test for sensitivity in detecting changes, and the vulnerability to masking of outliers by misspecification frequencies. We observe that the size of SLS affects the sampling distribution of ηSLS (test statistics for SLS detection) in case of SAR (1) and SMA (1) model. The empirical critical values for 1%, 5%, and 10% upper percentiles are higher than the usual cut off points and the empirical level of significance is inversely related to sample size and the model coefficients. The empirical power of the test statistics is not satisfactory at small sample size, and for large model coefficient. ηSLS gets confused with IO. The potential list of types of outliers should retain both IO and SLS as a part of outlier detection procedure for most efficient results. We apply the method suggested by Kaiser and Maravall with five possible types of outliers, that is, AO, IO, LS, TC, and SLS, to a number of quarterly and monthly time series data from Pakistan.  相似文献   

15.
The growth curve model introduced by Potthoff and Roy (1964) is a general statistical model which includes as special cases regression models and both univariate and multivariate analysis of variance models. In this paper, we discuss procedures for detection of outliers in growth curve models for mean-slippage and dispersion-slippage outlier model. The distributions of the test statistics are discussed and the values of significant probabilities are given using Bonferronl's bounds. Some simulation results are also presented.  相似文献   

16.
The influence function introduced by Hampe1 (1968, 1973, 1974) is a tool that can be used for outlier detection. Campbell (1978) has obtained influence function for Mahalanobis’s distance between two populations which can be used for detecting outliers in discrim-inant analysis. In this paper influence functions for a variety of parametric functions in multivariate analysis are obtained. Influence functions for the generalized variance, the matrix of regression coefficients, the noncentrality matrix Σ-1 δ in multivariate analysis of variance and its eigen values, the matrix L, which is a generalization of 1-R2 , canonical correlations, principal components and parameters that correspond to Pillai’s statistic (1955), Hotelling’s (1951) generalized To2 and Wilk’s Λ (1932), which can be used for outlier detection in multivariate analysis, are obtained. Delvin, Ginanadesikan and Kettenring (1975) have obtained influence function for the population correlation co-efficient in the bivariate case. It is shown in this paper that influence functions for parameters corresponding to r2, R2, and Mahalanobis D2 can be obtained as particular cases.  相似文献   

17.
Abstract

Binomial integer-valued AR processes have been well studied in the literature, but there is little progress in modeling bounded integer-valued time series with outliers. In this paper, we first review some basic properties of the binomial integer-valued AR(1) process and then we introduce binomial integer-valued AR(1) processes with two classes of innovational outliers. We focus on the joint conditional least squares (CLS) and the joint conditional maximum likelihood (CML) estimates of models’ parameters and the probability of occurrence of the outlier. Their large-sample properties are illustrated by simulation studies. Artificial and real data examples are used to demonstrate good performances of the proposed models.  相似文献   

18.
This article provides a procedure for the detection and identification of outliers in the spectral domain where the Whittle maximum likelihood estimator of the panel data model proposed by Chen [W.D. Chen, Testing for spurious regression in a panel data model with the individual number and time length growing, J. Appl. Stat. 33(88) (2006b), pp. 759–772] is implemented. We extend the approach of Chang and co-workers [I. Chang, G.C. Tiao, and C. Chen, Estimation of time series parameters in the presence of outliers, Technometrics 30 (2) (1988), pp. 193–204] to the spectral domain and through the Whittle approach we can quickly detect and identify the type of outliers. A fixed effects panel data model is used, in which the remainder disturbance is assumed to be a fractional autoregressive integrated moving-average (ARFIMA) process and the likelihood ratio criterion is obtained directly through the modified inverse Fourier transform. This saves much time, especially when the estimated model implements a huge data-set.

Through Monte Carlo experiments, the consistency of the estimator is examined by growing the individual number N and time length T, in which the long memory remainder disturbances are contaminated with two types of outliers: additive outlier and innovation outlier. From the power tests, we see that the estimators are quite successful and powerful.

In the empirical study, we apply the model on Taiwan's computer motherboard industry. Weekly data from 1 January 2000 to 31 October 2006 of nine familiar companies are used. The proposed model has a smaller mean square error and shows more distinctive aggressive properties than the raw data model does.  相似文献   


19.
Srivastava (1980) has shown that Grubbs's (1950) test for a univariate outlier is robust against the effect of equicorrelation. In this note we extend Srivastava's result by giving a more general covariance structure, which relaxes both the covariance structure and the assumption of equal variances. We also show that under the more general covariance structure, the power of Grubbs's test, as well as the significance level, is identical to the independently and identically distributed case.  相似文献   

20.
In the conventional hypothesis-testing approach to the detection of a unit root and a trend break, selections of the outlier type (additive or innovational) and of the break type (jump or kink) are carried out arbitrarily, because there is no generally accepted statistical technique. To overcome this problem, a model-selection approach using the modified Bayesian information criterion (MBIC) is proposed. Whether the observed time series contains a unit root and a trend break is determined as a result of model selection from among alternative models with and without unit root and trend break. The efficacy of the proposed approach is verified using comprehensive simulations.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号