A note on contamination models and outliers |
| |
Authors: | Järgen Wellmann Ursula Gather |
| |
Affiliation: | 1. Institute of Epidemiology , GSF-National Research Center for Environment and Health Ingolstadter Landstraβe 1 , Neuherberg, D-85764, Germany;2. Department of Statistics Vogelpothsweg 87 , University of Dortmund , Dortmund, D-44221, Germany |
| |
Abstract: | In order to describe or generate so-called outliers in univariate statistical data, contamination models are often used. These models assume that k out of n independent random variables are shifted or multiplicated by some constant, whereas the other observations still come i.i.d. from some common target distribution. Of course, these contaminants do not necessarily stick out as the extremes in the sample. Moreover, it is the amount and magnitude of ‘contamination” which determines the number of obvious outliers. Using the concept of Davies and Gather (1993) to formalize the outlier notion we quantify the amount of contamination needed to produce a prespecified expected number of ‘genuine’ outliers. In particular, we demonstrate that for sample of moderate size from a normal target distribution a rather large shift of the contaminants is necessary to yield a certain expected number of outliers. Such an insight is of interest when designing simulation studies where outliers shoulod occur as well as in theoretical investigations on outliers. |
| |
Keywords: | slippage model univariate data expected number of outliers |
|
|