首页 | 本学科首页   官方微博 | 高级检索  
     


Boxplot-Based Outlier Detection for the Location-Scale Family
Authors:Y. H. Dovoedo
Affiliation:Department of Mathematics, University of North Alabama, Florence, Alabama, USA
Abstract:Boxplots are among the most widely used exploratory data analysis (EDA) tools in statistical practice. Typical applications of boxplots include eliciting information about the underlying distribution (shape, location, etc.) as well as identifying possible outliers. This article focuses on a modification using a type of lower and upper fences similar in concept to those used in a traditional boxplot; however, instead of constructing the upper and lower fences using the upper and lower quartiles, respectively, and a multiple of the interquartile range (IQR), multiples of the upper and the lower semi-interquartile ranges (SIQR), respectively, measured from the sample median, are used. Any observation beyond the proposed fences is labeled a potential outlier. An exact expression for the probability that at least one sample observation is wrongly classified as an outlier, the so-called “some-outside rate per sample” (Hoaglin et al. (1986)), is derived for the family of location-scale distributions and is used in the determination of the fence constants. Tables for the fence constants are provided for a number of well-known location-scale distributions along with some illustrations with data; the performance of the outlier detection rule is explored in a simulation study.
Keywords:Boxplot  Fences  Interquartile range (IQR)  Order statistics  Outlier identification  Semi-interquartile range (SIQR)
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号