Nonparametric depth-based multivariate outlier identifiers, and masking robustness properties |
| |
Authors: | Xin Dang Robert Serfling |
| |
Affiliation: | aDepartment of Mathematics, University of Mississippi, Oxford, MS 38677-1848, USA;bDepartment of Mathematical Sciences, University of Texas at Dallas, Richardson, TX 75083-0688, USA |
| |
Abstract: | In extending univariate outlier detection methods to higher dimension, various issues arise: limited visualization methods, inadequacy of marginal methods, lack of a natural order, limited parametric modeling, and, when using Mahalanobis distance, restriction to ellipsoidal contours. To address and overcome such limitations, we introduce nonparametric multivariate outlier identifiers based on multivariate depth functions, which can generate contours following the shape of the data set. Also, we study masking robustness, that is, robustness against misidentification of outliers as nonoutliers. In particular, we define a masking breakdown point (MBP), adapting to our setting certain ideas of Davies and Gather [1993. The identification of multiple outliers (with discussion). Journal of the American Statistical Association 88, 782–801] and Becker and Gather [1999. The masking breakdown point of multivariate outlier identification rules. Journal of the American Statistical Association 94, 947–955] based on the Mahalanobis distance outlyingness. We then compare four affine invariant outlier detection procedures, based on Mahalanobis distance, halfspace or Tukey depth, projection depth, and “Mahalanobis spatial” depth. For the goal of threshold type outlier detection, it is found that the Mahalanobis distance and projection procedures are distinctly superior in performance, each with very high MBP, while the halfspace approach is quite inferior. When a moderate MBP suffices, the Mahalanobis spatial procedure is competitive in view of its contours not constrained to be elliptical and its computational burden relatively mild. A small sampling experiment yields findings completely in accordance with the theoretical comparisons. While these four depth procedures are relatively comparable for the purpose of robust affine equivariant location estimation, the halfspace depth is not competitive with the others for the quite different goal of robust setting of an outlyingness threshold. |
| |
Keywords: | Multivariate analysis Nonparametric Robust Outlier identification Depth functions |
本文献已被 ScienceDirect 等数据库收录! |
|