Improving the detection of unusual observations in high‐dimensional settings |
| |
Authors: | Insha Ullah Matthew D.M. Pawley Adam N.H. Smith Beatrix Jones |
| |
Affiliation: | 1. School of Mathematical Sciences, Queensland University of Technology, Brisbane, QLD, Australia;2. Institute of Natural and Mathematical Sciences, Massey University, Auckland, New Zealand |
| |
Abstract: | Multivariate control charts are used to monitor stochastic processes for changes and unusual observations. Hotelling's T2 statistic is calculated for each new observation and an out‐of‐control signal is issued if it goes beyond the control limits. However, this classical approach becomes unreliable as the number of variables p approaches the number of observations n, and impossible when p exceeds n. In this paper, we devise an improvement to the monitoring procedure in high‐dimensional settings. We regularise the covariance matrix to estimate the baseline parameter and incorporate a leave‐one‐out re‐sampling approach to estimate the empirical distribution of future observations. An extensive simulation study demonstrates that the new method outperforms the classical Hotelling T2 approach in power, and maintains appropriate false positive rates. We demonstrate the utility of the method using a set of quality control samples collected to monitor a gas chromatography–mass spectrometry apparatus over a period of 67 days. |
| |
Keywords: | high‐dimensional data outlier detection shrinkage estimator |
|
|