Identification of multiple high leverage points in logistic regression期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

Identification of multiple high leverage points in logistic regression

Authors:	A.H.M. Rahmatullah Imon Ali S. Hadi

Affiliation:	1. Department of Mathematical Sciences, Ball State University, Muncie, IN, USA;2. Department of Mathematics, The American University in Cairo, Cairo, Egypt

Abstract:	Leverage values are being used in regression diagnostics as measures of unusual observations in the X-space. Detection of high leverage observations or points is crucial due to their responsibility for masking outliers. In linear regression, high leverage points (HLP) are those that stand far apart from the center (mean) of the data and hence the most extreme points in the covariate space get the highest leverage. But Hosemer and Lemeshow [Applied logistic regression, Wiley, New York, 1980] pointed out that in logistic regression, the leverage measure contains a component which can make the leverage values of genuine HLP misleadingly very small and that creates problem in the correct identification of the cases. Attempts have been made to identify the HLP based on the median distances from the mean, but since they are designed for the identification of a single high leverage point they may not be very effective in the presence of multiple HLP due to their masking (false–negative) and swamping (false–positive) effects. In this paper we propose a new method for the identification of multiple HLP in logistic regression where the suspect cases are identified by a robust group deletion technique and they are confirmed using diagnostic techniques. The usefulness of the proposed method is then investigated through several well-known examples and a Monte Carlo simulation.

Keywords:	logistic regression covariates high leverage points masking swamping group deletion robust regression deletion median distance from the median Monte Carlo simulation

设为首页 | 免责声明 | 关于勤云 | 加入收藏