Dealing with big data: comparing dimension reduction and shrinkage regression methods期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

按检索

Dealing with big data: comparing dimension reduction and shrinkage regression methods

Authors:	Hamideh D Hamedani Sara Sadat Moosavi

Institution:	Statistics Department, Faculty of Mathematical Sciences, Shahid Beheshti University, Tehran, Iran

Abstract:	In the past decades, the number of variables explaining observations in different practical applications increased gradually. This has led to heavy computational tasks, despite of widely using provisional variable selection methods in data processing. Therefore, more methodological techniques have appeared to reduce the number of explanatory variables without losing much of the information. In these techniques, two distinct approaches are apparent: ‘shrinkage regression’ and ‘sufficient dimension reduction’. Surprisingly, there has not been any communication or comparison between these two methodological categories, and it is not clear when each of these two approaches are appropriate. In this paper, we fill some of this gap by first reviewing each category in brief, paying special attention to the most commonly used methods in each category. We then compare commonly used methods from both categories based on their accuracy, computation time, and their ability to select effective variables. A simulation study on the performance of the methods in each category is generated as well. The selected methods are concurrently tested on two sets of real data which allows us to recommend conditions under which one approach is more appropriate to be applied to high-dimensional data.

Keywords:	Sufficient dimension reduction central subspace SPICE method shrinkage regression LASSO Elastic-Net FLASH OSCAR SCAD Ridge regression

设为首页 | 免责声明 | 关于勤云 | 加入收藏