首页 | 本学科首页   官方微博 | 高级检索  
     检索      


Dealing with big data: comparing dimension reduction and shrinkage regression methods
Authors:Hamideh D Hamedani  Sara Sadat Moosavi
Institution:Statistics Department, Faculty of Mathematical Sciences, Shahid Beheshti University, Tehran, Iran
Abstract:In the past decades, the number of variables explaining observations in different practical applications increased gradually. This has led to heavy computational tasks, despite of widely using provisional variable selection methods in data processing. Therefore, more methodological techniques have appeared to reduce the number of explanatory variables without losing much of the information. In these techniques, two distinct approaches are apparent: ‘shrinkage regression’ and ‘sufficient dimension reduction’. Surprisingly, there has not been any communication or comparison between these two methodological categories, and it is not clear when each of these two approaches are appropriate. In this paper, we fill some of this gap by first reviewing each category in brief, paying special attention to the most commonly used methods in each category. We then compare commonly used methods from both categories based on their accuracy, computation time, and their ability to select effective variables. A simulation study on the performance of the methods in each category is generated as well. The selected methods are concurrently tested on two sets of real data which allows us to recommend conditions under which one approach is more appropriate to be applied to high-dimensional data.
Keywords:Sufficient dimension reduction  central subspace  SPICE method  shrinkage regression  LASSO  Elastic-Net  FLASH  OSCAR  SCAD  Ridge regression
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号