Semivarying coefficient least-squares support vector regression for analyzing high-dimensional gene-environmental data |
| |
Authors: | Jooyong Shim Changha Hwang Sunjoo Jeong |
| |
Institution: | 1. Institute of Statistical Information, Department of Statistics, Inje University, Kyungnam, South Korea;2. Department of Applied Statistics, Dankook University, Gyeonggido, South Korea;3. Department of Bioconvergent Science and Technology, Dankook University, Gyeonggido, South Korea |
| |
Abstract: | In the context of genetics and genomic medicine, gene-environment (G×E) interactions have a great impact on the risk of human diseases. Some existing methods for identifying G×E interactions are considered to be limited, since they analyze one or a few number of G factors at a time, assume linear effects of E factors, and use inefficient selection methods. In this paper, we propose a new method to identify significant main effects and G×E interactions. This is based on a semivarying coefficient least-squares support vector regression (LS-SVR) technique, which is devised by utilizing flexible semiparametric LS-SVR approach for censored survival data. This semivarying coefficient model is used to deal with the nonlinear effects of E factors. We also derive a generalized cross validation (GCV) function for determining the optimal values of hyperparameters of the proposed method. This GCV function is also used to identify significant main effects and G×E interactions. The proposed method is evaluated through numerical studies. |
| |
Keywords: | Generalized cross validation gene-environment interaction least-squares support vector regression main effect semiparametric regression semivarying coefficient survival data variable selection varying coefficient regression |
|
|