首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
2.
Summary.  The importance of variable selection in regression has grown in recent years as computing power has encouraged the modelling of data sets of ever-increasing size. Data mining applications in finance, marketing and bioinformatics are obvious examples. A limitation of nearly all existing variable selection methods is the need to specify the correct model before selection. When the number of predictors is large, model formulation and validation can be difficult or even infeasible. On the basis of the theory of sufficient dimension reduction, we propose a new class of model-free variable selection approaches. The methods proposed assume no model of any form, require no nonparametric smoothing and allow for general predictor effects. The efficacy of the methods proposed is demonstrated via simulation, and an empirical example is given.  相似文献   

3.
A new estimation method for the dimension of a regression at the outset of an analysis is proposed. A linear subspace spanned by projections of the regressor vector X , which contains part or all of the modelling information for the regression of a vector Y on X , and its dimension are estimated via the means of parametric inverse regression. Smooth parametric curves are fitted to the p inverse regressions via a multivariate linear model. No restrictions are placed on the distribution of the regressors. The estimate of the dimension of the regression is based on optimal estimation procedures. A simulation study shows the method to be more powerful than sliced inverse regression in some situations.  相似文献   

4.
Sliced regression is an effective dimension reduction method by replacing the original high-dimensional predictors with its appropriate low-dimensional projection. It is free from any probabilistic assumption and can exhaustively estimate the central subspace. In this article, we propose to incorporate shrinkage estimation into sliced regression so that variable selection can be achieved simultaneously with dimension reduction. The new method can improve the estimation accuracy and achieve better interpretability for the reduced variables. The efficacy of proposed method is shown through both simulation and real data analysis.  相似文献   

5.
In the area of sufficient dimension reduction, two structural conditions are often assumed: the linearity condition that is close to assuming ellipticity of underlying distribution of predictors, and the constant variance condition that nears multivariate normality assumption of predictors. Imposing these conditions are considered as necessary trade-off for overcoming the “curse of dimensionality”. However, it is very hard to check whether these conditions hold or not. When these conditions are violated, some methods such as marginal transformation and re-weighting are suggested so that data fulfill them approximately. In this article, we assume an independence condition between the projected predictors and their orthogonal complements which can ensure the commonly used inverse regression methods to identify the central subspace of interest. The independence condition can be checked by the gridded chi-square test. Thus, we extend the scope of many inverse regression methods and broaden their applicability in the literature. Simulation studies and an application to the car price data are presented for illustration.  相似文献   

6.
7.
We consider a regression analysis of multivariate response on a vector of predictors. In this article, we develop a sliced inverse regression-based method for reducing the dimension of predictors without requiring a prespecified parametric model. Our proposed method preserves as much regression information as possible. We derive the asymptotic weighted chi-squared test for dimension. Simulation results are reported and comparisons are made with three methods—most predictable variates, k-means inverse regression and canonical correlation approach.  相似文献   

8.
In this paper we consider a semiparametric regression model involving a d-dimensional quantitative explanatory variable X and including a dimension reduction of X via an index βX. In this model, the main goal is to estimate the Euclidean parameter β and to predict the real response variable Y conditionally to X. Our approach is based on sliced inverse regression (SIR) method and optimal quantization in Lp-norm. We obtain the convergence of the proposed estimators of β and of the conditional distribution. Simulation studies show the good numerical behavior of the proposed estimators for finite sample size.  相似文献   

9.
L. Ferré  A. F. Yao 《Statistics》2013,47(6):475-488
Most of the usual multivariate methods have been extended to the context of functional data analysis. Our contribution concerns the study of sliced inverse regression (SIR) when the response variable is real but the regressor is a function. In the first part, we show how the relevant properties of SIR remain essentially the same in the functional context under suitable conditions. Unfortunately, the estimation procedure used in the multivariate case cannot be directly transposed to the functional one. Then, we propose a solution that overcomes this difficulty and we show the consistency of the estimates of the parameters of the model.  相似文献   

10.
In this paper, we extend the modified lasso of Wang et al. (2007) to the linear regression model with autoregressive moving average (ARMA) errors. Such an extension is far from trivial because new devices need to be called for to establish the asymptotics due to the existence of the moving average component. A shrinkage procedure is proposed to simultaneously estimate the parameters and select the informative variables in the regression, autoregressive, and moving average components. We show that the resulting estimator is consistent in both parameter estimation and variable selection, and enjoys the oracle properties. To overcome the complexity in numerical computation caused by the existence of the moving average component, we propose a procedure based on a least squares approximation to implement estimation. The ordinary least squares formulation with the use of the modified lasso makes the computation very efficient. Simulation studies are conducted to evaluate the finite sample performance of the procedure. An empirical example of ground-level ozone is also provided.  相似文献   

11.
Abstract

K-means inverse regression was developed as an easy-to-use dimension reduction procedure for multivariate regression. This approach is similar to the original sliced inverse regression method, with the exception that the slices are explicitly produced by a K-means clustering of the response vectors. In this article, we propose K-medoids clustering as an alternative clustering approach for slicing and compare its performance to K-means in a simulation study. Although the two methods often produce comparable results, K-medoids tends to yield better performance in the presence of outliers. In addition to isolation of outliers, K-medoids clustering also has the advantage of accommodating a broader range of dissimilarity measures, which could prove useful in other graphical regression applications where slicing is required.  相似文献   

12.
Sliced inverse regression (SIR) is an effective method for dimensionality reduction in high-dimensional regression problems. However, the method has requirements on the distribution of the predictors that are hard to check since they depend on unobserved variables. It has been shown that, if the distribution of the predictors is elliptical, then these requirements are satisfied. In case of mixture models, the ellipticity is violated and in addition there is no assurance of a single underlying regression model among the different components. Our approach clusterizes the predictors space to force the condition to hold on each cluster and includes a merging technique to look for different underlying models in the data. A study on simulated data as well as two real applications are provided. It appears that SIR, unsurprisingly, is not capable of dealing with a mixture of Gaussians involving different underlying models whereas our approach is able to correctly investigate the mixture.  相似文献   

13.
In this article, a new efficient iteration procedure based on quantile regression is developed for single-index varying-coefficient models. The proposed estimation scheme is an extension of the full iteration procedure proposed by Carroll et al., which is different with the method adopted by Wu et al. for single-index models that a double-weighted summation is used therein. This distinguish not only be the reason that undersmoothing should be a necessary condition in our proposed procedure, but also may reduce the computational burden especially for large-sample size. The resulting estimators are shown to be robust with regardless of outliers as well as varying errors. Moreover, to achieve sparsity when there exist irrelevant variables in the index parameters, a variable selection procedure combined with adaptive LASSO penalty is developed to simultaneously select and estimate significant parameters. Theoretical properties of the obtained estimators are established under some regular conditions, and some simulation studies with various distributed errors are conducted to assess the finite sample performance of our proposed method.  相似文献   

14.
Sliced Inverse Regression (SIR) is an effective method for dimension reduction in high-dimensional regression problems. The original method, however, requires the inversion of the predictors covariance matrix. In case of collinearity between these predictors or small sample sizes compared to the dimension, the inversion is not possible and a regularization technique has to be used. Our approach is based on a Fisher Lecture given by R.D. Cook where it is shown that SIR axes can be interpreted as solutions of an inverse regression problem. We propose to introduce a Gaussian prior distribution on the unknown parameters of the inverse regression problem in order to regularize their estimation. We show that some existing SIR regularizations can enter our framework, which permits a global understanding of these methods. Three new priors are proposed leading to new regularizations of the SIR method. A comparison on simulated data as well as an application to the estimation of Mars surface physical properties from hyperspectral images are provided.  相似文献   

15.
It is shown that the sliced inverse regression procedure proposed by Li corresponds to the maximum likelihood estimate where the observations in each slice are samples of multivariate normal distributions with means in an affine manifold.  相似文献   

16.
Modeling of count responses is widely performed via Poisson regression models. This paper covers the problem of variable selection in Poisson regression analysis. The basic emphasis of this paper is to present the usefulness of information complexity-based criteria for Poisson regression. Particle swarm optimization (PSO) algorithm was adopted to minimize the information criteria. A real dataset example and two simulation studies were conducted for highly collinear and lowly correlated datasets. Results demonstrate the capability of information complexity-type criteria. According to the results, information complexity-type criteria can be effectively used instead of classical criteria in count data modeling via the PSO algorithm.  相似文献   

17.
18.
In this article, we present a new efficient iteration estimation approach based on local modal regression for single-index varying-coefficient models. The resulted estimators are shown to be robust with regardless of outliers and error distributions. The asymptotic properties of the estimators are established under some regularity conditions and a practical modified EM algorithm is proposed for the new method. Moreover, to achieve sparse estimator when there exists irrelevant variables in the index parameters, a variable selection procedure based on SCAD penalty is developed to select significant parametric covariates and the well-known oracle properties are also derived. Finally, some numerical examples with various distributed errors and a real data analysis are conducted to illustrate the validity and feasibility of our proposed method.  相似文献   

19.
To seek the nonlinear structure hidden in data points of high-dimension, a transformation related to projection pursuit method and a projection index were proposed by Li (1989, 1990 ). In this paper, we present a consistent estimator of the supremum of the projection index based sliced inverse regression technique. This estimator also suggests a method to obtain approximately the most interesting projection in the general case.  相似文献   

20.
In this article, a new composite quantile regression estimation approach is proposed for estimating the parametric part of single-index model. We use local linear composite quantile regression (CQR) for estimating the nonparametric part of single-index model (SIM) when the error distribution is symmetrical. The weighted local linear CQR is proposed for estimating the nonparametric part of SIM when the error distribution is asymmetrical. Moreover, a new variable selection procedure is proposed for SIM. Under some regularity conditions, we establish the large sample properties of the proposed estimators. Simulation studies and a real data analysis are presented to illustrate the behavior of the proposed estimators.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号