共查询到20条相似文献,搜索用时 15 毫秒
1.
Sliced inverse regression (SIR) is an effective method for dimensionality reduction in high-dimensional regression problems. However, the method has requirements on the distribution of the predictors that are hard to check since they depend on unobserved variables. It has been shown that, if the distribution of the predictors is elliptical, then these requirements are satisfied. In case of mixture models, the ellipticity is violated and in addition there is no assurance of a single underlying regression model among the different components. Our approach clusterizes the predictors space to force the condition to hold on each cluster and includes a merging technique to look for different underlying models in the data. A study on simulated data as well as two real applications are provided. It appears that SIR, unsurprisingly, is not capable of dealing with a mixture of Gaussians involving different underlying models whereas our approach is able to correctly investigate the mixture. 相似文献
2.
Haileab Hilafu 《统计学通讯:模拟与计算》2017,46(5):3516-3526
Sliced Inverse Regression (SIR; 1991) is a dimension reduction method for reducing the dimension of the predictors without losing regression information. The implementation of SIR requires inverting the covariance matrix of the predictors—which has hindered its use to analyze high-dimensional data where the number of predictors exceed the sample size. We propose random sliced inverse regression (rSIR) by applying SIR to many bootstrap samples, each using a subset of randomly selected candidate predictors. The final rSIR estimate is obtained by aggregating these estimates. A simple variable selection procedure is also proposed using these bootstrap estimates. The performance of the proposed estimates is studied via extensive simulation. Application to a dataset concerning myocardial perfusion diagnosis from cardiac Single Proton Emission Computed Tomography (SPECT) images is presented. 相似文献
3.
Most of the usual multivariate methods have been extended to the context of functional data analysis. Our contribution concerns the study of sliced inverse regression (SIR) when the response variable is real but the regressor is a function. In the first part, we show how the relevant properties of SIR remain essentially the same in the functional context under suitable conditions. Unfortunately, the estimation procedure used in the multivariate case cannot be directly transposed to the functional one. Then, we propose a solution that overcomes this difficulty and we show the consistency of the estimates of the parameters of the model. 相似文献
4.
Romain AzaïsAnne Gégout-Petit Jérôme Saracco 《Journal of statistical planning and inference》2012,142(2):481-492
In this paper we consider a semiparametric regression model involving a d-dimensional quantitative explanatory variable X and including a dimension reduction of X via an index β′X. In this model, the main goal is to estimate the Euclidean parameter β and to predict the real response variable Y conditionally to X. Our approach is based on sliced inverse regression (SIR) method and optimal quantization in Lp-norm. We obtain the convergence of the proposed estimators of β and of the conditional distribution. Simulation studies show the good numerical behavior of the proposed estimators for finite sample size. 相似文献
5.
Sliced inverse regression (SIR) was developed to find effective linear dimension-reduction directions for exploring the intrinsic structure of the high-dimensional data. In this study, we present isometric SIR for nonlinear dimension reduction, which is a hybrid of the SIR method using the geodesic distance approximation. First, the proposed method computes the isometric distance between data points; the resulting distance matrix is then sliced according to K-means clustering results, and the classical SIR algorithm is applied. We show that the isometric SIR (ISOSIR) can reveal the geometric structure of a nonlinear manifold dataset (e.g., the Swiss roll). We report and discuss this novel method in comparison to several existing dimension-reduction techniques for data visualization and classification problems. The results show that ISOSIR is a promising nonlinear feature extractor for classification applications. 相似文献
6.
It is shown that the sliced inverse regression procedure proposed by Li corresponds to the maximum likelihood estimate where the observations in each slice are samples of multivariate normal distributions with means in an affine manifold. 相似文献
7.
8.
9.
Many sufficient dimension reduction methods for univariate regression have been extended to multivariate regression. Sliced
average variance estimation (SAVE) has the potential to recover more reductive information and recent development enables
us to test the dimension and predictor effects with distributions commonly used in the literature. In this paper, we aim to
extend the functionality of the SAVE to multivariate regression. Toward the goal, we propose three new methods. Numerical
studies and real data analysis demonstrate that the proposed methods perform well. 相似文献
10.
In this paper we address the problem of estimating a vector of regression parameters in the Weibull censored regression model. Our main objective is to provide natural adaptive estimators that significantly improve upon the classical procedures in the situation where some of the predictors may or may not be associated with the response. In the context of two competing Weibull censored regression models (full model and candidate submodel), we consider an adaptive shrinkage estimation strategy that shrinks the full model maximum likelihood estimate in the direction of the submodel maximum likelihood estimate. We develop the properties of these estimators using the notion of asymptotic distributional risk. The shrinkage estimators are shown to have higher efficiency than the classical estimators for a wide class of models. Further, we consider a LASSO type estimation strategy and compare the relative performance with the shrinkage estimators. Monte Carlo simulations reveal that when the true model is close to the candidate submodel, the shrinkage strategy performs better than the LASSO strategy when, and only when, there are many inactive predictors in the model. Shrinkage and LASSO strategies are applied to a real data set from Veteran's administration (VA) lung cancer study to illustrate the usefulness of the procedures in practice. 相似文献
11.
For a moderate or large number of regression coefficients, shrinkage estimates towards an overall mean are obtained by Bayes and empirical Bayes methods. For a special case, the Bayes and empirical Bayes shrinking weights are shown to be asymptotically equivalent as the amount of shrinkage goes to zero. Based on comparisons between Bayes and empirical Bayes solutions, a modification of the empirical Bayes shrinking weights designed to guard against unreasonable overshrinking is suggested. A numerical example is given. 相似文献
12.
Howard D. Bondell Lexin Li 《Journal of the Royal Statistical Society. Series B, Statistical methodology》2009,71(1):287-299
Summary. The family of inverse regression estimators that was recently proposed by Cook and Ni has proven effective in dimension reduction by transforming the high dimensional predictor vector to its low dimensional projections. We propose a general shrinkage estimation strategy for the entire inverse regression estimation family that is capable of simultaneous dimension reduction and variable selection. We demonstrate that the new estimators achieve consistency in variable selection without requiring any traditional model, meanwhile retaining the root n estimation consistency of the dimension reduction basis. We also show the effectiveness of the new estimators through both simulation and real data analysis. 相似文献
13.
A generalised regression estimation procedure is proposed that can lead to much improved estimation of population characteristics, such as quantiles, variances and coefficients of variation. The method involves conditioning on the discrepancy between an estimate of an auxiliary parameter and its known population value. The key distributional assumption is joint asymptotic normality of the estimates of the target and auxiliary parameters. This assumption implies that the relationship between the estimated target and the estimated auxiliary parameters is approximately linear with coefficients determined by their asymptotic covariance matrix. The main contribution of this paper is the use of the bootstrap to estimate these coefficients, which avoids the need for parametric distributional assumptions. First‐order correct conditional confidence intervals based on asymptotic normality can be improved upon using quantiles of a conditional double bootstrap approximation to the distribution of the studentised target parameter estimate. 相似文献
14.
To seek the nonlinear structure hidden in data points of high-dimension, a transformation related to projection pursuit method and a projection index were proposed by Li (1989, 1990 ). In this paper, we present a consistent estimator of the supremum of the projection index based sliced inverse regression technique. This estimator also suggests a method to obtain approximately the most interesting projection in the general case. 相似文献
15.
《Journal of Statistical Computation and Simulation》2012,82(16):3335-3351
In this paper, we consider the shrinkage and penalty estimation procedures in the linear regression model with autoregressive errors of order p when it is conjectured that some of the regression parameters are inactive. We develop the statistical properties of the shrinkage estimation method including asymptotic distributional biases and risks. We show that the shrinkage estimators have a significantly higher relative efficiency than the classical estimator. Furthermore, we consider the two penalty estimators: least absolute shrinkage and selection operator (LASSO) and adaptive LASSO estimators, and numerically compare their relative performance with that of the shrinkage estimators. A Monte Carlo simulation experiment is conducted for different combinations of inactive predictors and the performance of each estimator is evaluated in terms of the simulated mean-squared error. This study shows that the shrinkage estimators are comparable to the penalty estimators when the number of inactive predictors in the model is relatively large. The shrinkage and penalty methods are applied to a real data set to illustrate the usefulness of the procedures in practice. 相似文献
16.
Efstathia Bura & R. Dennis Cook 《Journal of the Royal Statistical Society. Series B, Statistical methodology》2001,63(2):393-410
A new estimation method for the dimension of a regression at the outset of an analysis is proposed. A linear subspace spanned by projections of the regressor vector X , which contains part or all of the modelling information for the regression of a vector Y on X , and its dimension are estimated via the means of parametric inverse regression. Smooth parametric curves are fitted to the p inverse regressions via a multivariate linear model. No restrictions are placed on the distribution of the regressors. The estimate of the dimension of the regression is based on optimal estimation procedures. A simulation study shows the method to be more powerful than sliced inverse regression in some situations. 相似文献
17.
Heavy tail probability distributions are important in many scientific disciplines such as hydrology, geology, and physics and therefore feature heavily in statistical practice. Rather than specifying a family of heavy-tailed distributions for a given application, it is more common to use a nonparametric approach, where the distributions are classified according to the tail behavior. Through the use of the logarithm of Parzen's density-quantile function, this work proposes a consistent, flexible estimator of the tail exponent. The approach we develop is based on a Fourier series estimator and allows for separate estimates of the left and right tail exponents. The theoretical properties for the tail exponent estimator are determined, and we also provide some results of independent interest that may be used to establish weak convergence of stochastic processes. We assess the practical performance of the method by exploring its finite sample properties in simulation studies. The overall performance is competitive with classical tail index estimators, and, in contrast, with these our method obtains somewhat better results in the case of lighter heavy-tailed distributions. 相似文献
18.
Ling-Yau Chan 《Journal of applied statistics》2010,37(3):425-433
On the basis of a negative binomial sampling scheme, we consider a uniformly most accurate upper confidence limit for a small but unknown proportion, such as the proportion of defectives in a manufacturing process. The optimal stopping rule, with reference to the twin criteria of the expected length of the confidence interval and the expected sample size, is investigated. The proposed confidence interval has also been compared with several others that have received attention in the recent literature. 相似文献
19.
AbstractK-means inverse regression was developed as an easy-to-use dimension reduction procedure for multivariate regression. This approach is similar to the original sliced inverse regression method, with the exception that the slices are explicitly produced by a K-means clustering of the response vectors. In this article, we propose K-medoids clustering as an alternative clustering approach for slicing and compare its performance to K-means in a simulation study. Although the two methods often produce comparable results, K-medoids tends to yield better performance in the presence of outliers. In addition to isolation of outliers, K-medoids clustering also has the advantage of accommodating a broader range of dissimilarity measures, which could prove useful in other graphical regression applications where slicing is required. 相似文献
20.
We consider a regression analysis of multivariate response on a vector of predictors. In this article, we develop a sliced inverse regression-based method for reducing the dimension of predictors without requiring a prespecified parametric model. Our proposed method preserves as much regression information as possible. We derive the asymptotic weighted chi-squared test for dimension. Simulation results are reported and comparisons are made with three methods—most predictable variates, k-means inverse regression and canonical correlation approach. 相似文献