首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 984 毫秒
1.
One important goal in multi-state modelling is to explore information about conditional transition-type-specific hazard rate functions by estimating influencing effects of explanatory variables. This may be performed using single transition-type-specific models if these covariate effects are assumed to be different across transition-types. To investigate whether this assumption holds or whether one of the effects is equal across several transition-types (cross-transition-type effect), a combined model has to be applied, for instance with the use of a stratified partial likelihood formulation. Here, prior knowledge about the underlying covariate effect mechanisms is often sparse, especially about ineffectivenesses of transition-type-specific or cross-transition-type effects. As a consequence, data-driven variable selection is an important task: a large number of estimable effects has to be taken into account if joint modelling of all transition-types is performed. A related but subsequent task is model choice: is an effect satisfactory estimated assuming linearity, or is the true underlying nature strongly deviating from linearity? This article introduces component-wise Functional Gradient Descent Boosting (short boosting) for multi-state models, an approach performing unsupervised variable selection and model choice simultaneously within a single estimation run. We demonstrate that features and advantages in the application of boosting introduced and illustrated in classical regression scenarios remain present in the transfer to multi-state models. As a consequence, boosting provides an effective means to answer questions about ineffectiveness and non-linearity of single transition-type-specific or cross-transition-type effects.  相似文献   

2.
Many of the popular nonlinear time series models require a priori the choice of parametric functions which are assumed to be appropriate in specific applications. This approach is mainly used in financial applications, when sufficient knowledge is available about the nonlinear structure between the covariates and the response. One principal strategy to investigate a broader class on nonlinear time series is the Nonlinear Additive AutoRegressive (NAAR) model. The NAAR model estimates the lags of a time series as flexible functions in order to detect non-monotone relationships between current and past observations. We consider linear and additive models for identifying nonlinear relationships. A componentwise boosting algorithm is applied for simultaneous model fitting, variable selection, and model choice. Thus, with the application of boosting for fitting potentially nonlinear models we address the major issues in time series modelling: lag selection and nonlinearity. By means of simulation we compare boosting to alternative nonparametric methods. Boosting shows a strong overall performance in terms of precise estimations of highly nonlinear lag functions. The forecasting potential of boosting is examined on the German industrial production (IP); to improve the model’s forecasting quality we include additional exogenous variables. Thus we address the second major aspect in this paper which concerns the issue of high dimensionality in models. Allowing additional inputs in the model extends the NAAR model to a broader class of models, namely the NAARX model. We show that boosting can cope with large models which have many covariates compared to the number of observations.  相似文献   

3.
In a calibration of near-infrared (NIR) instrument, we regress some chemical compositions of interest as a function of their NIR spectra. In this process, we have two immediate challenges: first, the number of variables exceeds the number of observations and, second, the multicollinearity between variables are extremely high. To deal with the challenges, prediction models that produce sparse solutions have recently been proposed. The term ‘sparse’ means that some model parameters are zero estimated and the other parameters are estimated naturally away from zero. In effect, a variable selection is embedded in the model to potentially achieve a better prediction. Many studies have investigated sparse solutions for latent variable models, such as partial least squares and principal component regression, and for direct regression models such as ridge regression (RR). However, in the latter, it mainly involves an L1 norm penalty to the objective function such as lasso regression. In this study, we investigate new sparse alternative models for RR within a random effects model framework, where we consider Cauchy and mixture-of-normals distributions on the random effects. The results indicate that the mixture-of-normals model produces a sparse solution with good prediction and better interpretation. We illustrate the methods using NIR spectra datasets from milk and corn specimens.  相似文献   

4.
We propose Twin Boosting which has much better feature selection behavior than boosting, particularly with respect to reducing the number of false positives (falsely selected features). In addition, for cases with a few important effective and many noise features, Twin Boosting also substantially improves the predictive accuracy of boosting. Twin Boosting is as general and generic as (gradient-based) boosting. It can be used with general weak learners and in a wide variety of situations, including generalized regression, classification or survival modeling. Furthermore, it is computationally feasible for large problems with potentially many more features than observed samples. Finally, for the special case of orthonormal linear models, we prove equivalence of Twin Boosting to the adaptive Lasso which provides some theoretical aspects on feature selection with Twin Boosting.  相似文献   

5.
The varying coefficient model (VCM) is an important generalization of the linear regression model and many existing estimation procedures for VCM were built on L 2 loss, which is popular for its mathematical beauty but is not robust to non-normal errors and outliers. In this paper, we address the problem of both robustness and efficiency of estimation and variable selection procedure based on the convex combined loss of L 1 and L 2 instead of only quadratic loss for VCM. By using local linear modeling method, the asymptotic normality of estimation is driven and a useful selection method is proposed for the weight of composite L 1 and L 2. Then the variable selection procedure is given by combining local kernel smoothing with adaptive group LASSO. With appropriate selection of tuning parameters by Bayesian information criterion (BIC) the theoretical properties of the new procedure, including consistency in variable selection and the oracle property in estimation, are established. The finite sample performance of the new method is investigated through simulation studies and the analysis of body fat data. Numerical studies show that the new method is better than or at least as well as the least square-based method in terms of both robustness and efficiency for variable selection.  相似文献   

6.
L2Boosting is an effective method for constructing model. In the case of high-dimensional setting, Bühlmann and Yu (2003 Bühlmann, P., Yu, B. (2003). Boosting with the L2-loss: regression and classification. J. Amer. Stat. Assoc. 98:324339.[Taylor &; Francis Online], [Web of Science ®] [Google Scholar]) proposed the componentwise L2Boosting, but componentwise L2Boosting can only fit a special limited model. In this paper, by combining a boosting and sufficient dimension reduction method, e.g., sliced inverse regression (SIR), we propose a new method for regression, called dimension reduction boosting (DRBoosting). Compared with L2Boosting, the computation of DRBoosting is less intensive and its prediction is better, especially for high-dimensional data. Simulations confirm the advantage of the new method.  相似文献   

7.
When employing model selection methods with oracle properties such as the smoothly clipped absolute deviation (SCAD) and the Adaptive Lasso, it is typical to estimate the smoothing parameter by m-fold cross-validation, for example, m = 10. In problems where the true regression function is sparse and the signals large, such cross-validation typically works well. However, in regression modeling of genomic studies involving Single Nucleotide Polymorphisms (SNP), the true regression functions, while thought to be sparse, do not have large signals. We demonstrate empirically that in such problems, the number of selected variables using SCAD and the Adaptive Lasso, with 10-fold cross-validation, is a random variable that has considerable and surprising variation. Similar remarks apply to non-oracle methods such as the Lasso. Our study strongly questions the suitability of performing only a single run of m-fold cross-validation with any oracle method, and not just the SCAD and Adaptive Lasso.  相似文献   

8.
The L1-type regularization provides a useful tool for variable selection in high-dimensional regression modeling. Various algorithms have been proposed to solve optimization problems for L1-type regularization. Especially the coordinate descent algorithm has been shown to be effective in sparse regression modeling. Although the algorithm shows a remarkable performance to solve optimization problems for L1-type regularization, it suffers from outliers, since the procedure is based on the inner product of predictor variables and partial residuals obtained from a non-robust manner. To overcome this drawback, we propose a robust coordinate descent algorithm, especially focusing on the high-dimensional regression modeling based on the principal components space. We show that the proposed robust algorithm converges to the minimum value of its objective function. Monte Carlo experiments and real data analysis are conducted to examine the efficiency of the proposed robust algorithm. We observe that our robust coordinate descent algorithm effectively performs for the high-dimensional regression modeling even in the presence of outliers.  相似文献   

9.
There is currently much discussion about lasso-type regularized regression which is a useful tool for simultaneous estimation and variable selection. Although the lasso-type regularization has several advantages in regression modelling, owing to its sparsity, it suffers from outliers because of using penalized least-squares methods. To overcome this issue, we propose a robust lasso-type estimation procedure that uses the robust criteria as the loss function, imposing L1-type penalty called the elastic net. We also introduce to use the efficient bootstrap information criteria for choosing optimal regularization parameters and a constant in outlier detection. Simulation studies and real data analysis are given to examine the efficiency of the proposed robust sparse regression modelling. We observe that our modelling strategy performs well in the presence of outliers.  相似文献   

10.
We adopt boosting for classification and selection of high-dimensional binary variables for which classical methods based on normality and non singular sample dispersion are inapplicable. Boosting seems particularly well suited for binary variables. We present three methods of which two combine boosting with the relatively classical variable selection methods developed in Wilbur et al. (2002 Wilbur , J. D. , Ghosh , J. K. , Nakatsu , C. H. , Brouder , S. M. , Doerge , R. W. ( 2002 ). Variable selection in high-dimensional multivariate binary data with application to the analysis of microbial community DNA fingerprints . Biometrics 58 : 378386 . [Google Scholar]). Our primary interest is variable selection in classification with small misclassification error being used as validation of proposed method for variable selection. Two of the new methods perform uniformly better than Wilbur et al. (2002 Wilbur , J. D. , Ghosh , J. K. , Nakatsu , C. H. , Brouder , S. M. , Doerge , R. W. ( 2002 ). Variable selection in high-dimensional multivariate binary data with application to the analysis of microbial community DNA fingerprints . Biometrics 58 : 378386 . [Google Scholar]) in one set of simulated and three real life examples.  相似文献   

11.
Important progress has been made with model averaging methods over the past decades. For spatial data, however, the idea of model averaging has not been applied well. This article studies model averaging methods for the spatial geostatistical linear model. A spatial Mallows criterion is developed to choose weights for the model averaging estimator. The resulting estimator can achieve asymptotic optimality in terms of L2 loss. Simulation experiments reveal that our proposed estimator is superior to the model averaging estimator by the Mallows criterion developed for ordinary linear models [Hansen, 2007] and the model selection estimator using the corrected Akaike's information criterion, developed for geostatistical linear models [Hoeting et al., 2006]. The Canadian Journal of Statistics 47: 336–351; 2019 © 2019 Statistical Society of Canada  相似文献   

12.
Boosting is a new, powerful method for classification. It is an iterative procedure which successively classifies a weighted version of the sample, and then reweights this sample dependent on how successful the classification was. In this paper we review some of the commonly used methods for performing boosting and show how they can be fit into a Bayesian setup at each iteration of the algorithm. We demonstrate how this formulation gives rise to a new splitting criterion when using a domain-partitioning classification method such as a decision tree. Further we can improve the predictive performance of simple decision trees, known as stumps, by using a posterior weighted average of them to classify at each step of the algorithm, rather than just a single stump. The main advantage of this approach is to reduce the number of boosting iterations required to produce a good classifier with only a minimal increase in the computational complexity of the algorithm.  相似文献   

13.
Qingguo Tang 《Statistics》2013,47(5):389-404
The varying coefficient model is a useful extension of linear models and has many advantages in practical use. To estimate the unknown functions in the model, the kernel type with local linear least-squares (L 2) estimation methods has been proposed by several authors. When the data contain outliers or come from population with heavy-tailed distributions, L 1-estimation should yield better estimators. In this article, we present the local linear L 1-estimation method and derive the asymptotic distributions of the L 1-estimators. The simulation results for two examples, with outliers and heavy-tailed distribution, respectively, show that the L 1-estimators outperform the L 2-estimators.  相似文献   

14.
Boosting is one of the most important methods for fitting regression models and building prediction rules. A notable feature of boosting is that the technique can be modified such that it includes a built-in mechanism for shrinking coefficient estimates and variable selection. This regularization mechanism makes boosting a suitable method for analyzing data characterized by small sample sizes and large numbers of predictors. We extend the existing methodology by developing a boosting method for prediction functions with multiple components. Such multidimensional functions occur in many types of statistical models, for example in count data models and in models involving outcome variables with a mixture distribution. As will be demonstrated, the new algorithm is suitable for both the estimation of the prediction function and regularization of the estimates. In addition, nuisance parameters can be estimated simultaneously with the prediction function.  相似文献   

15.
When the data contain outliers or come from population with heavy-tailed distributions, which appear very often in spatiotemporal data, the estimation methods based on least-squares (L2) method will not perform well. More robust estimation methods are required. In this article, we propose the local linear estimation for spatiotemporal models based on least absolute deviation (L1) and drive the asymptotic distributions of the L1-estimators under some mild conditions imposed on the spatiotemporal process. The simulation results for two examples, with outliers and heavy-tailed distribution, respectively, show that the L1-estimators perform better than the L2-estimators.  相似文献   

16.
In this paper we propose a general cure rate aging model. Our approach enables different underlying activation mechanisms which lead to the event of interest. The number of competing causes of the event of interest is assumed to follow a logarithmic distribution. The model is parameterized in terms of the cured fraction which is then linked to covariates. We explore the use of Markov chain Monte Carlo methods to develop a Bayesian analysis for the proposed model. Moreover, some discussions on the model selection to compare the fitted models are given, as well as case deletion influence diagnostics are developed for the joint posterior distribution based on the ψ-divergence, which has several divergence measures as particular cases, such as the Kullback–Leibler (K-L), J-distance, L1 norm, and χ2-square divergence measures. Simulation studies are performed and experimental results are illustrated based on a real malignant melanoma data.  相似文献   

17.
Gradient Boosting (GB) was introduced to address both classification and regression problems with great power. People have studied the boosting with L2 loss intensively both in theory and practice. However, the L2 loss is not proper for learning distributional functionals beyond the conditional mean such as conditional quantiles. There are huge amount of literatures studying conditional quantile prediction with various methods including machine learning techniques such like random forests and boosting. Simulation studies reveal that the weakness of random forests lies in predicting centre quantiles and that of GB lies in predicting extremes. Is there an algorithm that enjoys the advantages of both random forests and boosting so that it can perform well over all quantiles? In this article, we propose such a boosting algorithm called random GB which embraces the merits of both random forests and GB. Empirical results will be presented to support the superiority of this algorithm in predicting conditional quantiles.  相似文献   

18.
In this article, we study stepwise AIC method for variable selection comparing with other stepwise method for variable selection, such as, Partial F, Partial Correlation, and Semi-Partial Correlation in linear regression modeling. Then we show mathematically that the stepwise AIC method and other stepwise methods lead to the same method as Partial F. Hence, there are more reasons to use the stepwise AIC method than the other stepwise methods for variable selection, since the stepwise AIC method is a model selection method that can be easily managed and can be widely extended to more generalized models and applied to non normally distributed data. We also treat problems that always appear in applications, that are validation of selected variables and problem of collinearity.  相似文献   

19.
We propose a robust regression method called regression with outlier shrinkage (ROS) for the traditional n>pn>p cases. It improves over the other robust regression methods such as least trimmed squares (LTS) in the sense that it can achieve maximum breakdown value and full asymptotic efficiency simultaneously. Moreover, its computational complexity is no more than that of LTS. We also propose a sparse estimator, called sparse regression with outlier shrinkage (SROS), for robust variable selection and estimation. It is proven that SROS can not only give consistent selection but also estimate the nonzero coefficients with full asymptotic efficiency under the normal model. In addition, we introduce a concept of nearly regression equivariant estimator for understanding the breakdown properties of sparse estimators, and prove that SROS achieves the maximum breakdown value of nearly regression equivariant estimators. Numerical examples are presented to illustrate our methods.  相似文献   

20.
There are several procedures for fitting generalized additive models, i.e. regression models for an exponential family response where the influence of each single covariates is assumed to have unknown, potentially non-linear shape. Simulated data are used to compare a smoothing parameter optimization approach for selection of smoothness and of covariates, a stepwise approach, a mixed model approach, and a procedure based on boosting techniques. In particular it is investigated how the performance of procedures is linked to amount of information, type of response, total number of covariates, number of influential covariates, and extent of non-linearity. Measures for comparison are prediction performance, identification of influential covariates, and smoothness of fitted functions. One result is that the mixed model approach returns sparse fits with frequently over-smoothed functions, while the functions are less smooth for the boosting approach and variable selection is less strict. The other approaches are in between with respect to these measures. The boosting procedure is seen to perform very well when little information is available and/or when a large number of covariates is to be investigated. It is somewhat surprising that in scenarios with low information the fitting of a linear model, even with stepwise variable selection, has not much advantage over the fitting of an additive model when the true underlying structure is linear. In cases with more information the prediction performance of all procedures is very similar. So, in difficult data situations the boosting approach can be recommended, in others the procedures can be chosen conditional on the aim of the analysis.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号