首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
In this article, we propose a semiparametric mixture of additive regression models, in which the regression functions are additive and non parametric while the mixing proportions and variances are constant. Compared with the mixture of linear regression models, the proposed methodology is more flexible in modeling the non linear relationship between the response and covariate. A two-step procedure based on the spline-backfitted kernel method is derived for computation. Moreover, we establish the asymptotic normality of the resultant estimators and examine their good performance through a numerical example.  相似文献   

2.
In this paper, we investigate the commonality of nonparametric component functions among different quantile levels in additive regression models. We propose two fused adaptive group Least Absolute Shrinkage and Selection Operator penalties to shrink the difference of functions between neighbouring quantile levels. The proposed methodology is able to simultaneously estimate the nonparametric functions and identify the quantile regions where functions are unvarying, and thus is expected to perform better than standard additive quantile regression when there exists a region of quantile levels on which the functions are unvarying. Under some regularity conditions, the proposed penalised estimators can theoretically achieve the optimal rate of convergence and identify the true varying/unvarying regions consistently. Simulation studies and a real data application show that the proposed methods yield good numerical results.  相似文献   

3.
ABSTRACT

For multivariate regressors, the Nadaraya–Watson regression estimator suffers from the well-known curse of dimensionality. Additive models overcome this drawback. To estimate the additive components, it is usually assumed that we observe all the data. However, in many applied statistical analysis missing data occur. In this paper, we study the effect of missing responses on the additive components estimation. The estimators are based on marginal integration adapted to the missing situation. The proposed estimators turn out to be consistent under mild assumptions. A simulation study allows to compare the behavior of our procedures, under different scenarios.  相似文献   

4.
In high-dimensional setting, componentwise L2boosting has been used to construct sparse model that performs well, but it tends to select many ineffective variables. Several sparse boosting methods, such as, SparseL2Boosting and Twin Boosting, have been proposed to improve the variable selection of L2boosting algorithm. In this article, we propose a new general sparse boosting method (GSBoosting). The relations are established between GSBoosting and other well known regularized variable selection methods in the orthogonal linear model, such as adaptive Lasso, hard thresholds, etc. Simulation results show that GSBoosting has good performance in both prediction and variable selection.  相似文献   

5.
This paper develops inference for the significance of features such as peaks and valleys observed in additive modeling through an extension of the SiZer-type methodology of Chaudhuri and Marron (1999) and Godtliebsen et al. (2002, 2004) to the case where the outcome is discrete. We consider the problem of determining the significance of features such as peaks or valleys in observed covariate effects both for the case of additive modeling where the main predictor of interest is univariate as well as the problem of studying the significance of features such as peaks, inclines, ridges and valleys when the main predictor of interest is geographical location. We work with low rank radial spline smoothers to allow to the handling of sparse designs and large sample sizes. Reducing the problem to a Generalised Linear Mixed Model (GLMM) framework enables derivation of simulation-based critical value approximations and guards against the problem of multiple inferences over a range of predictor values. Such a reduction also allows for easy adjustment for confounders including those which have an unknown or complex effect on the outcome. A simulation study indicates that our method has satisfactory power. Finally, we illustrate our methodology on several data sets.  相似文献   

6.
Nonparametric additive models are powerful techniques for multivariate data analysis. Although many procedures have been developed for estimating additive components both in mean regression and quantile regression, the problem of selecting relevant components has not been addressed much especially in quantile regression. We present a doubly-penalized estimation procedure for component selection in additive quantile regression models that combines basis function approximation with a ridge-type penalty and a variant of the smoothly clipped absolute deviation penalty. We show that the proposed estimator identifies relevant and irrelevant components consistently and achieves the nonparametric optimal rate of convergence for the relevant components. We also provide an accurate and efficient computation algorithm to implement the estimator and demonstrate its performance through simulation studies. Finally, we illustrate our method via a real data example to identify important body measurements to predict percentage of body fat of an individual.  相似文献   

7.
Sparsity-inducing penalties are useful tools for variable selection and are also effective for regression problems where the data are functions. We consider the problem of selecting not only variables but also decision boundaries in multiclass logistic regression models for functional data, using sparse regularization. The parameters of the functional logistic regression model are estimated in the framework of the penalized likelihood method with the sparse group lasso-type penalty, and then tuning parameters for the model are selected using the model selection criterion. The effectiveness of the proposed method is investigated through simulation studies and the analysis of a gene expression data set.  相似文献   

8.
We provide an optimization interpretation of both back-fitting and integration estimators for additive nonparametric regression. We find that the integration estimator is a projection with respect to a product measure. We also provide further understanding of the back-fitting method.  相似文献   

9.
The group Lasso is a penalized regression method, used in regression problems where the covariates are partitioned into groups to promote sparsity at the group level [27 M. Yuan and Y. Lin, Model selection and estimation in regression with grouped variables, J. R. Stat. Soc. Ser. B 68 (2006), pp. 4967. doi: 10.1111/j.1467-9868.2005.00532.x[Crossref] [Google Scholar]]. Quantile group Lasso, a natural extension of quantile Lasso [25 Y. Wu and Y. Liu, Variable selection in quantile regression, Statist. Sinica 19 (2009), pp. 801817.[Web of Science ®] [Google Scholar]], is a good alternative when the data has group information and has many outliers and/or heavy tails. How to discover important features that are correlated with interest of outcomes and immune to outliers has been paid much attention. In many applications, however, we may also want to keep the flexibility of selecting variables within a group. In this paper, we develop a sparse group variable selection based on quantile methods which select important covariates at both the group level and within the group level, which penalizes the empirical check loss function by the sum of square root group-wise L1-norm penalties. The oracle properties are established where the number of parameters diverges. We also apply our new method to varying coefficient model with categorial effect modifiers. Simulations and real data example show that the newly proposed method has robust and superior performance.  相似文献   

10.
The central topic of this article is the estimation of parameters of the generalized partially linear single-index model (GPLSIM). Two numerical optimization procedures are presented and an S-plus program based on these procedures is compared to a program by Wand in a simulation setting. The results from these simulations indicate that the estimates for the new procedures are as good, if not better, than Wand's. Also, this program is much more flexible than Wand's since it can handle more general models. Other simulations are also conducted. The first compares the effects of using linear interpolation versus spline interpolation in an optimization procedure. The results indicate that by using spline interpolation one gets more stable estimates at a cost of increased computational time. A second simulation was conducted to assess the performance of a method for estimating the variance of alpha. A third set of simulations is carried out to determine the best criterion for testing that one of the elements of alpha is equal to zero. The GPLSIM is applied to a water quality data set and the results indicate an interesting relationship between gastrointestinal illness and turbidity (cloudiness) of drinking water.  相似文献   

11.
We consider the efficient estimation in the semiparametric additive isotonic regression model where each additive nonparametric component is assumed to be a monotone function. We show that the least-square estimator of the finite-dimensional regression coefficient is root-nn consistent and asymptotically normal. Moreover, the isotonic estimator of each additive functional component is proved to have the oracle property, which means the additive component can be estimated with the highest asymptotic accuracy as if the other components were known. A fast algorithm is developed by iterating between a cyclic pool adjacent violators procedure and solving a standard ordinary least squares problem. Simulations are used to illustrate the performance of the proposed procedure and verify the oracle property.  相似文献   

12.
Following the extension from linear mixed models to additive mixed models, extension from generalized linear mixed models to generalized additive mixed models is made, Algorithms are developed to compute the MLE's of the nonlinear effects and the covariance structures based on the penalized marginal likelihood. Convergence of the algorithms and selection of the smooth param¬eters are discussed.  相似文献   

13.
Lots of semi-parametric and nonparametric models are used to fit nonlinear time series data. They include partially linear time series models, nonparametric additive models, and semi-parametric single index models. In this article, we focus on fitting time series data by partially linear additive model. Combining the orthogonal series approximation and the adaptive sparse group LASSO regularization, we select the important variables between and within the groups simultaneously. Specially, we propose a two-step algorithm to obtain the grouped sparse estimators. Numerical studies show that the proposed method outperforms LASSO method in both fitting and forecasting. An empirical analysis is used to illustrate the methodology.  相似文献   

14.
Geoadditive models   总被引:7,自引:0,他引:7  
Summary. A study into geographical variability of reproductive health outcomes (e.g. birth weight) in Upper Cape Cod, Massachusetts, USA, benefits from geostatistical mapping or kriging . However, also observed are some continuous covariates (e.g. maternal age) that exhibit pronounced non-linear relationships with the response variable. To account for such effects properly we merge kriging with additive models to obtain what we call geoadditive models . The merging becomes effortless by expressing both as linear mixed models. The resulting mixed model representation for the geoadditive model allows for fitting and diagnosis using standard methodology and software.  相似文献   

15.
Based on B-spline basis functions and smoothly clipped absolute deviation (SCAD) penalty, we present a new estimation and variable selection procedure based on modal regression for partially linear additive models. The outstanding merit of the new method is that it is robust against outliers or heavy-tail error distributions and performs no worse than the least-square-based estimation for normal error case. The main difference is that the standard quadratic loss is replaced by a kernel function depending on a bandwidth that can be automatically selected based on the observed data. With appropriate selection of the regularization parameters, the new method possesses the consistency in variable selection and oracle property in estimation. Finally, both simulation study and real data analysis are performed to examine the performance of our approach.  相似文献   

16.
Summary.  Motivated from the problem of testing for genetic effects on complex traits in the presence of gene–environment interaction, we develop score tests in general semiparametric regression problems that involves Tukey style 1 degree-of-freedom form of interaction between parametrically and non-parametrically modelled covariates. We find that the score test in this type of model, as recently developed by Chatterjee and co-workers in the fully parametric setting, is biased and requires undersmoothing to be valid in the presence of non-parametric components. Moreover, in the presence of repeated outcomes, the asymptotic distribution of the score test depends on the estimation of functions which are defined as solutions of integral equations, making implementation difficult and computationally taxing. We develop profiled score statistics which are unbiased and asymptotically efficient and can be performed by using standard bandwidth selection methods. In addition, to overcome the difficulty of solving functional equations, we give easy interpretations of the target functions, which in turn allow us to develop estimation procedures that can be easily implemented by using standard computational methods. We present simulation studies to evaluate type I error and power of the method proposed compared with a naive test that does not consider interaction. Finally, we illustrate our methodology by analysing data from a case–control study of colorectal adenoma that was designed to investigate the association between colorectal adenoma and the candidate gene NAT2 in relation to smoking history.  相似文献   

17.
18.
Summary.  Partial least squares regression has been an alternative to ordinary least squares for handling multicollinearity in several areas of scientific research since the 1960s. It has recently gained much attention in the analysis of high dimensional genomic data. We show that known asymptotic consistency of the partial least squares estimator for a univariate response does not hold with the very large p and small n paradigm. We derive a similar result for a multivariate response regression with partial least squares. We then propose a sparse partial least squares formulation which aims simultaneously to achieve good predictive performance and variable selection by producing sparse linear combinations of the original predictors. We provide an efficient implementation of sparse partial least squares regression and compare it with well-known variable selection and dimension reduction approaches via simulation experiments. We illustrate the practical utility of sparse partial least squares regression in a joint analysis of gene expression and genomewide binding data.  相似文献   

19.
Suppose the observations (ti,yi), i = 1,… n, follow the model where gj are unknown functions. The estimation of the additive components can be done by approximating gj, with a function made up of the sum of a linear fit and a truncated Fourier series of cosines and minimizing a penalized least-squares loss function over the coefficients. This finite-dimensional basis approximation, when fitting an additive model with r predictors, has the advantage of reducing the computations drastically, since it does not require the use of the backfitting algorithm. The cross-validation (CV) [or generalized cross-validation (GCV)] for the additive fit is calculated in a further 0(n) operations. A search path in the r-dimensional space of degrees of freedom is proposed along which the CV (GCV) continuously decreases. The path ends when an increase in the degrees of freedom of any of the predictors yields an increase in CV (GCV). This procedure is illustrated on a meteorological data set.  相似文献   

20.
Abstract

This paper searches for A-optimal designs for Kronecker product and additive regression models when the errors are heteroscedastic. Sufficient conditions are given so that A-optimal designs for the multifactor models can be built from A-optimal designs for their sub-models with a single factor. The results of an efficiency study carried out to check the adequacy of the products of optimal designs for uni-factor marginal models when these are used to estimate different multi-factor models are also reported.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号