首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Abstract

In this article, we focus on the variable selection for semiparametric varying coefficient partially linear model with response missing at random. Variable selection is proposed based on modal regression, where the non parametric functions are approximated by B-spline basis. The proposed procedure uses SCAD penalty to realize variable selection of parametric and nonparametric components simultaneously. Furthermore, we establish the consistency, the sparse property and asymptotic normality of the resulting estimators. The penalty estimation parameters value of the proposed method is calculated by EM algorithm. Simulation studies are carried out to assess the finite sample performance of the proposed variable selection procedure.  相似文献   

2.
G. Aneiros  F. Ferraty  P. Vieu 《Statistics》2015,49(6):1322-1347
The problem of variable selection is considered in high-dimensional partial linear regression under some model allowing for possibly functional variable. The procedure studied is that of nonconcave-penalized least squares. It is shown the existence of a √n/sn-consistent estimator for the vector of pn linear parameters in the model, even when pn tends to ∞ as the sample size n increases (sn denotes the number of influential variables). An oracle property is also obtained for the variable selection method, and the nonparametric rate of convergence is stated for the estimator of the nonlinear functional component of the model. Finally, a simulation study illustrates the finite sample size performance of our procedure.  相似文献   

3.
Generalized linear models (GLMs) are widely studied to deal with complex response variables. For the analysis of categorical dependent variables with more than two response categories, multivariate GLMs are presented to build the relationship between this polytomous response and a set of regressors. Traditional variable selection approaches have been proposed for the multivariate GLM with a canonical link function when the number of parameters is fixed in the literature. However, in many model selection problems, the number of parameters may be large and grow with the sample size. In this paper, we present a new selection criterion to the model with a diverging number of parameters. Under suitable conditions, the criterion is shown to be model selection consistent. A simulation study and a real data analysis are conducted to support theoretical findings.  相似文献   

4.
In this paper,we propose a class of general partially linear varying-coefficient transformation models for ranking data. In the models, the functional coefficients are viewed as nuisance parameters and approximated by B-spline smoothing approximation technique. The B-spline coefficients and regression parameters are estimated by rank-based maximum marginal likelihood method. The three-stage Monte Carlo Markov Chain stochastic approximation algorithm based on ranking data is used to compute estimates and the corresponding variances for all the B-spline coefficients and regression parameters. Through three simulation studies and a Hong Kong horse racing data application, the proposed procedure is illustrated to be accurate, stable and practical.  相似文献   

5.
The authors propose a two‐stage estimation procedure for the partially linear model Y = fo(T) + X'βo + ψ. They show how to estimate consistently the location of the nonzero components of βo. Their approach turns out to be compatible with minimax adaptive estimation of fo over Besov balls in the case of penalized least squares. Their proofs are based on a new type of oracle inequality.  相似文献   

6.
In this paper, we consider the estimation of partially linear additive quantile regression models where the conditional quantile function comprises a linear parametric component and a nonparametric additive component. We propose a two-step estimation approach: in the first step, we approximate the conditional quantile function using a series estimation method. In the second step, the nonparametric additive component is recovered using either a local polynomial estimator or a weighted Nadaraya–Watson estimator. Both consistency and asymptotic normality of the proposed estimators are established. Particularly, we show that the first-stage estimator for the finite-dimensional parameters attains the semiparametric efficiency bound under homoskedasticity, and that the second-stage estimators for the nonparametric additive component have an oracle efficiency property. Monte Carlo experiments are conducted to assess the finite sample performance of the proposed estimators. An application to a real data set is also illustrated.  相似文献   

7.
In the context of longitudinal data analysis, a random function typically represents a subject that is often observed at a small number of time point. For discarding this restricted condition of observation number of each subject, we consider the semiparametric partially linear regression models with mean function x?βx?β + g(z), where x and z   are functional data. The estimations of ββ and g(z) are presented and some asymptotic results are given. It is shown that the estimator of the parametric component is asymptotically normal. The convergence rate of the estimator of the nonparametric component is also obtained. Here, the observation number of each subject is completely flexible. Some simulation study is conducted to investigate the finite sample performance of the proposed estimators.  相似文献   

8.
This paper is concerned with selection of explanatory variables in generalized linear models (GLM). The class of GLM's is quite large and contains e.g. the ordinary linear regression, the binary logistic regression, the probit model and Poisson regression with linear or log-linear parameter structure. We show that, through an approximation of the log likelihood and a certain data transformation, the variable selection problem in a GLM can be converted into variable selection in an ordinary (unweighted) linear regression model. As a consequence no specific computer software for variable selection in GLM's is needed. Instead, some suitable variable selection program for linear regression can be used. We also present a simulation study which shows that the log likelihood approximation is very good in many practical situations. Finally, we mention briefly possible extensions to regression models outside the class of GLM's.  相似文献   

9.
Jing Yang  Fang Lu  Hu Yang 《Statistics》2017,51(6):1179-1199
In this paper, we develop a new estimation procedure based on quantile regression for semiparametric partially linear varying-coefficient models. The proposed estimation approach is empirically shown to be much more efficient than the popular least squares estimation method for non-normal error distributions, and almost not lose any efficiency for normal errors. Asymptotic normalities of the proposed estimators for both the parametric and nonparametric parts are established. To achieve sparsity when there exist irrelevant variables in the model, two variable selection procedures based on adaptive penalty are developed to select important parametric covariates as well as significant nonparametric functions. Moreover, both these two variable selection procedures are demonstrated to enjoy the oracle property under some regularity conditions. Some Monte Carlo simulations are conducted to assess the finite sample performance of the proposed estimators, and a real-data example is used to illustrate the application of the proposed methods.  相似文献   

10.
A nonconcave penalized estimation method is proposed for partially linear models with longitudinal data when the number of parameters diverges with the sample size. The proposed procedure can simultaneously estimate the parameters and select the important variables. Under some regularity conditions, the rate of convergence and asymptotic normality of the resulting estimators are established. In addition, an iterative algorithm is proposed to implement the proposed estimators. To improve efficiency for regression coefficients, the estimation of the covariance function is integrated in the iterative algorithm. Simulation studies are carried out to demonstrate that the proposed method performs well, and a real data example is analysed to illustrate the proposed procedure.  相似文献   

11.
Based on B-spline basis functions and smoothly clipped absolute deviation (SCAD) penalty, we present a new estimation and variable selection procedure based on modal regression for partially linear additive models. The outstanding merit of the new method is that it is robust against outliers or heavy-tail error distributions and performs no worse than the least-square-based estimation for normal error case. The main difference is that the standard quadratic loss is replaced by a kernel function depending on a bandwidth that can be automatically selected based on the observed data. With appropriate selection of the regularization parameters, the new method possesses the consistency in variable selection and oracle property in estimation. Finally, both simulation study and real data analysis are performed to examine the performance of our approach.  相似文献   

12.
The authors propose a block empirical likelihood procedure to accommodate the within‐group correlation in longitudinal partially linear regression models. This leads them to prove a nonparametric version of the Wilks theorem. In comparison with normal approximations, their method does not require a consistent estimator for the asymptotic covariance matrix, which makes it easier to conduct inference on the parametric component of the model. An application to a longitudinal study on fluctuations of progesterone level in a menstrual cycle is used to illustrate the procedure developed here.  相似文献   

13.
In this paper, we focus on the variable selection for the semiparametric regression model with longitudinal data when some covariates are measured with errors. A new bias-corrected variable selection procedure is proposed based on the combination of the quadratic inference functions and shrinkage estimations. With appropriate selection of the tuning parameters, we establish the consistency and asymptotic normality of the resulting estimators. Extensive Monte Carlo simulation studies are conducted to examine the finite sample performance of the proposed variable selection procedure. We further illustrate the proposed procedure with an application.  相似文献   

14.
Stepwise variable selection procedures are computationally inexpensive methods for constructing useful regression models for a single dependent variable. At each step a variable is entered into or deleted from the current model, based on the criterion of minimizing the error sum of squares (SSE). When there is more than one dependent variable, the situation is more complex. In this article we propose variable selection criteria for multivariate regression which generalize the univariate SSE criterion. Specifically, we suggest minimizing some function of the estimated error covariance matrix: the trace, the determinant, or the largest eigenvalue. The computations associated with these criteria may be burdensome. We develop a computational framework based on the use of the SWEEP operator which greatly reduces these calculations for stepwise variable selection in multivariate regression.  相似文献   

15.
Cox's seminal 1972 paper on regression methods for possibly censored failure time data popularized the use of time to an event as a primary response in prospective studies. But one key assumption of this and other regression methods is that observations are independent of one another. In many problems, failure times are clustered into small groups where outcomes within a group are correlated. Examples include failure times for two eyes from one person or for members of the same family.This paper presents a survey of models for multivariate failure time data. Two distinct classes of models are considered: frailty and marginal models. In a frailty model, the correlation is assumed to derive from latent variables (frailties) common to observations from the same cluster. Regression models are formulated for the conditional failure time distribution given the frailties. Alternatively, marginal models describe the marginal failure time distribution of each response while separately modelling the association among responses from the same cluster.We focus on recent extensions of the proportional hazards model for multivariate failure time data. Model formulation, parameter interpretation and estimation procedures are considered.  相似文献   

16.
In this article, we consider the problem of variable selection in linear regression when multicollinearity is present in the data. It is well known that in the presence of multicollinearity, performance of least square (LS) estimator of regression parameters is not satisfactory. Consequently, subset selection methods, such as Mallow's Cp, which are based on LS estimates lead to selection of inadequate subsets. To overcome the problem of multicollinearity in subset selection, a new subset selection algorithm based on the ridge estimator is proposed. It is shown that the new algorithm is a better alternative to Mallow's Cp when the data exhibit multicollinearity.  相似文献   

17.
In this paper we study a class of multivariate partially linear regression models. Various estimators for the parametric component and the nonparametric component are constructed and their asymptotic normality established. In particular, we propose an estimator of the contemporaneous correlation among the multiple responses and develop a test for detecting the existence of such contemporaneous correlation without using any nonparametric estimation. The performance of the proposed estimators and test is evaluated through some simulation studies and an analysis of a real data set is used to illustrate the developed methodology. The Canadian Journal of Statistics 41: 1–22; 2013 © 2013 Statistical Society of Canada  相似文献   

18.
Generalized partially linear varying-coefficient models (GPLVCM) are frequently used in statistical modeling. However, the statistical inference of the GPLVCM, such as confidence region/interval construction, has not been very well developed. In this article, empirical likelihood-based inference for the parametric components in the GPLVCM is investigated. Based on the local linear estimators of the GPLVCM, an estimated empirical likelihood-based statistic is proposed. We show that the resulting statistic is asymptotically non-standard chi-squared. By the proposed empirical likelihood method, the confidence regions for the parametric components are constructed. In addition, when some components of the parameter are of particular interest, the construction of their confidence intervals is also considered. A simulation study is undertaken to compare the empirical likelihood and the other existing methods in terms of coverage accuracies and average lengths. The proposed method is applied to a real example.  相似文献   

19.
20.
This paper discusses the problem of statistical inference in multivariate linear regression models when the errors involved are non normally distributed. We consider multivariate t-distribution, a fat-tailed distribution, for the errors as alternative to normal distribution. Such non normality is commonly observed in working with many data sets, e.g., financial data that are usually having excess kurtosis. This distribution has a number of applications in many other areas of research as well. We use modified maximum likelihood estimation method that provides the estimator, called modified maximum likelihood estimator (MMLE), in closed form. These estimators are shown to be unbiased, efficient, and robust as compared to the widely used least square estimators (LSEs). Also, the tests based upon MMLEs are found to be more powerful than the similar tests based upon LSEs.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号