首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
In this paper, we consider a semiparametric regression model under long-range dependent errors. By approximating the nonparametric component by a finite series sum, we construct consistent estimators for both parametric and nonparametric components. Meanwhile, convergence rates for the consistent estimators are also investigated. Additionally, an optimal truncation parameter selection procedure is proposed.  相似文献   

2.
Variable selection for multivariate nonparametric regression is an important, yet challenging, problem due, in part, to the infinite dimensionality of the function space. An ideal selection procedure should be automatic, stable, easy to use, and have desirable asymptotic properties. In particular, we define a selection procedure to be nonparametric oracle (np-oracle) if it consistently selects the correct subset of predictors and at the same time estimates the smooth surface at the optimal nonparametric rate, as the sample size goes to infinity. In this paper, we propose a model selection procedure for nonparametric models, and explore the conditions under which the new method enjoys the aforementioned properties. Developed in the framework of smoothing spline ANOVA, our estimator is obtained via solving a regularization problem with a novel adaptive penalty on the sum of functional component norms. Theoretical properties of the new estimator are established. Additionally, numerous simulated and real examples further demonstrate that the new approach substantially outperforms other existing methods in the finite sample setting.  相似文献   

3.
Jing Yang  Fang Lu  Hu Yang 《Statistics》2017,51(6):1179-1199
In this paper, we develop a new estimation procedure based on quantile regression for semiparametric partially linear varying-coefficient models. The proposed estimation approach is empirically shown to be much more efficient than the popular least squares estimation method for non-normal error distributions, and almost not lose any efficiency for normal errors. Asymptotic normalities of the proposed estimators for both the parametric and nonparametric parts are established. To achieve sparsity when there exist irrelevant variables in the model, two variable selection procedures based on adaptive penalty are developed to select important parametric covariates as well as significant nonparametric functions. Moreover, both these two variable selection procedures are demonstrated to enjoy the oracle property under some regularity conditions. Some Monte Carlo simulations are conducted to assess the finite sample performance of the proposed estimators, and a real-data example is used to illustrate the application of the proposed methods.  相似文献   

4.
Regularized variable selection is a powerful tool for identifying the true regression model from a large number of candidates by applying penalties to the objective functions. The penalty functions typically involve a tuning parameter that controls the complexity of the selected model. The ability of the regularized variable selection methods to identify the true model critically depends on the correct choice of the tuning parameter. In this study, we develop a consistent tuning parameter selection method for regularized Cox's proportional hazards model with a diverging number of parameters. The tuning parameter is selected by minimizing the generalized information criterion. We prove that, for any penalty that possesses the oracle property, the proposed tuning parameter selection method identifies the true model with probability approaching one as sample size increases. Its finite sample performance is evaluated by simulations. Its practical use is demonstrated in The Cancer Genome Atlas breast cancer data.  相似文献   

5.
Kai B  Li R  Zou H 《Annals of statistics》2011,39(1):305-332
The complexity of semiparametric models poses new challenges to statistical inference and model selection that frequently arise from real applications. In this work, we propose new estimation and variable selection procedures for the semiparametric varying-coefficient partially linear model. We first study quantile regression estimates for the nonparametric varying-coefficient functions and the parametric regression coefficients. To achieve nice efficiency properties, we further develop a semiparametric composite quantile regression procedure. We establish the asymptotic normality of proposed estimators for both the parametric and nonparametric parts and show that the estimators achieve the best convergence rate. Moreover, we show that the proposed method is much more efficient than the least-squares-based method for many non-normal errors and that it only loses a small amount of efficiency for normal errors. In addition, it is shown that the loss in efficiency is at most 11.1% for estimating varying coefficient functions and is no greater than 13.6% for estimating parametric components. To achieve sparsity with high-dimensional covariates, we propose adaptive penalization methods for variable selection in the semiparametric varying-coefficient partially linear model and prove that the methods possess the oracle property. Extensive Monte Carlo simulation studies are conducted to examine the finite-sample performance of the proposed procedures. Finally, we apply the new methods to analyze the plasma beta-carotene level data.  相似文献   

6.
The use of nonparametric regression techniques for binary regression is a promising alternative to parametric methods. As in other nonparametric smoothing problems, the choice of smoothing parameter is critical to the performance of the estimator and the appearance of the resulting estimate. In this paper, we discuss the use of selection criteria based on estimates of squared prediction risk and show consistency and asymptotic normality of the selected bandwidths. The usefulness of the methods is explored on a data set and in a small simulation study.  相似文献   

7.
Applying nonparametric variable selection criteria in nonlinear regression models generally requires a substantial computational effort if the data set is large. In this paper we present a selection technique that is computationally much less demanding and performs well in comparison with methods currently available. It is based on a polynomial approximation of the nonlinear model. Performing the selection only requires repeated least squares estimation of models that are linear in parameters. The main limitation of the method is that the number of variables among which to select cannot be very large if the sample is small and the order of an adequate polynomial at the same time is high. Large samples can be handled without problems.  相似文献   

8.
Summary.  The importance of variable selection in regression has grown in recent years as computing power has encouraged the modelling of data sets of ever-increasing size. Data mining applications in finance, marketing and bioinformatics are obvious examples. A limitation of nearly all existing variable selection methods is the need to specify the correct model before selection. When the number of predictors is large, model formulation and validation can be difficult or even infeasible. On the basis of the theory of sufficient dimension reduction, we propose a new class of model-free variable selection approaches. The methods proposed assume no model of any form, require no nonparametric smoothing and allow for general predictor effects. The efficacy of the methods proposed is demonstrated via simulation, and an empirical example is given.  相似文献   

9.
In this article, a new composite quantile regression estimation approach is proposed for estimating the parametric part of single-index model. We use local linear composite quantile regression (CQR) for estimating the nonparametric part of single-index model (SIM) when the error distribution is symmetrical. The weighted local linear CQR is proposed for estimating the nonparametric part of SIM when the error distribution is asymmetrical. Moreover, a new variable selection procedure is proposed for SIM. Under some regularity conditions, we establish the large sample properties of the proposed estimators. Simulation studies and a real data analysis are presented to illustrate the behavior of the proposed estimators.  相似文献   

10.
ABSTRACT

This article considers nonparametric regression problems and develops a model-averaging procedure for smoothing spline regression problems. Unlike most smoothing parameter selection studies determining an optimum smoothing parameter, our focus here is on the prediction accuracy for the true conditional mean of Y given a predictor X. Our method consists of two steps. The first step is to construct a class of smoothing spline regression models based on nonparametric bootstrap samples, each with an appropriate smoothing parameter. The second step is to average bootstrap smoothing spline estimates of different smoothness to form a final improved estimate. To minimize the prediction error, we estimate the model weights using a delete-one-out cross-validation procedure. A simulation study has been performed by using a program written in R. The simulation study provides a comparison of the most well known cross-validation (CV), generalized cross-validation (GCV), and the proposed method. This new method is straightforward to implement, and gives reliable performances in simulations.  相似文献   

11.
Model selection methods are important to identify the best approximating model. To identify the best meaningful model, purpose of the model should be clearly pre-stated. The focus of this paper is model selection when the modelling purpose is classification. We propose a new model selection approach designed for logistic regression model selection where main modelling purpose is classification. The method is based on the distance between the two clustering trees. We also question and evaluate the performances of conventional model selection methods based on information theory concepts in determining best logistic regression classifier. An extensive simulation study is used to assess the finite sample performances of the cluster tree based and the information theoretic model selection methods. Simulations are adjusted for whether the true model is in the candidate set or not. Results show that the new approach is highly promising. Finally, they are applied to a real data set to select a binary model as a means of classifying the subjects with respect to their risk of breast cancer.  相似文献   

12.
Hea-Jung Kim  Taeyoung Roh 《Statistics》2013,47(5):1082-1111
In regression analysis, a sample selection scheme often applies to the response variable, which results in missing not at random observations on the variable. In this case, a regression analysis using only the selected cases would lead to biased results. This paper proposes a Bayesian methodology to correct this bias based on a semiparametric Bernstein polynomial regression model that incorporates the sample selection scheme into a stochastic monotone trend constraint, variable selection, and robustness against departures from the normality assumption. We present the basic theoretical properties of the proposed model that include its stochastic representation, sample selection bias quantification, and hierarchical model specification to deal with the stochastic monotone trend constraint in the nonparametric component, simple bias corrected estimation, and variable selection for the linear components. We then develop computationally feasible Markov chain Monte Carlo methods for semiparametric Bernstein polynomial functions with stochastically constrained parameter estimation and variable selection procedures. We demonstrate the finite-sample performance of the proposed model compared to existing methods using simulation studies and illustrate its use based on two real data applications.  相似文献   

13.
Abstract. Lasso and other regularization procedures are attractive methods for variable selection, subject to a proper choice of shrinkage parameter. Given a set of potential subsets produced by a regularization algorithm, a consistent model selection criterion is proposed to select the best one among this preselected set. The approach leads to a fast and efficient procedure for variable selection, especially in high‐dimensional settings. Model selection consistency of the suggested criterion is proven when the number of covariates d is fixed. Simulation studies suggest that the criterion still enjoys model selection consistency when d is much larger than the sample size. The simulations also show that our approach for variable selection works surprisingly well in comparison with existing competitors. The method is also applied to a real data set.  相似文献   

14.
Our goal is to find a regression technique that can be used in a small-sample situation with possible model misspecification. The development of a new bandwidth selector allows nonparametric regression (in conjunction with least squares) to be used in this small-sample problem, where nonparametric procedures have previously proven to be inadequate. Considered here are two new semiparametric (model-robust) regression techniques that combine parametric and nonparametric techniques when there is partial information present about the underlying model. A general overview is given of how typical concerns for bandwidth selection in nonparametric regression extend to the model-robust procedures. A new penalized PRESS criterion (with a graphical selection strategy for applications) is developed that overcomes these concerns and is able to maintain the beneficial mean squared error properties of the new model-robust methods. It is shown that this new selector outperforms standard and recently improved bandwidth selectors. Comparisons of the selectors are made via numerous generated data examples and a small simulation study.  相似文献   

15.
Summary.  The paper introduces a new local polynomial estimator and develops supporting asymptotic theory for nonparametric regression in the presence of covariate measurement error. We address the measurement error with Cook and Stefanski's simulation–extrapolation (SIMEX) algorithm. Our method improves on previous local polynomial estimators for this problem by using a bandwidth selection procedure that addresses SIMEX's particular estimation method and considers higher degree local polynomial estimators. We illustrate the accuracy of our asymptotic expressions with a Monte Carlo study, compare our method with other estimators with a second set of Monte Carlo simulations and apply our method to a data set from nutritional epidemiology. SIMEX was originally developed for parametric models. Although SIMEX is, in principle, applicable to nonparametric models, a serious problem arises with SIMEX in nonparametric situations. The problem is that smoothing parameter selectors that are developed for data without measurement error are no longer appropriate and can result in considerable undersmoothing. We believe that this is the first paper to address this difficulty.  相似文献   

16.
In this article, we consider the problem of variable selection in linear regression when multicollinearity is present in the data. It is well known that in the presence of multicollinearity, performance of least square (LS) estimator of regression parameters is not satisfactory. Consequently, subset selection methods, such as Mallow's Cp, which are based on LS estimates lead to selection of inadequate subsets. To overcome the problem of multicollinearity in subset selection, a new subset selection algorithm based on the ridge estimator is proposed. It is shown that the new algorithm is a better alternative to Mallow's Cp when the data exhibit multicollinearity.  相似文献   

17.
In this paper, a generalized partially linear model (GPLM) with missing covariates is studied and a Monte Carlo EM (MCEM) algorithm with penalized-spline (P-spline) technique is developed to estimate the regression coefficients and nonparametric function, respectively. As classical model selection procedures such as Akaike's information criterion become invalid for our considered models with incomplete data, some new model selection criterions for GPLMs with missing covariates are proposed under two different missingness mechanism, say, missing at random (MAR) and missing not at random (MNAR). The most attractive point of our method is that it is rather general and can be extended to various situations with missing observations based on EM algorithm, especially when no missing data involved, our new model selection criterions are reduced to classical AIC. Therefore, we can not only compare models with missing observations under MAR/MNAR settings, but also can compare missing data models with complete-data models simultaneously. Theoretical properties of the proposed estimator, including consistency of the model selection criterions are investigated. A simulation study and a real example are used to illustrate the proposed methodology.  相似文献   

18.
The problem of predicting a future value of a time series is considered in this article. If the series follows a stationary Markov process, this can be done by nonparametric estimation of the autoregression function. Two forecasting algorithms are introduced. They only differ in the nonparametric kernel-type estimator used: the Nadaraya-Watson estimator and the local linear estimator. There are three major issues in the implementation of these algorithms: selection of the autoregressor variables, smoothing parameter selection, and computing prediction intervals. These have been tackled using recent techniques borrowed from the nonparametric regression estimation literature under dependence. The performance of these nonparametric algorithms has been studied by applying them to a collection of 43 well-known time series. Their results have been compared to those obtained using classical Box-Jenkins methods. Finally, the practical behavior of the methods is also illustrated by a detailed analysis of two data sets.  相似文献   

19.
This article is concerned with one discrete nonparametric kernel and two parametric regression approaches for providing the evolution law of pavement deterioration. The first parametric approach is a survival data analysis method; and the second is a nonlinear mixed-effects model. The nonparametric approach consists of a regression estimator using the discrete associated kernels. Some asymptotic properties of the discrete nonparametric kernel estimator are shown as, in particular, its almost sure consistency. Moreover, two data-driven bandwidth selection methods are also given, with a new theoretical explicit expression of optimal bandwidth provided for this nonparametric estimator. A comparative simulation study is realized with an application of bootstrap methods to a measure of statistical accuracy.  相似文献   

20.
A Bayesian approach is proposed for coefficient estimation in the Tobit quantile regression model. The proposed approach is based on placing a g-prior distribution depends on the quantile level on the regression coefficients. The prior is generalized by introducing a ridge parameter to address important challenges that may arise with censored data, such as multicollinearity and overfitting problems. Then, a stochastic search variable selection approach is proposed for Tobit quantile regression model based on g-prior. An expression for the hyperparameter g is proposed to calibrate the modified g-prior with a ridge parameter to the corresponding g-prior. Some possible extensions of the proposed approach are discussed, including the continuous and binary responses in quantile regression. The methods are illustrated using several simulation studies and a microarray study. The simulation studies and the microarray study indicate that the proposed approach performs well.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号