首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
ABSTRACT

This article considers nonparametric regression problems and develops a model-averaging procedure for smoothing spline regression problems. Unlike most smoothing parameter selection studies determining an optimum smoothing parameter, our focus here is on the prediction accuracy for the true conditional mean of Y given a predictor X. Our method consists of two steps. The first step is to construct a class of smoothing spline regression models based on nonparametric bootstrap samples, each with an appropriate smoothing parameter. The second step is to average bootstrap smoothing spline estimates of different smoothness to form a final improved estimate. To minimize the prediction error, we estimate the model weights using a delete-one-out cross-validation procedure. A simulation study has been performed by using a program written in R. The simulation study provides a comparison of the most well known cross-validation (CV), generalized cross-validation (GCV), and the proposed method. This new method is straightforward to implement, and gives reliable performances in simulations.  相似文献   

2.
Smoothing Splines and Shape Restrictions   总被引:2,自引:0,他引:2  
Constrained smoothing splines are discussed under order restrictions on the shape of the function m . We consider shape constraints of the type m ( r )≥ 0, i.e. positivity, monotonicity, convexity, .... (Here for an integer r ≥ 0, m ( r ) denotes the r th derivative of m .) The paper contains three results: (1) constrained smoothing splines achieve optimal rates in shape restricted Sobolev classes; (2) they are equivalent to two step procedures of the following type: (a) in a first step the unconstrained smoothing spline is calculated; (b) in a second step the unconstrained smoothing spline is "projected" onto the constrained set. The projection is calculated with respect to a Sobolev-type norm; this result can be used for two purposes, it may motivate new algorithmic approaches and it helps to understand the form of the estimator and its asymptotic properties; (3) the infinite number of constraints can be replaced by a finite number with only a small loss of accuracy, this is discussed for estimation of a convex function.  相似文献   

3.
A permutation testing approach in multivariate mixed models is presented. The solutions proposed allow for testing between-unit effect; they are exact under some assumptions, while approximated in the more general case. The classes of models comprised by this approach include generalized linear models, vector generalized additive models and other nonparametric models based on smoothing. Moreover it does not assume observations of different units to have the same distribution. The extensions to a multivariate framework are presented and discussed. The proposed multivariate tests exploit the dependence among variables, hence increasing the power with respect to other standard solutions (e.g. Bonferroni correction) which combine many univariate tests in an overall one. Examples are given of two applications to real data from psychological and ecological studies; a simulation study provides some insight into the unbiasedness of the tests and their power. The methods were implemented in the R package flip, freely available on CRAN.  相似文献   

4.
Multivariate temporal disaggregation deals with the historical reconstruction and nowcasting of economic variables subject to temporal and contemporaneous aggregation constraints. The problem involves a system of time series that are related not only by a dynamic model but also by accounting constraints. The paper introduces two fundamental (and realistic) models that implement the multivariate best linear unbiased estimation approach that has potential application to the temporal disaggregation of the national accounts series. The multivariate regression model with random walk disturbances is most suitable to deal with the chained linked volumes (as the nature of the national accounts time series suggests); however, in this case the accounting constraints are not binding and the discrepancy has to be modeled by either a trend-stationary or an integrated process. The tiny, compared with other driving disturbances, size of the discrepancy prevents maximum-likelihood estimation to be carried out, and the parameters have to be estimated separately. The multivariate disaggregation with integrated random walk disturbances is suitable for the national accounts aggregates expressed at current prices, in which case the accounting constraints are binding.  相似文献   

5.
Generalized additive models represented using low rank penalized regression splines, estimated by penalized likelihood maximisation and with smoothness selected by generalized cross validation or similar criteria, provide a computationally efficient general framework for practical smooth modelling. Various authors have proposed approximate Bayesian interval estimates for such models, based on extensions of the work of Wahba, G. (1983) [Bayesian confidence intervals for the cross validated smoothing spline. J. R. Statist. Soc. B 45 , 133–150] and Silverman, B.W. (1985) [Some aspects of the spline smoothing approach to nonparametric regression curve fitting. J. R. Statist. Soc. B 47 , 1–52] on smoothing spline models of Gaussian data, but testing of such intervals has been rather limited and there is little supporting theory for the approximations used in the generalized case. This paper aims to improve this situation by providing simulation tests and obtaining asymptotic results supporting the approximations employed for the generalized case. The simulation results suggest that while across‐the‐model performance is good, component‐wise coverage probabilities are not as reliable. Since this is likely to result from the neglect of smoothing parameter variability, a simple and efficient simulation method is proposed to account for smoothing parameter uncertainty: this is demonstrated to substantially improve the performance of component‐wise intervals.  相似文献   

6.
Nonparametric model specification for stationary time series involves selections of the smoothing parameter (bandwidth), the lag structure and the functional form (linear vs. nonlinear). In real life problems, none of these factors are known and the choices are interdependent. In this article, we recommend to accomplish these choices in one step via the model selection approach. Two procedures are considered; one based on the information criterion and the other based on the least squares cross validation. The Monte Carlo simulation results show that both procedures have good finite sample performances and are easy to implement compared to existing two-step probabilistic testing procedures.  相似文献   

7.
This paper considers inference about the individual level relationship between two dichotomous variables based on aggregated data. It is known that such analyses suffer from 'ecological bias', caused by the lack of homogeneity of this relationship across the groups over which the aggregation occurs. Two new methods for overcoming this bias, one based on local smoothing and the other a simple semiparametric approach, are developed and evaluated. The local smoothing approach performs best when it is used with a covariate which accounts for some of the variation in the relationships across groups. The semiparametric approach performed well in our evaluation even without such auxiliary information  相似文献   

8.
There is an increasing amount of literature focused on Bayesian computational methods to address problems with intractable likelihood. One approach is a set of algorithms known as Approximate Bayesian Computational (ABC) methods. One of the problems with these algorithms is that their performance depends on the appropriate choice of summary statistics, distance measure and tolerance level. To circumvent this problem, an alternative method based on the empirical likelihood has been introduced. This method can be easily implemented when a set of constraints, related to the moments of the distribution, is specified. However, the choice of the constraints is sometimes challenging. To overcome this difficulty, we propose an alternative method based on a bootstrap likelihood approach. The method is easy to implement and in some cases is actually faster than the other approaches considered. We illustrate the performance of our algorithm with examples from population genetics, time series and stochastic differential equations. We also test the method on a real dataset.  相似文献   

9.
Many different methods have been proposed to construct nonparametric estimates of a smooth regression function, including local polynomial, (convolution) kernel and smoothing spline estimators. Each of these estimators uses a smoothing parameter to control the amount of smoothing performed on a given data set. In this paper an improved version of a criterion based on the Akaike information criterion (AIC), termed AICC, is derived and examined as a way to choose the smoothing parameter. Unlike plug-in methods, AICC can be used to choose smoothing parameters for any linear smoother, including local quadratic and smoothing spline estimators. The use of AICC avoids the large variability and tendency to undersmooth (compared with the actual minimizer of average squared error) seen when other 'classical' approaches (such as generalized cross-validation (GCV) or the AIC) are used to choose the smoothing parameter. Monte Carlo simulations demonstrate that the AICC-based smoothing parameter is competitive with a plug-in method (assuming that one exists) when the plug-in method works well but also performs well when the plug-in approach fails or is unavailable.  相似文献   

10.
When spatial data are correlated, currently available data‐driven smoothing parameter selection methods for nonparametric regression will often fail to provide useful results. The authors propose a method that adjusts the generalized cross‐validation criterion for the effect of spatial correlation in the case of bivariate local polynomial regression. Their approach uses a pilot fit to the data and the estimation of a parametric covariance model. The method is easy to implement and leads to improved smoothing parameter selection, even when the covariance model is misspecified. The methodology is illustrated using water chemistry data collected in a survey of lakes in the Northeastern United States.  相似文献   

11.
A method is proposed for shape-constrained density estimation under a variety of constraints, including but not limited to unimodality, monotonicity, symmetry, and constraints on the number of inflection points of the density or its derivative. The method involves computing an adjustment curve that is used to bring a pre-existing pilot estimate into conformance with the specified shape restrictions. The pilot estimate may be obtained using any preferred estimator, and the optimal adjustment can be computed using fast, readily-available quadratic programming routines. This makes the proposed procedure generic and easy to implement.  相似文献   

12.
Generalised linear models are frequently used in modeling the relationship of the response variable from the general exponential family with a set of predictor variables, where a linear combination of predictors is linked to the mean of the response variable. We propose a penalised spline (P-spline) estimation for generalised partially linear single-index models, which extend the generalised linear models to include nonlinear effect for some predictors. The proposed models can allow flexible dependence on some predictors while overcome the “curse of dimensionality”. We investigate the P-spline profile likelihood estimation using the readily available R package mgcv, leading to straightforward computation. Simulation studies are considered under various link functions. In addition, we examine different choices of smoothing parameters. Simulation results and real data applications show effectiveness of the proposed approach. Finally, some large sample properties are established.  相似文献   

13.
We propose a Bayesian nonparametric instrumental variable approach under additive separability that allows us to correct for endogeneity bias in regression models where the covariate effects enter with unknown functional form. Bias correction relies on a simultaneous equations specification with flexible modeling of the joint error distribution implemented via a Dirichlet process mixture prior. Both the structural and instrumental variable equation are specified in terms of additive predictors comprising penalized splines for nonlinear effects of continuous covariates. Inference is fully Bayesian, employing efficient Markov chain Monte Carlo simulation techniques. The resulting posterior samples do not only provide us with point estimates, but allow us to construct simultaneous credible bands for the nonparametric effects, including data-driven smoothing parameter selection. In addition, improved robustness properties are achieved due to the flexible error distribution specification. Both these features are challenging in the classical framework, making the Bayesian one advantageous. In simulations, we investigate small sample properties and an investigation of the effect of class size on student performance in Israel provides an illustration of the proposed approach which is implemented in an R package bayesIV. Supplementary materials for this article are available online.  相似文献   

14.

Regression spline smoothing is a popular approach for conducting nonparametric regression. An important issue associated with it is the choice of a "theoretically best" set of knots. Different statistical model selection methods, such as Akaike's information criterion and generalized cross-validation, have been applied to derive different "theoretically best" sets of knots. Typically these best knot sets are defined implicitly as the optimizers of some objective functions. Hence another equally important issue concerning regression spline smoothing is how to optimize such objective functions. In this article different numerical algorithms that are designed for carrying out such optimization problems are compared by means of a simulation study. Both the univariate and bivariate smoothing settings will be considered. Based on the simulation results, recommendations for choosing a suitable optimization algorithm under various settings will be provided.  相似文献   

15.
In this paper we present a unified discussion of different approaches to the identification of smoothing spline analysis of variance (ANOVA) models: (i) the “classical” approach (in the line of Wahba in Spline Models for Observational Data, 1990; Gu in Smoothing Spline ANOVA Models, 2002; Storlie et al. in Stat. Sin., 2011) and (ii) the State-Dependent Regression (SDR) approach of Young in Nonlinear Dynamics and Statistics (2001). The latter is a nonparametric approach which is very similar to smoothing splines and kernel regression methods, but based on recursive filtering and smoothing estimation (the Kalman filter combined with fixed interval smoothing). We will show that SDR can be effectively combined with the “classical” approach to obtain a more accurate and efficient estimation of smoothing spline ANOVA models to be applied for emulation purposes. We will also show that such an approach can compare favorably with kriging.  相似文献   

16.
This paper deals with a general class of transformation models that contains many important semiparametric regression models as special cases. It develops a self-induced smoothing for the maximum rank correlation estimator, resulting in simultaneous point and variance estimation. The self-induced smoothing does not require bandwidth selection, yet provides the right amount of smoothness so that the estimator is asymptotically normal with mean zero (unbiased) and variance–covariance matrix consistently estimated by the usual sandwich-type estimator. An iterative algorithm is given for the variance estimation and shown to numerically converge to a consistent limiting variance estimator. The approach is applied to a data set involving survival times of primary biliary cirrhosis patients. Simulation results are reported, showing that the new method performs well under a variety of scenarios.  相似文献   

17.
Abstract. We study the coverage properties of Bayesian confidence intervals for the smooth component functions of generalized additive models (GAMs) represented using any penalized regression spline approach. The intervals are the usual generalization of the intervals first proposed by Wahba and Silverman in 1983 and 1985, respectively, to the GAM component context. We present simulation evidence showing these intervals have close to nominal ‘across‐the‐function’ frequentist coverage probabilities, except when the truth is close to a straight line/plane function. We extend the argument introduced by Nychka in 1988 for univariate smoothing splines to explain these results. The theoretical argument suggests that close to nominal coverage probabilities can be achieved, provided that heavy oversmoothing is avoided, so that the bias is not too large a proportion of the sampling variability. The theoretical results allow us to derive alternative intervals from a purely frequentist point of view, and to explain the impact that the neglect of smoothing parameter variability has on confidence interval performance. They also suggest switching the target of inference for component‐wise intervals away from smooth components in the space of the GAM identifiability constraints.  相似文献   

18.
Cubic spline smoothing of hazard rate functions is evaluated through a simulation study. The smoothing algorithm requires unsmoothed time-point estimates of a hazard rate, variances of the estimators, and a smoothing parameter. Two unsmoothed estimators were compared (Kaplan-Meier and Nelson based) as well as variations in the number of time-point estimates input to the algorithm. A cross-validated likelihood approach automated the selection of the smoothing parameter and the number of time-point estimates. The results indicated that, for a simple hazard shape, a wide range of smoothing parameter values and number of time-points will yield mean squared errors not much larger than parametric maximum likelihood estimators. However, for peaked hazards, it seems advisable to use the cross-validated likelihood approach in order to avoid oversmoothing.  相似文献   

19.
Longitudinal data frequently arises in various fields of applied sciences where individuals are measured according to some ordered variable, e.g. time. A common approach used to model such data is based on the mixed models for repeated measures. This model provides an eminently flexible approach to modeling of a wide range of mean and covariance structures. However, such models are forced into a rigidly defined class of mathematical formulas which may not be well supported by the data within the whole sequence of observations. A possible non-parametric alternative is a cubic smoothing spline, which is highly flexible and has useful smoothing properties. It can be shown that under normality assumption, the solution of the penalized log-likelihood equation is the cubic smoothing spline, and this solution can be further expressed as a solution of the linear mixed model. It is shown here how cubic smoothing splines can be easily used in the analysis of complete and balanced data. Analysis can be greatly simplified by using the unweighted estimator studied in the paper. It is shown that if the covariance structure of random errors belong to certain class of matrices, the unweighted estimator is the solution to the penalized log-likelihood function. This result is new in smoothing spline context and it is not only confined to growth curve settings. The connection to mixed models is used in developing a rough testing of group profiles. Numerical examples are presented to illustrate the techniques proposed.  相似文献   

20.
The best linear unbiased predictor (BLUP) of the random parameter in a linear mixed model satisfies a linear constraint, which has been previously termed a built-in restriction. In other literature, constraints on the random parameter itself have been introduced into the modeling framework. The present article has two goals. First, it explores the idea of imposing the built-in restrictions on the BLUP as constraints on the random parameter. Second, it investigates the built-in restrictions satisfied by certain smoothing spline analysis of variance (SSANOVA) estimators, and compares these restrictions to arguably more natural side conditions on the ANOVA decomposition.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号