首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
The performance of nine different nonparametric regression estimates is empirically compared on ten different real datasets. The number of data points in the real datasets varies between 7, 900 and 18, 000, where each real dataset contains between 5 and 20 variables. The nonparametric regression estimates include kernel, partitioning, nearest neighbor, additive spline, neural network, penalized smoothing splines, local linear kernel, regression trees, and random forests estimates. The main result is a table containing the empirical L2 risks of all nine nonparametric regression estimates on the evaluation part of the different datasets. The neural networks and random forests are the two estimates performing best. The datasets are publicly available, so that any new regression estimate can be easily compared with all nine estimates considered in this article by just applying it to the publicly available data and by computing its empirical L2 risks on the evaluation part of the datasets.  相似文献   

2.
Nonparametric regression techniques such as spline smoothing and local fitting depend implicitly on a parametric model. For instance, the cubic smoothing spline estimate of a regression function ∫ μ based on observations ti, Yi is the minimizer of Σ{Yi ‐ μ(ti)}2 + λ∫(μ′′)2. Since ∫(μ″)2 is zero when μ is a line, the cubic smoothing spline estimate favors the parametric model μ(t) = αo + α1t. Here the authors consider replacing ∫(μ″)2 with the more general expression ∫(Lμ)2 where L is a linear differential operator with possibly nonconstant coefficients. The resulting estimate of μ performs well, particularly if Lμ is small. They present an O(n) algorithm for the computation of μ. This algorithm is applicable to a wide class of L's. They also suggest a method for the estimation of L. They study their estimates via simulation and apply them to several data sets.  相似文献   

3.
ABSTRACT

This article considers nonparametric regression problems and develops a model-averaging procedure for smoothing spline regression problems. Unlike most smoothing parameter selection studies determining an optimum smoothing parameter, our focus here is on the prediction accuracy for the true conditional mean of Y given a predictor X. Our method consists of two steps. The first step is to construct a class of smoothing spline regression models based on nonparametric bootstrap samples, each with an appropriate smoothing parameter. The second step is to average bootstrap smoothing spline estimates of different smoothness to form a final improved estimate. To minimize the prediction error, we estimate the model weights using a delete-one-out cross-validation procedure. A simulation study has been performed by using a program written in R. The simulation study provides a comparison of the most well known cross-validation (CV), generalized cross-validation (GCV), and the proposed method. This new method is straightforward to implement, and gives reliable performances in simulations.  相似文献   

4.
An important problem for fitting local linear regression is the choice of the smoothing parameter. As the smoothing parameter becomes large, the estimator tends to a straight line, which is the least squares fit in the ordinary linear regression setting. This property may be used to assess the adequacy of a simple linear model. Motivated by Silverman's (1981) work in kernel density estimation, a suitable test statistic is the critical smoothing parameter where the estimate changes from nonlinear to linear, while linearity or non- linearity requires a more precise judgment. We define the critical smoothing parameter through the approximate F-tests by Hastie and Tibshirani (1990). To assess the significance, the “wild bootstrap” procedure is used to replicate the data and the proportion of bootstrap samples which give a nonlinear estimate when using the critical bandwidth is obtained as the p-value. Simulation results show that the critical smoothing test is useful in detecting a wide range of alternatives.  相似文献   

5.
Spatially-adaptive Penalties for Spline Fitting   总被引:2,自引:0,他引:2  
The paper studies spline fitting with a roughness penalty that adapts to spatial heterogeneity in the regression function. The estimates are p th degree piecewise polynomials with p − 1 continuous derivatives. A large and fixed number of knots is used and smoothing is achieved by putting a quadratic penalty on the jumps of the p th derivative at the knots. To be spatially adaptive, the logarithm of the penalty is itself a linear spline but with relatively few knots and with values at the knots chosen to minimize the generalized cross validation (GCV) criterion. This locally-adaptive spline estimator is compared with other spline estimators in the literature such as cubic smoothing splines and knot-selection techniques for least squares regression. Our estimator can be interpreted as an empirical Bayes estimate for a prior allowing spatial heterogeneity. In cases of spatially heterogeneous regression functions, empirical Bayes confidence intervals using this prior achieve better pointwise coverage probabilities than confidence intervals based on a global-penalty parameter. The method is developed first for univariate models and then extended to additive models.  相似文献   

6.
Summary.  Local polynomial regression is a useful non-parametric regression tool to explore fine data structures and has been widely used in practice. We propose a new non-parametric regression technique called local composite quantile regression smoothing to improve local polynomial regression further. Sampling properties of the estimation procedure proposed are studied. We derive the asymptotic bias, variance and normality of the estimate proposed. The asymptotic relative efficiency of the estimate with respect to local polynomial regression is investigated. It is shown that the estimate can be much more efficient than the local polynomial regression estimate for various non-normal errors, while being almost as efficient as the local polynomial regression estimate for normal errors. Simulation is conducted to examine the performance of the estimates proposed. The simulation results are consistent with our theoretical findings. A real data example is used to illustrate the method proposed.  相似文献   

7.
Pricing of American options in discrete time is considered, where the option is allowed to be based on several underlying stocks. It is assumed that the price processes of the underlying stocks are given by Markov processes. We use the Monte Carlo approach to generate artificial sample paths of these price processes, and then we use nonparametric regression estimates to estimate from this data so-called continuation values, which are defined as mean values of the American option for given values of the underlying stocks at time t subject to the constraint that the option is not exercised at time t. As nonparametric regression estimates we use least squares estimates with complexity penalties, which include as special cases least squares spline estimates, least squares neural networks, smoothing splines and orthogonal series estimates. General results concerning rate of convergence are presented and applied to derive results for the special cases mentioned above. Furthermore the pricing of American options is illustrated by simulated data.  相似文献   

8.
The P-splines of Eilers and Marx (Stat Sci 11:89–121, 1996) combine a B-spline basis with a discrete quadratic penalty on the basis coefficients, to produce a reduced rank spline like smoother. P-splines have three properties that make them very popular as reduced rank smoothers: (i) the basis and the penalty are sparse, enabling efficient computation, especially for Bayesian stochastic simulation; (ii) it is possible to flexibly ‘mix-and-match’ the order of B-spline basis and penalty, rather than the order of penalty controlling the order of the basis as in spline smoothing; (iii) it is very easy to set up the B-spline basis functions and penalties. The discrete penalties are somewhat less interpretable in terms of function shape than the traditional derivative based spline penalties, but tend towards penalties proportional to traditional spline penalties in the limit of large basis size. However part of the point of P-splines is not to use a large basis size. In addition the spline basis functions arise from solving functional optimization problems involving derivative based penalties, so moving to discrete penalties for smoothing may not always be desirable. The purpose of this note is to point out that the three properties of basis-penalty sparsity, mix-and-match penalization and ease of setup are readily obtainable with B-splines subject to derivative based penalization. The penalty setup typically requires a few lines of code, rather than the two lines typically required for P-splines, but this one off disadvantage seems to be the only one associated with using derivative based penalties. As an example application, it is shown how basis-penalty sparsity enables efficient computation with tensor product smoothers of scattered data.  相似文献   

9.
Time series smoothers estimate the level of a time series at time t as its conditional expectation given present, past and future observations, with the smoothed value depending on the estimated time series model. Alternatively, local polynomial regressions on time can be used to estimate the level, with the implied smoothed value depending on the weight function and the bandwidth in the local linear least squares fit. In this article we compare the two smoothing approaches and describe their similarities. Through simulations, we assess the increase in the mean square error that results when approximating the estimated optimal time series smoother with the local regression estimate of the level.  相似文献   

10.
The least-absolute-deviation estimate of a monotone regression function on an interval has been studied in the literature. If the observation points become dense in the interval, the almost sure rate of convergence has been shown to be O(n1/4). Applying the techniques used by Brunk (1970, Nonparametric, Techniques in Statistical Inference. Cambridge Univ. Press), the asymptotic distribution of the l1 estimator at a point is obtained. If the underlying regression function has positive slope at the point, the rate of convergence is seen to be O(n1/3). Monotone percentile regression estimates are also considered.  相似文献   

11.
In this paper, under a nonparametric regression model, we introduce two families of robust procedures to estimate the regression function when missing data occur in the response. The first proposal is based on a local MM-functional applied to the conditional distribution function estimate adapted to the presence of missing data. The second proposal imputes the missing responses using the local MM-smoother based on the observed sample and then estimates the regression function with the completed sample. We show that the robust procedures considered are consistent and asymptotically normally distributed. A robust procedure to select the smoothing parameter is also discussed.  相似文献   

12.
In order to study developmental variables, for example, neuromotor development of children and adolescents, monotone fitting is typically needed. Most methods, to estimate a monotone regression function non-parametrically, however, are not straightforward to implement, a difficult issue being the choice of smoothing parameters. In this paper, a convenient implementation of the monotone B-spline estimates of Ramsay [Monotone regression splines in action (with discussion), Stat. Sci. 3 (1988), pp. 425–461] and Kelly and Rice [Montone smoothing with application to dose-response curves and the assessment of synergism, Biometrics 46 (1990), pp. 1071–1085] is proposed and applied to neuromotor data. Knots are selected adaptively using ideas found in Friedman and Silverman [Flexible parsimonous smoothing and additive modelling (with discussion), Technometrics 31 (1989), pp. 3–39] yielding a flexible algorithm to automatically and accurately estimate a monotone regression function. Using splines also simultaneously allows to include other aspects in the estimation problem, such as modeling a constant difference between two groups or a known jump in the regression function. Finally, an estimate which is not only monotone but also has a ‘levelling-off’ (i.e. becomes constant after some point) is derived. This is useful when the developmental variable is known to attain a maximum/minimum within the interval of observation.  相似文献   

13.
Heteroscedasticity generally exists when a linear regression model is applied to analyzing some real-world problems. Therefore, how to accurately estimate the variance functions of the error term in a heteroscedastic linear regression model is of great importance for obtaining efficient estimates of the regression parameters and making valid statistical inferences. A method for estimating the variance function of heteroscedastic linear regression models is proposed in this article based on the variance-reduced local linear smoothing technique. Some simulations and comparisons with other method are conducted to assess the performance of the proposed method. The results demonstrate that the proposed method can accurately estimate the variance functions and therefore produce more efficient estimates of the regression parameters.  相似文献   

14.
Suppose we have {(x i , y i )} i = 1, 2,…, n, a sequence of independent observations. We wish to find approximate 1 ? α simultaneous confidence bands for the regression curve. Many previous confidence bands in the literature have practical difficulties. In this article, the local linear smoother is used to estimate the regression curve. The bias of the estimator is considered. Different methods of constructing confidence bands are discussed. Finally, a possible method incorporating logistic regression in an innovative way is proposed to construct the bands for random designs. Simulations are used to study the performance or properties of the methods. The procedure for constructing confidence bands is entirely data-driven. The advantage of the proposed method is that it is simple to use and can be applied to random designs. It can be considered as a practically useful and efficient method.  相似文献   

15.
Suppose one estimates the coefficient β2 in E[Y] = β0 + β1 X 1 + β2 X 2 by stagewise regression. That is, first the model E[Y] ≌ β0 + β1 X 1 is fit using simple linear regression followed by a simple linear regression of the residuals from this model on X 2 to yield the estimator β2. The ratio of the squared t statistic for the estimate b 2 from multiple regression to the squared t statistic for β2 is greater than or equal to 1.0 and is shown to be a convenient function of correlation coefficients among Y, X 1, and X 2. Examination of stagewise regression can provide useful insights when introducing concepts of multiple regression.  相似文献   

16.
Girma Taye 《Statistics》2013,47(3):275-289
Fertility trend within blocks and local variations are the major obstacles to estimate cultivar contrasts in agricultural field trials. This paper examines methods of smoothing fertility trends in field trials using the P-spline. We begin by smoothing trend within block and for each block, and proceeds to demonstrate how it can be extended to smooth trends in trials with two-dimensional setting. We propose simultaneous modelling of trends and local variation. We use Papadakis [J.S. Papadakis, Comparison de differentes methds d'expermentation phytotechnique, Rev. Argen. Agronom. 7 (1940), pp. 297–362.] and kriged covariate to model local variation. We emphasize on the benefit of using P-spline to compromise between parametric and non-parametric approaches. Data sets from wheat and barley trials, designed as randomized complete block design and row-column, are analyzed. We set out a simple strategy of choosing between additive model and two-dimensional setting. We explore different estimation methods and offer some generalizations. The results show importance of the P-spline in modelling trend and the need to choose between additive and two-dimensional settings.  相似文献   

17.
ABSTRACT

We present methods for modeling and estimation of a concurrent functional regression when the predictors and responses are two-dimensional functional datasets. The implementations use spline basis functions and model fitting is based on smoothing penalties and mixed model estimation. The proposed methods are implemented in available statistical software, allow the construction of confidence intervals for the bivariate model parameters, and can be applied to completely or sparsely sampled responses. Methods are tested to data in simulations and they show favorable results in practice. The usefulness of the methods is illustrated in an application to environmental data.  相似文献   

18.
The smoothing spline method is used to fit a curve to a noisy data set, where selection of the smoothing parameter is essential. An adaptive Cp criterion (Chen and Huang 2011 Chen, C. S., and H. C. Huang. 2011. An improved Cp criterion for spline smoothing. Journal of Statistical Planning and Inference 141:44552.[Crossref], [Web of Science ®] [Google Scholar]) based on the Stein’s unbiased risk estimate has been proposed to select the smoothing parameter, which not only considers the usual effective degrees of freedom but also takes into account the selection variability. The resulting fitted curve has been shown to be superior and more stable than commonly used selection criteria and possesses the same asymptotic optimality as Cp. In this paper, we further discuss some characteristics on the selection of smoothing parameter, especially for the selection variability.  相似文献   

19.
Many different methods have been proposed to construct nonparametric estimates of a smooth regression function, including local polynomial, (convolution) kernel and smoothing spline estimators. Each of these estimators uses a smoothing parameter to control the amount of smoothing performed on a given data set. In this paper an improved version of a criterion based on the Akaike information criterion (AIC), termed AICC, is derived and examined as a way to choose the smoothing parameter. Unlike plug-in methods, AICC can be used to choose smoothing parameters for any linear smoother, including local quadratic and smoothing spline estimators. The use of AICC avoids the large variability and tendency to undersmooth (compared with the actual minimizer of average squared error) seen when other 'classical' approaches (such as generalized cross-validation (GCV) or the AIC) are used to choose the smoothing parameter. Monte Carlo simulations demonstrate that the AICC-based smoothing parameter is competitive with a plug-in method (assuming that one exists) when the plug-in method works well but also performs well when the plug-in approach fails or is unavailable.  相似文献   

20.
Straightforward intermediate rank tensor product smoothing in mixed models   总被引:3,自引:0,他引:3  
Tensor product smooths provide the natural way of representing smooth interaction terms in regression models because they are invariant to the units in which the covariates are measured, hence avoiding the need for arbitrary decisions about relative scaling of variables. They would also be the natural way to represent smooth interactions in mixed regression models, but for the fact that the tensor product constructions proposed to date are difficult or impossible to estimate using most standard mixed modelling software. This paper proposes a new approach to the construction of tensor product smooths, which allows the smooth to be written as the sum of some fixed effects and some sets of i.i.d. Gaussian random effects: no previously published construction achieves this. Because of the simplicity of this random effects structure, our construction is useable with almost any flexible mixed modelling software, allowing smooth interaction terms to be readily incorporated into any Generalized Linear Mixed Model. To achieve the computationally convenient separation of smoothing penalties, the construction differs from previous tensor product approaches in the penalties used to control smoothness, but the penalties have the advantage over several alternative approaches of being explicitly interpretable in terms of function shape. Like all tensor product smoothing methods, our approach builds up smooth functions of several variables from marginal smooths of lower dimension, but unlike much of the previous literature we treat the general case in which the marginal smooths can be any quadratically penalized basis expansion, and there can be any number of them. We also point out that the imposition of identifiability constraints on smoothers requires more care in the mixed model setting than it would in a simple additive model setting, and show how to deal with the issue. An interesting side effect of our construction is that an ANOVA-decomposition of the smooth can be read off from the estimates, although this is not our primary focus. We were motivated to undertake this work by applied problems in the analysis of abundance survey data, and two examples of this are presented.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号