首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
ABSTRACT

This article considers nonparametric regression problems and develops a model-averaging procedure for smoothing spline regression problems. Unlike most smoothing parameter selection studies determining an optimum smoothing parameter, our focus here is on the prediction accuracy for the true conditional mean of Y given a predictor X. Our method consists of two steps. The first step is to construct a class of smoothing spline regression models based on nonparametric bootstrap samples, each with an appropriate smoothing parameter. The second step is to average bootstrap smoothing spline estimates of different smoothness to form a final improved estimate. To minimize the prediction error, we estimate the model weights using a delete-one-out cross-validation procedure. A simulation study has been performed by using a program written in R. The simulation study provides a comparison of the most well known cross-validation (CV), generalized cross-validation (GCV), and the proposed method. This new method is straightforward to implement, and gives reliable performances in simulations.  相似文献   

2.
In a single index Poisson regression model with unknown link function, the index parameter can be root- n consistently estimated by the method of pseudo maximum likelihood. In this paper, we study, by simulation arguments, the practical validity of the asymptotic behaviour of the pseudo maximum likelihood index estimator and of some associated cross-validation bandwidths. A robust practical rule for implementing the pseudo maximum likelihood estimation method is suggested, which uses the bootstrap for estimating the variance of the index estimator and a variant of bagging for numerically stabilizing its variance. Our method gives reasonable results even for moderate sized samples; thus, it can be used for doing statistical inference in practical situations. The procedure is illustrated through a real data example.  相似文献   

3.
We propose a new criterion for model selection in prediction problems. The covariance inflation criterion adjusts the training error by the average covariance of the predictions and responses, when the prediction rule is applied to permuted versions of the data set. This criterion can be applied to general prediction problems (e.g. regression or classification) and to general prediction rules (e.g. stepwise regression, tree-based models and neural nets). As a by-product we obtain a measure of the effective number of parameters used by an adaptive procedure. We relate the covariance inflation criterion to other model selection procedures and illustrate its use in some regression and classification problems. We also revisit the conditional bootstrap approach to model selection.  相似文献   

4.
An important problem for fitting local linear regression is the choice of the smoothing parameter. As the smoothing parameter becomes large, the estimator tends to a straight line, which is the least squares fit in the ordinary linear regression setting. This property may be used to assess the adequacy of a simple linear model. Motivated by Silverman's (1981) work in kernel density estimation, a suitable test statistic is the critical smoothing parameter where the estimate changes from nonlinear to linear, while linearity or non- linearity requires a more precise judgment. We define the critical smoothing parameter through the approximate F-tests by Hastie and Tibshirani (1990). To assess the significance, the “wild bootstrap” procedure is used to replicate the data and the proportion of bootstrap samples which give a nonlinear estimate when using the critical bandwidth is obtained as the p-value. Simulation results show that the critical smoothing test is useful in detecting a wide range of alternatives.  相似文献   

5.
Calibration techniques in survey sampling, such as generalized regression estimation (GREG), were formalized in the 1990s to produce efficient estimators of linear combinations of study variables, such as totals or means. They implicitly lie on the assumption of a linear regression model between the variable of interest and some auxiliary variables in order to yield estimates with lower variance if the model is true and remaining approximately design-unbiased even if the model does not hold. We propose a new class of model-assisted estimators obtained by releasing a few calibration constraints and replacing them with a penalty term. This penalization is added to the distance criterion to minimize. By introducing the concept of penalized calibration, combining usual calibration and this ‘relaxed’ calibration, we are able to adjust the weight given to the available auxiliary information. We obtain a more flexible estimation procedure giving better estimates particularly when the auxiliary information is overly abundant or not fully appropriate to be completely used. Such an approach can also be seen as a design-based alternative to the estimation procedures based on the more general class of mixed models, presenting new prospects in some scopes of application such as inference on small domains.  相似文献   

6.
Although bootstrapping has become widely used in statistical analysis, there has been little reported concerning bootstrapped Bayesian analyses, especially when there is proper prior informa-tion concerning the parameter of interest. In this paper, we first propose an operationally implementable definition of a Bayesian bootstrap. Thereafter, in simulated studies of the estimation of means and variances, this Bayesian bootstrap is compared to various parametric procedures. It turns out that little information is lost in using the Bayesian bootstrap even when the sampling distribution is known. On the other hand, the parametric procedures are at times very sensitive to incorrectly specified sampling distributions, implying that the Bayesian bootstrap is a very robust procedure for determining the posterior distribution of the parameter.  相似文献   

7.
Conditional probability distributions have been commonly used in modeling Markov chains. In this paper we consider an alternative approach based on copulas to investigate Markov-type dependence structures. Based on the realization of a single Markov chain, we estimate the parameters using one- and two-stage estimation procedures. We derive asymptotic properties of the marginal and copula parameter estimators and compare performance of the estimation procedures based on Monte Carlo simulations. At low and moderate dependence structures the two-stage estimation has comparable performance as the maximum likelihood estimation. In addition we propose a parametric pseudo-likelihood ratio test for copula model selection under the two-stage procedure. We apply the proposed methods to an environmental data set.  相似文献   

8.
Alternative methods of estimating properties of unknown distributions include the bootstrap and the smoothed bootstrap. In the standard bootstrap setting, Johns (1988) introduced an importance resam¬pling procedure that results in more accurate approximation to the bootstrap estimate of a distribution function or a quantile. With a suitable “exponential tilting” similar to that used by Johns, we derived a smoothed version of importance resampling in the framework of the smoothed bootstrap. Smoothed importance resampling procedures were developed for the estimation of distribution functions of the Studentized mean, the Studentized variance, and the correlation coefficient. Implementation of these procedures are presented via simulation results which concentrate on the problem of estimation of distribution functions of the Studentized mean and Studentized variance for different sample sizes and various pre-specified smoothing bandwidths for the normal data; additional simulations were conducted for the estimation of quantiles of the distribution of the Studentized mean under an optimal smoothing bandwidth when the original data were simulated from three different parent populations: lognormal, t(3) and t(10). These results suggest that in cases where it is advantageous to use the smoothed bootstrap rather than the standard bootstrap, the amount of resampling necessary might be substantially reduced by the use of importance resampling methods and the efficiency gains depend on the bandwidth used in the kernel density estimation.  相似文献   

9.
Simple nonparametric estimates of the conditional distribution of a response variable given a covariate are often useful for data exploration purposes or to help with the specification or validation of a parametric or semi-parametric regression model. In this paper we propose such an estimator in the case where the response variable is interval-censored and the covariate is continuous. Our approach consists in adding weights that depend on the covariate value in the self-consistency equation proposed by Turnbull (J R Stat Soc Ser B 38:290–295, 1976), which results in an estimator that is no more difficult to implement than Turnbull’s estimator itself. We show the convergence of our algorithm and that our estimator reduces to the generalized Kaplan–Meier estimator (Beran, Nonparametric regression with randomly censored survival data, 1981) when the data are either complete or right-censored. We demonstrate by simulation that the estimator, bootstrap variance estimation and bandwidth selection (by rule of thumb or cross-validation) all perform well in finite samples. We illustrate the method by applying it to a dataset from a study on the incidence of HIV in a group of female sex workers from Kinshasa.  相似文献   

10.
This paper considers nonlinear regression models when neither the response variable nor the covariates can be directly observed, but are measured with both multiplicative and additive distortion measurement errors. We propose conditional variance and conditional mean calibration estimation methods for the unobserved variables, then a nonlinear least squares estimator is proposed. For the hypothesis testing of parameter, a restricted estimator under the null hypothesis and a test statistic are proposed. The asymptotic properties for the estimator and test statistic are established. Lastly, a residual-based empirical process test statistic marked by proper functions of the regressors is proposed for the model checking problem. We further suggest a bootstrap procedure to calculate critical values. Simulation studies demonstrate the performance of the proposed procedure and a real example is analysed to illustrate its practical usage.  相似文献   

11.
Introducing model uncertainty by moving blocks bootstrap   总被引:1,自引:1,他引:0  
It is common in parametric bootstrap to select the model from the data, and then treat as if it were the true model. Chatfield (1993, 1996) has shown that ignoring the model uncertainty may seriously undermine the coverage accuracy of prediction intervals. In this paper, we propose a method based on moving block bootstrap for introducing the model selection step in the resampling algorithm. We present a Monte Carlo study comparing the finite sample properties of the proposel method with those of alternative methods in the case of prediction intervas.  相似文献   

12.
A multivariate “errors in variables” regression model is proposed which generalizes a model previously considered by Gleser and Watson (1973). Maximum likelihood estimators [MLE's] for the parameters of this model are obtained, and the consistency properties of these estimators are investigated. Distribution of the MLE of the “error” variance is obtained in a simple case while the mean and the variance of the estimator are obtained in this case without appealing to the exact distribution.  相似文献   

13.
This paper is concerned with the problem of constructing a good predictive distribution relative to the Kullback–Leibler information in a linear regression model. The problem is equivalent to the simultaneous estimation of regression coefficients and error variance in terms of a complicated risk, which yields a new challenging issue in a decision-theoretic framework. An estimator of the variance is incorporated here into a loss for estimating the regression coefficients. Several estimators of the variance and of the regression coefficients are proposed and shown to improve on usual benchmark estimators both analytically and numerically. Finally, the prediction problem of a distribution is noted to be related to an information criterion for model selection like the Akaike information criterion (AIC). Thus, several AIC variants are obtained based on proposed and improved estimators and are compared numerically with AIC as model selection procedures.  相似文献   

14.
Conventional approaches for inference about efficiency in parametric stochastic frontier (PSF) models are based on percentiles of the estimated distribution of the one-sided error term, conditional on the composite error. When used as prediction intervals, coverage is poor when the signal-to-noise ratio is low, but improves slowly as sample size increases. We show that prediction intervals estimated by bagging yield much better coverages than the conventional approach, even with low signal-to-noise ratios. We also present a bootstrap method that gives confidence interval estimates for (conditional) expectations of efficiency, and which have good coverage properties that improve with sample size. In addition, researchers who estimate PSF models typically reject models, samples, or both when residuals have skewness in the “wrong” direction, i.e., in a direction that would seem to indicate absence of inefficiency. We show that correctly specified models can generate samples with “wrongly” skewed residuals, even when the variance of the inefficiency process is nonzero. Both our bagging and bootstrap methods provide useful information about inefficiency and model parameters irrespective of whether residuals have skewness in the desired direction.  相似文献   

15.
In this article we investigate the problem of ascertaining A- and D-optimal designs in a cubic regression model with random coefficients. Our interest lies in estimation of all the parameters or in only those except the intercept term. Assuming the variance ratios to be known, we tabulate D-optimal designs for various combinations of the variance ratios. A-optimality does not pose any new problem in the random coefficients situation.  相似文献   

16.
We regard the simple linear calibration problem where only the response y of the regression line y = β0 + β1 t is observed with errors. The experimental conditions t are observed without error. For the errors of the observations y we assume that there may be some gross errors providing outlying observations. This situation can be modeled by a conditionally contaminated regression model. In this model the classical calibration estimator based on the least squares estimator has an unbounded asymptotic bias. Therefore we introduce calibration estimators based on robust one-step-M-estimators which have a bounded asymptotic bias. For this class of estimators we discuss two problems: The optimal estimators and their corresponding optimal designs. We derive the locally optimal solutions and show that the maximin efficient designs for non-robust estimation and robust estimation coincide.  相似文献   

17.
The authors consider a double robust estimation of the regression parameter defined by an estimating equation in a surrogate outcome set‐up. Under a correct specification of the propensity score, the proposed estimator has smallest trace of asymptotic covariance matrix whether the “working outcome regression model” involved is specified correct or not, and it is particularly meaningful when it is incorrectly specified. Simulations are conducted to examine the finite sample performance of the proposed procedure. Data on obesity and high blood pressure are analyzed for illustration. The Canadian Journal of Statistics 38: 633–646; 2010 © 2010 Statistical Society of Canada  相似文献   

18.
Scientific experiments commonly result in clustered discrete and continuous data. Existing methods for analyzing such data include the use of quasi-likelihood procedures and generalized estimating equations to estimate marginal mean response parameters. In applications to areas such as developmental toxicity studies, where discrete and continuous measurements are recorded on each fetus, or clinical ophthalmologic trials, where different types of observations are made on each eye, the assumption that data within cluster are exchangeable is often very reasonable. We use this assumption to formulate fully parametric regression models for clusters of bivariate data with binary and continuous components. The regression models proposed have marginal interpretations and reproducible model structures. Tractable expressions for likelihood equations are derived and iterative schemes are given for computing efficient estimates (MLEs) of the marginal mean, correlations, variances and higher moments. We demonstrate the use the ‘exchangeable’ procedure with an application to a developmental toxicity study involving fetal weight and malformation data.  相似文献   

19.
Qunfang Xu 《Statistics》2017,51(6):1280-1303
In this paper, semiparametric modelling for longitudinal data with an unstructured error process is considered. We propose a partially linear additive regression model for longitudinal data in which within-subject variances and covariances of the error process are described by unknown univariate and bivariate functions, respectively. We provide an estimating approach in which polynomial splines are used to approximate the additive nonparametric components and the within-subject variance and covariance functions are estimated nonparametrically. Both the asymptotic normality of the resulting parametric component estimators and optimal convergence rate of the resulting nonparametric component estimators are established. In addition, we develop a variable selection procedure to identify significant parametric and nonparametric components simultaneously. We show that the proposed SCAD penalty-based estimators of non-zero components have an oracle property. Some simulation studies are conducted to examine the finite-sample performance of the proposed estimation and variable selection procedures. A real data set is also analysed to demonstrate the usefulness of the proposed method.  相似文献   

20.
The class of joint mean‐covariance models uses the modified Cholesky decomposition of the within subject covariance matrix in order to arrive to an unconstrained, statistically meaningful reparameterisation. The new parameterisation of the covariance matrix has two sets of parameters that separately describe the variances and correlations. Thus, with the mean or regression parameters, these models have three sets of distinct parameters. In order to alleviate the problem of inefficient estimation and downward bias in the variance estimates, inherent in the maximum likelihood estimation procedure, the usual REML estimation procedure adjusts for the degrees of freedom lost due to the estimation of the mean parameters. Because of the parameterisation of the joint mean covariance models, it is possible to adapt the usual REML procedure in order to estimate the variance (correlation) parameters by taking into account the degrees of freedom lost by the estimation of both the mean and correlation (variance) parameters. To this end, here we propose adjustments to the estimation procedures based on the modified and adjusted profile likelihoods. The methods are illustrated by an application to a real data set and simulation studies. The Canadian Journal of Statistics 40: 225–242; 2012 © 2012 Statistical Society of Canada  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号