首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 16 毫秒
1.
Donor imputation is frequently used in surveys. However, very few variance estimation methods that take into account donor imputation have been developed in the literature. This is particularly true for surveys with high sampling fractions using nearest donor imputation, often called nearest‐neighbour imputation. In this paper, the authors develop a variance estimator for donor imputation based on the assumption that the imputed estimator of a domain total is approximately unbiased under an imputation model; that is, a model for the variable requiring imputation. Their variance estimator is valid, irrespective of the magnitude of the sampling fractions and the complexity of the donor imputation method, provided that the imputation model mean and variance are accurately estimated. They evaluate its performance in a simulation study and show that nonparametric estimation of the model mean and variance via smoothing splines brings robustness with respect to imputation model misspecifications. They also apply their variance estimator to real survey data when nearest‐neighbour imputation has been used to fill in the missing values. The Canadian Journal of Statistics 37: 400–416; 2009 © 2009 Statistical Society of Canada  相似文献   

2.
In this article, we extend smoothing splines to model the regression mean structure when data are sampled through a complex survey. Smoothing splines are evaluated both with and without sample weights, and are compared with local linear estimator. Simulation studies find that nonparametric estimators perform better when sample weights are incorporated, rather than being treated as if iid. They also find that smoothing splines perform better than local linear estimator through completely data-driven bandwidth selection methods.  相似文献   

3.
We consider the use of smoothing splines for the adaptive modelling of dose–response relationships. A smoothing spline is a nonparametric estimator of a function that is a compromise between the fit to the data and the degree of smoothness and thus provides a flexible way of modelling dose–response data. In conjunction with decision rules for which doses to continue with after an interim analysis, it can be used to give an adaptive way of modelling the relationship between dose and response. We fit smoothing splines using the generalized cross‐validation criterion for deciding on the degree of smoothness and we use estimated bootstrap percentiles of the predicted values for each dose to decide upon which doses to continue with after an interim analysis. We compare this approach with a corresponding adaptive analysis of variance approach based upon new simulations of the scenarios previously used by the PhRMA Working Group on Adaptive Dose‐Ranging Studies. The results obtained for the adaptive modelling of dose–response data using smoothing splines are mostly comparable with those previously obtained by the PhRMA Working Group for the Bayesian Normal Dynamic Linear model (GADA) procedure. These methods may be useful for carrying out adaptations, detecting dose–response relationships and identifying clinically relevant doses. Copyright © 2009 John Wiley & Sons, Ltd.  相似文献   

4.
The author considers the use of auxiliary information available at population level to improve the estimation of finite population totals. She introduces a new type of model‐assisted estimator based on nonparametric regression splines. The estimator is a weighted linear combination of the study variable with weights calibrated to the B‐splines known population totals. The author shows that the estimator is asymptotically design‐unbiased and consistent under conditions which do not require the superpopulation model to be correct. She proposes a design‐based variance approximation and shows that the anticipated variance is asymptotically equivalent to the Godambe‐Joshi lower bound. She also shows through simulations that the estimator has good properties.  相似文献   

5.
Summary.  The objective is to estimate the period and the light curve (or periodic function) of a variable star. Previously, several methods have been proposed to estimate the period of a variable star, but they are inaccurate especially when a data set contains outliers. We use a smoothing spline regression to estimate the light curve given a period and then find the period which minimizes the generalized cross-validation (GCV). The GCV method works well, matching an intensive visual examination of a few hundred stars, but the GCV score is still sensitive to outliers. Handling outliers in an automatic way is important when this method is applied in a 'data mining' context to a vary large star survey. Therefore, we suggest a robust method which minimizes a robust cross-validation criterion induced by a robust smoothing spline regression. Once the period has been determined, a nonparametric method is used to estimate the light curve. A real example and a simulation study suggest that the robust cross-validation and GCV methods are superior to existing methods.  相似文献   

6.
Smoothing splines are known to exhibit a type of boundary bias that can reduce their estimation efficiency. In this paper, a boundary corrected cubic smoothing spline is developed in a way that produces a uniformly fourth order estimator. The resulting estimator can be calculated efficiently using an O(n) algorithm that is designed for the computation of fitted values and associated smoothing parameter selection criteria. A simulation study shows that use of the boundary corrected estimator can improve estimation efficiency in finite samples. Applications to the construction of asymptotically valid pointwise confidence intervals are also investigated .  相似文献   

7.
A fast and accurate method of confidence interval construction for the smoothing parameter in penalised spline and partially linear models is proposed. The method is akin to a parametric percentile bootstrap where Monte Carlo simulation is replaced by saddlepoint approximation, and can therefore be viewed as an approximate bootstrap. It is applicable in a quite general setting, requiring only that the underlying estimator be the root of an estimating equation that is a quadratic form in normal random variables. This is the case under a variety of optimality criteria such as those commonly denoted by maximum likelihood (ML), restricted ML (REML), generalized cross validation (GCV) and Akaike's information criteria (AIC). Simulation studies reveal that under the ML and REML criteria, the method delivers a near‐exact performance with computational speeds that are an order of magnitude faster than existing exact methods, and two orders of magnitude faster than a classical bootstrap. Perhaps most importantly, the proposed method also offers a computationally feasible alternative when no known exact or asymptotic methods exist, e.g. GCV and AIC. An application is illustrated by applying the methodology to well‐known fossil data. Giving a range of plausible smoothed values in this instance can help answer questions about the statistical significance of apparent features in the data.  相似文献   

8.
Longitudinal data frequently arises in various fields of applied sciences where individuals are measured according to some ordered variable, e.g. time. A common approach used to model such data is based on the mixed models for repeated measures. This model provides an eminently flexible approach to modeling of a wide range of mean and covariance structures. However, such models are forced into a rigidly defined class of mathematical formulas which may not be well supported by the data within the whole sequence of observations. A possible non-parametric alternative is a cubic smoothing spline, which is highly flexible and has useful smoothing properties. It can be shown that under normality assumption, the solution of the penalized log-likelihood equation is the cubic smoothing spline, and this solution can be further expressed as a solution of the linear mixed model. It is shown here how cubic smoothing splines can be easily used in the analysis of complete and balanced data. Analysis can be greatly simplified by using the unweighted estimator studied in the paper. It is shown that if the covariance structure of random errors belong to certain class of matrices, the unweighted estimator is the solution to the penalized log-likelihood function. This result is new in smoothing spline context and it is not only confined to growth curve settings. The connection to mixed models is used in developing a rough testing of group profiles. Numerical examples are presented to illustrate the techniques proposed.  相似文献   

9.
Spatially-adaptive Penalties for Spline Fitting   总被引:2,自引:0,他引:2  
The paper studies spline fitting with a roughness penalty that adapts to spatial heterogeneity in the regression function. The estimates are p th degree piecewise polynomials with p − 1 continuous derivatives. A large and fixed number of knots is used and smoothing is achieved by putting a quadratic penalty on the jumps of the p th derivative at the knots. To be spatially adaptive, the logarithm of the penalty is itself a linear spline but with relatively few knots and with values at the knots chosen to minimize the generalized cross validation (GCV) criterion. This locally-adaptive spline estimator is compared with other spline estimators in the literature such as cubic smoothing splines and knot-selection techniques for least squares regression. Our estimator can be interpreted as an empirical Bayes estimate for a prior allowing spatial heterogeneity. In cases of spatially heterogeneous regression functions, empirical Bayes confidence intervals using this prior achieve better pointwise coverage probabilities than confidence intervals based on a global-penalty parameter. The method is developed first for univariate models and then extended to additive models.  相似文献   

10.
Abstract. Although generalized cross‐validation (GCV) has been frequently applied to select bandwidth when kernel methods are used to estimate non‐parametric mixed‐effect models in which non‐parametric mean functions are used to model covariate effects, and additive random effects are applied to account for overdispersion and correlation, the optimality of the GCV has not yet been explored. In this article, we construct a kernel estimator of the non‐parametric mean function. An equivalence between the kernel estimator and a weighted least square type estimator is provided, and the optimality of the GCV‐based bandwidth is investigated. The theoretical derivations also show that kernel‐based and spline‐based GCV give very similar asymptotic results. This provides us with a solid base to use kernel estimation for mixed‐effect models. Simulation studies are undertaken to investigate the empirical performance of the GCV. A real data example is analysed for illustration.  相似文献   

11.
These Fortran-77 subroutines provide building blocks for Generalized Cross-Validation (GCV) (Craven and Wahba, 1979) calculations in data analysis and data smoothing including ridge regression (Golub, Heath, and Wahba, 1979), thin plate smoothing splines (Wahba and Wendelberger, 1980), deconvolution (Wahba, 1982d), smoothing of generalized linear models (O'sullivan, Yandell and Raynor 1986, Green 1984 and Green and Yandell 1985), and ill-posed problems (Nychka et al., 1984, O'sullivan and Wahba, 1985). We present some of the types of problems for which GCV is a useful method of choosing a smoothing or regularization parameter and we describe the structure of the subroutines.Ridge Regression: A familiar example of a smoothing parameter is the ridge parameter X in the ridge regression problem which we write.  相似文献   

12.
Summary.  We construct approximate confidence intervals for a nonparametric regression function, using polynomial splines with free-knot locations. The number of knots is determined by generalized cross-validation. The estimates of knot locations and coefficients are obtained through a non-linear least squares solution that corresponds to the maximum likelihood estimate. Confidence intervals are then constructed based on the asymptotic distribution of the maximum likelihood estimator. Average coverage probabilities and the accuracy of the estimate are examined via simulation. This includes comparisons between our method and some existing methods such as smoothing spline and variable knots selection as well as a Bayesian version of the variable knots method. Simulation results indicate that our method works well for smooth underlying functions and also reasonably well for discontinuous functions. It also performs well for fairly small sample sizes.  相似文献   

13.
针对纵向数据半参数模型E(y|x,t)=XTβ+f(t),采用惩罚二次推断函数方法同时估计模型中的回归参数β和未知光滑函数f(t)。首先利用截断幂函数基对未知光滑函数进行基函数展开近似,然后利用惩罚样条的思想构造关于回归参数和基函数系数的惩罚二次推断函数,最小化惩罚二次推断函数便可得到回归参数和基函数系数的惩罚二次推断函数估计。理论结果显示,估计结果具有相合性和渐近正态性,通过数值方法也得到了较好的模拟结果。  相似文献   

14.
15.
Let x be a random variable having the normal distribution with mean μ and variance c2μ2, where c is a known constant. The maximum likelihood estimation of μ when the lowest r1 and the highest r2 sample values censored have been given the asymptotic variance of the maximum likelihood estimator is obtained.  相似文献   

16.
In this paper we consider the problem of constructing confidence intervals for nonparametric quantile regression with an emphasis on smoothing splines. The mean‐based approaches for smoothing splines of Wahba (1983) and Nychka (1988) may not be efficient for constructing confidence intervals for the underlying function when the observed data are non‐Gaussian distributed, for instance if they are skewed or heavy‐tailed. This paper proposes a method of constructing confidence intervals for the unknown τth quantile function (0<τ<1) based on smoothing splines. In this paper we investigate the extent to which the proposed estimator provides the desired coverage probability. In addition, an improvement based on a local smoothing parameter that provides more uniform pointwise coverage is developed. The results from numerical studies including a simulation study and real data analysis demonstrate the promising empirical properties of the proposed approach.  相似文献   

17.
We derived two methods to estimate the logistic regression coefficients in a meta-analysis when only the 'aggregate' data (mean values) from each study are available. The estimators we proposed are the discriminant function estimator and the reverse Taylor series approximation. These two methods of estimation gave similar estimators using an example of individual data. However, when aggregate data were used, the discriminant function estimators were quite different from the other two estimators. A simulation study was then performed to evaluate the performance of these two estimators as well as the estimator obtained from the model that simply uses the aggregate data in a logistic regression model. The simulation study showed that all three estimators are biased. The bias increases as the variance of the covariate increases. The distribution type of the covariates also affects the bias. In general, the estimator from the logistic regression using the aggregate data has less bias and better coverage probabilities than the other two estimators. We concluded that analysts should be cautious in using aggregate data to estimate the parameters of the logistic regression model for the underlying individual data.  相似文献   

18.
The sampling designs dependent on sample moments of auxiliary variables are well known. Lahiri (Bull Int Stat Inst 33:133–140, 1951) considered a sampling design proportionate to a sample mean of an auxiliary variable. Sing and Srivastava (Biometrika 67(1):205–209, 1980) proposed the sampling design proportionate to a sample variance while Wywiał (J Indian Stat Assoc 37:73–87, 1999) a sampling design proportionate to a sample generalized variance of auxiliary variables. Some other sampling designs dependent on moments of an auxiliary variable were considered e.g. in Wywiał (Some contributions to multivariate methods in, survey sampling. Katowice University of Economics, Katowice, 2003a); Stat Transit 4(5):779–798, 2000) where accuracy of some sampling strategies were compared, too.These sampling designs cannot be useful in the case when there are some censored observations of the auxiliary variable. Moreover, they can be much too sensitive to outliers observations. In these cases the sampling design proportionate to the order statistic of an auxiliary variable can be more useful. That is why such an unequal probability sampling design is proposed here. Its particular cases as well as its conditional version are considered, too. The sampling scheme implementing this sampling design is proposed. The inclusion probabilities of the first and second orders were evaluated. The well known Horvitz–Thompson estimator is taken into account. A ratio estimator dependent on an order statistic is constructed. It is similar to the well known ratio estimator based on the population and sample means. Moreover, it is an unbiased estimator of the population mean when the sample is drawn according to the proposed sampling design dependent on the appropriate order statistic.  相似文献   

19.
Two new stochastic search methods are proposed for optimizing the knot locations and/or smoothing parameters for least-squares or penalized splines. One of the methods is a golden-section-augmented blind search, while the other is a continuous genetic algorithm. Monte Carlo experiments indicate that the algorithms are very successful at producing knot locations and/or smoothing parameters that are near optimal in a squared error sense. Both algorithms are amenable to parallelization and have been implemented in OpenMP and MPI. An adjusted GCV criterion is also considered for selecting both the number and location of knots. The method performed well relative to MARS in a small empirical comparison.  相似文献   

20.
For the survey population total of a variable y when values of an auxiliary variable x are available a popular procedure is to employ the ratio estimator on drawing a simple random sample without replacement (SRSWOR) especially when the size of the sample is large. To set up a confidence interval for the total, various variance estimators are available to pair with the ratio estimator. We add a few more variance estimators studded with asymptotic design-cum-model properties. The ratio estimator is traditionally known to be appropriate when the regression of y on x is linear through the origin and the conditional variance of y given x is proportional to x. But through a numerical exercise by simulation we find the confidence intervals to fare better if the regression line deviates from the origin or if the conditional variance is disproportionate with x. Also, comparing the confidence intervals using alternative variance estimators we find our newly proposed variance estimators to yield favourably competitive results.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号