首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 33 毫秒
1.
When variable selection with stepwise regression and model fitting are conducted on the same data set, competition for inclusion in the model induces a selection bias in coefficient estimators away from zero. In proportional hazards regression with right-censored data, selection bias inflates the absolute value of parameter estimate of selected parameters, while the omission of other variables may shrink coefficients toward zero. This paper explores the extent of the bias in parameter estimates from stepwise proportional hazards regression and proposes a bootstrap method, similar to those proposed by Miller (Subset Selection in Regression, 2nd edn. Chapman & Hall/CRC, 2002) for linear regression, to correct for selection bias. We also use bootstrap methods to estimate the standard error of the adjusted estimators. Simulation results show that substantial biases could be present in uncorrected stepwise estimators and, for binary covariates, could exceed 250% of the true parameter value. The simulations also show that the conditional mean of the proposed bootstrap bias-corrected parameter estimator, given that a variable is selected, is moved closer to the unconditional mean of the standard partial likelihood estimator in the chosen model, and to the population value of the parameter. We also explore the effect of the adjustment on estimates of log relative risk, given the values of the covariates in a selected model. The proposed method is illustrated with data sets in primary biliary cirrhosis and in multiple myeloma from the Eastern Cooperative Oncology Group.  相似文献   

2.
Some asymptotic behaviour of the bootstrap estimates on a finite sample   总被引:1,自引:1,他引:0  
Bootstrapping the mean, variance, standard error of the mean, regression coefficient and its standard error is considered. It is shown that at a fixed sample size bootstrap estimates converge to classical sample estimates as the number of bootstrap replications tends to infinity. For the mean, variance and regression coefficient, convergence almost everywhere is proven; for the standard error of the mean and standard error of the regression coefficient, weak convergence is proven. The speed of convergence is illustrated by simulation results.  相似文献   

3.
Asymptotic variance plays an important role in the inference using interval estimate of attributable risk. This paper compares asymptotic variances of attributable risk estimate using the delta method and the Fisher information matrix for a 2×2 case–control study due to the practicality of applications. The expressions of these two asymptotic variance estimates are shown to be equivalent. Because asymptotic variance usually underestimates the standard error, the bootstrap standard error has also been utilized in constructing the interval estimates of attributable risk and compared with those using asymptotic estimates. A simulation study shows that the bootstrap interval estimate performs well in terms of coverage probability and confidence length. An exact test procedure for testing independence between the risk factor and the disease outcome using attributable risk is proposed and is justified for the use with real-life examples for a small-sample situation where inference using asymptotic variance may not be valid.  相似文献   

4.
The bootstrap is a methodology for estimating standard errors. The idea is to use a Monte Carlo simulation experiment based on a nonparametric estimate of the error distribution. The main objective of this article is to demonstrate the use of the bootstrap to attach standard errors to coefficient estimates in a second-order autoregressive model fitted by least squares and maximum likelihood estimation. Additionally, a comparison of the bootstrap and the conventional methodology is made. As it turns out, the conventional asymptotic formulae (both the least squares and maximum likelihood estimates) for estimating standard errors appear to overestimate the true standard errors. But there are two problems:i. The first two observations y1 and y2 have been fixed, and ii. The residuals have not been inflated. After these two factors are considered in the trial and bootstrap experiment, both the conventional maximum likelihood and bootstrap estimates of the standard errors appear to be performing quite well.  相似文献   

5.
In incident cohort studies, survival data often include subjects who have experienced an initiate event but have not experienced a subsequent event at the calendar time of recruitment. During the follow-up periods, subjects may undergo a series of successive events. Since the second/third duration process becomes observable only if the first/second event has occurred, the data are subject to left-truncation and dependent censoring. In this article, using the inverse-probability-weighted (IPW) approach, we propose nonparametric estimators for the estimation of the joint survival function of three successive duration times. The asymptotic properties of the proposed estimators are established. The simple bootstrap methods are used to estimate standard deviations and construct interval estimators. A simulation study is conducted to investigate the finite sample properties of the proposed estimators.  相似文献   

6.
For estimating the distribution of a standardized statistic, the bootstrap estimate is known to be local asymptotic minimax. Various computational techniques have been developed to improve on the simulation efficiency of uniform resampling, the standard Monte Carlo approach to approximating the bootstrap estimate. Two new approaches are proposed which give accurate yet simple approximations to the bootstrap estimate. The second of the approaches even improves the convergence rate of the simulation error. A simulation study examines the performance of these two approaches in comparison with other modified bootstrap estimates.  相似文献   

7.
Several methods have been proposed to estimate the misclassification probabilities when a linear discriminant function is used to classify an observation into one of several populations. We describe the application of bootstrap sampling to the above problem. The proposed method has the advantage of not only furnishing the estimates of misclassification probabilities but also provides an estimate of the standard error of estimate. The method is illustrated by a small simulation experiment. It is then applied to three published, well accessible data sets, which are typical of large, medium and small data sets encountered in practice.  相似文献   

8.
In incident cohort studies, survival data often include subjects who have had an initiate event at recruitment and may potentially experience two successive events (first and second) during the follow-up period. Since the second duration process becomes observable only if the first event has occurred, left truncation and dependent censoring arise if the two duration times are correlated. To confront the two potential sampling biases, we propose two inverse-probability-weighted (IPW) estimators for the estimation of the joint survival function of two successive duration times. One of them is similar to the estimator proposed by Chang and Tzeng [Nonparametric estimation of sojourn time distributions for truncated serial event data – a weight adjusted approach, Lifetime Data Anal. 12 (2006), pp. 53–67]. The other is the extension of the nonparametric estimator proposed by Wang and Wells [Nonparametric estimation of successive duration times under dependent censoring, Biometrika 85 (1998), pp. 561–572]. The weak convergence of both estimators are established. Furthermore, the delete-one jackknife and simple bootstrap methods are used to estimate standard deviations and construct interval estimators. A simulation study is conducted to compare the two IPW approaches.  相似文献   

9.
Several existing methods for the choice of the ridge parameter are reviewed, and a bootstrap method is proposed. The bootstrap provides independent measures of prediction errors based on multiple predictions along with an estimate of the standard error of prediction. The bootstrap and selected competitors are compared through Monte Carlo simulations for various degrees of design matrix collinearity and varying levels of signal-to-noise ratio. The procedure is also illustrated by application to two published data sets. In one case, the bootstrap choice of the ridge parameter leads to a smaller mean squared error of prediction than the ridge trace method. In the second case, an optimal choice of no perturbation is confirmed. Benefits of the bootstrap choice include its less subjective nature, ease of implementation, and robustness.  相似文献   

10.
The bootstrap, like the jackknife, is a technique for estimating standard errors. The idea is to use Monte Carlo simulation, based on a nonparametric estimate of the underlying error distribution. The bootstrap will be applied to an econometric model describing the demand for capital, labor, energy, and materials. The model is fitted by three-stage least squares. In sharp contrast with previous results, the coefficient estimates and the estimated standard errors perform very well. However, the model's forecasts show serious bias and large random errors, significantly understated by the conventional standard error of forecast.  相似文献   

11.
ABSTRACT

This article considers nonparametric regression problems and develops a model-averaging procedure for smoothing spline regression problems. Unlike most smoothing parameter selection studies determining an optimum smoothing parameter, our focus here is on the prediction accuracy for the true conditional mean of Y given a predictor X. Our method consists of two steps. The first step is to construct a class of smoothing spline regression models based on nonparametric bootstrap samples, each with an appropriate smoothing parameter. The second step is to average bootstrap smoothing spline estimates of different smoothness to form a final improved estimate. To minimize the prediction error, we estimate the model weights using a delete-one-out cross-validation procedure. A simulation study has been performed by using a program written in R. The simulation study provides a comparison of the most well known cross-validation (CV), generalized cross-validation (GCV), and the proposed method. This new method is straightforward to implement, and gives reliable performances in simulations.  相似文献   

12.
The commonly used method of small area estimation (SAE) under a linear mixed model may not be efficient if data contain substantial proportion of zeros than would be expected under standard model assumptions (hereafter zero-inflated data). The authors discuss the SAE for zero-inflated data under a two-part random effects model that account for excess zeros in the data. Empirical results show that proposed method for SAE works well and produces an efficient set of small area estimates. An application to real survey data from the National Sample Survey Office of India demonstrates the satisfactory performance of the method. The authors describe a parametric bootstrap method to estimate the mean squared error (MSE) of the proposed estimator of small areas. The bootstrap estimates of the MSE are compared to the true MSE in simulation study.  相似文献   

13.
This paper is concerned with the problem of estimating the standard errors of the empirical Bayes estimators in linear regression models. The problem of deriving an exact expression for the standard error of this estimator is generally intractable. We suggest a procedure based on Efron’s bootstrap method as a way of estimating the standard error. It is shown, through simulations, that the bootstrap method provides a more accurate estimate of the standard error of the empirical Bayes estimator than the traditional large sample method.  相似文献   

14.
Weighted log‐rank estimating function has become a standard estimation method for the censored linear regression model, or the accelerated failure time model. Well established statistically, the estimator defined as a consistent root has, however, rather poor computational properties because the estimating function is neither continuous nor, in general, monotone. We propose a computationally efficient estimator through an asymptotics‐guided Newton algorithm, in which censored quantile regression methods are tailored to yield an initial consistent estimate and a consistent derivative estimate of the limiting estimating function. We also develop fast interval estimation with a new proposal for sandwich variance estimation. The proposed estimator is asymptotically equivalent to the consistent root estimator and barely distinguishable in samples of practical size. However, computation time is typically reduced by two to three orders of magnitude for point estimation alone. Illustrations with clinical applications are provided.  相似文献   

15.
The area under the Receiver Operating Characteristic (ROC) curve (AUC) and related summary indices are widely used for assessment of accuracy of an individual and comparison of performances of several diagnostic systems in many areas including studies of human perception, decision making, and the regulatory approval process for new diagnostic technologies. Many investigators have suggested implementing the bootstrap approach to estimate variability of AUC-based indices. Corresponding bootstrap quantities are typically estimated by sampling a bootstrap distribution. Such a process, frequently termed Monte Carlo bootstrap, is often computationally burdensome and imposes an additional sampling error on the resulting estimates. In this article, we demonstrate that the exact or ideal (sampling error free) bootstrap variances of the nonparametric estimator of AUC can be computed directly, i.e., avoiding resampling of the original data, and we develop easy-to-use formulas to compute them. We derive the formulas for the variances of the AUC corresponding to a single given or random reader, and to the average over several given or randomly selected readers. The derived formulas provide an algorithm for computing the ideal bootstrap variances exactly and hence improve many bootstrap methods proposed earlier for analyzing AUCs by eliminating the sampling error and sometimes burdensome computations associated with a Monte Carlo (MC) approximation. In addition, the availability of closed-form solutions provides the potential for an analytical assessment of the properties of bootstrap variance estimators. Applications of the proposed method are shown on two experimentally ascertained datasets that illustrate settings commonly encountered in diagnostic imaging. In the context of the two examples we also demonstrate the magnitude of the effect of the sampling error of the MC estimators on the resulting inferences.  相似文献   

16.
Observational studies are increasingly being used in medicine to estimate the effects of treatments or exposures on outcomes. To minimize the potential for confounding when estimating treatment effects, propensity score methods are frequently implemented. Often outcomes are the time to event. While it is common to report the treatment effect as a relative effect, such as the hazard ratio, reporting the effect using an absolute measure of effect is also important. One commonly used absolute measure of effect is the risk difference or difference in probability of the occurrence of an event within a specified duration of follow-up between a treatment and comparison group. We first describe methods for point and variance estimation of the risk difference when using weighting or matching based on the propensity score when outcomes are time-to-event. Next, we conducted Monte Carlo simulations to compare the relative performance of these methods with respect to bias of the point estimate, accuracy of variance estimates, and coverage of estimated confidence intervals. The results of the simulation generally support the use of weighting methods (untrimmed ATT weights and IPTW) or caliper matching when the prevalence of treatment is low for point estimation. For standard error estimation the simulation results support the use of weighted robust standard errors, bootstrap methods, or matching with a naïve standard error (i.e., Greenwood method). The methods considered in the article are illustrated using a real-world example in which we estimate the effect of discharge prescribing of statins on patients hospitalized for acute myocardial infarction.  相似文献   

17.
In this article, we use a latent class model (LCM) with prevalence modeled as a function of covariates to assess diagnostic test accuracy in situations where the true disease status is not observed, but observations on three or more conditionally independent diagnostic tests are available. A fast Monte Carlo expectation–maximization (MCEM) algorithm with binary (disease) diagnostic data is implemented to estimate parameters of interest; namely, sensitivity, specificity, and prevalence of the disease as a function of covariates. To obtain standard errors for confidence interval construction of estimated parameters, the missing information principle is applied to adjust information matrix estimates. We compare the adjusted information matrix-based standard error estimates with the bootstrap standard error estimates both obtained using the fast MCEM algorithm through an extensive Monte Carlo study. Simulation demonstrates that the adjusted information matrix approach estimates the standard error similarly with the bootstrap methods under certain scenarios. The bootstrap percentile intervals have satisfactory coverage probabilities. We then apply the LCM analysis to a real data set of 122 subjects from a Gynecologic Oncology Group study of significant cervical lesion diagnosis in women with atypical glandular cells of undetermined significance to compare the diagnostic accuracy of a histology-based evaluation, a carbonic anhydrase-IX biomarker-based test and a human papillomavirus DNA test.  相似文献   

18.
Existing research on mixtures of regression models are limited to directly observed predictors. The estimation of mixtures of regression for measurement error data imposes challenges for statisticians. For linear regression models with measurement error data, the naive ordinary least squares method, which directly substitutes the observed surrogates for the unobserved error-prone variables, yields an inconsistent estimate for the regression coefficients. The same inconsistency also happens to the naive mixtures of regression estimate, which is based on the traditional maximum likelihood estimator and simply ignores the measurement error. To solve this inconsistency, we propose to use the deconvolution method to estimate the mixture likelihood of the observed surrogates. Then our proposed estimate is found by maximizing the estimated mixture likelihood. In addition, a generalized EM algorithm is also developed to find the estimate. The simulation results demonstrate that the proposed estimation procedures work well and perform much better than the naive estimates.  相似文献   

19.
This paper discusses the bootstrap risk of the linear empirical Bayes estimate of the form θ=Ǎ+B̌x, where x is the current observation, and Ǎ and B̌ are generally functions of the estimates of the prior parameters. The standard error of this risk is developed and ‘computations’ of both the bootstrap risk and its standard error are made.  相似文献   

20.
This paper investigates the quantile residual life regression based on semi-competing risk data. Because the terminal event time dependently censors the non-terminal event time, the inference on the non-terminal event time is not available without extra assumption. Therefore, we assume that the non-terminal event time and the terminal event time follow an Archimedean copula. Then, we apply the inverse probability weight technique to construct an estimating equation of quantile residual life regression coefficients. But, the estimating equation may not be continuous in coefficients. Thus, we apply the generalized solution approach to overcome this problem. Since the variance estimation of the proposed estimator is difficult to obtain, we use the bootstrap resampling method to estimate it. From simulations, it shows the performance of the proposed method is well. Finally, we analyze the Bone Marrow Transplant data for illustrations.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号