首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
We propose a profile conditional likelihood approach to handle missing covariates in the general semiparametric transformation regression model. The method estimates the marginal survival function by the Kaplan-Meier estimator, and then estimates the parameters of the survival model and the covariate distribution from a conditional likelihood, substituting the Kaplan-Meier estimator for the marginal survival function in the conditional likelihood. This method is simpler than full maximum likelihood approaches, and yields consistent and asymptotically normally distributed estimator of the regression parameter when censoring is independent of the covariates. The estimator demonstrates very high relative efficiency in simulations. When compared with complete-case analysis, the proposed estimator can be more efficient when the missing data are missing completely at random and can correct bias when the missing data are missing at random. The potential application of the proposed method to the generalized probit model with missing continuous covariates is also outlined.  相似文献   

2.
We consider parametric regression problems with some covariates missing at random. It is shown that the regression parameter remains identifiable under natural conditions. When the always observed covariates are discrete, we propose a semiparametric maximum likelihood method, which does not require parametric specification of the missing data mechanism or the covariate distribution. The global maximum likelihood estimator (MLE), which maximizes the likelihood over the whole parameter set, is shown to exist under simple conditions. For ease of computation, we also consider a restricted MLE which maximizes the likelihood over covariate distributions supported by the observed values. Under regularity conditions, the two MLEs are asymptotically equivalent and strongly consistent for a class of topologies on the parameter set.  相似文献   

3.
We present three multiple imputation estimates for the Cox model with missing covariates. Two of the suggested estimates are asymptotically equivalent to estimates in the literature when the number of multiple imputations approaches infinity. The third estimate can be implemented using standard software that could handle time-varying covariates. This revised version was published online in July 2006 with corrections to the Cover Date.  相似文献   

4.
Missing covariates data with censored outcomes put a challenge in the analysis of clinical data especially in small sample settings. Multiple imputation (MI) techniques are popularly used to impute missing covariates and the data are then analyzed through methods that can handle censoring. However, techniques based on MI are available to impute censored data also but they are not much in practice. In the present study, we applied a method based on multiple imputation by chained equations to impute missing values of covariates and also to impute censored outcomes using restricted survival time in small sample settings. The complete data were then analyzed using linear regression models. Simulation studies and a real example of CHD data show that the present method produced better estimates and lower standard errors when applied on the data having missing covariate values and censored outcomes than the analysis of the data having censored outcome but excluding cases with missing covariates or the analysis when cases with missing covariate values and censored outcomes were excluded from the data (complete case analysis).  相似文献   

5.
Consider estimation of a population mean of a response variable when the observations are missing at random with respect to the covariate. Two common approaches to imputing the missing values are the nonparametric regression weighting method and the Horvitz-Thompson (HT) inverse weighting approach. The regression approach includes the kernel regression imputation and the nearest neighbor imputation. The HT approach, employing inverse kernel-estimated weights, includes the basic estimator, the ratio estimator and the estimator using inverse kernel-weighted residuals. Asymptotic normality of the nearest neighbor imputation estimators is derived and compared to kernel regression imputation estimator under standard regularity conditions of the regression function and the missing pattern function. A comprehensive simulation study shows that the basic HT estimator is most sensitive to discontinuity in the missing data patterns, and the nearest neighbors estimators can be insensitive to missing data patterns unbalanced with respect to the distribution of the covariate. Empirical studies show that the nearest neighbor imputation method is most effective among these imputation methods for estimating a finite population mean and for classifying the species of the iris flower data.  相似文献   

6.
Biao Zhang 《Statistics》2016,50(5):1173-1194
Missing covariate data occurs often in regression analysis. We study methods for estimating the regression coefficients in an assumed conditional mean function when some covariates are completely observed but other covariates are missing for some subjects. We adopt the semiparametric perspective of Robins et al. [Estimation of regression coefficients when some regressors are not always observed. J Amer Statist Assoc. 1994;89:846–866] on regression analyses with missing covariates, in which they pioneered the use of two working models, the working propensity score model and the working conditional score model. A recent approach to missing covariate data analysis is the empirical likelihood method of Qin et al. [Empirical likelihood in missing data problems. J Amer Statist Assoc. 2009;104:1492–1503], which effectively combines unbiased estimating equations. In this paper, we consider an alternative likelihood approach based on the full likelihood of the observed data. This full likelihood-based method enables us to generate estimators for the vector of the regression coefficients that are (a) asymptotically equivalent to those of Qin et al. [Empirical likelihood in missing data problems. J Amer Statist Assoc. 2009;104:1492–1503] when the working propensity score model is correctly specified, and (b) doubly robust, like the augmented inverse probability weighting (AIPW) estimators of Robins et al. [Estimation of regression coefficients when some regressors are not always observed. J Am Statist Assoc. 1994;89:846–866]. Thus, the proposed full likelihood-based estimators improve on the efficiency of the AIPW estimators when the working propensity score model is correct but the working conditional score model is possibly incorrect, and also improve on the empirical likelihood estimators of Qin, Zhang and Leung [Empirical likelihood in missing data problems. J Amer Statist Assoc. 2009;104:1492–1503] when the reverse is true, that is, the working conditional score model is correct but the working propensity score model is possibly incorrect. In addition, we consider a regression method for estimation of the regression coefficients when the working conditional score model is correctly specified; the asymptotic variance of the resulting estimator is no greater than the semiparametric variance bound characterized by the theory of Robins et al. [Estimation of regression coefficients when some regressors are not always observed. J Amer Statist Assoc. 1994;89:846–866]. Finally, we compare the finite-sample performance of various estimators in a simulation study.  相似文献   

7.
Missing covariate data are common in biomedical studies. In this article, by using the non parametric kernel regression technique, a new imputation approach is developed for the Cox-proportional hazard regression model with missing covariates. This method achieves the same efficiency as the fully augmented weighted estimators (Qi et al. 2005. Journal of the American Statistical Association, 100:1250) and has a simpler form. The asymptotic properties of the proposed estimator are derived and analyzed. The comparisons between the proposed imputation method and several other existing methods are conducted via a number of simulation studies and a mouse leukemia data.  相似文献   

8.
Abstract.  In the analysis of clustered and/or longitudinal data, it is usually desirable to ignore covariate information for other cluster members as well as future covariate information when predicting outcome for a given subject at a given time. This can be accomplished through con-ditional mean models which merely condition on the considered subject's covariate history at each time. Pepe & Anderson (Commun. Stat. Simul. Comput. 23, 1994 , 939) have shown that ordinary generalized estimating equations may yield biased estimates for the parameters in such models, but that valid inferences can be guaranteed by using a diagonal working covariance matrix in the equations. In this paper, we provide insight into the nature of this problem by uncovering substantive data-generating mechanisms under which such biases will result. We then propose a class of asymptotically unbiased estimators for the parameters indexing the suggested conditional mean models. In addition, we provide a representation for the efficient estimator in our class, which attains the semi-parametric efficiency bound under the model, along with an efficient algorithm for calculating it. This algorithm is easy to apply and may realize major efficiency improvements as demonstrated through simulation studies. The results suggest ways to improve the efficiency of inverse-probability-of-treatment estimators which adjust for time-varying confounding, and are used to estimate the effect of discontinuing highly active anti-retroviral therapy (HAART) on viral load in HIV-infected patients.  相似文献   

9.
Ibrahim (1990) used the EM-algorithm to obtain maximum likelihood estimates of the regression parameters in generalized linear models with partially missing covariates. The technique was termed EM by the method of weights. In this paper, we generalize this technique to Cox regression analysis with missing values in the covariates. We specify a full model letting the unobserved covariate values be random and then maximize the observed likelihood. The asymptotic covariance matrix is estimated by the inverse information matrix. The missing data are allowed to be missing at random but also the non-ignorable non-response situation may in principle be considered. Simulation studies indicate that the proposed method is more efficient than the method suggested by Paik & Tsai (1997). We apply the procedure to a clinical trials example with six covariates with three of them having missing values.  相似文献   

10.
In many clinical studies where time to failure is of primary interest, patients may fail or die from one of many causes where failure time can be right censored. In some circumstances, it might also be the case that patients are known to die but the cause of death information is not available for some patients. Under the assumption that cause of death is missing at random, we compare the Goetghebeur and Ryan (1995, Biometrika, 82, 821–833) partial likelihood approach with the Dewanji (1992, Biometrika, 79, 855–857)partial likelihood approach. We show that the estimator for the regression coefficients based on the Dewanji partial likelihood is not only consistent and asymptotically normal, but also semiparametric efficient. While the Goetghebeur and Ryan estimator is more robust than the Dewanji partial likelihood estimator against misspecification of proportional baseline hazards, the Dewanji partial likelihood estimator allows the probability of missing cause of failure to depend on covariate information without the need to model the missingness mechanism. Tests for proportional baseline hazards are also suggested and a robust variance estimator is derived.  相似文献   

11.
In the parametric regression model, the covariate missing problem under missing at random is considered. It is often desirable to use flexible parametric or semiparametric models for the covariate distribution, which can reduce a potential misspecification problem. Recently, a completely nonparametric approach was developed by [H.Y. Chen, Nonparametric and semiparametric models for missing covariates in parameter regression, J. Amer. Statist. Assoc. 99 (2004), pp. 1176–1189; Z. Zhang and H.E. Rockette, On maximum likelihood estimation in parametric regression with missing covariates, J. Statist. Plann. Inference 47 (2005), pp. 206–223]. Although it does not require a model for the covariate distribution or the missing data mechanism, the proposed method assumes that the covariate distribution is supported only by observed values. Consequently, their estimator is a restricted maximum likelihood estimator (MLE) rather than the global MLE. In this article, we show the restricted semiparametric MLE could be very misleading in some cases. We discuss why this problem occurs and suggest an algorithm to obtain the global MLE. Then, we assess the performance of the proposed method via some simulation experiments.  相似文献   

12.
In longitudinal studies, nonlinear mixed-effects models have been widely applied to describe the intra- and the inter-subject variations in data. The inter-subject variation usually receives great attention and it may be partially explained by time-dependent covariates. However, some covariates may be measured with substantial errors and may contain missing values. We proposed a multiple imputation method, implemented by a Markov Chain Monte-Carlo method along with Gibbs sampler, to address the covariate measurement errors and missing data in nonlinear mixed-effects models. The multiple imputation method is illustrated in a real data example. Simulation studies show that the multiple imputation method outperforms the commonly used naive methods.  相似文献   

13.
Murrayand Tsiatis (1996) described a weighted survival estimate thatincorporates prognostic time-dependent covariate informationto increase the efficiency of estimation. We propose a test statisticbased on the statistic of Pepe and Fleming (1989, 1991) thatincorporates these weighted survival estimates. As in Pepe andFleming, the test is an integrated weighted difference of twoestimated survival curves. This test has been shown to be effectiveat detecting survival differences in crossing hazards settingswhere the logrank test performs poorly. This method uses stratifiedlongitudinal covariate information to get more precise estimatesof the underlying survival curves when there is censored informationand this leads to more powerful tests. Another important featureof the test is that it remains valid when informative censoringis captured by the incorporated covariate. In this case, thePepe-Fleming statistic is known to be biased and should not beused. These methods could be useful in clinical trials with heavycensoring that include collection over time of covariates, suchas laboratory measurements, that are prognostic of subsequentsurvival or capture information related to censoring.  相似文献   

14.
Patient dropout is a common problem in studies that collect repeated binary measurements. Generalized estimating equations (GEE) are often used to analyze such data. The dropout mechanism may be plausibly missing at random (MAR), i.e. unrelated to future measurements given covariates and past measurements. In this case, various authors have recommended weighted GEE with weights based on an assumed dropout model, or an imputation approach, or a doubly robust approach based on weighting and imputation. These approaches provide asymptotically unbiased inference, provided the dropout or imputation model (as appropriate) is correctly specified. Other authors have suggested that, provided the working correlation structure is correctly specified, GEE using an improved estimator of the correlation parameters (‘modified GEE’) show minimal bias. These modified GEE have not been thoroughly examined. In this paper, we study the asymptotic bias under MAR dropout of these modified GEE, the standard GEE, and also GEE using the true correlation. We demonstrate that all three methods are biased in general. The modified GEE may be preferred to the standard GEE and are subject to only minimal bias in many MAR scenarios but in others are substantially biased. Hence, we recommend the modified GEE be used with caution.  相似文献   

15.
By employing all the observed information and the optimal augmentation term, we propose an augmented inverse probability weighted fractional imputation method (AFI) to handle covariates missing at random in quantile regression. Compared with the existing completely case analysis, inverse probability weighting, multiple imputation and fractional imputation based on quantile regression model with missing covarites, we carry out simulation study to investigate its performance in estimation accuracy and efficiency, computational efficiency and estimation robustness. We also talk about the influence of imputation replicates in our AFI. Finally, we apply our methodology to part of the National Health and Nutrition Examination Survey data.  相似文献   

16.
This article is concerned with partially non linear models when the response variables are missing at random. We examine the empirical likelihood (EL) ratio statistics for unknown parameter in non linear function based on complete-case data, semiparametric regression imputation, and bias-corrected imputation. All the proposed statistics are proven to be asymptotically chi-square distribution under some suitable conditions. Simulation experiments are conducted to compare the finite sample behaviors of the proposed approaches in terms of confidence intervals. It showed that the EL method has advantage compared to the conventional method, and moreover, the imputation technique performs better than the complete-case data.  相似文献   

17.
Covariate measurement error problems have been extensively studied in the context of right-censored data but less so for interval-censored data. Motivated by the AIDS Clinical Trial Group 175 study, where the occurrence time of AIDS was examined only at intermittent clinic visits and the baseline covariate CD4 count was measured with error, we describe a semiparametric maximum likelihood method for analyzing mixed case interval-censored data with mismeasured covariates under the proportional hazards model. We show that the estimator of the regression coefficient is asymptotically normal and efficient and provide a very stable and efficient algorithm for computing the estimators. We evaluate the method through simulation studies and illustrate it with AIDS data.  相似文献   

18.
This article develops estimators for unconditional quantile treatment effects when the treatment selection is endogenous. We use an instrumental variable (IV) to solve for the endogeneity of the binary treatment variable. Identification is based on a monotonicity assumption in the treatment choice equation and is achieved without any functional form restriction. We propose a weighting estimator that is extremely simple to implement. This estimator is root n consistent, asymptotically normally distributed, and its variance attains the semiparametric efficiency bound. We also show that including covariates in the estimation is not only necessary for consistency when the IV is itself confounded but also for efficiency when the instrument is valid unconditionally. An application of the suggested methods to the effects of fertility on the family income distribution illustrates their usefulness. Supplementary materials for this article are available online.  相似文献   

19.
We consider statistical inference of unknown parameters in estimating equations (EEs) when some covariates have nonignorably missing values, which is quite common in practice but has rarely been discussed in the literature. When an instrument, a fully observed covariate vector that helps identifying parameters under nonignorable missingness, is available, the conditional distribution of the missing covariates given other covariates can be estimated by the pseudolikelihood method of Zhao and Shao [(2015), ‘Semiparametric pseudo likelihoods in generalised linear models with nonignorable missing data’, Journal of the American Statistical Association, 110, 1577–1590)] and be used to construct unbiased EEs. These modified EEs then constitute a basis for valid inference by empirical likelihood. Our method is applicable to a wide range of EEs used in practice. It is semiparametric since no parametric model for the propensity of missing covariate data is assumed. Asymptotic properties of the proposed estimator and the empirical likelihood ratio test statistic are derived. Some simulation results and a real data analysis are presented for illustration.  相似文献   

20.
Covariate measurement error problems have been extensively studied in the context of right‐censored data but less so for current status data. Motivated by the zebrafish basal cell carcinoma (BCC) study, where the occurrence time of BCC was only known to lie before or after a sacrifice time and where the covariate (Sonic hedgehog expression) was measured with error, the authors describe a semiparametric maximum likelihood method for analyzing current status data with mismeasured covariates under the proportional hazards model. They show that the estimator of the regression coefficient is asymptotically normal and efficient and that the profile likelihood ratio test is asymptotically Chi‐squared. They also provide an easily implemented algorithm for computing the estimators. They evaluate their method through simulation studies, and illustrate it with a real data example. The Canadian Journal of Statistics 39: 73–88; 2011 © 2011 Statistical Society of Canada  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号