首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 125 毫秒
1.
Models that involve an outcome variable, covariates, and latent variables are frequently the target for estimation and inference. The presence of missing covariate or outcome data presents a challenge, particularly when missingness depends on the latent variables. This missingness mechanism is called latent ignorable or latent missing at random and is a generalisation of missing at random. Several authors have previously proposed approaches for handling latent ignorable missingness, but these methods rely on prior specification of the joint distribution for the complete data. In practice, specifying the joint distribution can be difficult and/or restrictive. We develop a novel sequential imputation procedure for imputing covariate and outcome data for models with latent variables under latent ignorable missingness. The proposed method does not require a joint model; rather, we use results under a joint model to inform imputation with less restrictive modelling assumptions. We discuss identifiability and convergence‐related issues, and simulation results are presented in several modelling settings. The method is motivated and illustrated by a study of head and neck cancer recurrence. Imputing missing data for models with latent variables under latent‐dependent missingness without specifying a full joint model.  相似文献   

2.
In the parametric regression model, the covariate missing problem under missing at random is considered. It is often desirable to use flexible parametric or semiparametric models for the covariate distribution, which can reduce a potential misspecification problem. Recently, a completely nonparametric approach was developed by [H.Y. Chen, Nonparametric and semiparametric models for missing covariates in parameter regression, J. Amer. Statist. Assoc. 99 (2004), pp. 1176–1189; Z. Zhang and H.E. Rockette, On maximum likelihood estimation in parametric regression with missing covariates, J. Statist. Plann. Inference 47 (2005), pp. 206–223]. Although it does not require a model for the covariate distribution or the missing data mechanism, the proposed method assumes that the covariate distribution is supported only by observed values. Consequently, their estimator is a restricted maximum likelihood estimator (MLE) rather than the global MLE. In this article, we show the restricted semiparametric MLE could be very misleading in some cases. We discuss why this problem occurs and suggest an algorithm to obtain the global MLE. Then, we assess the performance of the proposed method via some simulation experiments.  相似文献   

3.
In many clinical studies where time to failure is of primary interest, patients may fail or die from one of many causes where failure time can be right censored. In some circumstances, it might also be the case that patients are known to die but the cause of death information is not available for some patients. Under the assumption that cause of death is missing at random, we compare the Goetghebeur and Ryan (1995, Biometrika, 82, 821–833) partial likelihood approach with the Dewanji (1992, Biometrika, 79, 855–857)partial likelihood approach. We show that the estimator for the regression coefficients based on the Dewanji partial likelihood is not only consistent and asymptotically normal, but also semiparametric efficient. While the Goetghebeur and Ryan estimator is more robust than the Dewanji partial likelihood estimator against misspecification of proportional baseline hazards, the Dewanji partial likelihood estimator allows the probability of missing cause of failure to depend on covariate information without the need to model the missingness mechanism. Tests for proportional baseline hazards are also suggested and a robust variance estimator is derived.  相似文献   

4.
Abstract

Handling data with the nonignorably missing mechanism is still a challenging problem in statistics. In this paper, we develop a fully Bayesian adaptive Lasso approach for quantile regression models with nonignorably missing response data, where the nonignorable missingness mechanism is specified by a logistic regression model. The proposed method extends the Bayesian Lasso by allowing different penalization parameters for different regression coefficients. Furthermore, a hybrid algorithm that combined the Gibbs sampler and Metropolis-Hastings algorithm is implemented to simulate the parameters from posterior distributions, mainly including regression coefficients, shrinkage coefficients, parameters in the non-ignorable missing models. Finally, some simulation studies and a real example are used to illustrate the proposed methodology.  相似文献   

5.
In nonignorable missing response problems, we study a semiparametric model with unspecified missingness mechanism model and a exponential family model for response conditional density. Even though existing methods are available to estimate the parameters in exponential family, estimation or testing of the missingness mechanism model nonparametrically remains to be an open problem. By defining a “synthesis" density involving the unknown missingness mechanism model and the known baseline “carrier" density in the exponential family model, we treat this “synthesis" density as a legitimate one with biased sampling version. We develop maximum pseudo likelihood estimation procedures and the resultant estimators are consistent and asymptotically normal. Since the “synthesis" cumulative distribution is a functional of the missingness mechanism model and the known carrier density, proposed method can be used to test the correctness of the missingness mechanism model nonparametrically andindirectly. Simulation studies and real example demonstrate the proposed methods perform very well.  相似文献   

6.
ABSTRACT

In this article, a finite mixture model of hurdle Poisson distribution with missing outcomes is proposed, and a stochastic EM algorithm is developed for obtaining the maximum likelihood estimates of model parameters and mixing proportions. Specifically, missing data is assumed to be missing not at random (MNAR)/non ignorable missing (NINR) and the corresponding missingness mechanism is modeled through probit regression. To improve the algorithm efficiency, a stochastic step is incorporated into the E-step based on data augmentation, whereas the M-step is solved by the method of conditional maximization. A variation on Bayesian information criterion (BIC) is also proposed to compare models with different number of components with missing values. The considered model is a general model framework and it captures the important characteristics of count data analysis such as zero inflation/deflation, heterogeneity as well as missingness, providing us with more insight into the data feature and allowing for dispersion to be investigated more fully and correctly. Since the stochastic step only involves simulating samples from some standard distributions, the computational burden is alleviated. Once missing responses and latent variables are imputed to replace the conditional expectation, our approach works as part of a multiple imputation procedure. A simulation study and a real example illustrate the usefulness and effectiveness of our methodology.  相似文献   

7.
The generalized method of moments (GMM) and empirical likelihood (EL) are popular methods for combining sample and auxiliary information. These methods are used in very diverse fields of research, where competing theories often suggest variables satisfying different moment conditions. Results in the literature have shown that the efficient‐GMM (GMME) and maximum empirical likelihood (MEL) estimators have the same asymptotic distribution to order n?1/2 and that both estimators are asymptotically semiparametric efficient. In this paper, we demonstrate that when data are missing at random from the sample, the utilization of some well‐known missing‐data handling approaches proposed in the literature can yield GMME and MEL estimators with nonidentical properties; in particular, it is shown that the GMME estimator is semiparametric efficient under all the missing‐data handling approaches considered but that the MEL estimator is not always efficient. A thorough examination of the reason for the nonequivalence of the two estimators is presented. A particularly strong feature of our analysis is that we do not assume smoothness in the underlying moment conditions. Our results are thus relevant to situations involving nonsmooth estimating functions, including quantile and rank regressions, robust estimation, the estimation of receiver operating characteristic (ROC) curves, and so on.  相似文献   

8.
In this paper, we propose an empirical likelihood-based weighted estimator of regression parameter in quantile regression model with non ignorable missing covariates. The proposed estimator is computationally simple and achieves semiparametric efficiency if the probability of missingness on the fully observed variables is correctly specified. The efficiency gain of the proposed estimator over the complete-case-analysis estimator is quantified theoretically and illustrated via simulation and a real data application.  相似文献   

9.
The authors propose two tests, one parametric and the other semiparametric, for testing bias of estimating equations in weighted regression with partially missing covariates when the primary regression model is correctly specified. More generally, the proposed tests may be thought of as a diagnostic tool for the combined package of the primary regression model and the missingness assumptions. The asymptotic null distributions of the two test statistics are derived under the assumption of missingness at random for the partially missing covariates. A small scale simulation study completes the work.  相似文献   

10.
It is quite a challenge to develop model‐free feature screening approaches for missing response problems because the existing standard missing data analysis methods cannot be applied directly to high dimensional case. This paper develops some novel methods by borrowing information of missingness indicators such that any feature screening procedures for ultrahigh‐dimensional covariates with full data can be applied to missing response case. The first method is the so‐called missing indicator imputation screening, which is developed by proving that the set of the active predictors of interest for the response is a subset of the active predictors for the product of the response and missingness indicator under some mild conditions. As an alternative, another method called Venn diagram‐based approach is also developed. The sure screening property is proven for both methods. It is shown that the complete case analysis can also keep the sure screening property of any feature screening approach with sure screening property.  相似文献   

11.
In this article, the authors consider a semiparametric additive hazards regression model for right‐censored data that allows some censoring indicators to be missing at random. They develop a class of estimating equations and use an inverse probability weighted approach to estimate the regression parameters. Nonparametric smoothing techniques are employed to estimate the probability of non‐missingness and the conditional probability of an uncensored observation. The asymptotic properties of the resulting estimators are derived. Simulation studies show that the proposed estimators perform well. They motivate and illustrate their methods with data from a brain cancer clinical trial. The Canadian Journal of Statistics 38: 333–351; 2010 © 2010 Statistical Society of Canada  相似文献   

12.
We consider statistical inference of unknown parameters in estimating equations (EEs) when some covariates have nonignorably missing values, which is quite common in practice but has rarely been discussed in the literature. When an instrument, a fully observed covariate vector that helps identifying parameters under nonignorable missingness, is available, the conditional distribution of the missing covariates given other covariates can be estimated by the pseudolikelihood method of Zhao and Shao [(2015), ‘Semiparametric pseudo likelihoods in generalised linear models with nonignorable missing data’, Journal of the American Statistical Association, 110, 1577–1590)] and be used to construct unbiased EEs. These modified EEs then constitute a basis for valid inference by empirical likelihood. Our method is applicable to a wide range of EEs used in practice. It is semiparametric since no parametric model for the propensity of missing covariate data is assumed. Asymptotic properties of the proposed estimator and the empirical likelihood ratio test statistic are derived. Some simulation results and a real data analysis are presented for illustration.  相似文献   

13.
Quantitle regression (QR) is a popular approach to estimate functional relations between variables for all portions of a probability distribution. Parameter estimation in QR with missing data is one of the most challenging issues in statistics. Regression quantiles can be substantially biased when observations are subject to missingness. We study several inverse probability weighting (IPW) estimators for parameters in QR when covariates or responses are subject to missing not at random. Maximum likelihood and semiparametric likelihood methods are employed to estimate the respondent probability function. To achieve nice efficiency properties, we develop an empirical likelihood (EL) approach to QR with the auxiliary information from the calibration constraints. The proposed methods are less sensitive to misspecified missing mechanisms. Asymptotic properties of the proposed IPW estimators are shown under general settings. The efficiency gain of EL-based IPW estimator is quantified theoretically. Simulation studies and a data set on the work limitation of injured workers from Canada are used to illustrated our proposed methodologies.  相似文献   

14.
Missing covariates data is a common issue in generalized linear models (GLMs). A model-based procedure arising from properly specifying joint models for both the partially observed covariates and the corresponding missing indicator variables represents a sound and flexible methodology, which lends itself to maximum likelihood estimation as the likelihood function is available in computable form. In this paper, a novel model-based methodology is proposed for the regression analysis of GLMs when the partially observed covariates are categorical. Pair-copula constructions are used as graphical tools in order to facilitate the specification of the high-dimensional probability distributions of the underlying missingness components. The model parameters are estimated by maximizing the weighted log-likelihood function by using an EM algorithm. In order to compare the performance of the proposed methodology with other well-established approaches, which include complete-cases and multiple imputation, several simulation experiments of Binomial, Poisson and Normal regressions are carried out under both missing at random and non-missing at random mechanisms scenarios. The methods are illustrated by modeling data from a stage III melanoma clinical trial. The results show that the methodology is rather robust and flexible, representing a competitive alternative to traditional techniques.  相似文献   

15.
Tianqing Liu 《Statistics》2016,50(1):89-113
This paper proposes an empirical likelihood-based weighted (ELW) quantile regression approach for estimating the conditional quantiles when some covariates are missing at random. The proposed ELW estimator is computationally simple and achieves semiparametric efficiency if the probability of missingness is correctly specified. The limiting covariance matrix of the ELW estimator can be estimated by a resampling technique, which does not involve nonparametric density estimation or numerical derivatives. Simulation results show that the ELW method works remarkably well in finite samples. A real data example is used to illustrate the proposed ELW method.  相似文献   

16.
Nonresponse is a very common phenomenon in survey sampling. Nonignorable nonresponse – that is, a response mechanism that depends on the values of the variable having nonresponse – is the most difficult type of nonresponse to handle. This article develops a robust estimation approach to estimating equations (EEs) by incorporating the modelling of nonignorably missing data, the generalized method of moments (GMM) method and the imputation of EEs via the observed data rather than the imputed missing values when some responses are subject to nonignorably missingness. Based on a particular semiparametric logistic model for nonignorable missing response, this paper proposes the modified EEs to calculate the conditional expectation under nonignorably missing data. We can apply the GMM to infer the parameters. The advantage of our method is that it replaces the non-parametric kernel-smoothing with a parametric sampling importance resampling (SIR) procedure to avoid nonparametric kernel-smoothing problems with high dimensional covariates. The proposed method is shown to be more robust than some current approaches by the simulations.  相似文献   

17.
Summary. The paper considers canonical link generalized linear models with stratum-specific nuisance intercepts and missing covariate data. This family includes the conditional logistic regression model. Existing methods for this problem, each of which uses a conditioning argu- ment to eliminate the nuisance intercept, model either the missing covariate data or the missingness process. The paper compares these methods under a common likelihood framework. The semiparametric efficient estimator is identified, and a new estimator, which reduces dependence on the model for the missing covariate, is proposed. A simulation study compares the methods with respect to efficiency and robustness to model misspecification.  相似文献   

18.
Missing data are present in almost all statistical analysis. In simple paired design tests, when some subject has one of the involved variables missing in the so-called partially overlapping samples scheme, it is usually discarded for the analysis. The lack of consistency between the information reported in the univariate and multivariate analysis is, perhaps, the main consequence. Although the randomness on the missing mechanism (missingness completely at random) is an usual and needed assumption for this particular situation, missing data presence could lead to serious inconsistencies on the reported conclusions. In this paper, the authors develop a simple and direct procedure which allows using the whole available information in order to perform paired tests. In particular, the proposed methodology is applied to check the equality among the means from two paired samples. In addition, the use of two different resampling techniques is also explored. Finally, real-world data are analysed.  相似文献   

19.
Matched case–control designs are commonly used in epidemiological studies for estimating the effect of exposure variables on the risk of a disease by controlling the effect of confounding variables. Due to retrospective nature of the study, information on a covariate could be missing for some subjects. A straightforward application of the conditional logistic likelihood for analyzing matched case–control data with the partially missing covariate may yield inefficient estimators of the parameters. A robust method has been proposed to handle this problem using an estimated conditional score approach when the missingness mechanism does not depend on the disease status. Within the conditional logistic likelihood framework, an empirical procedure is used to estimate the odds of the disease for the subjects with missing covariate values. The asymptotic distribution and the asymptotic variance of the estimator when the matching variables and the completely observed covariates are categorical. The finite sample performance of the proposed estimator is assessed through a simulation study. Finally, the proposed method has been applied to analyze two matched case–control studies. The Canadian Journal of Statistics 38: 680–697; 2010 © 2010 Statistical Society of Canada  相似文献   

20.
For an estimation with missing data, a crucial step is to determine if the data are missing completely at random (MCAR), in which case a complete‐case analysis would suffice. Most existing tests for MCAR do not provide a method for a subsequent estimation once the MCAR is rejected. In the setting of estimating means, we propose a unified approach for testing MCAR and the subsequent estimation. Upon rejecting MCAR, the same set of weights used for testing can then be used for estimation. The resulting estimators are consistent if the missingness of each response variable depends only on a set of fully observed auxiliary variables and the true outcome regression model is among the user‐specified functions for deriving the weights. The proposed method is based on the calibration idea from survey sampling literature and the empirical likelihood theory.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号