首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Modern statistical methods using incomplete data have been increasingly applied in a wide variety of substantive problems. Similarly, receiver operating characteristic (ROC) analysis, a method used in evaluating diagnostic tests or biomarkers in medical research, has also been increasingly popular problem in both its development and application. While missing-data methods have been applied in ROC analysis, the impact of model mis-specification and/or assumptions (e.g. missing at random) underlying the missing data has not been thoroughly studied. In this work, we study the performance of multiple imputation (MI) inference in ROC analysis. Particularly, we investigate parametric and non-parametric techniques for MI inference under common missingness mechanisms. Depending on the coherency of the imputation model with the underlying data generation mechanism, our results show that MI generally leads to well-calibrated inferences under ignorable missingness mechanisms.  相似文献   

2.
3.
A popular choice when analyzing ordinal data is to consider the cumulative proportional odds model to relate the marginal probabilities of the ordinal outcome to a set of covariates. However, application of this model relies on the condition of identical cumulative odds ratios across the cut-offs of the ordinal outcome; the well-known proportional odds assumption. This paper focuses on the assessment of this assumption while accounting for repeated and missing data. In this respect, we develop a statistical method built on multiple imputation (MI) based on generalized estimating equations that allows to test the proportionality assumption under the missing at random setting. The performance of the proposed method is evaluated for two MI algorithms for incomplete longitudinal ordinal data. The impact of both MI methods is compared with respect to the type I error rate and the power for situations covering various numbers of categories of the ordinal outcome, sample sizes, rates of missingness, well-balanced and skewed data. The comparison of both MI methods with the complete-case analysis is also provided. We illustrate the use of the proposed methods on a quality of life data from a cancer clinical trial.  相似文献   

4.
Linear increments (LI) are used to analyse repeated outcome data with missing values. Previously, two LI methods have been proposed, one allowing non‐monotone missingness but not independent measurement error and one allowing independent measurement error but only monotone missingness. In both, it was suggested that the expected increment could depend on current outcome. We show that LI can allow non‐monotone missingness and either independent measurement error of unknown variance or dependence of expected increment on current outcome but not both. A popular alternative to LI is a multivariate normal model ignoring the missingness pattern. This gives consistent estimation when data are normally distributed and missing at random (MAR). We clarify the relation between MAR and the assumptions of LI and show that for continuous outcomes multivariate normal estimators are also consistent under (non‐MAR and non‐normal) assumptions not much stronger than those of LI. Moreover, when missingness is non‐monotone, they are typically more efficient.  相似文献   

5.
In this paper we propose a latent class based multiple imputation approach for analyzing missing categorical covariate data in a highly stratified data model. In this approach, we impute the missing data assuming a latent class imputation model and we use likelihood methods to analyze the imputed data. Via extensive simulations, we study its statistical properties and make comparisons with complete case analysis, multiple imputation, saturated log-linear multiple imputation and the Expectation–Maximization approach under seven missing data mechanisms (including missing completely at random, missing at random and not missing at random). These methods are compared with respect to bias, asymptotic standard error, type I error, and 95% coverage probabilities of parameter estimates. Simulations show that, under many missingness scenarios, latent class multiple imputation performs favorably when jointly considering these criteria. A data example from a matched case–control study of the association between multiple myeloma and polymorphisms of the Inter-Leukin 6 genes is considered.  相似文献   

6.
Inverse probability weighting (IPW) can deal with confounding in non randomized studies. The inverse weights are probabilities of treatment assignment (propensity scores), estimated by regressing assignment on predictors. Problems arise if predictors can be missing. Solutions previously proposed include assuming assignment depends only on observed predictors and multiple imputation (MI) of missing predictors. For the MI approach, it was recommended that missingness indicators be used with the other predictors. We determine when the two MI approaches, (with/without missingness indicators) yield consistent estimators and compare their efficiencies.We find that, although including indicators can reduce bias when predictors are missing not at random, it can induce bias when they are missing at random. We propose a consistent variance estimator and investigate performance of the simpler Rubin’s Rules variance estimator. In simulations we find both estimators perform well. IPW is also used to correct bias when an analysis model is fitted to incomplete data by restricting to complete cases. Here, weights are inverse probabilities of being a complete case. We explain how the same MI methods can be used in this situation to deal with missing predictors in the weight model, and illustrate this approach using data from the National Child Development Survey.  相似文献   

7.
The analysis of incomplete contingency tables is a practical and an interesting problem. In this paper, we provide characterizations for the various missing mechanisms of a variable in terms of response and non-response odds for two and three dimensional incomplete tables. Log-linear parametrization and some distinctive properties of the missing data models for the above tables are discussed. All possible cases in which data on one, two or all variables may be missing are considered. We study the missingness of each variable in a model, which is more insightful for analyzing cross-classified data than the missingness of the outcome vector. For sensitivity analysis of the incomplete tables, we propose easily verifiable procedures to evaluate the missing at random (MAR), missing completely at random (MCAR) and not missing at random (NMAR) assumptions of the missing data models. These methods depend only on joint and marginal odds computed from fully and partially observed counts in the tables, respectively. Finally, some real-life datasets are analyzed to illustrate our results, which are confirmed based on simulation studies.  相似文献   

8.
Models that involve an outcome variable, covariates, and latent variables are frequently the target for estimation and inference. The presence of missing covariate or outcome data presents a challenge, particularly when missingness depends on the latent variables. This missingness mechanism is called latent ignorable or latent missing at random and is a generalisation of missing at random. Several authors have previously proposed approaches for handling latent ignorable missingness, but these methods rely on prior specification of the joint distribution for the complete data. In practice, specifying the joint distribution can be difficult and/or restrictive. We develop a novel sequential imputation procedure for imputing covariate and outcome data for models with latent variables under latent ignorable missingness. The proposed method does not require a joint model; rather, we use results under a joint model to inform imputation with less restrictive modelling assumptions. We discuss identifiability and convergence‐related issues, and simulation results are presented in several modelling settings. The method is motivated and illustrated by a study of head and neck cancer recurrence. Imputing missing data for models with latent variables under latent‐dependent missingness without specifying a full joint model.  相似文献   

9.
Patients often discontinue from a clinical trial because their health condition is not improving or they cannot tolerate the assigned treatment. Consequently, the observed clinical outcomes in the trial are likely better on average than if every patient had completed the trial. If these differences between trial completers and non-completers cannot be explained by the observed data, then the study outcomes are missing not at random (MNAR). One way to overcome this problem—the trimmed means approach for missing data due to study discontinuation—sets missing values as the worst observed outcome and then trims away a fraction of the distribution from each treatment arm before calculating differences in treatment efficacy (Permutt T, Li F. Trimmed means for symptom trials with dropouts. Pharm Stat. 2017;16(1):20–28). In this paper, we derive sufficient and necessary conditions for when this approach can identify the average population treatment effect. Simulation studies show the trimmed means approach's ability to effectively estimate treatment efficacy when data are MNAR and missingness due to study discontinuation is strongly associated with an unfavorable outcome, but trimmed means fail when data are missing at random. If the reasons for study discontinuation in a clinical trial are known, analysts can improve estimates with a combination of multiple imputation and the trimmed means approach when the assumptions of each hold. We compare the methodology to existing approaches using data from a clinical trial for chronic pain. An R package trim implements the method. When the assumptions are justifiable, using trimmed means can help identify treatment effects notwithstanding MNAR data.  相似文献   

10.
In real-life situations, we often encounter data sets containing missing observations. Statistical methods that address missingness have been extensively studied in recent years. One of the more popular approaches involves imputation of the missing values prior to the analysis, thereby rendering the data complete. Imputation broadly encompasses an entire scope of techniques that have been developed to make inferences about incomplete data, ranging from very simple strategies (e.g. mean imputation) to more advanced approaches that require estimation, for instance, of posterior distributions using Markov chain Monte Carlo methods. Additional complexity arises when the number of missingness patterns increases and/or when both categorical and continuous random variables are involved. Implementation of routines, procedures, or packages capable of generating imputations for incomplete data are now widely available. We review some of these in the context of a motivating example, as well as in a simulation study, under two missingness mechanisms (missing at random and missing not at random). Thus far, evaluation of existing implementations have frequently centred on the resulting parameter estimates of the prescribed model of interest after imputing the missing data. In some situations, however, interest may very well be on the quality of the imputed values at the level of the individual – an issue that has received relatively little attention. In this paper, we focus on the latter to provide further insight about the performance of the different routines, procedures, and packages in this respect.  相似文献   

11.
Missing data are a prevalent and widespread data analytic issue and previous studies have performed simulations to compare the performance of missing data methods in various contexts and for various models; however, one such context that has yet to receive much attention in the literature is the handling of missing data with small samples, particularly when the missingness is arbitrary. Prior studies have either compared methods for small samples with monotone missingness commonly found in longitudinal studies or have investigated the performance of a single method to handle arbitrary missingness with small samples but studies have yet to compare the relative performance of commonly implemented missing data methods for small samples with arbitrary missingness. This study conducts a simulation study to compare and assess the small sample performance of maximum likelihood, listwise deletion, joint multiple imputation, and fully conditional specification multiple imputation for a single-level regression model with a continuous outcome. Results showed that, provided assumptions are met, joint multiple imputation unanimously performed best of the methods examined in the conditions under study.  相似文献   

12.
When modeling multilevel data, it is important to accurately represent the interdependence of observations within clusters. Ignoring data clustering may result in parameter misestimation. However, it is not well established to what degree parameter estimates are affected by model misspecification when applying missing data techniques (MDTs) to incomplete multilevel data. We compare the performance of three MDTs with incomplete hierarchical data. We consider the impact of imputation model misspecification on the quality of parameter estimates by employing multiple imputation under assumptions of a normal model (MI/NM) with two-level cross-sectional data when values are missing at random on the dependent variable at rates of 10%, 30%, and 50%. Five criteria are used to compare estimates from MI/NM to estimates from MI assuming a linear mixed model (MI/LMM) and maximum likelihood estimation to the same incomplete data sets. With 10% missing data (MD), techniques performed similarly for fixed-effects estimates, but variance components were biased with MI/NM. Effects of model misspecification worsened at higher rates of MD, with the hierarchical structure of the data markedly underrepresented by biased variance component estimates. MI/LMM and maximum likelihood provided generally accurate and unbiased parameter estimates but performance was negatively affected by increased rates of MD.  相似文献   

13.
This paper develops a test for comparing treatment effects when observations are missing at random for repeated measures data on independent subjects. It is assumed that missingness at any occasion follows a Bernoulli distribution. It is shown that the distribution of the vector of linear rank statistics depends on the unknown parameters of the probability law that governs missingness, which is absent in the existing conditional methods employing rank statistics. This dependence is through the variance–covariance matrix of the vector of linear ranks. The test statistic is a quadratic form in the linear rank statistics when the variance–covariance matrix is estimated. The limiting distribution of the test statistic is derived under the null hypothesis. Several methods of estimating the unknown components of the variance–covariance matrix are considered. The estimate that produces stable empirical Type I error rate while maintaining the highest power among the competing tests is recommended for implementation in practice. Simulation studies are also presented to show the advantage of the proposed test over other rank-based tests that do not account for the randomness in the missing data pattern. Our method is shown to have the highest power while also maintaining near-nominal Type I error rates. Our results clearly illustrate that even for an ignorable missingness mechanism, the randomness in the pattern of missingness cannot be ignored. A real data example is presented to highlight the effectiveness of the proposed method.  相似文献   

14.
In this paper, a generalized partially linear model (GPLM) with missing covariates is studied and a Monte Carlo EM (MCEM) algorithm with penalized-spline (P-spline) technique is developed to estimate the regression coefficients and nonparametric function, respectively. As classical model selection procedures such as Akaike's information criterion become invalid for our considered models with incomplete data, some new model selection criterions for GPLMs with missing covariates are proposed under two different missingness mechanism, say, missing at random (MAR) and missing not at random (MNAR). The most attractive point of our method is that it is rather general and can be extended to various situations with missing observations based on EM algorithm, especially when no missing data involved, our new model selection criterions are reduced to classical AIC. Therefore, we can not only compare models with missing observations under MAR/MNAR settings, but also can compare missing data models with complete-data models simultaneously. Theoretical properties of the proposed estimator, including consistency of the model selection criterions are investigated. A simulation study and a real example are used to illustrate the proposed methodology.  相似文献   

15.
Summary.  In longitudinal studies, missingness of data is often an unavoidable problem. Estimators from the linear mixed effects model assume that missing data are missing at random. However, estimators are biased when this assumption is not met. In the paper, theoretical results for the asymptotic bias are established under non-ignorable drop-out, drop-in and other missing data patterns. The asymptotic bias is large when the drop-out subjects have only one or no observation, especially for slope-related parameters of the linear mixed effects model. In the drop-in case, intercept-related parameter estimators show substantial asymptotic bias when subjects enter late in the study. Eight other missing data patterns are considered and these produce asymptotic biases of a variety of magnitudes.  相似文献   

16.
Summary.  In a large, prospective longitudinal study designed to monitor cardiac abnormalities in children born to women who are infected with the human immunodeficiency virus, instead of a single outcome variable, there are multiple binary outcomes (e.g. abnormal heart rate, abnormal blood pressure and abnormal heart wall thickness) considered as joint measures of heart function over time. In the presence of missing responses at some time points, longitudinal marginal models for these multiple outcomes can be estimated by using generalized estimating equations (GEEs), and consistent estimates can be obtained under the assumption of a missingness completely at random mechanism. When the missing data mechanism is missingness at random, i.e. the probability of missing a particular outcome at a time point depends on observed values of that outcome and the remaining outcomes at other time points, we propose joint estimation of the marginal models by using a single modified GEE based on an EM-type algorithm. The method proposed is motivated by the longitudinal study of cardiac abnormalities in children who were born to women infected with the human immunodeficiency virus, and analyses of these data are presented to illustrate the application of the method. Further, in an asymptotic study of bias, we show that, under a missingness at random mechanism in which missingness depends on all observed outcome variables, our joint estimation via the modified GEE produces almost unbiased estimates, provided that the correlation model has been correctly specified, whereas estimates from standard GEEs can lead to substantial bias.  相似文献   

17.
This work provides a set of macros performed with SAS (Statistical Analysis System) for Windows, which can be used to fit conditional models under intermittent missingness in longitudinal data. A formalized transition model, including random effects for individuals and measurement error, is presented. Model fitting is based on the missing completely at random or missing at random assumptions, and the separability condition. The problem translates to maximization of the marginal observed data density only, which for Gaussian data is again Gaussian, meaning that the likelihood can be expressed in terms of the mean and covariance matrix of the observed data vector. A simulation study is presented and misspecification issues are considered. A practical application is also given, where conditional models are fitted to the data from a clinical trial that assessed the effect of a Cuban medicine on a disease of the respiratory system.  相似文献   

18.
In this paper, a simulation study is conducted to systematically investigate the impact of dichotomizing longitudinal continuous outcome variables under various types of missing data mechanisms. Generalized linear models (GLM) with standard generalized estimating equations (GEE) are widely used for longitudinal outcome analysis, but these semi‐parametric approaches are only valid under missing data completely at random (MCAR). Alternatively, weighted GEE (WGEE) and multiple imputation GEE (MI‐GEE) were developed to ensure validity under missing at random (MAR). Using a simulation study, the performance of standard GEE, WGEE and MI‐GEE on incomplete longitudinal dichotomized outcome analysis is evaluated. For comparisons, likelihood‐based linear mixed effects models (LMM) are used for incomplete longitudinal original continuous outcome analysis. Focusing on dichotomized outcome analysis, MI‐GEE with original continuous missing data imputation procedure provides well controlled test sizes and more stable power estimates compared with any other GEE‐based approaches. It is also shown that dichotomizing longitudinal continuous outcome will result in substantial loss of power compared with LMM. Copyright © 2009 John Wiley & Sons, Ltd.  相似文献   

19.
Existence of missing values is an inseparable part of longitudinal studies in epidemiology, medical and clinical studies. Usually researchers, for simplicity, ignore the missingness mechanism while, ignoring a not at random mechanism may lead to misleading results. In this paper, we use a Bayesian paradigm for fitting selection model of Heckman, which allows the non-ignorable missingness for longitudinal data. Also, we use reversible-jump Markov chain Monte Carlo to allow the model to choose between non-ignorable and ignorable structures for missingness mechanism, and show how the selection can be incorporated. Some simulation studies are performed for illustration of the proposed approach. The approach is also used for analyzing two real data sets.  相似文献   

20.
Summary.  In longitudinal studies missing data are the rule not the exception. We consider the analysis of longitudinal binary data with non-monotone missingness that is thought to be non-ignorable. In this setting a full likelihood approach is complicated algebraically and can be computationally prohibitive when there are many measurement occasions. We propose a 'protective' estimator that assumes that the probability that a response is missing at any occasion depends, in a completely unspecified way, on the value of that variable alone. Relying on this 'protectiveness' assumption, we describe a pseudolikelihood estimator of the regression parameters under non-ignorable missingness, without having to model the missing data mechanism directly. The method proposed is applied to CD4 cell count data from two longitudinal clinical trials of patients infected with the human immunodeficiency virus.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号