首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Summary.  In a large, prospective longitudinal study designed to monitor cardiac abnormalities in children born to women who are infected with the human immunodeficiency virus, instead of a single outcome variable, there are multiple binary outcomes (e.g. abnormal heart rate, abnormal blood pressure and abnormal heart wall thickness) considered as joint measures of heart function over time. In the presence of missing responses at some time points, longitudinal marginal models for these multiple outcomes can be estimated by using generalized estimating equations (GEEs), and consistent estimates can be obtained under the assumption of a missingness completely at random mechanism. When the missing data mechanism is missingness at random, i.e. the probability of missing a particular outcome at a time point depends on observed values of that outcome and the remaining outcomes at other time points, we propose joint estimation of the marginal models by using a single modified GEE based on an EM-type algorithm. The method proposed is motivated by the longitudinal study of cardiac abnormalities in children who were born to women infected with the human immunodeficiency virus, and analyses of these data are presented to illustrate the application of the method. Further, in an asymptotic study of bias, we show that, under a missingness at random mechanism in which missingness depends on all observed outcome variables, our joint estimation via the modified GEE produces almost unbiased estimates, provided that the correlation model has been correctly specified, whereas estimates from standard GEEs can lead to substantial bias.  相似文献   

2.
Summary. Missing observations are a common problem that complicate the analysis of clustered data. In the Connecticut child surveys of childhood psychopathology, it was possible to identify reasons why outcomes were not observed. Of note, some of these causes of missingness may be assumed to be ignorable , whereas others may be non-ignorable . We consider logistic regression models for incomplete bivariate binary outcomes and propose mixture models that permit estimation assuming that there are two distinct types of missingness mechanisms: one that is ignorable; the other non-ignorable. A feature of the mixture modelling approach is that additional analyses to assess the sensitivity to assumptions about the missingness are relatively straightforward to incorporate. The methods were developed for analysing data from the Connecticut child surveys, where there are missing informant reports of child psychopathology and different reasons for missingness can be distinguished.  相似文献   

3.
Summary.  The paper studies the non-response process in a long-term study of neurotic dis-order by comparing the analysis based on the responses that were collected by the established practice of interviewing the subjects, at dates arranged in advance (appointments), with the analysis of the nearly complete set of responses that were collected by an extensive effort that involved attempts to interview without seeking a prior agreement. The method of multiple imputation is applied, and its properties are explored in a setting that is not perfectly suited for its application: a relatively small sample size, ordinal score outcomes and the likelihood that the outcomes are missing not at random.  相似文献   

4.
5.
It is quite a challenge to develop model‐free feature screening approaches for missing response problems because the existing standard missing data analysis methods cannot be applied directly to high dimensional case. This paper develops some novel methods by borrowing information of missingness indicators such that any feature screening procedures for ultrahigh‐dimensional covariates with full data can be applied to missing response case. The first method is the so‐called missing indicator imputation screening, which is developed by proving that the set of the active predictors of interest for the response is a subset of the active predictors for the product of the response and missingness indicator under some mild conditions. As an alternative, another method called Venn diagram‐based approach is also developed. The sure screening property is proven for both methods. It is shown that the complete case analysis can also keep the sure screening property of any feature screening approach with sure screening property.  相似文献   

6.
Linear increments (LI) are used to analyse repeated outcome data with missing values. Previously, two LI methods have been proposed, one allowing non‐monotone missingness but not independent measurement error and one allowing independent measurement error but only monotone missingness. In both, it was suggested that the expected increment could depend on current outcome. We show that LI can allow non‐monotone missingness and either independent measurement error of unknown variance or dependence of expected increment on current outcome but not both. A popular alternative to LI is a multivariate normal model ignoring the missingness pattern. This gives consistent estimation when data are normally distributed and missing at random (MAR). We clarify the relation between MAR and the assumptions of LI and show that for continuous outcomes multivariate normal estimators are also consistent under (non‐MAR and non‐normal) assumptions not much stronger than those of LI. Moreover, when missingness is non‐monotone, they are typically more efficient.  相似文献   

7.
Comparison of groups in longitudinal studies is often conducted using the area under the outcome versus time curve. However, outcomes may be subject to censoring due to a limit of detection and specific methods that take informative missingness into account need to be applied. In this article, we present a unified model‐based method that accounts for both the within‐subject variability in the estimation of the area under the curve as well as the missingness mechanism in the event of censoring. Simulation results demonstrate that our proposed method has a significant advantage over traditionally implemented methods with regards to its inferential properties. A working example from an AIDS study is presented to demonstrate the applicability of our approach. Copyright © 2015 John Wiley & Sons, Ltd.  相似文献   

8.
We study bias arising from rounding categorical variables following multivariate normal (MVN) imputation. This task has been well studied for binary variables, but not for more general categorical variables. Three methods that assign imputed values to categories based on fixed reference points are compared using 25 specific scenarios covering variables with k=3, …, 7 categories, and five distributional shapes, and for each k=3, …, 7, we examine the distribution of bias arising over 100,000 distributions drawn from a symmetric Dirichlet distribution. We observed, on both empirical and theoretical grounds, that one method (projected-distance-based rounding) is superior to the other two methods, and that the risk of invalid inference with the best method may be too high at sample sizes n≥150 at 50% missingness, n≥250 at 30% missingness and n≥1500 at 10% missingness. Therefore, these methods are generally unsatisfactory for rounding categorical variables (with up to seven categories) following MVN imputation.  相似文献   

9.
ABSTRACT

In this article, a finite mixture model of hurdle Poisson distribution with missing outcomes is proposed, and a stochastic EM algorithm is developed for obtaining the maximum likelihood estimates of model parameters and mixing proportions. Specifically, missing data is assumed to be missing not at random (MNAR)/non ignorable missing (NINR) and the corresponding missingness mechanism is modeled through probit regression. To improve the algorithm efficiency, a stochastic step is incorporated into the E-step based on data augmentation, whereas the M-step is solved by the method of conditional maximization. A variation on Bayesian information criterion (BIC) is also proposed to compare models with different number of components with missing values. The considered model is a general model framework and it captures the important characteristics of count data analysis such as zero inflation/deflation, heterogeneity as well as missingness, providing us with more insight into the data feature and allowing for dispersion to be investigated more fully and correctly. Since the stochastic step only involves simulating samples from some standard distributions, the computational burden is alleviated. Once missing responses and latent variables are imputed to replace the conditional expectation, our approach works as part of a multiple imputation procedure. A simulation study and a real example illustrate the usefulness and effectiveness of our methodology.  相似文献   

10.
Missing data pose a serious challenge to the integrity of randomized clinical trials, especially of treatments for prolonged illnesses such as schizophrenia, in which long‐term impact assessment is of great importance, but the follow‐up rates are often no more than 50%. Sensitivity analysis using Bayesian modeling for missing data offers a systematic approach to assessing the sensitivity of the inferences made on the basis of observed data. This paper uses data from an 18‐month study of veterans with schizophrenia to demonstrate this approach. Data were obtained from a randomized clinical trial involving 369 patients diagnosed with schizophrenia that compared long‐acting injectable risperidone with a psychiatrist's choice of oral treatment. Bayesian analysis utilizing a pattern‐mixture modeling approach was used to validate the reported results by detecting bias due to non‐random patterns of missing data. The analysis was applied to several outcomes including standard measures of schizophrenia symptoms, quality of life, alcohol use, and global mental status. The original study results for several measures were confirmed against a wide range of patterns of non‐random missingness. Robustness of the conclusions was assessed using sensitivity parameters. The missing data in the trial did not likely threaten the validity of previously reported results. Copyright © 2014 John Wiley & Sons, Ltd.  相似文献   

11.
Non ignorable missing data is a common problem in longitudinal studies. Latent class models are attractive for simplifying the modeling of missing data when the data are subject to either a monotone or intermittent missing data pattern. In our study, we propose a new two-latent-class model for categorical data with informative dropouts, dividing the observed data into two latent classes; one class in which the outcomes are deterministic and a second one in which the outcomes can be modeled using logistic regression. In the model, the latent classes connect the longitudinal responses and the missingness process under the assumption of conditional independence. Parameters are estimated by the method of maximum likelihood estimation based on the above assumptions and the tetrachoric correlation between responses within the same subject. We compare the proposed method with the shared parameter model and the weighted GEE model using the areas under the ROC curves in the simulations and the application to the smoking cessation data set. The simulation results indicate that the proposed two-latent-class model performs well under different missing procedures. The application results show that our proposed method is better than the shared parameter model and the weighted GEE model.  相似文献   

12.
Regression analysis is one of the most used statistical methods for data analysis. There are, however, many situations in which one cannot base inference solely on f ( y ∣ x ; β), the conditional probability (density) function for the response variable Y , given x , the covariates. Examples include missing data where the missingness is non-ignorable, sampling surveys in which subjects are selected on the basis of the Y -values and meta-analysis where published studies are subject to 'selection bias'. The conventional approaches require the correct specification of the missingness mechanism, sampling probability and probability for being published respectively. In this paper, we propose an alternative estimating procedure for β based on an idea originated by Kalbfleisch. The novelty of this method is that no assumption on the missingness probability mechanisms etc. mentioned above is required to be specified. Asymptotic efficiency calculations and simulation studies were conducted to compare the method proposed with the two existing methods: the conditional likelihood and the weighted estimating function approaches.  相似文献   

13.
A popular choice when analyzing ordinal data is to consider the cumulative proportional odds model to relate the marginal probabilities of the ordinal outcome to a set of covariates. However, application of this model relies on the condition of identical cumulative odds ratios across the cut-offs of the ordinal outcome; the well-known proportional odds assumption. This paper focuses on the assessment of this assumption while accounting for repeated and missing data. In this respect, we develop a statistical method built on multiple imputation (MI) based on generalized estimating equations that allows to test the proportionality assumption under the missing at random setting. The performance of the proposed method is evaluated for two MI algorithms for incomplete longitudinal ordinal data. The impact of both MI methods is compared with respect to the type I error rate and the power for situations covering various numbers of categories of the ordinal outcome, sample sizes, rates of missingness, well-balanced and skewed data. The comparison of both MI methods with the complete-case analysis is also provided. We illustrate the use of the proposed methods on a quality of life data from a cancer clinical trial.  相似文献   

14.
In nonignorable missing response problems, we study a semiparametric model with unspecified missingness mechanism model and a exponential family model for response conditional density. Even though existing methods are available to estimate the parameters in exponential family, estimation or testing of the missingness mechanism model nonparametrically remains to be an open problem. By defining a “synthesis" density involving the unknown missingness mechanism model and the known baseline “carrier" density in the exponential family model, we treat this “synthesis" density as a legitimate one with biased sampling version. We develop maximum pseudo likelihood estimation procedures and the resultant estimators are consistent and asymptotically normal. Since the “synthesis" cumulative distribution is a functional of the missingness mechanism model and the known carrier density, proposed method can be used to test the correctness of the missingness mechanism model nonparametrically andindirectly. Simulation studies and real example demonstrate the proposed methods perform very well.  相似文献   

15.
Modern statistical methods using incomplete data have been increasingly applied in a wide variety of substantive problems. Similarly, receiver operating characteristic (ROC) analysis, a method used in evaluating diagnostic tests or biomarkers in medical research, has also been increasingly popular problem in both its development and application. While missing-data methods have been applied in ROC analysis, the impact of model mis-specification and/or assumptions (e.g. missing at random) underlying the missing data has not been thoroughly studied. In this work, we study the performance of multiple imputation (MI) inference in ROC analysis. Particularly, we investigate parametric and non-parametric techniques for MI inference under common missingness mechanisms. Depending on the coherency of the imputation model with the underlying data generation mechanism, our results show that MI generally leads to well-calibrated inferences under ignorable missingness mechanisms.  相似文献   

16.
Summary.  Social data often contain missing information. The problem is inevitably severe when analysing historical data. Conventionally, researchers analyse complete records only. Listwise deletion not only reduces the effective sample size but also may result in biased estimation, depending on the missingness mechanism. We analyse household types by using population registers from ancient China (618–907 AD) by comparing a simple classification, a latent class model of the complete data and a latent class model of the complete and partially missing data assuming four types of ignorable and non-ignorable missingness mechanisms. The findings show that either a frequency classification or a latent class analysis using the complete records only yielded biased estimates and incorrect conclusions in the presence of partially missing data of a non-ignorable mechanism. Although simply assuming ignorable or non-ignorable missing data produced consistently similarly higher estimates of the proportion of complex households, a specification of the relationship between the latent variable and the degree of missingness by a row effect uniform association model helped to capture the missingness mechanism better and improved the model fit.  相似文献   

17.
We show how register data combined at person-level with survey data can be used to conduct a novel type of nonresponse analysis in a panel survey. The availability of register data provides a unique opportunity to directly test the type of the missingness mechanism as well as estimate the size of bias due to initial nonresponse and attrition. We are also able to study in-depth the determinants of initial nonresponse and attrition. We use the Finnish subset of the European Community Household Panel (FI ECHP) data combined with register panel data and unemployment spells as outcome variables of interest. Our results show that initial nonresponse and attrition are clearly different processes driven by different background variables. Both the initial nonresponse and attrition mechanisms are nonignorable with respect to analysis of unemployment spells. Finally, our results suggest that initial nonresponse may play a role at least as important as attrition in causing bias. This result challenges the common view of attrition being the main threat to the value of panel data.  相似文献   

18.
Abstract.  This paper examines and applies methods for modelling longitudinal binary data subject to both intermittent missingness and dropout. The paper is based around the analysis of data from a study into the health impact of a sanitation programme carried out in Salvador, Brazil. Our objective was to investigate risk factors associated with incidence and prevalence of diarrhoea in children aged up to 3 years old. In total, 926 children were followed up at home twice a week from October 2000 to January 2002 and for each child daily occurrence of diarrhoea was recorded. A challenging factor in analysing these data is the presence of between-subject heterogeneity not explained by known risk factors, combined with significant loss of observed data through either intermittent missingness (average of 78 days per child) or dropout (21% of children). We discuss modelling strategies and show the advantages of taking an event history approach with an additive discrete time regression model.  相似文献   

19.
Non-response (or missing data) is often encountered in large-scale surveys. To enable the behavioural analysis of these data sets, statistical treatments are commonly applied to complete or remove these data. However, the correctness of such procedures critically depends on the nature of the underlying missingness generation process. Clearly, the efficacy of applying either case deletion or imputation procedures rests on the unknown missingness generation mechanism. The contribution of this paper is twofold. The study is the first to propose a simple sequential method to attempt to identify the form of missingness. Second, the effectiveness of the tests is assessed by generating (experimentally) nine missing data sets by imposed MCAR, MAR and NMAR processes, with data removed.  相似文献   

20.
Summary.  The main statistical problem in many epidemiological studies which involve repeated measurements of surrogate markers is the frequent occurrence of missing data. Standard likelihood-based approaches like the linear random-effects model fail to give unbiased estimates when data are non-ignorably missing. In human immunodeficiency virus (HIV) type 1 infection, two markers which have been widely used to track progression of the disease are CD4 cell counts and HIV–ribonucleic acid (RNA) viral load levels. Repeated measurements of these markers tend to be informatively censored, which is a special case of non-ignorable missingness. In such cases, we need to apply methods that jointly model the observed data and the missingness process. Despite their high correlation, longitudinal data of these markers have been analysed independently by using mainly random-effects models. Touloumi and co-workers have proposed a model termed the joint multivariate random-effects model which combines a linear random-effects model for the underlying pattern of the marker with a log-normal survival model for the drop-out process. We extend the joint multivariate random-effects model to model simultaneously the CD4 cell and viral load data while adjusting for informative drop-outs due to disease progression or death. Estimates of all the model's parameters are obtained by using the restricted iterative generalized least squares method or a modified version of it using the EM algorithm as a nested algorithm in the case of censored survival data taking also into account non-linearity in the HIV–RNA trend. The method proposed is evaluated and compared with simpler approaches in a simulation study. Finally the method is applied to a subset of the data from the 'Concerted action on seroconversion to AIDS and death in Europe' study.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号