首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 359 毫秒
We consider a new approach to deal with non ignorable non response on an outcome variable, in a causal inference framework. Assuming that a binary instrumental variable for non response is available, we provide a likelihood-based approach to identify and estimate heterogeneous causal effects of a binary treatment on specific latent subgroups of units, named principal strata, defined by the non response behavior under each level of the treatment and of the instrument. We show that, within each stratum, non response is ignorable and respondents can be properly compared by treatment status. In order to assess our method and its robustness when the usually invoked assumptions are relaxed or misspecified, we simulate data to resemble a real experiment conducted on a panel survey which compares different methods of reducing panel attrition.  相似文献   

Even in randomized experiments the identification of causal effects is often threatened by the presence of missing outcome values, with missingness possibly being non ignorable. We provide sufficient conditions under which the availability of a binary instrument for non response allows us to non parametrically point identify average causal effects in some latent subgroups of units, named Principal Strata, defined by their non response behavior in all possible combinations of treatment and instrument. Examples are provided as possible scenarios where our assumptions may be plausible.  相似文献   

Summary.  Social data often contain missing information. The problem is inevitably severe when analysing historical data. Conventionally, researchers analyse complete records only. Listwise deletion not only reduces the effective sample size but also may result in biased estimation, depending on the missingness mechanism. We analyse household types by using population registers from ancient China (618–907 AD) by comparing a simple classification, a latent class model of the complete data and a latent class model of the complete and partially missing data assuming four types of ignorable and non-ignorable missingness mechanisms. The findings show that either a frequency classification or a latent class analysis using the complete records only yielded biased estimates and incorrect conclusions in the presence of partially missing data of a non-ignorable mechanism. Although simply assuming ignorable or non-ignorable missing data produced consistently similarly higher estimates of the proportion of complex households, a specification of the relationship between the latent variable and the degree of missingness by a row effect uniform association model helped to capture the missingness mechanism better and improved the model fit.  相似文献   

Incomplete data subject to non‐ignorable non‐response are often encountered in practice and have a non‐identifiability problem. A follow‐up sample is randomly selected from the set of non‐respondents to avoid the non‐identifiability problem and get complete responses. Glynn, Laird, & Rubin analyzed non‐ignorable missing data with a follow‐up sample under a pattern mixture model. In this article, maximum likelihood estimation of parameters of the categorical missing data is considered with a follow‐up sample under a selection model. To estimate the parameters with non‐ignorable missing data, the EM algorithm with weighting, proposed by Ibrahim, is used. That is, in the E‐step, the weighted mean is calculated using the fractional weights for imputed data. Variances are estimated using the approximated jacknife method. Simulation results are presented to compare the proposed method with previously presented methods.  相似文献   

Models that involve an outcome variable, covariates, and latent variables are frequently the target for estimation and inference. The presence of missing covariate or outcome data presents a challenge, particularly when missingness depends on the latent variables. This missingness mechanism is called latent ignorable or latent missing at random and is a generalisation of missing at random. Several authors have previously proposed approaches for handling latent ignorable missingness, but these methods rely on prior specification of the joint distribution for the complete data. In practice, specifying the joint distribution can be difficult and/or restrictive. We develop a novel sequential imputation procedure for imputing covariate and outcome data for models with latent variables under latent ignorable missingness. The proposed method does not require a joint model; rather, we use results under a joint model to inform imputation with less restrictive modelling assumptions. We discuss identifiability and convergence‐related issues, and simulation results are presented in several modelling settings. The method is motivated and illustrated by a study of head and neck cancer recurrence. Imputing missing data for models with latent variables under latent‐dependent missingness without specifying a full joint model.  相似文献   

We consider non-response models for a single categorical response with categorical covariates whose values are always observed. We present Bayesian methods for ignorable models and a particular non-ignorable model, and we argue that standard methods of model comparison are inappropriate for comparing ignorable and non-ignorable models. Uncertainty about ignorability of non-response is incorporated by introducing parameters describing the extent of non-ignorability into a pattern mixture specification and integrating over the prior uncertainty associated with these parameters. Our approach is illustrated using polling data from the 1992 British general election panel survey. We suggest sample size adjustments for surveys when non-ignorable non-response is expected.  相似文献   

In the National Survey of Sexual Attitudes and Lifestyles (NATSSAL), it is recognized that non-response is unlikely to be ignorable. In some surveys, in addition to the response variables of interest, there may also be an 'enthusiasm-to-respond' variable which is expected to be related to the probabilities of item and unit response. Inference techniques to deal with non-ignorable non-response, based on a propensity-to-respond score, can be developed when there are both item and unit non-responders. For the NATSSAL data, an interviewer-measured interviewee embarrassment variable is combined with demographics to produce a score for the propensity to respond. The necessary likelihood development is outlined and alternative approaches to interval estimation are compared. The methodology is illustrated through an estimation of virginity from NATSSAL data.  相似文献   

Summary.  Latent class analysis has been used to model measurement error, to identify flawed survey questions and to estimate mode effects. Using data from a survey of University of Maryland alumni together with alumni records, we evaluate this technique to determine its usefulness for detecting bad questions in the survey context. Two sets of latent class analysis models are applied in this evaluation: latent class models with three indicators and latent class models with two indicators under different assumptions about prevalence and error rates. Our results indicated that the latent class analysis approach produced good qualitative results for the latent class models—the item that the model deemed the worst was the worst according to the true scores. However, the approach yielded weaker quantitative estimates of the error rates for a given item.  相似文献   

Summary.  We propose a model of transitions into and out of low paid employment that accounts for non-ignorable panel dropout, employment retention and base year low pay status ('initial conditions'). The model is fitted to data for men from the British Household Panel Survey. Initial conditions and employment retention are found to be non-ignorable selection processes. Whether panel dropout is found to be ignorable depends on how item non-response on pay is treated. Notwithstanding these results, we also find that models incorporating a simpler approach to accounting for non-ignorable selections provide estimates of covariate effects that differ very little from the estimates from the general model.  相似文献   

We show how to infer about a finite population proportion using data from a possibly biased sample. In the absence of any selection bias or survey weights, a simple ignorable selection model, which assumes that the binary responses are independent and identically distributed Bernoulli random variables, is not unreasonable. However, this ignorable selection model is inappropriate when there is a selection bias in the sample. We assume that the survey weights (or their reciprocals which we call ‘selection’ probabilities) are available, but there is no simple relation between the binary responses and the selection probabilities. To capture the selection bias, we assume that there is some correlation between the binary responses and the selection probabilities (e.g., there may be a somewhat higher/lower proportion of positive responses among the sampled units than among the nonsampled units). We use a Bayesian nonignorable selection model to accommodate the selection mechanism. We use Markov chain Monte Carlo methods to fit the nonignorable selection model. We illustrate our method using numerical examples obtained from NHIS 1995 data.  相似文献   

This paper proposes a new approach to the treatment of item non-response in attitude scales. It combines the ideas of latent variable identification with the issues of non-response adjustment in sample surveys. The latent variable approach allows missing values to be included in the analysis and, equally importantly, allows information about attitude to be inferred from non-response. We present a symmetric pattern methodology for handling item non-response in attitude scales. The methodology is symmetric in that all the variables are given equivalent status in the analysis (none is designated a 'dependent' variable) and is pattern based in that the pattern of responses and non-responses across individuals is a key element in the analysis. Our approach to the problem is through a latent variable model with two latent dimensions: one to summarize response propensity and the other to summarize attitude, ability or belief. The methodology presented here can handle binary, metric and mixed (binary and metric) manifest items with missing values. Examples using both artificial data sets and two real data sets are used to illustrate the mechanism and the advantages of the methodology proposed.  相似文献   

Very often, in psychometric research, as in educational assessment, it is necessary to analyze item response from clustered respondents. The multiple group item response theory (IRT) model proposed by Bock and Zimowski [12] provides a useful framework for analyzing such type of data. In this model, the selected groups of respondents are of specific interest such that group-specific population distributions need to be defined. The usual assumption for parameter estimation in this model, which is that the latent traits are random variables following different symmetric normal distributions, has been questioned in many works found in the IRT literature. Furthermore, when this assumption does not hold, misleading inference can result. In this paper, we consider that the latent traits for each group follow different skew-normal distributions, under the centered parameterization. We named it skew multiple group IRT model. This modeling extends the works of Azevedo et al. [4], Bazán et al. [11] and Bock and Zimowski [12] (concerning the latent trait distribution). Our approach ensures that the model is identifiable. We propose and compare, concerning convergence issues, two Monte Carlo Markov Chain (MCMC) algorithms for parameter estimation. A simulation study was performed in order to evaluate parameter recovery for the proposed model and the selected algorithm concerning convergence issues. Results reveal that the proposed algorithm recovers properly all model parameters. Furthermore, we analyzed a real data set which presents asymmetry concerning the latent traits distribution. The results obtained by using our approach confirmed the presence of negative asymmetry for some latent trait distributions.  相似文献   

Non ignorable missing data is a common problem in longitudinal studies. Latent class models are attractive for simplifying the modeling of missing data when the data are subject to either a monotone or intermittent missing data pattern. In our study, we propose a new two-latent-class model for categorical data with informative dropouts, dividing the observed data into two latent classes; one class in which the outcomes are deterministic and a second one in which the outcomes can be modeled using logistic regression. In the model, the latent classes connect the longitudinal responses and the missingness process under the assumption of conditional independence. Parameters are estimated by the method of maximum likelihood estimation based on the above assumptions and the tetrachoric correlation between responses within the same subject. We compare the proposed method with the shared parameter model and the weighted GEE model using the areas under the ROC curves in the simulations and the application to the smoking cessation data set. The simulation results indicate that the proposed two-latent-class model performs well under different missing procedures. The application results show that our proposed method is better than the shared parameter model and the weighted GEE model.  相似文献   

This article presents a Bayesian latent variable model used to analyze ordinal response survey data by taking into account the characteristics of respondents. The ordinal response data are viewed as multivariate responses arising from continuous latent variables with known cut-points. Each respondent is characterized by two parameters that have a Dirichlet process as their joint prior distribution. The proposed mechanism adjusts for classes of personalities. The model is applied to student survey data in course evaluations. Goodness-of-fit (GoF) procedures are developed for assessing the validity of the model. The proposed GoF procedures are simple, intuitive, and do not seem to be a part of current Bayesian practice.  相似文献   

We discuss the use of latent variable models with observed covariates for computing response propensities for sample respondents. A response propensity score is often used to weight item and unit responders to account for item and unit non-response and to obtain adjusted means and proportions. In the context of attitude scaling, we discuss computing response propensity scores by using latent variable models for binary or nominal polytomous manifest items with covariates. Our models allow the response propensity scores to be found for several different items without refitting. They allow any pattern of missing responses for the items. If one prefers, it is possible to estimate population proportions directly from the latent variable models, so avoiding the use of propensity scores. Artificial data sets and a real data set extracted from the 1996 British Social Attitudes Survey are used to compare the various methods proposed.  相似文献   


In this article, a finite mixture model of hurdle Poisson distribution with missing outcomes is proposed, and a stochastic EM algorithm is developed for obtaining the maximum likelihood estimates of model parameters and mixing proportions. Specifically, missing data is assumed to be missing not at random (MNAR)/non ignorable missing (NINR) and the corresponding missingness mechanism is modeled through probit regression. To improve the algorithm efficiency, a stochastic step is incorporated into the E-step based on data augmentation, whereas the M-step is solved by the method of conditional maximization. A variation on Bayesian information criterion (BIC) is also proposed to compare models with different number of components with missing values. The considered model is a general model framework and it captures the important characteristics of count data analysis such as zero inflation/deflation, heterogeneity as well as missingness, providing us with more insight into the data feature and allowing for dispersion to be investigated more fully and correctly. Since the stochastic step only involves simulating samples from some standard distributions, the computational burden is alleviated. Once missing responses and latent variables are imputed to replace the conditional expectation, our approach works as part of a multiple imputation procedure. A simulation study and a real example illustrate the usefulness and effectiveness of our methodology.  相似文献   

We propose a class of multidimensional Item Response Theory models for polytomously-scored items with ordinal response categories. This class extends an existing class of multidimensional models for dichotomously-scored items in which the latent abilities are represented by a random vector assumed to have a discrete distribution, with support points corresponding to different latent classes in the population. In the proposed approach, we allow for different parameterizations for the conditional distribution of the response variables given the latent traits, which depend on the type of link function and the constraints imposed on the item parameters. Moreover, we suggest a strategy for model selection that is based on a series of steps consisting of selecting specific features, such as the dimension of the model (number of latent traits), the number of latent classes, and the specific parameterization. In order to illustrate the proposed approach, we analyze a dataset from a study on anxiety and depression on a sample of oncological patients.  相似文献   

Missing response problem is ubiquitous in survey sampling, medical, social science and epidemiology studies. It is well known that non-ignorable missing is the most difficult missing data problem where the missing of a response depends on its own value. In statistical literature, unlike the ignorable missing data problem, not many papers on non-ignorable missing data are available except for the full parametric model based approach. In this paper we study a semiparametric model for non-ignorable missing data in which the missing probability is known up to some parameters, but the underlying distributions are not specified. By employing Owen (1988)’s empirical likelihood method we can obtain the constrained maximum empirical likelihood estimators of the parameters in the missing probability and the mean response which are shown to be asymptotically normal. Moreover the likelihood ratio statistic can be used to test whether the missing of the responses is non-ignorable or completely at random. The theoretical results are confirmed by a simulation study. As an illustration, the analysis of a real AIDS trial data shows that the missing of CD4 counts around two years are non-ignorable and the sample mean based on observed data only is biased.  相似文献   

Medical research frequently focuses on the relationship between quality of life (QoL) and survival time of subjects. QoL may be one of the most important factors that could be used to predict survival, making it worth identifying factors that jointly affect survival and QoL. We propose a semiparametric joint model that consists of item response and survival components, where these two components are linked through latent variables. Several popular ordinal models are considered and compared in the item response component, while the Cox proportional hazards model is used in the survival component. We estimate the baseline hazard function and model parameters simultaneously, through a profile likelihood approach. We illustrate the method using an example from a clinical study.  相似文献   

Bayesian hierarchical formulations are utilized by the U.S. Bureau of Labor Statistics (BLS) with respondent‐level data for missing item imputation because these formulations are readily parameterized to capture correlation structures. BLS collects survey data under informative sampling designs that assign probabilities of inclusion to be correlated with the response on which sampling‐weighted pseudo posterior distributions are estimated for asymptotically unbiased inference about population model parameters. Computation is expensive and does not support BLS production schedules. We propose a new method to scale the computation that divides the data into smaller subsets, estimates a sampling‐weighted pseudo posterior distribution, in parallel, for every subset and combines the pseudo posterior parameter samples from all the subsets through their mean in the Wasserstein space of order 2. We construct conditions on a class of sampling designs where posterior consistency of the proposed method is achieved. We demonstrate on both synthetic data and in application to the Current Employment Statistics survey that our method produces results of similar accuracy as the usual approach while offering substantially faster computation.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号