首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
ABSTRACT

Motivated by a longitudinal oral health study, the Signal-Tandmobiel® study, a Bayesian approach has been developed to model misclassified ordinal response data. Two regression models have been considered to incorporate misclassification in the categorical response. Specifically, probit and logit models have been developed. The computational difficulties have been avoided by using data augmentation. This idea is exploited to derive efficient Markov chain Monte Carlo methods. Although the method is proposed for ordered categories, it can also be implemented for unordered ones in a simple way. The model performance is shown through a simulation-based example and the analysis of the motivating study.  相似文献   

2.
We propose the misclassified Ising Model: a framework for analyzing dependent binary data where the binary state is susceptible to error. We extend previous theoretical results of a model selection method based on applying the LASSO to logistic regression at each node and show that the method will still correctly identify edges in the underlying graphical model under suitable misclassification settings. With knowledge of the misclassification process, an expectation maximization algorithm is developed that accounts for misclassification during model selection. We illustrate the increase of performance of the proposed expectation maximization algorithm with simulated data, and using data from a functional magnetic resonance imaging analysis.  相似文献   

3.
4.
We study the correlation structure for a mixture of ordinal and continuous repeated measures using a Bayesian approach. We assume a multivariate probit model for the ordinal variables and a normal linear regression for the continuous variables, where latent normal variables underlying the ordinal data are correlated with continuous variables in the model. Due to the probit model assumption, we are required to sample a covariance matrix with some of the diagonal elements equal to one. The key computational idea is to use parameter-extended data augmentation, which involves applying the Metropolis-Hastings algorithm to get a sample from the posterior distribution of the covariance matrix incorporating the relevant restrictions. The methodology is illustrated through a simulated example and through an application to data from the UCLA Brain Injury Research Center.  相似文献   

5.
Summary  In panel studies binary outcome measures together with time stationary and time varying explanatory variables are collected over time on the same individual. Therefore, a regression analysis for this type of data must allow for the correlation among the outcomes of an individual. The multivariate probit model of Ashford and Sowden (1970) was the first regression model for multivariate binary responses. However, a likelihood analysis of the multivariate probit model with general correlation structure for higher dimensions is intractable due to the maximization over high dimensional integrals thus severely restricting ist applicability so far. Czado (1996) developed a Markov Chain Monte Carlo (MCMC) algorithm to overcome this difficulty. In this paper we present an application of this algorithm to unemployment data from the Panel Study of Income Dynamics involving 11 waves of the panel study. In addition we adapt Bayesian model checking techniques based on the posterior predictive distribution (see for example Gelman et al. (1996)) for the multivariate probit model. These help to identify mean and correlation specification which fit the data well. C. Czado was supported by research grant OGP0089858 of the Natural Sciences and Engineering Research Council of Canada.  相似文献   

6.
Misclassifications in binary responses have long been a common problem in medical and health surveys. One way to handle misclassifications in clustered or longitudinal data is to incorporate the misclassification model through the generalized estimating equation (GEE) approach. However, existing methods are developed under a non-survey setting and cannot be used directly for complex survey data. We propose a pseudo-GEE method for the analysis of binary survey responses with misclassifications. We focus on cluster sampling and develop analysis strategies for analyzing binary survey responses with different forms of additional information for the misclassification process. The proposed methodology has several attractive features, including simultaneous inferences for both the response model and the association parameters. Finite sample performance of the proposed estimators is evaluated through simulation studies and an application using a real dataset from the Canadian Longitudinal Study on Aging.  相似文献   

7.
ABSTRACT

When a binary dependent variable is misclassified, that is, recorded in the category other than where it really belongs, probit and logit estimates are biased and inconsistent. In some cases, the probability of misclassification may vary systematically with covariates, and thus be endogenous. In this paper, we develop an estimation approach that corrects for endogenous misclassification, validate our approach using a simulation study, and apply it to the analysis of a treatment program designed to improve family dynamics. Our results show that endogenous misclassification could lead to potentially incorrect conclusions unless corrected using an appropriate technique.  相似文献   

8.
Here we consider a multinomial probit regression model where the number of variables substantially exceeds the sample size and only a subset of the available variables is associated with the response. Thus selecting a small number of relevant variables for classification has received a great deal of attention. Generally when the number of variables is substantial, sparsity-enforcing priors for the regression coefficients are called for on grounds of predictive generalization and computational ease. In this paper, we propose a sparse Bayesian variable selection method in multinomial probit regression model for multi-class classification. The performance of our proposed method is demonstrated with one simulated data and three well-known gene expression profiling data: breast cancer data, leukemia data, and small round blue-cell tumors. The results show that compared with other methods, our method is able to select the relevant variables and can obtain competitive classification accuracy with a small subset of relevant genes.  相似文献   

9.
Random error in a continuous outcome variable does not affect its regression on a predictor. However, when a continuous outcome variable is dichotomised, random measurement error results in a flatter exposure-response relationship with a higher intercept. Although this consequence is similar to the effect of misclassification in a binary outcome variable, it cannot be corrected using techniques appropriate for binary data. Conditional distributions of the measurements of the continuous outcome variable can be corrected if the reliability coefficient of the measurements can be estimated. An unbiased estimate of the exposure-response relationship is then easily calculated. This procedure is demonstrated using data on the relationship between smoking and the development of airway obstruction.  相似文献   

10.
The autologistic model, first introduced by Besag, is a popular tool for analyzing binary data in spatial lattices. However, no investigation was found to consider modeling of binary data clustered in uncorrelated lattices. Owing to spatial dependency of responses, the exact likelihood estimation of parameters is not possible. For circumventing this difficulty, many studies have been designed to approximate the likelihood and the related partition function of the model. So, the traditional and Bayesian estimation methods based on the likelihood function are often time-consuming and require heavy computations and recursive techniques. Some investigators have introduced and implemented data augmentation and latent variable model to reduce computational complications in parameter estimation. In this work, the spatially correlated binary data distributed in uncorrelated lattices were modeled using autologistic regression, a Bayesian inference was developed with contribution of data augmentation and the proposed models were applied to caries experiences of deciduous dents.  相似文献   

11.
We consider an extension of the recursive bivariate probit model for estimating the effect of a binary variable on a binary outcome in the presence of unobserved confounders, nonlinear covariate effects and overdispersion. Specifically, the model consists of a system of two binary outcomes with a binary endogenous regressor which includes smooth functions of covariates, hence allowing for flexible functional dependence of the responses on the continuous regressors, and arbitrary random intercepts to deal with overdispersion arising from correlated observations on clusters or from the omission of non‐confounding covariates. We fit the model by maximizing a penalized likelihood using an Expectation‐Maximisation algorithm. The issues of automatic multiple smoothing parameter selection and inference are also addressed. The empirical properties of the proposed algorithm are examined in a simulation study. The method is then illustrated using data from a survey on health, aging and wealth.  相似文献   

12.
We consider data with a nominal grouping variable and a binary response variable. The grouping variable is measured without error, but the response variable is measured using a fallible device subject to misclassification. To achieve model identifiability, we use the double-sampling scheme which requires obtaining a subsample of the original data or another independent sample. This sample is then classified by both the fallible device and another infallible device regarding the response variable. We propose two Wald tests for testing the association between the two variables and illustrate the test using traffic data. The Type-I error rate and power of the tests are examined using simulations and a modified Wald test is recommended.  相似文献   

13.
Estimated associations between an outcome variable and misclassified covariates tend to be biased when the methods of estimation that ignore the classification error are applied. Available methods to account for misclassification often require the use of a validation sample (i.e. a gold standard). In practice, however, such a gold standard may be unavailable or impractical. We propose a Bayesian approach to adjust for misclassification in a binary covariate in the random effect logistic model when a gold standard is not available. This Markov Chain Monte Carlo (MCMC) approach uses two imperfect measures of a dichotomous exposure under the assumptions of conditional independence and non-differential misclassification. A simulated numerical example and a real clinical example are given to illustrate the proposed approach. Our results suggest that the estimated log odds of inpatient care and the corresponding standard deviation are much larger in our proposed method compared with the models ignoring misclassification. Ignoring misclassification produces downwardly biased estimates and underestimate uncertainty.  相似文献   

14.
In this article, we introduce minimum divergence estimators of parameters of a binary response model when data are subject to false-positive misclassification and obtained using a double-sampling plan. Under this set up, the problem of goodness-of-fit is considered and divergence-based confidence intervals (CIs) for a population proportion parameter are derived. A simulation experiment is carried out to compare the coverage probabilities of the new CIs. An application to real data is also given.  相似文献   

15.
Monte Carlo experiments are conducted to compare the Bayesian and sample theory model selection criteria in choosing the univariate probit and logit models. We use five criteria: the deviance information criterion (DIC), predictive deviance information criterion (PDIC), Akaike information criterion (AIC), weighted, and unweighted sums of squared errors. The first two criteria are Bayesian while the others are sample theory criteria. The results show that if data are balanced none of the model selection criteria considered in this article can distinguish the probit and logit models. If data are unbalanced and the sample size is large the DIC and AIC choose the correct models better than the other criteria. We show that if unbalanced binary data are generated by a leptokurtic distribution the logit model is preferred over the probit model. The probit model is preferred if unbalanced data are generated by a platykurtic distribution. We apply the model selection criteria to the probit and logit models that link the ups and downs of the returns on S&P500 to the crude oil price.  相似文献   

16.
韩本三  曹征  黎实 《统计研究》2012,29(7):81-85
 本文将RESET检验扩展到二元选择面板数据模型的设定,考察了固定效应Probit模型和Logit模型的设定检验,包括异方差、遗漏变量和分布误设的检验。模拟结果表明Logit模型的RESET设定检验显示良好的水平和功效,而Probit模型的RESET检验可能由于估计方法的选择导致在某些方面的功效表现不好。但总体说来,在二元选择面板数据模型的设定检验上,RESET检验仍然是一个较好的选择。  相似文献   

17.
This article examines several goodness-of-fit measures in the binary probit regression model. Existing pseudo-R 2 measures are reviewed, two modified and one new pseudo-R 2 measure are proposed. For the probit regression model, empirical comparisons are made for different goodness-of-fit measures with the squared sample correlation coefficient of the observed response and the predicted probabilities. As an illustration, the goodness-of-fit measures are applied to a “paid labor force” data set.  相似文献   

18.
Bayesian propensity score regression analysis with misclassified binary responses is proposed to analyse clustered observational data. This approach utilizes multilevel models and corrects for misclassification in the responses. Using the deviance information criterion (DIC), the performance of the approach is compared with approaches without correcting for misclassification, multilevel structure specification, or both in the study of the impact of female employment on the likelihood of physical violence. The smallest DIC confirms that our proposed model best fits the data. We conclude that female employment has an insignificant impact on the likelihood of physical spousal violence towards women. In addition, a simulation study confirms that the proposed approach performed best in terms of bias and coverage rate. Ignoring misclassification in response or multilevel structure of data would yield biased estimation of the exposure effect.  相似文献   

19.
ABSTRACT

In this article, a finite mixture model of hurdle Poisson distribution with missing outcomes is proposed, and a stochastic EM algorithm is developed for obtaining the maximum likelihood estimates of model parameters and mixing proportions. Specifically, missing data is assumed to be missing not at random (MNAR)/non ignorable missing (NINR) and the corresponding missingness mechanism is modeled through probit regression. To improve the algorithm efficiency, a stochastic step is incorporated into the E-step based on data augmentation, whereas the M-step is solved by the method of conditional maximization. A variation on Bayesian information criterion (BIC) is also proposed to compare models with different number of components with missing values. The considered model is a general model framework and it captures the important characteristics of count data analysis such as zero inflation/deflation, heterogeneity as well as missingness, providing us with more insight into the data feature and allowing for dispersion to be investigated more fully and correctly. Since the stochastic step only involves simulating samples from some standard distributions, the computational burden is alleviated. Once missing responses and latent variables are imputed to replace the conditional expectation, our approach works as part of a multiple imputation procedure. A simulation study and a real example illustrate the usefulness and effectiveness of our methodology.  相似文献   

20.
Joint modeling of associated mixed biomarkers in longitudinal studies leads to a better clinical decision by improving the efficiency of parameter estimates. In many clinical studies, the observed time for two biomarkers may not be equivalent and one of the longitudinal responses may have recorded in a longer time than the other one. In addition, the response variables may have different missing patterns. In this paper, we propose a new joint model of associated continuous and binary responses by accounting different missing patterns for two longitudinal outcomes. A conditional model for joint modeling of the two responses is used and two shared random effects models are considered for intermittent missingness of two responses. A Bayesian approach using Markov Chain Monte Carlo (MCMC) is adopted for parameter estimation and model implementation. The validation and performance of the proposed model are investigated using some simulation studies. The proposed model is also applied for analyzing a real data set of bariatric surgery.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号