首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 265 毫秒
1.
Bayesian propensity score regression analysis with misclassified binary responses is proposed to analyse clustered observational data. This approach utilizes multilevel models and corrects for misclassification in the responses. Using the deviance information criterion (DIC), the performance of the approach is compared with approaches without correcting for misclassification, multilevel structure specification, or both in the study of the impact of female employment on the likelihood of physical violence. The smallest DIC confirms that our proposed model best fits the data. We conclude that female employment has an insignificant impact on the likelihood of physical spousal violence towards women. In addition, a simulation study confirms that the proposed approach performed best in terms of bias and coverage rate. Ignoring misclassification in response or multilevel structure of data would yield biased estimation of the exposure effect.  相似文献   

2.
Generalized linear models are addressed to describe the dependence of data on explanatory variables when the binary outcome is subject to misclassification. Both probit and t-link regressions for misclassified binary data under Bayesian methodology are proposed. The computational difficulties have been avoided by using data augmentation. The idea of using a data augmentation framework (with two types of latent variables) is exploited to derive efficient Gibbs sampling and expectation–maximization algorithms. Besides, this formulation has allowed to obtain the probit model as a particular case of the t-link model. Simulation examples are presented to illustrate the model performance when comparing with standard methods that do not consider misclassification. In order to show the potential of the proposed approaches, a real data problem arising when studying hearing loss caused by exposure to occupational noise is analysed.  相似文献   

3.
Measurement error is a commonly addressed problem in psychometrics and the behavioral sciences, particularly where gold standard data either does not exist or are too expensive. The Bayesian approach can be utilized to adjust for the bias that results from measurement error in tests. Bayesian methods offer other practical advantages for the analysis of epidemiological data including the possibility of incorporating relevant prior scientific information and the ability to make inferences that do not rely on large sample assumptions. In this paper we consider a logistic regression model where both the response and a binary covariate are subject to misclassification. We assume both a continuous measure and a binary diagnostic test are available for the response variable but no gold standard test is assumed available. We consider a fully Bayesian analysis that affords such adjustments, accounting for the sources of error and correcting estimates of the regression parameters. Based on the results from our example and simulations, the models that account for misclassification produce more statistically significant results, than the models that ignore misclassification. A real data example on math disorders is considered.  相似文献   

4.
We consider an extension of the recursive bivariate probit model for estimating the effect of a binary variable on a binary outcome in the presence of unobserved confounders, nonlinear covariate effects and overdispersion. Specifically, the model consists of a system of two binary outcomes with a binary endogenous regressor which includes smooth functions of covariates, hence allowing for flexible functional dependence of the responses on the continuous regressors, and arbitrary random intercepts to deal with overdispersion arising from correlated observations on clusters or from the omission of non‐confounding covariates. We fit the model by maximizing a penalized likelihood using an Expectation‐Maximisation algorithm. The issues of automatic multiple smoothing parameter selection and inference are also addressed. The empirical properties of the proposed algorithm are examined in a simulation study. The method is then illustrated using data from a survey on health, aging and wealth.  相似文献   

5.
Longitudinal categorical data are commonly applied in a variety of fields and are frequently analyzed by generalized estimating equation (GEE) method. Prior to making further inference based on the GEE model, the assessment of model fit is crucial. Graphical techniques have long been in widespread use for assessing the model adequacy. We develop alternative graphical approaches utilizing plots of marginal model-checking condition and local mean deviance to assess the GEE model with logit link for longitudinal binary responses. The applications of the proposed procedures are illustrated through two longitudinal binary datasets.  相似文献   

6.
We show how to infer about a finite population proportion using data from a possibly biased sample. In the absence of any selection bias or survey weights, a simple ignorable selection model, which assumes that the binary responses are independent and identically distributed Bernoulli random variables, is not unreasonable. However, this ignorable selection model is inappropriate when there is a selection bias in the sample. We assume that the survey weights (or their reciprocals which we call ‘selection’ probabilities) are available, but there is no simple relation between the binary responses and the selection probabilities. To capture the selection bias, we assume that there is some correlation between the binary responses and the selection probabilities (e.g., there may be a somewhat higher/lower proportion of positive responses among the sampled units than among the nonsampled units). We use a Bayesian nonignorable selection model to accommodate the selection mechanism. We use Markov chain Monte Carlo methods to fit the nonignorable selection model. We illustrate our method using numerical examples obtained from NHIS 1995 data.  相似文献   

7.
The problem of classification into two univariate normal populations with a common mean is considered. Several classification rules are proposed based on efficient estimators of the common mean. Detailed numerical comparisons of probabilities of misclassifications using these rules have been carried out. It is shown that the classification rule based on the Graybill-Deal estimator of the common mean performs the best. Classification rules are also proposed for the case when variances are assumed to be ordered. Comparison of these rules with the rule based on the Graybill-Deal estimator has been done with respect to individual probabilities of misclassification.  相似文献   

8.
Estimated associations between an outcome variable and misclassified covariates tend to be biased when the methods of estimation that ignore the classification error are applied. Available methods to account for misclassification often require the use of a validation sample (i.e. a gold standard). In practice, however, such a gold standard may be unavailable or impractical. We propose a Bayesian approach to adjust for misclassification in a binary covariate in the random effect logistic model when a gold standard is not available. This Markov Chain Monte Carlo (MCMC) approach uses two imperfect measures of a dichotomous exposure under the assumptions of conditional independence and non-differential misclassification. A simulated numerical example and a real clinical example are given to illustrate the proposed approach. Our results suggest that the estimated log odds of inpatient care and the corresponding standard deviation are much larger in our proposed method compared with the models ignoring misclassification. Ignoring misclassification produces downwardly biased estimates and underestimate uncertainty.  相似文献   

9.
We consider a Bayesian nonignorable model to accommodate a nonignorable selection mechanism for predicting small area proportions. Our main objective is to extend a model on selection bias in a previously published paper, coauthored by four authors, to accommodate small areas. These authors assume that the survey weights (or their reciprocals that we also call selection probabilities) are available, but there is no simple relation between the binary responses and the selection probabilities. To capture the nonignorable selection bias within each area, they assume that the binary responses and the selection probabilities are correlated. To accommodate the small areas, we extend their model to a hierarchical Bayesian nonignorable model and we use Markov chain Monte Carlo methods to fit it. We illustrate our methodology using a numerical example obtained from data on activity limitation in the U.S. National Health Interview Survey. We also perform a simulation study to assess the effect of the correlation between the binary responses and the selection probabilities.  相似文献   

10.
Longitudinal surveys have emerged in recent years as an important data collection tool for population studies where the primary interest is to examine population changes over time at the individual level. Longitudinal data are often analyzed through the generalized estimating equations (GEE) approach. The vast majority of existing literature on the GEE method; however, is developed under non‐survey settings and are inappropriate for data collected through complex sampling designs. In this paper the authors develop a pseudo‐GEE approach for the analysis of survey data. They show that survey weights must and can be appropriately accounted in the GEE method under a joint randomization framework. The consistency of the resulting pseudo‐GEE estimators is established under the proposed framework. Linearization variance estimators are developed for the pseudo‐GEE estimators when the finite population sampling fractions are small or negligible, a scenario often held for large‐scale surveys. Finite sample performances of the proposed estimators are investigated through an extensive simulation study using data from the National Longitudinal Survey of Children and Youth. The results show that the pseudo‐GEE estimators and the linearization variance estimators perform well under several sampling designs and for both continuous and binary responses. The Canadian Journal of Statistics 38: 540–554; 2010 © 2010 Statistical Society of Canada  相似文献   

11.
The importance of discrete spatial models cannot be overemphasized, especially when measuring living standards. The battery of measurements is generally categorical with nearer geo-referenced observations featuring stronger dependencies. This study presents a Clipped Gaussian Geo-Classification (CGG-C) model for spatially-dependent ordered data, and compares its performance with existing methods to classify household poverty using Ghana living standards survey (GLSS 6) data. Bayesian inference was performed on data sampled by MCMC. Model evaluation was based on measures of classification and prediction accuracy. Spatial associations, given some household features, were quantified, and a poverty classification map for Ghana was developed. Overall, the results of estimation showed that many of the statistically significant covariates were generally strongly related with the ordered response variable. Households at specific locations tended to uniformly experience specific levels of poverty, thus, providing an empirical spatial character of poverty in Ghana. A comparative analysis of validation results showed that the CGG-C model (with 14.2% misclassification rate) outperformed the Cumulative Probit (CP) model with misclassification rate of 17.4%. This approach to poverty analysis is relevant for policy design and the implementation of cost-effective programmes to reduce category and site-specific poverty incidence, and monitor changes in both category and geographical trends thereof.KEYWORDS: Ordered responses, spatial correlation, Bayesian estimation via MCMC, Gaussian random fields, poverty classification  相似文献   

12.
In this article we focus on logistic regression models for binary responses. An existing result shows that the log-odds can be modelled depending on the log of the ratio between the conditional densities of the predictors given the response variable. This suggests that relevant statistical information could be extracted investigating the inverse problem. Thus, we present different methods for studying the log-density ratio through graphs, which allow us to select which predictors are needed, and how they should be included in a logistic regression model. We also discuss data analysis examples based on real datasets available in literature in order to provide further insights into the methodology proposed.  相似文献   

13.
This paper proposes a new approach to the treatment of item non-response in attitude scales. It combines the ideas of latent variable identification with the issues of non-response adjustment in sample surveys. The latent variable approach allows missing values to be included in the analysis and, equally importantly, allows information about attitude to be inferred from non-response. We present a symmetric pattern methodology for handling item non-response in attitude scales. The methodology is symmetric in that all the variables are given equivalent status in the analysis (none is designated a 'dependent' variable) and is pattern based in that the pattern of responses and non-responses across individuals is a key element in the analysis. Our approach to the problem is through a latent variable model with two latent dimensions: one to summarize response propensity and the other to summarize attitude, ability or belief. The methodology presented here can handle binary, metric and mixed (binary and metric) manifest items with missing values. Examples using both artificial data sets and two real data sets are used to illustrate the mechanism and the advantages of the methodology proposed.  相似文献   

14.
Survival data analysis aims at collecting data on durations spent in a state by a sample of units, in order to analyse the process of transition to a different state. Survival analysis applied to social and economic phenomena typically relies upon data on transitions collected, for a sample of units, in one or more follow-up surveys. We explore the effect of misclassification of the transition indicator on parameter estimates in an appropriate statistical model for the duration spent in an origin state. Some empirical investigations about the bias induced when ignoring misclassification are reported, extending the model to include the possibility that the rate of misclassification can vary across units according to the value of some covariates. Finally it is shown how a Bayesian approach can lead to parameter estimates.  相似文献   

15.
Despite tremendous effort on different designs with cross-sectional data, little research has been conducted for sample size calculation and power analyses under repeated measures design. In addition to time-averaged difference, changes in mean response over time (CIMROT) is the primary interest in repeated measures analysis. We generalized sample size calculation and power analysis equations for CIMROT to allow unequal sample size between groups for both continuous and binary measures, through simulation, evaluated the performance of proposed methods, and compared our approach to that of a two-stage model formulization. We also created a software procedure to implement the proposed methods.  相似文献   

16.
This article considers Bayesian estimation methods for categorical data with misclassifications. To adjust for misclassification, double sampling schemes are utilized. Observations are represented in a contingency table categorized by error-free categorical variables and error-prone categorical variables. Posterior means of probabilities in cells are considered as estimates. In some cases, the posterior means can be calculated exactly. However,in some cases, the exact calculation may be too difficult to perform, but we can easily use the expectation-maximiza-tion(EM) algorithm to obtain approximate posterior means.  相似文献   

17.
In this paper, we study the maximum likelihood estimation of a model with mixed binary responses and censored observations. The model is very general and includes the Tobit model and the binary choice model as special cases. We show that, by using additional binary choice observations, our method is more efficient than the traditional Tobit model. Two iterative procedures are proposed to compute the maximum likelihood estimator (MLE) for the model based on the EM algorithm (Dempster et al, 1977) and the Newton-Raphson method. The uniqueness of the MLE is proved. The simulation results show that the inconsistency and inefficiency can be significant when the Tobit method is applied to the present mixed model. The experiment results also suggest that the EM algorithm is much faster than the Newton-Raphson method for the present mixed model. The method also allows one to combine two data sets, the smaller data set with more detailed observations and the larger data set with less detailed binary choice observations in order to improve the efficiency of estimation. This may entail substantial savings when one conducts surveys.  相似文献   

18.
In this paper, we study the identification of Bayesian regression models, when an ordinal covariate is subject to unidirectional misclassification. Xia and Gustafson [Bayesian regression models adjusting for unidirectional covariate misclassification. Can J Stat. 2016;44(2):198–218] obtained model identifiability for non-binary regression models, when there is a binary covariate subject to unidirectional misclassification. In the current paper, we establish the moment identifiability of regression models for misclassified ordinal covariates with more than two categories, based on forms of observable moments. Computational studies are conducted that confirm the theoretical results. We apply the method to two datasets, one from the Medical Expenditure Panel Survey (MEPS), and the other from Translational Research Investigating Underlying Disparities in Acute Myocardial infarction Patients Health Status (TRIUMPH).  相似文献   

19.
In recent years, the spatial lattice data has been a motivating issue for researches. Modeling of binary variables observed at locations on a spatial lattice has been sufficiently investigated and the autologistic model is a popular tool for analyzing these data. But, there are many situations where binary responses are clustered in several uncorrelated lattices, and only a few studies were found to investigate the modeling of binary data distributed in such spatial structure. Besides, due to spatial dependency in data exact likelihood analyses is not possible. Bayesian inference, for the autologistic function due to intractability of its normalizing-constant, often has limitations and difficulties. In this study, spatially correlated binary data clustered in uncorrelated lattices are modeled via autologistic regression and IBF (inverse Bayes formulas) sampler with help of introducing latent variables, is extended for posterior analysis and parameter estimation. The proposed methodology is illustrated using simulated and real observations.  相似文献   

20.
We consider data with a nominal grouping variable and a binary response variable. The grouping variable is measured without error, but the response variable is measured using a fallible device subject to misclassification. To achieve model identifiability, we use the double-sampling scheme which requires obtaining a subsample of the original data or another independent sample. This sample is then classified by both the fallible device and another infallible device regarding the response variable. We propose two Wald tests for testing the association between the two variables and illustrate the test using traffic data. The Type-I error rate and power of the tests are examined using simulations and a modified Wald test is recommended.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号