期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Bayesian adjustment for unidirectional misclassification in ordinal covariates

Liangrui Sun Michelle Xia Yuanyuan Tang Philip G. Jones 《Journal of Statistical Computation and Simulation》2017,87(18):3440-3468

In this paper, we study the identification of Bayesian regression models, when an ordinal covariate is subject to unidirectional misclassification. Xia and Gustafson [Bayesian regression models adjusting for unidirectional covariate misclassification. Can J Stat. 2016;44(2):198–218] obtained model identifiability for non-binary regression models, when there is a binary covariate subject to unidirectional misclassification. In the current paper, we establish the moment identifiability of regression models for misclassified ordinal covariates with more than two categories, based on forms of observable moments. Computational studies are conducted that confirm the theoretical results. We apply the method to two datasets, one from the Medical Expenditure Panel Survey (MEPS), and the other from Translational Research Investigating Underlying Disparities in Acute Myocardial infarction Patients Health Status (TRIUMPH). 相似文献

2.

Weibull multi-state models with misclassification

Charles D. G. Keown-Stoneman Julie Horrocks Gerarda A. Darlington 《统计学通讯:模拟与计算》2019,48(1):39-57

It is often important to allow multi-state models (MSMs) to accommodate misclassification of states. We introduce Bayesian parametric MSMs with unknown misclassification of states and Weibull distributed waiting times between states. This allows transitions between states to depend on the time spent in the current state, a feature lacking in commonly used exponential waiting times model. To fit the proposed model, a MCMC algorithm was employed. An example on the progression of bipolar disorder is presented along with simulation results. There was evidence that Weibull waiting times are an improvement over exponential in the study of bipolar disorder. 相似文献

3.

Bayesian estimation of logistic regression with misclassified covariates and response

Brandi N. Falley James D. Stamey A. Alexander Beaujean 《Journal of applied statistics》2018,45(10):1756-1769

Measurement error is a commonly addressed problem in psychometrics and the behavioral sciences, particularly where gold standard data either does not exist or are too expensive. The Bayesian approach can be utilized to adjust for the bias that results from measurement error in tests. Bayesian methods offer other practical advantages for the analysis of epidemiological data including the possibility of incorporating relevant prior scientific information and the ability to make inferences that do not rely on large sample assumptions. In this paper we consider a logistic regression model where both the response and a binary covariate are subject to misclassification. We assume both a continuous measure and a binary diagnostic test are available for the response variable but no gold standard test is assumed available. We consider a fully Bayesian analysis that affords such adjustments, accounting for the sources of error and correcting estimates of the regression parameters. Based on the results from our example and simulations, the models that account for misclassification produce more statistically significant results, than the models that ignore misclassification. A real data example on math disorders is considered. 相似文献

4.

Maximum-likelihood and closed-form estimators of epidemiologic measures under misclassification

Sander Greenland 《Journal of statistical planning and inference》2008

There is a large literature on estimation under misclassification. The present paper reviews epidemiologic inference under misclassification in the multiway contingency-table setting, and addresses a few controversial issues. In the 1990s, claims of inefficiency of early closed-form estimators of odds ratios under misclassification arose from misapplication of the estimators to studies with internal validation. In reality, these estimators are maximum likelihood (ML) and hence efficient under the external-validation assumptions used for their derivation. For the internal-validation case, a new closed-form estimator is derived that incorporates the nondifferentiality constraint into the predictive-value (“direct” or “inverse-matrix”) estimator. Results are presented in a general framework that applies to misclassification in models for multiway tables, and that allows the target parameter to be any measure of association or effect. 相似文献

5.

A Bayesian Adjustment for Covariate Misclassification with Correlated Binary Outcome Data

Dianxu Ren Roslyn A. Stone 《Journal of applied statistics》2007,34(9):1019-1034

Estimated associations between an outcome variable and misclassified covariates tend to be biased when the methods of estimation that ignore the classification error are applied. Available methods to account for misclassification often require the use of a validation sample (i.e. a gold standard). In practice, however, such a gold standard may be unavailable or impractical. We propose a Bayesian approach to adjust for misclassification in a binary covariate in the random effect logistic model when a gold standard is not available. This Markov Chain Monte Carlo (MCMC) approach uses two imperfect measures of a dichotomous exposure under the assumptions of conditional independence and non-differential misclassification. A simulated numerical example and a real clinical example are given to illustrate the proposed approach. Our results suggest that the estimated log odds of inpatient care and the corresponding standard deviation are much larger in our proposed method compared with the models ignoring misclassification. Ignoring misclassification produces downwardly biased estimates and underestimate uncertainty. 相似文献

6.

Analysis of multivariate categorical data with misclassification errors by triple sampling schemes

T. Timothy Chen Yosef Hochberg Aaron Tenenbein 《Journal of statistical planning and inference》1984,9(2):177-184

Previous work has been carried out on the use of double-sampling schemes for inference from categorical data subject to misclassification. The double-sampling schemes utilize a sample of n units classified by both a fallible and true device and another sample of n₂ units classified only by a fallible device. In actual applications, one often hasavailable a third sample of n₁ units, which is classified only by the true device. In this article we develop techniques of fitting log-linear models under various misclassification structures for a general triple-sampling scheme. The estimation is by maximum likelihood and the fitted models are hierarchical. The methodology is illustrated by applying it to data in traffic safety research from a study on the effectiveness of belts in reducing injuries. 相似文献

7.

Model selection using discriminant analysis

Timothy J. Novotny Lyman L. Mcdonald 《Journal of applied statistics》1986,13(2):159-165

A researcher is often confronted with the difficult and subjective task of determining which of m models best fits a set of observed data. A general robust statistical procedure for model selection is examined which uses discriminant analysis on significance levels resulting from various tests of hypotheses concerning the models. The use of Monte Carlo simulation to obtain the significance levels associated with the tests is presented. The technique is illustrated by application to four band recovery models useful in wildlife studies. Error rates due to misclassification are also reported. 相似文献

8.

Addressing misclassification for binary data: probit and t-link regressions

《Journal of Statistical Computation and Simulation》2012,82(10):2187-2213

Generalized linear models are addressed to describe the dependence of data on explanatory variables when the binary outcome is subject to misclassification. Both probit and t-link regressions for misclassified binary data under Bayesian methodology are proposed. The computational difficulties have been avoided by using data augmentation. The idea of using a data augmentation framework (with two types of latent variables) is exploited to derive efficient Gibbs sampling and expectation–maximization algorithms. Besides, this formulation has allowed to obtain the probit model as a particular case of the t-link model. Simulation examples are presented to illustrate the model performance when comparing with standard methods that do not consider misclassification. In order to show the potential of the proposed approaches, a real data problem arising when studying hearing loss caused by exposure to occupational noise is analysed. 相似文献

9.

Does the supplemental nutrition assistance program really increase obesity? The importance of accounting for misclassification errors

Achilleas Vassilopoulos Andreas C. Drichoutis Rodolfo M. Nayga Jr. Panagiotis Lazaridis 《Journal of applied statistics》2018,45(12):2269-2278

The prevalence of obesity among US citizens has grown rapidly over the last few decades, especially among low-income individuals. This has led to questions about the effectiveness of nutritional assistance programs such as the Supplemental Nutrition Assistance Program (SNAP). Previous results on the effect of SNAP participation on obesity are mixed. These findings are however based on the assumption that participation status can be accurately observed, despite significant misclassification errors reported in the literature. Using propensity score matching, we conclude that there seems to be a positive effect of SNAP participation on obesity rates for female participants and no such effect for males, a result that is consistent with several previous studies. However, an extensive sensitivity analysis reveals that the positive effect for females is sensitive to misclassification errors and to the conditional independence assumption. Thus analogous findings should also be used with caution unless examined under the prism of classification errors and of other assumptions used for the identification of causal parameters. 相似文献

10.

Inference for Bivariate Survival Data by Copula Models Adjusted for the Boundary Effect

Aidong Adam Ding Weijing Wang 《统计学通讯:理论与方法》2013,42(16):2927-2936

Copula models describe the dependence structure of two random variables separately from their marginal distributions and hence are particularly useful in studying the association for bivariate survival data. Semiparametric inference for bivariate survival data based on copula models has been studied for various types of data, including complete data, right-censored data, and current status data. This article discusses the boundary effect on these inference procedures, a problem that has been neglected in the previous literature. Specifically, asymptotic distribution of the association estimator on the boundary of parameter space is derived for one-dimensional copula models. The boundary properties are applied to test independence and to study the estimation efficiency. Simulation study is conducted for the bivariate right-censored data and current status data. 相似文献

11.

Dynamic latent trait models with mixed hidden Markov structure for mixed longitudinal outcomes

Yue Zhang Kiros Berhane 《Journal of applied statistics》2016,43(4):704-720

We propose a general Bayesian joint modeling approach to model mixed longitudinal outcomes from the exponential family for taking into account any differential misclassification that may exist among categorical outcomes. Under this framework, outcomes observed without measurement error are related to latent trait variables through generalized linear mixed effect models. The misclassified outcomes are related to the latent class variables, which represent unobserved real states, using mixed hidden Markov models (MHMMs). In addition to enabling the estimation of parameters in prevalence, transition and misclassification probabilities, MHMMs capture cluster level heterogeneity. A transition modeling structure allows the latent trait and latent class variables to depend on observed predictors at the same time period and also on latent trait and latent class variables at previous time periods for each individual. Simulation studies are conducted to make comparisons with traditional models in order to illustrate the gains from the proposed approach. The new approach is applied to data from the Southern California Children Health Study to jointly model questionnaire-based asthma state and multiple lung function measurements in order to gain better insight about the underlying biological mechanism that governs the inter-relationship between asthma state and lung function development. 相似文献

12.

The estimation of gross flows in the presence of measurement error using auxiliary variables 总被引：1，自引：1，他引：0

Danny Pfeffermann Chris Skinner & Keith Humphreys 《Journal of the Royal Statistical Society. Series A, (Statistics in Society)》1998,161(1):13-32

Classification error can lead to substantial biases in the estimation of gross flows from longitudinal data. We propose a method to adjust flow estimates for bias, based on fitting separate multinomial logistic models to the classification error probabilities and the true state transition probabilities using values of auxiliary variables. Our approach has the advantages that it does not require external information on misclassification rates, it permits the identification of factors that are related to misclassification and true transitions and it does not assume independence between classification errors at successive points in time. Constraining the prediction of the stocks to agree with the observed stocks protects against model misspecification. We apply the approach to data on women from the Panel Study of Income Dynamics with three categories of labour force status. The model fitted is shown to have interpretable coefficient estimates and to provide a good fit. Simulation results indicate good performance of the model in predicting the true flows and robustness against departures from the model postulated. 相似文献

13.

Bayesian misclassification and propensity score methods for clustered observational studies

Qi Zhou Yoo-Mi Chin James D. Stamey 《Journal of applied statistics》2018,45(9):1547-1560

Bayesian propensity score regression analysis with misclassified binary responses is proposed to analyse clustered observational data. This approach utilizes multilevel models and corrects for misclassification in the responses. Using the deviance information criterion (DIC), the performance of the approach is compared with approaches without correcting for misclassification, multilevel structure specification, or both in the study of the impact of female employment on the likelihood of physical violence. The smallest DIC confirms that our proposed model best fits the data. We conclude that female employment has an insignificant impact on the likelihood of physical spousal violence towards women. In addition, a simulation study confirms that the proposed approach performed best in terms of bias and coverage rate. Ignoring misclassification in response or multilevel structure of data would yield biased estimation of the exposure effect. 相似文献

14.

The effect of unequal priors and unequal misclassification costs on MDA

Patricia M. Rudolph Marvin Karson 《Journal of applied statistics》1988,15(1):69-83

Multiple discriminant analysis (MDA) is a frequently used statistical technique. Although the dependence of this technique on the underlying assumptions concerning population priors and misclassification costs is well known, the assumption most often made by researchers is that both population priors and misclassification costs are equal. The purpose of this paper is to demonstrate the magnitude of the effect of these assumptions on statistical results. In the savings and loan case used here, the population priors are known:however, the relative misclassification costs are not. To test the sensitivity of the results to the unknown misclassification costs several different misclassification cost assumptions are used. 相似文献

15.

Errors of misclassification in discrimination with data from truncated <Emphasis Type="Italic">t</Emphasis> populations

Apostolos?Batsidis Email author 《Statistical Papers》2012,53(2):281-298

The distribution of the probabilities of misclassification is derived in this paper, which are reproduced by the use of the linear discriminant function. The statistical background is two independent doubly truncated t populations with distinct location parameters and common scale parameter and degrees of freedom. The behavior of the linear discriminant function is studied by comparing the distribution function of the errors of misclassification under the truncated t and truncated normal models. 相似文献

16.

基于改进的AdaBoost算法的信用评分模型

杨海江魏秋萍张景肖《统计与信息论坛》2011,26(2):27-31

将AdaBoost组合算法应用于信用评分模型中的分类问题,并针对该算法在解决不平衡分类问题上的一些不足,对算法进行了改进。应用此改进的AdaBoost算法,创建了新的信用评分模型,并进行了实证分析。实证结果表明,基于改进的AdaBoost算法的信用评分模型可以有效降低由于模型错判而导致的损失。相似文献

17.

Models with Errors due to Misreported Measurements

Brent Henderson Richard Jarrett 《Australian & New Zealand Journal of Statistics》2003,45(4):431-444

Measurement error and misclassification models feature prominently in the literature. This paper describes misreporting error, which can be considered to fall somewhere between these two broad types of model. Misreporting is concerned with situations where a continuous random variable X is measured with error and only reported as the discrete random variable Z. Data grouping or rounding are the simplest examples of this, but more generally X may be reported as a value z of Z which refers to a different interval from the one in which X lies. The paper discusses a method for handling misreported data and draws links with measurement error and misclassification models. A motivating example is considered from a prenatal Down's syndrome screening, where the gestational age at which mothers present for screening is a true continuous variable but is misreported because it is only ever observed as a discrete whole number of weeks which may in fact be in error. The implications this misreporting might have for the screening are investigated. 相似文献

18.

Phylogenetic tree selection by the adjusted k-means approach

Hsiuying Wang Shan-Lin Hung 《Journal of applied statistics》2012,39(3):643-655

The reconstruction of phylogenetic trees is one of the most important and interesting problems of the evolutionary study. There are many methods proposed in the literature for constructing phylogenetic trees. Each approach is based on different criteria and evolutionary models. However, the topologies of trees constructed from different methods may be quite different. The topological errors may be due to unsuitable criterions or evolutionary models. Since there are many tree construction approaches, we are interested in selecting a better tree to fit the true model. In this study, we propose an adjusted k-means approach and a misclassification error score criterion to solve the problem. The simulation study shows this method can select better trees among the potential candidates, which can provide a useful way in phylogenetic tree selection. 相似文献

19.

Why do we observe misclassification errors smaller than the Bayes error?

《Journal of Statistical Computation and Simulation》2012,82(5):717-722

In simulation studies for discriminant analysis, misclassification errors are often computed using the Monte Carlo method, by testing a classifier on large samples generated from known populations. Although large samples are expected to behave closely to the underlying distributions, they may not do so in a small interval or region, and thus may lead to unexpected results. We demonstrate with an example that the LDA misclassification error computed via the Monte Carlo method may often be smaller than the Bayes error. We give a rigorous explanation and recommend a method to properly compute misclassification errors. 相似文献

20.

Bayesian sample size determination for estimating binomial parameters from data subject to misclassification 总被引：1，自引：0，他引：1

E. Rahme L. Joseph & T. W. Gyorkos 《Journal of the Royal Statistical Society. Series C, Applied statistics》2000,49(1):119-128

We investigate the sample size problem when a binomial parameter is to be estimated, but some degree of misclassification is possible. The problem is especially challenging when the degree to which misclassification occurs is not exactly known. Motivated by a Canadian survey of the prevalence of toxoplasmosis infection in pregnant women, we examine the situation where it is desired that a marginal posterior credible interval for the prevalence of width w has coverage 1−α, using a Bayesian sample size criterion. The degree to which the misclassification probabilities are known a priori can have a very large effect on sample size requirements, and in some cases achieving a coverage of 1−α is impossible, even with an infinite sample size. Therefore, investigators must carefully evaluate the degree to which misclassification can occur when estimating sample size requirements. 相似文献