排序方式: 共有76条查询结果,搜索用时 312 毫秒
1.
Estimated associations between an outcome variable and misclassified covariates tend to be biased when the methods of estimation that ignore the classification error are applied. Available methods to account for misclassification often require the use of a validation sample (i.e. a gold standard). In practice, however, such a gold standard may be unavailable or impractical. We propose a Bayesian approach to adjust for misclassification in a binary covariate in the random effect logistic model when a gold standard is not available. This Markov Chain Monte Carlo (MCMC) approach uses two imperfect measures of a dichotomous exposure under the assumptions of conditional independence and non-differential misclassification. A simulated numerical example and a real clinical example are given to illustrate the proposed approach. Our results suggest that the estimated log odds of inpatient care and the corresponding standard deviation are much larger in our proposed method compared with the models ignoring misclassification. Ignoring misclassification produces downwardly biased estimates and underestimate uncertainty. 相似文献
2.
Liangrui Sun Michelle Xia Yuanyuan Tang Philip G. Jones 《Journal of Statistical Computation and Simulation》2017,87(18):3440-3468
In this paper, we study the identification of Bayesian regression models, when an ordinal covariate is subject to unidirectional misclassification. Xia and Gustafson [Bayesian regression models adjusting for unidirectional covariate misclassification. Can J Stat. 2016;44(2):198–218] obtained model identifiability for non-binary regression models, when there is a binary covariate subject to unidirectional misclassification. In the current paper, we establish the moment identifiability of regression models for misclassified ordinal covariates with more than two categories, based on forms of observable moments. Computational studies are conducted that confirm the theoretical results. We apply the method to two datasets, one from the Medical Expenditure Panel Survey (MEPS), and the other from Translational Research Investigating Underlying Disparities in Acute Myocardial infarction Patients Health Status (TRIUMPH). 相似文献
3.
4.
《Journal of Statistical Computation and Simulation》2012,82(5):717-722
In simulation studies for discriminant analysis, misclassification errors are often computed using the Monte Carlo method, by testing a classifier on large samples generated from known populations. Although large samples are expected to behave closely to the underlying distributions, they may not do so in a small interval or region, and thus may lead to unexpected results. We demonstrate with an example that the LDA misclassification error computed via the Monte Carlo method may often be smaller than the Bayes error. We give a rigorous explanation and recommend a method to properly compute misclassification errors. 相似文献
5.
《Journal of Statistical Computation and Simulation》2012,82(6):1187-1199
Methods are proposed to combine several individual classifiers in order to develop more accurate classification rules. The proposed algorithm uses Rademacher–Walsh polynomials to combine M (≥2) individual classifiers in a nonlinear way. The resulting classifier is optimal in the sense that its misclassification error rate is always less than, or equal to, that of each constituent classifier. A number of numerical examples (based on both real and simulated data) are also given. These examples demonstrate some new, and far-reaching, benefits of working with combined classifiers. 相似文献
6.
Raffaella Calabrese 《Journal of applied statistics》2014,41(8):1678-1693
This paper develops a method for handling two-class classification problems with highly unbalanced class sizes and misclassification costs. When the class sizes are highly unbalanced and the minority class represents a rare event, conventional classification methods tend to strongly favour the majority class, resulting in very low detection of the minority class. A method is proposed to determine the optimal cut-off for asymmetric misclassification costs and for unbalanced class sizes. Monte Carlo simulations show that this proposal performs better than the method based on the notion of classification accuracy. Finally, the proposed method is applied to empirical data on Italian small and medium enterprises to classify them into default and non-default groups. 相似文献
7.
8.
《Journal of Statistical Computation and Simulation》2012,82(9):1145-1156
We investigate three interval estimators for binomial misclassification rates in a complementary Poisson model where the data are possibly misclassified: a Wald-based interval, a score-based interval, and an interval based on the profile log-likelihood statistic. We investigate the coverage and average width properties of these intervals via a simulation study. For small Poisson counts and small misclassification rates, the intervals can perform poorly in terms of coverage. The profile log-likelihood confidence interval (CI) is often proved to outperform the other intervals with good coverage and width properties. Lastly, we apply the CIs to a real data set involving traffic accident data that contain misclassified counts. 相似文献
9.
C.Y. Leung 《统计学通讯:理论与方法》2013,42(5):1659-1667
The performance of Anderson's classification statistic based on a post-stratified random sample is examined. It is assumed that the training sample is a random sample from a stratified population consisting of two strata with unknown stratum weights. The sample is first segregated into the two strata by post-stratification. The unknown parameters for each of the two populations are then estimated and used in the construction of the plug-in discriminant. Under this procedure, it is shown that additional estimation of the stratum weight will not seriously affect the performance of Anderson's classification statistic. Furthermore, our discriminant enjoys a much higher efficiency than the procedure based on an unclassified sample from a mixture of normals investigated by Ganesalingam and McLachlan (1978). 相似文献
10.
在数据挖掘领域中,通常以分类精度作为分类算法效果的评估标准。这一标准是建立在假设任意一实例被误分类为任意类时都具备同样代价的基础上的。当此假设不成立时,直接使用传统分类方法就无法取得良好的分类和预测效果。针对这一问题,通过改进编解码方法以及在适应度函数中集成样本的不同误分类代价,提出了一种基于基因表达式程序设计的代价敏感分类算法(CSC-GEP),并在三个UCI数据集上对该算法进行了测试,实验结果表明CSC-GEP是一种有效的代价敏感分类算法。 相似文献