期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Comparing the methods of measuring multi-rater agreement on an ordinal rating scale: a simulation study with an application to real data

Y. Sertdemir H. R. Burgut Z. N. Alparslan I. Unal S. Gunasti 《Journal of applied statistics》2013,40(7):1506-1519

Agreement among raters is an important issue in medicine, as well as in education and psychology. The agreement among two raters on a nominal or ordinal rating scale has been investigated in many articles. The multi-rater case with normally distributed ratings has also been explored at length. However, there is a lack of research on multiple raters using an ordinal rating scale. In this simulation study, several methods were compared with analyze rater agreement. The special case that was focused on was the multi-rater case using a bounded ordinal rating scale. The proposed methods for agreement were compared within different settings. Three main ordinal data simulation settings were used (normal, skewed and shifted data). In addition, the proposed methods were applied to a real data set from dermatology. The simulation results showed that the Kendall's W and mean gamma highly overestimated the agreement in data sets with shifts in data. ICC₄ for bounded data should be avoided in agreement studies with rating scales<5, where this method highly overestimated the simulated agreement. The difference in bias for all methods under study, except the mean gamma and Kendall's W, decreased as the rating scale increased. The bias of ICC₃ was consistent and small for nearly all simulation settings except the low agreement setting in the shifted data set. Researchers should be careful in selecting agreement methods, especially if shifts in ratings between raters exist and may apply more than one method before any conclusions are made. 相似文献

2.

Beyond kappa: A review of interrater agreement measures

Mousumi Banerjee Michelle Capozzoli Laura McSweeney Debajyoti Sinha 《Revue canadienne de statistique》1999,27(1):3-23

相似文献

3.

Estimation of symmetric disagreement using a uniform association model for ordinal agreement data

Serpil Aktaş Tülay Saraçbaşı 《AStA Advances in Statistical Analysis》2009,93(3):335-343

The Cohen kappa is probably the most widely used measure of agreement. Measuring the degree of agreement or disagreement in square contingency tables by two raters is mostly of interest. Modeling the agreement provides more information on the pattern of the agreement rather than summarizing the agreement by kappa coefficient. Additionally, the disagreement models in the literature they mentioned are proposed for the nominal scales. Disagreement and uniform association models are aggregated as a new model for the ordinal scale agreement data, thus in this paper, symmetric disagreement plus uniform association model that aims separating the association from the disagreement is proposed. Proposed model is applied to real uterine cancer data. 相似文献

4.

Joint models for mixed categorical outcomes: a study of HIV risk perception and disease status in Mozambique

Osvaldo Loquiha Niel Hens Emilia Martins-Fonteyn Herman Meulemans Edwin Wouters Marleen Temmerman 《Journal of applied statistics》2018,45(10):1781-1798

Two types of bivariate models for categorical response variables are introduced to deal with special categories such as ‘unsure’ or ‘unknown’ in combination with other ordinal categories, while taking additional hierarchical data structures into account. The latter is achieved by the use of different covariance structures for a trivariate random effect. The models are applied to data from the INSIDA survey, where interest goes to the effect of covariates on the association between HIV risk perception (quadrinomial with an ‘unknown risk’ category) and HIV infection status (binary). The final model combines continuation-ratio with cumulative link logits for the risk perception, together with partly correlated and partly shared trivariate random effects for the household level. The results indicate that only age has a significant effect on the association between HIV risk perception and infection status. The proposed models may be useful in various fields of application such as social and biomedical sciences, epidemiology and public health. 相似文献

5.

Cohen’s quadratically weighted kappa is higher than linearly weighted kappa for tridiagonal agreement tables

《Statistical Methodology》2012,9(3):440-444

相似文献

6.

Weighted kappa as a function of unweighted kappas

N. Moradzadeh M. Ganjali 《统计学通讯:模拟与计算》2017,46(5):3769-3780

The kappa coefficient is a widely used measure for assessing agreement on a nominal scale. Weighted kappa is an extension of Cohen's kappa that is commonly used for measuring agreement on an ordinal scale. In this article, it is shown that weighted kappa can be computed as a function of unweighted kappas. The latter coefficients are kappa coefficients that correspond to smaller contingency tables that are obtained by merging categories. 相似文献

7.

The disagreeable behaviour of the kappa statistic

下载免费PDF全文

Laura Flight Steven A. Julious 《Pharmaceutical statistics》2015,14(1):74-78

It is often of interest to measure the agreement between a number of raters when an outcome is nominal or ordinal. The kappa statistic is used as a measure of agreement. The statistic is highly sensitive to the distribution of the marginal totals and can produce unreliable results. Other statistics such as the proportion of concordance, maximum attainable kappa and prevalence and bias adjusted kappa should be considered to indicate how well the kappa statistic represents agreement in the data. Each kappa should be considered and interpreted based on the context of the data being analysed. Copyright © 2014 JohnWiley & Sons, Ltd. 相似文献

8.

A bootstrap method for comparing correlated kappa coefficients

《Journal of Statistical Computation and Simulation》2012,82(11):1009-1015

Cohen's kappa coefficient is traditionally used to quantify the degree of agreement between two raters on a nominal scale. Correlated kappas occur in many settings (e.g., repeated agreement by raters on the same individuals, concordance between diagnostic tests and a gold standard) and often need to be compared. While different techniques are now available to model correlated κ coefficients, they are generally not easy to implement in practice. The present paper describes a simple alternative method based on the bootstrap for comparing correlated kappa coefficients. The method is illustrated by examples and its type I error studied using simulations. The method is also compared with the generalized estimating equations of the second order and the weighted least-squares methods. 相似文献

9.

A BAYESIAN ANALYSIS FOR INTER-RATER AGREEMENT

《统计学通讯:模拟与计算》2013,42(3):437-446

An analysis of inter-rater agreement is presented. We study the problem with several raters using a Bayesian model based on the Dirichlet distribution. Inter-rater agreement, including global and partial agreement, is studied by determining the joint posterior distribution of the raters. Posterior distributions are computed with a direct resampling technique. Our method is illustrated with an example involving four residents, who are diagnosing 12 psychiatric patients suspected of having a thought disorder. Initially employing analytical and resampling methods, total agreement between the four is examined with a Bayesian testing technique. Later, partial agreement is examined by determining the posterior probability of certain orderings among the rater means. The power of resampling is revealed by its ability to compute complex multiple integrals that represent various posterior probabilities of agreement and disagreement between several raters. 相似文献

10.

A generalization of the uniform association model for assessing rater agreement in ordinal scales

Alireza Akbarzadeh Bagheban Farid Zayeri 《Journal of applied statistics》2010,37(8):1265-1273

相似文献

11.

Cohen’s kappa is a weighted average 总被引：1，自引：0，他引：1

Matthijs J. Warrens 《Statistical Methodology》2011,8(6):473-484

相似文献

12.

Category Distinguishability and Observer Agreement 总被引：1，自引：0，他引：1

J. N. Darroch P. I. McCloud 《Australian & New Zealand Journal of Statistics》1986,28(3):371-388

It is common in the medical, biological, and social sciences for the categories into which an object is classified not to have a fully objective definition. Theoretically speaking the categories are therefore not completely distinguishable. The practical extent of their distinguishability can be measured when two expert observers classify the same sample of objects. It is shown, under reasonable assumptions, that the matrix of joint classification probabilities is quasi-symmetric, and that the symmetric matrix component is non-negative definite. The degree of distinguishability between two categories is defined and is used to give a measure of overall category distinguishability. It is argued that the kappa measure of observer agreement is unsatisfactory as a measure of overall category distinguishability. 相似文献

13.

数据挖掘中多分类有序变量间距差异分析及应用 总被引：1，自引：0，他引：1

陈民恳朱建平《统计与信息论坛》2007,22(1):27-31

文章在明确累积logistic回归模型的基础上,针对多分类有序变量存在间距差异的问题,提出了统计检验方法并引入工具虚拟变量对logistic模型加以改进,通过其在实际中的应用,取得了良好的效果. 相似文献

14.

A simple method for estimating a regression model for κ between a pair of raters

Stuart R. Lipsitz John Williamson Neil Klar Joseph Ibrahim & Michael Parzen 《Journal of the Royal Statistical Society. Series A, (Statistics in Society)》2001,164(3):449-465

Agreement studies commonly occur in medical research, for example, in the review of X-rays by radiologists, blood tests by a panel of pathologists and the evaluation of psychopathology by a panel of raters. In these studies, often two observers rate the same subject for some characteristic with a discrete number of levels. The κ-coefficient is a popular measure of agreement between the two raters. The κ-coefficient may depend on covariates, i.e. characteristics of the raters and/or the subjects being rated. Our research was motivated by two agreement problems. The first is a study of agreement between a pastor and a co-ordinator of Christian education on whether they feel that the congregation puts enough emphasis on encouraging members to work for social justice (yes versus no). We wish to model the κ-coefficient as a function of covariates such as political orientation (liberal versus conservative) of the pastor and co-ordinator. The second example is a spousal education study, in which we wish to model the κ-coefficient as a function of covariates such as the highest degree of the father of the wife and the father of the husband. We propose a simple method to estimate the regression model for the κ-coefficient, which consists of two logistic (or multinomial logistic) regressions and one linear regression for binary data. The estimates can be easily obtained in any generalized linear model software program. 相似文献

15.

Model-based clustering of multivariate ordinal data relying on a stochastic binary search algorithm

Christophe Biernacki Julien Jacques 《Statistics and Computing》2016,26(5):929-943

We design a probability distribution for ordinal data by modeling the process generating data, which is assumed to rely only on order comparisons between categories. Contrariwise, most competitors often either forget the order information or add a non-existent distance information. The data generating process is assumed, from optimality arguments, to be a stochastic binary search algorithm in a sorted table. The resulting distribution is natively governed by two meaningful parameters (position and precision) and has very appealing properties: decrease around the mode, shape tuning from uniformity to a Dirac, identifiability. Moreover, it is easily estimated by an EM algorithm since the path in the stochastic binary search algorithm can be considered as missing values. Using then the classical latent class assumption, the previous univariate ordinal model is straightforwardly extended to model-based clustering for multivariate ordinal data. Parameters of this mixture model are estimated by an AECM algorithm. Both simulated and real data sets illustrate the great potential of this model by its ability to parsimoniously identify particularly relevant clusters which were unsuspected by some traditional competitors. 相似文献

16.

Ordinal ridge regression with categorical predictors

Faisal M. Zahid Shahla Ramzan 《Journal of applied statistics》2012,39(1):161-171

In multi-category response models, categories are often ordered. In the case of ordinal response models, the usual likelihood approach becomes unstable with ill-conditioned predictor space or when the number of parameters to be estimated is large relative to the sample size. The likelihood estimates do not exist when the number of observations is less than the number of parameters. The same problem arises if constraint on the order of intercept values is not met during the iterative procedure. Proportional odds models (POMs) are most commonly used for ordinal responses. In this paper, penalized likelihood with quadratic penalty is used to address these issues with a special focus on POMs. To avoid large differences between two parameter values corresponding to the consecutive categories of an ordinal predictor, the differences between the parameters of two adjacent categories should be penalized. The considered penalized-likelihood function penalizes the parameter estimates or differences between the parameter estimates according to the type of predictors. Mean-squared error for parameter estimates, deviance of fitted probabilities and prediction error for ridge regression are compared with usual likelihood estimates in a simulation study and an application. 相似文献

17.

Calculating power for the comparison of dependent κ-coefficients

Hung-Mo Lin John M. Williamson Stuart R. Lipsitz 《Journal of the Royal Statistical Society. Series C, Applied statistics》2003,52(4):391-404

Summary. In the psychosocial and medical sciences, some studies are designed to assess the agreement between different raters and/or different instruments. Often the same sample will be used to compare the agreement between two or more assessment methods for simplicity and to take advantage of the positive correlation of the ratings. Although sample size calculations have become an important element in the design of research projects, such methods for agreement studies are scarce. We adapt the generalized estimating equations approach for modelling dependent κ -statistics to estimate the sample size that is required for dependent agreement studies. We calculate the power based on a Wald test for the equality of two dependent κ -statistics. The Wald test statistic has a non-central χ ²-distribution with non-centrality parameter that can be estimated with minimal assumptions. The method proposed is useful for agreement studies with two raters and two instruments, and is easily extendable to multiple raters and multiple instruments. Furthermore, the method proposed allows for rater bias. Power calculations for binary ratings under various scenarios are presented. Analyses of two biomedical studies are used for illustration. 相似文献

18.

Flexible uncertainty in mixture models for ordinal responses

Gerhard Tutz Micha Schneider 《Journal of applied statistics》2019,46(9):1582-1601

In classical mixture models for ordinal data with an uncertainty component, the Uniform distribution is used to model indecision. In the approach proposed here, the discrete Uniform distribution is replaced by a more flexible distribution, which is centered in the middle of the response categories. The resulting model allows to distinguish between a tendency to middle categories and a tendency to extreme categories. By linking these preferences to explanatory variables, one can investigate which persons show a tendency to these response styles. It is demonstrated that severe bias might occur if inadvertently the Uniform distribution is used to model uncertainty. An application to attitudes on the performance of health services illustrates the advantages of the more flexible model. 相似文献

19.

Analysis of a longitudinal ordinal response clinical trial using dynamic models

P. J. Lindsey J. Kaufmann 《Journal of the Royal Statistical Society. Series C, Applied statistics》2004,53(3):523-537

Summary. In many areas of pharmaceutical research, there has been increasing use of categorical data and more specifically ordinal responses. In many cases, complex models are required to account for different types of dependences among the responses. The clinical trial that is considered here involved patients who were required to remain in a particular state to enable the doctors to examine their heart. The aim of this trial was to study the relationship between the dose of the drug administered and the time that was spent by the patient in the state permitting examination. The patient's state was measured every second by a continuous Doppler signal which was categorized by the doctors into one of four ordered categories. Hence, the response consisted of repeated ordinal series. These series were of different lengths because the drug effect wore off faster (or slower) on certain patients depending on the drug dose administered and the infusion rate, and therefore the length of drug administration. A general method for generating new ordinal distributions is presented which is sufficiently flexible to handle unbalanced ordinal repeated measurements. It consists of obtaining a cumulative mixture distribution from a Laplace transform and introducing into it the integrated intensity of a binary logistic, continuation ratio or proportional odds model. Then, a multivariate distribution is constructed by a procedure that is similar to the updating process of the Kalman filter. Several types of history dependences are proposed. 相似文献

20.

A distance-based rounding strategy for post-imputation ordinal data

Hakan Demirtas 《Journal of applied statistics》2010,37(3):489-500

Multiple imputation has emerged as a widely used model-based approach in dealing with incomplete data in many application areas. Gaussian and log-linear imputation models are fairly straightforward to implement for continuous and discrete data, respectively. However, in missing data settings which include a mix of continuous and discrete variables, correct specification of the imputation model could be a daunting task owing to the lack of flexible models for the joint distribution of variables of different nature. This complication, along with accessibility to software packages that are capable of carrying out multiple imputation under the assumption of joint multivariate normality, appears to encourage applied researchers for pragmatically treating the discrete variables as continuous for imputation purposes, and subsequently rounding the imputed values to the nearest observed category. In this article, I introduce a distance-based rounding approach for ordinal variables in the presence of continuous ones. The first step of the proposed rounding process is predicated upon creating indicator variables that correspond to the ordinal levels, followed by jointly imputing all variables under the assumption of multivariate normality. The imputed values are then converted to the ordinal scale based on their Euclidean distances to a set of indicators, with minimal distance corresponding to the closest match. I compare the performance of this technique to crude rounding via commonly accepted accuracy and precision measures with simulated data sets. 相似文献