期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Comparing the methods of measuring multi-rater agreement on an ordinal rating scale: a simulation study with an application to real data

Y. Sertdemir H. R. Burgut Z. N. Alparslan I. Unal S. Gunasti 《Journal of applied statistics》2013,40(7):1506-1519

Agreement among raters is an important issue in medicine, as well as in education and psychology. The agreement among two raters on a nominal or ordinal rating scale has been investigated in many articles. The multi-rater case with normally distributed ratings has also been explored at length. However, there is a lack of research on multiple raters using an ordinal rating scale. In this simulation study, several methods were compared with analyze rater agreement. The special case that was focused on was the multi-rater case using a bounded ordinal rating scale. The proposed methods for agreement were compared within different settings. Three main ordinal data simulation settings were used (normal, skewed and shifted data). In addition, the proposed methods were applied to a real data set from dermatology. The simulation results showed that the Kendall's W and mean gamma highly overestimated the agreement in data sets with shifts in data. ICC₄ for bounded data should be avoided in agreement studies with rating scales<5, where this method highly overestimated the simulated agreement. The difference in bias for all methods under study, except the mean gamma and Kendall's W, decreased as the rating scale increased. The bias of ICC₃ was consistent and small for nearly all simulation settings except the low agreement setting in the shifted data set. Researchers should be careful in selecting agreement methods, especially if shifts in ratings between raters exist and may apply more than one method before any conclusions are made. 相似文献

2.

Calculating power for the comparison of dependent κ-coefficients

Hung-Mo Lin John M. Williamson Stuart R. Lipsitz 《Journal of the Royal Statistical Society. Series C, Applied statistics》2003,52(4):391-404

Summary. In the psychosocial and medical sciences, some studies are designed to assess the agreement between different raters and/or different instruments. Often the same sample will be used to compare the agreement between two or more assessment methods for simplicity and to take advantage of the positive correlation of the ratings. Although sample size calculations have become an important element in the design of research projects, such methods for agreement studies are scarce. We adapt the generalized estimating equations approach for modelling dependent κ -statistics to estimate the sample size that is required for dependent agreement studies. We calculate the power based on a Wald test for the equality of two dependent κ -statistics. The Wald test statistic has a non-central χ ²-distribution with non-centrality parameter that can be estimated with minimal assumptions. The method proposed is useful for agreement studies with two raters and two instruments, and is easily extendable to multiple raters and multiple instruments. Furthermore, the method proposed allows for rater bias. Power calculations for binary ratings under various scenarios are presented. Analyses of two biomedical studies are used for illustration. 相似文献

3.

A bootstrap method for comparing correlated kappa coefficients

《Journal of Statistical Computation and Simulation》2012,82(11):1009-1015

Cohen's kappa coefficient is traditionally used to quantify the degree of agreement between two raters on a nominal scale. Correlated kappas occur in many settings (e.g., repeated agreement by raters on the same individuals, concordance between diagnostic tests and a gold standard) and often need to be compared. While different techniques are now available to model correlated κ coefficients, they are generally not easy to implement in practice. The present paper describes a simple alternative method based on the bootstrap for comparing correlated kappa coefficients. The method is illustrated by examples and its type I error studied using simulations. The method is also compared with the generalized estimating equations of the second order and the weighted least-squares methods. 相似文献

4.

A comparison of analysis procedures for correlated binary data in dedicated multi‐rater imaging trials

下载免费PDF全文

Michael Kunz 《Pharmaceutical statistics》2015,14(1):34-43

In this paper, three analysis procedures for repeated correlated binary data with no a priori ordering of the measurements are described and subsequently investigated. Examples for correlated binary data could be the binary assessments of subjects obtained by several raters in the framework of a clinical trial. This topic is especially of relevance when success criteria have to be defined for dedicated imaging trials involving several raters conducted for regulatory purposes. First, an analytical result on the expectation of the ‘Majority rater’ is presented when only the marginal distributions of the single raters are given. The paper provides a simulation study where all three analysis procedures are compared for a particular setting. It turns out that in many cases, ‘Average rater’ is associated with a gain in power. Settings were identified where ‘Majority significant’ has favorable properties. ‘Majority rater’ is in many cases difficult to interpret. Copyright © 2014 John Wiley & Sons, Ltd. 相似文献

5.

Modeling participation duration,with application to the North American Breeding Bird Survey

William A. Link John R. Sauer 《统计学通讯:理论与方法》2013,42(21):6311-6320

相似文献

6.

Run statistics in a sequence of arbitrarily dependent binary trials

Sevcan Demir Serkan Eryılmaz 《Statistical Papers》2010,51(4):959-973

Let {Z_i }_i≥1 be an arbitrary sequence of trials with two possible outcomes either success (1) or failure (0). General expressions for the exact distributions of runs, both success and failure, in Z ₁, . . . , Z _n are presented. Our method is based on the use of joint distribution of success and failure run lengths and unifies the results on distribution of runs. As a special case of our results we obtain the distributions of runs for various binary sequences. As illustrated in the paper the results enable us to derive the distribution of runs for binary trials arising in urn models. 相似文献

7.

k-Sample test based on the common area of kernel density estimators

P. Martínez-Camblor J. De Uña-Álvarez N. Corral 《Journal of statistical planning and inference》2008

相似文献

8.

ON THE EQUIVALENCE OF SOME INDICES OF SIMILARITY: IMPLICATION FOR BINARY PRESENCE/ABSENCE DATA

Magdalena Niewiadomska‐Bugaj 《Australian & New Zealand Journal of Statistics》2012,54(2):189-198

Cohen’s kappa, a special case of the weighted kappa, is a chance‐corrected index used extensively to quantify inter‐rater agreement in validation and reliability studies. In this paper, it is shown that in inter‐rater agreement for 2 × 2 tables, for two raters having the same number of opposite ratings, the weighted kappa, Cohen’s kappa, Peirce, Yule, Maxwell and Pilliner and Fleiss indices are identical. This implies that the weights in the weighted kappa are less important under such assumptions. Equivalently, it is shown that for two partitions of the same data set, resulting from two clustering algorithms having the same number of clusters with equal cluster sizes, these similarity indices are identical. Hence, an important characterisation is formulated relating equal numbers of clusters with the same cluster sizes to the presence/absence of a trait in a reliability study. Two numerical examples that exemplify the implication of this relationship are presented. 相似文献

9.

A Tandem Queue with Server Slow-Down and Blocking

《随机性模型》2013,29(2-3):695-724

Abstract

We consider two variants of a two-station tandem network with blocking. In both variants the first server ceases to work when the queue length at the second station hits a ‘blocking threshold.’ In addition, in variant 2 the first server decreases its service rate when the second queue exceeds a ‘slow-down threshold, ’ which is smaller than the blocking level. In both variants the arrival process is Poisson and the service times at both stations are exponentially distributed. Note, however, that in case of slow-downs, server 1 works at a high rate, a slow rate, or not at all, depending on whether the second queue is below or above the slow-down threshold or at the blocking threshold, respectively. For variant 1, i.e., only blocking, we concentrate on the geometric decay rate of the number of jobs in the first buffer and prove that for increasing blocking thresholds the sequence of decay rates decreases monotonically and at least geometrically fast to max{ρ₁, ρ₂}, where ρ_i is the load at server i. The methods used in the proof also allow us to clarify the asymptotic queue length distribution at the second station. Then we generalize the analysis to variant 2, i.e., slow-down and blocking, and establish analogous results. 相似文献

10.

Cohen’s quadratically weighted kappa is higher than linearly weighted kappa for tridiagonal agreement tables

《Statistical Methodology》2012,9(3):440-444

相似文献

11.

A study of generalized skew-normal distribution

Wen-Jang Huang Arjun K. Gupta 《Statistics》2013,47(5):942-953

Following the paper by Genton and Loperfido [Generalized skew-elliptical distributions and their quadratic forms, Ann. Inst. Statist. Math. 57 (2005), pp. 389–401], we say that Z has a generalized skew-normal distribution, if its probability density function (p.d.f.) is given by f(z)=2φ_p(z; ξ, Ω)π (z?ξ), z∈?^p, where φ_p(·; ξ, Ω) is the p-dimensional normal p.d.f. with location vector ξ and scale matrix Ω, ξ∈?^p, Ω>0, and π is a skewing function from ?^p to ?, that is 0≤π (z)≤1 and π (?z)=1?π (z), ? z∈?^p. First the distribution of linear transformations of Z are studied, and some moments of Z and its quadratic forms are derived. Next we obtain the joint moment-generating functions (m.g.f.’s) of linear and quadratic forms of Z and then investigate conditions for their independence. Finally explicit forms for the above distributions, m.g.f.’s and moments are derived when π (z)=κ (α′z), where α∈?^p and κ is the normal, Laplace, logistic or uniform distribution function. 相似文献

12.

A simple method for estimating a regression model for κ between a pair of raters

Stuart R. Lipsitz John Williamson Neil Klar Joseph Ibrahim & Michael Parzen 《Journal of the Royal Statistical Society. Series A, (Statistics in Society)》2001,164(3):449-465

Agreement studies commonly occur in medical research, for example, in the review of X-rays by radiologists, blood tests by a panel of pathologists and the evaluation of psychopathology by a panel of raters. In these studies, often two observers rate the same subject for some characteristic with a discrete number of levels. The κ-coefficient is a popular measure of agreement between the two raters. The κ-coefficient may depend on covariates, i.e. characteristics of the raters and/or the subjects being rated. Our research was motivated by two agreement problems. The first is a study of agreement between a pastor and a co-ordinator of Christian education on whether they feel that the congregation puts enough emphasis on encouraging members to work for social justice (yes versus no). We wish to model the κ-coefficient as a function of covariates such as political orientation (liberal versus conservative) of the pastor and co-ordinator. The second example is a spousal education study, in which we wish to model the κ-coefficient as a function of covariates such as the highest degree of the father of the wife and the father of the husband. We propose a simple method to estimate the regression model for the κ-coefficient, which consists of two logistic (or multinomial logistic) regressions and one linear regression for binary data. The estimates can be easily obtained in any generalized linear model software program. 相似文献

13.

An asymptotic distribution of wright’s process capability index sensitive to skewness

《Journal of Statistical Computation and Simulation》2012,82(1-2):147-158

In a recent paper (J. Statist. Comput. Simul., 1995, Vol.53, pp. 195–203) P. A. Wright proposed a new process capability index C_s which generalizes the Pearn-Kotz-Johnson’s index C_pmk by taking into account the skewness (in addition to deviation of the mean from tliCrntarget already incorporated in C_pmk ). The purpose of this article is to study the consistency and asymptotics of an estimate ?_s of C_s The asymptotic distribution provides an insight into some desirable properties of the estimate which are not apparent from its original definition 相似文献

14.

A model-based concordance-type index for evaluating the added predictive ability of novel risk factors and markers in the logistic regression models

M. Shafiqur Rahman Afrin Sadia Rumana 《Journal of applied statistics》2019,46(12):2145-2163

ABSTRACT

The Concordance statistic (C-statistic) is commonly used to assess the predictive performance (discriminatory ability) of logistic regression model. Although there are several approaches for the C-statistic, their performance in quantifying the subsequent improvement in predictive accuracy due to inclusion of novel risk factors or biomarkers in the model has been extremely criticized in literature. This paper proposed a model-based concordance-type index, C_K, for use with logistic regression model. The C_K and its asymptotic sampling distribution is derived following Gonen and Heller's approach for Cox PH model for survival data but taking necessary modifications for use with binary data. Unlike the existing C-statistics for logistic model, it quantifies the concordance probability by taking the difference in the predicted risks between two subjects in a pair rather than ranking them and hence is able to quantify the equivalent incremental value from the new risk factor or marker. The simulation study revealed that the C_K performed well when the model parameters are correctly estimated for large sample and showed greater improvement in quantifying the additional predictive value from the new risk factor or marker than the existing C-statistics. Furthermore, the illustration using three datasets supports the findings from simulation study. 相似文献

15.

A mixed double sampling plan based on Cpk

Saminathan Balamurali Liaquat Ahmad Chi-Hyuck Jun 《统计学通讯:理论与方法》2020,49(8):1840-1857

Abstract

Acceptance sampling plans are quality tools for the manufacturer and the customer. The ultimate result of reduction of nonconforming items will increase the profit of the manufacturer and enhance the satisfaction of the consumer. In this article, a mixed double sampling plan is proposed in which the attribute double sampling inspection is used in the first stage and a variables sampling plan based on the process capability index C_pk is used in the second stage. The optimal parameters are determined so that the producer’s and the consumer’s risks are to be satisfied with minimum average sample number. The optimal parameters of the proposed plan are estimated using different plan settings using two points on the operating characteristic curve approach. In designing the proposed mixed double sampling plan, we consider the symmetric and asymmetric nonconforming cases under variables inspection. The efficiency of the proposed plan is discussed and compared with the existing sampling plans. Tables are constructed for easy selection of the optimal plan parameters and an industrial example is also included for implementation of the proposed plan. 相似文献

16.

Sample size calculation for count outcomes in cluster randomization trials with varying cluster sizes

《统计学通讯:理论与方法》2012,41(1):116-124

Abstract

In many cluster randomization studies, cluster sizes are not fixed and may be highly variable. For those studies, sample size estimation assuming a constant cluster size may lead to under-powered studies. Sample size formulas have been developed to incorporate the variability in cluster size for clinical trials with continuous and binary outcomes. Count outcomes frequently occur in cluster randomized studies. In this paper, we derive a closed-form sample size formula for count outcomes accounting for the variability in cluster size. We compare the performance of the proposed method with the average cluster size method through simulation. The simulation study shows that the proposed method has a better performance with empirical powers and type I errors closer to the nominal levels. 相似文献

17.

On population‐based measures of agreement for binary classifications

Kerrie P. Nelson Don Edwards 《Revue canadienne de statistique》2008,36(3):411-426

The authors describe a model‐based kappa statistic for binary classifications which is interpretable in the same manner as Scott's pi and Cohen's kappa, yet does not suffer from the same flaws. They compare this statistic with the data‐driven and population‐based forms of Scott's pi in a population‐based setting where many raters and subjects are involved, and inference regarding the underlying diagnostic procedure is of interest. The authors show that Cohen's kappa and Scott's pi seriously underestimate agreement between experts classifying subjects for a rare disease; in contrast, the new statistic is robust to changes in prevalence. The performance of the three statistics is illustrated with simulations and prostate cancer data. 相似文献

18.

Non binary partially neighbor balanced designs for circular blocks

Naqvi Hamad Muhammad Hanif 《统计学通讯:理论与方法》2013,42(20):5961-5965

ABSTRACT

Neighbor designs are recommended for the cases where the performance of treatment is affected by the neighboring treatments as in biometrics and agriculture. In this paper we have constructed two new series of non binary partially neighbor balanced designs for v = 2n and v = 2n+1 number of treatments, respectively. The blocks in the design are non binary and circular but no treatment is ever a neighbor to itself. The designs proposed here are partially balanced in terms of nearest neighbors. No such series are known in the literature. 相似文献

19.

A family of multi-rater kappas that can always be increased and decreased by combining categories

《Statistical Methodology》2012,9(3):330-340

相似文献

20.

Two graphical displays for the detection of potentially influential subsets in regression

Ali S. Hadi 《Journal of applied statistics》1990,17(3):313-327

In the context of the general linear model Y=Xβ+ε, the matrix P_z =Z(Z^TZ)^?1 Z^T , where Z=(X: Y), plays an important role in determining least squares results. In this article we propose two graphical displays for the off-diagonal as well as the diagonal elements of P_Z . The two graphs are based on simple ideas and are useful in the detection of potentially influential subsets of observations in regression. Since P_Z is invariant with respect to permutations of the columns of Z, an added advantage of these graphs is that they can be used to detect outliers in multivariate data where the rows of Z are usually regarded as a random sample from a multivariate population. We also suggest two calibration points, one for the diagonal elements of P_Z and the other for the off-diagonal elements. The advantage of these calibration points is that they take into consideration the variability of the off-diagonal as well as the diagonal elements of P_Z . They also do not suffer from masking. 相似文献