首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 125 毫秒
1.
The L statistic is extended to allow incomplete or partial rankings within groups. As a special case of these extensions, tests for ofered alternatives analogous to the page test are provided for other than complete rankings. A two group illustration is provided by an alternative analysis of the Bradley and Terry taste-testing experiment on pork-roasts.  相似文献   

2.
The Bayes classification rule offers the optimal classifier, minimizing the classification error rate, whereas the Neyman–Pearson lemma offers the optimal family of classifiers to maximize the detection rate for any given false alarm rate. These motivate studies on comparing classifiers based on similarities between the classifiers and the optimal. In this article, we define partial order relations on classifiers and families of classifiers, based on rankings of rate function values and rankings of test function values, respectively. Each partial order relation provides a sufficient condition, which yields better classification error rates or better performance on the receiver operating characteristic analysis. Various examples and applications of the partial order theorems are discussed to provide comparisons of classifiers and families of classifiers, including the comparison of cross-validation methods, training data that contains outliers, and labelling errors in training data. The Canadian Journal of Statistics 48: 152–166; 2020 © 2019 Statistical Society of Canada  相似文献   

3.
In several research areas such as psychology, social science, and medicine, studies are conducted in which objects should be ranked by different judges/raters and the concordance of the different rankings is then analyzed. In such studies, it is also frequently of interest to compare the rankings between different groups of judges, e.g. female vs. male judges or judges from different professions. In the two-group case, the two-group concordance test of Schucany & Frawley can be employed for such a comparison. In this article, we propose an extension of this test enabling the comparison of rankings from more than two groups of judges. This test aims to detect disagreement in the average rankings of the objects between k groups with an at least moderate intra-group concordance. We evaluate this test in an extensive simulation study and in an application to data from an aesthetics study. This simulation study shows that the proposed test is able to detect differences between average rankings and performs well even in situations in which the disagreement is comparably small or the intra-group concordance is inhomogeneous.  相似文献   

4.
We develop an omnibus two-sample test for ranked-set sampling (RSS) data. The test statistic is the conditional probability of seeing the observed sequence of ranks in the combined sample, given the observed sequences within the separate samples. We compare the test to existing tests under perfect rankings, finding that it can outperform existing tests in terms of power, particularly when the set size is large. The test does not maintain its level under imperfect rankings. However, one can create a permutation version of the test that is comparable in power to the basic test under perfect rankings and also maintains its level under imperfect rankings. Both tests extend naturally to judgment post-stratification, unbalanced RSS, and even RSS with multiple set sizes. Interestingly, the tests have no simple random sampling analog.  相似文献   

5.
Kendall's tau is a coefficient of concordance between two rankings of n objects. Its definition and large sample normal approximation are easily extended to the case where one of the rankings contains ties. In this paper, definition and normal approximation are extended further to the case where both rankings contain ties. The results are applied to give a fully distribution-free test for two-way contingency tables with ordered categories.  相似文献   

6.
Although experimentation is a crucial stage in the process of research and development of industrial products, no satisfactory procedure is available to deal with the common but rather important industrial problem of defining a preference ranking among all the studied product prototypes on the basis of performances. In this paper we propose a two-stage non-parametric procedure in which we firstly perform a set of C-sample testing procedures, followed by multiple comparisons, in this way evaluating a set of partial preference rankings, and secondly synthesise the partial rankings by combining them into a global ranking that provides a general product preference rule. The proposed method is particularly useful in the context of industrial experimentation and offers several advantages such as effectiveness, high flexibility and practical adherence to real problems where preference ranking is a natural goal.  相似文献   

7.
The authors consider the situation of incomplete rankings in which n judges independently rank ki ∈ {2, …, t} objects. They wish to test the null hypothesis that each judge picks the ranking at random from the space of ki! permutations of the integers 1, …, ki. The statistic considered is a generalization of the Friedman test in which the ranks assigned by each judge are replaced by real‐valued functions a(j, ki), 1 ≤ jkit of the ranks. The authors define a measure of pairwise similarity between complete rankings based on such functions, and use averages of such similarities to construct measures of the level of concordance of the judges' rankings. In the complete ranking case, the resulting statistics coincide with those defined by Hájek & ?idák (1967, p. 118), and Sen (1968). These measures of similarity are extended to the situation of incomplete rankings. A statistic is derived in this more general situation and its properties are investigated.  相似文献   

8.
Both the method of ranking after alignment and the Tukey-Quade method of weighted rankings for the analysis of complete blocks are generalized so as to give rise to classes of tests containing a conditionally distribution-free test and strictly distribution-free tests that are asymptotically optimal in the sense that, when the number of blocks tends to infinity, their asymptotic local power reaches the one of the asymptotically minimax test based on block-location-free statistics.  相似文献   

9.
From an analysis of the track records of U.S. economic forecasters, Stekler (1987) concluded that “all forecasters are not equal” (p. 158). This article shows that his result is based on an incorrectly defined test statistic. When a more appropriate test is conducted, the figures suggest that accuracy rankings are not significantly different from those that might be expected as a result of sampling error in a population of equally accurate forecasters.  相似文献   

10.
Testing for ordered alternatives in randomized block designs has been a problem of interest for almost three decades (Jonckheere (1954)). Three classes of rank tests have evolved—tests based on “within-blocks” rankings (W-tests), tests based on “ranking after alignment” within blocks (RAA-tests), and tests based on “among-blocks” rankings (A-Tests). This paper focuses on the latter. A simplified version of the Skillings-Wolfe generalized Purl test (1977) is suggested and two very useful A-tests—a generalized Johnson-Mehrotra “Optimal contrast” procedure and a generalized Tryon-Hettmansperger rank test—are developed. These procedures are compared and contrasted with other recent competitors presented by Skllllngs and Wolfe (1978) and by Salama and Quade (1981).  相似文献   

11.
Ranked-set sampling (RSS) and judgment post-stratification (JPS) use ranking information to obtain more efficient inference than is possible using simple random sampling. Both methods were developed with subjective, judgment-based rankings in mind, but the idea of ranking using a covariate has received a lot of attention. We provide evidence here that when rankings are done using a covariate, the standard RSS and JPS mean estimators no longer make efficient use of the available information. We first show that when rankings are done using a covariate, the standard nonparametric mean estimators in JPS and unbalanced RSS are inadmissible under squared error loss. We then show that when rankings are done using a covariate, nonparametric regression techniques yield mean estimators that tend to be significantly more efficient than the standard RSS and JPS mean estimators. We conclude that the standard estimators are best reserved for settings where only subjective, judgment-based rankings are available.  相似文献   

12.
A method is proposed for calculating the small sample powers of rank tests which are based on the method of n rankings. A class of normal shift alternative hypotheses is considered, and Hodges–Lehmann efficiencies are calculated for the Friedman test.  相似文献   

13.
针对金融机构风险容忍度的关键指标:资本、获利、财务弹性、银行特许价值,以及《巴塞尔资本协议》III关注的表外资产与金融衍生品隐含嵌入杠杆,作为风险调整的投入和产出,构建台湾金控集团下属银行的效率评估体系。结果表明:第一,银行效率排名有强者恒强趋势,在拟定具有稳定性与一贯性的风险容忍度后,除了形成银行特有的风险文化与经营特色外,还能够维持稳定绩效,对于市场品牌价值将有显著的提升;第二,风险调整后的效率与监管指标相关程度较高,表示考虑风险后的运营更容易满足监管者管控要求;第三,低流动性风险(低贷存比)风险管理优势带来较高的风险调整后效率,然而贷款普遍存在流动性差的特征,银行应加强优质合格存款与资产的流动性管理,避免过度依靠贷款(高贷存比)而降低自身经营效率;最后,台湾利率市场化造成利差逐渐缩小,较快的放款增速却无法提高风险调整后的效率,资产管理与财务管理等中间业务收入将是台湾银行结构转型的重点业务。  相似文献   

14.
Overall journal rankings, which are generated with sample articles in different research fields, are commonly used to measure the research productivity of academic economists. In this article, we investigate a growing concern in the profession that the use of the overall journal rankings to evaluate scholars’ relative research productivity may exhibit a downward bias toward researchers in some specialty fields if their respective field journals are under-ranked in the overall journals rankings. To address this concern, we constructed new journal rankings based on the intellectual influence of research in 8 specialty fields using a sample consisting of 26,401 articles published across 60 economics journals from 1998 to 2007. We made various comparisons between the newly constructed journal rankings in specialty fields and the traditional overall journal ranking. Our results show that the overall journal ranking provides a considerably good mapping for the article quality in specialty fields. Supplementary materials for this article are available online.  相似文献   

15.
A new rank correlation index, which can be used to measure the extent of concordance or discordance between two rankings, is proposed. This index is based on Gini’s mean difference computed on the totals ranks corresponding to each unit and it turns out to be a special case of a more general measure of the agreement of m rankings. The proposed index can be used in a test for the independence of two criteria used to rank the units of a sample, against their concordance/discordance. It can then be regarded as a competitor of other classical methods, such as Kendall’s tau. The exact distribution of the proposed test-statistic under the null hypothesis of independence is studied and its expectation and variance are determined; moreover, the asymptotic distribution of the test-statistic is derived. Finally, the implementation of the proposed test and its performance are discussed. Both the authors contributed equally to this work; however, the actual writing of the paper was as follows: Sects. 2 and 3 are due to C. G. Borroni, Sects. 1 and 4 are due to M. Zenga.  相似文献   

16.
A rank test based on the number of ‘near-matches’ among within-block rankings is proposed for stochastically ordered alternatives in a randomized block design with t treatments and b blocks. The asymptotic relative efficiency of this test with respect to the Page test is computed as number of blocks increases to infinity. A sequential analog of the above test procedure is also considered. A repeated significance test procedure is developed and average sample number is computed asymptotically under the null hypothesis as well as under a sequence of contiguous alternatives.  相似文献   

17.
对中外主要大学排行榜的评价指标按所反映的评价内容进行分类归纳,通过各类指标权重的大小来比较各评价指标体系的不同侧重点,并对由此反映出的中外大学评价指标的差异进行原因分析,从定性与定量相结合的角度为中国大学综合排行榜评价指标体系的设计提供借鉴参考。此外,在对中外大学排行榜评价指标的数据来源进行比较分析的基础上,指出影响大学评价指标设计合理性的主要原因在于数据来源的条件限制,并提出相应的建议。  相似文献   

18.
The use of weighted rankings to analyze complete blocks designs (Quade, 1979) is a practical way of recovering between-block information. A family of old and new test statistics can be generated by this procedure. Selection among these statistics and comparison with parametric and nonparametric competitors are based on expected significance level [ESL] in small designs (3 to 6 blocks, 3 to 5 treatments).  相似文献   

19.
Abstract

The presence of a maverick judge, one whose rankings differ greatly from the other members of a panel, can result in incorrect rankings and a sense of unfairness among contestants. We develop and explore the properties of a likelihood ratio test, assuming a Mallows type distribution, for the presence of a maverick judge when each judge selects his or her best k out of n objects, k ≤ n. Detection of a maverick judge, who may be viewed as a multivariate outlier, turns out to be very difficult unless the judges are very consistent and there are repeat observations on the panel.  相似文献   

20.
A general rank test procedure based on an underlying multinomial distribution is suggested for randomized block experiments with multifactor treatment combinations within each block. The Wald statistic for the multinomial is used to test hypotheses about the within–block rankings. This statistic is shown to be related to the one–sample Hotellingt's T2 statistic, suggesting a method for computing the test statistic using the standard statistical computer packages.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号