首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 296 毫秒
1.
A new rank correlation index, which can be used to measure the extent of concordance or discordance between two rankings, is proposed. This index is based on Gini’s mean difference computed on the totals ranks corresponding to each unit and it turns out to be a special case of a more general measure of the agreement of m rankings. The proposed index can be used in a test for the independence of two criteria used to rank the units of a sample, against their concordance/discordance. It can then be regarded as a competitor of other classical methods, such as Kendall’s tau. The exact distribution of the proposed test-statistic under the null hypothesis of independence is studied and its expectation and variance are determined; moreover, the asymptotic distribution of the test-statistic is derived. Finally, the implementation of the proposed test and its performance are discussed. Both the authors contributed equally to this work; however, the actual writing of the paper was as follows: Sects. 2 and 3 are due to C. G. Borroni, Sects. 1 and 4 are due to M. Zenga.  相似文献   

2.
In this paper, two measures of agreement among several sets of ranks, Kendall's concordance coefficient and top-down concordance coefficient, are reviewed. In order to illustrate the utility of these measures, two examples, in the fields of health and sports, are presented. A Monte Carlo simulation study was carried out to compare the performance of Kendall's and top-down concordance coefficients in detecting several types and magnitudes of agreements. The data generation scheme was developed in order to induce an agreement with different intensities among m (m>2) sets of ranks in non-directional and directional rank agreement scenarios. The performance of each coefficient was estimated by the proportion of rejected null hypotheses, assessed at 5% significance level, when testing whether the underlying population concordance coefficient is sufficiently greater than zero. For the directional rank agreement scenario, the top-down concordance coefficient allowed to achieve a percentage of significant concordances that was higher than the one achieved by Kendall's concordance coefficient. Mainly, when the degree of agreement was small, the results of the simulation study pointed to the advantage of using a weighted rank concordance, namely the top-down concordance coefficient, simultaneously with Kendall's concordance coefficient, enabling the detection of agreement (in a top-down sense) in situations not detected by Kendall's concordance coefficient.  相似文献   

3.
We proposed two simple moment-based procedures, one with (GCCC1) and one without (GCCC2) normality assumptions, to generalize the inference of concordance correlation coefficient for the evaluation of agreement among multiple observers for measurements on a continuous scale. A modified Fisher's Z-transformation was adapted to further improve the inference. We compared the proposed methods with U-statistic-based inference approach. Simulation analysis showed desirable statistical properties of the simplified approach GCCC1, in terms of coverage probabilities and coverage balance, especially for small samples. GCCC2, which is distribution-free, behaved comparably with the U-statistic-based procedure, but had a more intuitive and explicit variance estimator. The utility of these approaches were illustrated using two clinical data examples.  相似文献   

4.
The problem of whether the rankings of some objects given by a set of criteria (or judges) show any agreement or are more or less independent is addressed. The most familiar measure for concordance is the Kendall W coefficient. Classical tests for concordance are the Friedman test and the F test. Legendre [Species associations: the Kendall coefficient of concordance revisited. J. Agric. Biol. Environ. Stat. 2005;10(2):226–245] compared via simulation the Friedman test and its permutation version. Unfortunately, the simulation study of Legendre was very limited because it considered neither the copula aspect nor the F test. Kendall W is a rank-based correlation measure, and therefore it is not affected by the marginal distributions of the underlying variables, but only by the copula of the multivariate distribution. In this article, the simulation study of Legendre is deeply extended by considering the copula aspect as well as the F test. It is shown that the Friedman test is too conservative and less powerful than both the F test and the permutation test for concordance which always have a correct size and behave alike. The F test should be preferred because it is computationally much easier. Surprisingly, the power function of the tests is not much affected by the type of copula.  相似文献   

5.
Abstract

Sample size calculation is an important component in designing an experiment or a survey. In a wide variety of fields—including management science, insurance, and biological and medical science—truncated normal distributions are encountered in many applications. However, the sample size required for the left-truncated normal distribution has not been investigated, because the distribution of the sample mean from the left-truncated normal distribution is complex and difficult to obtain. This paper compares an ad hoc approach to two newly proposed methods based on the Central Limit Theorem and on a high degree saddlepoint approximation for calculating the required sample size with the prespecified power. As shown by use of simulations and an example of health insurance cost in China, the ad hoc approach underestimates the sample size required to achieve prespecified power. The method based on the high degree saddlepoint approximation provides valid sample size and power calculations, and it performs better than the Central Limit Theorem. When the sample size is not too small, the Central Limit Theorem also provides a valid, but relatively simple tool to approximate that sample size.  相似文献   

6.
In this work, we developed a robust permutation test for the concordance correlation coefficient (ρc) for testing the general hypothesis H0 : ρc = ρc(0). The proposed test is based on an appropriately studentized statistic. Theoretically, the test is proven to be asymptotically valid in the general setting when two paired variables are uncorrelated but dependent. This desired property was demonstrated across a range of distributional assumptions and sample sizes in simulation studies, where the test exhibits robust type I error control in all settings tested, even when the sample size is small. We demonstrated the application of this test in two real world examples across cardiac output measurements and endocardiographic imaging.  相似文献   

7.
In the cases with three ordinal diagnostic groups, the important measures of diagnostic accuracy are the volume under surface (VUS) and the partial volume under surface (PVUS) which are the extended forms of the area under curve (AUC) and the partial area under curve (PAUC). This article addresses confidence interval estimation of the difference in paired VUS s and the difference in paired PVUS s. To focus especially on studies with small to moderate sample sizes, we propose an approach based on the concepts of generalized inference. A Monte Carlo study demonstrates that the proposed approach generally can provide confidence intervals with reasonable coverage probabilities even at small sample sizes. The proposed approach is compared to a parametric bootstrap approach and a large sample approach through simulation. Finally, the proposed approach is illustrated via an application to a data set of blood test results of anemia patients.  相似文献   

8.
Two new statistics are proposed for testing the identity of high-dimensional covariance matrix. Applying the large dimensional random matrix theory, we study the asymptotic distributions of our proposed statistics under the situation that the dimension p and the sample size n tend to infinity proportionally. The proposed tests can accommodate the situation that the data dimension is much larger than the sample size, and the situation that the population distribution is non-Gaussian. The numerical studies demonstrate that the proposed tests have good performance on the empirical powers for a wide range of dimensions and sample sizes.  相似文献   

9.
ABSTRACT

The Concordance statistic (C-statistic) is commonly used to assess the predictive performance (discriminatory ability) of logistic regression model. Although there are several approaches for the C-statistic, their performance in quantifying the subsequent improvement in predictive accuracy due to inclusion of novel risk factors or biomarkers in the model has been extremely criticized in literature. This paper proposed a model-based concordance-type index, CK, for use with logistic regression model. The CK and its asymptotic sampling distribution is derived following Gonen and Heller's approach for Cox PH model for survival data but taking necessary modifications for use with binary data. Unlike the existing C-statistics for logistic model, it quantifies the concordance probability by taking the difference in the predicted risks between two subjects in a pair rather than ranking them and hence is able to quantify the equivalent incremental value from the new risk factor or marker. The simulation study revealed that the CK performed well when the model parameters are correctly estimated for large sample and showed greater improvement in quantifying the additional predictive value from the new risk factor or marker than the existing C-statistics. Furthermore, the illustration using three datasets supports the findings from simulation study.  相似文献   

10.
In this paper we consider the problem of unbiased estimation of the distribution function of an exponential population using order statistics based on a random sample. We present a (unique) unbiased estimator based on a single, say ith, order statistic and study some properties of the estimator for i = 2. We also indicate how this estimator can be utilized to obtain unbiased estimators when a few selected order statistics are available as well as when the sample is selected following an alternative sampling procedure known as ranked set sampling. It is further proved that for a ranked set sample of size two, the proposed estimator is uniformly better than the conventional nonparametric unbiased estimator, further, for a general sample size, a modified ranked set sampling procedure provides an unbiased estimator uniformly better than the conventional nonparametric unbiased estimator based on the usual ranked set sampling procedure.  相似文献   

11.
Under proper conditions, two independent tests of the null hypothesis of homogeneity of means are provided by a set of sample averages. One test, with tail probability P 1, relates to the variation between the sample averages, while the other, with tail probability P 2, relates to the concordance of the rankings of the sample averages with the anticipated rankings under an alternative hypothesis. The quantity G = P 1 P 2 is considered as the combined test statistic and, except for the discreteness in the null distribution of P 2, would correspond to the Fisher statistic for combining probabilities. Illustration is made, for the case of four means, on how to get critical values of G or critical values of P 1 for each possible value of P 2, taking discreteness into account. Alternative measures of concordance considered are Spearman's ρ and Kendall's τ. The concept results, in the case of two averages, in assigning two-thirds of the test size to the concordant tail, one-third to the discordant tail.  相似文献   

12.
The proportional odds model (POM) is commonly used in regression analysis to predict the outcome for an ordinal response variable. The maximum likelihood estimation (MLE) approach is typically used to obtain the parameter estimates. The likelihood estimates do not exist when the number of parameters, p, is greater than the number of observations n. The MLE also does not exist if there are no overlapping observations in the data. In a situation where the number of parameters is less than the sample size but p is approaching to n, the likelihood estimates may not exist, and if they exist they may have quite large standard errors. An estimation method is proposed to address the last two issues, i.e. complete separation and the case when p approaches n, but not the case when p>n. The proposed method does not use any penalty term but uses pseudo-observations to regularize the observed responses by downgrading their effect so that they become close to the underlying probabilities. The estimates can be computed easily with all commonly used statistical packages supporting the fitting of POMs with weights. Estimates are compared with MLE in a simulation study and an application to the real data.  相似文献   

13.
DETERMINATION OF DOMAINS OF ATTRACTION BASED ON A SEQUENCE OF MAXIMA   总被引:2,自引:0,他引:2  
Suppose that the maximum of a random sample from a distribution F(x) may be obtained in each of k equally spaced observation periods. This paper proposes a test to determine the domain of attraction of F(x), and investigates the properties when the sample size is very large and perhaps unknown and k is fixed and small. The test statistic is a function of the spacings between the order statistics based on the sequence of maxima and is suggested by reference to one studied previously when inference was based on the largest k observations of a random sample. A Monte Carlo study shows that the proposed test is more powerful than its main competitor. The test is illustrated by two examples.  相似文献   

14.
A bioequivalence test is to compare bioavailability parameters, such as the maximum observed concentration (Cmax) or the area under the concentration‐time curve, for a test drug and a reference drug. During the planning of a bioequivalence test, it requires an assumption about the variance of Cmax or area under the concentration‐time curve for the estimation of sample size. Since the variance is unknown, current 2‐stage designs use variance estimated from stage 1 data to determine the sample size for stage 2. However, the estimation of variance with the stage 1 data is unstable and may result in too large or too small sample size for stage 2. This problem is magnified in bioequivalence tests with a serial sampling schedule, by which only one sample is collected from each individual and thus the correct assumption of variance becomes even more difficult. To solve this problem, we propose 3‐stage designs. Our designs increase sample sizes over stages gradually, so that extremely large sample sizes will not happen. With one more stage of data, the power is increased. Moreover, the variance estimated using data from both stages 1 and 2 is more stable than that using data from stage 1 only in a 2‐stage design. These features of the proposed designs are demonstrated by simulations. Testing significance levels are adjusted to control the overall type I errors at the same level for all the multistage designs.  相似文献   

15.
Five estimation approaches have been developed to compute the confidence interval (CI) for the ratio of two lognormal means: (1) T, the CI based on the t-test procedure; (2) ML, a traditional maximum likelihood-based approach; (3) BT, a bootstrap approach; (4) R, the signed log-likelihood ratio statistic; and (5) R*, the modified signed log-likelihood ratio statistic. The purpose of this study was to assess the performance of these five approaches when applied to distributions other than lognormal distribution, for which they were derived. Performance was assessed in terms of average length and coverage probability of the CIs for each estimation approaches (i.e., T, ML, BT, R, and R*) when data followed a Weibull or gamma distribution. Four models were discussed in this study. In Model 1, the sample sizes and variances were equal within the two groups. In Model 2, the sample sizes were equal but variances were different within the two groups. In Model 3, the variances were different within the two groups and the larger variance was paired with the larger sample size. In Model 4, the variances were different within the two groups and the larger variance was paired with the smaller sample size. The results showed that when the variances of the two groups were equal, the t-test performed well, no matter what the underlying distribution was and how large the variances of the two groups were. The BT approach performed better than the others when the underlying distribution was not lognormal distribution, although it was inaccurate when the variances were large. The R* test did not perform well when the underlying distribution was Weibull or gamma distributed data, but it performed best when the data followed a lognormal distribution.  相似文献   

16.
In this article, we consider the problem of testing the mean vector in the multivariate normal distribution, where the dimension p is greater than the sample size N. We propose a new test TBlock and obtain its asymptotic distribution. We also compare the proposed test with other two tests. The simulation results suggest that the performance of the new test is comparable to the existing two tests, and under some circumstances it may have higher power. Therefore, the new statistic can be employed in practice as an alternative choice.  相似文献   

17.
Given two independent samples of size n and m drawn from univariate distributions with unknown densities f and g, respectively, we are interested in identifying subintervals where the two empirical densities deviate significantly from each other. The solution is built by turning the nonparametric density comparison problem into a comparison of two regression curves. Each regression curve is created by binning the original observations into many small size bins, followed by a suitable form of root transformation to the binned data counts. Turned as a regression comparison problem, several nonparametric regression procedures for detection of sparse signals can be applied. Both multiple testing and model selection methods are explored. Furthermore, an approach for estimating larger connected regions where the two empirical densities are significantly different is also derived, based on a scale-space representation. The proposed methods are applied on simulated examples as well as real-life data from biology.  相似文献   

18.
ABSTRACT

A dual-record system (DRS) (equivalently two sample capture–recapture experiments) model, with time and behavioural response variation, has attracted much attention specifically in the domain of official statistics and epidemiology, as the assumption of list independence often fails. The relevant model suffers from parameter identifiability problem, and suitable Bayesian methodologies could be helpful. In this article, we formulate population size estimation in DRS as a missing data problem and two empirical Bayes approaches are proposed along with the discussion of an existing Bayes treatment. Some features and associated posterior convergence for these methods are mentioned. Investigation through an extensive simulation study finds that our proposed approaches compare favourably with the existing Bayes approach for this complex model depending upon the availability of directional nature of underlying behavioural response effect. A real-data example is given to illustrate these methods.  相似文献   

19.
The central limit theorem says that, provided an estimator fulfills certain weak conditions, then, for reasonable sample sizes, the sampling distribution of the estimator converges to normality. We propose a procedure to find out what a “reasonably large sample size” is. The procedure is based on the properties of Gini's mean difference decomposition. We show the results of implementations of the procedure from simulated datasets and data from the German Socio-economic Panel.  相似文献   

20.
Random samples are assumed for the univariate two-sample problem. Sometimes this assumption may be violated in that an observation in one “sample”, of size m, is from a population different from that yielding the remaining m—1 observations (which are a random sample). Then, the interest is in whether this random sample of size m—1 is from the same population as the other random sample. If such a violation occurs and can be recognized, and also the non-conforming observation can be identified (without imposing conditional effects), then that observation could be removed and a two-sample test applied to the remaining samples. Unfortunately, satisfactory procedures for such a removal do not seem to exist. An alternative approach is to use two-sample tests whose significance levels remain the same when a non-conforming observation occurs, and is removed, as for the case where the samples were both truly random. The equal-tail median test is shown to have this property when the two “samples” are of the same size (and ties do not occur).  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号