首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
The maximum vertical distance between a receiver operating characteristic (ROC) curve and its chance diagonal is a common measure of effectiveness of the classifier that gives rise to this curve. This measure is known to be equivalent to a two-sample Kolmogorov–Smirnov statistic; so the absolute difference D between two such statistics is often used informally as a measure of difference between the corresponding classifiers. A significance test of D is of great practical interest, but the available Kolmogorov–Smirnov distribution theory precludes easy analytical construction of such a significance test. We, therefore, propose a Monte Carlo procedure for conducting the test, using the binormal model for the underlying ROC curves. We provide Splus/R routines for the computation, tabulate the results for a number of illustrative cases, apply the methods to some practical examples and discuss some implications.  相似文献   

2.
We aimed to determine the most proper change measure among simple difference, percent, or symmetrized percent changes in simple paired designs. For this purpose, we devised a computer simulation program. Since distributions of percent and symmetrized percent change values are skewed and bimodal, paired t-test did not give good results according to Type I error and the test power. To be to able use percent change or symmetrized percent change as change measure, either the distribution of test statistics should be transformed to a known theoretical distribution by transformation methods or a new test statistic for these values should be developed.  相似文献   

3.
This study investigates self-citation rates of 222 Chinese journals within seven groups including 76 journals of agronomy (34.2 percent), 57 of biology (25.7 percent), 28 of environmental science and technology (12.6 percent), 15 of forestry (6.8 percent), 24 of academic journals of agricultural university (10.8 percent), 9 of aquatic sciences (4.1 percent), and 13 of animal husbandry and veterinary medicine (5.9 percent). The average self-citation rates range from 2 percent to 67 percent in 2006, 1 percent to 68 percent in 2007 and 0 percent to 67 percent in 2008. There is a significant difference in self-citation rate between most groups of journals. The self-citation rate is positively and significantly correlated with the self-citation rate in 2006 for all 222 journals (N = 222, R2 = 0.194, P = 0.004) (P < 0.05). However, the self-citation rate is not significantly correlated with the journal's impact factor in 2007 (N = 222, R2 = 0.114, P = 0.091) and 2008 (N = 222, R2 = 0.112, P = 0.096) (P < 0.05) for the 222 journals. The relationship between self-citation rate and journal impact factor is discussed.  相似文献   

4.
ABSTRACT

In a test of significance, it is common practice to report the p-value as one way of summarizing the incompatibility between a set of data and a proposed model for the data constructed under a set of assumptions together with a null hypothesis. However, the p-value does have some flaws, one being in general its definition for two-sided tests and a related serious logical one of incoherence, in its interpretation as a statistical measure of evidence for its respective null hypothesis. We shall address these two issues in this article.  相似文献   

5.
Abstract

It is widely recognized by statisticians, though not as widely by other researchers, that the p-value cannot be interpreted in isolation, but rather must be considered in the context of certain features of the design and substantive application, such as sample size and meaningful effect size. I consider the setting of the normal mean and highlight the information contained in the p-value in conjunction with the sample size and meaningful effect size. The p-value and sample size jointly yield 95% confidence bounds for the effect of interest, which can be compared to the predetermined meaningful effect size to make inferences about the true effect. I provide simple examples to demonstrate that although the p-value is calculated under the null hypothesis, and thus seemingly may be divorced from the features of the study from which it arises, its interpretation as a measure of evidence requires its contextualization within the study. This implies that any proposal for improved use of the p-value as a measure of the strength of evidence cannot simply be a change to the threshold for significance.  相似文献   

6.
7.
Expected shortfall (ES) is a well-known measure of extreme loss associated with a risky asset or portfolio. For any 0 < p < 1, the 100(1 ? p) percent ES is defined as the mean of the conditional loss distribution, given the event that the loss exceeds (1 ? p)th quantile of the marginal loss distribution. Estimation of ES based on asset return data is an important problem in finance. Several nonparametric estimators of the expected shortfall are available in the literature. Using Monte Carlo simulations, we compare the accuracy of these estimators under the condition that p → 0 as n → ∞ for several asset return time series models, where n is the sample size. Not much seems to be known regarding the properties of the ES estimators under this condition. For p close to zero, the ES measures an extreme loss in the right tail of the loss distribution of the asset or portfolio. Our simulations and real-data analysis provide insight into the effect of varying p with n on the performance of nonparametric ES estimators.  相似文献   

8.
9.
ABSTRACT

Various approaches can be used to construct a model from a null distribution and a test statistic. I prove that one such approach, originating with D. R. Cox, has the property that the p-value is never greater than the Generalized Likelihood Ratio (GLR). When combined with the general result that the GLR is never greater than any Bayes factor, we conclude that, under Cox’s model, the p-value is never greater than any Bayes factor. I also provide a generalization, illustrations for the canonical Normal model, and an alternative approach based on sufficiency. This result is relevant for the ongoing discussion about the evidential value of small p-values, and the movement among statisticians to “redefine statistical significance.”  相似文献   

10.
Partitioning objects into closely related groups that have different states allows to understand the underlying structure in the data set treated. Different kinds of similarity measure with clustering algorithms are commonly used to find an optimal clustering or closely akin to original clustering. Using shrinkage-based and rank-based correlation coefficients, which are known to be robust, the recovery level of six chosen clustering algorithms is evaluated using Rand’s C values. The recovery levels using weighted likelihood estimate of correlation coefficient are obtained and compared to the results from using those correlation coefficients in applying agglomerative clustering algorithms. This work was supported by RIC(R) grants from Traditional and Bio-Medical Research Center, Daejeon University (RRC04713, 2005) by ITEP in Republic of Korea.  相似文献   

11.
The paper proposes a new test for detecting the umbrella pattern under a general non‐parametric scheme. The alternative asserts that the umbrella ordering holds while the hypothesis is its complement. The main focus is put on controlling the power function of the test outside the alternative. As a result, the asymptotic error of the first kind of the constructed solution is smaller than or equal to the fixed significance level α on the whole set where the umbrella ordering does not hold. Also, under finite sample sizes, this error is controlled to a satisfactory extent. A simulation study shows, among other things, that the new test improves upon the solution widely recommended in the literature of the subject. A routine, written in R, is attached as the Supporting Information file.  相似文献   

12.
In statistical models where jumps of a d -dimensional stable process ( S t ) t ≥0 are observed in windows with certain asymptotic properties, and where parameters appearing in the Levy measure of S are to be estimated, we have asymptotically efficient estimators. If Poisson random measure μ on (0, ∞) × ( R d \{0}) with intensity dt Λ( dx ) replaces the jump measure of S , where Λ is a ε-finite measure on R d \{0} admitting tail parameters in a suitable sense, we specify a notion of neighbourhood which allows to treat efficiency in statistical experiments of the second type by switching to accompanying sequences of the stable process type considered first.  相似文献   

13.
It sometimes occurs that one or more components of the data exert a disproportionate influence on the model estimation. We need a reliable tool for identifying such troublesome cases in order to decide either eliminate from the sample, when the data collect was badly realized, or otherwise take care on the use of the model because the results could be affected by such components. Since a measure for detecting influential cases in linear regression setting was proposed by Cook [Detection of influential observations in linear regression, Technometrics 19 (1977), pp. 15–18.], apart from the same measure for other models, several new measures have been suggested as single-case diagnostics. For most of them some cutoff values have been recommended (see [D.A. Belsley, E. Kuh, and R.E. Welsch, Regression Diagnostics: Identifying Influential Data and Sources of Collinearity, 2nd ed., John Wiley & Sons, New York, Chichester, Brisban, (2004).], for instance), however the lack of a quantile type cutoff for Cook's statistics has induced the analyst to deal only with index plots as worthy diagnostic tools. Focussed on logistic regression, the aim of this paper is to provide the asymptotic distribution of Cook's distance in order to look for a meaningful cutoff point for detecting influential and leverage observations.  相似文献   

14.
The likelihood ratio test for equality of ordered means is known to have power characteristics that are generally superior to those of competing procedures. Difficulties in implementing this test have led to the development of alternative approaches, most of which are based on contrasts. While orthogonal contrasts can be chosen to simplify the distribution theory, we propose a class of tests that is easy to implement even if the contrasts used are not orthogonal. An overall measure of significance may be obtained by using Fisher's combination statistic to combine the dependent p-values arising from these contrasts. This method can be easily implemented for testing problems involving unequal sample sizes and any partial order, and has power properties that compare well with those of the likelihood ratio test and other contrast-based tests.  相似文献   

15.
The t-statistic used in the existing literature for testing the significance of linear multiple regression coefficients has only a limited use in testing the marginal significance of explanatory variables though it is used in testing the partial significance also. This article identifies the t-statistic appropriate for testing the partial significance.  相似文献   

16.
17.
18.
This paper deals with the estimation of reliability R = P(Y < X) when X is a random strength of a component subjected to a random stress Y, and (X, Y) follows a bivariate Rayleigh distribution. The maximum likelihood estimator of R and its asymptotic distribution are obtained. An asymptotic confidence interval of R is constructed using the asymptotic distribution. Also, two confidence intervals are proposed based on Bootstrap method and a computational approach. Testing of the reliability based on asymptotic distribution of R is discussed. Simulation study to investigate performance of the confidence intervals and tests has been carried out. Also, a numerical example is given to illustrate the proposed approaches.  相似文献   

19.
ABSTRACT

P values linked to null hypothesis significance testing (NHST) is the most widely (mis)used method of statistical inference. Empirical data suggest that across the biomedical literature (1990–2015), when abstracts use P values 96% of them have P values of 0.05 or less. The same percentage (96%) applies for full-text articles. Among 100 articles in PubMed, 55 report P values, while only 4 present confidence intervals for all the reported effect sizes, none use Bayesian methods and none use false-discovery rate. Over 25 years (1990–2015), use of P values in abstracts has doubled for all PubMed, and tripled for meta-analyses, while for some types of designs such as randomized trials the majority of abstracts report P values. There is major selective reporting for P values. Abstracts tend to highlight most favorable P values and inferences use even further spin to reach exaggerated, unreliable conclusions. The availability of large-scale data on P values from many papers has allowed the development and applications of methods that try to detect and model selection biases, for example, p-hacking, that cause patterns of excess significance. Inferences need to be cautious as they depend on the assumptions made by these models and can be affected by the presence of other biases (e.g., confounding in observational studies). While much of the unreliability of past and present research is driven by small, underpowered studies, NHST with P values may be also particularly problematic in the era of overpowered big data. NHST and P values are optimal only in a minority of current research. Using a more stringent threshold, as in the recently proposed shift from P < 0.05 to P < 0.005, is a temporizing measure to contain the flood and death-by-significance. NHST and P values may be replaced in many fields by other, more fit-for-purpose, inferential methods. However, curtailing selection biases requires additional measures, beyond changes in inferential methods, and in particular reproducible research practices.  相似文献   

20.
For square contingency tables with ordered categories, this paper proposes a measure to represent the degree of departure from the marginal homogeneity model. It is expressed as the weighted sum of the power-divergence or Patil–Taillie diversity index, and is a function of marginal log odds ratios. The measure represents the degree of departure from the equality of the log odds that the row variable is i or below instead of i+1 or above and the log odds that the column variable is i or below instead of i+1 or above for every i. The measure is also extended to multi-way tables. Examples are given.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号