首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
The Fisher exact test has been unjustly dismissed by some as ‘only conditional,’ whereas it is unconditionally the uniform most powerful test among all unbiased tests, tests of size α and with power greater than its nominal level of significance α. The problem with this truly optimal test is that it requires randomization at the critical value(s) to be of size α. Obviously, in practice, one does not want to conclude that ‘with probability x the we have a statistical significant result.’ Usually, the hypothesis is rejected only if the test statistic's outcome is more extreme than the critical value, reducing the actual size considerably.

The randomized unconditional Fisher exact is constructed (using Neyman–structure arguments) by deriving a conditional randomized test randomizing at critical values c(t) by probabilities γ(t), that both depend on the total number of successes T (the complete-sufficient statistic for the nuisance parameter—the common success probability) conditioned upon.

In this paper, the Fisher exact is approximated by deriving nonrandomized conditional tests with critical region including the critical value only if γ (t) > γ0, for a fixed threshold value γ0, such that the size of the unconditional modified test is for all value of the nuisance parameter—the common success probability—smaller, but as close as possible to α. It will be seen that this greatly improves the size of the test as compared with the conservative nonrandomized Fisher exact test.

Size, power, and p value comparison with the (virtual) randomized Fisher exact test, and the conservative nonrandomized Fisher exact, Pearson's chi-square test, with the more competitive mid-p value, the McDonald's modification, and Boschloo's modifications are performed under the assumption of two binomial samples.  相似文献   

2.
Several unconditional exact tests, which are constructed to control the Type I error rate at the nominal level, for comparing two independent Poisson rates are proposed and compared to the conditional exact test using a binomial distribution. The unconditional exact test using binomial p-value, likelihood ratio, or efficient score as the test statistic improves the power in general, and are therefore recommended. Unconditional exact tests using Wald statistics, whether on the original or square-root scale, may be substantially less powerful than the conditional exact test, and is not recommended. An example is provided from a cardiovascular trial.  相似文献   

3.
In teaching the development of uniformly most powerful unbiased (UMPU) tests, one rarely discusses the performance of alternative biased tests. It is shown, through the comparison of two independent Bernoulli proportions, that a biased test (the Z test) can be more powerful than the UMPU test (Fisher's exact test—randomized) in a large region of the alternative parameter space. A more general example is also given.  相似文献   

4.
The asymptotic chi-square test for testing the Hardy–Weinberg law is unreliable in either small or unbalanced samples. As an alternative, either the unconditional or conditional exact test might be used. It is known that the unconditional exact test has greater power than the conditional exact test in small samples. In this article, we show that the conditional exact test is more powerful than the unconditional exact test in large samples. This result is useful in extremely unbalanced cases with large sample sizes which are often obtained when a rare allele exists.  相似文献   

5.
ABSTRACT

This article considers the problem of testing equality of parameters of two exponential distributions having common known coefficient of variation, both under unconditional and conditional setup. Unconditional tests based on BLUE'S and LRT are considered. Using the Conditionality Principle of Fisher, an UMP conditional test for one-sided alternative is derived by conditioning on an ancillary. This test is seen to be uniformly more powerful than unconditional tests in certain given ranges of ancillary. Simulation studies on the power functions of the tests are done for this purpose.  相似文献   

6.
The paper is concerned with structural properties of the acceptance regions of uniformly most powerful unbiased tests (UMPU-tests) for one- and two-sided hypotheses for 2×2 tables as, for instance, the comparison of two proportions or testing for association. These tests can be considered as randomized versions of Fisher's exact tests. A series of monotonicity and unimodality properties will be proved. These properties are equivalent to a symmetry and convexity condition often required for powerful unconditional tests. Knowledge of such properties allows a fast and in some sense recursive calculation of the critical values of the UMPU-tests which is important if a repeated calculation of all critical values for different sample sizes or different levels is required. This is, for example, the case if the unconditional power has to be controlled over a certain subset of the alternative, or, if one is interested in powerful unconditional non-randomized tests generated by a UMPU-test. Our results also imply some useful properties of the two-dimensional unconditional power function. On the other hand, we found some less nice properties of the UMPU-tests, too.  相似文献   

7.
The classical unconditional exact p-value test can be used to compare two multinomial distributions with small samples. This general hypothesis requires parameter estimation under the null which makes the test severely conservative. Similar property has been observed for Fisher's exact test with Barnard and Boschloo providing distinct adjustments that produce more powerful testing approaches. In this study, we develop a novel adjustment for the conservativeness of the unconditional multinomial exact p-value test that produces nominal type I error rate and increased power in comparison to all alternative approaches. We used a large simulation study to empirically estimate the 5th percentiles of the distributions of the p-values of the exact test over a range of scenarios and implemented a regression model to predict the values for two-sample multinomial settings. Our results show that the new test is uniformly more powerful than Fisher's, Barnard's, and Boschloo's tests with gains in power as large as several hundred percent in certain scenarios. Lastly, we provide a real-life data example where the unadjusted unconditional exact test wrongly fails to reject the null hypothesis and the corrected unconditional exact test rejects the null appropriately.  相似文献   

8.
Exact unconditional tests for comparing two binomial probabilities are generally more powerful than conditional tests like Fisher's exact test. Their power can be further increased by the Berger and Boos confidence interval method, where a p-value is found by restricting the common binomial probability under H 0 to a 1?γ confidence interval. We studied the average test power for the exact unconditional z-pooled test for a wide range of cases with balanced and unbalanced sample sizes, and significance levels 0.05 and 0.01. The detailed results are available online on the web. Among the values 10?3, 10?4, …, 10?10, the value γ=10?4 gave the highest power, or close to the highest power, in all the cases we looked at, and can be given as a general recommendation as an optimal γ.  相似文献   

9.
The approximate chi-square statistic, X 2 Q , which is calculated as the difference between the usual chi-square statistic for heterogeneity and the Cochran-Armitage trend test statistic, has been widely applied to test the linearity assumption for dose-response data. This statistic can be shown to be asymptotically distributed as chi-square with K - 2 degrees of freedom. However, this asymptotic property could be quite questionable if the sample size is small, or if there is a high degree of sparseness or imbalance in the data. In this article, we consider how exact tests based on this X 2 Q statistic can be performed. Both the exact conditional and unconditional versions will be studied. Interesting findings include: (i) the exact conditional test is extremely sensitive to a small change in dosages, which may eventually produce a degenerate exact conditional distribution; and (ii) the exact unconditional test avoids the problem of degenerate distribution and is shown to be less sensitive to the change in dosages. A real example involving an animal carcinogenesis experiment as well as a fictitious data set will be used for illustration purposes.  相似文献   

10.
Pearson’s chi-square (Pe), likelihood ratio (LR), and Fisher (Fi)–Freeman–Halton test statistics are commonly used to test the association of an unordered r×c contingency table. Asymptotically, these test statistics follow a chi-square distribution. For small sample cases, the asymptotic chi-square approximations are unreliable. Therefore, the exact p-value is frequently computed conditional on the row- and column-sums. One drawback of the exact p-value is that it is conservative. Different adjustments have been suggested, such as Lancaster’s mid-p version and randomized tests. In this paper, we have considered 3×2, 2×3, and 3×3 tables and compared the exact power and significance level of these test’s standard, mid-p, and randomized versions. The mid-p and randomized test versions have approximately the same power and higher power than that of the standard test versions. The mid-p type-I error probability seldom exceeds the nominal level. For a given set of parameters, the power of Pe, LR, and Fi differs approximately the same way for standard, mid-p, and randomized test versions. Although there is no general ranking of these tests, in some situations, especially when averaged over the parameter space, Pe and Fi have the same power and slightly higher power than LR. When the sample sizes (i.e., the row sums) are equal, the differences are small, otherwise the observed differences can be 10% or more. In some cases, perhaps characterized by poorly balanced designs, LR has the highest power.  相似文献   

11.
The best-known non-asymptotic method for comparing two independent proportions is Fisher's exact text. The usual critical region (CR) tables for this test contain one or more of the following defects:they distinguish between rows and columns; they distinguish between the alternatives H = p1 < p2 and H = p1 > p2; they assume that the error for the two-tailed test is twice that of the one-tailed test; they do not use the optimal version of the test; they do not give both CRs for one and two tails at the same time. All this results in the unnecessary duplication of the space required for the tables, the construction of tables of low-powered methods, or the need to manipulate two different tables (one for the one-tailed test, the other for the two-tailed test). This paper presents CR tables which have been obtained from the most powerful version of Fisher's exact test and which occupy the minimum space possible. The tables, which are valid for one- or two-tailed tests, have levels of significance of 10%, 5% and 1% and values for N (the total size of both samples) of less than or equal to 40. This article shows how to calculate the P value in a specific problem, using the tables as a means of partial checking and as a preliminary step to determining the exact P value.  相似文献   

12.
Sequential order statistics with conditional proportional hazard rates form a regular exponential family in the model parameters. This finding is used to establish uniformly most powerful unbiased (UMPU) tests for a variety of hypotheses.  相似文献   

13.
Editor's Report     
There are two common methods for statistical inference on 2 × 2 contingency tables. One is the widely taught Pearson chi-square test, which uses the well-known χ2statistic. The chi-square test is appropriate for large sample inference, and it is equivalent to the Z-test that uses the difference between the two sample proportions for the 2 × 2 case. Another method is Fisher’s exact test, which evaluates the likelihood of each table with the same marginal totals. This article mathematically justifies that these two methods for determining extreme do not completely agree with each other. Our analysis obtains one-sided and two-sided conditions under which a disagreement in determining extreme between the two tests could occur. We also address the question whether or not their discrepancy in determining extreme would make them draw different conclusions when testing homogeneity or independence. Our examination of the two tests casts light on which test should be trusted when the two tests draw different conclusions.  相似文献   

14.
We consider seven exact unconditional testing procedures for comparing adjusted incidence rates between two groups from a Poisson process. Exact tests are always preferable due to the guarantee of test size in small to medium sample settings. Han [Comparing two independent incidence rates using conditional and unconditional exact tests. Pharm Stat. 2008;7(3):195–201] compared the performance of partial maximization p-values based on the Wald test statistic, the likelihood ratio test statistic, the score test statistic, and the conditional p-value. These four testing procedures do not perform consistently, as the results depend on the choice of test statistics for general alternatives. We consider the approach based on estimation and partial maximization, and compare these to the ones studied by Han (2008) for testing superiority. The procedures are compared with regard to the actual type I error rate and power under various conditions. An example from a biomedical research study is provided to illustrate the testing procedures. The approach based on partial maximization using the score test is recommended due to the comparable performance and computational advantage in large sample settings. Additionally, the approach based on estimation and partial maximization performs consistently for all the three test statistics, and is also recommended for use in practice.  相似文献   

15.
We consider small sample equivalence tests for exponentialy. Statistical inference in this setting is particularly challenging since equivalence testing procedures typically require much larger sample sizes, in comparison with classical “difference tests,” to perform well. We make use of Butler's marginal likelihood for the shape parameter of a gamma distribution in our development of small sample equivalence tests for exponentiality. We consider two procedures using the principle of confidence interval inclusion, four Bayesian methods, and the uniformly most powerful unbiased (UMPU) test where a saddlepoint approximation to the intractable distribution of a canonical sufficient statistic is used. We perform small sample simulation studies to assess the bias of our various tests and show that all of the Bayes posteriors we consider are integrable. Our simulation studies show that the saddlepoint-approximated UMPU method performs remarkably well for small sample sizes and is the only method that consistently exhibits an empirical significance level close to the nominal 5% level.  相似文献   

16.
We review the 40 year old controversy over the correct analysis of 2×2 tables. It is argued that conditioning on all marginal totals is appropriate by examining the likelihood based on these margins and secondly by comparing conditional and unconditional evaluations of certain specific outcomes. Several foundational issues which naturally arise are also discussed.  相似文献   

17.
In 1945, George Alfred Barnard presented an unconditional exact test to compare two independent proportions. Critical regions for this test, by construction accomplish the very useful property of being Barnard convex sets. Besides, there are empirical findings suggesting that Barnard’s test is the most generally powerful. For Barnard’s test, calculation of critical regions is complicated due that they are constructed in an iterative form until is obtained a test size, as close as possible to the nominal significance level and less than or equal to it. In this article we present an extension to non-inferiority of this very leading test. This extension was contructed for any dissimilarity measure and tables were constructed for the difference between proportions. Also we calculate the critical regions for this extended test for sample sizes less or equal than 30, nominal significance level 0.01, 0.025, 0.05, and 0.10 and for non-inferiority margins 0.05, 0.10, 0.15, and 0.20. Additionally, we computed test sizes for the mentioned configurations. To do this calculations, we have written a program in the R environment.  相似文献   

18.
For testing a one-sided hypothesis in a one-parameter family of distributions, it is shown that the generalized likelihood ratio (GLR) test coincides with the uniformly most powerful (UMP) test, assuming certain monotonicity properties for the likelihood function. In particular, the equivalence of GLR tests and UMP tests holds for one-parameter exponential families. In addition, the relationship between GLR and UMPU (UMP unbiased) tests is considered when testing two-sided hypotheses.  相似文献   

19.
In this paper, we show that the widely used stationarity tests such as the Kwiatkowski Phillips, Schmidt, and Shin (KPSS) test have power close to size in the presence of time-varying unconditional variance. We propose a new test as a complement of the existing tests. Monte Carlo experiments show that the proposed test possesses the following characteristics: (i) in the presence of unit root or a structural change in the mean, the proposed test is as powerful as the KPSS and other tests; (ii) in the presence of a changing variance, the traditional tests perform badly whereas the proposed test has high power comparing to the existing tests; (iii) the proposed test has the same size as traditional stationarity tests under the null hypothesis of stationarity. An application to daily observations of return on U.S. Dollar/Euro exchange rate reveals the existence of instability in the unconditional variance when the entire sample is considered, but stability is found in subsamples.  相似文献   

20.
In this paper, we show that the widely used stationarity tests such as the Kwiatkowski Phillips, Schmidt, and Shin (KPSS) test have power close to size in the presence of time-varying unconditional variance. We propose a new test as a complement of the existing tests. Monte Carlo experiments show that the proposed test possesses the following characteristics: (i) in the presence of unit root or a structural change in the mean, the proposed test is as powerful as the KPSS and other tests; (ii) in the presence of a changing variance, the traditional tests perform badly whereas the proposed test has high power comparing to the existing tests; (iii) the proposed test has the same size as traditional stationarity tests under the null hypothesis of stationarity. An application to daily observations of return on U.S. Dollar/Euro exchange rate reveals the existence of instability in the unconditional variance when the entire sample is considered, but stability is found in subsamples.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号