首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 62 毫秒
1.
In 1935, R.A. Fisher published his well-known “exact” test for 2x2 contingency tables. This test is based on the conditional distribution of a cell entry when the rows and columns marginal totals are held fixed. Tocher (1950) and Lehmann (1959) showed that Fisher s test, when supplemented by randomization, is uniformly most powerful among all the unbiased tests UMPU). However, since all the practical tests for 2x2 tables are nonrandomized - and therefore biased the UMPU test is not necessarily more powerful than other tests of the same or lower size. Inthis work, the two-sided Fisher exact test and the UMPU test are compared with six nonrandomized unconditional exact tests with respect to their power. In both the two-binomial and double dichotomy models, the UMPU test is often less powerful than some of the unconditional tests of the same (or even lower) size. Thus, the assertion that the Tocher-Lehmann modification of Fisher's conditional test is the optimal test for 2x2 tables is unjustified.  相似文献   

2.
This article proposes a modified p-value for the two-sided test of the location of the normal distribution when the parameter space is restricted. A commonly used test for the two-sided test of the normal distribution is the uniformly most powerful unbiased (UMPU) test, which is also the likelihood ratio test. The p-value of the test is used as evidence against the null hypothesis. Note that the usual p-value does not depend on the parameter space but only on the observation and the assumption of the null hypothesis. When the parameter space is known to be restricted, the usual p-value cannot sufficiently utilize this information to make a more accurate decision. In this paper, a modified p-value (also called the rp-value) dependent on the parameter space is proposed, and the test derived from the modified p-value is also shown to be the UMPU test.  相似文献   

3.
For testing a one-sided hypothesis in a one-parameter family of distributions, it is shown that the generalized likelihood ratio (GLR) test coincides with the uniformly most powerful (UMP) test, assuming certain monotonicity properties for the likelihood function. In particular, the equivalence of GLR tests and UMP tests holds for one-parameter exponential families. In addition, the relationship between GLR and UMPU (UMP unbiased) tests is considered when testing two-sided hypotheses.  相似文献   

4.
We consider small sample equivalence tests for exponentialy. Statistical inference in this setting is particularly challenging since equivalence testing procedures typically require much larger sample sizes, in comparison with classical “difference tests,” to perform well. We make use of Butler's marginal likelihood for the shape parameter of a gamma distribution in our development of small sample equivalence tests for exponentiality. We consider two procedures using the principle of confidence interval inclusion, four Bayesian methods, and the uniformly most powerful unbiased (UMPU) test where a saddlepoint approximation to the intractable distribution of a canonical sufficient statistic is used. We perform small sample simulation studies to assess the bias of our various tests and show that all of the Bayes posteriors we consider are integrable. Our simulation studies show that the saddlepoint-approximated UMPU method performs remarkably well for small sample sizes and is the only method that consistently exhibits an empirical significance level close to the nominal 5% level.  相似文献   

5.
The objective of this article is to propose and study frequentist tests that have maximum average power, averaging with respect to some specified weight function. First, some relationships between these tests, called maximum average-power (MAP) tests, and most powerful or uniformly most powerful tests are presented. Second, the existence of a maximum average-power test for any hypothesis testing problem is shown. Third, an MAP test for any hypothesis testing problem with a simple null hypothesis is constructed, including some interesting classical examples. Fourth, an MAP test for a hypothesis testing problem with a composite null hypothesis is discussed. From any one-parameter exponential family, a commonly used UMPU test is shown to be also an MAP test with respect to a rich class of weight functions. Finally, some remarks are given to conclude the article.  相似文献   

6.
Sequential order statistics with conditional proportional hazard rates form a regular exponential family in the model parameters. This finding is used to establish uniformly most powerful unbiased (UMPU) tests for a variety of hypotheses.  相似文献   

7.
Nonparametric regression models are often used to check or suggest a parametric model. Several methods have been proposed to test the hypothesis of a parametric regression function against an alternative smoothing spline model. Some tests such as the locally most powerful (LMP) test by Cox et al. (Cox, D., Koh, E., Wahba, G. and Yandell, B. (1988). Testing the (parametric) null model hypothesis in (semiparametric) partial and generalized spline models. Ann. Stat., 16, 113–119.), the generalized maximum likelihood (GML) ratio test and the generalized cross validation (GCV) test by Wahba (Wahba, G. (1990). Spline models for observational data. CBMS-NSF Regional Conference Series in Applied Mathematics, SIAM.) were developed from the corresponding Bayesian models. Their frequentist properties have not been studied. We conduct simulations to evaluate and compare finite sample performances. Simulation results show that the performances of these tests depend on the shape of the true function. The LMP and GML tests are more powerful for low frequency functions while the GCV test is more powerful for high frequency functions. For all test statistics, distributions under the null hypothesis are complicated. Computationally intensive Monte Carlo methods can be used to calculate null distributions. We also propose approximations to these null distributions and evaluate their performances by simulations.  相似文献   

8.
ABSTRACT

A simple test based on Gini's mean difference is proposed to test the hypothesis of equality of population variances. Using 2000 replicated samples and empirical distributions, we show that the test compares favourably with Bartlett's and Levene's test for the normal population. Also, it is more powerful than Bartlett's and Levene's tests for some alternative hypotheses for some non-normal distributions and more robust than the other two tests for large sample sizes under some alternative hypotheses. We also give an approximate distribution to the test statistic to enable one to calculate the nominal levels and P-values.  相似文献   

9.
We consider multiple comparison test procedures among treatment effects in a randomized block design. We propose closed testing procedures based on maximum values of some two-sample t test statistics and based on F test statistics. It is shown that the proposed procedures are more powerful than single-step procedures and the REGW (Ryan/Einot–Gabriel/Welsch)-type tests. Next, we consider the randomized block design under simple ordered restrictions of treatment effects. We propose closed testing procedures based on maximum values of two-sample one-sided t test statistics and based on Batholomew’s statistics for all pairwise comparisons of treatment effects. Although single-step multiple comparison procedures are utilized in general, the power of these procedures is low for a large number of groups. The closed testing procedures stated in the present article are more powerful than the single-step procedures. Simulation studies are performed under the null hypothesis and some alternative hypotheses. In this studies, the proposed procedures show a good performance.  相似文献   

10.
In the analysis of clinical trials of combination therapies, the min test is often used to demonstrate a combination therapy's superiority to its components. Although uniformly most powerful within a class of monotone tests, this test is excessively conservative with low power at certain alternatives. This paperdemonstrates that more powerful tests may be found outside of this class. Some such alternative tests are suggested and compared with the min tests on the basis of their actual significance levels and powers. The proposed tests are observed to be less conservative and uniformly more powerful than the min test.  相似文献   

11.
Testing the equality of variances of two linear models with common β-parameter is considered. A test based on least squares residuals (ASR test) is proposed, and it is shown that this test is invariant under the group of scale and translation changes. For some special cases, it is also proved that this test has a monotone power function. Finding the exact critical values of this test is not easy; an approximation is given to facilitate the computation of these. The powers of the BLUS test, the F-test and the new test are computed for various alternatives and compared in a particular case. The proposed test seems to be locally more powerful than the alternative tests.  相似文献   

12.
In the two-sample location-shift problem, Student's t test or Wilcoxon's rank-sum test are commonly applied. The latter test can be more powerful for non-normal data. Here, we propose to combine the two tests within a maximum test. We show that the constructed maximum test controls the type I error rate and has good power characteristics for a variety of distributions; its power is close to that of the more powerful of the two tests. Thus, irrespective of the distribution, the maximum test stabilizes the power. To carry out the maximum test is a more powerful strategy than selecting one of the single tests. The proposed test is applied to data of a clinical trial.  相似文献   

13.
The use of general linear modeling (GLM) procedures based on log-rank scores is proposed for the analysis of survival data and compared to standard survival analysis procedures. For the comparison of two groups, this approach performed similarly to the traditional log-rank test. In the case of more complicated designs - without ties in the survival times - the approach was only marginally less powerful than tests from proportional hazards models, and clearly less powerful than a likelihood ratio test for a fully parametric model; however, with ties in the survival time, the approach proved more powerful than tests from Cox's semi-parametric proportional hazards procedure. The method appears to provide a reasonably powerful alternative for the analysis of survival data, is easily used in complicated study designs, avoids (semi-)parametric assumptions, and is quite computationally easy and inexpensive to employ.  相似文献   

14.
Uniformly most powerful Bayesian tests (UMPBTs) are a new class of Bayesian tests in which null hypotheses are rejected if their Bayes factor exceeds a specified threshold. The alternative hypotheses in UMPBTs are defined to maximize the probability that the null hypothesis is rejected. Here, we generalize the notion of UMPBTs by restricting the class of alternative hypotheses over which this maximization is performed, resulting in restricted most powerful Bayesian tests (RMPBTs). We then derive RMPBTs for linear models by restricting alternative hypotheses to g priors. For linear models, the rejection regions of RMPBTs coincide with those of usual frequentist F‐tests, provided that the evidence thresholds for the RMPBTs are appropriately matched to the size of the classical tests. This correspondence supplies default Bayes factors for many common tests of linear hypotheses. We illustrate the use of RMPBTs for ANOVA tests and t‐tests and compare their performance in numerical studies.  相似文献   

15.
A class of tests due to Shoemaker (Commun Stat Simul Comput 28: 189–205, 1999) for differences in scale which is valid for a variety of both skewed and symmetric distributions when location is known or unknown is considered. The class is based on the interquantile range and requires that the population variances are finite. In this paper, we firstly propose a permutation version of it that does not require the condition of finite variances and is remarkably more powerful than the original one. Secondly we solve the question of what quantile choose by proposing a combined interquantile test based on our permutation version of Shoemaker tests. Shoemaker showed that the more extreme interquantile range tests are more powerful than the less extreme ones, unless the underlying distributions are very highly skewed. Since in practice you may not know if the underlying distributions are very highly skewed or not, the question arises. The combined interquantile test solves this question, is robust and more powerful than the stand alone tests. Thirdly we conducted a much more detailed simulation study than that of Shoemaker (1999) that compared his tests to the F and the squared rank tests showing that his tests are better. Since the F and the squared rank test are not good for differences in scale, his results suffer of such a drawback, and for this reason instead of considering the squared rank test we consider, following the suggestions of several authors, tests due to Brown–Forsythe (J Am Stat Assoc 69:364–367, 1974), Pan (J Stat Comput Simul 63:59–71, 1999), O’Brien (J Am Stat Assoc 74:877–880, 1979) and Conover et al. (Technometrics 23:351–361, 1981).  相似文献   

16.
A modification to Tiku's (1981) test, which may be seriously biased, is proposed. The modified test is only marginally biased if at all and is substantially more powerful. A ratio test based on Tiku’s (1967) modified likelihood function is also proposed, and shown to have power comparable to the power of the ratio test based on the likelihood function. The proposed ratio test is, however, much easier from a computational viewpoint.  相似文献   

17.
ABSTRACT

Nonhomogeneous Poisson processes (NHPP) provide many models for hardware and software reliability analysis. In order to get an appropriate NHPP model, goodness-of-Fit (GOF for short) tests have to be carried out. For the power-law processes, lots of GOF tests have been developed. For other NHPP models, only the Conditional Probability Integral Transformation (CPIT) test has been proposed. However, the CPIT test is less powerful and cannot be applied to some NHPP models. This article proposes a general GOF test based on the Laplace statistic for a large class of NHPP models with intensity functions of the form αλ(t, β). The simulation results show that this test is more powerful than CPIT test.  相似文献   

18.
Lehmann & Stein (1948) proved the existence of non-similar tests which can be more powerful than best similar tests. They used Student's problem of testing for a non-zero mean given a random sample from the normal distribution with unknown variance as an example. This raises the question: should we use a non-similar test instead of Student's t test? Questions like this can be answered by comparing the power of the test with the power envelope. This paper discusses the difficulties involved in computing power envelopes. It reports an empirical comparison of the power of the t test and the power envelope and finds that the two are almost identical especially for sample sizes greater than 20. These findings suggest that, as well as being uniformly most powerful (UMP) within the class of similar tests, Student's t test is approximately UMP within the class of all tests. For practical purposes it might also be regarded as UMP when moderate or large sample sizes are involved.  相似文献   

19.
Two-treatment multicentre clinical trials are very common in practice. In cases where a non-parametric analysis is appropriate, a rank-sum test for grouped data called the van Elteren test can be applied. As an alternative approach, one may apply a combination test such as Fisher's combination test or the inverse normal combination test (also called Liptak's method) in order to combine centre-specific P-values. If there are no ties and no differences between centres with regard to the groups’ sample sizes, the inverse normal combination test using centre-specific Wilcoxon rank-sum tests is equivalent to the van Elteren test. In this paper, the van Elteren test is compared with Fisher's combination test based on Wilcoxon rank-sum tests. Data from two multicentre trials as well as simulated data indicate that Fisher's combination of P-values is more powerful than the van Elteren test in realistic scenarios, i.e. when there are large differences between the centres’ P-values, some quantitative interaction between treatment and centre, and/or heterogeneity in variability. The combination approach opens the possibility of using statistics other than the rank sum, and it is also a suitable method for more complicated designs, e.g. when covariates such as age or gender are included in the analysis.  相似文献   

20.
The classical unconditional exact p-value test can be used to compare two multinomial distributions with small samples. This general hypothesis requires parameter estimation under the null which makes the test severely conservative. Similar property has been observed for Fisher's exact test with Barnard and Boschloo providing distinct adjustments that produce more powerful testing approaches. In this study, we develop a novel adjustment for the conservativeness of the unconditional multinomial exact p-value test that produces nominal type I error rate and increased power in comparison to all alternative approaches. We used a large simulation study to empirically estimate the 5th percentiles of the distributions of the p-values of the exact test over a range of scenarios and implemented a regression model to predict the values for two-sample multinomial settings. Our results show that the new test is uniformly more powerful than Fisher's, Barnard's, and Boschloo's tests with gains in power as large as several hundred percent in certain scenarios. Lastly, we provide a real-life data example where the unadjusted unconditional exact test wrongly fails to reject the null hypothesis and the corrected unconditional exact test rejects the null appropriately.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号