首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 26 毫秒
1.
Interest in confirmatory adaptive combined phase II/III studies with treatment selection has increased in the past few years. These studies start comparing several treatments with a control. One (or more) treatment(s) is then selected after the first stage based on the available information at an interim analysis, including interim data from the ongoing trial, external information and expert knowledge. Recruitment continues, but now only for the selected treatment(s) and the control, possibly in combination with a sample size reassessment. The final analysis of the selected treatment(s) includes the patients from both stages and is performed such that the overall Type I error rate is strictly controlled, thus providing confirmatory evidence of efficacy at the final analysis. In this paper we describe two approaches to control the Type I error rate in adaptive designs with sample size reassessment and/or treatment selection. The first method adjusts the critical value using a simulation-based approach, which incorporates the number of patients at an interim analysis, the true response rates, the treatment selection rule, etc. We discuss the underlying assumptions of simulation-based procedures and give several examples where the Type I error rate is not controlled if some of the assumptions are violated. The second method is an adaptive Bonferroni-Holm test procedure based on conditional error rates of the individual treatment-control comparisons. We show that this procedure controls the Type I error rate, even if a deviation from a pre-planned adaptation rule or the time point of such a decision is necessary.  相似文献   

2.
In Clinical trials involving multiple comparisons of interest, the importance of controlling the trial Type I error is well-understood and well-documented. Moreover, when these comparisons are themselves correlated, methodologies exist for accounting for the correlation in the trial design, when calculating the trial significance levels. However, less well-documented is the fact that there are some circumstances where multiple comparisons affect the Type II error rather than the Type I error, and failure to account for this, can result in a reduction in the overall trial power. In this paper, we describe sample size calculations for clinical trials involving multiple correlated comparisons, where all the comparisons must be statistically significant for the trial to provide evidence of effect, and show how such calculations have to account for multiplicity in the Type II error. For the situation of two comparisons, we provide a result which assumes a bivariate Normal distribution. For the general case of two or more comparisons we provide a solution using inflation factors to increase the sample size relative to the case of a single outcome. We begin with a simple case of two comparisons assuming a bivariate Normal distribution, show how to factor in correlation between comparisons and then generalise our findings to situations with two or more comparisons. These methods are easy to apply, and we demonstrate how accounting for the multiplicity in the Type II error leads, at most, to modest increases in the sample size.  相似文献   

3.
The treatment sum of squares in the one-way analysis of variance can be expressed in two different ways: as a sum of comparisons between each treatment and the remaining treatments combined, or as a sum of comparisons between the treatments two at a time. When comparisons between treatments are made with the Wilcoxon rank sum statistic, these two expressions lead to two different tests; namely, that of Kruskal and Wallis and one which is essentially the same as that proposed by Crouse (1961,1966). The latter statistic is known to be asymptotically distributed as a chi-squared variable when the numbers of replicates are large. Here it is shown to be asymptotically normal when the replicates are few but the number of treatments is large. For all combinations of numbers of replicates and treatments its empirical distribution is well approximated by a beta distribution  相似文献   

4.
When thousands of tests are performed simultaneously to detect differentially expressed genes in microarray analysis, the number of Type I errors can be immense if a multiplicity adjustment is not made. However, due to the large scale, traditional adjustment methods require very stringen significance levels for individual tests, which yield low power for detecting alterations. In this work, we describe how two omnibus tests can be used in conjunction with a gene filtration process to circumvent difficulties due to the large scale of testing. These two omnibus tests, the D-test and the modified likelihood ratio test (MLRT), can be used to investigate whether a collection of P-values has arisen from the Uniform(0,1) distribution or whether the Uniform(0,1) distribution contaminated by another Beta distribution is more appropriate. In the former case, attention can be directed to a smaller part of the genome; in the latter event, parameter estimates for the contamination model provide a frame of reference for multiple comparisons. Unlike the likelihood ratio test (LRT), both the D-test and MLRT enjoy simple limiting distributions under the null hypothesis of no contamination, so critical values can be obtained from standard tables. Simulation studies demonstrate that the D-test and MLRT are superior to the AIC, BIC, and Kolmogorov-Smirnov test. A case study illustrates omnibus testing and filtration.  相似文献   

5.
Confidence intervals for location parameters are expanded (in either direction) to some “crucial” points and the resulting increase in the confidence coefficient investigated.Particaular crucial points are chosen to illuminate some hypothesis testing problems.Special results are dervied for the normal distribution with estimated variance and, in particular, for the problem of classifiying treatments as better or worse than a control.For this problem the usual two-sided Dunnett procedure is seen to be inefficient.Suggestions are made for the use of already published tables for this problem.Mention is made of the use of expanded confidence intervals for all pairwise comparisons of treatments using an “honest ordering difference” rather than Tukey's “honest siginificant difference”.  相似文献   

6.
In this article, we consider the three-factor unbalanced nested design model without the assumption of equal error variance. For the problem of testing “main effects” of the three factors, we propose a parametric bootstrap (PB) approach and compare it with the existing generalized F (GF) test. The Type I error rates of the tests are evaluated using Monte Carlo simulation. Our studies show that the PB test performs better than the generalized F-test. The PB test performs very satisfactorily even for small samples while the GF test exhibits poor Type I error properties when the number of factorial combinations or treatments goes up. It is also noted that the same tests can be used to test the significance of the random effect variance component in a three-factor mixed effects nested model under unequal error variances.  相似文献   

7.
In this article we consider the two-way ANOVA model without interaction under heteroscedasticity. For the problem of testing equal effects of factors, we propose a parametric bootstrap (PB) approach and compare it with existing the generalized F (GF) test. The Type I error rates and powers of the tests are evaluated using Monte Carlo simulation. Our studies show that the PB test performs better than the GF test. The PB test performs very satisfactorily even for small samples while the GF test exhibits poor Type I error properties when the number of factorial combinations or treatments goes up. It is also noted that the same tests can be used to test the significance of random effect variance component in a two-way mixed-effects model under unequal error variances.  相似文献   

8.
Although many methods are available for performing multiple comparisons based on some measure of location, most can be unsatisfactory in at least some situations, in simulations when sample sizes are small, say less than or equal to twenty. That is, the actual Type I error probability can substantially exceed the nominal level, and for some methods the actual Type I error probability can be well below the nominal level, suggesting that power might be relatively poor. In addition, all methods based on means can have relatively low power under arbitrarily small departures from normality. Currently, a method based on 20% trimmed means and a percentile bootstrap method performs relatively well (Wilcox, in press). However, symmetric trimming was used, even when sampling from a highly skewed distribution and a rigid adherence to 20% trimming can result in low efficiency when a distribution is sufficiently heavy-tailed. Robust M-estimators are more flexible but they can be unsatisfactory in terms of Type I errors when sample sizes are small. This paper describes an alternative approach based on a modified one-step M-estimator that introduces more flexibility than a trimmed mean but provides better control over Type I error probabilities compared with using a one-step M-estimator.  相似文献   

9.
Halperin et al. (1988) suggested an approach which allows for k Type I errors while using Scheffe's method of multiple comparisons for linear combinations of p means. In this paper we apply the same type of error control to Tukey's method of multiple pairwise comparisons. In fact, the variant of the Tukey (1953) approach discussed here defines the error control objective as assuring with a specified probability that at most one out of the p(p-l)/2 comparisons between all pairs of the treatment means is significant in two-sided tests when an overall null hypothesis (all p means are equal) is true or, from a confidence interval point of view, that at most one of a set of simultaneous confidence intervals for all of the pairwise differences of the treatment means is incorrect. The formulae which yield the critical values needed to carry out this new procedure are derived and the critical values are tabulated. A Monte Carlo study was conducted and several tables are presented to demonstrate the experimentwise Type I error rates and the gains in power furnished by the proposed procedure  相似文献   

10.
In this article, we consider the two-factor unbalanced nested design model without the assumption of equal error variance. For the problem of testing ‘main effects’ of both factors, we propose a parametric bootstrap (PB) approach and compare it with the existing generalized F (GF) test. The Type I error rates of the tests are evaluated using Monte Carlo simulation. Our studies show that the PB test performs better than the GF test. The PB test performs very satisfactorily even for small samples while the GF test exhibit poor Type I error properties when the number of factorial combinations or treatments goes up. It is also noted that the same tests can be used to test the significance of the random effect variance component in a two-factor mixed effects nested model under unequal error variances.  相似文献   

11.
In early clinical development of new medicines, a single‐arm study with a limited number of patients is often used to provide a preliminary assessment of a response rate. A multi‐stage design may be indicated, especially when the first stage should only include very few patients so as to enable rapid identification of an ineffective drug. We used decision rules based on several types of nominal confidence intervals to evaluate a three‐stage design for a study that includes at most 30 patients. For each decision rule, we used exact binomial calculations to determine the probability of continuing to further stages as well as to evaluate Type I and Type II error rates. Examples are provided to illustrate the methods for evaluating alternative decision rules and to provide guidance on how to extend the methods to situations with modifications to the number of stages or number of patients per stage in the study design. Copyright © 2004 John Wiley & Sons, Ltd.  相似文献   

12.
13.
Traditionally, an assessment for grain yield of rice is to split it into the yield components, including the number of panicles per plant, the number of spikelets per panicle, the 1000-grain weight and the filled-spikelet percentage, such that the yield performance can be individually evaluated through each component, and the products of yield components are employed for grain yield comparisons. However, when using the standard statistical methods, such as the two-sample t-test and analysis of variance, the assumptions of normality and variance homogeneity cannot be fully justified for comparing the grain yields, leading to that the empirical sizes cannot be adequately controlled. In this study, based on the concepts of generalized test variables and generalized p-values, a novel statistical testing procedure is developed for grain yield comparisons of rice. The proposed method is assessed by a series of numerical simulations. According to the simulation results, the proposed method performs reasonably well in Type I error control and empirical power. In addition, a real-life field experiment is analyzed by the proposed method, some productive rice varieties are screened out and suggested for a follow-up investigation.  相似文献   

14.
Heterogeneity of variances of treatment groups influences the validity and power of significance tests of location in two distinct ways. First, if sample sizes are unequal, the Type I error rate and power are depressed if a larger variance is associated with a larger sample size, and elevated if a larger variance is associated with a smaller sample size. This well-established effect, which occurs in t and F tests, and to a lesser degree in nonparametric rank tests, results from unequal contributions of pooled estimates of error variance in the computation of test statistics. It is observed in samples from normal distributions, as well as non-normal distributions of various shapes. Second, transformation of scores from skewed distributions with unequal variances to ranks produces differences in the means of the ranks assigned to the respective groups, even if the means of the initial groups are equal, and a subsequent inflation of Type I error rates and power. This effect occurs for all sample sizes, equal and unequal. For the t test, the discrepancy diminishes, and for the Wilcoxon–Mann–Whitney test, it becomes larger, as sample size increases. The Welch separate-variance t test overcomes the first effect but not the second. Because of interaction of these separate effects, the validity and power of both parametric and nonparametric tests performed on samples of any size from unknown distributions with possibly unequal variances can be distorted in unpredictable ways.  相似文献   

15.
Five univariate divisive clustering methods for grouping means in analysis of variance are considered.Unlike pairwise multiple comparison procedures, cluster analysis has the advantage of producing non-overlapping groups of the treatment means. Comparisonwise Type I error rates and average numbers of clusters per experiment are examined for a heterogeneous set of 20 true treatment means with 11 embedded homogenous sub-groups of one or more treatments. The results of a simulation study clearly show that observed comparisonwise error rate and number of clusters are determined to a far greater extent by the precision of the experiment (as determined by the magnitude of the standard deviation) than by either the stated significance level or the clustering method used.  相似文献   

16.
In terms of the risk of making a Type I error in evaluating a null hypothesis of equality, requiring two independent confirmatory trials with two‐sided p‐values less than 0.05 is equivalent to requiring one confirmatory trial with two‐sided p‐value less than 0.001 25. Furthermore, the use of a single confirmatory trial is gaining acceptability, with discussion in both ICH E9 and a CPMP Points to Consider document. Given the growing acceptance of this approach, this note provides a formula for the sample size savings that are obtained with the single clinical trial approach depending on the levels of Type I and Type II errors chosen. For two replicate trials each powered at 90%, which corresponds to a single larger trial powered at 81%, an approximate 19% reduction in total sample size is achieved with the single trial approach. Alternatively, a single trial with the same sample size as the total sample size from two smaller trials will have much greater power. For example, in the case where two trials are each powered at 90% for two‐sided α=0.05 yielding an overall power of 81%, a single trial using two‐sided α=0.001 25 would have 91% power. Copyright © 2004 John Wiley & Sons, Ltd.  相似文献   

17.
A new method, theoretically justified, is proposed to overcome difficulties in analysing reliability data coming from Type I censored samples. The article shows that, by means of Monte Carlo simulations, it is possible to obtain quasi-exact likelihood estimator properties and conservative confidence intervals for log-location-scale distributions. In the case of the exponential distribution, comparisons with the exact estimator properties show that the Monte Carlo approach allows to calculate the properties with very good accuracy. Finally, for the exponential distribution it is demonstrated that, if the number of failures can only be different from zero, confidence intervals based on the asymptotic properties of the likelihood estimators may give statistically meaningless results in the case of small sample size (3–10) and low probability of failure (.05–.20).  相似文献   

18.
In this paper, we present several nonparametric multiple comparison (MC) procedures for unbalanced one-way factorial designs. The nonparametric hypotheses are formulated by using normalized distribution functions and the comparisons are carried out on the basis of the relative treatment effects. The proposed test statistics take the form of linear pseudo rank statistics and the asymptotic joint distribution of the pseudo rank statistics for testing treatments versus control satisfies the multivariate totally positive of order two condition irrespective of the correlations among the rank statistics. Therefore, in the context of MCs of treatments versus control, the nonparametric Simes test is validated for the global testing of the intersection hypothesis. For simultaneous testing of individual hypotheses, the nonparametric Hochberg stepup procedure strongly controls the familywise type I error rate asymptotically. With regard to all pairwise comparisons, we generalize various single-step and stagewise procedures to perform comparisons on the relative treatment effects. To further compare with normal theory counterparts, the asymptotic relative efficiencies of the nonparametric MC procedures with respect to the parametric MC procedures are derived under a sequence of Pitman alternatives in a nonparametric location shift model for unbalanced one-way layouts. Monte Carlo simulations are conducted to demonstrate the validity and power of the proposed nonparametric MC procedures.  相似文献   

19.
A Monte Carlo study was used to compare the Type I error rates and power of two nonparametric tests against the F test for the single-factor repeated measures model. The performance of the nonparametric Friedman and Conover tests was investigated for different distributions, numbers of blocks and numbers of repeated measures. The results indicated that the type of the distribution has little effect on the ability of the Friedman and Conover tests to control Type error rates. For power, the Friedman and Conover tests tended to agree in rejecting the same false hyporhesis when the design consisted of three repeated measures. However, the Conover test was more powerful than the Friedman test when the number of repeated measures was 4 or 5. Still, the F test is recommended for the single-factor repeated measures model because of its robustness to non-normality and its good power across a range of conditions.  相似文献   

20.
In one-way ANOVA, most of the pairwise multiple comparison procedures depend on normality assumption of errors. In practice, errors have non-normal distributions so frequently. Therefore, it is very important to develop robust estimators of location and the associated variance under non-normality. In this paper, we consider the estimation of one-way ANOVA model parameters to make pairwise multiple comparisons under short-tailed symmetric (STS) distribution. The classical least squares method is neither efficient nor robust and maximum likelihood estimation technique is problematic in this situation. Modified maximum likelihood (MML) estimation technique gives the opportunity to estimate model parameters in closed forms under non-normal distributions. Hence, the use of MML estimators in the test statistic is proposed for pairwise multiple comparisons under STS distribution. The efficiency and power comparisons of the test statistic based on sample mean, trimmed mean, wave and MML estimators are given and the robustness of the test obtained using these estimators under plausible alternatives and inlier model are examined. It is demonstrated that the test statistic based on MML estimators is efficient and robust and the corresponding test is more powerful and having smallest Type I error.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号