首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Simultaneously testing a family of n null hypotheses can arise in many applications. A common problem in multiple hypothesis testing is to control Type-I error. The probability of at least one false rejection referred to as the familywise error rate (FWER) is one of the earliest error rate measures. Many FWER-controlling procedures have been proposed. The ability to control the FWER and achieve higher power is often used to evaluate the performance of a controlling procedure. However, when testing multiple hypotheses, FWER and power are not sufficient for evaluating controlling procedure’s performance. Furthermore, the performance of a controlling procedure is also governed by experimental parameters such as the number of hypotheses, sample size, the number of true null hypotheses and data structure. This paper evaluates, under various experimental settings, the performance of some FWER-controlling procedures in terms of five indices, the FWER, the false discovery rate, the false non-discovery rate, the sensitivity and the specificity. The results can provide guidance on how to select an appropriate FWER-controlling procedure to meet a study’s objective.  相似文献   

2.
High-throughput data analyses are widely used for examining differential gene expression, identifying single nucleotide polymorphisms, and detecting methylation loci. False discovery rate (FDR) has been considered a proper type I error rate to control for discovery-based high-throughput data analysis. Various multiple testing procedures have been proposed to control the FDR. The power and stability properties of some commonly used multiple testing procedures have not been extensively investigated yet, however. Simulation studies were conducted to compare power and stability properties of five widely used multiple testing procedures at different proportions of true discoveries for various sample sizes for both independent and dependent test statistics. Storey's two linear step-up procedures showed the best performance among all tested procedures considering FDR control, power, and variance of true discoveries. Leukaemia and ovarian cancer microarray studies were used to illustrate the power and stability characteristics of these five multiple testing procedures with FDR control.  相似文献   

3.
Summary.  Estimation of the number or proportion of true null hypotheses in multiple-testing problems has become an interesting area of research. The first important work in this field was performed by Schweder and Spjøtvoll. Among others, they proposed to use plug-in estimates for the proportion of true null hypotheses in multiple-test procedures to improve the power. We investigate the problem of controlling the familywise error rate FWER when such estimators are used as plug-in estimators in single-step or step-down multiple-test procedures. First we investigate the case of independent p -values under the null hypotheses and show that a suitable choice of plug-in estimates leads to control of FWER in single-step procedures. We also investigate the power and study the asymptotic behaviour of the number of false rejections. Although step-down procedures are more difficult to handle we briefly consider a possible solution to this problem. Anyhow, plug-in step-down procedures are not recommended here. For dependent p -values we derive a condition for asymptotic control of FWER and provide some simulations with respect to FWER and power for various models and hypotheses.  相似文献   

4.
The idea of modifying, and potentially improving, classical multiple testing methods controlling the familywise error rate (FWER) via an estimate of the unknown number of true null hypotheses has been around for a long time without a formal answer to the question whether or not such adaptive methods ultimately maintain the strong control of FWER, until Finner and Gontscharuk (2009) and Guo (2009) have offered some answers. A class of adaptive Bonferroni and S?idàk methods larger than considered in those papers is introduced, with the FWER control now proved under a weaker distributional setup. Numerical results show that there are versions of adaptive Bonferroni and S?idàk methods that can perform better under certain positive dependence situations than those previously considered. A different adaptive Holm method and its stepup analog, referred to as an adaptive Hochberg method, are also introduced, and their FWER control is proved asymptotically, as in those papers. These adaptive Holm and Hochberg methods are numerically seen to often outperform the previously considered adaptive Holm method.  相似文献   

5.
In this paper, we translate variable selection for linear regression into multiple testing, and select significant variables according to testing result. New variable selection procedures are proposed based on the optimal discovery procedure (ODP) in multiple testing. Due to ODP’s optimality, if we guarantee the number of significant variables included, it will include less non significant variables than marginal p-value based methods. Consistency of our procedures is obtained in theory and simulation. Simulation results suggest that procedures based on multiple testing have improvement over procedures based on selection criteria, and our new procedures have better performance than marginal p-value based procedures.  相似文献   

6.
Consider testing multiple hypotheses using tests that can only be evaluated by simulation, such as permutation tests or bootstrap tests. This article introduces MMCTest , a sequential algorithm that gives, with arbitrarily high probability, the same classification as a specific multiple testing procedure applied to ideal p‐values. The method can be used with a class of multiple testing procedures that include the Benjamini and Hochberg false discovery rate procedure and the Bonferroni correction controlling the familywise error rate. One of the key features of the algorithm is that it stops sampling for all the hypotheses that can already be decided as being rejected or non‐rejected. MMCTest can be interrupted at any stage and then returns three sets of hypotheses: the rejected, the non‐rejected and the undecided hypotheses. A simulation study motivated by actual biological data shows that MMCTest is usable in practice and that, despite the additional guarantee, it can be computationally more efficient than other methods.  相似文献   

7.
The presence of outliers would inevitably lead to distorted analysis and inappropriate prediction, especially for multiple outliers in high-dimensional regression, where the high dimensionality of the data might amplify the chance of an observation or multiple observations being outlying. Noting that the detection of outliers is not only necessary but also important in high-dimensional regression analysis, we, in this paper, propose a feasible outlier detection approach in sparse high-dimensional linear regression model. Firstly, we search a clean subset by use of the sure independence screening method and the least trimmed square regression estimates. Then, we define a high-dimensional outlier detection measure and propose a multiple outliers detection approach through multiple testing procedures. In addition, to enhance efficiency, we refine the outlier detection rule after obtaining a relatively reliable non-outlier subset based on the initial detection approach. By comparison studies based on Monte Carlo simulation, it is shown that the proposed method performs well for detecting multiple outliers in sparse high-dimensional linear regression model. We further illustrate the application of the proposed method by empirical analysis of a real-life protein and gene expression data.  相似文献   

8.
We consider the problem of testing which of two normally distributed treatments has the largest mean, when the tested populations incorporate a covariate. From the class of procedures using the invariant sequential probability ratio test we derive an optimal allocation that minimizes, in a continuous time setting, the expected sampling costs. Simulations show that this procedure reduces the number of observations from the costlier treatment and categories while maintaining an overall sample size closer to the “pairwise” procedure. A randomized trial example is given.  相似文献   

9.
A study design with two or more doses of a test drug and placebo is frequently used in clinical drug development. Multiplicity issues arise when there are multiple comparisons between doses of test drug and placebo, and also when there are comparisons of doses with one another. An appropriate analysis strategy needs to be specified in advance to avoid spurious results through insufficient control of Type I error, as well as to avoid the loss of power due to excessively conservative adjustments for multiplicity. For evaluation of alternative strategies with possibly complex management of multiplicity, we compare the performance of several testing procedures through the simulated data that represent various patterns of treatment differences. The purpose is to identify which methods perform better or more robustly than the others and under what conditions. Copyright © 2005 John Wiley & Sons, Ltd.  相似文献   

10.
Endpoints in clinical trials are often highly correlated. However, the commonly used multiple testing procedures in clinical trials either do not take into consideration the correlations among test statistics or can only exploit known correlations. Westfall and Young constructed a resampling-based stepdown method that implicitly utilizes the correlation structure of test statistics in situations with unknown correlations. However, their method requires a “subset pivotality” assumption. Romano and Wolf proposed a more general stepdown method, which does not require such an assumption. There is at present little experience with the application of such methods in analyzing clinical trial data. We advocate the application of resampling-based multiple testing procedures to clinical trials data when appropriate. We have conjectured that the resampling-based stepdown methods can be extended to a stepup procedure under appropriate assumptions and examined the performance of both stepdown and stepup methods under a variety of correlation structures and distribution types. Results from our simulation studies support the use of the resampling-based methods under various scenarios, including binary data and small samples, with strong control of Family wise type I error rate (FWER). Under positive dependence and for binary data even under independence, the resampling-based methods are more powerful than the Holm and Hochberg methods. Last, we illustrate the advantage of the resampling-based stepwise methods with two clinical trial data examples: a cardiovascular outcome trial and an oncology trial.  相似文献   

11.
We are concerned with a situation in which we would like to test multiple hypotheses with tests whose p‐values cannot be computed explicitly but can be approximated using Monte Carlo simulation. This scenario occurs widely in practice. We are interested in obtaining the same rejections and non‐rejections as the ones obtained if the p‐values for all hypotheses had been available. The present article introduces a framework for this scenario by providing a generic algorithm for a general multiple testing procedure. We establish conditions that guarantee that the rejections and non‐rejections obtained through Monte Carlo simulations are identical to the ones obtained with the p‐values. Our framework is applicable to a general class of step‐up and step‐down procedures, which includes many established multiple testing corrections such as the ones of Bonferroni, Holm, Sidak, Hochberg or Benjamini–Hochberg. Moreover, we show how to use our framework to improve algorithms available in the literature in such a way as to yield theoretical guarantees on their results. These modifications can easily be implemented in practice and lead to a particular way of reporting multiple testing results as three sets together with an error bound on their correctness, demonstrated exemplarily using a real biological dataset.  相似文献   

12.
This paper discusses multiple testing procedures in dose-response clinical trials with primary and secondary endpoints. A general gatekeeping framework for constructing multiple tests is proposed, which extends the Dunnett test [Journal of the American Statistical Association 1955; 50: 1096-1121] and Bonferroni-based gatekeeping tests developed by Dmitrienko et al. [Statistics in Medicine 2003; 22:2387-2400]. The proposed procedure accounts for the hierarchical structure of the testing problem; for example, it restricts testing of secondary endpoints to the doses for which the primary endpoint is significant. The multiple testing approach is illustrated using a dose-response clinical trial in patients with diabetes. Monte-Carlo simulations demonstrate that the proposed procedure provides a power advantage over the Bonferroni gatekeeping procedure. The power gain generally increases with increasing correlation among the endpoints, especially when all primary dose-control comparisons are significant.  相似文献   

13.
The use of surrogate variables has been proposed as a means to capture, for a given observed set of data, sources driving the dependency structure among high-dimensional sets of features and remove the effects of those sources and their potential negative impact on simultaneous inference. In this article we illustrate the potential effects of latent variables on testing dependence and the resulting impact on multiple inference, we briefly review the method of surrogate variable analysis proposed by Leek and Storey (PNAS 2008; 105:18718-18723), and assess that method via simulations intended to mimic the complexity of feature dependence observed in real-world microarray data. The method is also assessed via application to a recent Merck microarray data set. Both simulation and case study results indicate that surrogate variable analysis can offer a viable strategy for tackling the multiple testing dependence problem when the features follow a potentially complex correlation structure, yielding improvements in the variability of false positive rates and increases in power.  相似文献   

14.
Wald and Wolfowitz (1948) have shown that the Sequential Probability Ratio Test (SPRT) for deciding between two simple hypotheses is, under very restrictive conditions, optimal in three attractive senses. First, it can be a Bayes-optimal rule. Second, of all level α tests having the same power, the test with the smallest joint-expected number of observations is the SPRT, where this expectation is taken jointly with respect to both data and prior over the two hypotheses. Third, the level α test needing the fewest conditional-expected number of observat ions is the SPRT, where this expectation is now taken with respect to the data conditional on either hypothesis being true. Principal among the strong restrictions is that sampling can proceed only in a one-at-a-time manner. In this paper, we relax some of the conditions and show that there are sequential procedures that strictly dominate the SPRT in all three senses. We conclude that the third type of optimality occurs rarely and that decision-makers are better served by looking for sequential procedures that possess the first two types of optimality. By relaxing the one-at-a-time sampling restriction, we obtain optimal (in the first two senses) variable-s ample-size- sequential probability ratio tests.  相似文献   

15.
Multiple testing procedures defined by directed, weighted graphs have recently been proposed as an intuitive visual tool for constructing multiple testing strategies that reflect the often complex contextual relations between hypotheses in clinical trials. Many well‐known sequentially rejective tests, such as (parallel) gatekeeping tests or hierarchical testing procedures are special cases of the graph based tests. We generalize these graph‐based multiple testing procedures to adaptive trial designs with an interim analysis. These designs permit mid‐trial design modifications based on unblinded interim data as well as external information, while providing strong family wise error rate control. To maintain the familywise error rate, it is not required to prespecify the adaption rule in detail. Because the adaptive test does not require knowledge of the multivariate distribution of test statistics, it is applicable in a wide range of scenarios including trials with multiple treatment comparisons, endpoints or subgroups, or combinations thereof. Examples of adaptations are dropping of treatment arms, selection of subpopulations, and sample size reassessment. If, in the interim analysis, it is decided to continue the trial as planned, the adaptive test reduces to the originally planned multiple testing procedure. Only if adaptations are actually implemented, an adjusted test needs to be applied. The procedure is illustrated with a case study and its operating characteristics are investigated by simulations. Copyright © 2014 John Wiley & Sons, Ltd.  相似文献   

16.
Many multiple testing procedures (MTPs) are available today, and their number is growing. Also available are many type I error rates: the family-wise error rate (FWER), the false discovery rate, the proportion of false positives, and others. Most MTPs are designed to control a specific type I error rate, and it is hard to compare different procedures. We approach the problem by studying the exact level at which threshold step-down (TSD) procedures (an important class of MTPs exemplified by the classic Holm procedure) control the generalized FWER   defined as the probability of kk or more false rejections. We find that level explicitly for any TSD procedure and any kk. No assumptions are made about the dependency structure of the pp-values of the individual tests. We derive from our formula a criterion for unimprovability   of a procedure in the class of TSD procedures controlling the generalized FWER at a given level. In turn, this criterion implies that for each kk the number of such unimprovable procedures is finite and is greater than one if k>1k>1. Consequently, in this case the most rejective procedure in the above class does not exist.  相似文献   

17.
In a two-sample testing problem, sometimes one of the sample observations are difficult and/or costlier to collect compared to the other one. Also, it may be the situation that sample observations from one of the populations have been previously collected and for operational advantages we do not wish to collect any more observations from the second population that are necessary for reaching a decision. Partially sequential technique is found to be very useful in such situations. The technique gained its popularity in statistics literature due to its very nature of capitalizing the best aspects of both fixed and sequential procedures. The literature is enriched with various types of partially sequential techniques useable under different types of data set-up. Nonetheless, there is no mention of multivariate data framework in this context, although very common in practice. The present paper aims at developing a class of partially sequential nonparametric test procedures for two-sample multivariate continuous data. For this we suggest a suitable stopping rule adopting inverse sampling technique and propose a class of test statistics based on the samples drawn using the suggested sampling scheme. Various asymptotic properties of the proposed tests are explored. An extensive simulation study is also performed to study the asymptotic performance of the tests. Finally the benefit of the proposed test procedure is demonstrated with an application to a real-life data on liver disease.  相似文献   

18.
The concept of a partially sequential hypothesis test was introduced by Wolfe (1977a), an{associated procedures were developed for both parametric and nonparametric assumptions. In this paper we consider distribution-free extensions of those indicator tests, based on the placements of the sequentially obtained observations among the previously collected fixed size sample. Exact and asymptotic, as the fixed sample size in¬creases to infinity, properties of these sequential placements procedures are obtained, including statements about the power and expected number of sequentially obtained observations. The results of a Monte Carlo study are used to differentiate be¬tween various placement scoring schemes.  相似文献   

19.
In a breakthrough paper, Benjamini and Hochberg (J Roy Stat Soc Ser B 57:289–300, 1995) proposed a new error measure for multiple testing, the FDR; and developed a distribution-free procedure to control it under independence among the test statistics. In this paper we argue by extensive simulation and theoretical considerations that the assumption of independence is not needed. Along the lines of (Ann Stat 32:1035–1061, 2004b), we moreover provide a more powerful method, that exploits an estimator of the number of false nulls among the tests. We propose a whole family of iterative estimators that prove robust under dependence and independence between the test statistics. These estimators can be used to improve also classical multiple testing procedures, and in general to estimate the weight of a known component in a mixture distribution. Innovations are illustrated by simulations.  相似文献   

20.
A sequentially rejective (SR) testing procedure introduced by Holm (1979) and modified (MSR) by Shaffer (1986) is considered for testing all pairwise mean comparisons.For such comparisons, both the SR and MSR methods require that the observed test statistics be ordered and compared, each in turn, to appropriate percentiles on Student's t distribution.For the MSR method these percentiles are based on the maximum number of true null hypotheses remaining at each stage of the sequential procedure, given prior significance at previous stages, A function is developed for determining this number from the number of means being tested and the stage of the test.For a test of all pairwise comparisons, the logical implications which follow the rejection of a null hypothesis renders the MSR procedure uniformly more powerful than the SR procedure.Tables of percentiles for comparing K means, 3 < K < 6, using the MSR method are presented.These tables use Sidak's (1967) multiplicative inequality and simplify the use of t he MSR procedure.Several modifications to the MSR are suggested as a means of further increasing the power for testing the pairwise comparisons.General use of the MSR and the corresponding function for testing other parameters besides the mean is discussed.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号