期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Evaluation of false discovery rate and power via sample size in microarray studies

Jie Song Herman W. Raadsma Peter C. Thomson 《Journal of applied statistics》2012,39(3):489-500

Microarray studies are now common for human, agricultural plant and animal studies. False discovery rate (FDR) is widely used in the analysis of large-scale microarray data to account for problems associated with multiple testing. A well-designed microarray study should have adequate statistical power to detect the differentially expressed (DE) genes, while keeping the FDR acceptably low. In this paper, we used a mixture model of expression responses involving DE genes and non-DE genes to analyse theoretical FDR and power for simple scenarios where it is assumed that each gene has equal error variance and the gene effects are independent. A simulation study was used to evaluate the empirical FDR and power for more complex scenarios with unequal error variance and gene dependence. Based on this approach, we present a general guide for sample size requirement at the experimental design stage for prospective microarray studies. This paper presented an approach to explicitly connect the sample size with FDR and power. While the methods have been developed in the context of one-sample microarray studies, they are readily applicable to two-sample, and could be adapted to multiple-sample studies. 相似文献

2.

Evaluations of FDR-controlling procedures in multiple hypothesis testing

Yi-Ting Hwang Shih-Kai Chu Shyh-Tyan Ou 《Statistics and Computing》2011,21(4):569-583

Many exploratory experiments such as DNA microarray or brain imaging require simultaneously comparisons of hundreds or thousands of hypotheses. Under such a setting, using the false discovery rate (FDR) as an overall Type I error is recommended (Benjamini and Hochberg in J. R. Stat. Soc. B 57:289–300, 1995). Many FDR controlling procedures have been proposed. However, when evaluating the performance of FDR-controlling procedures, researchers are often focused on the ability of procedures to control the FDR and to achieve high power. Meanwhile, under the multiple hypotheses, it may be also likely to commit a false non-discovery or fail to claim a true non-significance. In addition, various experimental parameters such as the number of hypotheses, the proportion of the number of true null hypotheses to the number of hypotheses, the samples size and the correlation structure may affect the performance of FDR controlling procedures. The purpose of this paper is to illustrate the performance of some existing FDR controlling procedures in terms of four indices, i.e., the FDR, the false non-discovery rate, the sensitivity and the specificity. Analytical results of these indices for the FDR controlling procedures are derived. Simulations are also performed to evaluate the performance of controlling procedures in terms of these indices under various experimental parameters. The result can be used to summarize as a guidance for practitioners to properly choose a FDR controlling procedure. 相似文献

3.

False discovery rates for large-scale model checking under certain dependence

Lu Deng Xuemin Zi 《统计学通讯:理论与方法》2018,47(1):64-79

In many scientific fields, it is interesting and important to determine whether an observed data stream comes from a prespecified model or not, particularly when the number of data streams is of large scale, where multiple hypotheses testing is necessary. In this article, we consider large-scale model checking under certain dependence among different data streams observed at the same time. We propose a false discovery rate (FDR) control procedure to check those unusual data streams. Specifically, we derive an approximation of false discovery and construct a point estimate of FDR. Theoretical results show that, under some mild assumptions, our proposed estimate of FDR is simultaneously conservatively consistent with the true FDR, and hence it is an asymptotically strong control procedure. Simulation comparisons with some competing procedures show that our proposed FDR procedure behaves better in general settings. Application of our proposed FDR procedure is illustrated by the StarPlus fMRI data. 相似文献

4.

Evaluations of FWER-controlling methods in multiple hypothesis testing

Yi-Ting Hwang Jia-Jung Lai Shyh-Tyan Ou 《Journal of applied statistics》2010,37(10):1681-1694

Simultaneously testing a family of n null hypotheses can arise in many applications. A common problem in multiple hypothesis testing is to control Type-I error. The probability of at least one false rejection referred to as the familywise error rate (FWER) is one of the earliest error rate measures. Many FWER-controlling procedures have been proposed. The ability to control the FWER and achieve higher power is often used to evaluate the performance of a controlling procedure. However, when testing multiple hypotheses, FWER and power are not sufficient for evaluating controlling procedure’s performance. Furthermore, the performance of a controlling procedure is also governed by experimental parameters such as the number of hypotheses, sample size, the number of true null hypotheses and data structure. This paper evaluates, under various experimental settings, the performance of some FWER-controlling procedures in terms of five indices, the FWER, the false discovery rate, the false non-discovery rate, the sensitivity and the specificity. The results can provide guidance on how to select an appropriate FWER-controlling procedure to meet a study’s objective. 相似文献

5.

Large-scale multiple testing under dependence

Wenguang Sun T. Tony Cai 《Journal of the Royal Statistical Society. Series B, Statistical methodology》2009,71(2):393-424

Summary. The paper considers the problem of multiple testing under dependence in a compound decision theoretic framework. The observed data are assumed to be generated from an underlying two-state hidden Markov model. We propose oracle and asymptotically optimal data-driven procedures that aim to minimize the false non-discovery rate FNR subject to a constraint on the false discovery rate FDR. It is shown that the performance of a multiple-testing procedure can be substantially improved by adaptively exploiting the dependence structure among hypotheses, and hence conventional FDR procedures that ignore this structural information are inefficient. Both theoretical properties and numerical performances of the procedures proposed are investigated. It is shown that the procedures proposed control FDR at the desired level, enjoy certain optimality properties and are especially powerful in identifying clustered non-null cases. The new procedure is applied to an influenza-like illness surveillance study for detecting the timing of epidemic periods. 相似文献

6.

Modified Simes’ critical values under positive dependence

《Journal of statistical planning and inference》2006,136(12):4129-4146

A modification of the critical values of Simes’ test is suggested in this article when the underlying test statistics are multivariate normal with a common non-negative correlation, yielding a more powerful test than the original Simes’ test. A step-up multiple testing procedure with these modified critical values, which is shown to control false discovery rate (FDR), is presented as a modification of the traditional Benjamini–Hochberg (BH) procedure. Simulations were carried out to compare this modified BH procedure with the BH and other modified BH procedures in terms of false non-discovery rate (FNR), 1–FDR–FNR and average power. The present modified BH procedure is observed to perform well compared to others when the test statistics are highly correlated and most of the hypotheses are true. 相似文献

7.

A clarifying comparison of methods for controlling the false discovery rate

Yaling Yin Christine E. Soteros Miķelis G. Bickis 《Journal of statistical planning and inference》2009

Traditional multiple hypothesis testing procedures fix an error rate and determine the corresponding rejection region. In 2002 Storey proposed a fixed rejection region procedure and showed numerically that it can gain more power than the fixed error rate procedure of Benjamini and Hochberg while controlling the same false discovery rate (FDR). In this paper it is proved that when the number of alternatives is small compared to the total number of hypotheses, Storey's method can be less powerful than that of Benjamini and Hochberg. Moreover, the two procedures are compared by setting them to produce the same FDR. The difference in power between Storey's procedure and that of Benjamini and Hochberg is near zero when the distance between the null and alternative distributions is large, but Benjamini and Hochberg's procedure becomes more powerful as the distance decreases. It is shown that modifying the Benjamini and Hochberg procedure to incorporate an estimate of the proportion of true null hypotheses as proposed by Black gives a procedure with superior power. 相似文献

8.

Comparisons of estimators of the number of true null hypotheses and adaptive FDR procedures in multiplicity testing

《Journal of Statistical Computation and Simulation》2012,82(2):207-220

Many exploratory studies such as microarray experiments require the simultaneous comparison of hundreds or thousands of genes. It is common to see that most genes in many microarray experiments are not expected to be differentially expressed. Under such a setting, a procedure that is designed to control the false discovery rate (FDR) is aimed at identifying as many potential differentially expressed genes as possible. The usual FDR controlling procedure is constructed based on the number of hypotheses. However, it can become very conservative when some of the alternative hypotheses are expected to be true. The power of a controlling procedure can be improved if the number of true null hypotheses (m ₀) instead of the number of hypotheses is incorporated in the procedure [Y. Benjamini and Y. Hochberg, On the adaptive control of the false discovery rate in multiple testing with independent statistics, J. Edu. Behav. Statist. 25(2000), pp. 60–83]. Nevertheless, m ₀ is unknown, and has to be estimated. The objective of this article is to evaluate some existing estimators of m ₀ and discuss the feasibility of these estimators in incorporating into FDR controlling procedures under various experimental settings. The results of simulations can help the investigator to choose an appropriate procedure to meet the requirement of the study. 相似文献

9.

MULTIPLE TESTING VIA FDR FOR LARGE SCALE IMAGING DATA

Zhang C Fan J Yu T 《Annals of statistics》2011,39(1):613-642

The multiple testing procedure plays an important role in detecting the presence of spatial signals for large scale imaging data. Typically, the spatial signals are sparse but clustered. This paper provides empirical evidence that for a range of commonly used control levels, the conventional FDR procedure can lack the ability to detect statistical significance, even if the p-values under the true null hypotheses are independent and uniformly distributed; more generally, ignoring the neighboring information of spatially structured data will tend to diminish the detection effectiveness of the FDR procedure. This paper first introduces a scalar quantity to characterize the extent to which the "lack of identification phenomenon" (LIP) of the FDR procedure occurs. Second, we propose a new multiple comparison procedure, called FDR(L), to accommodate the spatial information of neighboring p-values, via a local aggregation of p-values. Theoretical properties of the FDR(L) procedure are investigated under weak dependence of p-values. It is shown that the FDR(L) procedure alleviates the LIP of the FDR procedure, thus substantially facilitating the selection of more stringent control levels. Simulation evaluations indicate that the FDR(L) procedure improves the detection sensitivity of the FDR procedure with little loss in detection specificity. The computational simplicity and detection effectiveness of the FDR(L) procedure are illustrated through a real brain fMRI dataset. 相似文献

10.

The false discovery rate: a variable selection perspective

《Journal of statistical planning and inference》2006,136(8):2668-2684

相似文献

11.

Some Results on the Control of the False Discovery Rate under Dependence

ALESSIO FARCOMENI 《Scandinavian Journal of Statistics》2007,34(2):275-297

Abstract. Controlling the false discovery rate (FDR) is a powerful approach to multiple testing, with procedures developed with applications in many areas. Dependence among the test statistics is a common problem, and many attempts have been made to extend the procedures. In this paper, we show that a certain degree of dependence is allowed among the test statistics, when the number of tests is large, with no need for any correction. We then suggest a way to conservatively estimate the proportion of false nulls, both under dependence and independence, and discuss the advantages of using such estimators when controlling the FDR. 相似文献

12.

Strong control, conservative point estimation and simultaneous conservative consistency of false discovery rates: a unified approach 总被引：1，自引：0，他引：1

John D. Storey Jonathan E. Taylor David Siegmund 《Journal of the Royal Statistical Society. Series B, Statistical methodology》2004,66(1):187-205

Summary. The false discovery rate (FDR) is a multiple hypothesis testing quantity that describes the expected proportion of false positive results among all rejected null hypotheses. Benjamini and Hochberg introduced this quantity and proved that a particular step-up p -value method controls the FDR. Storey introduced a point estimate of the FDR for fixed significance regions. The former approach conservatively controls the FDR at a fixed predetermined level, and the latter provides a conservatively biased estimate of the FDR for a fixed predetermined significance region. In this work, we show in both finite sample and asymptotic settings that the goals of the two approaches are essentially equivalent. In particular, the FDR point estimates can be used to define valid FDR controlling procedures. In the asymptotic setting, we also show that the point estimates can be used to estimate the FDR conservatively over all significance regions simultaneously, which is equivalent to controlling the FDR at all levels simultaneously. The main tool that we use is to translate existing FDR methods into procedures involving empirical processes. This simplifies finite sample proofs, provides a framework for asymptotic results and proves that these procedures are valid even under certain forms of dependence. 相似文献

13.

Controlling Bayes directional false discovery rate in random effects model

Sanat K. Sarkar Tianhui Zhou 《Journal of statistical planning and inference》2008

Starting with a decision theoretic formulation of simultaneous testing of null hypotheses against two-sided alternatives, a procedure controlling the Bayesian directional false discovery rate (BDFDR) is developed through controlling the posterior directional false discovery rate (PDFDR). This is an alternative to Lewis and Thayer [2004. A loss function related to the FDR for random effects multiple comparison. J. Statist. Plann. Inference 125, 49–58.] with a better control of the BDFDR. Moreover, it is optimum in the sense of being the non-randomized part of the procedure maximizing the posterior expectation of the directional per-comparison power rate given the data, while controlling the PDFDR. A corresponding empirical Bayes method is proposed in the context of one-way random effects model. Simulation study shows that the proposed Bayes and empirical Bayes methods perform much better from a Bayesian perspective than the procedures available in the literature. 相似文献

14.

Efficient Stratified Testing Procedure for a False Discovery Rate

Seungbong Han Adin-Cristian Andrei Kam-Wah Tsui 《统计学通讯:模拟与计算》2015,44(5):1117-1125

The false discovery rate (FDR) has become a popular error measure in the large-scale simultaneous testing. When data are collected from heterogenous sources and form grouped hypotheses testing, it may be beneficial to use the distinct feature of groups to conduct the multiple hypotheses testing. We propose a stratified testing procedure that uses different FDR levels according to the stratification features based on p-values. Our proposed method is easy to implement in practice. Simulations studies show that the proposed method produces more efficient testing results. The stratified testing procedure minimizes the overall false negative rate (FNR) level, while controlling the overall FDR. An example from a type II diabetes mice study further illustrates the practical advantages of this new approach. 相似文献

15.

A direct approach to false discovery rates

John D. Storey 《Journal of the Royal Statistical Society. Series B, Statistical methodology》2002,64(3):479-498

Summary. Multiple-hypothesis testing involves guarding against much more complicated errors than single-hypothesis testing. Whereas we typically control the type I error rate for a single-hypothesis test, a compound error rate is controlled for multiple-hypothesis tests. For example, controlling the false discovery rate FDR traditionally involves intricate sequential p -value rejection methods based on the observed data. Whereas a sequential p -value method fixes the error rate and estimates its corresponding rejection region, we propose the opposite approach—we fix the rejection region and then estimate its corresponding error rate. This new approach offers increased applicability, accuracy and power. We apply the methodology to both the positive false discovery rate pFDR and FDR, and provide evidence for its benefits. It is shown that pFDR is probably the quantity of interest over FDR. Also discussed is the calculation of the q -value, the pFDR analogue of the p -value, which eliminates the need to set the error rate beforehand as is traditionally done. Some simple numerical examples are presented that show that this new approach can yield an increase of over eight times in power compared with the Benjamini–Hochberg FDR method. 相似文献

16.

A method for modifying multiple testing procedures

Joshua D. Habiger 《Journal of statistical planning and inference》2012

Many recent multiple testing papers have provided more efficient and/or robust methodology for control of a particular error rate. However, different multiple testing scenarios call for the control of different error rates. Hence, the procedure possessing the desired optimality and/or robustness properties may not be applicable to the problem at hand. This paper provides a general method for extending any multiple testing procedure to control any error rate, thereby allowing for the procedure possessing the desired properties to be used to control the most relevant error rate. As an example, two popular procedures that were originally designed to control the marginal and positive False Discovery Rate are extended to control the False Discovery Rate and Family-wise Error Rate. It is shown that optimality and/or robustness properties of the original procedure are retained when it is modified using the proposed method. 相似文献

17.

Controlling type I error rates in multi-arm clinical trials: A case for the false discovery rate

James M. S. Wason David S. Robertson 《Pharmaceutical statistics》2021,20(1):109-116

Multi-arm trials are an efficient way of simultaneously testing several experimental treatments against a shared control group. As well as reducing the sample size required compared to running each trial separately, they have important administrative and logistical advantages. There has been debate over whether multi-arm trials should correct for the fact that multiple null hypotheses are tested within the same experiment. Previous opinions have ranged from no correction is required, to a stringent correction (controlling the probability of making at least one type I error) being needed, with regulators arguing the latter for confirmatory settings. In this article, we propose that controlling the false-discovery rate (FDR) is a suitable compromise, with an appealing interpretation in multi-arm clinical trials. We investigate the properties of the different correction methods in terms of the positive and negative predictive value (respectively how confident we are that a recommended treatment is effective and that a non-recommended treatment is ineffective). The number of arms and proportion of treatments that are truly effective is varied. Controlling the FDR provides good properties. It retains the high positive predictive value of FWER correction in situations where a low proportion of treatments is effective. It also has a good negative predictive value in situations where a high proportion of treatments is effective. In a multi-arm trial testing distinct treatment arms, we recommend that sponsors and trialists consider use of the FDR. 相似文献

18.

Estimating the number of true null hypotheses in multiple hypothesis testing

Yi-Ting Hwang Hsun-Chih Kuo Chun-Chao Wang Meng Feng Lee 《Statistics and Computing》2014,24(3):399-416

The overall Type I error computed based on the traditional means may be inflated if many hypotheses are compared simultaneously. The family-wise error rate (FWER) and false discovery rate (FDR) are some of commonly used error rates to measure Type I error under the multiple hypothesis setting. Many controlling FWER and FDR procedures have been proposed and have the ability to control the desired FWER/FDR under certain scenarios. Nevertheless, these controlling procedures become too conservative when only some hypotheses are from the null. Benjamini and Hochberg (J. Educ. Behav. Stat. 25:60–83, 2000) proposed an adaptive FDR-controlling procedure that adapts the information of the number of true null hypotheses (m ₀) to overcome this problem. Since m ₀ is unknown, estimators of m ₀ are needed. Benjamini and Hochberg (J. Educ. Behav. Stat. 25:60–83, 2000) suggested a graphical approach to construct an estimator of m ₀, which is shown to overestimate m ₀ (see Hwang in J. Stat. Comput. Simul. 81:207–220, 2011). Following a similar construction, this paper proposes new estimators of m ₀. Monte Carlo simulations are used to evaluate accuracy and precision of new estimators and the feasibility of these new adaptive procedures is evaluated under various simulation settings. 相似文献

19.

QuickMMCTest: quick multiple Monte Carlo testing

Axel Gandy Georg Hahn 《Statistics and Computing》2017,27(3):823-832

Multiple hypothesis testing is widely used to evaluate scientific studies involving statistical tests. However, for many of these tests, p values are not available and are thus often approximated using Monte Carlo tests such as permutation tests or bootstrap tests. This article presents a simple algorithm based on Thompson Sampling to test multiple hypotheses. It works with arbitrary multiple testing procedures, in particular with step-up and step-down procedures. Its main feature is to sequentially allocate Monte Carlo effort, generating more Monte Carlo samples for tests whose decisions are so far less certain. A simulation study demonstrates that for a low computational effort, the new approach yields a higher power and a higher degree of reproducibility of its results than previously suggested methods. 相似文献

20.

Closure properties of classes of multiple testing procedures

Georg Hahn 《AStA Advances in Statistical Analysis》2018,102(2):167-178

Statistical discoveries are often obtained through multiple hypothesis testing. A variety of procedures exists to evaluate multiple hypotheses, for instance the ones of Benjamini–Hochberg, Bonferroni, Holm or Sidak. We are particularly interested in multiple testing procedures with two desired properties: (solely) monotonic and well-behaved procedures. This article investigates to which extent the classes of (monotonic or well-behaved) multiple testing procedures, in particular the subclasses of so-called step-up and step-down procedures, are closed under basic set operations, specifically the union, intersection, difference and the complement of sets of rejected or non-rejected hypotheses. The present article proves two main results: First, taking the union or intersection of arbitrary (monotonic or well-behaved) multiple testing procedures results in new procedures which are monotonic but not well-behaved, whereas the complement or difference generally preserves neither property. Second, the two classes of (solely monotonic or well-behaved) step-up and step-down procedures are closed under taking the union or intersection, but not the complement or difference. 相似文献