首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
Traditional multiple hypothesis testing procedures fix an error rate and determine the corresponding rejection region. In 2002 Storey proposed a fixed rejection region procedure and showed numerically that it can gain more power than the fixed error rate procedure of Benjamini and Hochberg while controlling the same false discovery rate (FDR). In this paper it is proved that when the number of alternatives is small compared to the total number of hypotheses, Storey's method can be less powerful than that of Benjamini and Hochberg. Moreover, the two procedures are compared by setting them to produce the same FDR. The difference in power between Storey's procedure and that of Benjamini and Hochberg is near zero when the distance between the null and alternative distributions is large, but Benjamini and Hochberg's procedure becomes more powerful as the distance decreases. It is shown that modifying the Benjamini and Hochberg procedure to incorporate an estimate of the proportion of true null hypotheses as proposed by Black gives a procedure with superior power.  相似文献   

2.
Summary.  The false discovery rate (FDR) is a multiple hypothesis testing quantity that describes the expected proportion of false positive results among all rejected null hypotheses. Benjamini and Hochberg introduced this quantity and proved that a particular step-up p -value method controls the FDR. Storey introduced a point estimate of the FDR for fixed significance regions. The former approach conservatively controls the FDR at a fixed predetermined level, and the latter provides a conservatively biased estimate of the FDR for a fixed predetermined significance region. In this work, we show in both finite sample and asymptotic settings that the goals of the two approaches are essentially equivalent. In particular, the FDR point estimates can be used to define valid FDR controlling procedures. In the asymptotic setting, we also show that the point estimates can be used to estimate the FDR conservatively over all significance regions simultaneously, which is equivalent to controlling the FDR at all levels simultaneously. The main tool that we use is to translate existing FDR methods into procedures involving empirical processes. This simplifies finite sample proofs, provides a framework for asymptotic results and proves that these procedures are valid even under certain forms of dependence.  相似文献   

3.
In this note, we focus on estimating the false discovery rate (FDR) of a multiple testing method with a common, non-random rejection threshold under a mixture model. We develop a new class of estimates of the FDR and prove that it is less conservatively biased than what is traditionally used. Numerical evidence is presented to show that the mean squared error (MSE) is also often smaller for the present class of estimates, especially in small-scale multiple testings. A similar class of estimates of the positive false discovery rate (pFDR) less conservatively biased than what is usually used is then proposed. When modified using our estimate of the pFDR and applied to a gene-expression data, Storey's q-value method identifies a few more significant genes than his original q-value method at certain thresholds. The BH like method developed by thresholding our estimate of the FDR is shown to control the FDR in situations where the p  -values have the same dependence structure as required by the BH method and, for lack of information about the proportion π0π0 of true null hypotheses, it is reasonable to assume that π0π0 is uniformly distributed over (0,1).  相似文献   

4.
Summary.  The use of a fixed rejection region for multiple hypothesis testing has been shown to outperform standard fixed error rate approaches when applied to control of the false discovery rate. In this work it is demonstrated that, if the original step-up procedure of Benjamini and Hochberg is modified to exercise adaptive control of the false discovery rate, its performance is virtually identical to that of the fixed rejection region approach. In addition, the dependence of both methods on the proportion of true null hypotheses is explored, with a focus on the difficulties that are involved in the estimation of this quantity.  相似文献   

5.
Microarray studies are now common for human, agricultural plant and animal studies. False discovery rate (FDR) is widely used in the analysis of large-scale microarray data to account for problems associated with multiple testing. A well-designed microarray study should have adequate statistical power to detect the differentially expressed (DE) genes, while keeping the FDR acceptably low. In this paper, we used a mixture model of expression responses involving DE genes and non-DE genes to analyse theoretical FDR and power for simple scenarios where it is assumed that each gene has equal error variance and the gene effects are independent. A simulation study was used to evaluate the empirical FDR and power for more complex scenarios with unequal error variance and gene dependence. Based on this approach, we present a general guide for sample size requirement at the experimental design stage for prospective microarray studies. This paper presented an approach to explicitly connect the sample size with FDR and power. While the methods have been developed in the context of one-sample microarray studies, they are readily applicable to two-sample, and could be adapted to multiple-sample studies.  相似文献   

6.
Selecting predictors to optimize the outcome prediction is an important statistical method. However, it usually ignores the false positives in the selected predictors. In this article, we advocate a conventional stepwise forward variable selection method based on the predicted residual sum of squares, and develop a positive false discovery rate (pFDR) estimate for the selected predictor subset, and a local pFDR estimate to prioritize the selected predictors. This pFDR estimate takes account of the existence of non null predictors, and is proved to be asymptotically conservative. In addition, we propose two views of a variable selection process: an overall and an individual test. An interesting feature of the overall test is that its power of selecting non null predictors increases with the proportion of non null predictors among all candidate predictors. Data analysis is illustrated with an example, in which genetic and clinical predictors were selected to predict the cholesterol level change after four months of tamoxifen treatment, and pFDR was estimated. Our method's performance is evaluated through statistical simulations.  相似文献   

7.
Multiple hypothesis testing literature has recently experienced a growing development with particular attention to the control of the false discovery rate (FDR) based on p-values. While these are not the only methods to deal with multiplicity, inference with small samples and large sets of hypotheses depends on the specific choice of the p-value used to control the FDR in the presence of nuisance parameters. In this paper we propose to use the partial posterior predictive p-value [Bayarri, M.J., Berger, J.O., 2000. p-values for composite null models. J. Amer. Statist. Assoc. 95, 1127–1142] that overcomes this difficulty. This choice is motivated by theoretical considerations and examples. Finally, an application to a controlled microarray experiment is presented.  相似文献   

8.
Summary. We investigate the operating characteristics of the Benjamini–Hochberg false discovery rate procedure for multiple testing. This is a distribution-free method that controls the expected fraction of falsely rejected null hypotheses among those rejected. The paper provides a framework for understanding more about this procedure. We first study the asymptotic properties of the `deciding point' D that determines the critical p -value. From this, we obtain explicit asymptotic expressions for a particular risk function. We introduce the dual notion of false non-rejections and we consider a risk function that combines the false discovery rate and false non-rejections. We also consider the optimal procedure with respect to a measure of conditional risk.  相似文献   

9.
High-throughput data analyses are widely used for examining differential gene expression, identifying single nucleotide polymorphisms, and detecting methylation loci. False discovery rate (FDR) has been considered a proper type I error rate to control for discovery-based high-throughput data analysis. Various multiple testing procedures have been proposed to control the FDR. The power and stability properties of some commonly used multiple testing procedures have not been extensively investigated yet, however. Simulation studies were conducted to compare power and stability properties of five widely used multiple testing procedures at different proportions of true discoveries for various sample sizes for both independent and dependent test statistics. Storey's two linear step-up procedures showed the best performance among all tested procedures considering FDR control, power, and variance of true discoveries. Leukaemia and ovarian cancer microarray studies were used to illustrate the power and stability characteristics of these five multiple testing procedures with FDR control.  相似文献   

10.
Abstract. This paper is concerned with exact control of the false discovery rate (FDR) for step‐up‐down (SUD) tests related to the asymptotically optimal rejection curve (AORC). Since the system of equations and/or constraints for critical values and FDRs is numerically extremely sensitive, existence and computation of valid solutions is a challenging problem. We derive explicit formulas for upper bounds of the FDR and show that under a well‐known monotonicity condition, control of the FDR by a step‐up procedure results in control of the FDR by a corresponding SUD procedure. Various methods for adjusting the AORC to achieve finite FDR control are investigated. Moreover, we introduce alternative FDR bounding curves and study their connection to rejection curves as well as the existence of critical values for exact FDR control with respect to the underlying FDR bounding curve. Finally, we propose an iterative method for the computation of critical values.  相似文献   

11.
In many scientific fields, it is interesting and important to determine whether an observed data stream comes from a prespecified model or not, particularly when the number of data streams is of large scale, where multiple hypotheses testing is necessary. In this article, we consider large-scale model checking under certain dependence among different data streams observed at the same time. We propose a false discovery rate (FDR) control procedure to check those unusual data streams. Specifically, we derive an approximation of false discovery and construct a point estimate of FDR. Theoretical results show that, under some mild assumptions, our proposed estimate of FDR is simultaneously conservatively consistent with the true FDR, and hence it is an asymptotically strong control procedure. Simulation comparisons with some competing procedures show that our proposed FDR procedure behaves better in general settings. Application of our proposed FDR procedure is illustrated by the StarPlus fMRI data.  相似文献   

12.
Abstract.  Controlling the false discovery rate (FDR) is a powerful approach to multiple testing, with procedures developed with applications in many areas. Dependence among the test statistics is a common problem, and many attempts have been made to extend the procedures. In this paper, we show that a certain degree of dependence is allowed among the test statistics, when the number of tests is large, with no need for any correction. We then suggest a way to conservatively estimate the proportion of false nulls, both under dependence and independence, and discuss the advantages of using such estimators when controlling the FDR.  相似文献   

13.
Simultaneously testing a family of n null hypotheses can arise in many applications. A common problem in multiple hypothesis testing is to control Type-I error. The probability of at least one false rejection referred to as the familywise error rate (FWER) is one of the earliest error rate measures. Many FWER-controlling procedures have been proposed. The ability to control the FWER and achieve higher power is often used to evaluate the performance of a controlling procedure. However, when testing multiple hypotheses, FWER and power are not sufficient for evaluating controlling procedure’s performance. Furthermore, the performance of a controlling procedure is also governed by experimental parameters such as the number of hypotheses, sample size, the number of true null hypotheses and data structure. This paper evaluates, under various experimental settings, the performance of some FWER-controlling procedures in terms of five indices, the FWER, the false discovery rate, the false non-discovery rate, the sensitivity and the specificity. The results can provide guidance on how to select an appropriate FWER-controlling procedure to meet a study’s objective.  相似文献   

14.
Recently, the field of multiple hypothesis testing has experienced a great expansion, basically because of the new methods developed in the field of genomics. These new methods allow scientists to simultaneously process thousands of hypothesis tests. The frequentist approach to this problem is made by using different testing error measures that allow to control the Type I error rate at a certain desired level. Alternatively, in this article, a Bayesian hierarchical model based on mixture distributions and an empirical Bayes approach are proposed in order to produce a list of rejected hypotheses that will be declared significant and interesting for a more detailed posterior analysis. In particular, we develop a straightforward implementation of a Gibbs sampling scheme where all the conditional posterior distributions are explicit. The results are compared with the frequentist False Discovery Rate (FDR) methodology. Simulation examples show that our model improves the FDR procedure in the sense that it diminishes the percentage of false negatives keeping an acceptable percentage of false positives.  相似文献   

15.
Most of current false discovery rate (FDR) procedures in a microarray experiment assume restrictive dependence structures, resulting in being less reliable. FDR controlling procedure under suitable dependence structures based on Poisson distributional approximation is shown. Unlike other procedures, the distribution of false null hypotheses is estimated by using kernel density estimation allowing for dependent structures among the genes. Furthermore, we develop an FDR framework that minimizes the false nondiscovery rate (FNR) with a constraint on the controlled level of the FDR. The performance of the proposed FDR procedure is compared with that of other existing FDR controlling procedures, with an application to the microarray study of simulated data.  相似文献   

16.
Abstract.  We propose a confidence envelope for false discovery control when testing multiple hypotheses of association simultaneously. The method is valid under arbitrary and unknown dependence between the test statistics and allows for an exploratory approach when choosing suitable rejection regions while still retaining strong control over the proportion of false discoveries.  相似文献   

17.
Case-control studies of genetic polymorphisms and gene-environment interactions are reporting large numbers of statistically significant associations, many of which are likely to be spurious. This problem reflects the low prior probability that any one null hypothesis is false, and the large number of test results reported for a given study. In a Bayesian approach to the low prior probabilities, Wacholder et al. (2004) suggest supplementing the p-value for a hypothesis with its posterior probability given the study data. In a frequentist approach to the test multiplicity problem, Benjamini & Hochberg (1995) propose a hypothesis-rejection rule that provides greater statistical power by controlling the false discovery rate rather than the family-wise error rate controlled by the Bonferroni correction. This paper defines a Bayes false discovery rate and proposes a Bayes-based rejection rule for controlling it. The method, which combines the Bayesian approach of Wacholder et al. with the frequentist approach of Benjamini & Hochberg, is used to evaluate the associations reported in a case-control study of breast cancer risk and genetic polymorphisms of genes involved in the repair of double-strand DNA breaks.  相似文献   

18.
The false discovery rate (FDR) has become a popular error measure in the large-scale simultaneous testing. When data are collected from heterogenous sources and form grouped hypotheses testing, it may be beneficial to use the distinct feature of groups to conduct the multiple hypotheses testing. We propose a stratified testing procedure that uses different FDR levels according to the stratification features based on p-values. Our proposed method is easy to implement in practice. Simulations studies show that the proposed method produces more efficient testing results. The stratified testing procedure minimizes the overall false negative rate (FNR) level, while controlling the overall FDR. An example from a type II diabetes mice study further illustrates the practical advantages of this new approach.  相似文献   

19.
Many exploratory studies such as microarray experiments require the simultaneous comparison of hundreds or thousands of genes. It is common to see that most genes in many microarray experiments are not expected to be differentially expressed. Under such a setting, a procedure that is designed to control the false discovery rate (FDR) is aimed at identifying as many potential differentially expressed genes as possible. The usual FDR controlling procedure is constructed based on the number of hypotheses. However, it can become very conservative when some of the alternative hypotheses are expected to be true. The power of a controlling procedure can be improved if the number of true null hypotheses (m 0) instead of the number of hypotheses is incorporated in the procedure [Y. Benjamini and Y. Hochberg, On the adaptive control of the false discovery rate in multiple testing with independent statistics, J. Edu. Behav. Statist. 25(2000), pp. 60–83]. Nevertheless, m 0 is unknown, and has to be estimated. The objective of this article is to evaluate some existing estimators of m 0 and discuss the feasibility of these estimators in incorporating into FDR controlling procedures under various experimental settings. The results of simulations can help the investigator to choose an appropriate procedure to meet the requirement of the study.  相似文献   

20.
Summary.  The paper considers the problem of multiple testing under dependence in a compound decision theoretic framework. The observed data are assumed to be generated from an underlying two-state hidden Markov model. We propose oracle and asymptotically optimal data-driven procedures that aim to minimize the false non-discovery rate FNR subject to a constraint on the false discovery rate FDR. It is shown that the performance of a multiple-testing procedure can be substantially improved by adaptively exploiting the dependence structure among hypotheses, and hence conventional FDR procedures that ignore this structural information are inefficient. Both theoretical properties and numerical performances of the procedures proposed are investigated. It is shown that the procedures proposed control FDR at the desired level, enjoy certain optimality properties and are especially powerful in identifying clustered non-null cases. The new procedure is applied to an influenza-like illness surveillance study for detecting the timing of epidemic periods.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号