首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 546 毫秒
1.
In many scientific fields, it is interesting and important to determine whether an observed data stream comes from a prespecified model or not, particularly when the number of data streams is of large scale, where multiple hypotheses testing is necessary. In this article, we consider large-scale model checking under certain dependence among different data streams observed at the same time. We propose a false discovery rate (FDR) control procedure to check those unusual data streams. Specifically, we derive an approximation of false discovery and construct a point estimate of FDR. Theoretical results show that, under some mild assumptions, our proposed estimate of FDR is simultaneously conservatively consistent with the true FDR, and hence it is an asymptotically strong control procedure. Simulation comparisons with some competing procedures show that our proposed FDR procedure behaves better in general settings. Application of our proposed FDR procedure is illustrated by the StarPlus fMRI data.  相似文献   

2.
Summary.  The paper considers the problem of multiple testing under dependence in a compound decision theoretic framework. The observed data are assumed to be generated from an underlying two-state hidden Markov model. We propose oracle and asymptotically optimal data-driven procedures that aim to minimize the false non-discovery rate FNR subject to a constraint on the false discovery rate FDR. It is shown that the performance of a multiple-testing procedure can be substantially improved by adaptively exploiting the dependence structure among hypotheses, and hence conventional FDR procedures that ignore this structural information are inefficient. Both theoretical properties and numerical performances of the procedures proposed are investigated. It is shown that the procedures proposed control FDR at the desired level, enjoy certain optimality properties and are especially powerful in identifying clustered non-null cases. The new procedure is applied to an influenza-like illness surveillance study for detecting the timing of epidemic periods.  相似文献   

3.
The false discovery rate (FDR) has become a popular error measure in the large-scale simultaneous testing. When data are collected from heterogenous sources and form grouped hypotheses testing, it may be beneficial to use the distinct feature of groups to conduct the multiple hypotheses testing. We propose a stratified testing procedure that uses different FDR levels according to the stratification features based on p-values. Our proposed method is easy to implement in practice. Simulations studies show that the proposed method produces more efficient testing results. The stratified testing procedure minimizes the overall false negative rate (FNR) level, while controlling the overall FDR. An example from a type II diabetes mice study further illustrates the practical advantages of this new approach.  相似文献   

4.
A two-stage stepup procedure is defined and an explicit formula for the FDR of this procedure is derived under any distributional setting. Sets of critical values are determined that provide a control of the FDR of a two-stage stepup procedure under iid mixture model. A class of two-stage FDR procedures modifying the Benjamini–Hochberg (BH) procedure and containing the one given in Storey et al. [2004. Strong control, conservative point estimation and simultanaeous conservative consistency of false discovery rates: a unified approach. J. Roy. Statist. Soc. Ser. B 66, 187–205] is obtained. The FDR controlling property of the Storey–Taylor–Siegmund procedure is proved only under the independence, which is different from that presented by these authors. A single-stage stepup procedure controlling the FDR under any form of dependence, which is different from and in some situations performs better than the Benjamini–Yekutieli (BY) procedure, is given before discussing how to obtain two-stage versions of the BY and this new procedures. Simulations reveal that procedures proposed in this article under the mixture model can perform quite well in terms of improving the FDR control of the BH procedure. However, the similar idea of improving the FDR control of a stepup procedure under any form dependence does not seem to work.  相似文献   

5.
Multiple comparisons of the effects of several treatments with a control (MCC) has been a central problem in medicine and other areas. Nearly all of existing papers are devoted to comparing means of the effects. To study medical problems more deeply, one needs more information than mean relationship from the given data. It can be expected to get more useful and deeper conclusion by comparing the probability distributions, i.e., by comparison under stochastic orders. This paper presents a likelihood ratio testing procedure to compare effects under stochastic order for MCC problems, controlling the false discovery rate (FDR). Setting a test controlling FDR under stochastic order faces several non trivial problems. These problems are analyzed and solved in this paper. To facilitate the test more easily, the asymptotic p values for the test are used and their distributions are derived. It is shown that controllability of FDR for this comparison procedure can be guaranteed. A real data example is used to illustrate how to apply this testing procedure and what the test can tell. Simulation results show that this testing procedure works quite well, better than some other tests.  相似文献   

6.
The overall Type I error computed based on the traditional means may be inflated if many hypotheses are compared simultaneously. The family-wise error rate (FWER) and false discovery rate (FDR) are some of commonly used error rates to measure Type I error under the multiple hypothesis setting. Many controlling FWER and FDR procedures have been proposed and have the ability to control the desired FWER/FDR under certain scenarios. Nevertheless, these controlling procedures become too conservative when only some hypotheses are from the null. Benjamini and Hochberg (J. Educ. Behav. Stat. 25:60–83, 2000) proposed an adaptive FDR-controlling procedure that adapts the information of the number of true null hypotheses (m 0) to overcome this problem. Since m 0 is unknown, estimators of m 0 are needed. Benjamini and Hochberg (J. Educ. Behav. Stat. 25:60–83, 2000) suggested a graphical approach to construct an estimator of m 0, which is shown to overestimate m 0 (see Hwang in J. Stat. Comput. Simul. 81:207–220, 2011). Following a similar construction, this paper proposes new estimators of m 0. Monte Carlo simulations are used to evaluate accuracy and precision of new estimators and the feasibility of these new adaptive procedures is evaluated under various simulation settings.  相似文献   

7.
Summary.  To help to design vaccines for acquired immune deficiency syndrome that protect broadly against many genetic variants of the human immunodeficiency virus, the mutation rates at 118 positions in HIV amino-acid sequences of subtype C versus those of subtype B were compared. The false discovery rate (FDR) multiple-comparisons procedure can be used to determine statistical significance. When the test statistics have discrete distributions, the FDR procedure can be made more powerful by a simple modification. The paper develops a modified FDR procedure for discrete data and applies it to the human immunodeficiency virus data. The new procedure detects 15 positions with significantly different mutation rates compared with 11 that are detected by the original FDR method. Simulations delineate conditions under which the modified FDR procedure confers large gains in power over the original technique. In general FDR adjustment methods can be improved for discrete data by incorporating the modification proposed.  相似文献   

8.
Many exploratory experiments such as DNA microarray or brain imaging require simultaneously comparisons of hundreds or thousands of hypotheses. Under such a setting, using the false discovery rate (FDR) as an overall Type I error is recommended (Benjamini and Hochberg in J. R. Stat. Soc. B 57:289–300, 1995). Many FDR controlling procedures have been proposed. However, when evaluating the performance of FDR-controlling procedures, researchers are often focused on the ability of procedures to control the FDR and to achieve high power. Meanwhile, under the multiple hypotheses, it may be also likely to commit a false non-discovery or fail to claim a true non-significance. In addition, various experimental parameters such as the number of hypotheses, the proportion of the number of true null hypotheses to the number of hypotheses, the samples size and the correlation structure may affect the performance of FDR controlling procedures. The purpose of this paper is to illustrate the performance of some existing FDR controlling procedures in terms of four indices, i.e., the FDR, the false non-discovery rate, the sensitivity and the specificity. Analytical results of these indices for the FDR controlling procedures are derived. Simulations are also performed to evaluate the performance of controlling procedures in terms of these indices under various experimental parameters. The result can be used to summarize as a guidance for practitioners to properly choose a FDR controlling procedure.  相似文献   

9.
In this note, we focus on estimating the false discovery rate (FDR) of a multiple testing method with a common, non-random rejection threshold under a mixture model. We develop a new class of estimates of the FDR and prove that it is less conservatively biased than what is traditionally used. Numerical evidence is presented to show that the mean squared error (MSE) is also often smaller for the present class of estimates, especially in small-scale multiple testings. A similar class of estimates of the positive false discovery rate (pFDR) less conservatively biased than what is usually used is then proposed. When modified using our estimate of the pFDR and applied to a gene-expression data, Storey's q-value method identifies a few more significant genes than his original q-value method at certain thresholds. The BH like method developed by thresholding our estimate of the FDR is shown to control the FDR in situations where the p  -values have the same dependence structure as required by the BH method and, for lack of information about the proportion π0π0 of true null hypotheses, it is reasonable to assume that π0π0 is uniformly distributed over (0,1).  相似文献   

10.
Summary.  The false discovery rate (FDR) is a multiple hypothesis testing quantity that describes the expected proportion of false positive results among all rejected null hypotheses. Benjamini and Hochberg introduced this quantity and proved that a particular step-up p -value method controls the FDR. Storey introduced a point estimate of the FDR for fixed significance regions. The former approach conservatively controls the FDR at a fixed predetermined level, and the latter provides a conservatively biased estimate of the FDR for a fixed predetermined significance region. In this work, we show in both finite sample and asymptotic settings that the goals of the two approaches are essentially equivalent. In particular, the FDR point estimates can be used to define valid FDR controlling procedures. In the asymptotic setting, we also show that the point estimates can be used to estimate the FDR conservatively over all significance regions simultaneously, which is equivalent to controlling the FDR at all levels simultaneously. The main tool that we use is to translate existing FDR methods into procedures involving empirical processes. This simplifies finite sample proofs, provides a framework for asymptotic results and proves that these procedures are valid even under certain forms of dependence.  相似文献   

11.
Most of current false discovery rate (FDR) procedures in a microarray experiment assume restrictive dependence structures, resulting in being less reliable. FDR controlling procedure under suitable dependence structures based on Poisson distributional approximation is shown. Unlike other procedures, the distribution of false null hypotheses is estimated by using kernel density estimation allowing for dependent structures among the genes. Furthermore, we develop an FDR framework that minimizes the false nondiscovery rate (FNR) with a constraint on the controlled level of the FDR. The performance of the proposed FDR procedure is compared with that of other existing FDR controlling procedures, with an application to the microarray study of simulated data.  相似文献   

12.
In large-scale genomics experiments involving thousands of statistical tests, such as association scans and microarray expression experiments, a key question is: Which of the L tests represent true associations (TAs)? The traditional way to control false findings is via individual adjustments. In the presence of multiple TAs, p-value combination methods offer certain advantages. Both Fisher's and Lancaster's combination methods use an inverse gamma transformation. We identify the relation of the shape parameter of that distribution to the implicit threshold value; p-values below that threshold are favored by the inverse gamma method (GM). We explore this feature to improve power over Fisher's method when L is large and the number of TAs is moderate. However, the improvement in power provided by combination methods is at the expense of a weaker claim made upon rejection of the null hypothesis - that there are some TAs among the L tests. Thus, GM remains a global test. To allow a stronger claim about a subset of p-values that is smaller than L, we investigate two methods with an explicit truncation: the rank truncated product method (RTP) that combines the first K-ordered p-values, and the truncated product method (TPM) that combines p-values that are smaller than a specified threshold. We conclude that TPM allows claims to be made about subsets of p-values, while the claim of the RTP is, like GM, more appropriately about all L tests. GM gives somewhat higher power than TPM, RTP, Fisher, and Simes methods across a range of simulations.  相似文献   

13.
Weighted methods are an important feature of multiplicity control methods. The weights must usually be chosen a priori, on the basis of experimental hypotheses. Under some conditions, however, they can be chosen making use of information from the data (therefore a posteriori) while maintaining multiplicity control. In this paper we provide: (1) a review of weighted methods for familywise type I error rate (FWE) (both parametric and nonparametric) and false discovery rate (FDR) control; (2) a review of data-driven weighted methods for FWE control; (3) a new proposal for weighted FDR control (data-driven weights) under independence among variables; (4) under any type of dependence; (5) a simulation study that assesses the performance of procedure of point 4 under various conditions.  相似文献   

14.
A modification of the critical values of Simes’ test is suggested in this article when the underlying test statistics are multivariate normal with a common non-negative correlation, yielding a more powerful test than the original Simes’ test. A step-up multiple testing procedure with these modified critical values, which is shown to control false discovery rate (FDR), is presented as a modification of the traditional Benjamini–Hochberg (BH) procedure. Simulations were carried out to compare this modified BH procedure with the BH and other modified BH procedures in terms of false non-discovery rate (FNR), 1–FDR–FNR and average power. The present modified BH procedure is observed to perform well compared to others when the test statistics are highly correlated and most of the hypotheses are true.  相似文献   

15.
Abstract. This paper is concerned with exact control of the false discovery rate (FDR) for step‐up‐down (SUD) tests related to the asymptotically optimal rejection curve (AORC). Since the system of equations and/or constraints for critical values and FDRs is numerically extremely sensitive, existence and computation of valid solutions is a challenging problem. We derive explicit formulas for upper bounds of the FDR and show that under a well‐known monotonicity condition, control of the FDR by a step‐up procedure results in control of the FDR by a corresponding SUD procedure. Various methods for adjusting the AORC to achieve finite FDR control are investigated. Moreover, we introduce alternative FDR bounding curves and study their connection to rejection curves as well as the existence of critical values for exact FDR control with respect to the underlying FDR bounding curve. Finally, we propose an iterative method for the computation of critical values.  相似文献   

16.
Feature extraction from observed noisy samples is a common important problem in statistics and engineering. This paper presents a novel general statistical approach to the region detection problem in long data sequences. The proposed technique is a multiscale kernel regression in conjunction with statistical multiple testing for region detection while controlling the false discovery rate (FDR) and maximizing the signal-to-noise ratio via matched filtering. This is achieved by considering a one-dimensional region detection problem as its equivalent zero-dimensional peak detection problem. The detection method does not require a priori knowledge of the shape of the nonzero regions. However, if the shape of the nonzero regions is known a priori, e.g., rectangular pulse, the signal regions can also be reconstructed from the detected peaks, seen as their topological point representatives. Simulations show that the method can effectively perform signal detection and reconstruction in the simulated data under high noise conditions, while controlling the FDR of detected regions and their reconstructed length.  相似文献   

17.
Many exploratory studies such as microarray experiments require the simultaneous comparison of hundreds or thousands of genes. It is common to see that most genes in many microarray experiments are not expected to be differentially expressed. Under such a setting, a procedure that is designed to control the false discovery rate (FDR) is aimed at identifying as many potential differentially expressed genes as possible. The usual FDR controlling procedure is constructed based on the number of hypotheses. However, it can become very conservative when some of the alternative hypotheses are expected to be true. The power of a controlling procedure can be improved if the number of true null hypotheses (m 0) instead of the number of hypotheses is incorporated in the procedure [Y. Benjamini and Y. Hochberg, On the adaptive control of the false discovery rate in multiple testing with independent statistics, J. Edu. Behav. Statist. 25(2000), pp. 60–83]. Nevertheless, m 0 is unknown, and has to be estimated. The objective of this article is to evaluate some existing estimators of m 0 and discuss the feasibility of these estimators in incorporating into FDR controlling procedures under various experimental settings. The results of simulations can help the investigator to choose an appropriate procedure to meet the requirement of the study.  相似文献   

18.
The Benjamini-Hochberg procedure is widely used in multiple comparisons. Previous power results for this procedure have been based on simulations. This article produces theoretical expressions for expected power. To derive them, we make assumptions about the number of hypotheses being tested, which null hypotheses are true, which are false, and the distributions of the test statistics under each null and alternative. We use these assumptions to derive bounds for multiple dimensional rejection regions. With these bounds and a permanent based representation of the joint density function of the largest p-values, we use the law of total probability to derive the distribution of the total number of rejections. We derive the joint distribution of the total number of rejections and the number of rejections when the null hypothesis is true. We give an analytic expression for the expected power for a false discovery rate procedure that assumes the hypotheses are independent.  相似文献   

19.
ABSTRACT

Holm's step-down testing procedure starts with the smallest p-value and sequentially screens larger p-values without any information on confidence intervals. This article changes the conventional step-down testing framework by presenting a nonparametric procedure that starts with the largest p-value and sequentially screens smaller p-values in a step-by-step manner to construct a set of simultaneous confidence sets. We use a partitioning approach to prove that the new procedure controls the simultaneous confidence level (thus strongly controlling the familywise error rate). Discernible features of the new stepwise procedure include consistency with individual inference, coherence, and confidence estimations for follow-up investigations. In a simple simulation study, the proposed procedure (treated as a testing procedure), is more powerful than Holm's procedure when the correlation coefficient is large, and vice versa when it is small. In the data analysis of a medical study, the new procedure is able to detect the efficacy of Aspirin as a cardiovascular prophylaxis in a nonparametric setting.  相似文献   

20.
In a breakthrough paper, Benjamini and Hochberg (J Roy Stat Soc Ser B 57:289–300, 1995) proposed a new error measure for multiple testing, the FDR; and developed a distribution-free procedure to control it under independence among the test statistics. In this paper we argue by extensive simulation and theoretical considerations that the assumption of independence is not needed. Along the lines of (Ann Stat 32:1035–1061, 2004b), we moreover provide a more powerful method, that exploits an estimator of the number of false nulls among the tests. We propose a whole family of iterative estimators that prove robust under dependence and independence between the test statistics. These estimators can be used to improve also classical multiple testing procedures, and in general to estimate the weight of a known component in a mixture distribution. Innovations are illustrated by simulations.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号