首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
The Benjamini-Hochberg procedure is widely used in multiple comparisons. Previous power results for this procedure have been based on simulations. This article produces theoretical expressions for expected power. To derive them, we make assumptions about the number of hypotheses being tested, which null hypotheses are true, which are false, and the distributions of the test statistics under each null and alternative. We use these assumptions to derive bounds for multiple dimensional rejection regions. With these bounds and a permanent based representation of the joint density function of the largest p-values, we use the law of total probability to derive the distribution of the total number of rejections. We derive the joint distribution of the total number of rejections and the number of rejections when the null hypothesis is true. We give an analytic expression for the expected power for a false discovery rate procedure that assumes the hypotheses are independent.  相似文献   

2.
Consider the multiple hypotheses testing problem controlling the generalized familywise error rate k-FWER, the probability of at least k false rejections. We propose a plug-in procedure based on the estimation of the number of true null hypotheses. Under the independence assumption of the p-values corresponding to the true null hypotheses, we first introduce the least favorable configuration (LFC) of k-FWER for Bonferroni-type plug-in procedure, then we construct a plug-in k-FWER-controlled procedure based on LFC. For dependent p-values, we establish the asymptotic k-FWER control under some mild conditions. Simulation studies suggest great improvement over generalized Bonferroni test and generalized Holm test.  相似文献   

3.
This article considers multiple hypotheses testing with the generalized familywise error rate k-FWER control, which is the probability of at least k false rejections. We first assume the p-values corresponding to the true null hypotheses are independent, and propose adaptive generalized Bonferroni procedure with k-FWER control based on the estimation of the number of true null hypotheses. Then, we assume the p-values are dependent, satisfying block dependence, and propose adaptive procedure with k-FWER control. Extensive simulations compare the performance of the adaptive procedures with different estimators.  相似文献   

4.
We develop a finite-sample procedure to test the mean-variance efficiency and spanning hypotheses, without imposing any parametric assumptions on the distribution of model disturbances. In so doing, we provide an exact distribution-free method to test uniform linear restrictions in multivariate linear regression models. The framework allows for unknown forms of nonnormalities as well as time-varying conditional variances and covariances among the model disturbances. We derive exact bounds on the null distribution of joint F statistics to deal with the presence of nuisance parameters, and we show how to implement the resulting generalized nonparametric bounds tests with Monte Carlo resampling techniques. In sharp contrast to the usual tests that are not even computable when the number of test assets is too large, the power of the proposed test procedure potentially increases along both the time and cross-sectional dimensions.  相似文献   

5.
Summary.  Estimation of the number or proportion of true null hypotheses in multiple-testing problems has become an interesting area of research. The first important work in this field was performed by Schweder and Spjøtvoll. Among others, they proposed to use plug-in estimates for the proportion of true null hypotheses in multiple-test procedures to improve the power. We investigate the problem of controlling the familywise error rate FWER when such estimators are used as plug-in estimators in single-step or step-down multiple-test procedures. First we investigate the case of independent p -values under the null hypotheses and show that a suitable choice of plug-in estimates leads to control of FWER in single-step procedures. We also investigate the power and study the asymptotic behaviour of the number of false rejections. Although step-down procedures are more difficult to handle we briefly consider a possible solution to this problem. Anyhow, plug-in step-down procedures are not recommended here. For dependent p -values we derive a condition for asymptotic control of FWER and provide some simulations with respect to FWER and power for various models and hypotheses.  相似文献   

6.
The positive false discovery rate (pFDR) is the average proportion of false rejections given that the overall number of rejections is greater than zero. Assuming that the proportion of true null hypotheses, proportion of false positives, and proportion of true positives all converge pointwise, the pFDR converges to a continuous limit uniformly over all significance levels. We are showing that the uniform convergence still holds given a weaker assumption that the proportion of true positives converges in L 1.  相似文献   

7.
Many exploratory studies such as microarray experiments require the simultaneous comparison of hundreds or thousands of genes. It is common to see that most genes in many microarray experiments are not expected to be differentially expressed. Under such a setting, a procedure that is designed to control the false discovery rate (FDR) is aimed at identifying as many potential differentially expressed genes as possible. The usual FDR controlling procedure is constructed based on the number of hypotheses. However, it can become very conservative when some of the alternative hypotheses are expected to be true. The power of a controlling procedure can be improved if the number of true null hypotheses (m 0) instead of the number of hypotheses is incorporated in the procedure [Y. Benjamini and Y. Hochberg, On the adaptive control of the false discovery rate in multiple testing with independent statistics, J. Edu. Behav. Statist. 25(2000), pp. 60–83]. Nevertheless, m 0 is unknown, and has to be estimated. The objective of this article is to evaluate some existing estimators of m 0 and discuss the feasibility of these estimators in incorporating into FDR controlling procedures under various experimental settings. The results of simulations can help the investigator to choose an appropriate procedure to meet the requirement of the study.  相似文献   

8.
Simultaneously testing a family of n null hypotheses can arise in many applications. A common problem in multiple hypothesis testing is to control Type-I error. The probability of at least one false rejection referred to as the familywise error rate (FWER) is one of the earliest error rate measures. Many FWER-controlling procedures have been proposed. The ability to control the FWER and achieve higher power is often used to evaluate the performance of a controlling procedure. However, when testing multiple hypotheses, FWER and power are not sufficient for evaluating controlling procedure’s performance. Furthermore, the performance of a controlling procedure is also governed by experimental parameters such as the number of hypotheses, sample size, the number of true null hypotheses and data structure. This paper evaluates, under various experimental settings, the performance of some FWER-controlling procedures in terms of five indices, the FWER, the false discovery rate, the false non-discovery rate, the sensitivity and the specificity. The results can provide guidance on how to select an appropriate FWER-controlling procedure to meet a study’s objective.  相似文献   

9.
This article proposes a class of weighted differences of averages (WDA) statistics to test and estimate possible change-points in variance for time series with weakly dependent blocks and dependent panel data without specific distributional assumptions. We derive the asymptotic distributions of the test statistics for testing the existence of a single variance change-point under the null and local alternatives. We also study the consistency of the change-point estimator. Within the proposed class of the WDA test statistics, a standardized WDA test is shown to have the best consistency rate and is recommended for practical use. An iterative binary searching procedure is suggested for estimating the locations of possible multiple change-points in variance, whose consistency is also established. Simulation studies are conducted to compare detection power and number of wrong rejections of the proposed procedure to that of a cumulative sum (CUSUM) based test and a likelihood ratio-based test. Finally, we apply the proposed method to a stock index dataset and an unemployment rate dataset. Supplementary materials for this article are available online.  相似文献   

10.
Traditional multiple hypothesis testing procedures fix an error rate and determine the corresponding rejection region. In 2002 Storey proposed a fixed rejection region procedure and showed numerically that it can gain more power than the fixed error rate procedure of Benjamini and Hochberg while controlling the same false discovery rate (FDR). In this paper it is proved that when the number of alternatives is small compared to the total number of hypotheses, Storey's method can be less powerful than that of Benjamini and Hochberg. Moreover, the two procedures are compared by setting them to produce the same FDR. The difference in power between Storey's procedure and that of Benjamini and Hochberg is near zero when the distance between the null and alternative distributions is large, but Benjamini and Hochberg's procedure becomes more powerful as the distance decreases. It is shown that modifying the Benjamini and Hochberg procedure to incorporate an estimate of the proportion of true null hypotheses as proposed by Black gives a procedure with superior power.  相似文献   

11.
In many scientific fields, it is interesting and important to determine whether an observed data stream comes from a prespecified model or not, particularly when the number of data streams is of large scale, where multiple hypotheses testing is necessary. In this article, we consider large-scale model checking under certain dependence among different data streams observed at the same time. We propose a false discovery rate (FDR) control procedure to check those unusual data streams. Specifically, we derive an approximation of false discovery and construct a point estimate of FDR. Theoretical results show that, under some mild assumptions, our proposed estimate of FDR is simultaneously conservatively consistent with the true FDR, and hence it is an asymptotically strong control procedure. Simulation comparisons with some competing procedures show that our proposed FDR procedure behaves better in general settings. Application of our proposed FDR procedure is illustrated by the StarPlus fMRI data.  相似文献   

12.
It is important that the proportion of true null hypotheses be estimated accurately in a multiple hypothesis context. Current estimation methods, however, are not suitable for high-dimensional data such as microarray data. First, they do not consider the (strong) dependence between hypotheses (or genes), thereby resulting in inaccurate estimation. Second, the unknown distribution of false null hypotheses cannot be estimated properly by these methods. Third, the estimation is affected strongly by outliers. In this paper, we find out the optimal procedure for estimating the proportion of true null hypotheses under a (strong) dependence based on the Dirichlet process prior. In addition, by using the minimum Hellinger distance, the estimation should be robust to any model misspecification as well as to any outliers while maintaining efficiency. The results are confirmed by a simulation study, and the newly developed methodology is illustrated by a real microarray data.  相似文献   

13.
Estimating the proportion of true null hypotheses, π0, has attracted much attention in the recent statistical literature. Besides its apparent relevance for a set of specific scientific hypotheses, an accurate estimate of this parameter is key for many multiple testing procedures. Most existing methods for estimating π0 in the literature are motivated from the independence assumption of test statistics, which is often not true in reality. Simulations indicate that most existing estimators in the presence of the dependence among test statistics can be poor, mainly due to the increase of variation in these estimators. In this paper, we propose several data-driven methods for estimating π0 by incorporating the distribution pattern of the observed p-values as a practical approach to address potential dependence among test statistics. Specifically, we use a linear fit to give a data-driven estimate for the proportion of true-null p-values in (λ, 1] over the whole range [0, 1] instead of using the expected proportion at 1?λ. We find that the proposed estimators may substantially decrease the variance of the estimated true null proportion and thus improve the overall performance.  相似文献   

14.
Summary. We investigate the operating characteristics of the Benjamini–Hochberg false discovery rate procedure for multiple testing. This is a distribution-free method that controls the expected fraction of falsely rejected null hypotheses among those rejected. The paper provides a framework for understanding more about this procedure. We first study the asymptotic properties of the `deciding point' D that determines the critical p -value. From this, we obtain explicit asymptotic expressions for a particular risk function. We introduce the dual notion of false non-rejections and we consider a risk function that combines the false discovery rate and false non-rejections. We also consider the optimal procedure with respect to a measure of conditional risk.  相似文献   

15.
A common approach to analysing clinical trials with multiple outcomes is to control the probability for the trial as a whole of making at least one incorrect positive finding under any configuration of true and false null hypotheses. Popular approaches are to use Bonferroni corrections or structured approaches such as, for example, closed-test procedures. As is well known, such strategies, which control the family-wise error rate, typically reduce the type I error for some or all the tests of the various null hypotheses to below the nominal level. In consequence, there is generally a loss of power for individual tests. What is less well appreciated, perhaps, is that depending on approach and circumstances, the test-wise loss of power does not necessarily lead to a family wise loss of power. In fact, it may be possible to increase the overall power of a trial by carrying out tests on multiple outcomes without increasing the probability of making at least one type I error when all null hypotheses are true. We examine two types of problems to illustrate this. Unstructured testing problems arise typically (but not exclusively) when many outcomes are being measured. We consider the case of more than two hypotheses when a Bonferroni approach is being applied while for illustration we assume compound symmetry to hold for the correlation of all variables. Using the device of a latent variable it is easy to show that power is not reduced as the number of variables tested increases, provided that the common correlation coefficient is not too high (say less than 0.75). Afterwards, we will consider structured testing problems. Here, multiplicity problems arising from the comparison of more than two treatments, as opposed to more than one measurement, are typical. We conduct a numerical study and conclude again that power is not reduced as the number of tested variables increases.  相似文献   

16.
Many exploratory experiments such as DNA microarray or brain imaging require simultaneously comparisons of hundreds or thousands of hypotheses. Under such a setting, using the false discovery rate (FDR) as an overall Type I error is recommended (Benjamini and Hochberg in J. R. Stat. Soc. B 57:289–300, 1995). Many FDR controlling procedures have been proposed. However, when evaluating the performance of FDR-controlling procedures, researchers are often focused on the ability of procedures to control the FDR and to achieve high power. Meanwhile, under the multiple hypotheses, it may be also likely to commit a false non-discovery or fail to claim a true non-significance. In addition, various experimental parameters such as the number of hypotheses, the proportion of the number of true null hypotheses to the number of hypotheses, the samples size and the correlation structure may affect the performance of FDR controlling procedures. The purpose of this paper is to illustrate the performance of some existing FDR controlling procedures in terms of four indices, i.e., the FDR, the false non-discovery rate, the sensitivity and the specificity. Analytical results of these indices for the FDR controlling procedures are derived. Simulations are also performed to evaluate the performance of controlling procedures in terms of these indices under various experimental parameters. The result can be used to summarize as a guidance for practitioners to properly choose a FDR controlling procedure.  相似文献   

17.
Consider a set of order statistics that arise from sorting samples from two different populations, each with their own, possibly different distribution functions. The probability that these order statistics fall in disjoint, ordered intervals and that of the smallest statistics, a certain number come from the first populations is given in terms of the two distribution functions. The result is applied to computing the joint probability of the number of rejections and the number of false rejections for the Benjamini-Hochberg false discovery rate procedure.  相似文献   

18.
In a high-dimensional multiple testing framework, we present new confidence bounds on the false positives contained in subsets S of selected null hypotheses. These bounds are post hoc in the sense that the coverage probability holds simultaneously over all S, possibly chosen depending on the data. This article focuses on the common case of structured null hypotheses, for example, along a tree, a hierarchy, or geometrically (spatially or temporally). Following recent advances in post hoc inference, we build confidence bounds for some prespecified forest-structured subsets and deduce a bound for any subset S by interpolation. The proposed bounds are shown to improve substantially previous ones when the signal is locally structured. Our findings are supported both by theoretical results and numerical experiments. Moreover, our bounds can be obtained by an algorithm (with complexity bilinear in the sizes of the reference hierarchy and of the selected subset) that is implemented in the open-source R package sansSouci available from https://github.com/pneuvial/sanssouci , making our approach operational.  相似文献   

19.
In practical settings such as microarray data analysis, multiple hypotheses with dependence within but not between equal-sized blocks often need to be tested. We consider an adaptive BH procedure to test the hypotheses. Under the condition of positive regression dependence on a subset of the true null hypotheses, the proposed adaptive procedure is shown to control the false discovery rate. The proposed approach is compared to the existing methods in simulation under block dependence and totally uniform pairwise dependence. It is observed that the proposed method performs better than the existing methods in several situations.  相似文献   

20.
We are concerned with a situation in which we would like to test multiple hypotheses with tests whose p‐values cannot be computed explicitly but can be approximated using Monte Carlo simulation. This scenario occurs widely in practice. We are interested in obtaining the same rejections and non‐rejections as the ones obtained if the p‐values for all hypotheses had been available. The present article introduces a framework for this scenario by providing a generic algorithm for a general multiple testing procedure. We establish conditions that guarantee that the rejections and non‐rejections obtained through Monte Carlo simulations are identical to the ones obtained with the p‐values. Our framework is applicable to a general class of step‐up and step‐down procedures, which includes many established multiple testing corrections such as the ones of Bonferroni, Holm, Sidak, Hochberg or Benjamini–Hochberg. Moreover, we show how to use our framework to improve algorithms available in the literature in such a way as to yield theoretical guarantees on their results. These modifications can easily be implemented in practice and lead to a particular way of reporting multiple testing results as three sets together with an error bound on their correctness, demonstrated exemplarily using a real biological dataset.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号