首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 218 毫秒
1.
We are concerned with a situation in which we would like to test multiple hypotheses with tests whose p‐values cannot be computed explicitly but can be approximated using Monte Carlo simulation. This scenario occurs widely in practice. We are interested in obtaining the same rejections and non‐rejections as the ones obtained if the p‐values for all hypotheses had been available. The present article introduces a framework for this scenario by providing a generic algorithm for a general multiple testing procedure. We establish conditions that guarantee that the rejections and non‐rejections obtained through Monte Carlo simulations are identical to the ones obtained with the p‐values. Our framework is applicable to a general class of step‐up and step‐down procedures, which includes many established multiple testing corrections such as the ones of Bonferroni, Holm, Sidak, Hochberg or Benjamini–Hochberg. Moreover, we show how to use our framework to improve algorithms available in the literature in such a way as to yield theoretical guarantees on their results. These modifications can easily be implemented in practice and lead to a particular way of reporting multiple testing results as three sets together with an error bound on their correctness, demonstrated exemplarily using a real biological dataset.  相似文献   

2.
A generalization of step-up and step-down multiple test procedures is proposed. This step-up-down procedure is useful when the objective is to reject a specified minimum number, q, out of a family of k hypotheses. If this basic objective is met at the first step, then it proceeds in a step-down manner to see if more than q hypotheses can be rejected. Otherwise it proceeds in a step-up manner to see if some number less than q hypotheses can be rejected. The usual step-down procedure is the special case where q = 1, and the usual step-up procedure is the special case where q = k. Analytical and numerical comparisons between the powers of the step-up-down procedures with different choices of q are made to see how these powers depend on the actual number of false hypotheses. Examples of application include comparing the efficacy of a treatment to a control for multiple endpoints and testing the sensitivity of a clinical trial for comparing the efficacy of a new treatment with a set of standard treatments.  相似文献   

3.
Multiple testing procedures defined by directed, weighted graphs have recently been proposed as an intuitive visual tool for constructing multiple testing strategies that reflect the often complex contextual relations between hypotheses in clinical trials. Many well‐known sequentially rejective tests, such as (parallel) gatekeeping tests or hierarchical testing procedures are special cases of the graph based tests. We generalize these graph‐based multiple testing procedures to adaptive trial designs with an interim analysis. These designs permit mid‐trial design modifications based on unblinded interim data as well as external information, while providing strong family wise error rate control. To maintain the familywise error rate, it is not required to prespecify the adaption rule in detail. Because the adaptive test does not require knowledge of the multivariate distribution of test statistics, it is applicable in a wide range of scenarios including trials with multiple treatment comparisons, endpoints or subgroups, or combinations thereof. Examples of adaptations are dropping of treatment arms, selection of subpopulations, and sample size reassessment. If, in the interim analysis, it is decided to continue the trial as planned, the adaptive test reduces to the originally planned multiple testing procedure. Only if adaptations are actually implemented, an adjusted test needs to be applied. The procedure is illustrated with a case study and its operating characteristics are investigated by simulations. Copyright © 2014 John Wiley & Sons, Ltd.  相似文献   

4.
A method for controlling the familywise error rate combining the Bonferroni adjustment and fixed testing sequence procedures is proposed. This procedure allots Type I error like the Bonferroni adjustment, but allows the Type I error to accumulate whenever a null hypothesis is rejected. In this manner, power for hypotheses tested later in a prespecified order will be increased. The order of the hypothesis tests needs to be prespecified as in a fixed sequence testing procedure, but unlike the fixed sequence testing procedure all hypotheses can always be tested, allowing for an a priori method of concluding a difference in the various endpoints. An application will be in clinical trials in which mortality is a concern, but it is expected that power to distinguish a difference in mortality will be low. If the effect on mortality is larger than anticipated, this method allows a test with a prespecified method of controlling the Type I error rate. Copyright © 2003 John Wiley & Sons, Ltd.  相似文献   

5.
Some multiple comparison procedures are described for multiple armed studies. The procedures are appropriate for testing all hypotheses for comparing two endpoints and multiple test arms to a single control group, for example three different fixed doses compared to a placebo. The procedure assumes that among the two endpoints, one is designated as a primary endpoint such that for a given treatment arm, no hypothesis for the secondary endpoint can be rejected unless the hypothesis for the primary endpoint was rejected. The procedures described control the family-wise error rate in the strong sense at a specified level α.  相似文献   

6.
We propose two new procedures based on multiple hypothesis testing for correct support estimation in high‐dimensional sparse linear models. We conclusively prove that both procedures are powerful and do not require the sample size to be large. The first procedure tackles the atypical setting of ordered variable selection through an extension of a testing procedure previously developed in the context of a linear hypothesis. The second procedure is the main contribution of this paper. It enables data analysts to perform support estimation in the general high‐dimensional framework of non‐ordered variable selection. A thorough simulation study and applications to real datasets using the R package mht shows that our non‐ordered variable procedure produces excellent results in terms of correct support estimation as well as in terms of mean square errors and false discovery rate, when compared to common methods such as the Lasso, the SCAD penalty, forward regression or the false discovery rate procedure (FDR).  相似文献   

7.
Uniformly most powerful Bayesian tests (UMPBTs) are a new class of Bayesian tests in which null hypotheses are rejected if their Bayes factor exceeds a specified threshold. The alternative hypotheses in UMPBTs are defined to maximize the probability that the null hypothesis is rejected. Here, we generalize the notion of UMPBTs by restricting the class of alternative hypotheses over which this maximization is performed, resulting in restricted most powerful Bayesian tests (RMPBTs). We then derive RMPBTs for linear models by restricting alternative hypotheses to g priors. For linear models, the rejection regions of RMPBTs coincide with those of usual frequentist F‐tests, provided that the evidence thresholds for the RMPBTs are appropriately matched to the size of the classical tests. This correspondence supplies default Bayes factors for many common tests of linear hypotheses. We illustrate the use of RMPBTs for ANOVA tests and t‐tests and compare their performance in numerical studies.  相似文献   

8.
Abstract. Many statistical models arising in applications contain non‐ and weakly‐identified parameters. Due to identifiability concerns, tests concerning the parameters of interest may not be able to use conventional theories and it may not be clear how to assess statistical significance. This paper extends the literature by developing a testing procedure that can be used to evaluate hypotheses under non‐ and weakly‐identifiable semiparametric models. The test statistic is constructed from a general estimating function of a finite dimensional parameter model representing the population characteristics of interest, but other characteristics which may be described by infinite dimensional parameters, and viewed as nuisance, are left completely unspecified. We derive the limiting distribution of this statistic and propose theoretically justified resampling approaches to approximate its asymptotic distribution. The methodology's practical utility is illustrated in simulations and an analysis of quality‐of‐life outcomes from a longitudinal study on breast cancer.  相似文献   

9.
A consistent approach to the problem of testing non‐correlation between two univariate infinite‐order autoregressive models was proposed by Hong (1996). His test is based on a weighted sum of squares of residual cross‐correlations, with weights depending on a kernel function. In this paper, the author follows Hong's approach to test non‐correlation of two cointegrated (or partially non‐stationary) ARMA time series. The test of Pham, Roy & Cédras (2003) may be seen as a special case of his approach, as it corresponds to the choice of a truncated uniform kernel. The proposed procedure remains valid for testing non‐correlation between two stationary invertible multivariate ARMA time series. The author derives the asymptotic distribution of his test statistics under the null hypothesis and proves that his procedures are consistent. He also studies the level and power of his proposed tests in finite samples through simulation. Finally, he presents an illustration based on real data.  相似文献   

10.
Simultaneously testing a family of n null hypotheses can arise in many applications. A common problem in multiple hypothesis testing is to control Type-I error. The probability of at least one false rejection referred to as the familywise error rate (FWER) is one of the earliest error rate measures. Many FWER-controlling procedures have been proposed. The ability to control the FWER and achieve higher power is often used to evaluate the performance of a controlling procedure. However, when testing multiple hypotheses, FWER and power are not sufficient for evaluating controlling procedure’s performance. Furthermore, the performance of a controlling procedure is also governed by experimental parameters such as the number of hypotheses, sample size, the number of true null hypotheses and data structure. This paper evaluates, under various experimental settings, the performance of some FWER-controlling procedures in terms of five indices, the FWER, the false discovery rate, the false non-discovery rate, the sensitivity and the specificity. The results can provide guidance on how to select an appropriate FWER-controlling procedure to meet a study’s objective.  相似文献   

11.
The false discovery rate (FDR) has become a popular error measure in the large-scale simultaneous testing. When data are collected from heterogenous sources and form grouped hypotheses testing, it may be beneficial to use the distinct feature of groups to conduct the multiple hypotheses testing. We propose a stratified testing procedure that uses different FDR levels according to the stratification features based on p-values. Our proposed method is easy to implement in practice. Simulations studies show that the proposed method produces more efficient testing results. The stratified testing procedure minimizes the overall false negative rate (FNR) level, while controlling the overall FDR. An example from a type II diabetes mice study further illustrates the practical advantages of this new approach.  相似文献   

12.
We consider the problem of estimating the proportion θ of true null hypotheses in a multiple testing context. The setup is classically modelled through a semiparametric mixture with two components: a uniform distribution on interval [0,1] with prior probability θ and a non‐parametric density f . We discuss asymptotic efficiency results and establish that two different cases occur whether f vanishes on a non‐empty interval or not. In the first case, we exhibit estimators converging at a parametric rate, compute the optimal asymptotic variance and conjecture that no estimator is asymptotically efficient (i.e. attains the optimal asymptotic variance). In the second case, we prove that the quadratic risk of any estimator does not converge at a parametric rate. We illustrate those results on simulated data.  相似文献   

13.
Summary. We investigate the operating characteristics of the Benjamini–Hochberg false discovery rate procedure for multiple testing. This is a distribution-free method that controls the expected fraction of falsely rejected null hypotheses among those rejected. The paper provides a framework for understanding more about this procedure. We first study the asymptotic properties of the `deciding point' D that determines the critical p -value. From this, we obtain explicit asymptotic expressions for a particular risk function. We introduce the dual notion of false non-rejections and we consider a risk function that combines the false discovery rate and false non-rejections. We also consider the optimal procedure with respect to a measure of conditional risk.  相似文献   

14.
We consider the multiple comparison problem where multiple outcomes are each compared among several different collections of groups in a multiple group setting. In this case there are several different types of hypotheses, with each specifying equality of the distributions of a single outcome over a different collection of groups. Each type of hypothesis requires a different permutational approach. We show that under a certain multivariate condition it is possible to use closure over all hypotheses, although intersection hypotheses are tested using Boole's inequality in conjunction with permutation distributions in some cases. Shortcut tests are then found so that the resulting testing procedure is easily performed. The error rate and power of the new method is compared to existing competitors through simulation of correlated data. An example is analyzed, consisting of multiple adverse events in a clinical trial.  相似文献   

15.
Many exploratory studies such as microarray experiments require the simultaneous comparison of hundreds or thousands of genes. It is common to see that most genes in many microarray experiments are not expected to be differentially expressed. Under such a setting, a procedure that is designed to control the false discovery rate (FDR) is aimed at identifying as many potential differentially expressed genes as possible. The usual FDR controlling procedure is constructed based on the number of hypotheses. However, it can become very conservative when some of the alternative hypotheses are expected to be true. The power of a controlling procedure can be improved if the number of true null hypotheses (m 0) instead of the number of hypotheses is incorporated in the procedure [Y. Benjamini and Y. Hochberg, On the adaptive control of the false discovery rate in multiple testing with independent statistics, J. Edu. Behav. Statist. 25(2000), pp. 60–83]. Nevertheless, m 0 is unknown, and has to be estimated. The objective of this article is to evaluate some existing estimators of m 0 and discuss the feasibility of these estimators in incorporating into FDR controlling procedures under various experimental settings. The results of simulations can help the investigator to choose an appropriate procedure to meet the requirement of the study.  相似文献   

16.
This article considers an approach to estimating and testing a new Kronecker product covariance structure for three-level (multiple time points (p), multiple sites (u), and multiple response variables (q)) multivariate data. Testing of such covariance structure is potentially important for high dimensional multi-level multivariate data. The hypothesis testing procedure developed in this article can not only test the hypothesis for three-level multivariate data, but also can test many different hypotheses, such as blocked compound symmetry, for two-level multivariate data as special cases. The tests are implemented with two real data sets.  相似文献   

17.
Recently, the field of multiple hypothesis testing has experienced a great expansion, basically because of the new methods developed in the field of genomics. These new methods allow scientists to simultaneously process thousands of hypothesis tests. The frequentist approach to this problem is made by using different testing error measures that allow to control the Type I error rate at a certain desired level. Alternatively, in this article, a Bayesian hierarchical model based on mixture distributions and an empirical Bayes approach are proposed in order to produce a list of rejected hypotheses that will be declared significant and interesting for a more detailed posterior analysis. In particular, we develop a straightforward implementation of a Gibbs sampling scheme where all the conditional posterior distributions are explicit. The results are compared with the frequentist False Discovery Rate (FDR) methodology. Simulation examples show that our model improves the FDR procedure in the sense that it diminishes the percentage of false negatives keeping an acceptable percentage of false positives.  相似文献   

18.
Clinical trials involving multiple time‐to‐event outcomes are increasingly common. In this paper, permutation tests for testing for group differences in multivariate time‐to‐event data are proposed. Unlike other two‐sample tests for multivariate survival data, the proposed tests attain the nominal type I error rate. A simulation study shows that the proposed tests outperform their competitors when the degree of censored observations is sufficiently high. When the degree of censoring is low, it is seen that naive tests such as Hotelling's T2 outperform tests tailored to survival data. Computational and practical aspects of the proposed tests are discussed, and their use is illustrated by analyses of three publicly available datasets. Implementations of the proposed tests are available in an accompanying R package.  相似文献   

19.
When testing treatment effects in multi‐arm clinical trials, the Bonferroni method or the method of Simes 1986) is used to adjust for the multiple comparisons. When control of the family‐wise error rate is required, these methods are combined with the close testing principle of Marcus et al. (1976). Under weak assumptions, the resulting p‐values all give rise to valid tests provided that the basic test used for each treatment is valid. However, standard tests can be far from valid, especially when the endpoint is binary and when sample sizes are unbalanced, as is common in multi‐arm clinical trials. This paper looks at the relationship between size deviations of the component test and size deviations of the multiple comparison test. The conclusion is that multiple comparison tests are as imperfect as the basic tests at nominal size α/m where m is the number of treatments. This, admittedly not unexpected, conclusion implies that these methods should only be used when the component test is very accurate at small nominal sizes. For binary end‐points, this suggests use of the parametric bootstrap test. All these conclusions are supported by a detailed numerical study.  相似文献   

20.
Multiple hypothesis testing is widely used to evaluate scientific studies involving statistical tests. However, for many of these tests, p values are not available and are thus often approximated using Monte Carlo tests such as permutation tests or bootstrap tests. This article presents a simple algorithm based on Thompson Sampling to test multiple hypotheses. It works with arbitrary multiple testing procedures, in particular with step-up and step-down procedures. Its main feature is to sequentially allocate Monte Carlo effort, generating more Monte Carlo samples for tests whose decisions are so far less certain. A simulation study demonstrates that for a low computational effort, the new approach yields a higher power and a higher degree of reproducibility of its results than previously suggested methods.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号