首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
A modification of the critical values of Simes’ test is suggested in this article when the underlying test statistics are multivariate normal with a common non-negative correlation, yielding a more powerful test than the original Simes’ test. A step-up multiple testing procedure with these modified critical values, which is shown to control false discovery rate (FDR), is presented as a modification of the traditional Benjamini–Hochberg (BH) procedure. Simulations were carried out to compare this modified BH procedure with the BH and other modified BH procedures in terms of false non-discovery rate (FNR), 1–FDR–FNR and average power. The present modified BH procedure is observed to perform well compared to others when the test statistics are highly correlated and most of the hypotheses are true.  相似文献   

2.
Most of current false discovery rate (FDR) procedures in a microarray experiment assume restrictive dependence structures, resulting in being less reliable. FDR controlling procedure under suitable dependence structures based on Poisson distributional approximation is shown. Unlike other procedures, the distribution of false null hypotheses is estimated by using kernel density estimation allowing for dependent structures among the genes. Furthermore, we develop an FDR framework that minimizes the false nondiscovery rate (FNR) with a constraint on the controlled level of the FDR. The performance of the proposed FDR procedure is compared with that of other existing FDR controlling procedures, with an application to the microarray study of simulated data.  相似文献   

3.
The false discovery rate (FDR) has become a popular error measure in the large-scale simultaneous testing. When data are collected from heterogenous sources and form grouped hypotheses testing, it may be beneficial to use the distinct feature of groups to conduct the multiple hypotheses testing. We propose a stratified testing procedure that uses different FDR levels according to the stratification features based on p-values. Our proposed method is easy to implement in practice. Simulations studies show that the proposed method produces more efficient testing results. The stratified testing procedure minimizes the overall false negative rate (FNR) level, while controlling the overall FDR. An example from a type II diabetes mice study further illustrates the practical advantages of this new approach.  相似文献   

4.
Zhang C  Fan J  Yu T 《Annals of statistics》2011,39(1):613-642
The multiple testing procedure plays an important role in detecting the presence of spatial signals for large scale imaging data. Typically, the spatial signals are sparse but clustered. This paper provides empirical evidence that for a range of commonly used control levels, the conventional FDR procedure can lack the ability to detect statistical significance, even if the p-values under the true null hypotheses are independent and uniformly distributed; more generally, ignoring the neighboring information of spatially structured data will tend to diminish the detection effectiveness of the FDR procedure. This paper first introduces a scalar quantity to characterize the extent to which the "lack of identification phenomenon" (LIP) of the FDR procedure occurs. Second, we propose a new multiple comparison procedure, called FDR(L), to accommodate the spatial information of neighboring p-values, via a local aggregation of p-values. Theoretical properties of the FDR(L) procedure are investigated under weak dependence of p-values. It is shown that the FDR(L) procedure alleviates the LIP of the FDR procedure, thus substantially facilitating the selection of more stringent control levels. Simulation evaluations indicate that the FDR(L) procedure improves the detection sensitivity of the FDR procedure with little loss in detection specificity. The computational simplicity and detection effectiveness of the FDR(L) procedure are illustrated through a real brain fMRI dataset.  相似文献   

5.
A two-stage stepup procedure is defined and an explicit formula for the FDR of this procedure is derived under any distributional setting. Sets of critical values are determined that provide a control of the FDR of a two-stage stepup procedure under iid mixture model. A class of two-stage FDR procedures modifying the Benjamini–Hochberg (BH) procedure and containing the one given in Storey et al. [2004. Strong control, conservative point estimation and simultanaeous conservative consistency of false discovery rates: a unified approach. J. Roy. Statist. Soc. Ser. B 66, 187–205] is obtained. The FDR controlling property of the Storey–Taylor–Siegmund procedure is proved only under the independence, which is different from that presented by these authors. A single-stage stepup procedure controlling the FDR under any form of dependence, which is different from and in some situations performs better than the Benjamini–Yekutieli (BY) procedure, is given before discussing how to obtain two-stage versions of the BY and this new procedures. Simulations reveal that procedures proposed in this article under the mixture model can perform quite well in terms of improving the FDR control of the BH procedure. However, the similar idea of improving the FDR control of a stepup procedure under any form dependence does not seem to work.  相似文献   

6.
Multiple comparisons of the effects of several treatments with a control (MCC) has been a central problem in medicine and other areas. Nearly all of existing papers are devoted to comparing means of the effects. To study medical problems more deeply, one needs more information than mean relationship from the given data. It can be expected to get more useful and deeper conclusion by comparing the probability distributions, i.e., by comparison under stochastic orders. This paper presents a likelihood ratio testing procedure to compare effects under stochastic order for MCC problems, controlling the false discovery rate (FDR). Setting a test controlling FDR under stochastic order faces several non trivial problems. These problems are analyzed and solved in this paper. To facilitate the test more easily, the asymptotic p values for the test are used and their distributions are derived. It is shown that controllability of FDR for this comparison procedure can be guaranteed. A real data example is used to illustrate how to apply this testing procedure and what the test can tell. Simulation results show that this testing procedure works quite well, better than some other tests.  相似文献   

7.
Summary.  In a large, prospective longitudinal study designed to monitor cardiac abnormalities in children born to women who are infected with the human immunodeficiency virus, instead of a single outcome variable, there are multiple binary outcomes (e.g. abnormal heart rate, abnormal blood pressure and abnormal heart wall thickness) considered as joint measures of heart function over time. In the presence of missing responses at some time points, longitudinal marginal models for these multiple outcomes can be estimated by using generalized estimating equations (GEEs), and consistent estimates can be obtained under the assumption of a missingness completely at random mechanism. When the missing data mechanism is missingness at random, i.e. the probability of missing a particular outcome at a time point depends on observed values of that outcome and the remaining outcomes at other time points, we propose joint estimation of the marginal models by using a single modified GEE based on an EM-type algorithm. The method proposed is motivated by the longitudinal study of cardiac abnormalities in children who were born to women infected with the human immunodeficiency virus, and analyses of these data are presented to illustrate the application of the method. Further, in an asymptotic study of bias, we show that, under a missingness at random mechanism in which missingness depends on all observed outcome variables, our joint estimation via the modified GEE produces almost unbiased estimates, provided that the correlation model has been correctly specified, whereas estimates from standard GEEs can lead to substantial bias.  相似文献   

8.
In many scientific fields, it is interesting and important to determine whether an observed data stream comes from a prespecified model or not, particularly when the number of data streams is of large scale, where multiple hypotheses testing is necessary. In this article, we consider large-scale model checking under certain dependence among different data streams observed at the same time. We propose a false discovery rate (FDR) control procedure to check those unusual data streams. Specifically, we derive an approximation of false discovery and construct a point estimate of FDR. Theoretical results show that, under some mild assumptions, our proposed estimate of FDR is simultaneously conservatively consistent with the true FDR, and hence it is an asymptotically strong control procedure. Simulation comparisons with some competing procedures show that our proposed FDR procedure behaves better in general settings. Application of our proposed FDR procedure is illustrated by the StarPlus fMRI data.  相似文献   

9.
Objectives in many longitudinal studies of individuals infected with the human immunodeficiency virus (HIV) include the estimation of population average trajectories of HIV ribonucleic acid (RNA) over time and tests for differences in trajectory across subgroups. Special features that are often inherent in the underlying data include a tendency for some HIV RNA levels to be below an assay detection limit, and for individuals with high initial levels or high ranges of change to drop out of the study early because of illness or death. We develop a likelihood for the observed data that incorporates both of these features. Informative drop-outs are handled by means of an approach previously published by Schluchter. Using data from the HIV Epidemiology Research Study, we implement a maximum likelihood procedure to estimate initial HIV RNA levels and slopes within a population, compare these parameters across subgroups of HIV-infected women and illustrate the importance of appropriate treatment of left censoring and informative drop-outs. We also assess model assumptions and consider the prediction of random intercepts and slopes in this setting. The results suggest that marked bias in estimates of fixed effects, variance components and standard errors in the analysis of HIV RNA data might be avoided by the use of methods like those illustrated.  相似文献   

10.
Patients infected with the human immunodeficiency virus (HIV) generally experience a decline in their CD4 cell count (a count of certain white blood cells). We describe the use of quantile regression methods to analyse longitudinal data on CD4 cell counts from 1300 patients who participated in clinical trials that compared two therapeutic treatments: zidovudine and didanosine. It is of scientific interest to determine any treatment differences in the CD4 cell counts over a short treatment period. However, the analysis of the CD4 data is complicated by drop-outs: patients with lower CD4 cell counts at the base-line appear more likely to drop out at later measurement occasions. Motivated by this example, we describe the use of `weighted' estimating equations in quantile regression models for longitudinal data with drop-outs. In particular, the conventional estimating equations for the quantile regression parameters are weighted inversely proportionally to the probability of drop-out. This approach requires the process generating the missing data to be estimable but makes no assumptions about the distribution of the responses other than those imposed by the quantile regression model. This method yields consistent estimates of the quantile regression parameters provided that the model for drop-out has been correctly specified. The methodology proposed is applied to the CD4 cell count data and the results are compared with those obtained from an `unweighted' analysis. These results demonstrate how an analysis that fails to account for drop-outs can mislead.  相似文献   

11.
The objective of this paper is to describe methods for estimating current incidence rates for human immunodeficiency virus (HIV) that account for follow-up bias. Follow-up bias arises when the incidence rate among individuals in a cohort who return for follow-up is different from the incidence rate among those who do not return. The methods are based on the use of early markers of HIV infection such as p24 antigen. The first method, called the cross-sectional method, uses only data collected at an initial base-line visit. The method does not require follow-up data but does require a priori knowledge of the mean duration of the marker (μ). A confidence interval procedure is developed that accounts for uncertainty in μ. The second method combines the base-line data from all individuals together with follow-up data from those individuals who return for follow-up. This method has the distinct advantage of not requiring prior information about μ. Several confidence interval procedures for the incidence rate are compared by simulation. The methods are applied to a study in India to estimate current HIV incidence. These data suggest that the epidemic is growing rapidly in some subpopulations in India.  相似文献   

12.
Summary.  The main statistical problem in many epidemiological studies which involve repeated measurements of surrogate markers is the frequent occurrence of missing data. Standard likelihood-based approaches like the linear random-effects model fail to give unbiased estimates when data are non-ignorably missing. In human immunodeficiency virus (HIV) type 1 infection, two markers which have been widely used to track progression of the disease are CD4 cell counts and HIV–ribonucleic acid (RNA) viral load levels. Repeated measurements of these markers tend to be informatively censored, which is a special case of non-ignorable missingness. In such cases, we need to apply methods that jointly model the observed data and the missingness process. Despite their high correlation, longitudinal data of these markers have been analysed independently by using mainly random-effects models. Touloumi and co-workers have proposed a model termed the joint multivariate random-effects model which combines a linear random-effects model for the underlying pattern of the marker with a log-normal survival model for the drop-out process. We extend the joint multivariate random-effects model to model simultaneously the CD4 cell and viral load data while adjusting for informative drop-outs due to disease progression or death. Estimates of all the model's parameters are obtained by using the restricted iterative generalized least squares method or a modified version of it using the EM algorithm as a nested algorithm in the case of censored survival data taking also into account non-linearity in the HIV–RNA trend. The method proposed is evaluated and compared with simpler approaches in a simulation study. Finally the method is applied to a subset of the data from the 'Concerted action on seroconversion to AIDS and death in Europe' study.  相似文献   

13.
Traditional multiple hypothesis testing procedures fix an error rate and determine the corresponding rejection region. In 2002 Storey proposed a fixed rejection region procedure and showed numerically that it can gain more power than the fixed error rate procedure of Benjamini and Hochberg while controlling the same false discovery rate (FDR). In this paper it is proved that when the number of alternatives is small compared to the total number of hypotheses, Storey's method can be less powerful than that of Benjamini and Hochberg. Moreover, the two procedures are compared by setting them to produce the same FDR. The difference in power between Storey's procedure and that of Benjamini and Hochberg is near zero when the distance between the null and alternative distributions is large, but Benjamini and Hochberg's procedure becomes more powerful as the distance decreases. It is shown that modifying the Benjamini and Hochberg procedure to incorporate an estimate of the proportion of true null hypotheses as proposed by Black gives a procedure with superior power.  相似文献   

14.
Summary.  The paper considers the problem of multiple testing under dependence in a compound decision theoretic framework. The observed data are assumed to be generated from an underlying two-state hidden Markov model. We propose oracle and asymptotically optimal data-driven procedures that aim to minimize the false non-discovery rate FNR subject to a constraint on the false discovery rate FDR. It is shown that the performance of a multiple-testing procedure can be substantially improved by adaptively exploiting the dependence structure among hypotheses, and hence conventional FDR procedures that ignore this structural information are inefficient. Both theoretical properties and numerical performances of the procedures proposed are investigated. It is shown that the procedures proposed control FDR at the desired level, enjoy certain optimality properties and are especially powerful in identifying clustered non-null cases. The new procedure is applied to an influenza-like illness surveillance study for detecting the timing of epidemic periods.  相似文献   

15.
Summary.  In longitudinal studies missing data are the rule not the exception. We consider the analysis of longitudinal binary data with non-monotone missingness that is thought to be non-ignorable. In this setting a full likelihood approach is complicated algebraically and can be computationally prohibitive when there are many measurement occasions. We propose a 'protective' estimator that assumes that the probability that a response is missing at any occasion depends, in a completely unspecified way, on the value of that variable alone. Relying on this 'protectiveness' assumption, we describe a pseudolikelihood estimator of the regression parameters under non-ignorable missingness, without having to model the missing data mechanism directly. The method proposed is applied to CD4 cell count data from two longitudinal clinical trials of patients infected with the human immunodeficiency virus.  相似文献   

16.
Summary: One specific problem statistical offices and research institutes are faced with when releasing microdata is the preservation of confidentiality. Traditional methods to avoid disclosure often destroy the structure of the data, and information loss is potentially high. In this paper an alternative technique of creating scientific–use files is discussed, which reproduces the characteristics of the original data quite well. It is based on Fienberg (1997, 1994) who estimates and resamples from the empirical multivariate cumulative distribution function of the data in order to get synthetic data. The procedure creates data sets – the resample – which have the same characteristics as the original survey data. The paper includes some applications of this method with (a) simulated data and (b) innovation survey data, the Mannheim Innovation Panel (MIP), and a comparison between resampling and a common method of disclosure control (disturbance with multiplicative error) with regard to confidentiality on the one hand and the appropriateness of the disturbed data for different kinds of analyses on the other. The results show that univariate distributions can be better reproduced by unweighted resampling. Parameter estimates can be reproduced quite well if the resampling procedure implements the correlation structure of the original data as a scale or if the data is multiplicatively perturbed and a correction term is used. On average, anonymization of data with multiplicatively perturbed values protects better against re–identification than the various resampling methods used.  相似文献   

17.
Summary.  Estimates of the number of prevalent human immunodeficiency virus infections are used in England and Wales to monitor development of the human immunodeficiency virus–acquired immune deficiency syndrome epidemic and for planning purposes. The population is split into risk groups, and estimates of risk group size and of risk group prevalence and diagnosis rates are combined to derive estimates of the number of undiagnosed infections and of the overall number of infected individuals. In traditional approaches, each risk group size, prevalence or diagnosis rate parameter must be informed by just one summary statistic. Yet a rich array of surveillance and other data is available, providing information on parameters and on functions of parameters, and raising the possibility of inconsistency between sources of evidence in some parts of the parameter space. We develop a Bayesian framework for synthesis of surveillance and other information, implemented through Markov chain Monte Carlo methods. The sources of data are found to be inconsistent under their accepted interpretation, but the inconsistencies can be resolved by introducing additional 'bias adjustment' parameters. The best-fitting model incorporates a hierarchical structure to spread information more evenly over the parameter space. We suggest that multiparameter evidence synthesis opens new avenues in epidemiology based on the coherent summary of available data, assessment of consistency and bias modelling.  相似文献   

18.
In this paper we discuss graphical models for mixed types of continuous and discrete variables with incomplete data. We use a set of hyperedges to represent an observed data pattern. A hyperedge is a set of variables observed for a group of individuals. In a mixed graph with two types of vertices and two types of edges, dots and circles represent discrete and continuous variables respectively. A normal graph represents a graphical model and a hypergraph represents an observed data pattern. In terms of the mixed graph, we discuss decomposition of mixed graphical models with incomplete data, and we present a partial imputation method which can be used in the EM algorithm and the Gibbs sampler to speed their convergence. For a given mixed graphical model and an observed data pattern, we try to decompose a large graph into several small ones so that the original likelihood can be factored into a product of likelihoods with distinct parameters for small graphs. For the case that a graph cannot be decomposed due to its observed data pattern, we can impute missing data partially so that the graph can be decomposed.  相似文献   

19.
Several scientific questions are of interest in phase III trials of prophylactic human immunodeficiency virus (HIV) vaccines. In this paper we focus on some issues related to evaluating the direct protective effects of a vaccine in reducing susceptibility, VES, and its effect on reducing infectiousness, VEI. An estimation of VEI generally requires information on contacts between the infective and susceptible individuals. By augmenting the primary participants of an HIV vaccine trial by their steady sexual partners, information can be collected that allows an estimation of VEI as well as VES. Exposure to infection information, however, may be expensive and difficult to collect. A vaccine trial design can include a small validation set with good exposure to infection data to correct bias in a larger, simpler main study with only coarse exposure data. The large main study increases the efficiency of the small validation set. More research into the combination of different levels of information in vaccine trial design will yield more efficient and less biased estimates of the efficacy measures of interest.  相似文献   

20.
There are still some open questions whether the existing step-up procedures for establishing superiority and equivalence of a new treatment compared with several standard treatments can strongly control the type I familywise error rate (FWE) at the designated level. In this paper we modify one of the three step-up procedures suggested by Dunnett and Tamhane (16 (1997) Statist. Med. 2489–2506) and then prove that the modified procedure strongly controls the FWE. The method for evaluating the critical values of the modified procedure is also discussed. A simulation study reveals that the modified procedure is generally more powerful than the original procedure.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号