首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
A stochastic multitype model for the spread of an infectious disease in a community of heterogeneous individuals is analysed. In particular, estimates of R 0 (the basic reproduction number) and the critical vaccination coverage are derived, where estimation is based on final size data of an outbreak in the community. It is shown that these key parameters cannot be estimated consistently from data; only upper and lower bounds can be estimated. Confidence regions for the upper bounds are derived, thus giving conservative estimates of R 0 and the fractions necessary to vaccinate.  相似文献   

2.
The design of infectious disease studies has received little attention because they are generally viewed as observational studies. That is, epidemic and endemic disease transmission happens and we observe it. We argue here that statistical design often provides useful guidance for such studies with regard to type of data and the size of the data set to be collected. It is shown that data on disease transmission in part of the community enables the estimation of central parameters and it is possible to compute the sample size required to make inferences with a desired precision. We illustrate this for data on disease transmission in a single community of uniformly mixing individuals and for data on outbreak sizes in households. Data on disease transmission is usually incomplete and this creates an identifiability problem for certain parameters of multitype epidemic models. We identify designs that can overcome this problem for the important objective of estimating parameters that help to assess the effectiveness of a vaccine. With disease transmission in animal groups there is greater scope for conducting planned experiments and we explore some possibilities for such experiments. The topic is largely unexplored and numerous open research problems in the area of statistical design of infectious disease data are mentioned.  相似文献   

3.
A model with nonrandom latent and infectious periods is suggested for epidemics in a large community. This permits a relatively complete statistical analysis of data from the spread of a single epidemic. An attractive feature of such models is the possibility of exploring how the rate of spread of the disease depends on the number of susceptibles and infectives. An application to smallpox data is included.  相似文献   

4.
孙怡帆等 《统计研究》2019,36(3):124-128
从大量基因中识别出致病基因是大数据下的一个十分重要的高维统计问题。基因间网络结构的存在使得对于致病基因的识别已从单个基因识别扩展到基因模块识别。从基因网络中挖掘出基因模块就是所谓的社区发现(或节点聚类)问题。绝大多数社区发现方法仅利用网络结构信息,而忽略节点本身的信息。Newman和Clauset于2016年提出了一个将二者有机结合的基于统计推断的社区发现方法(简称为NC方法)。本文以NC方法为案例,介绍统计方法在实际基因网络中的应用和取得的成果,并从统计学角度提出了改进措施。通过对NC方法的分析可以看出对于以基因网络为代表的非结构化数据,统计思想和原理在数据分析中仍然处于核心地位。而相应的统计方法则需要针对数据的特点及关心的问题进行相应的调整和优化。  相似文献   

5.
Abstract.  Much recent methodological progress in the analysis of infectious disease data has been due to Markov chain Monte Carlo (MCMC) methodology. In this paper, it is illustrated that rejection sampling can also be applied to a family of inference problems in the context of epidemic models, avoiding the issues of convergence associated with MCMC methods. Specifically, we consider models for epidemic data arising from a population divided into households. The models allow individuals to be potentially infected both from outside and from within the household. We develop methodology for selection between competing models via the computation of Bayes factors. We also demonstrate how an initial sample can be used to adjust the algorithm and improve efficiency. The data are assumed to consist of the final numbers ultimately infected within a sample of households in some community. The methods are applied to data taken from outbreaks of influenza.  相似文献   

6.
The field of genetic epidemiology is growing rapidly with the realization that many important diseases are influenced by both genetic and environmental factors. For this reason, pedigree data are becoming increasingly valuable as a means of studying patterns of disease occurrence. Analysis of pedigree data is complicated by the lack of independence among family members and by the non-random sampling schemes used to ascertain families. An additional complicating factor is the variability in age at disease onset from one person to another. In developing statistical methods for analysing pedigree data, analytic results are often intractable, making simulation studies imperative for assessing the performance of proposed methods and estimators. In this paper, an algorithm is presented for simulating disease data in pedigrees, incorporating variable age at onset and genetic and environmental effects. Computational formulas are developed in the context of a proportional hazards model and assuming single ascertainment of families, but the methods can be easily generalized to alternative models. The algorithm is computationally efficient, making multi-dataset simulation studies feasible. Numerical examples are provided to demonstrate the methods.  相似文献   

7.
The analysis of infectious disease data presents challenges arising from the dependence in the data and the fact that only part of the transmission process is observable. These difficulties are usually overcome by making simplifying assumptions. The paper explores the use of Markov chain Monte Carlo (MCMC) methods for the analysis of infectious disease data, with the hope that they will permit analyses to be made under more realistic assumptions. Two important kinds of data sets are considered, containing temporal and non-temporal information, from outbreaks of measles and influenza. Stochastic epidemic models are used to describe the processes that generate the data. MCMC methods are then employed to perform inference in a Bayesian context for the model parameters. The MCMC methods used include standard algorithms, such as the Metropolis–Hastings algorithm and the Gibbs sampler, as well as a new method that involves likelihood approximation. It is found that standard algorithms perform well in some situations but can exhibit serious convergence difficulties in others. The inferences that we obtain are in broad agreement with estimates obtained by other methods where they are available. However, we can also provide inferences for parameters which have not been reported in previous analyses.  相似文献   

8.
中国统计数据质量理论研究与实践历程   总被引:13,自引:4,他引:9       下载免费PDF全文
金勇进  陶然 《统计研究》2010,27(1):62-67
本文通过对改革开放以来我国统计数据质量理论研究和实践成果的回顾,归纳出三十年来有关统计数据质量的理论研究和实践脉络。在总结三十年来我国统计数据质量说取得的理论与实践成就的基础上,分析了存在的问题及面临的挑战。  相似文献   

9.
10.
We present a novel methodology for a comprehensive statistical analysis of approximately periodic biosignal data. There are two main challenges in such analysis: (1) the automatic extraction (segmentation) of cycles from long, cyclostationary biosignals and (2) the subsequent statistical analysis, which in many cases involves the separation of temporal and amplitude variabilities. The proposed framework provides a principled approach for statistical analysis of such signals, which in turn allows for an efficient cycle segmentation algorithm. This is achieved using a convenient representation of functions called the square-root velocity function (SRVF). The segmented cycles, represented by SRVFs, are temporally aligned using the notion of the Karcher mean, which in turn allows for more efficient statistical summaries of signals. We show the strengths of this method through various disease classification experiments. In the case of myocardial infarction detection and localization, we show that our method compares favorably to methods described in the current literature.  相似文献   

11.
Bayesian inference for partially observed stochastic epidemics   总被引:4,自引:0,他引:4  
The analysis of infectious disease data is usually complicated by the fact that real life epidemics are only partially observed. In particular, data concerning the process of infection are seldom available. Consequently, standard statistical techniques can become too complicated to implement effectively. In this paper Markov chain Monte Carlo methods are used to make inferences about the missing data as well as the unknown parameters of interest in a Bayesian framework. The methods are applied to real life data from disease outbreaks.  相似文献   

12.
Summary.  The paper is concerned with new methodology for statistical inference for final outcome infectious disease data using certain structured population stochastic epidemic models. A major obstacle to inference for such models is that the likelihood is both analytically and numerically intractable. The approach that is taken here is to impute missing information in the form of a random graph that describes the potential infectious contacts between individuals. This level of imputation overcomes various constraints of existing methodologies and yields more detailed information about the spread of disease. The methods are illustrated with both real and test data.  相似文献   

13.
Different longitudinal study designs require different statistical analysis methods and different methods of sample size determination. Statistical power analysis is a flexible approach to sample size determination for longitudinal studies. However, different power analyses are required for different statistical tests which arises from the difference between different statistical methods. In this paper, the simulation-based power calculations of F-tests with Containment, Kenward-Roger or Satterthwaite approximation of degrees of freedom are examined for sample size determination in the context of a special case of linear mixed models (LMMs), which is frequently used in the analysis of longitudinal data. Essentially, the roles of some factors, such as variance–covariance structure of random effects [unstructured UN or factor analytic FA0], autocorrelation structure among errors over time [independent IND, first-order autoregressive AR1 or first-order moving average MA1], parameter estimation methods [maximum likelihood ML and restricted maximum likelihood REML] and iterative algorithms [ridge-stabilized Newton-Raphson and Quasi-Newton] on statistical power of approximate F-tests in the LMM are examined together, which has not been considered previously. The greatest factor affecting statistical power is found to be the variance–covariance structure of random effects in the LMM. It appears that the simulation-based analysis in this study gives an interesting insight into statistical power of approximate F-tests for fixed effects in LMMs for longitudinal data.  相似文献   

14.
This article consists of a review and some remarks on the scope, common models, methods, their limitations and implications for the analysis of lifetime data. Also a new approach based upon data-transformations analogous to that of Box and Cox (1964) is introduced. The basic methods and theory of the subject are most familiarly and commonly encountered by the statistical community in the context of problems in reliability studies and survival analysis. However, they are also useful in areas of statistical applications such as goodness-of-fit and approximations for sampling distributions and are applicable in such diverse fields of applied research as economics, finance, sociology, meteorology and hydrology. The discussion includes examples from the mainstream statistical, social sciences and business literature.  相似文献   

15.
Abstract. This article considers the problem of cardinality estimation in data stream applications. We present a statistical analysis of probabilistic counting algorithms, focusing on two techniques that use pseudo‐random variates to form low‐dimensional data sketches. We apply conventional statistical methods to compare probabilistic algorithms based on storing either selected order statistics, or random projections. We derive estimators of the cardinality in both cases, and show that the maximal‐term estimator is recursively computable and has exponentially decreasing error bounds. Furthermore, we show that the estimators have comparable asymptotic efficiency, and explain this result by demonstrating an unexpected connection between the two approaches.  相似文献   

16.
Functional data are being observed frequently in many scientific fields, and therefore most of the standard statistical methods are being adapted for functional data. The multivariate analysis of variance problem for functional data is considered. It seems to be of practical interest similarly as the one-way analysis of variance for such data. For the MANOVA problem for multivariate functional data, we propose permutation tests based on a basis function representation and tests based on random projections. Their performance is examined in comprehensive simulation studies, which provide an idea of the size control and power of the tests and identify differences between them. The simulation experiments are based on artificial data and real labeled multivariate time series data found in the literature. The results suggest that the studied testing procedures can detect small differences between vectors of curves even with small sample sizes. Illustrative real data examples of the use of the proposed testing procedures in practice are also presented.  相似文献   

17.
In pre-clinical oncology studies, tumor-bearing animals are treated and observed over a period of time in order to measure and compare the efficacy of one or more cancer-intervention therapies along with a placebo/standard of care group. A data analysis is typically carried out by modeling and comparing tumor volumes, functions of tumor volumes, or survival. Data analysis on tumor volumes is complicated because animals under observation may be euthanized prior to the end of the study for one or more reasons, such as when an animal's tumor volume exceeds an upper threshold. In such a case, the tumor volume is missing not-at-random for the time remaining in the study. To work around the non-random missingness issue, several statistical methods have been proposed in the literature, including the rate of change in log tumor volume and partial area under the curve. In this work, an examination and comparison of the test size and statistical power of these and other popular methods for the analysis of tumor volume data is performed through realistic Monte Carlo computer simulations. The performance, advantages, and drawbacks of popular statistical methods for animal oncology studies are reported. The recommended methods are applied to a real data set.  相似文献   

18.
Considerable statistical research has been performed in recent years to develop sophisticated statistical methods for handling missing data and dropouts in the analysis of clinical trial data. However, if statisticians and other study team members proactively set out at the trial initiation stage to assess the impact of missing data and investigate ways to reduce dropouts, there is considerable potential to improve the clarity and quality of trial results and also increase efficiency. This paper presents a Human Immunodeficiency Virus (HIV) case study where statisticians led a project to reduce dropouts. The first step was to perform a pooled analysis of past HIV trials investigating which patient subgroups are more likely to drop out. The second step was to educate internal and external trial staff at all levels about the patient types more likely to dropout, and the impact this has on data quality and sample sizes required. The final step was to work collaboratively with clinical trial teams to create proactive plans regarding focused retention efforts, identifying ways to increase retention particularly in patients most at risk. It is acknowledged that identifying the specific impact of new patient retention efforts/tools is difficult because patient retention can be influenced by overall study design, investigational product tolerability profile, current standard of care and treatment access for the disease under study, which may vary over time. However, the implementation of new retention strategies and efforts within clinical trial teams attests to the influence of the analyses described in this case study. Copyright © 2012 John Wiley & Sons, Ltd.  相似文献   

19.
ABSTRACT

Scientific research of all kinds should be guided by statistical thinking: in the design and conduct of the study, in the disciplined exploration and enlightened display of the data, and to avoid statistical pitfalls in the interpretation of the results. However, formal, probability-based statistical inference should play no role in most scientific research, which is inherently exploratory, requiring flexible methods of analysis that inherently risk overfitting. The nature of exploratory work is that data are used to help guide model choice, and under these circumstances, uncertainty cannot be precisely quantified, because of the inevitable model selection bias that results. To be valid, statistical inference should be restricted to situations where the study design and analysis plan are specified prior to data collection. Exploratory data analysis provides the flexibility needed for most other situations, including statistical methods that are regularized, robust, or nonparametric. Of course, no individual statistical analysis should be considered sufficient to establish scientific validity: research requires many sets of data along many lines of evidence, with a watchfulness for systematic error. Replicating and predicting findings in new data and new settings is a stronger way of validating claims than blessing results from an isolated study with statistical inferences.  相似文献   

20.
A simple approach for analyzing longitudinally measured biomarkers is to calculate summary measures such as the area under the curve (AUC) for each individual and then compare the mean AUC between treatment groups using methods such as t test. This two-step approach is difficult to implement when there are missing data since the AUC cannot be directly calculated for individuals with missing measurements. Simple methods for dealing with missing data include the complete case analysis and imputation. A recent study showed that the estimated mean AUC difference between treatment groups based on the linear mixed model (LMM), rather than on individually calculated AUCs by simple imputation, has negligible bias under random missing assumptions and only small bias when missing is not at random. However, this model assumes the outcome to be normally distributed, which is often violated in biomarker data. In this paper, we propose to use a LMM on log-transformed biomarkers, based on which statistical inference for the ratio, rather than difference, of AUC between treatment groups is provided. The proposed method can not only handle the potential baseline imbalance in a randomized trail but also circumvent the estimation of the nuisance variance parameters in the log-normal model. The proposed model is applied to a recently completed large randomized trial studying the effect of nicotine reduction on biomarker exposure of smokers.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号