首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Ordinary differential equations (ODEs) are popular tools for modeling complicated dynamic systems in many areas. When multiple replicates of measurements are available for the dynamic process, it is of great interest to estimate mixed-effects in the ODE model for the process. We propose a semiparametric method to estimate mixed-effects ODE models. Rather than using the ODE numeric solution directly, which requires providing initial conditions, this method estimates a spline function to approximate the dynamic process using smoothing splines. A roughness penalty term is defined using the ODEs, which measures the fidelity of the spline function to the ODEs. The smoothing parameter, which controls the trade-off between fitting the data and maintaining fidelity to the ODEs, can be specified by users or selected objectively by generalized cross validation. The spline coefficients, the ODE random effects, and the ODE fixed effects are estimated in three nested levels of optimization. Two simulation studies show that the proposed method obtains good estimates for mixed-effects ODE models. The semiparametric method is demonstrated with an application of a pharmacokinetic model in a study of HIV combination therapy.  相似文献   

2.
Clustering gene expression time course data is an important problem in bioinformatics because understanding which genes behave similarly can lead to the discovery of important biological information. Statistically, the problem of clustering time course data is a special case of the more general problem of clustering longitudinal data. In this paper, a very general and flexible model-based technique is used to cluster longitudinal data. Mixtures of multivariate t-distributions are utilized, with a linear model for the mean and a modified Cholesky-decomposed covariance structure. Constraints are placed upon the covariance structure, leading to a novel family of mixture models, including parsimonious models. In addition to model-based clustering, these models are also used for model-based classification, i.e., semi-supervised clustering. Parameters, including the component degrees of freedom, are estimated using an expectation-maximization algorithm and two different approaches to model selection are considered. The models are applied to simulated data to illustrate their efficacy; this includes a comparison with their Gaussian analogues—the use of these Gaussian analogues with a linear model for the mean is novel in itself. Our family of multivariate t mixture models is then applied to two real gene expression time course data sets and the results are discussed. We conclude with a summary, suggestions for future work, and a discussion about constraining the degrees of freedom parameter.  相似文献   

3.
In the field of molecular biology, it is often of interest to analyze microarray data for clustering genes based on similar profiles of gene expression to identify genes that are differentially expressed under multiple biological conditions. One of the notable characteristics of a gene expression profile is that it shows a cyclic curve over a course of time. To group sequences of similar molecular functions, we propose a Bayesian Dirichlet process mixture of linear regression models with a Fourier series for the regression coefficients, for each of which a spike and slab prior is assumed. A full Gibbs-sampling algorithm is developed for an efficient Markov chain Monte Carlo (MCMC) posterior computation. Due to the so-called “label-switching” problem and different numbers of clusters during the MCMC computation, a post-process approach of Fritsch and Ickstadt (2009) is additionally applied to MCMC samples for an optimal single clustering estimate by maximizing the posterior expected adjusted Rand index with the posterior probabilities of two observations being clustered together. The proposed method is illustrated with two simulated data and one real data of the physiological response of fibroblasts to serum of Iyer et al. (1999).  相似文献   

4.
HIV dynamic models, a set of ordinary differential equations (ODEs), have provided new understanding of the pathogenesis of HIV infection and the treatment effects of antiviral therapies. However, to estimate parameters for ODEs is very challenging due to the complexity of this nonlinear system. In this article, we propose a comprehensive procedure to deal with this issue. In the proposed procedure, a series of cutting-edge statistical methods and techniques are employed, including nonparametric mixed-effects smoothing-based methods for ODE models and stochastic approximation expectation–maximization (EM) approach for mixed-effects ODE models. A simulation study is performed to validate the proposed approach. An application example from a real HIV clinical trial study is used to illustrate the usefulness of the proposed method.  相似文献   

5.
The study of HIV dynamics is one of the most important developments in recent AIDS research. It has led to a new understanding of the pathogenesis of HIV infection. Although important findings in HIV dynamics have been published in prestigious scientific journals, the statistical methods for parameter estimation and model-fitting used in those papers appear surprisingly crude and have not been studied in more detail. For example, the unidentifiable parameters were simply imputed by mean estimates from previous studies, and important pharmacological/clinical factors were not considered in the modelling. In this paper, a viral dynamic model is developed to evaluate the effect of pharmacokinetic variation, drug resistance and adherence on antiviral responses. In the context of this model, we investigate a Bayesian modelling approach under a non-linear mixed-effects (NLME) model framework. In particular, our modelling strategy allows us to estimate time-varying antiviral efficacy of a regimen during the whole course of a treatment period by incorporating the information of drug exposure and drug susceptibility. Both simulated and real clinical data examples are given to illustrate the proposed approach. The Bayesian approach has great potential to be used in many aspects of viral dynamics modelling since it allow us to fit complex dynamic models and identify all the model parameters. Our results suggest that Bayesian approach for estimating parameters in HIV dynamic models is flexible and powerful.  相似文献   

6.
7.
Gene regulatory networks are collections of genes that interact with one other and with other substances in the cell. By measuring gene expression over time using high-throughput technologies, it may be possible to reverse engineer, or infer, the structure of the gene network involved in a particular cellular process. These gene expression data typically have a high dimensionality and a limited number of biological replicates and time points. Due to these issues and the complexity of biological systems, the problem of reverse engineering networks from gene expression data demands a specialized suite of statistical tools and methodologies. We propose a non-standard adaptation of a simulation-based approach known as Approximate Bayesian Computing based on Markov chain Monte Carlo sampling. This approach is particularly well suited for the inference of gene regulatory networks from longitudinal data. The performance of this approach is investigated via simulations and using longitudinal expression data from a genetic repair system in Escherichia coli.  相似文献   

8.
Traffic flow data are routinely collected for many networks worldwide. These invariably large data sets can be used as part of a traffic management system, for which good traffic flow forecasting models are crucial. The linear multiregression dynamic model (LMDM) has been shown to be promising for forecasting flows, accommodating multivariate flow time series, while being a computationally simple model to use. While statistical flow forecasting models usually base their forecasts on flow data alone, data for other traffic variables are also routinely collected. This paper shows how cubic splines can be used to incorporate extra variables into the LMDM in order to enhance flow forecasts. Cubic splines are also introduced into the LMDM to parsimoniously accommodate the daily cycle exhibited by traffic flows. The proposed methodology allows the LMDM to provide more accurate forecasts when forecasting flows in a real high‐dimensional traffic data set. The resulting extended LMDM can deal with some important traffic modelling issues not usually considered in flow forecasting models. Additionally, the model can be implemented in a real‐time environment, a crucial requirement for traffic management systems designed to support decisions and actions to alleviate congestion and keep traffic flowing.  相似文献   

9.
10.
We propose a hierarchical Bayesian model for analyzing gene expression data to identify pathways differentiating between two biological states (e.g., cancer vs. non-cancer and mutant vs. normal). Finding significant pathways can improve our understanding of biological processes. When the biological process of interest is related to a specific disease, eliciting a better understanding of the underlying pathways can lead to designing a more effective treatment. We apply our method to data obtained by interrogating the mutational status of p53 in 50 cancer cell lines (33 mutated and 17 normal). We identify several significant pathways with strong biological connections. We show that our approach provides a natural framework for incorporating prior biological information, and it has the best overall performance in terms of correctly identifying significant pathways compared to several alternative methods.  相似文献   

11.
We present a flexible branching process model for cell population dynamics in synchrony/time-series experiments used to study important cellular processes. Its formulation is constructive, based on an accounting of the unique cohorts in the population as they arise and evolve over time, allowing it to be written in closed form. The model can attribute effects to subsets of the population, providing flexibility not available using the models historically applied to these populations. It provides a tool for in silico synchronization of the population and can be used to deconvolve population-level experimental measurements, such as temporal expression profiles. It also allows for the direct comparison of assay measurements made from multiple experiments. The model can be fit either to budding index or DNA content measurements, or both, and is easily adaptable to new forms of data. The ability to use DNA content data makes the model applicable to almost any organism. We describe the model and illustrate its utility and flexibility in a study of cell cycle progression in the yeast Saccharomyces cerevisiae.  相似文献   

12.
13.
In high throughput genomic studies, an important goal is to identify a small number of genomic markers that are associated with development and progression of diseases. A representative example is microarray prognostic studies, where the goal is to identify genes whose expressions are associated with disease free or overall survival. Because of the high dimensionality of gene expression data, standard survival analysis techniques cannot be directly applied. In addition, among the thousands of genes surveyed, only a subset are disease-associated. Gene selection is needed along with estimation. In this article, we model the relationship between gene expressions and survival using the accelerated failure time (AFT) models. We use the bridge penalization for regularized estimation and gene selection. An efficient iterative computational algorithm is proposed. Tuning parameters are selected using V-fold cross validation. We use a resampling method to evaluate the prediction performance of bridge estimator and the relative stability of identified genes. We show that the proposed bridge estimator is selection consistent under appropriate conditions. Analysis of two lymphoma prognostic studies suggests that the bridge estimator can identify a small number of genes and can have better prediction performance than the Lasso.  相似文献   

14.
ABSTRACT

Computer models depending on unknown inputs are used in computer experiments in order to study the input-output relationship. Some computer models are computationally intensive and only a few computer runs can be made. A nonstationary statistical model that is used as a faster-running surrogate for computationally intensive numerical solvers of ordinary differential systems is proposed in this article. This statistical model reflects the dynamics of the system and, as a population dynamics example will show, it can be more accurate than a commonly used black box statistical model.  相似文献   

15.
Parametric incomplete data models defined by ordinary differential equations (ODEs) are widely used in biostatistics to describe biological processes accurately. Their parameters are estimated on approximate models, whose regression functions are evaluated by a numerical integration method. Accurate and efficient estimations of these parameters are critical issues. This paper proposes parameter estimation methods involving either a stochastic approximation EM algorithm (SAEM) in the maximum likelihood estimation, or a Gibbs sampler in the Bayesian approach. Both algorithms involve the simulation of non-observed data with conditional distributions using Hastings–Metropolis (H–M) algorithms. A modified H–M algorithm, including an original local linearization scheme to solve the ODEs, is proposed to reduce the computational time significantly. The convergence on the approximate model of all these algorithms is proved. The errors induced by the numerical solving method on the conditional distribution, the likelihood and the posterior distribution are bounded. The Bayesian and maximum likelihood estimation methods are illustrated on a simulated pharmacokinetic nonlinear mixed-effects model defined by an ODE. Simulation results illustrate the ability of these algorithms to provide accurate estimates.  相似文献   

16.
Using tests of time reversibility, this paper provides further statistical evidence on the long-standing conjecture in economics concerning the potentially asymmetric behaviour of output over the expansionary and contractionary phases of the business cycle. A particular advantage of this approach is that it provides a discriminating test that is instructive as to whether any asymmetries detected are due to asymmetric shocks to a linear model, or an underlying non-linear model with symmetric shocks, and in the latter case is informative as to the potential form of that nonlinear model. Using a long span of international per capita output growth data, the asymmetry detected is overwhelmingly consistent with the long standing perception that the output business cycle is characterized by steeper recessions and longer more gentle expansions, but the evidence for this form of business cycle asymmetry is weaker in the data adjusted for the influence of outliers associated with wars and other extreme events. Statistically significant time irreversibility is reported for the output growth rates of almost all of the countries considered in the full sample data, and there is evidence that this time irreversibility is of a form implying an underlying nonlinear model with symmetrically distributed innovations for 15 of the 22 countries considered. However, the time irreversibility test results for the outlier-trimmed full sample data reveal significant time irreversibility in output growth for around one half of the countries considered, predominantly in Northern Europe and North America, and of a form implying a nonlinear underlying model in only a further half of those cases.  相似文献   

17.
GenoCAD(www.genocad.com)是一种基于Web的免费合成生物学设计软件,使用它可以进行表达载体及人工基因网络设计。不断地点击代表各种合成生物学标准"零件"的图标,以一种语法进行设计,最后就可以得到由数十个功能片段组成的复杂质粒载体。但是一般来讲在GenoCAD中,每一类的合成生物学标准"零件"数量众多。随着这些标准"零件"的不断开发,其数量也在进一步增加,目前选择合适的"零件"组装成功能性的质粒载体费时费力并且容易发生错误。在进行载体设计的最后阶段,从众多的"零件"中选择合适的往往比较困难。为解决这一问题,采用自然语言处理的统计语言模型,并以该模型为基础应用动态规划算法优化质粒载体设计,从众多的选项中找出最优者。利用这一方法可以减少进行生物学实验的冗余操作,从而减少载体构建过程中的花费。  相似文献   

18.
19.
Summary. The study of human immunodeficiency virus dynamics is one of the most important areas in research into acquired immune deficiency syndrome in recent years. Non-linear mixed effects models have been proposed for modelling viral dynamic processes. A challenging problem in the modelling is to identify repeatedly measured (time-dependent), but possibly missing, immunologic or virologic markers (covariates) for viral dynamic parameters. For missing time-dependent covariates in non-linear mixed effects models, the commonly used complete-case, mean imputation and last value carried forward methods may give misleading results. We propose a three-step hierarchical multiple-imputation method, implemented by Gibbs sampling, which imputes the missing data at the individual level but can pool information across individuals. We compare various methods by Monte Carlo simulations and find that the multiple-imputation method proposed performs the best in terms of bias and mean-squared errors in the estimates of covariate coefficients. By applying the favoured multiple-imputation method to clinical data, we conclude that there is a negative correlation between the viral decay rate (a virological response parameter) and CD4 or CD8 cell counts during the treatment; this is counter-intuitive, but biologically interpretable on the basis of findings from other clinical studies. These results may have an important influence on decisions about treatment for acquired immune deficiency syndrome patients.  相似文献   

20.
When an appropriate parametric model and a prior distribution of its parameters are given to describe clinical time courses of a dynamic biological process, Bayesian approaches allow us to estimate the entire profiles from a few or even a single observation per subject. The goodness of the estimation depends on the measurement points at which the observations were made. The number of measurement points per subject is generally limited to one or two. The limited measurement points have to be selected carefully. This paper proposes an approach to the selection of the optimum measurement point for Bayesian estimations of clinical time courses. The selection is made among given candidates, based on the goodness of estimation evaluated by the Kullback-Leibler information. This information measures the discrepancy of an estimated time course from the true one specified by a given appropriate model. The proposed approach is applied to a pharmacokinetic analysis, which is a typical clinical example where the selection is required. The results of the present study strongly suggest that the proposed approach is applicable to pharmacokinetic data and has a wide range of clinical applications.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号