首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 265 毫秒
1.
Many late-onset complex diseases exhibit variable age of onset. Efficiently incorporating age of onset information into linkage analysis can potentially increase the power of dissecting complex diseases. In this paper, we treat age of onset as a genetic trait with censored observations. We use multiple markers to infer the inheritance vector at the disease susceptibility (DS) locus in order to extract information about the inheritance pattern of the disease allele in a pedigree. Given the inheritance distribution at the DS locus, we define the genetic frailty for each individual within a nuclear family as the sum of frailties due to a putative major disease gene and a polygenic effect due to any remaining DS loci. Conditioning on these frailties we use the proportional hazards model for the risk of developing disease. We show that a test of linkage can be formulated as a test of zero variance due to a specific locus of the additive gamma frailties. Maximum likelihood estimation, using the EM algorithm, and likelihood ratio tests are employed for parameter estimation and tests of linkage. A simulation study presented indicates that the proposed method is well behaved and can be more powerful than the currently available allele-sharing based linkage methods. A breast cancer data example is used for illustration.  相似文献   

2.
In cancer studies that use transgenic or knockout mice, skin tumour counts are recorded over time to measure tumorigenicity. In these studies cancer biologists are interested in the effect of endogenous and/or exogenous factors on papilloma onset, multiplicity and regression. In this paper an analysis of data from a study conducted by the National Institute of Environmental Health Sciences on the effect of genetic factors on skin tumorigenesis is presented. Papilloma multiplicity and regression are modelled by using Bernoulli, Poisson and binomial latent variables, each of which can depend on covariates and previous outcomes. An EM algorithm is proposed for parameter estimation, and generalized estimating equations adjust for extra dependence between outcomes within individual animals. A Cox proportional hazards model is used to describe covariate effects on the onset of tumours.  相似文献   

3.
A new method of modeling coronary artery calcium (CAC) is needed in order to properly understand the probability of onset and growth of CAC. CAC remains a controversial indicator of cardiovascular disease (CVD) risk, but this may be due to ill-equipped methods of specifying CAC during the analysis phase of studies reporting an analysis where CAC is the primary outcome. The modern method of two-part latent growth modeling may represent a strong alternative to the myriad of existing methods for modeling CAC. We provide a brief overview of existing methods of analysis used for CAC before introducing the general latent growth curve model, how it extends into a two-part (semicontinuous) growth model, and how the ubiquitous problem of missing data can be effectively handled. We then present an example of how to model CAC using this framework. We demonstrate that utilizing this type of modeling strategy can result in traditional predictors of CAC (e.g. age, gender, and high-density lipoprotein cholesterol), exerting a different impact on the two different, yet simultaneous, operationalizations of CAC. This method of analyzing CAC could inform future analyses of CAC and inform subsequent discussions about the nature of its potential to inform long-term CVD risk and heart events.  相似文献   

4.
Recent analyses seeking to explain variation in area health outcomes often consider the impact on them of latent measures (i.e. unobserved constructs) of population health risk. The latter are typically obtained by forms of multivariate analysis, with a small set of latent constructs derived from a collection of observed indicators, and a few recent area studies take such constructs to be spatially structured rather than independent over areas. A confirmatory approach is often applicable to the model linking indicators to constructs, based on substantive knowledge of relevant risks for particular diseases or outcomes. In this paper, population constructs relevant to a particular set of health outcomes are derived using an integrated model containing all the manifest variables, namely health outcome variables, as well as indicator variables underlying the latent constructs. A further feature of the approach is the use of variable selection techniques to select significant loadings and factors (especially in terms of effects of constructs on health outcomes), so ensuring parsimonious models are selected. A case study considers suicide mortality and self-harm contrasts in the East of England in relation to three latent constructs: deprivation, fragmentation and urbanicity.  相似文献   

5.
The field of genetic epidemiology is growing rapidly with the realization that many important diseases are influenced by both genetic and environmental factors. For this reason, pedigree data are becoming increasingly valuable as a means of studying patterns of disease occurrence. Analysis of pedigree data is complicated by the lack of independence among family members and by the non-random sampling schemes used to ascertain families. An additional complicating factor is the variability in age at disease onset from one person to another. In developing statistical methods for analysing pedigree data, analytic results are often intractable, making simulation studies imperative for assessing the performance of proposed methods and estimators. In this paper, an algorithm is presented for simulating disease data in pedigrees, incorporating variable age at onset and genetic and environmental effects. Computational formulas are developed in the context of a proportional hazards model and assuming single ascertainment of families, but the methods can be easily generalized to alternative models. The algorithm is computationally efficient, making multi-dataset simulation studies feasible. Numerical examples are provided to demonstrate the methods.  相似文献   

6.
In studies of affective disorder, individuals are often observed to experience recurrent symptomatic exacerbations warranting hospitalization. Interest may lie in modeling the occurrence of such exacerbations over time and identifying associated risk factors. In some patients, recurrent exacerbations are temporally clustered following disease onset, but cease to occur after a period of time. We develop a dynamic Mover–Stayer model in which a canonical binary variable associated with each event indicates whether the underlying disease has resolved. An individual whose disease process has not resolved will experience events following a standard point process model governed by a latent intensity. When the disease process resolves, the complete data intensity becomes zero and no further event will occur. An expectation–maximization algorithm is described for parametric and semiparametric model fitting based on a discrete time dynamic Mover–Stayer model and a latent intensity-based model of the underlying point process.  相似文献   

7.
We propose a general Bayesian joint modeling approach to model mixed longitudinal outcomes from the exponential family for taking into account any differential misclassification that may exist among categorical outcomes. Under this framework, outcomes observed without measurement error are related to latent trait variables through generalized linear mixed effect models. The misclassified outcomes are related to the latent class variables, which represent unobserved real states, using mixed hidden Markov models (MHMMs). In addition to enabling the estimation of parameters in prevalence, transition and misclassification probabilities, MHMMs capture cluster level heterogeneity. A transition modeling structure allows the latent trait and latent class variables to depend on observed predictors at the same time period and also on latent trait and latent class variables at previous time periods for each individual. Simulation studies are conducted to make comparisons with traditional models in order to illustrate the gains from the proposed approach. The new approach is applied to data from the Southern California Children Health Study to jointly model questionnaire-based asthma state and multiple lung function measurements in order to gain better insight about the underlying biological mechanism that governs the inter-relationship between asthma state and lung function development.  相似文献   

8.
Familial aggregation studies seek to identify diseases that cluster in families. These studies are often carried out as a first step in the search for hereditary factors affecting the risk of disease. It is necessary to account for age at disease onset to avoid potential misclassification of family members who are disease-free at the time of study participation or who die before developing disease. This is especially true for late-onset diseases, such as prostate cancer or Alzheimer's disease. We propose a discrete time model that accounts for the age at disease onset and allows the familial association to vary with age and to be modified by covariates, such as pedigree relationship. The parameters of the model have interpretations as conditional log-odds and log-odds ratios, which can be viewed as discrete time conditional cross hazard ratios. These interpretations are appealing for cancer risk assessment. Properties of this model are explored in simulation studies, and the method is applied to a large family study of cancer conducted by the National Cancer Institute-sponsored Cancer Genetics Network (CGN).  相似文献   

9.
ABSTRACT

Latent variable modeling is commonly used in behavioral, social, and medical science research. The models used in such analysis relate all observed variables to latent common factors. In many applications, the observations are highly non normal or discrete, e.g., polytomous responses or counts. The existing approaches for non normal observations can be considered lacking in several aspects, especially for multi-group samples situations. We propose a generalized linear model approach for multi-sample latent variable analysis that can handle a broad class of non normal and discrete observations, and that furnishes meaningful interpretation and inference in multi-group studies through maximum likelihood analysis. A Monte Carlo EM algorithm is proposed for parameter estimation. The convergence assessment and standard error estimation is addressed. Simulation studies are reported to show the usefulness of the our approach. An example from a substance abuse prevention study is also presented.  相似文献   

10.
《统计学通讯:理论与方法》2012,41(16-17):3079-3093
The paper presents an extension of a new class of multivariate latent growth models (Bianconcini and Cagnone, 2012) to allow for covariate effects on manifest, latent variables and random effects. The new class of models combines: (i) multivariate latent curves that describe the temporal behavior of the responses, and (ii) a factor model that specifies the relationship between manifest and latent variables. Based on the Generalized Linear and Latent Variable Model framework (Bartholomew and Knott, 1999), the response variables are assumed to follow different distributions of the exponential family, with item-specific linear predictors depending on both latent variables and measurement errors. A full maximum likelihood method is used to estimate all the model parameters simultaneously. Data coming from the Data WareHouse of the University of Bologna are used to illustrate the methodology.  相似文献   

11.
We extend the bivariate Wiener process considered by Whitmore and co-workers and model the joint process of a marker and health status. The health status process is assumed to be latent or unobservable. The time to reach the primary end point or failure (death, onset of disease, etc.) is the time when the latent health status process first crosses a failure threshold level. Inferences for the model are based on two kinds of data: censored survival data and marker measurements. Covariates, such as treatment variables, risk factors and base-line conditions, are related to the model parameters through generalized linear regression functions. The model offers a much richer potential for the study of treatment efficacy than do conventional models. Treatment effects can be assessed in terms of their influence on both the failure threshold and the health status process parameters. We derive an explicit formula for the prediction of residual failure times given the current marker level. Also we discuss model validation. This model does not require the proportional hazards assumption and hence can be widely used. To demonstrate the usefulness of the model, we apply the methods in analysing data from the protocol 116a of the AIDS Clinical Trials Group.  相似文献   

12.
Genome-wide association studies (GWAS) are effective in investigating the loci related with complex diseases. For most of these studies, the genetic inheritance model is not known in advance and therefore robust tests are preferred. Empirical likelihood (EL) method is well known for its flexibility and nonparametric properties, but is rarely investigated in GWAS. In this study, we develop EL-based test statistics to detect the association of a disease and genetic loci while the genetic model is unknown. The performance of proposed tests is evaluated by simulations and compared with several existing methods. For illustration, we apply these tests to identify the single nucleotide polymorphisms associated with alkaline phosphatase level on mouse chromosome 6.  相似文献   

13.
Genome-wide association studies are effective in investigating the loci related with complex diseases. Sometimes, the genotype is not exactly decoded and only genotype probability is obtained. In this case, F-test based on imputed genotype is usually used for the association analysis. Simulations show that existing methods such as the dosage test have poor performance when the genetic model is misspecified. In this study, we develop a robust test to detect the association of a disease and genetic loci while the genotype is uncertain and the genetic model is unknown.  相似文献   

14.
In this article, a general approach to latent variable models based on an underlying generalized linear model (GLM) with factor analysis observation process is introduced. We call these models Generalized Linear Factor Models (GLFM). The observations are produced from a general model framework that involves observed and latent variables that are assumed to be distributed in the exponential family. More specifically, we concentrate on situations where the observed variables are both discretely measured (e.g., binomial, Poisson) and continuously distributed (e.g., gamma). The common latent factors are assumed to be independent with a standard multivariate normal distribution. Practical details of training such models with a new local expectation-maximization (EM) algorithm, which can be considered as a generalized EM-type algorithm, are also discussed. In conjunction with an approximated version of the Fisher score algorithm (FSA), we show how to calculate maximum likelihood estimates of the model parameters, and to yield inferences about the unobservable path of the common factors. The methodology is illustrated by an extensive Monte Carlo simulation study and the results show promising performance.  相似文献   

15.
Abstract. Family‐based case–control designs are commonly used in epidemiological studies for evaluating the role of genetic susceptibility and environmental exposure to risk factors in the etiology of rare diseases. Within this framework, it is often reasonable to assume genetic susceptibility and environmental exposure being conditionally independent of each other within families in the source population. We focus on this setting to explore the situation of measurement error affecting the assessment of the environmental exposure. We correct for measurement error through a likelihood‐based method. We exploit a conditional likelihood approach to relate the probability of disease to the genetic and the environmental risk factors. We show that this approach provides less biased and more efficient results than that based on logistic regression. Regression calibration, instead, provides severely biased estimators of the parameters. The comparison of the correction methods is performed through simulation, under common measurement error structures.  相似文献   

16.
The Net Ecosystem Exchange describes the net carbon dioxide flux between an ecosystem and the atmosphere and is a key quantity in climate change studies and in political negotiations. This paper provides a spatio-temporal statistical framework, which is able to infer the Net Ecosystem Exchange from remotely-sensed carbon dioxide ground concentrations together with data on the Normalized Difference Vegetation Index, the Gross Primary Production and the land cover classification. The model is based on spatial and temporal latent random effects, that act as space–time varying coefficients, which allows for a flexible modeling of the spatio-temporal auto- and cross-correlation structure. The intra- and inter-annual variations of the Net Ecosystem Exchange are evaluated and dynamic maps are provided on a nearly global grid and in intervals of 16 days.  相似文献   

17.
In several studies, investigators are interested in estimating the bivariate distribution of the onset ages of a generic disorder in successive generations. The empirical distribution is inappropriate for this purpose due to truncation: only parent–child pairs with onset ages prior to the ages at interview were included in the sample. In this paper, we propose a simple nonparametric estimator for the underlying bivariate distribution of the onset ages. Compared with the existing estimators, the proposed estimator has a closed form and smaller biases when estimating marginal distributions. A real example is used to illustrate this estimator.  相似文献   

18.
In the development of many diseases there are often associated variables which continuously measure the progress of an individual towards the final expression of the disease (failure). Such variables are stochastic processes, here called marker processes, and, at a given point in time, they may provide information about the current hazard and subsequently on the remaining time to failure. Here we consider a simple additive model for the relationship between the hazard function at time t and the history of the marker process up until time t. We develop some basic calculations based on this model. Interest is focused on statistical applications for markers related to estimation of the survival distribution of time to failure, including (i) the use of markers as surrogate responses for failure with censored data, and (ii) the use of markers as predictors of the time elapsed since onset of a survival process in prevalent individuals. Particular attention is directed to potential gains in efficiency incurred by using marker process information.  相似文献   

19.
We propose a mixture of latent variables model for the model-based clustering, classification, and discriminant analysis of data comprising variables with mixed type. This approach is a generalization of latent variable analysis, and model fitting is carried out within the expectation-maximization framework. Our approach is outlined and a simulation study conducted to illustrate the effect of sample size and noise on the standard errors and the recovery probabilities for the number of groups. Our modelling methodology is then applied to two real data sets and their clustering and classification performance is discussed. We conclude with discussion and suggestions for future work.  相似文献   

20.
With the ready availability of spatial databases and geographical information system software, statisticians are increasingly encountering multivariate modelling settings featuring associations of more than one type: spatial associations between data locations and associations between the variables within the locations. Although flexible modelling of multivariate point-referenced data has recently been addressed by using a linear model of co-regionalization, existing methods for multivariate areal data typically suffer from unnecessary restrictions on the covariance structure or undesirable dependence on the conditioning order of the variables. We propose a class of Bayesian hierarchical models for multivariate areal data that avoids these restrictions, permitting flexible and order-free modelling of correlations both between variables and across areal units. Our framework encompasses a rich class of multivariate conditionally autoregressive models that are computationally feasible via modern Markov chain Monte Carlo methods. We illustrate the strengths of our approach over existing models by using simulation studies and also offer a real data application involving annual lung, larynx and oesophageal cancer death-rates in Minnesota counties between 1990 and 2000.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号