首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
We propose a class of state-space models for multivariate longitudinal data where the components of the response vector may have different distributions. The approach is based on the class of Tweedie exponential dispersion models, which accommodates a wide variety of discrete, continuous and mixed data. The latent process is assumed to be a Markov process, and the observations are conditionally independent given the latent process, over time as well as over the components of the response vector. This provides a fully parametric alternative to the quasilikelihood approach of Liang and Zeger. We estimate the regression parameters for time-varying covariates entering either via the observation model or via the latent process, based on an estimating equation derived from the Kalman smoother. We also consider analysis of residuals from both the observation model and the latent process.  相似文献   

2.
The continuous extension of a discrete random variable is amongst the computational methods used for estimation of multivariate normal copula-based models with discrete margins. Its advantage is that the likelihood can be derived conveniently under the theory for copula models with continuous margins, but there has not been a clear analysis of the adequacy of this method. We investigate the asymptotic and small-sample efficiency of two variants of the method for estimating the multivariate normal copula with univariate binary, Poisson, and negative binomial regressions, and show that they lead to biased estimates for the latent correlations, and the univariate marginal parameters that are not regression coefficients. We implement a maximum simulated likelihood method, which is based on evaluating the multidimensional integrals of the likelihood with randomized quasi-Monte Carlo methods. Asymptotic and small-sample efficiency calculations show that our method is nearly as efficient as maximum likelihood for fully specified multivariate normal copula-based models. An illustrative example is given to show the use of our simulated likelihood method.  相似文献   

3.
The Cramér-von Mises test methodology is applied to build a goodness-of fit test for the mixed Rasch model. The Mixed Rasch Model is a probability model of a multivariate discrete random variable driven by an unknown latent continuous variable. The problem of estimation of the unknown fixed difficulty parameters and the latent density function is also considered. The theoretical results are illustrated through simulations and an application to real Quality of Life data.  相似文献   

4.
In this paper, a joint model for analyzing multivariate mixed ordinal and continuous responses, where continuous outcomes may be skew, is presented. For modeling the discrete ordinal responses, a continuous latent variable approach is considered and for describing continuous responses, a skew-normal mixed effects model is used. A Bayesian approach using Markov Chain Monte Carlo (MCMC) is adopted for parameter estimation. Some simulation studies are performed for illustration of the proposed approach. The results of the simulation studies show that the use of the separate models or the normal distributional assumption for shared random effects and within-subject errors of continuous and ordinal variables, instead of the joint modeling under a skew-normal distribution, leads to biased parameter estimates. The approach is used for analyzing a part of the British Household Panel Survey (BHPS) data set. Annual income and life satisfaction are considered as the continuous and the ordinal longitudinal responses, respectively. The annual income variable is severely skewed, therefore, the use of the normality assumption for the continuous response does not yield acceptable results. The results of data analysis show that gender, marital status, educational levels and the amount of money spent on leisure have a significant effect on annual income, while marital status has the highest impact on life satisfaction.  相似文献   

5.
Latent Variable Models for Mixed Discrete and Continuous Outcomes   总被引:1,自引:0,他引:1  
We propose a latent variable model for mixed discrete and continuous outcomes. The model accommodates any mixture of outcomes from an exponential family and allows for arbitrary covariate effects, as well as direct modelling of covariates on the latent variable. An EM algorithm is proposed for parameter estimation and estimates of the latent variables are produced as a by-product of the analysis. A generalized likelihood ratio test can be used to test the significance of covariates affecting the latent outcomes. This method is applied to birth defects data, where the outcomes of interest are continuous measures of size and binary indicators of minor physical anomalies. Infants who were exposed in utero to anticonvulsant medications are compared with controls.  相似文献   

6.
Abstract. For certain classes of hierarchical models, it is easy to derive an expression for the joint moment‐generating function (MGF) of data, whereas the joint probability density has an intractable form which typically involves an integral. The most important example is the class of linear models with non‐Gaussian latent variables. Parameters in the model can be estimated by approximate maximum likelihood, using a saddlepoint‐type approximation to invert the MGF. We focus on modelling heavy‐tailed latent variables, and suggest a family of mixture distributions that behaves well under the saddlepoint approximation (SPA). It is shown that the well‐known normalization issue renders the ordinary SPA useless in the present context. As a solution we extend the non‐Gaussian leading term SPA to a multivariate setting, and introduce a general rule for choosing the leading term density. The approach is applied to mixed‐effects regression, time‐series models and stochastic networks and it is shown that the modified SPA is very accurate.  相似文献   

7.
A class of multivariate mixed survival models for continuous and discrete time with a complex covariance structure is introduced in a context of quantitative genetic applications. The methods introduced can be used in many applications in quantitative genetics although the discussion presented concentrates on longevity studies. The framework presented allows to combine models based on continuous time with models based on discrete time in a joint analysis. The continuous time models are approximations of the frailty model in which the baseline hazard function will be assumed to be piece-wise constant. The discrete time models used are multivariate variants of the discrete relative risk models. These models allow for regular parametric likelihood-based inference by exploring a coincidence of their likelihood functions and the likelihood functions of suitably defined multivariate generalized linear mixed models. The models include a dispersion parameter, which is essential for obtaining a decomposition of the variance of the trait of interest as a sum of parcels representing the additive genetic effects, environmental effects and unspecified sources of variability; as required in quantitative genetic applications. The methods presented are implemented in such a way that large and complex quantitative genetic data can be analyzed. Some key model control techniques are discussed in a supplementary online material.  相似文献   

8.
In many longitudinal studies multiple characteristics of each individual, along with time to occurrence of an event of interest, are often collected. In such data set, some of the correlated characteristics may be discrete and some of them may be continuous. In this paper, a joint model for analysing multivariate longitudinal data comprising mixed continuous and ordinal responses and a time to event variable is proposed. We model the association structure between longitudinal mixed data and time to event data using a multivariate zero-mean Gaussian process. For modeling discrete ordinal data we assume a continuous latent variable follows the logistic distribution and for continuous data a Gaussian mixed effects model is used. For the event time variable, an accelerated failure time model is considered under different distributional assumptions. For parameter estimation, a Bayesian approach using Markov Chain Monte Carlo is adopted. The performance of the proposed methods is illustrated using some simulation studies. A real data set is also analyzed, where different model structures are used. Model comparison is performed using a variety of statistical criteria.  相似文献   

9.
ABSTRACT

This article extends the literature on copulas with discrete or continuous marginals to the case where some of the marginals are a mixture of discrete and continuous components. We do so by carefully defining the likelihood as the density of the observations with respect to a mixed measure. The treatment is quite general, although we focus on mixtures of Gaussian and Archimedean copulas. The inference is Bayesian with the estimation carried out by Markov chain Monte Carlo. We illustrate the methodology and algorithms by applying them to estimate a multivariate income dynamics model. Supplementary materials for this article are available online.  相似文献   

10.
Abstract. Latent variable modelling has gradually become an integral part of mainstream statistics and is currently used for a multitude of applications in different subject areas. Examples of ‘traditional’ latent variable models include latent class models, item–response models, common factor models, structural equation models, mixed or random effects models and covariate measurement error models. Although latent variables have widely different interpretations in different settings, the models have a very similar mathematical structure. This has been the impetus for the formulation of general modelling frameworks which accommodate a wide range of models. Recent developments include multilevel structural equation models with both continuous and discrete latent variables, multiprocess models and nonlinear latent variable models.  相似文献   

11.
The construction of a joint model for mixed discrete and continuous random variables that accounts for their associations is an important statistical problem in many practical applications. In this paper, we use copulas to construct a class of joint distributions of mixed discrete and continuous random variables. In particular, we employ the Gaussian copula to generate joint distributions for mixed variables. Examples include the robit-normal and probit-normal-exponential distributions, the first for modelling the distribution of mixed binary-continuous data and the second for a mixture of continuous, binary and trichotomous variables. The new class of joint distributions is general enough to include many mixed-data models currently available. We study properties of the distributions and outline likelihood estimation; a small simulation study is used to investigate the finite-sample properties of estimates obtained by full and pairwise likelihood methods. Finally, we present an application to discriminant analysis of multiple correlated binary and continuous data from a study involving advanced breast cancer patients.  相似文献   

12.
The shared frailty models allow for unobserved heterogeneity or for statistical dependence between observed survival data. The most commonly used estimation procedure in frailty models is the EM algorithm, but this approach yields a discrete estimator of the distribution and consequently does not allow direct estimation of the hazard function. We show how maximum penalized likelihood estimation can be applied to nonparametric estimation of a continuous hazard function in a shared gamma-frailty model with right-censored and left-truncated data. We examine the problem of obtaining variance estimators for regression coefficients, the frailty parameter and baseline hazard functions. Some simulations for the proposed estimation procedure are presented. A prospective cohort (Paquid) with grouped survival data serves to illustrate the method which was used to analyze the relationship between environmental factors and the risk of dementia.  相似文献   

13.
Multivariate mixture regression models can be used to investigate the relationships between two or more response variables and a set of predictor variables by taking into consideration unobserved population heterogeneity. It is common to take multivariate normal distributions as mixing components, but this mixing model is sensitive to heavy-tailed errors and outliers. Although normal mixture models can approximate any distribution in principle, the number of components needed to account for heavy-tailed distributions can be very large. Mixture regression models based on the multivariate t distributions can be considered as a robust alternative approach. Missing data are inevitable in many situations and parameter estimates could be biased if the missing values are not handled properly. In this paper, we propose a multivariate t mixture regression model with missing information to model heterogeneity in regression function in the presence of outliers and missing values. Along with the robust parameter estimation, our proposed method can be used for (i) visualization of the partial correlation between response variables across latent classes and heterogeneous regressions, and (ii) outlier detection and robust clustering even under the presence of missing values. We also propose a multivariate t mixture regression model using MM-estimation with missing information that is robust to high-leverage outliers. The proposed methodologies are illustrated through simulation studies and real data analysis.  相似文献   

14.
A general framework is proposed for modelling clustered mixed outcomes. A mixture of generalized linear models is used to describe the joint distribution of a set of underlying variables, and an arbitrary function relates the underlying variables to be observed outcomes. The model accommodates multilevel data structures, general covariate effects and distinct link functions and error distributions for each underlying variable. Within the framework proposed, novel models are developed for clustered multiple binary, unordered categorical and joint discrete and continuous outcomes. A Markov chain Monte Carlo sampling algorithm is described for estimating the posterior distributions of the parameters and latent variables. Because of the flexibility of the modelling framework and estimation procedure, extensions to ordered categorical outcomes and more complex data structures are straightforward. The methods are illustrated by using data from a reproductive toxicity study.  相似文献   

15.
Multiple imputation has emerged as a widely used model-based approach in dealing with incomplete data in many application areas. Gaussian and log-linear imputation models are fairly straightforward to implement for continuous and discrete data, respectively. However, in missing data settings which include a mix of continuous and discrete variables, correct specification of the imputation model could be a daunting task owing to the lack of flexible models for the joint distribution of variables of different nature. This complication, along with accessibility to software packages that are capable of carrying out multiple imputation under the assumption of joint multivariate normality, appears to encourage applied researchers for pragmatically treating the discrete variables as continuous for imputation purposes, and subsequently rounding the imputed values to the nearest observed category. In this article, I introduce a distance-based rounding approach for ordinal variables in the presence of continuous ones. The first step of the proposed rounding process is predicated upon creating indicator variables that correspond to the ordinal levels, followed by jointly imputing all variables under the assumption of multivariate normality. The imputed values are then converted to the ordinal scale based on their Euclidean distances to a set of indicators, with minimal distance corresponding to the closest match. I compare the performance of this technique to crude rounding via commonly accepted accuracy and precision measures with simulated data sets.  相似文献   

16.
Abstract.  CG-regressions are multivariate regression models for mixed continuous and discrete responses that result from conditioning in the class of conditional Gaussian (CG) models. Their conditional independence structure can be read off a marked graph. The property of collapsibility, in this context, means that the multivariate CG-regression can be decomposed into lower dimensional regressions that are still CG and are consistent with the corresponding subgraphs. We derive conditions for this property that can easily be checked on the graph, and indicate computational advantages of this kind of collapsibility. Further, a simple graphical condition is given for checking whether a decomposition into univariate regressions is possible.  相似文献   

17.
This article presents flexible new models for the dependence structure, or copula, of economic variables based on a latent factor structure. The proposed models are particularly attractive for relatively high-dimensional applications, involving 50 or more variables, and can be combined with semiparametric marginal distributions to obtain flexible multivariate distributions. Factor copulas generally lack a closed-form density, but we obtain analytical results for the implied tail dependence using extreme value theory, and we verify that simulation-based estimation using rank statistics is reliable even in high dimensions. We consider “scree” plots to aid the choice of the number of factors in the model. The model is applied to daily returns on all 100 constituents of the S&P 100 index, and we find significant evidence of tail dependence, heterogeneous dependence, and asymmetric dependence, with dependence being stronger in crashes than in booms. We also show that factor copula models provide superior estimates of some measures of systemic risk. Supplementary materials for this article are available online.  相似文献   

18.
Some general remarks are made about likelihood factorizations, distinguishing parameter-based factorizations and concentration-graph factorizations. Two parametric families of distributions for mixed discrete and continuous variables are discussed. Conditions on graphs are given for the circumstances under which their joint analysis can be split into separate analyses, each involving a reduced set of component variables and parameters. The result shows marked differences between the two families although both involve the same necessary condition on prime graphs. This condition is both necessary and sufficient for simplified estimation in Gaussian and for discrete log linear models.  相似文献   

19.
We propose a general Bayesian joint modeling approach to model mixed longitudinal outcomes from the exponential family for taking into account any differential misclassification that may exist among categorical outcomes. Under this framework, outcomes observed without measurement error are related to latent trait variables through generalized linear mixed effect models. The misclassified outcomes are related to the latent class variables, which represent unobserved real states, using mixed hidden Markov models (MHMMs). In addition to enabling the estimation of parameters in prevalence, transition and misclassification probabilities, MHMMs capture cluster level heterogeneity. A transition modeling structure allows the latent trait and latent class variables to depend on observed predictors at the same time period and also on latent trait and latent class variables at previous time periods for each individual. Simulation studies are conducted to make comparisons with traditional models in order to illustrate the gains from the proposed approach. The new approach is applied to data from the Southern California Children Health Study to jointly model questionnaire-based asthma state and multiple lung function measurements in order to gain better insight about the underlying biological mechanism that governs the inter-relationship between asthma state and lung function development.  相似文献   

20.
This paper aims at evaluating different aspects of Monte Carlo expectation – maximization algorithm to estimate heavy-tailed mixed logistic regression (MLR) models. As a novelty it also proposes a multiple chain Gibbs sampler to generate of the latent variables distributions thus obtaining independent samples. In heavy-tailed MLR models, the analytical forms of the full conditional distributions for the random effects are unknown. Four different Metropolis–Hastings algorithms are assumed to generate from them. We also discuss stopping rules in order to obtain more efficient algorithms in heavy-tailed MLR models. The algorithms are compared through the analysis of simulated and Ascaris Suum data.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号