首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 203 毫秒
Latent variable models are widely used for jointly modeling of mixed data including nominal, ordinal, count and continuous data. In this paper, we consider a latent variable model for jointly modeling relationships between mixed binary, count and continuous variables with some observed covariates. We assume that, given a latent variable, mixed variables of interest are independent and count and continuous variables have Poisson distribution and normal distribution, respectively. As such data may be extracted from different subpopulations, consideration of an unobserved heterogeneity has to be taken into account. A mixture distribution is considered (for the distribution of the latent variable) which accounts the heterogeneity. The generalized EM algorithm which uses the Newton–Raphson algorithm inside the EM algorithm is used to compute the maximum likelihood estimates of parameters. The standard errors of the maximum likelihood estimates are computed by using the supplemented EM algorithm. Analysis of the primary biliary cirrhosis data is presented as an application of the proposed model.  相似文献   

The Cramér-von Mises test methodology is applied to build a goodness-of fit test for the mixed Rasch model. The Mixed Rasch Model is a probability model of a multivariate discrete random variable driven by an unknown latent continuous variable. The problem of estimation of the unknown fixed difficulty parameters and the latent density function is also considered. The theoretical results are illustrated through simulations and an application to real Quality of Life data.  相似文献   

We propose a general latent variable model for multivariate ordinal categorical variables, in which both the responses and the covariates are ordinal, to assess the effect of the covariates on the responses and to model the covariance structure of the response variables. A?fully Bayesian approach is employed to analyze the model. The Gibbs sampler is used to simulate the joint posterior distribution of the latent variables and the parameters, and the parameter expansion and reparameterization techniques are used to speed up the convergence procedure. The proposed model and method are demonstrated by simulation studies and a real data example.  相似文献   

In biomedical and public health research, both repeated measures of biomarkers Y as well as times T to key clinical events are often collected for a subject. The scientific question is how the distribution of the responses [ T , Y | X ] changes with covariates X . [ T | X ] may be the focus of the estimation where Y can be used as a surrogate for T . Alternatively, T may be the time to drop-out in a study in which [ Y | X ] is the target for estimation. Also, the focus of a study might be on the effects of covariates X on both T and Y or on some underlying latent variable which is thought to be manifested in the observable outcomes. In this paper, we present a general model for the joint analysis of [ T , Y | X ] and apply the model to estimate [ T | X ] and other related functionals by using the relevant information in both T and Y . We adopt a latent variable formulation like that of Fawcett and Thomas and use it to estimate several quantities of clinical relevance to determine the efficacy of a treatment in a clinical trial setting. We use a Markov chain Monte Carlo algorithm to estimate the model's parameters. We illustrate the methodology with an analysis of data from a clinical trial comparing risperidone with a placebo for the treatment of schizophrenia.  相似文献   

We propose a random partition model that implements prediction with many candidate covariates and interactions. The model is based on a modified product partition model that includes a regression on covariates by favouring homogeneous clusters in terms of these covariates. Additionally, the model allows for a cluster‐specific choice of the covariates that are included in this evaluation of homogeneity. The variable selection is implemented by introducing a set of cluster‐specific latent indicators that include or exclude covariates. The proposed model is motivated by an application to predicting mortality in an intensive care unit in Lisboa, Portugal.  相似文献   

We develop Bayesian models for density regression with emphasis on discrete outcomes. The problem of density regression is approached by considering methods for multivariate density estimation of mixed scale variables, and obtaining conditional densities from the multivariate ones. The approach to multivariate mixed scale outcome density estimation that we describe represents discrete variables, either responses or covariates, as discretised versions of continuous latent variables. We present and compare several models for obtaining these thresholds in the challenging context of count data analysis where the response may be over‐ and/or under‐dispersed in some of the regions of the covariate space. We utilise a nonparametric mixture of multivariate Gaussians to model the directly observed and the latent continuous variables. The paper presents a Markov chain Monte Carlo algorithm for posterior sampling, sufficient conditions for weak consistency, and illustrations on density, mean and quantile regression utilising simulated and real datasets.  相似文献   

We propose a latent variable model for informative missingness in longitudinal studies which is an extension of latent dropout class model. In our model, the value of the latent variable is affected by the missingness pattern and it is also used as a covariate in modeling the longitudinal response. So the latent variable links the longitudinal response and the missingness process. In our model, the latent variable is continuous instead of categorical and we assume that it is from a normal distribution. The EM algorithm is used to obtain the estimates of the parameter we are interested in and Gauss–Hermite quadrature is used to approximate the integration of the latent variable. The standard errors of the parameter estimates can be obtained from the bootstrap method or from the inverse of the Fisher information matrix of the final marginal likelihood. Comparisons are made to the mixed model and complete-case analysis in terms of a clinical trial dataset, which is Weight Gain Prevention among Women (WGPW) study. We use the generalized Pearson residuals to assess the fit of the proposed latent variable model.  相似文献   

A general model is proposed for flexibly estimating the density of a continuous response variable conditional on a possibly high-dimensional set of covariates. The model is a finite mixture of asymmetric student t densities with covariate-dependent mixture weights. The four parameters of the components, the mean, degrees of freedom, scale and skewness, are all modeled as functions of the covariates. Inference is Bayesian and the computation is carried out using Markov chain Monte Carlo simulation. To enable model parsimony, a variable selection prior is used in each set of covariates and among the covariates in the mixing weights. The model is used to analyze the distribution of daily stock market returns, and shown to more accurately forecast the distribution of returns than other widely used models for financial data.  相似文献   

We are concerned with cumulative regression models for an ordered categorical response variable Y. We propose two methods to build partial residuals from regression on a subset Z1 of covariates Z., which take into regard the ordinal character of the response. The first method makes use of a multivariate GLM-representation of the model and produces residual measures for diagnostic purposes. The second uses a latent continuous variable model and yields new (adjusted) ordinal data Y*. Both methods are illustrated by a data set from forestry.  相似文献   

We consider an extension of the recursive bivariate probit model for estimating the effect of a binary variable on a binary outcome in the presence of unobserved confounders, nonlinear covariate effects and overdispersion. Specifically, the model consists of a system of two binary outcomes with a binary endogenous regressor which includes smooth functions of covariates, hence allowing for flexible functional dependence of the responses on the continuous regressors, and arbitrary random intercepts to deal with overdispersion arising from correlated observations on clusters or from the omission of non‐confounding covariates. We fit the model by maximizing a penalized likelihood using an Expectation‐Maximisation algorithm. The issues of automatic multiple smoothing parameter selection and inference are also addressed. The empirical properties of the proposed algorithm are examined in a simulation study. The method is then illustrated using data from a survey on health, aging and wealth.  相似文献   

In longitudinal observational studies, repeated measures are often correlated with observation times as well as censoring time. This article proposes joint modeling and analysis of longitudinal data with time-dependent covariates in the presence of informative observation and censoring times via a latent variable. Estimating equation approaches are developed for parameter estimation and asymptotic properties of the proposed estimators are established. In addition, a generalization of the semiparametric model with time-varying coefficients for the longitudinal response is considered. Furthermore, a lack-of-fit test is provided for assessing the adequacy of the model, and some tests are presented for investigating whether or not covariate effects vary with time. The finite-sample behavior of the proposed methods is examined in simulation studies, and an application to a bladder cancer study is illustrated.  相似文献   

In this article, we develop a Bayesian variable selection method that concerns selection of covariates in the Poisson change-point regression model with both discrete and continuous candidate covariates. Ranging from a null model with no selected covariates to a full model including all covariates, the Bayesian variable selection method searches the entire model space, estimates posterior inclusion probabilities of covariates, and obtains model averaged estimates on coefficients to covariates, while simultaneously estimating a time-varying baseline rate due to change-points. For posterior computation, the Metropolis-Hastings within partially collapsed Gibbs sampler is developed to efficiently fit the Poisson change-point regression model with variable selection. We illustrate the proposed method using simulated and real datasets.  相似文献   

We propose a class of state-space models for multivariate longitudinal data where the components of the response vector may have different distributions. The approach is based on the class of Tweedie exponential dispersion models, which accommodates a wide variety of discrete, continuous and mixed data. The latent process is assumed to be a Markov process, and the observations are conditionally independent given the latent process, over time as well as over the components of the response vector. This provides a fully parametric alternative to the quasilikelihood approach of Liang and Zeger. We estimate the regression parameters for time-varying covariates entering either via the observation model or via the latent process, based on an estimating equation derived from the Kalman smoother. We also consider analysis of residuals from both the observation model and the latent process.  相似文献   

Current methods of testing the equality of conditional correlations of bivariate data on a third variable of interest (covariate) are limited due to discretizing of the covariate when it is continuous. In this study, we propose a linear model approach for estimation and hypothesis testing of the Pearson correlation coefficient, where the correlation itself can be modeled as a function of continuous covariates. The restricted maximum likelihood method is applied for parameter estimation, and the corrected likelihood ratio test is performed for hypothesis testing. This approach allows for flexible and robust inference and prediction of the conditional correlations based on the linear model. Simulation studies show that the proposed method is statistically more powerful and more flexible in accommodating complex covariate patterns than the existing methods. In addition, we illustrate the approach by analyzing the correlation between the physical component summary and the mental component summary of the MOS SF-36 form across a fair number of covariates in the national survey data.  相似文献   

The main purpose of this paper is the longitudinal analysis of the poverty phenomenon. By interpreting poverty as a latent variable, we are able to resort to the statistical methodology developed for latent structure analysis. In particular, we propose to use the mixture latent Markov model which allows us to achieve two goals: (i) a time-invariant classification of households into homogenous groups, representing different levels of poverty; (ii) the dynamic analysis of the poverty phenomenon which highlights the distinction between transitory and permanent poverty situations. Furthermore, we exploit the flexibility provided by the model in order to achieve the measurement of poverty in a multidisciplinary framework, using several socio-economic indicators as covariates and identifying the main relevant factors which influence permanent and transitory poverty. The analysis of the longitudinal data of the Survey on Households Income and Wealth of the Bank of Italy provides the identification of two groups of households which are characterized by different dynamic features. Moreover, the inclusion of socio-economic covariates such as level of education, employment status, geographical area and residence size of the household head shows a direct association with permanent poverty.  相似文献   

A popular model for competing risks postulates the existence of a latent unobserved failure time for each risk. Assuming that these underlying failure times are independent is attractive since it allows standard statistical tools for right-censored lifetime data to be used in the analysis. This paper proposes simple independence score tests for the validity of this assumption when the individual risks are modeled using semiparametric proportional hazards regressions. It assumes that covariates are available, making the model identifiable. The score tests are derived for alternatives that specify that copulas are responsible for a possible dependency between the competing risks. The test statistics are constructed by adding to the partial likelihoods for the individual risks an explanatory variable for the dependency between the risks. A variance estimator is derived by writing the score function and the Fisher information matrix for the marginal models as stochastic integrals. Pitman efficiencies are used to compare test statistics. A simulation study and a numerical example illustrate the methodology proposed in this paper.  相似文献   

Latent variable models have been widely used for modelling the dependence structure of multiple outcomes data. However, the formulation of a latent variable model is often unknown a priori, the misspecification will distort the dependence structure and lead to unreliable model inference. Moreover, multiple outcomes with varying types present enormous analytical challenges. In this paper, we present a class of general latent variable models that can accommodate mixed types of outcomes. We propose a novel selection approach that simultaneously selects latent variables and estimates parameters. We show that the proposed estimator is consistent, asymptotically normal and has the oracle property. The practical utility of the methods is confirmed via simulations as well as an application to the analysis of the World Values Survey, a global research project that explores peoples’ values and beliefs and the social and personal characteristics that might influence them.  相似文献   

We discuss the use of latent variable models with observed covariates for computing response propensities for sample respondents. A response propensity score is often used to weight item and unit responders to account for item and unit non-response and to obtain adjusted means and proportions. In the context of attitude scaling, we discuss computing response propensity scores by using latent variable models for binary or nominal polytomous manifest items with covariates. Our models allow the response propensity scores to be found for several different items without refitting. They allow any pattern of missing responses for the items. If one prefers, it is possible to estimate population proportions directly from the latent variable models, so avoiding the use of propensity scores. Artificial data sets and a real data set extracted from the 1996 British Social Attitudes Survey are used to compare the various methods proposed.  相似文献   

The concepts of relative risk and hazard ratio are generalized for ordinary ordinal and continuous response variables, respectively. Under the generalized concepts, the Cox proportional hazards model with the Breslow's and Efron's methods can be regarded as generalizations of the Mantel–Haenszel estimator for dealing with broader types of covariates and responses. When ordinal responses can be regarded as discretized observations of a hypothetical continuous variable, the estimated relative risks from the Cox model reflect the associations between the responses and covariates. Examples are given to illustrate the generalized concepts and wider applications of the Cox model and the Kaplan–Meier estimator.  相似文献   

Both continuous and categorical covariates are common in traditional Chinese medicine (TCM) research, especially in the clinical syndrome identification and in the risk prediction research. For groups of dummy variables which are generated by the same categorical covariate, it is important to penalize them group-wise rather than individually. In this paper, we discuss the group lasso method for a risk prediction analysis in TCM osteoporosis research. It is the first time to apply such a group-wise variable selection method in this field. It may lead to new insights of using the grouped penalization method to select appropriate covariates in the TCM research. The introduced methodology can select categorical and continuous variables, and estimate their parameters simultaneously. In our application of the osteoporosis data, four covariates (including both categorical and continuous covariates) are selected out of 52 covariates. The accuracy of the prediction model is excellent. Compared with the prediction model with different covariates, the group lasso risk prediction model can significantly decrease the error rate and help TCM doctors to identify patients with a high risk of osteoporosis in clinical practice. Simulation results show that the application of the group lasso method is reasonable for the categorical covariates selection model in this TCM osteoporosis research.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号