首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 62 毫秒
1.
ABSTRACT

Latent variable modeling is commonly used in behavioral, social, and medical science research. The models used in such analysis relate all observed variables to latent common factors. In many applications, the observations are highly non normal or discrete, e.g., polytomous responses or counts. The existing approaches for non normal observations can be considered lacking in several aspects, especially for multi-group samples situations. We propose a generalized linear model approach for multi-sample latent variable analysis that can handle a broad class of non normal and discrete observations, and that furnishes meaningful interpretation and inference in multi-group studies through maximum likelihood analysis. A Monte Carlo EM algorithm is proposed for parameter estimation. The convergence assessment and standard error estimation is addressed. Simulation studies are reported to show the usefulness of the our approach. An example from a substance abuse prevention study is also presented.  相似文献   

2.
Particle MCMC involves using a particle filter within an MCMC algorithm. For inference of a model which involves an unobserved stochastic process, the standard implementation uses the particle filter to propose new values for the stochastic process, and MCMC moves to propose new values for the parameters. We show how particle MCMC can be generalised beyond this. Our key idea is to introduce new latent variables. We then use the MCMC moves to update the latent variables, and the particle filter to propose new values for the parameters and stochastic process given the latent variables. A generic way of defining these latent variables is to model them as pseudo-observations of the parameters or of the stochastic process. By choosing the amount of information these latent variables have about the parameters and the stochastic process we can often improve the mixing of the particle MCMC algorithm by trading off the Monte Carlo error of the particle filter and the mixing of the MCMC moves. We show that using pseudo-observations within particle MCMC can improve its efficiency in certain scenarios: dealing with initialisation problems of the particle filter; speeding up the mixing of particle Gibbs when there is strong dependence between the parameters and the stochastic process; and enabling further MCMC steps to be used within the particle filter.  相似文献   

3.
We develop Bayesian models for density regression with emphasis on discrete outcomes. The problem of density regression is approached by considering methods for multivariate density estimation of mixed scale variables, and obtaining conditional densities from the multivariate ones. The approach to multivariate mixed scale outcome density estimation that we describe represents discrete variables, either responses or covariates, as discretised versions of continuous latent variables. We present and compare several models for obtaining these thresholds in the challenging context of count data analysis where the response may be over‐ and/or under‐dispersed in some of the regions of the covariate space. We utilise a nonparametric mixture of multivariate Gaussians to model the directly observed and the latent continuous variables. The paper presents a Markov chain Monte Carlo algorithm for posterior sampling, sufficient conditions for weak consistency, and illustrations on density, mean and quantile regression utilising simulated and real datasets.  相似文献   

4.
The continuous extension of a discrete random variable is amongst the computational methods used for estimation of multivariate normal copula-based models with discrete margins. Its advantage is that the likelihood can be derived conveniently under the theory for copula models with continuous margins, but there has not been a clear analysis of the adequacy of this method. We investigate the asymptotic and small-sample efficiency of two variants of the method for estimating the multivariate normal copula with univariate binary, Poisson, and negative binomial regressions, and show that they lead to biased estimates for the latent correlations, and the univariate marginal parameters that are not regression coefficients. We implement a maximum simulated likelihood method, which is based on evaluating the multidimensional integrals of the likelihood with randomized quasi-Monte Carlo methods. Asymptotic and small-sample efficiency calculations show that our method is nearly as efficient as maximum likelihood for fully specified multivariate normal copula-based models. An illustrative example is given to show the use of our simulated likelihood method.  相似文献   

5.
Models that involve an outcome variable, covariates, and latent variables are frequently the target for estimation and inference. The presence of missing covariate or outcome data presents a challenge, particularly when missingness depends on the latent variables. This missingness mechanism is called latent ignorable or latent missing at random and is a generalisation of missing at random. Several authors have previously proposed approaches for handling latent ignorable missingness, but these methods rely on prior specification of the joint distribution for the complete data. In practice, specifying the joint distribution can be difficult and/or restrictive. We develop a novel sequential imputation procedure for imputing covariate and outcome data for models with latent variables under latent ignorable missingness. The proposed method does not require a joint model; rather, we use results under a joint model to inform imputation with less restrictive modelling assumptions. We discuss identifiability and convergence‐related issues, and simulation results are presented in several modelling settings. The method is motivated and illustrated by a study of head and neck cancer recurrence. Imputing missing data for models with latent variables under latent‐dependent missingness without specifying a full joint model.  相似文献   

6.
Abstract. We consider a semi‐nonparametric specification for the density of latent variables in Generalized Linear Latent Variable Models (GLLVM). This specification is flexible enough to allow for an asymmetric, multi‐modal, heavy or light tailed smooth density. The degree of flexibility required by many applications of GLLVM can be achieved through this semi‐nonparametric specification with a finite number of parameters estimated by maximum likelihood. Even with this additional flexibility, we obtain an explicit expression of the likelihood for conditionally normal manifest variables. We show by simulations that the estimated density of latent variables capture the true one with good degree of accuracy and is easy to visualize. By analysing two real data sets we show that a flexible distribution of latent variables is a useful tool for exploring the adequacy of the GLLVM in practice.  相似文献   

7.
A latent Markov model for detecting patterns of criminal activity   总被引:1,自引:0,他引:1  
Summary.  The paper investigates the problem of determining patterns of criminal behaviour from official criminal histories, concentrating on the variety and type of offending convictions. The analysis is carried out on the basis of a multivariate latent Markov model which allows for discrete covariates affecting the initial and the transition probabilities of the latent process. We also show some simplifications which reduce the number of parameters substantially; we include a Rasch-like parameterization of the conditional distribution of the response variables given the latent process and a constraint of partial homogeneity of the latent Markov chain. For the maximum likelihood estimation of the model we outline an EM algorithm based on recursions known in the hidden Markov literature, which make the estimation feasible also when the number of time occasions is large. Through this model, we analyse the conviction histories of a cohort of offenders who were born in England and Wales in 1953. The final model identifies five latent classes and specifies common transition probabilities for males and females between 5-year age periods, but with different initial probabilities.  相似文献   

8.
Directed acyclic graph (DAG) models—also called Bayesian networks—are widely used in probabilistic reasoning, machine learning and causal inference. If latent variables are present, then the set of possible marginal distributions over the remaining (observed) variables is generally not represented by any DAG. Larger classes of mixed graphical models have been introduced to overcome this; however, as we show, these classes are not sufficiently rich to capture all the marginal models that can arise. We introduce a new class of hyper‐graphs, called mDAGs, and a latent projection operation to obtain an mDAG from the margin of a DAG. We show that each distinct marginal of a DAG model is represented by at least one mDAG and provide graphical results towards characterizing equivalence of these models. Finally, we show that mDAGs correctly capture the marginal structure of causally interpreted DAGs under interventions on the observed variables.  相似文献   

9.
Abstract

In the fields of internet financial transactions and reliability engineering, there could be more zero and one observations simultaneously. In this paper, considering that it is beyond the range where the conventional model can fit, zero-and-one-inflated geometric distribution regression model is proposed. Ingeniously introducing Pólya-Gamma latent variables in the Bayesian inference, posterior sampling with high-dimensional parameters is converted to latent variables sampling and posterior sampling with lower-dimensional parameters, respectively. Circumventing the need for Metropolis-Hastings sampling, the sample with higher sampling efficiency is obtained. A simulation study is conducted to assess the performance of the proposed estimation for various sample sizes. Finally, a doctoral dissertation data set is analyzed to illustrate the practicability of the proposed method, research shows that zero-and-one-inflated geometric distribution regression model using Pólya-Gamma latent variables can achieve better fitting results.  相似文献   

10.
For manifest variables with additive noise and for a given number of latent variables with an assumed distribution, we propose to nonparametrically estimate the association between latent and manifest variables. Our estimation is a two step procedure: first it employs standard factor analysis to estimate the latent variables as theoretical quantiles of the assumed distribution; second, it employs the additive models’ backfitting procedure to estimate the monotone nonlinear associations between latent and manifest variables. The estimated fit may suggest a different latent distribution or point to nonlinear associations. We show on simulated data how, based on mean squared errors, the nonparametric estimation improves on factor analysis. We then employ the new estimator on real data to illustrate its use for exploratory data analysis.  相似文献   

11.
This article generalizes the Monte Carlo Markov Chain (MCMC) algorithm, based on the Gibbs weighted Chinese restaurant (gWCR) process algorithm, for a class of kernel mixture of time series models over the Dirichlet process. This class of models is an extension of Lo’s (Ann. Stat. 12:351–357, 1984) kernel mixture model for independent observations. The kernel represents a known distribution of time series conditional on past time series and both present and past latent variables. The latent variables are independent samples from a Dirichlet process, which is a random discrete (almost surely) distribution. This class of models includes an infinite mixture of autoregressive processes and an infinite mixture of generalized autoregressive conditional heteroskedasticity (GARCH) processes.  相似文献   

12.
Model-based clustering methods for continuous data are well established and commonly used in a wide range of applications. However, model-based clustering methods for categorical data are less standard. Latent class analysis is a commonly used method for model-based clustering of binary data and/or categorical data, but due to an assumed local independence structure there may not be a correspondence between the estimated latent classes and groups in the population of interest. The mixture of latent trait analyzers model extends latent class analysis by assuming a model for the categorical response variables that depends on both a categorical latent class and a continuous latent trait variable; the discrete latent class accommodates group structure and the continuous latent trait accommodates dependence within these groups. Fitting the mixture of latent trait analyzers model is potentially difficult because the likelihood function involves an integral that cannot be evaluated analytically. We develop a variational approach for fitting the mixture of latent trait models and this provides an efficient model fitting strategy. The mixture of latent trait analyzers model is demonstrated on the analysis of data from the National Long Term Care Survey (NLTCS) and voting in the U.S. Congress. The model is shown to yield intuitive clustering results and it gives a much better fit than either latent class analysis or latent trait analysis alone.  相似文献   

13.
We propose a class of multidimensional Item Response Theory models for polytomously-scored items with ordinal response categories. This class extends an existing class of multidimensional models for dichotomously-scored items in which the latent abilities are represented by a random vector assumed to have a discrete distribution, with support points corresponding to different latent classes in the population. In the proposed approach, we allow for different parameterizations for the conditional distribution of the response variables given the latent traits, which depend on the type of link function and the constraints imposed on the item parameters. Moreover, we suggest a strategy for model selection that is based on a series of steps consisting of selecting specific features, such as the dimension of the model (number of latent traits), the number of latent classes, and the specific parameterization. In order to illustrate the proposed approach, we analyze a dataset from a study on anxiety and depression on a sample of oncological patients.  相似文献   

14.
For observable indicators with ordered categories one can assume underlying latent variables following certain marginal distributions. Transforming the latent variables changes its marginal distributions but not the observable qualitative indicators. The joint distribution of the latent variables can be constructed from the marginal distributions. There is a broad class of multivariate distributions for which the observable indicators are equivalent. By choosing the multivariate normal distribution from this class we can analyse a linear relationship between the transformed latent variables. This leads to latent structural equation models. Estimation of these latter models is therefore more general than the distributional assumption might initially suggest. Robustness of the estimation procedure is also discussed for deviations from this distribution family. Using ordinal business survey data of the German Ifo-institute we test the efficiency of firms' price expectations implied by the rational expectation hypothesis.  相似文献   

15.
Latent Variable Models for Mixed Discrete and Continuous Outcomes   总被引:1,自引:0,他引:1  
We propose a latent variable model for mixed discrete and continuous outcomes. The model accommodates any mixture of outcomes from an exponential family and allows for arbitrary covariate effects, as well as direct modelling of covariates on the latent variable. An EM algorithm is proposed for parameter estimation and estimates of the latent variables are produced as a by-product of the analysis. A generalized likelihood ratio test can be used to test the significance of covariates affecting the latent outcomes. This method is applied to birth defects data, where the outcomes of interest are continuous measures of size and binary indicators of minor physical anomalies. Infants who were exposed in utero to anticonvulsant medications are compared with controls.  相似文献   

16.
Owing to the nature of the problems and the design of questionnaires, discrete polytomous data are very common in behavioural, medical and social research. Analysing the relationships between the manifest and the latent variables based on mixed polytomous and continuous data has proven to be difficult. A general structural equation model is investigated for these mixed outcomes. Maximum likelihood (ML) estimates of the unknown thresholds and the structural parameters in the covariance structure are obtained. A Monte Carlo–EM algorithm is implemented to produce the ML estimates. It is shown that closed form solutions can be obtained for the M-step, and estimates of the latent variables are produced as a by-product of the analysis. The method is illustrated with a real example.  相似文献   

17.
Latent variable models have been widely used for modelling the dependence structure of multiple outcomes data. However, the formulation of a latent variable model is often unknown a priori, the misspecification will distort the dependence structure and lead to unreliable model inference. Moreover, multiple outcomes with varying types present enormous analytical challenges. In this paper, we present a class of general latent variable models that can accommodate mixed types of outcomes. We propose a novel selection approach that simultaneously selects latent variables and estimates parameters. We show that the proposed estimator is consistent, asymptotically normal and has the oracle property. The practical utility of the methods is confirmed via simulations as well as an application to the analysis of the World Values Survey, a global research project that explores peoples’ values and beliefs and the social and personal characteristics that might influence them.  相似文献   

18.
The use of ridit, as a probability score, is a very common practice to compare discrete random variables in discrete data analysis. In the present work we formulate ridit reliability functionals for some comparison of K independent binary random variables. We use such functionals to provide a generalized response-adaptive design (GRAD) on K(≥ +2) treatment-arms for dichotomous response variables. We exhibit some properties of the proposed design and compare it with some of the existing competitors by computing its various performance measures. We also provide a discussion towards a possible modification of the GRAD in the presence of covariates.  相似文献   

19.
The alias method of Walker is a clever, new, fast method for generating random variables from an arbitrary, specified discrete distribution. A simple probabilistic proof is given, in terms of mixtures, that the method works for any discrete distribution with a finite number of outcomes. A more efficient version of the table-generating portion of the method is described. Finally, a brief discussion on efficiency of the method is given. We believe that the generality, speed, and simplicity of the method make it attractive for use in generating discrete random variables.  相似文献   

20.
We investigate two options for performing Bayesian inference on spatial log-Gaussian Cox processes assuming a spatially continuous latent field: Markov chain Monte Carlo (MCMC) and the integrated nested Laplace approximation (INLA). We first describe the device of approximating a spatially continuous Gaussian field by a Gaussian Markov random field on a discrete lattice, and present a simulation study showing that, with careful choice of parameter values, small neighbourhood sizes can give excellent approximations. We then introduce the spatial log-Gaussian Cox process and describe MCMC and INLA methods for spatial prediction within this model class. We report the results of a simulation study in which we compare the Metropolis-adjusted Langevin Algorithm (MALA) and the technique of approximating the continuous latent field by a discrete one, followed by approximate Bayesian inference via INLA over a selection of 18 simulated scenarios. The results question the notion that the latter technique is both significantly faster and more robust than MCMC in this setting; 100,000 iterations of the MALA algorithm running in 20 min on a desktop PC delivered greater predictive accuracy than the default INLA strategy, which ran in 4 min and gave comparative performance to the full Laplace approximation which ran in 39 min.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号