首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 546 毫秒
1.
A general framework is proposed for modelling clustered mixed outcomes. A mixture of generalized linear models is used to describe the joint distribution of a set of underlying variables, and an arbitrary function relates the underlying variables to be observed outcomes. The model accommodates multilevel data structures, general covariate effects and distinct link functions and error distributions for each underlying variable. Within the framework proposed, novel models are developed for clustered multiple binary, unordered categorical and joint discrete and continuous outcomes. A Markov chain Monte Carlo sampling algorithm is described for estimating the posterior distributions of the parameters and latent variables. Because of the flexibility of the modelling framework and estimation procedure, extensions to ordered categorical outcomes and more complex data structures are straightforward. The methods are illustrated by using data from a reproductive toxicity study.  相似文献   

2.
Social network data represent the interactions between a group of social actors. Interactions between colleagues and friendship networks are typical examples of such data.The latent space model for social network data locates each actor in a network in a latent (social) space and models the probability of an interaction between two actors as a function of their locations. The latent position cluster model extends the latent space model to deal with network data in which clusters of actors exist — actor locations are drawn from a finite mixture model, each component of which represents a cluster of actors.A mixture of experts model builds on the structure of a mixture model by taking account of both observations and associated covariates when modeling a heterogeneous population. Herein, a mixture of experts extension of the latent position cluster model is developed. The mixture of experts framework allows covariates to enter the latent position cluster model in a number of ways, yielding different model interpretations.Estimates of the model parameters are derived in a Bayesian framework using a Markov Chain Monte Carlo algorithm. The algorithm is generally computationally expensive — surrogate proposal distributions which shadow the target distributions are derived, reducing the computational burden.The methodology is demonstrated through an illustrative example detailing relationships between a group of lawyers in the USA.  相似文献   

3.
Spatial generalised linear mixed models are used commonly for modelling non‐Gaussian discrete spatial responses. In these models, the spatial correlation structure of data is modelled by spatial latent variables. Most users are satisfied with using a normal distribution for these variables, but in many applications it is unclear whether or not the normal assumption holds. This assumption is relaxed in the present work, using a closed skew normal distribution for the spatial latent variables, which is more flexible and includes normal and skew normal distributions. The parameter estimates and spatial predictions are calculated using the Markov Chain Monte Carlo method. Finally, the performance of the proposed model is analysed via two simulation studies, followed by a case study in which practical aspects are dealt with. The proposed model appears to give a smaller cross‐validation mean square error of the spatial prediction than the normal prior in modelling the temperature data set.  相似文献   

4.
《统计学通讯:理论与方法》2012,41(16-17):3079-3093
The paper presents an extension of a new class of multivariate latent growth models (Bianconcini and Cagnone, 2012) to allow for covariate effects on manifest, latent variables and random effects. The new class of models combines: (i) multivariate latent curves that describe the temporal behavior of the responses, and (ii) a factor model that specifies the relationship between manifest and latent variables. Based on the Generalized Linear and Latent Variable Model framework (Bartholomew and Knott, 1999), the response variables are assumed to follow different distributions of the exponential family, with item-specific linear predictors depending on both latent variables and measurement errors. A full maximum likelihood method is used to estimate all the model parameters simultaneously. Data coming from the Data WareHouse of the University of Bologna are used to illustrate the methodology.  相似文献   

5.
The issue of normalization arises whenever two different values for a vector of unknown parameters imply the identical economic model. A normalization implies not just a rule for selecting which among equivalent points to call the maximum likelihood estimate (MLE), but also governs the topography of the set of points that go into a small-sample confidence interval associated with that MLE. A poor normalization can lead to multimodal distributions, disjoint confidence intervals, and very misleading characterizations of the true statistical uncertainty. This paper introduces an identification principle as a framework upon which a normalization should be imposed, according to which the boundaries of the allowable parameter space should correspond to loci along which the model is locally unidentified. We illustrate these issues with examples taken from mixture models, structural vector autoregressions, and cointegration models.  相似文献   

6.
The paper deals with discrete-time regression models to analyze multistate—multiepisode models for event history data or failure time data collected in follow-up studies, retrospective studies, or longitudinal panels. The models are applicable if the events are not dated exactly but only a time interval is recorded. The models include individual specific parameters to account for unobserved heterogeneity. The explantory variables may be time-varying and random with distributions depending on the observed history of the process. Different estimation procedures are considered: Estimation of structural as well as individual specific parameters by maximization of a joint likelihood function, estimation of the structural parameters by maximization of a conditional likelihood function conditioning on a set of sufficient statistics for the individual specific parameters, and estimation of the structural parameters by maximization of a marginal likelihood function assuming that the individual specific parameters follow a distribution. The advantages and limitations of the different approaches are discussed.  相似文献   

7.
A class of prior distributions for multivariate autoregressive models is presented. This class of priors is built taking into account the latent component structure that characterizes a collection of autoregressive processes. In particular, the state-space representation of a vector autoregressive process leads to the decomposition of each time series in the multivariate process into simple underlying components. These components may have a common structure across the series. A key feature of the proposed priors is that they allow the modeling of such common structure. This approach also takes into account the uncertainty in the number of latent processes, consequently handling model order uncertainty in the multivariate autoregressive framework. Posterior inference is achieved via standard Markov chain Monte Carlo (MCMC) methods. Issues related to inference and exploration of the posterior distribution are discussed. We illustrate the methodology analyzing two data sets: a synthetic data set with quasi-periodic latent structure, and seasonally adjusted US monthly housing data consisting of housing starts and housing sales over the period 1965 to 1974.  相似文献   

8.
This paper introduces non-linear and non-Gaussian state space models with analytic updating recursions for filtering and prediction. This new class of models involves some well-known results in the theory of exponential models and of exponential dispersion models and the latent process is defined in such a way that both the filtering and the prediction distributions turn out to be conjugate to the observation distribution at each step. The corresponding analytic and inferential properties are investigated and some simple examples are presented.  相似文献   

9.
Data sets with excess zeroes are frequently analyzed in many disciplines. A common framework used to analyze such data is the zero-inflated (ZI) regression model. It mixes a degenerate distribution with point mass at zero with a non-degenerate distribution. The estimates from ZI models quantify the effects of covariates on the means of latent random variables, which are often not the quantities of primary interest. Recently, marginal zero-inflated Poisson (MZIP; Long et al. [A marginalized zero-inflated Poisson regression model with overall exposure effects. Stat. Med. 33 (2014), pp. 5151–5165]) and negative binomial (MZINB; Preisser et al., 2016) models have been introduced that model the mean response directly. These models yield covariate effects that have simple interpretations that are, for many applications, more appealing than those available from ZI regression. This paper outlines a general framework for marginal zero-inflated models where the latent distribution is a member of the exponential dispersion family, focusing on common distributions for count data. In particular, our discussion includes the marginal zero-inflated binomial (MZIB) model, which has not been discussed previously. The details of maximum likelihood estimation via the EM algorithm are presented and the properties of the estimators as well as Wald and likelihood ratio-based inference are examined via simulation. Two examples presented illustrate the advantages of MZIP, MZINB, and MZIB models for practical data analysis.  相似文献   

10.
We propose a class of state-space models for multivariate longitudinal data where the components of the response vector may have different distributions. The approach is based on the class of Tweedie exponential dispersion models, which accommodates a wide variety of discrete, continuous and mixed data. The latent process is assumed to be a Markov process, and the observations are conditionally independent given the latent process, over time as well as over the components of the response vector. This provides a fully parametric alternative to the quasilikelihood approach of Liang and Zeger. We estimate the regression parameters for time-varying covariates entering either via the observation model or via the latent process, based on an estimating equation derived from the Kalman smoother. We also consider analysis of residuals from both the observation model and the latent process.  相似文献   

11.
For observable indicators with ordered categories one can assume underlying latent variables following certain marginal distributions. Transforming the latent variables changes its marginal distributions but not the observable qualitative indicators. The joint distribution of the latent variables can be constructed from the marginal distributions. There is a broad class of multivariate distributions for which the observable indicators are equivalent. By choosing the multivariate normal distribution from this class we can analyse a linear relationship between the transformed latent variables. This leads to latent structural equation models. Estimation of these latter models is therefore more general than the distributional assumption might initially suggest. Robustness of the estimation procedure is also discussed for deviations from this distribution family. Using ordinal business survey data of the German Ifo-institute we test the efficiency of firms' price expectations implied by the rational expectation hypothesis.  相似文献   

12.
A new class of distributions called the log-logistic Weibull–Poisson distribution is introduced and its properties are explored. This new distribution represents a more flexible model for lifetime data. Some statistical properties of the proposed distribution including the expansion of the density function, quantile function, hazard and reverse hazard functions, moments, conditional moments, moment generating function, skewness and kurtosis are presented. Mean deviations, Bonferroni and Lorenz curves, Rényi entropy and distribution of the order statistics are derived. Maximum likelihood estimation technique is used to estimate the model parameters. A simulation study is conducted to examine the bias, mean square error of the maximum likelihood estimators and width of the confidence interval for each parameter and finally applications of the model to real data sets are presented to illustrate the usefulness of the proposed distribution.  相似文献   

13.
This paper presents a new Bayesian, infinite mixture model based, clustering approach, specifically designed for time-course microarray data. The problem is to group together genes which have “similar” expression profiles, given the set of noisy measurements of their expression levels over a specific time interval. In order to capture temporal variations of each curve, a non-parametric regression approach is used. Each expression profile is expanded over a set of basis functions and the sets of coefficients of each curve are subsequently modeled through a Bayesian infinite mixture of Gaussian distributions. Therefore, the task of finding clusters of genes with similar expression profiles is then reduced to the problem of grouping together genes whose coefficients are sampled from the same distribution in the mixture. Dirichlet processes prior is naturally employed in such kinds of models, since it allows one to deal automatically with the uncertainty about the number of clusters. The posterior inference is carried out by a split and merge MCMC sampling scheme which integrates out parameters of the component distributions and updates only the latent vector of the cluster membership. The final configuration is obtained via the maximum a posteriori estimator. The performance of the method is studied using synthetic and real microarray data and is compared with the performances of competitive techniques.  相似文献   

14.
Change-point time series specifications constitute flexible models that capture unknown structural changes by allowing for switches in the model parameters. Nevertheless most models suffer from an over-parametrization issue since typically only one latent state variable drives the switches in all parameters. This implies that all parameters have to change when a break happens. To gauge whether and where there are structural breaks in realized variance, we introduce the sparse change-point HAR model. The approach controls for model parsimony by limiting the number of parameters which evolve from one regime to another. Sparsity is achieved thanks to employing a nonstandard shrinkage prior distribution. We derive a Gibbs sampler for inferring the parameters of this process. Simulation studies illustrate the excellent performance of the sampler. Relying on this new framework, we study the stability of the HAR model using realized variance series of several major international indices between January 2000 and August 2015.  相似文献   

15.
Very often, in psychometric research, as in educational assessment, it is necessary to analyze item response from clustered respondents. The multiple group item response theory (IRT) model proposed by Bock and Zimowski [12] provides a useful framework for analyzing such type of data. In this model, the selected groups of respondents are of specific interest such that group-specific population distributions need to be defined. The usual assumption for parameter estimation in this model, which is that the latent traits are random variables following different symmetric normal distributions, has been questioned in many works found in the IRT literature. Furthermore, when this assumption does not hold, misleading inference can result. In this paper, we consider that the latent traits for each group follow different skew-normal distributions, under the centered parameterization. We named it skew multiple group IRT model. This modeling extends the works of Azevedo et al. [4], Bazán et al. [11] and Bock and Zimowski [12] (concerning the latent trait distribution). Our approach ensures that the model is identifiable. We propose and compare, concerning convergence issues, two Monte Carlo Markov Chain (MCMC) algorithms for parameter estimation. A simulation study was performed in order to evaluate parameter recovery for the proposed model and the selected algorithm concerning convergence issues. Results reveal that the proposed algorithm recovers properly all model parameters. Furthermore, we analyzed a real data set which presents asymmetry concerning the latent traits distribution. The results obtained by using our approach confirmed the presence of negative asymmetry for some latent trait distributions.  相似文献   

16.
To model extreme spatial events, a general approach is to use the generalized extreme value (GEV) distribution with spatially varying parameters such as spatial GEV models and latent variable models. In the literature, this approach is mostly used to capture spatial dependence for only one type of event. This limits the applications to air pollutants data as different pollutants may chemically interact with each other. A recent advancement in spatial extremes modelling for multiple variables is the multivariate max-stable processes. Similarly to univariate max-stable processes, the multivariate version also assumes standard distributions such as unit-Fréchet as margins. Additional modelling is required for applications such as spatial prediction. In this paper, we extend the marginal methods such as spatial GEV models and latent variable models into a multivariate setting based on copulas so that it is capable of handling both the spatial dependence and the dependence among multiple pollutants. We apply our proposed model to analyse weekly maxima of nitrogen dioxide, sulphur dioxide, respirable suspended particles, fine suspended particles, and ozone collected in Pearl River Delta in China.  相似文献   

17.
This article describes a convenient method of selecting Metropolis– Hastings proposal distributions for multinomial logit models. There are two key ideas involved. The first is that multinomial logit models have a latent variable representation similar to that exploited by Albert and Chib (J Am Stat Assoc 88:669–679, 1993) for probit regression. Augmenting the latent variables replaces the multinomial logit likelihood function with the complete data likelihood for a linear model with extreme value errors. While no conjugate prior is available for this model, a least squares estimate of the parameters is easily obtained. The asymptotic sampling distribution of the least squares estimate is Gaussian with known variance. The second key idea in this paper is to generate a Metropolis–Hastings proposal distribution by conditioning on the estimator instead of the full data set. The resulting sampler has many of the benefits of so-called tailored or approximation Metropolis–Hastings samplers. However, because the proposal distributions are available in closed form they can be implemented without numerical methods for exploring the posterior distribution. The algorithm converges geometrically ergodically, its computational burden is minor, and it requires minimal user input. Improvements to the sampler’s mixing rate are investigated. The algorithm is also applied to partial credit models describing ordinal item response data from the 1998 National Assessment of Educational Progress. Its application to hierarchical models and Poisson regression are briefly discussed.  相似文献   

18.
Latent class models have recently drawn considerable attention among many researchers and practitioners as a class of useful tools for capturing heterogeneity across different segments in a target market or population. In this paper, we consider a latent class logit model with parameter constraints and deal with two important issues in the latent class models--parameter estimation and selection of an appropriate number of classes--within a Bayesian framework. A simple Gibbs sampling algorithm is proposed for sample generation from the posterior distribution of unknown parameters. Using the Gibbs output, we propose a method for determining an appropriate number of the latent classes. A real-world marketing example as an application for market segmentation is provided to illustrate the proposed method.  相似文献   

19.
A new four-parameter distribution called the exponentiated power Lindley–Poisson distribution which is an extension of the power Lindley and Lindley–Poisson distributions is introduced. Statistical properties of the distribution including the shapes of the density and hazard functions, moments, entropy measures, and distribution of order statistics are given. Maximum likelihood estimation technique is used to estimate the parameters. A simulation study is conducted to examine the bias, mean square error of the maximum likelihood estimators, and width of the confidence interval for each parameter. Finally, applications to real data sets are presented to illustrate the usefulness of the proposed distribution.  相似文献   

20.
A new method for analyzing high-dimensional categorical data, Linear Latent Structure (LLS) analysis, is presented. LLS models belong to the family of latent structure models, which are mixture distribution models constrained to satisfy the local independence assumption. LLS analysis explicitly considers a family of mixed distributions as a linear space, and LLS models are obtained by imposing linear constraints on the mixing distribution.LLS models are identifiable under modest conditions and are consistently estimable. A remarkable feature of LLS analysis is the existence of a high-performance numerical algorithm, which reduces parameter estimation to a sequence of linear algebra problems. Simulation experiments with a prototype of the algorithm demonstrated a good quality of restoration of model parameters.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号