首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 140 毫秒
1.
In this article, a general approach to latent variable models based on an underlying generalized linear model (GLM) with factor analysis observation process is introduced. We call these models Generalized Linear Factor Models (GLFM). The observations are produced from a general model framework that involves observed and latent variables that are assumed to be distributed in the exponential family. More specifically, we concentrate on situations where the observed variables are both discretely measured (e.g., binomial, Poisson) and continuously distributed (e.g., gamma). The common latent factors are assumed to be independent with a standard multivariate normal distribution. Practical details of training such models with a new local expectation-maximization (EM) algorithm, which can be considered as a generalized EM-type algorithm, are also discussed. In conjunction with an approximated version of the Fisher score algorithm (FSA), we show how to calculate maximum likelihood estimates of the model parameters, and to yield inferences about the unobservable path of the common factors. The methodology is illustrated by an extensive Monte Carlo simulation study and the results show promising performance.  相似文献   

2.
The exponential–Poisson (EP) distribution with scale and shape parameters β>0 and λ∈?, respectively, is a lifetime distribution obtained by mixing exponential and zero-truncated Poisson models. The EP distribution has been a good alternative to the gamma distribution for modelling lifetime, reliability and time intervals of successive natural disasters. Both EP and gamma distributions have some similarities and properties in common, for example, their densities may be strictly decreasing or unimodal, and their hazard rate functions may be decreasing, increasing or constant depending on their shape parameters. On the other hand, the EP distribution has several interesting applications based on stochastic representations involving maximum and minimum of iid exponential variables (with random sample size) which make it of distinguishable scientific importance from the gamma distribution. Given the similarities and different scientific relevance between these models, one question of interest is how to discriminate them. With this in mind, we propose a likelihood ratio test based on Cox's statistic to discriminate the EP and gamma distributions. The asymptotic distribution of the normalized logarithm of the ratio of the maximized likelihoods under two null hypotheses – data come from EP or gamma distributions – is provided. With this, we obtain the probabilities of correct selection. Hence, we propose to choose the model that maximizes the probability of correct selection (PCS). We also determinate the minimum sample size required to discriminate the EP and gamma distributions when the PCS and a given tolerance level based on some distance are before stated. A simulation study to evaluate the accuracy of the asymptotic probabilities of correct selection is also presented. The paper is motivated by two applications to real data sets.  相似文献   

3.
Frailty models for survival data   总被引:1,自引:0,他引:1  
A frailty model is a random effects model for time variables, where the random effect (the frailty) has a multiplicative effect on the hazard. It can be used for univariate (independent) failure times, i.e. to describe the influence of unobserved covariates in a proportional hazards model. More interesting, however, is to consider multivariate (dependent) failure times generated as conditionally independent times given the frailty. This approach can be used both for survival times for individuals, like twins or family members, and for repeated events for the same individual. The standard assumption is to use a gamma distribution for the frailty, but this is a restriction that implies that the dependence is most important for late events. More generally, the distribution can be stable, inverse Gaussian, or follow a power variance function exponential family. Theoretically, large differences are seen between the choices. In practice, using the largest model makes it possible to allow for more general dependence structures, without making the formulas too complicated.This paper is a revised version of a review, which together with ten papers by the author made up a thesis for a Doctor of Science degree at the University of Copenhagen.  相似文献   

4.
Approximate Bayesian computation (ABC) methods permit approximate inference for intractable likelihoods when it is possible to simulate from the model. However, they perform poorly for high-dimensional data and in practice must usually be used in conjunction with dimension reduction methods, resulting in a loss of accuracy which is hard to quantify or control. We propose a new ABC method for high-dimensional data based on rare event methods which we refer to as RE-ABC. This uses a latent variable representation of the model. For a given parameter value, we estimate the probability of the rare event that the latent variables correspond to data roughly consistent with the observations. This is performed using sequential Monte Carlo and slice sampling to systematically search the space of latent variables. In contrast, standard ABC can be viewed as using a more naive Monte Carlo estimate. We use our rare event probability estimator as a likelihood estimate within the pseudo-marginal Metropolis–Hastings algorithm for parameter inference. We provide asymptotics showing that RE-ABC has a lower computational cost for high-dimensional data than standard ABC methods. We also illustrate our approach empirically, on a Gaussian distribution and an application in infectious disease modelling.  相似文献   

5.
We develop a Bayesian framework for estimating the means of two random variables when only the sum of those random variables can be observed. Mixture models are proposed for establishing conjugacy between the joint prior distribution and the distribution for observations. Among other desirable features, conjugate distributions allow Bayesian methods to be applied in sequential decision problems.  相似文献   

6.
The authors discuss a class of likelihood functions involving weak assumptions on data generating mechanisms. These likelihoods may be appropriate when it is difficult to propose models for the data. The properties of these likelihoods are given and it is shown how they can be computed numerically by use of the Blahut-Arimoto algorithm. The authors then show how these likelihoods can give useful inferences using a data set for which no plausible physical model is apparent. The plausibility of the inferences is enhanced by the extensive robustness analysis these likelihoods permit.  相似文献   

7.
The problem of simulating from distributions with intractable normalizing constants has received much attention in recent literature. In this article, we propose an asymptotic algorithm, the so-called double Metropolis–Hastings (MH) sampler, for tackling this problem. Unlike other auxiliary variable algorithms, the double MH sampler removes the need for exact sampling, the auxiliary variables being generated using MH kernels, and thus can be applied to a wide range of problems for which exact sampling is not available. For the problems for which exact sampling is available, it can typically produce the same accurate results as the exchange algorithm, but using much less CPU time. The new method is illustrated by various spatial models.  相似文献   

8.
Variable selection is an important task in regression analysis. Performance of the statistical model highly depends on the determination of the subset of predictors. There are several methods to select most relevant variables to construct a good model. However in practice, the dependent variable may have positive continuous values and not normally distributed. In such situations, gamma distribution is more suitable than normal for building a regression model. This paper introduces an heuristic approach to perform variable selection using artificial bee colony optimization for gamma regression models. We evaluated the proposed method against with classical selection methods such as backward and stepwise. Both simulation studies and real data set examples proved the accuracy of our selection procedure.  相似文献   

9.
In this paper, we consider a generalisation of the backward simulation method of Duch et al. [New approaches to operational risk modeling. IBM J Res Develop. 2014;58:1–9] to build bivariate Poisson processes with flexible time correlation structures, and to simulate the arrival times of the processes. The proposed backward construction approach uses the Marshall–Olkin bivariate binomial distribution for the conditional law and some well-known families of bivariate copulas for the joint success probability in lieu of the typical conditional independence assumption. The resulting bivariate Poisson process can exhibit various time correlation structures which are commonly observed in real data.  相似文献   

10.
When we are given only a transform such as the moment-generating function of a distribution, it is rare that we can efficiently simulate random variables. Possible approaches such as the inverse transform using numerical inversion of the transform are computationally very expensive. However, the saddlepoint approximation is known to be exact for the Normal, Gamma, and inverse Gaussian distribution and remarkably accurate for a large number of others. We explore the efficient use of the saddlepoint approximation for simulating distributions and provide three examples of the accuracy of these simulations.  相似文献   

11.
In this paper we discuss graphical models for mixed types of continuous and discrete variables with incomplete data. We use a set of hyperedges to represent an observed data pattern. A hyperedge is a set of variables observed for a group of individuals. In a mixed graph with two types of vertices and two types of edges, dots and circles represent discrete and continuous variables respectively. A normal graph represents a graphical model and a hypergraph represents an observed data pattern. In terms of the mixed graph, we discuss decomposition of mixed graphical models with incomplete data, and we present a partial imputation method which can be used in the EM algorithm and the Gibbs sampler to speed their convergence. For a given mixed graphical model and an observed data pattern, we try to decompose a large graph into several small ones so that the original likelihood can be factored into a product of likelihoods with distinct parameters for small graphs. For the case that a graph cannot be decomposed due to its observed data pattern, we can impute missing data partially so that the graph can be decomposed.  相似文献   

12.
A modified normal-based approximation for calculating the percentiles of a linear combination of independent random variables is proposed. This approximation is applicable in situations where expectations and percentiles of the individual random variables can be readily obtained. The merits of the approximation are evaluated for the chi-square and beta distributions using Monte Carlo simulation. An approximation to the percentiles of the ratio of two independent random variables is also given. Solutions based on the approximations are given for some classical problems such as interval estimation of the normal coefficient of variation, survival probability, the difference between or the ratio of two binomial proportions, and for some other problems. Furthermore, approximation to the percentiles of a doubly noncentral F distribution is also given. For all the problems considered, the approximation provides simple satisfactory solutions. Two examples are given to show applications of the approximation.  相似文献   

13.
In this article, we utilize a scale mixture of Gaussian random field as a tool for modeling spatial ordered categorical data with non-Gaussian latent variables. In fact, we assume a categorical random field is created by truncating a Gaussian Log-Gaussian latent variable model to accommodate heavy tails. Since the traditional likelihood approach for the considered model involves high-dimensional integrations which are computationally intensive, the maximum likelihood estimates are obtained using a stochastic approximation expectation–maximization algorithm. For this purpose, Markov chain Monte Carlo methods are employed to draw from the posterior distribution of latent variables. A numerical example illustrates the methodology.  相似文献   

14.
15.
The Poisson-binomial distribution is useful in many applied problems in engineering, actuarial science and data mining. The Poisson-binomial distribution models the distribution of the sum of independent but non-identically distributed random indicators whose success probabilities vary. In this paper, we extend the Poisson-binomial distribution to a generalized Poisson-binomial (GPB) distribution. The GPB distribution corresponds to the case where the random indicators are replaced by two-point random variables, which can take two arbitrary values instead of 0 and 1 as in the case of random indicators. The GPB distribution has found applications in many areas such as voting theory, actuarial science, warranty prediction and probability theory. As the GPB distribution has not been studied in detail so far, we introduce this distribution first and then derive its theoretical properties. We develop an efficient algorithm for the computation of its distribution function, using the fast Fourier transform. We test the accuracy of the developed algorithm by comparing it with enumeration-based exact method and the results from the binomial distribution. We also study the computational time of the algorithm under various parameter settings. Finally, we discuss the factors affecting the computational efficiency of the proposed algorithm and illustrate the use of the software package.  相似文献   

16.
Abstract

Non-negative limited normal or gamma distributed random variables are commonly used to model physical phenomenon such as the concentration of compounds within gaseous clouds. This paper demonstrates that when a collection of random variables with limited normal or gamma distributions represents a stationary process for which the underlying variables have exponentially decreasing correlations, then a central limit theorem applies to the correlated random variables.  相似文献   

17.
We study the properties of truncated gamma distributions and we derive simulation algorithms which dominate the standard algorithms for these distributions. For the right truncated gamma distribution, an optimal accept–reject algorithm is based on the fact that its density can be expressed as an infinite mixture of beta distribution. For integer values of the parameters, the density of the left truncated distributions can be rewritten as a mixture which can be easily generated. We give an optimal accept–reject algorithm for the other values of the parameter. We compare the efficiency of our algorithm with the previous method and show the improvement in terms of minimum acceptance probability. The algorithm proposed here has an acceptance probability which is superior to e/4.  相似文献   

18.
This paper provides a simple methodology for approximating the distribution of indefinite quadratic forms in normal random variables. It is shown that the density function of a positive definite quadratic form can be approximated in terms of the product of a gamma density function and a polynomial. An extension which makes use of a generalized gamma density function is also considered. Such representations are based on the moments of a quadratic form, which can be determined from its cumulants by means of a recursive formula. After expressing an indefinite quadratic form as the difference of two positive definite quadratic forms, one can obtain an approximation to its density function by means of the transformation of variable technique. An explicit representation of the resulting density approximant is given in terms of a degenerate hypergeometric function. An easily implementable algorithm is provided. The proposed approximants produce very accurate percentiles over the entire range of the distribution. Several numerical examples illustrate the results. In particular, the methodology is applied to the Durbin–Watson statistic which is expressible as the ratio of two quadratic forms in normal random variables. Quadratic forms being ubiquitous in statistics, the approximating technique introduced herewith has numerous potential applications. Some relevant computational considerations are also discussed.  相似文献   

19.
In spatial generalized linear mixed models (SGLMMs), statistical inference encounters problems, since random effects in the model imply high-dimensional integrals to calculate the marginal likelihood function. In this article, we temporarily treat parameters as random variables and express the marginal likelihood function as a posterior expectation. Hence, the marginal likelihood function is approximated using the obtained samples from the posterior density of the latent variables and parameters given the data. However, in this setting, misspecification of prior distribution of correlation function parameter and problems associated with convergence of Markov chain Monte Carlo (MCMC) methods could have an unpleasant influence on the likelihood approximation. To avoid these challenges, we utilize an empirical Bayes approach to estimate prior hyperparameters. We also use a computationally efficient hybrid algorithm by combining inverse Bayes formula (IBF) and Gibbs sampler procedures. A simulation study is conducted to assess the performance of our method. Finally, we illustrate the method applying a dataset of standard penetration test of soil in an area in south of Iran.  相似文献   

20.
The Additive Genetic Gamma Frailty Model   总被引:1,自引:0,他引:1  
In this paper the additive genetic gamma frailty model is defined. Individual frailties are correlated as a result of an additive genetic model. An algorithm to construct additive genetic gamma frailties for any pedigree is given so that the variance–covariance structure among individual frailties equals the numerator relationship matrix times a variance. The EM algorithm can be used to estimate the parameters in the model. Calculations are similar using the EM algorithm in the shared frailty model, however the E step is not correspondingly simple. This is illustrated re-analysing data, analysed by the shared frailty model in Nielsen et al . (1992), from the Danish adoptive register. Goodness of fit of the additive genetic gamma frailty model can be tested after analysing data with the correlated frailty model. Doing so, a "defect" in the often used and otherwise well behaving likelihood was found  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号