共查询到20条相似文献,搜索用时 31 毫秒
1.
Loukia Meligkotsidou 《Statistics and Computing》2007,17(2):93-107
In this paper we present Bayesian analysis of finite mixtures of multivariate Poisson distributions with an unknown number
of components. The multivariate Poisson distribution can be regarded as the discrete counterpart of the multivariate normal
distribution, which is suitable for modelling multivariate count data. Mixtures of multivariate Poisson distributions allow
for overdispersion and for negative correlations between variables. To perform Bayesian analysis of these models we adopt
a reversible jump Markov chain Monte Carlo (MCMC) algorithm with birth and death moves for updating the number of components.
We present results obtained from applying our modelling approach to simulated and real data. Furthermore, we apply our approach
to a problem in multivariate disease mapping, namely joint modelling of diseases with correlated counts. 相似文献
2.
A new Markov chain Monte Carlo method for the Bayesian analysis of finite mixture distributions with an unknown number of
components is presented. The sampler is characterized by a state space consisting only of the number of components and the
latent allocation variables. Its main advantage is that it can be used, with minimal changes, for mixtures of components from
any parametric family, under the assumption that the component parameters can be integrated out of the model analytically.
Artificial and real data sets are used to illustrate the method and mixtures of univariate and of multivariate normals are
explicitly considered. The problem of label switching, when parameter inference is of interest, is addressed in a post-processing
stage. 相似文献
3.
Statistics and Computing - Recent work on overfitting Bayesian mixtures of distributions offers a powerful framework for clustering multivariate data using a latent Gaussian model which resembles... 相似文献
4.
We present full Bayesian analysis of finite mixtures of multivariate normals with unknown number of components. We adopt reversible
jump Markov chain Monte Carlo and we construct, in a manner similar to that of Richardson and Green (1997), split and merge
moves that produce good mixing of the Markov chains. The split moves are constructed on the space of eigenvectors and eigenvalues
of the current covariance matrix so that the proposed covariance matrices are positive definite. Our proposed methodology
has applications in classification and discrimination as well as heterogeneity modelling. We test our algorithm with real
and simulated data. 相似文献
5.
6.
Oscar M. Rueda Cristina Rueda Ramon Diaz-Uriarte 《Journal of Statistical Computation and Simulation》2013,83(1):82-96
Hidden Markov models (HMMs) have been shown to be a flexible tool for modelling complex biological processes. However, choosing the number of hidden states remains an open question and the inclusion of random effects also deserves more research, as it is a recent addition to the fixed-effect HMM in many application fields. We present a Bayesian mixed HMM with an unknown number of hidden states and fixed covariates. The model is fitted using reversible-jump Markov chain Monte Carlo, avoiding the need to select the number of hidden states. We show through simulations that the estimations produced are more precise than those from a fixed-effect HMM and illustrate its practical application to the analysis of DNA copy number data, a field where HMMs are widely used. 相似文献
7.
Recently, mixture distribution becomes more and more popular in many scientific fields. Statistical computation and analysis of mixture models, however, are extremely complex due to the large number of parameters involved. Both EM algorithms for likelihood inference and MCMC procedures for Bayesian analysis have various difficulties in dealing with mixtures with unknown number of components. In this paper, we propose a direct sampling approach to the computation of Bayesian finite mixture models with varying number of components. This approach requires only the knowledge of the density function up to a multiplicative constant. It is easy to implement, numerically efficient and very practical in real applications. A simulation study shows that it performs quite satisfactorily on relatively high dimensional distributions. A well-known genetic data set is used to demonstrate the simplicity of this method and its power for the computation of high dimensional Bayesian mixture models. 相似文献
8.
We consider the analysis of data under mixture models where the number of components in the mixture is unknown. We concentrate on mixture Dirichlet process models, and in particular we consider such models under conjugate priors. This conjugacy enables us to integrate out many of the parameters in the model, and to discretize the posterior distribution. Particle filters are particularly well suited to such discrete problems, and we propose the use of the particle filter of Fearnhead and Clifford for this problem. The performance of this particle filter, when analyzing both simulated and real data from a Gaussian mixture model, is uniformly better than the particle filter algorithm of Chen and Liu. In many situations it outperforms a Gibbs Sampler. We also show how models without the required amount of conjugacy can be efficiently analyzed by the same particle filter algorithm. 相似文献
9.
In this paper, we study a new Bayesian approach for the analysis of linearly mixed structures. In particular, we consider the case of hyperspectral images, which have to be decomposed into a collection of distinct spectra, called endmembers, and a set of associated proportions for every pixel in the scene. This problem, often referred to as spectral unmixing, is usually considered on the basis of the linear mixing model (LMM). In unsupervised approaches, the endmember signatures have to be calculated by an endmember extraction algorithm, which generally relies on the supposition that there are pure (unmixed) pixels contained in the image. In practice, this assumption may not hold for highly mixed data and consequently the extracted endmember spectra differ from the true ones. A way out of this dilemma is to consider the problem under the normal compositional model (NCM). Contrary to the LMM, the NCM treats the endmembers as random Gaussian vectors and not as deterministic quantities. Existing Bayesian approaches for estimating the proportions under the NCM are restricted to the case that the covariance matrix of the Gaussian endmembers is a multiple of the identity matrix. The self-evident conclusion is that this model is not suitable when the variance differs from one spectral channel to the other, which is a common phenomenon in practice. In this paper, we first propose a Bayesian strategy for the estimation of the mixing proportions under the assumption of varying variances in the spectral bands. Then we generalize this model to handle the case of a completely unknown covariance structure. For both algorithms, we present Gibbs sampling strategies and compare their performance with other, state of the art, unmixing routines on synthetic as well as on real hyperspectral fluorescence spectroscopy data. 相似文献
10.
《Journal of Statistical Computation and Simulation》2012,82(12):2308-2334
ABSTRACTWe propose a new unsupervised learning algorithm to fit regression mixture models with unknown number of components. The developed approach consists in a penalized maximum likelihood estimation carried out by a robust expectation–maximization (EM)-like algorithm. We derive it for polynomial, spline, and B-spline regression mixtures. The proposed learning approach is unsupervised: (i) it simultaneously infers the model parameters and the optimal number of the regression mixture components from the data as the learning proceeds, rather than in a two-fold scheme as in standard model-based clustering using afterward model selection criteria, and (ii) it does not require accurate initialization unlike the standard EM for regression mixtures. The developed approach is applied to curve clustering problems. Numerical experiments on simulated and real data show that the proposed algorithm performs well and provides accurate clustering results, and confirm its benefit for practical applications. 相似文献
11.
Finite mixture models arise in a natural way in that they are modeling unobserved population heterogeneity. It is assumed that the population consists of an unknown number k of subpopulations with parameters λ1, ..., λk receiving weights p1, ..., pk. Because of the irregularity of the parameter space, the log-likelihood-ratio statistic (LRS) does not have a (χ2) limit distribution and therefore it is difficult to use the LRS to test for the number of components. These problems are circumvented by using the nonparametric bootstrap such that the mixture algorithm is applied B times to bootstrap samples obtained from the original sample with replacement. The number of components k is obtained as the mode of the bootstrap distribution of k. This approach is presented using the Times newspaper data and investigated in a simulation study for mixtures of Poisson data. 相似文献
12.
Marco Riani Anthony C. Atkinson Andrea Cerioli 《Journal of the Royal Statistical Society. Series B, Statistical methodology》2009,71(2):447-466
Summary. We use the forward search to provide robust Mahalanobis distances to detect the presence of outliers in a sample of multivariate normal data. Theoretical results on order statistics and on estimation in truncated samples provide the distribution of our test statistic. We also introduce several new robust distances with associated distributional results. Comparisons of our procedure with tests using other robust Mahalanobis distances show the good size and high power of our procedure. We also provide a unification of results on correction factors for estimation from truncated samples. 相似文献
13.
Baba B. Alhaji Yoshiko Hayashi Veronica Vinciotti Andrew Harrison Berthold Lausen 《Journal of applied statistics》2016,43(8):1369-1385
Bayesian finite mixture modelling is a flexible parametric modelling approach for classification and density fitting. Many areas of application require distinguishing a signal from a noise component. In practice, it is often difficult to justify a specific distribution for the signal component; therefore, the signal distribution is usually further modelled via a mixture of distributions. However, modelling the signal as a mixture of distributions is computationally non-trivial due to the difficulties in justifying the exact number of components to be used and due to the label switching problem. This paper proposes the use of a non-parametric distribution to model the signal component. We consider the case of discrete data and show how this new methodology leads to more accurate parameter estimation and smaller false non-discovery rate. Moreover, it does not incur the label switching problem. We show an application of the method to data generated by ChIP-sequencing experiments. 相似文献
14.
15.
Vee Ming Ng 《统计学通讯:理论与方法》2013,42(1):111-120
This paper analyses a linear model in which both the mean and the precision change exactly once at an unknown point in time. Posterior distributions are found for the unknown time point at which the changes occurred and for the ratio of the precisions. The Bayesian predictive distribution of k future observations is also derived. It is shown that the unconditional posterior distribution of the ratio of precisions is a mixture of F-type distributions and the predictive distribution is a mixture of multivariate t distributions. 相似文献
16.
This paper presents a Bayesian analysis of partially linear additive models for quantile regression. We develop a semiparametric Bayesian approach to quantile regression models using a spectral representation of the nonparametric regression functions and the Dirichlet process (DP) mixture for error distribution. We also consider Bayesian variable selection procedures for both parametric and nonparametric components in a partially linear additive model structure based on the Bayesian shrinkage priors via a stochastic search algorithm. Based on the proposed Bayesian semiparametric additive quantile regression model referred to as BSAQ, the Bayesian inference is considered for estimation and model selection. For the posterior computation, we design a simple and efficient Gibbs sampler based on a location-scale mixture of exponential and normal distributions for an asymmetric Laplace distribution, which facilitates the commonly used collapsed Gibbs sampling algorithms for the DP mixture models. Additionally, we discuss the asymptotic property of the sempiparametric quantile regression model in terms of consistency of posterior distribution. Simulation studies and real data application examples illustrate the proposed method and compare it with Bayesian quantile regression methods in the literature. 相似文献
17.
S. P. Brooks 《Statistics and Computing》2001,11(2):179-190
When the results of biological experiments are tested for a possible difference between treatment and control groups, the inference is only valid if based upon a model that fits the experimental results satisfactorily. In dominant-lethal testing, foetal death has previously been assumed to follow a variety of models, including a Poisson, Binomial, Beta-binomial and various mixture models. However, discriminating between models has always been a particularly difficult problem. In this paper, we consider the data from 6 separate dominant-lethal assay experiments and discriminate between the competing models which could be used to describe them. We adopt a Bayesian approach and illustrate how a variety of different models may be considered, using Markov chain Monte Carlo (MCMC) simulation techniques and comparing the results with the corresponding maximum likelihood analyses. We present an auxiliary variable method for determining the probability that any particular data cell is assigned to a given component in a mixture and we illustrate the value of this approach. Finally, we show how the Bayesian approach provides a natural and unique perspective on the model selection problem via reversible jump MCMC and illustrate how probabilities associated with each of the different models may be calculated for each data set. In terms of estimation we show how, by averaging over the different models, we obtain reliable and robust inference for any statistic of interest. 相似文献
18.
《Journal of statistical planning and inference》2006,136(3):578-596
Bayesian nonparametric methods have been applied to survival analysis problems since the emergence of the area of Bayesian nonparametrics. However, the use of the flexible class of Dirichlet process mixture models has been rather limited in this context. This is, arguably, to a large extent, due to the standard way of fitting such models that precludes full posterior inference for many functionals of interest in survival analysis applications. To overcome this difficulty, we provide a computational approach to obtain the posterior distribution of general functionals of a Dirichlet process mixture. We model the survival distribution employing a flexible Dirichlet process mixture, with a Weibull kernel, that yields rich inference for several important functionals. In the process, a method for hazard function estimation emerges. Methods for simulation-based model fitting, in the presence of censoring, and for prior specification are provided. We illustrate the modeling approach with simulated and real data. 相似文献
19.
Aldo M. Garay Heleno Bolfarine Celso R.B. Cabral 《Journal of applied statistics》2015,42(12):2694-2714
As is the case of many studies, the data collected are limited and an exact value is recorded only if it falls within an interval range. Hence, the responses can be either left, interval or right censored. Linear (and nonlinear) regression models are routinely used to analyze these types of data and are based on normality assumptions for the errors terms. However, those analyzes might not provide robust inference when the normality assumptions are questionable. In this article, we develop a Bayesian framework for censored linear regression models by replacing the Gaussian assumptions for the random errors with scale mixtures of normal (SMN) distributions. The SMN is an attractive class of symmetric heavy-tailed densities that includes the normal, Student-t, Pearson type VII, slash and the contaminated normal distributions, as special cases. Using a Bayesian paradigm, an efficient Markov chain Monte Carlo algorithm is introduced to carry out posterior inference. A new hierarchical prior distribution is suggested for the degrees of freedom parameter in the Student-t distribution. The likelihood function is utilized to compute not only some Bayesian model selection measures but also to develop Bayesian case-deletion influence diagnostics based on the q-divergence measure. The proposed Bayesian methods are implemented in the R package BayesCR. The newly developed procedures are illustrated with applications using real and simulated data. 相似文献
20.
This paper is concerned with the problem of obtaining Bayesian prediction bounds for future observations based on a type I censored sample from a nonhomogerieous population having a distribution which is a mixture of two Lomax components. A numerical example is given to illustrate our results. 相似文献