首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 312 毫秒
1.
Weak consistency and asymptotic normality is shown for a stochastic EM algorithm for censored data from a mixture of distributions under lognormal assumptions. The asymptotic properties hold for all parameters of the distributions, including the mixing parameter. In order to make parameter estimation meaningful it is necessary to know that the censored mixture distribution is identifiable. General conditions under which this is the case are given. The stochastic EM algorithm addressed in this paper is used for estimation of wood fibre length distributions based on optically measured data from cylindric wood samples (increment cores).  相似文献   

2.
Matrix-variate distributions represent a natural way for modeling random matrices. Realizations from random matrices are generated by the simultaneous observation of variables in different situations or locations, and are commonly arranged in three-way data structures. Among the matrix-variate distributions, the matrix normal density plays the same pivotal role as the multivariate normal distribution in the family of multivariate distributions. In this work we define and explore finite mixtures of matrix normals. An EM algorithm for the model estimation is developed and some useful properties are demonstrated. We finally show that the proposed mixture model can be a powerful tool for classifying three-way data both in supervised and unsupervised problems. A simulation study and some real examples are presented.  相似文献   

3.
In the present paper we examine finite mixtures of multivariate Poisson distributions as an alternative class of models for multivariate count data. The proposed models allow for both overdispersion in the marginal distributions and negative correlation, while they are computationally tractable using standard ideas from finite mixture modelling. An EM type algorithm for maximum likelihood (ML) estimation of the parameters is developed. The identifiability of this class of mixtures is proved. Properties of ML estimators are derived. A real data application concerning model based clustering for multivariate count data related to different types of crime is presented to illustrate the practical potential of the proposed class of models.  相似文献   

4.
We propose here a robust multivariate extension of the bivariate Birnbaum–Saunders (BS) distribution derived by Kundu et al. [Bivariate Birnbaum–Saunders distribution and associated inference. J Multivariate Anal. 2010;101:113–125], based on scale mixtures of normal (SMN) distributions that are used for modelling symmetric data. This resulting multivariate BS-type distribution is an absolutely continuous distribution whose marginal and conditional distributions are of BS-type distribution of Balakrishnan et al. [Estimation in the Birnbaum–Saunders distribution based on scalemixture of normals and the EM algorithm. Stat Oper Res Trans. 2009;33:171–192]. Due to the complexity of the likelihood function, parameter estimation by direct maximization is very difficult to achieve. For this reason, we exploit the nice hierarchical representation of the proposed distribution to propose a fast and accurate EM algorithm for computing the maximum likelihood (ML) estimates of the model parameters. We then evaluate the finite-sample performance of the developed EM algorithm and the asymptotic properties of the ML estimates through empirical experiments. Finally, we illustrate the obtained results with a real data and display the robustness feature of the estimation procedure developed here.  相似文献   

5.
Iterative reweighting (IR) is a popular method for computing M-estimates of location and scatter in multivariate robust estimation. When the objective function comes from a scale mixture of normal distributions the iterative reweighting algorithm can be identified as an EM algorithm. The purpose of this paper is to show that in the special case of the multivariate t-distribution, substantial improvements to the convergence rate can be obtained by modifying the EM algorithm.  相似文献   

6.
In some situations, the distribution of the error terms of a multivariate linear regression model may depart from normality. This problem has been addressed, for example, by specifying a different parametric distribution family for the error terms, such as multivariate skewed and/or heavy-tailed distributions. A new solution is proposed, which is obtained by modelling the error term distribution through a finite mixture of multi-dimensional Gaussian components. The multivariate linear regression model is studied under this assumption. Identifiability conditions are proved and maximum likelihood estimation of the model parameters is performed using the EM algorithm. The number of mixture components is chosen through model selection criteria; when this number is equal to one, the proposal results in the classical approach. The performances of the proposed approach are evaluated through Monte Carlo experiments and compared to the ones of other approaches. In conclusion, the results obtained from the analysis of a real dataset are presented.  相似文献   

7.
In this article, we propose mixtures of skew Laplace normal (SLN) distributions to model both skewness and heavy-tailedness in the neous data set as an alternative to mixtures of skew Student-t-normal (STN) distributions. We give the expectation–maximization (EM) algorithm to obtain the maximum likelihood (ML) estimators for the parameters of interest. We also analyze the mixture regression model based on the SLN distribution and provide the ML estimators of the parameters using the EM algorithm. The performance of the proposed mixture model is illustrated by a simulation study and two real data examples.  相似文献   

8.
The mixture distribution models are more useful than pure distributions in modeling of heterogeneous data sets. The aim of this paper is to propose mixture of Weibull–Poisson (WP) distributions to model heterogeneous data sets for the first time. So, a powerful alternative mixture distribution is created for modeling of the heterogeneous data sets. In the study, many features of the proposed mixture of WP distributions are examined. Also, the expectation maximization (EM) algorithm is used to determine the maximum-likelihood estimates of the parameters, and the simulation study is conducted for evaluating the performance of the proposed EM scheme. Applications for two real heterogeneous data sets are given to show the flexibility and potentiality of the new mixture distribution.  相似文献   

9.
Motivated by examples in protein bioinformatics, we study a mixture model of multivariate angular distributions. The distribution treated here (multivariate sine distribution) is a multivariate extension of the well-known von Mises distribution on the circle. The density of the sine distribution has an intractable normalizing constant and here we propose to replace it in the concentrated case by a simple approximation. We study the EM algorithm for this distribution and apply it to a practical example from protein bioinformatics.  相似文献   

10.
It is well known that the log-likelihood function for samples coming from normal mixture distributions may present spurious maxima and singularities. For this reason here we reformulate some Hathaways results and we propose two constrained estimation procedures for multivariate normal mixture modelling according to the likelihood approach. Their perfomances are illustrated on the grounds of some numerical simulations based on the EM algorithm. A comparison between multivariate normal mixtures and the hot-deck approach in missing data imputation is also considered.Salvatore Ingrassia: S. Ingrassia carried out the research as part of the project Metodi Statistici e Reti Neuronali per lAnalisi di Dati Complessi (PRIN 2000, resp. G. Lunetta).  相似文献   

11.
The majority of the existing literature on model-based clustering deals with symmetric components. In some cases, especially when dealing with skewed subpopulations, the estimate of the number of groups can be misleading; if symmetric components are assumed we need more than one component to describe an asymmetric group. Existing mixture models, based on multivariate normal distributions and multivariate t distributions, try to fit symmetric distributions, i.e. they fit symmetric clusters. In the present paper, we propose the use of finite mixtures of the normal inverse Gaussian distribution (and its multivariate extensions). Such finite mixture models start from a density that allows for skewness and fat tails, generalize the existing models, are tractable and have desirable properties. We examine both the univariate case, to gain insight, and the multivariate case, which is more useful in real applications. EM type algorithms are described for fitting the models. Real data examples are used to demonstrate the potential of the new model in comparison with existing ones.  相似文献   

12.
The iteratively reweighting algorithm is one of the widely used algorithm to compute the M-estimates for the location and scatter parameters of a multivariate dataset. If the M estimating equations are the maximum likelihood estimating equations from some scale mixture of normal distributions (e.g. from a multivariate t-distribution), the iteratively reweighting algorithm is identified as an EM algorithm and the convergence behavior of which is well established. However, as Tyler (J. Roy. Statist. Soc. Ser. B 59 (1997) 550) pointed out, little is known about the theoretical convergence properties of the iteratively reweighting algorithms if it cannot be identified as an EM algorithm. In this paper, we consider the convergence behavior of the iteratively reweighting algorithm induced from the M estimating equations which cannot be identified as an EM algorithm. We give some general results on the convergence properties and, we show that convergence behavior of a general iteratively reweighting algorithm induced from the M estimating equations is similar to the convergence behavior of an EM algorithm even if it cannot be identified as an EM algorithm.  相似文献   

13.
We present an algorithm for multivariate robust Bayesian linear regression with missing data. The iterative algorithm computes an approximative posterior for the model parameters based on the variational Bayes (VB) method. Compared to the EM algorithm, the VB method has the advantage that the variance for the model parameters is also computed directly by the algorithm. We consider three families of Gaussian scale mixture models for the measurements, which include as special cases the multivariate t distribution, the multivariate Laplace distribution, and the contaminated normal model. The observations can contain missing values, assuming that the missing data mechanism can be ignored. A Matlab/Octave implementation of the algorithm is presented and applied to solve three reference examples from the literature.  相似文献   

14.
Finite mixtures of multivariate skew t (MST) distributions have proven to be useful in modelling heterogeneous data with asymmetric and heavy tail behaviour. Recently, they have been exploited as an effective tool for modelling flow cytometric data. A number of algorithms for the computation of the maximum likelihood (ML) estimates for the model parameters of mixtures of MST distributions have been put forward in recent years. These implementations use various characterizations of the MST distribution, which are similar but not identical. While exact implementation of the expectation-maximization (EM) algorithm can be achieved for ‘restricted’ characterizations of the component skew t-distributions, Monte Carlo (MC) methods have been used to fit the ‘unrestricted’ models. In this paper, we review several recent fitting algorithms for finite mixtures of multivariate skew t-distributions, at the same time clarifying some of the connections between the various existing proposals. In particular, recent results have shown that the EM algorithm can be implemented exactly for faster computation of ML estimates for mixtures with unrestricted MST components. The gain in computational time is effected by noting that the semi-infinite integrals on the E-step of the EM algorithm can be put in the form of moments of the truncated multivariate non-central t-distribution, similar to the restricted case, which subsequently can be expressed in terms of the non-truncated form of the central t-distribution function for which fast algorithms are available. We present comparisons to illustrate the relative performance of the restricted and unrestricted models, and demonstrate the usefulness of the recently proposed methodology for the unrestricted MST mixture, by some applications to three real datasets.  相似文献   

15.
Estimators derived from the expectation‐maximization (EM) algorithm are not robust since they are based on the maximization of the likelihood function. We propose an iterative proximal‐point algorithm based on the EM algorithm to minimize a divergence criterion between a mixture model and the unknown distribution that generates the data. The algorithm estimates in each iteration the proportions and the parameters of the mixture components in two separate steps. Resulting estimators are generally robust against outliers and misspecification of the model. Convergence properties of our algorithm are studied. The convergence of the introduced algorithm is discussed on a two‐component Weibull mixture entailing a condition on the initialization of the EM algorithm in order for the latter to converge. Simulations on Gaussian and Weibull mixture models using different statistical divergences are provided to confirm the validity of our work and the robustness of the resulting estimators against outliers in comparison to the EM algorithm. An application to a dataset of velocities of galaxies is also presented. The Canadian Journal of Statistics 47: 392–408; 2019 © 2019 Statistical Society of Canada  相似文献   

16.
Multivariate mixtures of Erlang distributions form a versatile, yet analytically tractable, class of distributions making them suitable for multivariate density estimation. We present a flexible and effective fitting procedure for multivariate mixtures of Erlangs, which iteratively uses the EM algorithm, by introducing a computationally efficient initialization and adjustment strategy for the shape parameter vectors. We furthermore extend the EM algorithm for multivariate mixtures of Erlangs to be able to deal with randomly censored and fixed truncated data. The effectiveness of the proposed algorithm is demonstrated on simulated as well as real data sets.  相似文献   

17.
Hea-Jung Kim 《Statistics》2015,49(4):878-899
A screening problem is tackled by proposing a parametric class of distributions designed to match the behavior of the partially observed screened data. This class is obtained from the nontruncated marginal of the rectangle-truncated multivariate normal distributions. Motivations for the screened distribution as well as some of the basic properties, such as its characteristic function, are presented. These allow us a detailed exploration of other important properties that include closure property in linear transformation, in marginal and conditional operations, and in a mixture operation as well as the first two moments and some sampling distributions. Various applications of these results to the statistical modelling and data analysis are also provided.  相似文献   

18.
Cluster analysis is the automated search for groups of homogeneous observations in a data set. A popular modeling approach for clustering is based on finite normal mixture models, which assume that each cluster is modeled as a multivariate normal distribution. However, the normality assumption that each component is symmetric is often unrealistic. Furthermore, normal mixture models are not robust against outliers; they often require extra components for modeling outliers and/or give a poor representation of the data. To address these issues, we propose a new class of distributions, multivariate t distributions with the Box-Cox transformation, for mixture modeling. This class of distributions generalizes the normal distribution with the more heavy-tailed t distribution, and introduces skewness via the Box-Cox transformation. As a result, this provides a unified framework to simultaneously handle outlier identification and data transformation, two interrelated issues. We describe an Expectation-Maximization algorithm for parameter estimation along with transformation selection. We demonstrate the proposed methodology with three real data sets and simulation studies. Compared with a wealth of approaches including the skew-t mixture model, the proposed t mixture model with the Box-Cox transformation performs favorably in terms of accuracy in the assignment of observations, robustness against model misspecification, and selection of the number of components.  相似文献   

19.
This paper deals with the software reliability model based on a nonhomogeneous Poisson process. We introduce new types of mean functions which can be either NHPP-I or NHPP-II according to the choice of the distribution function. The proposed mean function is motivated by the fact that a strictly monotone increasing function can be modeled by a distribution function and an unknown distribution function approximated by a mixture of beta distributions. Some existing mean functions can be regarded as special cases of the proposed mean functions. The EM algorithm is used to obtain maximum likelihood estimates of the parameters in the proposed model.  相似文献   

20.
While the literature on multivariate models for continuous data flourishes, there is a lack of models for multivariate counts. We aim to contribute to this framework by extending the well known class of univariate hidden Markov models to the multidimensional case, by introducing multivariate Poisson hidden Markov models. Each state of the extended model is associated with a different multivariate discrete distribution. We consider different distributions with Poisson marginals, starting from the multivariate Poisson distribution and then extending to copula based distributions to allow flexible dependence structures. An EM type algorithm is developed for maximum likelihood estimation. A real data application is presented to illustrate the usefulness of the proposed models. In particular, we apply the models to the occurrence of strong earthquakes (surface wave magnitude ≥5), in three seismogenic subregions in the broad region of the North Aegean Sea for the time period from 1 January 1981 to 31 December 2008. Earthquakes occurring in one subregion may trigger events in adjacent ones and hence the observed time series of events are cross‐correlated. It is evident from the results that the three subregions interact with each other at times differing by up to a few months. This migration of seismic activity is captured by the model as a transition to a state of higher seismicity.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号