共查询到20条相似文献,搜索用时 15 毫秒
1.
The Hidden semi-Markov models (HSMMs) were introduced to overcome the constraint of a geometric sojourn time distribution for the different hidden states in the classical hidden Markov models. Several variations of HSMMs were proposed that model the sojourn times by a parametric or a nonparametric family of distributions. In this article, we concentrate our interest on the nonparametric case where the duration distributions are attached to transitions and not to states as in most of the published papers in HSMMs. Therefore, it is worth noticing that here we treat the underlying hidden semi-Markov chain in its general probabilistic structure. In that case, Barbu and Limnios (2008) proposed an Expectation–Maximization (EM) algorithm in order to estimate the semi-Markov kernel and the emission probabilities that characterize the dynamics of the model. In this article, we consider an improved version of Barbu and Limnios' EM algorithm which is faster than the original one. Moreover, we propose a stochastic version of the EM algorithm that achieves comparable estimates with the EM algorithm in less execution time. Some numerical examples are provided which illustrate the efficient performance of the proposed algorithms. 相似文献
2.
《Journal of Statistical Computation and Simulation》2012,82(8):713-729
The mixture transition distribution (MTD) model was introduced by Raftery to face the need for parsimony in the modeling of high-order Markov chains in discrete time. The particularity of this model comes from the fact that the effect of each lag upon the present is considered separately and additively, so that the number of parameters required is drastically reduced. However, the efficiency for the MTD parameter estimations proposed up to date still remains problematic on account of the large number of constraints on the parameters. In this article, an iterative procedure, commonly known as expectation–maximization (EM) algorithm, is developed cooperating with the principle of maximum likelihood estimation (MLE) to estimate the MTD parameters. Some applications of modeling MTD show the proposed EM algorithm is easier to be used than the algorithm developed by Berchtold. Moreover, the EM estimations of parameters for high-order MTD models led on DNA sequences outperform the corresponding fully parametrized Markov chain in terms of Bayesian information criterion. A software implementation of our algorithm is available in the library seq++at http://stat.genopole.cnrs.fr/seqpp. 相似文献
3.
Interval-censored data arise in a wide variety of application and research areas such as, for example, AIDS studies (Kim et al ., 1993) and cancer research (Finkelstein, 1986; Becker & Melbye, 1991). Peto (1973) proposed a Newton–Raphson algorithm for obtaining a generalized maximum likelihood estimate (GMLE) of the survival function with interval-cen sored observations. Turnbull (1976) proposed a self-consistent algorithm for interval-censored data and obtained the same GMLE. Groeneboom & Wellner (1992) used the convex minorant algorithm for constructing an estimator of the survival function with "case 2" interval-censored data. However, as is known, the GMLE is not uniquely defined on the interval [0, ∞]. In addition, Turnbull's algorithm leads to a self-consistent equation which is not in the form of an integral equation. Large sample properties of the GMLE have not been previously examined because of, we believe, among other things, the lack of such an integral equation. In this paper, we present an EM algorithm for constructing a GMLE on [0, ∞]. The GMLE is expressed as a solution of an integral equation. More recently, with the help of this integral equation, Yu et al . (1997a, b) have shown that the GMLE is consistent and asymptotically normally distributed. An application of the proposed GMLE is presented 相似文献
4.
A multidimensional scaling methodology (STUNMIX) for the analysis of subjects' preference/choice of stimuli that sets out to integrate the previous work in this area into a single framework, as well as to provide a variety of new options and models, is presented. Locations of the stimuli and the ideal points of derived segments of subjects on latent dimensions are estimated simultaneously. The methodology is formulated in the framework of the exponential family of distributions, whereby a wide range of different data types can be analyzed. Possible reparameterizations of stimulus coordinates by stimulus characteristics, as well as of probabilities of segment membership by subject background variables, are permitted. The models are estimated in a maximum likelihood framework. The performance of the models is demonstrated on synthetic data, and robustness is investigated. An empirical application is provided, concerning intentions to buy portable telephones. 相似文献
5.
Finite mixture models have provided a reasonable tool to model various types of observed phenomena, specially those which are random in nature. In this article, a finite mixture of Weibull and Pareto (IV) distribution is considered and studied. Some structural properties of the resulting model are discussed including estimation of the model parameters via expectation maximization (EM) algorithm. A real-life data application exhibits the fact that in certain situations, this mixture model might be a better alternative than the rival popular models. 相似文献
6.
S. R. Paul 《The American statistician》2013,67(2):136-139
Binary-response data arise in teratology and mutagenicity studies in which each treatment is applied to a group of litters. In a large experiment, a contingency table can be constructed to test the treatment X litter size interaction (see Kastenbaum and Lamphiear 1959). In situations in which there is a clumped category, as in the Kastenbaum and Lamphiear mice-depletion data, a clumped binomial model (Koch et al. 1976) or a clumped beta-binomial model (Paul 1979) can be used to analyze these data. When a clumped binomial model is appropriate, the maximum likelihood estimates of the parameters of the model under the hypothesis of no treatment X litter size interaction, as well as under the hypothesis of the said interaction, can be estimated via the EM algorithm for computing maximum likelihood estimates from incomplete data (Dempster et al. 1977). In this article the EM algorithm is described and used to test treatment X litter size interaction for the Kastenbaum and Lamphiear data and for a set of data given in Luning et al. (1966). 相似文献
7.
《Journal of Statistical Computation and Simulation》2012,82(10):2166-2186
The lasso is a popular technique of simultaneous estimation and variable selection in many research areas. The marginal posterior mode of the regression coefficients is equivalent to estimates given by the non-Bayesian lasso when the regression coefficients have independent Laplace priors. Because of its flexibility of statistical inferences, the Bayesian approach is attracting a growing body of research in recent years. Current approaches are primarily to either do a fully Bayesian analysis using Markov chain Monte Carlo (MCMC) algorithm or use Monte Carlo expectation maximization (MCEM) methods with an MCMC algorithm in each E-step. However, MCMC-based Bayesian method has much computational burden and slow convergence. Tan et al. [An efficient MCEM algorithm for fitting generalized linear mixed models for correlated binary data. J Stat Comput Simul. 2007;77:929–943] proposed a non-iterative sampling approach, the inverse Bayes formula (IBF) sampler, for computing posteriors of a hierarchical model in the structure of MCEM. Motivated by their paper, we develop this IBF sampler in the structure of MCEM to give the marginal posterior mode of the regression coefficients for the Bayesian lasso, by adjusting the weights of importance sampling, when the full conditional distribution is not explicit. Simulation experiments show that the computational time is much reduced with our method based on the expectation maximization algorithm and our algorithms and our methods behave comparably with other Bayesian lasso methods not only in prediction accuracy but also in variable selection accuracy and even better especially when the sample size is relatively large. 相似文献
8.
J. Portela 《统计学通讯:理论与方法》2013,42(20):3250-3263
In this work, the multinomial mixture model is studied, through a maximum likelihood approach. The convergence of the maximum likelihood estimator to a set with characteristics of interest is shown. A method to select the number of mixture components is developed based on the form of the maximum likelihood estimator. A simulation study is then carried out to verify its behavior. Finally, two applications on real data of multinomial mixtures are presented. 相似文献
9.
We proposed a modification to the variant of link-tracing sampling suggested by Félix-Medina and Thompson [M.H. Félix-Medina, S.K. Thompson, Combining cluster sampling and link-tracing sampling to estimate the size of hidden populations, Journal of Official Statistics 20 (2004) 19–38] that allows the researcher to have certain control of the final sample size, precision of the estimates or other characteristics of the sample that the researcher is interested in controlling. We achieve this goal by selecting an initial sequential sample of sites instead of an initial simple random sample of sites as those authors suggested. We estimate the population size by means of the maximum likelihood estimators suggested by the above-mentioned authors or by the Bayesian estimators proposed by Félix-Medina and Monjardin [M.H. Félix-Medina, P.E. Monjardin, Combining link-tracing sampling and cluster sampling to estimate the size of hidden populations: A Bayesian-assisted approach, Survey Methodology 32 (2006) 187–195]. Variances are estimated by means of jackknife and bootstrap estimators as well as by the delta estimators proposed in the two above-mentioned papers. Interval estimates of the population size are obtained by means of Wald and bootstrap confidence intervals. The results of an exploratory simulation study indicate good performance of the proposed sampling strategy. 相似文献
10.
A faster alternative to the EM algorithm in finite mixture distributions is described, which alternates EM iterations with Gauss-Newton iterations using the observed information matrix. At the expense of modest additional analytical effort in obtaining the observed information, the hybrid algorithm reduces the computing time required and provides asymptotic standard errors at convergence. The algorithm is illustrated on the two-component normal mixture. 相似文献
11.
This paper extends some of the work presented in Redner and Walker [I9841 on the maximum likelihood estimate of parameters in a mixture model to a Bayesian modal estimate. The problem of determining the mode of the joint posterior distribution is discussed. Necessary conditions are given for a choice of parameters to be the mode and a numerical scheme based on the EM algorithm is presented. Some theoretical remarks on the resulting iterative scheme and simulation results are also given. 相似文献
12.
In this article, we propose semiparametric methods to estimate the cumulative incidence function of two dependent competing risks for left-truncated and right-censored data. The proposed method is based on work by Huang and Wang (1995). We extend previous model by allowing for a general parametric truncation distribution and a third competing risk before recruitment. Based on work by Vardi (1989), several iterative algorithms are proposed to obtain the semiparametric estimates of cumulative incidence functions. The asymptotic properties of the semiparametric estimators are derived. Simulation results show that a semiparametric approach assuming the parametric truncation distribution is correctly specified produces estimates with smaller mean squared error than those obtained in a fully nonparametric model. 相似文献
13.
A nonparametric method based on the empirical likelihood is proposed to detect the change-point in the coefficient of linear regression models. The empirical likelihood ratio test statistic is proved to have the same asymptotic null distribution as that with classical parametric likelihood. Under some mild conditions, the maximum empirical likelihood change-point estimator is also shown to be consistent. The simulation results show the sensitivity and robustness of the proposed approach. The method is applied to some real datasets to illustrate the effectiveness. 相似文献
14.
Mahdi Teimouri 《Journal of applied statistics》2021,48(7):1154
Grouped data are frequently used in several fields of study. In this work, we use the expectation-maximization (EM) algorithm for fitting the skew-normal (SN) mixture model to the grouped data. Implementing the EM algorithm requires computing the one-dimensional integrals for each group or class. Our simulation study and real data analyses reveal that the EM algorithm not only always converges but also can be implemented in just a few seconds even when the number of components is large, contrary to the Bayesian paradigm that is computationally expensive. The accuracy of the EM algorithm and superiority of the SN mixture model over the traditional normal mixture model in modelling grouped data are demonstrated through the simulation and three real data illustrations. For implementing the EM algorithm, we use the package called ForestFit developed for R environment available at https://cran.r-project.org/web/packages/ForestFit/index.html. 相似文献
15.
《统计学通讯:理论与方法》2012,41(1):78-87
AbstractIn this article, we revisit the problem of fitting a mixture model under the assumption that the mixture components are symmetric and log-concave. To this end, we first study the nonparametric maximum likelihood estimation (MLE) of a monotone log-concave probability density. To fit the mixture model, we propose a semiparametric EM (SEM) algorithm, which can be adapted to other semiparametric mixture models. In our numerical experiments, we compare our algorithm to that of Balabdaoui and Doss (2018, Inference for a two-component mixture of symmetric distributions under log-concavity. Bernoulli 24 (2):1053–71) and other mixture models both on simulated and real-world datasets. 相似文献
16.
It is well known that the normal mixture with unequal variance has unbounded likelihood and thus the corresponding global maximum likelihood estimator (MLE) is undefined. One of the commonly used solutions is to put a constraint on the parameter space so that the likelihood is bounded and then one can run the EM algorithm on this constrained parameter space to find the constrained global MLE. However, choosing the constraint parameter is a difficult issue and in many cases different choices may give different constrained global MLE. In this article, we propose a profile log likelihood method and a graphical way to find the maximum interior mode. Based on our proposed method, we can also see how the constraint parameter, used in the constrained EM algorithm, affects the constrained global MLE. Using two simulation examples and a real data application, we demonstrate the success of our new method in solving the unboundness of the mixture likelihood and locating the maximum interior mode. 相似文献
17.
Mixture model-based clustering is widely used in many applications. In certain real-time applications the rapid increase of
data size with time makes classical clustering algorithms too slow. An online clustering algorithm based on mixture models
is presented in the context of a real-time flaw-diagnosis application for pressurized containers which uses data from acoustic
emission signals. The proposed algorithm is a stochastic gradient algorithm derived from the classification version of the
EM algorithm (CEM). It provides a model-based generalization of the well-known online k-means algorithm, able to handle non-spherical
clusters. Using synthetic and real data sets, the proposed algorithm is compared with the batch CEM algorithm and the online
EM algorithm. The three approaches generate comparable solutions in terms of the resulting partition when clusters are relatively
well separated, but online algorithms become faster as the size of the available observations increases. 相似文献
18.
Yu-Wen Wen 《统计学通讯:模拟与计算》2013,42(9):1914-1929
The two-part model and Heckman's sample selection model are often used in economic studies which involve analyzing the demand for limited variables. This study proposed a simultaneous equation model (SEM) and used the expectation-maximization algorithm to obtain the maximum likelihood estimate. We then constructed a simulation to compare the performance of estimates of price elasticity using SEM with those estimates from the two-part model and the sample selection model. The simulation shows that the estimates of price elasticity by SEM are more precise than those by the sample selection model and the two-part model when the model includes limited independent variables. Finally, we analyzed a real example of cigarette consumption as an application. We found an increase in cigarette price associated with a decrease in both the propensity to consume cigarettes and the amount actually consumed. 相似文献
19.
In this paper, we assume the number of competing causes to follow an exponentially weighted Poisson distribution. By assuming the initial number of competing causes can undergo destruction and that the population of interest has a cure fraction, we develop the EM algorithm for the determination of the MLEs of the model parameters of such a general cure model. This model is more flexible than the promotion time cure model and also provides an interesting and realistic interpretation of the biological mechanism of the occurrence of an event of interest. Instead of assuming a particular parametric distribution for the lifetime, we assume the lifetime to belong to the wider class of generalized gamma distribution. This allows us to carry out a model discrimination to select a parsimonious lifetime distribution that provides the best fit to the data. Within the EM framework, a two-way profile likelihood approach is proposed to estimate the shape parameters. An extensive Monte Carlo simulation study is carried out to demonstrate the performance of the proposed estimation method. Model discrimination is carried out by means of the likelihood ratio test and information-based methods. Finally, a data on melanoma is analyzed for illustrative purpose. 相似文献
20.
Improving the EM algorithm for mixtures 总被引:1,自引:0,他引:1
One of the estimating equations of the Maximum Likelihood Estimation method, for finite mixtures of the one parameter exponential family, is the first moment equation. This can help considerably in reducing the labor and the cost of calculating the Maximum Likelihood estimates. In this paper it is shown that the EM algorithm can be substantially improved by using this result when applied for mixture models. A short discussion about other methods proposed for the calculation of the Maximum Likelihood estimates are also reported showing that the above findings can help in this direction too. 相似文献