期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Ascent-based Monte Carlo expectation– maximization

Brian S. Caffo Wolfgang Jank Galin L. Jones 《Journal of the Royal Statistical Society. Series B, Statistical methodology》2005,67(2):235-251

Summary. The expectation–maximization (EM) algorithm is a popular tool for maximizing likelihood functions in the presence of missing data. Unfortunately, EM often requires the evaluation of analytically intractable and high dimensional integrals. The Monte Carlo EM (MCEM) algorithm is the natural extension of EM that employs Monte Carlo methods to estimate the relevant integrals. Typically, a very large Monte Carlo sample size is required to estimate these integrals within an acceptable tolerance when the algorithm is near convergence. Even if this sample size were known at the onset of implementation of MCEM, its use throughout all iterations is wasteful, especially when accurate starting values are not available. We propose a data-driven strategy for controlling Monte Carlo resources in MCEM. The algorithm proposed improves on similar existing methods by recovering EM's ascent (i.e. likelihood increasing) property with high probability, being more robust to the effect of user-defined inputs and handling classical Monte Carlo and Markov chain Monte Carlo methods within a common framework. Because of the first of these properties we refer to the algorithm as 'ascent-based MCEM'. We apply ascent-based MCEM to a variety of examples, including one where it is used to accelerate the convergence of deterministic EM dramatically. 相似文献

2.

Maximizing generalized linear mixed model likelihoods with an automated Monte Carlo EM algorithm

J. G. Booth & J. P. Hobert 《Journal of the Royal Statistical Society. Series B, Statistical methodology》1999,61(1):265-285

Two new implementations of the EM algorithm are proposed for maximum likelihood fitting of generalized linear mixed models. Both methods use random (independent and identically distributed) sampling to construct Monte Carlo approximations at the E-step. One approach involves generating random samples from the exact conditional distribution of the random effects (given the data) by rejection sampling, using the marginal distribution as a candidate. The second method uses a multivariate t importance sampling approximation. In many applications the two methods are complementary. Rejection sampling is more efficient when sample sizes are small, whereas importance sampling is better with larger sample sizes. Monte Carlo approximation using random samples allows the Monte Carlo error at each iteration to be assessed by using standard central limit theory combined with Taylor series methods. Specifically, we construct a sandwich variance estimate for the maximizer at each approximate E-step. This suggests a rule for automatically increasing the Monte Carlo sample size after iterations in which the true EM step is swamped by Monte Carlo error. In contrast, techniques for assessing Monte Carlo error have not been developed for use with alternative implementations of Monte Carlo EM algorithms utilizing Markov chain Monte Carlo E-step approximations. Three different data sets, including the infamous salamander data of McCullagh and Nelder, are used to illustrate the techniques and to compare them with the alternatives. The results show that the methods proposed can be considerably more efficient than those based on Markov chain Monte Carlo algorithms. However, the methods proposed may break down when the intractable integrals in the likelihood function are of high dimension. 相似文献

3.

A new REML (parameter expanded) EM algorithm for linear mixed models

下载免费PDF全文

S. M. Diffey A. B. Smith A. H. Welsh B. R. Cullis 《Australian & New Zealand Journal of Statistics》2017,59(4):433-448

Linear mixed models are regularly applied to animal and plant breeding data to evaluate genetic potential. Residual maximum likelihood (REML) is the preferred method for estimating variance parameters associated with this type of model. Typically an iterative algorithm is required for the estimation of variance parameters. Two algorithms which can be used for this purpose are the expectation‐maximisation (EM) algorithm and the parameter expanded EM (PX‐EM) algorithm. Both, particularly the EM algorithm, can be slow to converge when compared to a Newton‐Raphson type scheme such as the average information (AI) algorithm. The EM and PX‐EM algorithms require specification of the complete data, including the incomplete and missing data. We consider a new incomplete data specification based on a conditional derivation of REML. We illustrate the use of the resulting new algorithm through two examples: a sire model for lamb weight data and a balanced incomplete block soybean variety trial. In the cases where the AI algorithm failed, a REML PX‐EM based on the new incomplete data specification converged in 28% to 30% fewer iterations than the alternative REML PX‐EM specification. For the soybean example a REML EM algorithm using the new specification converged in fewer iterations than the current standard specification of a REML PX‐EM algorithm. The new specification integrates linear mixed models, Henderson's mixed model equations, REML and the REML EM algorithm into a cohesive framework. 相似文献

4.

A SEMIPARAMETRIC MIXTURE MODEL FOR THE ANALYSIS OF COMPETING RISKS DATA

Anthony Y.C. Kuk 《Australian & New Zealand Journal of Statistics》1992,34(2):169-180

This paper deals with the regression analysis of failure time data when there are censoring and multiple types of failures. We propose a semiparametric generalization of a parametric mixture model of Larson & Dinse (1985), for which the marginal probabilities of the various failure types are logistic functions of the covariates. Given the type of failure, the conditional distribution of the time to failure follows a proportional hazards model. A marginal like lihood approach to estimating regression parameters is suggested, whereby the baseline hazard functions are eliminated as nuisance parameters. The Monte Carlo method is used to approximate the marginal likelihood; the resulting function is maximized easily using existing software. Some guidelines for choosing the number of Monte Carlo replications are given. Fixing the regression parameters at their estimated values, the full likelihood is maximized via an EM algorithm to estimate the baseline survivor functions. The methods suggested are illustrated using the Stanford heart transplant data. 相似文献

5.

Stochastic approximation Monte Carlo EM for change-point analysis

Hwa Kyung Lim Jaejun Lee 《Journal of Statistical Computation and Simulation》2017,87(1):69-87

In the expectation–maximization (EM) algorithm for maximum likelihood estimation from incomplete data, Markov chain Monte Carlo (MCMC) methods have been used in change-point inference for a long time when the expectation step is intractable. However, the conventional MCMC algorithms tend to get trapped in local mode in simulating from the posterior distribution of change points. To overcome this problem, in this paper we propose a stochastic approximation Monte Carlo version of EM (SAMCEM), which is a combination of adaptive Markov chain Monte Carlo and EM utilizing a maximum likelihood method. SAMCEM is compared with the stochastic approximation version of EM and reversible jump Markov chain Monte Carlo version of EM on simulated and real datasets. The numerical results indicate that SAMCEM can outperform among the three methods by producing much more accurate parameter estimates and the ability to achieve change-point positions and estimates simultaneously. 相似文献

6.

Local influence for generalized linear mixed models

Hong‐Tu Zhu Sik‐Yum Lee 《Revue canadienne de statistique》2003,31(3):293-309

The authors describe a method for assessing model inadequacy in maximum likelihood estimation of a generalized linear mixed model. They treat the latent random effects in the model as missing data and develop the influence analysis on the basis of a Q‐function which is associated with the conditional expectation of the complete‐data log‐likelihood function in the EM algorithm. They propose a procedure to detect influential observations in six model perturbation schemes. They also illustrate their methodology in a hypothetical situation and in two real cases. 相似文献

7.

Full information maximum likelihood estimation in factor analysis with a large number of missing values

《Journal of Statistical Computation and Simulation》2012,82(1):91-104

We consider the problem of full information maximum likelihood (FIML) estimation in factor analysis when a majority of the data values are missing. The expectation–maximization (EM) algorithm is often used to find the FIML estimates, in which the missing values on manifest variables are included in complete data. However, the ordinary EM algorithm has an extremely high computational cost. In this paper, we propose a new algorithm that is based on the EM algorithm but that efficiently computes the FIML estimates. A significant improvement in the computational speed is realized by not treating the missing values on manifest variables as a part of complete data. When there are many missing data values, it is not clear if the FIML procedure can achieve good estimation accuracy. In order to investigate this, we conduct Monte Carlo simulations under a wide variety of sample sizes. 相似文献

8.

Fitting finite mixture models using iterative Monte Carlo classification

Jing Xu Jun Ma 《统计学通讯:理论与方法》2017,46(13):6684-6693

Parameters of a finite mixture model are often estimated by the expectation–maximization (EM) algorithm where the observed data log-likelihood function is maximized. This paper proposes an alternative approach for fitting finite mixture models. Our method, called the iterative Monte Carlo classification (IMCC), is also an iterative fitting procedure. Within each iteration, it first estimates the membership probabilities for each data point, namely the conditional probability of a data point belonging to a particular mixing component given that the data point value is obtained, it then classifies each data point into a component distribution using the estimated conditional probabilities and the Monte Carlo method. It finally updates the parameters of each component distribution based on the classified data. Simulation studies were conducted to compare IMCC with some other algorithms for fitting mixture normal, and mixture t, densities. 相似文献

9.

Imputation for statistical inference with coarse data

Jae Kwang Kim Minki Hong 《Revue canadienne de statistique》2012,40(3):604-618

Coarse data is a general type of incomplete data that includes grouped data, censored data, and missing data. The likelihood‐based estimation approach with coarse data is challenging because the likelihood function is in integral form. The Monte Carlo EM algorithm of Wei & Tanner [Wei & Tanner (1990). Journal of the American Statistical Association, 85, 699–704] is adapted to compute the maximum likelihood estimator in the presence of coarse data. Stochastic coarse data is also covered and the computation can be implemented using the parametric fractional imputation method proposed by Kim [Kim (2011). Biometrika, 98, 119–132]. Results from a limited simulation study are presented. The proposed method is also applied to the Korean Longitudinal Study of Aging (KLoSA). The Canadian Journal of Statistics 40: 604–618; 2012 © 2012 Statistical Society of Canada 相似文献

10.

Statistical inference for discretely observed Markov jump processes

Mogens Bladt Michael Sørensen 《Journal of the Royal Statistical Society. Series B, Statistical methodology》2005,67(3):395-410

Summary. Likelihood inference for discretely observed Markov jump processes with finite state space is investigated. The existence and uniqueness of the maximum likelihood estimator of the intensity matrix are investigated. This topic is closely related to the imbedding problem for Markov chains. It is demonstrated that the maximum likelihood estimator can be found either by the EM algorithm or by a Markov chain Monte Carlo procedure. When the maximum likelihood estimator does not exist, an estimator can be obtained by using a penalized likelihood function or by the Markov chain Monte Carlo procedure with a suitable prior. The methodology and its implementation are illustrated by examples and simulation studies. 相似文献

11.

Estimation and Properties of a Time-Varying EGARCH(1,1) in Mean Model 总被引：1，自引：1，他引：0

Sofia Anyfantaki Antonis Demos 《Econometric Reviews》2016,35(2):293-310

Time-varying GARCH-M models are commonly employed in econometrics and financial economics. Yet the recursive nature of the conditional variance makes likelihood analysis of these models computationally infeasible. This article outlines the issues and suggests to employ a Markov chain Monte Carlo algorithm which allows the calculation of a classical estimator via the simulated EM algorithm or a simulated Bayesian solution in only O(T) computational operations, where T is the sample size. Furthermore, the theoretical dynamic properties of a time-varying-parameter EGARCH(1,1)-M are derived. We discuss them and apply the suggested Bayesian estimation to three major stock markets. 相似文献

12.

A Simple Solution to Bayesian Mixture Labeling

Weixin Yao 《统计学通讯:模拟与计算》2013,42(4):800-813

The label-switching problem is one of the fundamental problems in Bayesian mixture analysis. Using all the Markov chain Monte Carlo samples as the initials for the expectation-maximization (EM) algorithm, we propose to label the samples based on the modes they converge to. Our method is based on the assumption that the samples converged to the same mode have the same labels. If a relative noninformative prior is used or the sample size is large, the posterior will be close to the likelihood and then the posterior modes can be located approximately by the EM algorithm for mixture likelihood, without assuming the availability of the closed form of the posterior. In order to speed up the computation of this labeling method, we also propose to first cluster the samples by K-means with a large number of clusters K. Then, by assuming that the samples within each cluster have the same labels, we only need to find one converged mode for each cluster. Using a Monte Carlo simulation study and a real dataset, we demonstrate the success of our new method in dealing with the label-switching problem. 相似文献

13.

Pointwise and functional approximations in Monte Carlo maximum likelihood estimation

Kuk Anthony Y. C. Cheng Yuk W. 《Statistics and Computing》1999,9(2):91-99

We consider the use of Monte Carlo methods to obtain maximum likelihood estimates for random effects models and distinguish between the pointwise and functional approaches. We explore the relationship between the two approaches and compare them with the EM algorithm. The functional approach is more ambitious but the approximation is local in nature which we demonstrate graphically using two simple examples. A remedy is to obtain successively better approximations of the relative likelihood function near the true maximum likelihood estimate. To save computing time, we use only one Newton iteration to approximate the maximiser of each Monte Carlo likelihood and show that this is equivalent to the pointwise approach. The procedure is applied to fit a latent process model to a set of polio incidence data. The paper ends by a comparison between the marginal likelihood and the recently proposed hierarchical likelihood which avoids integration altogether. 相似文献

14.

Parameter Estimation for Hidden Markov Models with Intractable Likelihoods

Thomas A. Dean Sumeetpal S. Singh Ajay Jasra Gareth W. Peters 《Scandinavian Journal of Statistics》2014,41(4):970-987

Approximate Bayesian computation (ABC) is a popular technique for analysing data for complex models where the likelihood function is intractable. It involves using simulation from the model to approximate the likelihood, with this approximate likelihood then being used to construct an approximate posterior. In this paper, we consider methods that estimate the parameters by maximizing the approximate likelihood used in ABC. We give a theoretical analysis of the asymptotic properties of the resulting estimator. In particular, we derive results analogous to those of consistency and asymptotic normality for standard maximum likelihood estimation. We also discuss how sequential Monte Carlo methods provide a natural method for implementing our likelihood‐based ABC procedures. 相似文献

15.

Missing covariates in generalized linear models when the missing data mechanism is non-ignorable

J. G. Ibrahim S. R. Lipsitz & M.-H. Chen 《Journal of the Royal Statistical Society. Series B, Statistical methodology》1999,61(1):173-190

We propose a method for estimating parameters in generalized linear models with missing covariates and a non-ignorable missing data mechanism. We use a multinomial model for the missing data indicators and propose a joint distribution for them which can be written as a sequence of one-dimensional conditional distributions, with each one-dimensional conditional distribution consisting of a logistic regression. We allow the covariates to be either categorical or continuous. The joint covariate distribution is also modelled via a sequence of one-dimensional conditional distributions, and the response variable is assumed to be completely observed. We derive the E- and M-steps of the EM algorithm with non-ignorable missing covariate data. For categorical covariates, we derive a closed form expression for the E- and M-steps of the EM algorithm for obtaining the maximum likelihood estimates (MLEs). For continuous covariates, we use a Monte Carlo version of the EM algorithm to obtain the MLEs via the Gibbs sampler. Computational techniques for Gibbs sampling are proposed and implemented. The parametric form of the assumed missing data mechanism itself is not `testable' from the data, and thus the non-ignorable modelling considered here can be viewed as a sensitivity analysis concerning a more complicated model. Therefore, although a model may have `passed' the tests for a certain missing data mechanism, this does not mean that we have captured, even approximately, the correct missing data mechanism. Hence, model checking for the missing data mechanism and sensitivity analyses play an important role in this problem and are discussed in detail. Several simulations are given to demonstrate the methodology. In addition, a real data set from a melanoma cancer clinical trial is presented to illustrate the methods proposed. 相似文献

16.

Publication bias and meta-analysis for 2×2 tables: an average Markov chain Monte Carlo EM algorithm

Jian Qing Shi John Copas 《Journal of the Royal Statistical Society. Series B, Statistical methodology》2002,64(2):221-236

Summary. A major difficulty in meta-analysis is publication bias . Studies with positive outcomes are more likely to be published than studies reporting negative or inconclusive results. Correcting for this bias is not possible without making untestable assumptions. In this paper, a sensitivity analysis is discussed for the meta-analysis of 2×2 tables using exact conditional distributions. A Markov chain Monte Carlo EM algorithm is used to calculate maximum likelihood estimates. A rule for increasing the accuracy of estimation and automating the choice of the number of iterations is suggested. 相似文献

17.

A computational framework for empirical Bayes inference

Yves F. Atchadé 《Statistics and Computing》2011,21(4):463-473

In empirical Bayes inference one is typically interested in sampling from the posterior distribution of a parameter with a hyper-parameter set to its maximum likelihood estimate. This is often problematic particularly when the likelihood function of the hyper-parameter is not available in closed form and the posterior distribution is intractable. Previous works have dealt with this problem using a multi-step approach based on the EM algorithm and Markov Chain Monte Carlo (MCMC). We propose a framework based on recent developments in adaptive MCMC, where this problem is addressed more efficiently using a single Monte Carlo run. We discuss the convergence of the algorithm and its connection with the EM algorithm. We apply our algorithm to the Bayesian Lasso of Park and Casella (J. Am. Stat. Assoc. 103:681–686, 2008) and on the empirical Bayes variable selection of George and Foster (J. Am. Stat. Assoc. 87:731–747, 2000). 相似文献

18.

Analysis of generalized linear mixed models via a stochastic approximation algorithm with Markov chain Monte-Carlo method 总被引：3，自引：0，他引：3

Zhu Hong-Tu Lee Sik-Yum 《Statistics and Computing》2002,12(2):175-183

In recent years much effort has been devoted to maximum likelihood estimation of generalized linear mixed models. Most of the existing methods use the EM algorithm, with various techniques in handling the intractable E-step. In this paper, a new implementation of a stochastic approximation algorithm with Markov chain Monte Carlo method is investigated. The proposed algorithm is computationally straightforward and its convergence is guaranteed. A simulation and three real data sets, including the challenging salamander data, are used to illustrate the procedure and to compare it with some existing methods. The results indicate that the proposed algorithm is an attractive alternative for problems with a large number of random effects or with high dimensional intractable integrals in the likelihood function. 相似文献

19.

Statistical inference for discrete middle-censored data

Nasser Davarzani Ahmad Parsian 《Journal of statistical planning and inference》2011,141(4):1455-1462

In this paper we consider the discrete middle censoring where lifetime, lower bound and length of censoring interval are variables with geometric distribution. We obtain the likelihood function of observed data and derive the MLE of the unknown parameter using EM algorithm. Also we obtain the Bayes estimator of the unknown parameter under squared error loss (SEL) function and credible interval of unknown parameter using Monte Carlo methods. 相似文献

20.

An automated (Markov chain) Monte Carlo EM algorithm 总被引：1，自引：0，他引：1

《Journal of Statistical Computation and Simulation》2012,82(5):349-360

We present an automated Monte Carlo EM (MCEM) algorithm which efficiently assesses Monte Carlo error in the presence of dependent Monte Carlo, particularly Markov chain Monte Carlo, E-step samples and chooses an appropriate Monte Carlo sample size to minimize this Monte Carlo error with respect to progressive EM step estimates. Monte Carlo error is gauged though an application of the central limit theorem during renewal periods of the MCMC sampler used in the E-step. The resulting normal approximation allows us to construct a rigorous and adaptive rule for updating the Monte Carlo sample size each iteration of the MCEM algorithm. We illustrate our automated routine and compare the performance with competing MCEM algorithms in an analysis of a data set fit by a generalized linear mixed model. 相似文献