首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
An automated (Markov chain) Monte Carlo EM algorithm   总被引:1,自引:0,他引:1  
We present an automated Monte Carlo EM (MCEM) algorithm which efficiently assesses Monte Carlo error in the presence of dependent Monte Carlo, particularly Markov chain Monte Carlo, E-step samples and chooses an appropriate Monte Carlo sample size to minimize this Monte Carlo error with respect to progressive EM step estimates. Monte Carlo error is gauged though an application of the central limit theorem during renewal periods of the MCMC sampler used in the E-step. The resulting normal approximation allows us to construct a rigorous and adaptive rule for updating the Monte Carlo sample size each iteration of the MCEM algorithm. We illustrate our automated routine and compare the performance with competing MCEM algorithms in an analysis of a data set fit by a generalized linear mixed model.  相似文献   

2.
Two new implementations of the EM algorithm are proposed for maximum likelihood fitting of generalized linear mixed models. Both methods use random (independent and identically distributed) sampling to construct Monte Carlo approximations at the E-step. One approach involves generating random samples from the exact conditional distribution of the random effects (given the data) by rejection sampling, using the marginal distribution as a candidate. The second method uses a multivariate t importance sampling approximation. In many applications the two methods are complementary. Rejection sampling is more efficient when sample sizes are small, whereas importance sampling is better with larger sample sizes. Monte Carlo approximation using random samples allows the Monte Carlo error at each iteration to be assessed by using standard central limit theory combined with Taylor series methods. Specifically, we construct a sandwich variance estimate for the maximizer at each approximate E-step. This suggests a rule for automatically increasing the Monte Carlo sample size after iterations in which the true EM step is swamped by Monte Carlo error. In contrast, techniques for assessing Monte Carlo error have not been developed for use with alternative implementations of Monte Carlo EM algorithms utilizing Markov chain Monte Carlo E-step approximations. Three different data sets, including the infamous salamander data of McCullagh and Nelder, are used to illustrate the techniques and to compare them with the alternatives. The results show that the methods proposed can be considerably more efficient than those based on Markov chain Monte Carlo algorithms. However, the methods proposed may break down when the intractable integrals in the likelihood function are of high dimension.  相似文献   

3.
The expectation-maximization (EM) method facilitates computation of max¬imum likelihood (ML) and maximum penalized likelihood (MPL) solutions. The procedure requires specification of unobservabie complete data which augment the measured or incomplete data. This specification defines a conditional expectation of the complete data log-likelihood function which is computed in the E-stcp. The EM algorithm is most effective when maximizing the iunction Q{0) denned in the F-stnp is easier than maximizing the likelihood function.

The Monte Carlo EM (MCEM) algorithm of Wei & Tanner (1990) was introduced for problems where computation of Q is difficult or intractable. However Monte Carlo can he computationally expensive, e.g. in signal processing applications involving large numbers of parameters. We provide another approach: a modification of thc standard EM algorithm avoiding computation of conditional expectations.  相似文献   

4.
Clustered binary data are common in medical research and can be fitted to the logistic regression model with random effects which belongs to a wider class of models called the generalized linear mixed model. The likelihood-based estimation of model parameters often has to handle intractable integration which leads to several estimation methods to overcome such difficulty. The penalized quasi-likelihood (PQL) method is the one that is very popular and computationally efficient in most cases. The expectation–maximization (EM) algorithm allows to estimate maximum-likelihood estimates, but requires to compute possibly intractable integration in the E-step. The variants of the EM algorithm to evaluate the E-step are introduced. The Monte Carlo EM (MCEM) method computes the E-step by approximating the expectation using Monte Carlo samples, while the Modified EM (MEM) method computes the E-step by approximating the expectation using the Laplace's method. All these methods involve several steps of approximation so that corresponding estimates of model parameters contain inevitable errors (large or small) induced by approximation. Understanding and quantifying discrepancy theoretically is difficult due to the complexity of approximations in each method, even though the focus is on clustered binary data. As an alternative competing computational method, we consider a non-parametric maximum-likelihood (NPML) method as well. We review and compare the PQL, MCEM, MEM and NPML methods for clustered binary data via simulation study, which will be useful for researchers when choosing an estimation method for their analysis.  相似文献   

5.
Empirical Bayes spatial prediction using a Monte Carlo EM algorithm   总被引:1,自引:0,他引:1  
This paper deals with an empirical Bayes approach for spatial prediction of a Gaussian random field. In fact, we estimate the hyperparameters of the prior distribution by using the maximum likelihood method. In order to maximize the marginal distribution of the data, the EM algorithm is used. Since this algorithm requires the evaluation of analytically intractable and high dimensionally integrals, a Monte Carlo method based on discretizing parameter space, is proposed to estimate the relevant integrals. Then, the approach is illustrated by its application to a spatial data set. Finally, we compare the predictive performance of this approach with the reference prior method.  相似文献   

6.
The lasso is a popular technique of simultaneous estimation and variable selection in many research areas. The marginal posterior mode of the regression coefficients is equivalent to estimates given by the non-Bayesian lasso when the regression coefficients have independent Laplace priors. Because of its flexibility of statistical inferences, the Bayesian approach is attracting a growing body of research in recent years. Current approaches are primarily to either do a fully Bayesian analysis using Markov chain Monte Carlo (MCMC) algorithm or use Monte Carlo expectation maximization (MCEM) methods with an MCMC algorithm in each E-step. However, MCMC-based Bayesian method has much computational burden and slow convergence. Tan et al. [An efficient MCEM algorithm for fitting generalized linear mixed models for correlated binary data. J Stat Comput Simul. 2007;77:929–943] proposed a non-iterative sampling approach, the inverse Bayes formula (IBF) sampler, for computing posteriors of a hierarchical model in the structure of MCEM. Motivated by their paper, we develop this IBF sampler in the structure of MCEM to give the marginal posterior mode of the regression coefficients for the Bayesian lasso, by adjusting the weights of importance sampling, when the full conditional distribution is not explicit. Simulation experiments show that the computational time is much reduced with our method based on the expectation maximization algorithm and our algorithms and our methods behave comparably with other Bayesian lasso methods not only in prediction accuracy but also in variable selection accuracy and even better especially when the sample size is relatively large.  相似文献   

7.
In the expectation–maximization (EM) algorithm for maximum likelihood estimation from incomplete data, Markov chain Monte Carlo (MCMC) methods have been used in change-point inference for a long time when the expectation step is intractable. However, the conventional MCMC algorithms tend to get trapped in local mode in simulating from the posterior distribution of change points. To overcome this problem, in this paper we propose a stochastic approximation Monte Carlo version of EM (SAMCEM), which is a combination of adaptive Markov chain Monte Carlo and EM utilizing a maximum likelihood method. SAMCEM is compared with the stochastic approximation version of EM and reversible jump Markov chain Monte Carlo version of EM on simulated and real datasets. The numerical results indicate that SAMCEM can outperform among the three methods by producing much more accurate parameter estimates and the ability to achieve change-point positions and estimates simultaneously.  相似文献   

8.
This paper addresses the estimation for the unknown scale parameter of the half-logistic distribution based on a Type-I progressively hybrid censoring scheme. We evaluate the maximum likelihood estimate (MLE) via numerical method, and EM algorithm, and also the approximate maximum likelihood estimate (AMLE). We use a modified acceptance rejection method to obtain the Bayes estimate and corresponding highest posterior confidence intervals. We perform Monte Carlo simulations to compare the performances of the different methods, and we analyze one dataset for illustrative purposes.  相似文献   

9.
In recent years much effort has been devoted to maximum likelihood estimation of generalized linear mixed models. Most of the existing methods use the EM algorithm, with various techniques in handling the intractable E-step. In this paper, a new implementation of a stochastic approximation algorithm with Markov chain Monte Carlo method is investigated. The proposed algorithm is computationally straightforward and its convergence is guaranteed. A simulation and three real data sets, including the challenging salamander data, are used to illustrate the procedure and to compare it with some existing methods. The results indicate that the proposed algorithm is an attractive alternative for problems with a large number of random effects or with high dimensional intractable integrals in the likelihood function.  相似文献   

10.
In statistical models involving constrained or missing data, likelihoods containing integrals emerge. In the case of both constrained and missing data, the result is a ratio of integrals, which for multivariate data may defy exact or approximate analytic expression. Seeking maximum-likelihood estimates in such settings, we propose Monte Carlo approximants for these integrals, and subsequently maximize the resulting approximate likelihood. Iteration of this strategy expedites the maximization, while the Gibbs sampler is useful for the required Monte Carlo generation. As a result, we handle a class of models broader than the customary EM setting without using an EM-type algorithm. Implementation of the methodology is illustrated in two numerical examples.  相似文献   

11.
In empirical Bayes inference one is typically interested in sampling from the posterior distribution of a parameter with a hyper-parameter set to its maximum likelihood estimate. This is often problematic particularly when the likelihood function of the hyper-parameter is not available in closed form and the posterior distribution is intractable. Previous works have dealt with this problem using a multi-step approach based on the EM algorithm and Markov Chain Monte Carlo (MCMC). We propose a framework based on recent developments in adaptive MCMC, where this problem is addressed more efficiently using a single Monte Carlo run. We discuss the convergence of the algorithm and its connection with the EM algorithm. We apply our algorithm to the Bayesian Lasso of Park and Casella (J. Am. Stat. Assoc. 103:681–686, 2008) and on the empirical Bayes variable selection of George and Foster (J. Am. Stat. Assoc. 87:731–747, 2000).  相似文献   

12.
We present a maximum likelihood estimation procedure for the multivariate frailty model. The estimation is based on a Monte Carlo EM algorithm. The expectation step is approximated by averaging over random samples drawn from the posterior distribution of the frailties using rejection sampling. The maximization step reduces to a standard partial likelihood maximization. We also propose a simple rule based on the relative change in the parameter estimates to decide on sample size in each iteration and a stopping time for the algorithm. An important new concept is acquiring absolute convergence of the algorithm through sample size determination and an efficient sampling technique. The method is illustrated using a rat carcinogenesis dataset and data on vase lifetimes of cut roses. The estimation results are compared with approximate inference based on penalized partial likelihood using these two examples. Unlike the penalized partial likelihood estimation, the proposed full maximum likelihood estimation method accounts for all the uncertainty while estimating standard errors for the parameters.  相似文献   

13.
We consider the use of Monte Carlo methods to obtain maximum likelihood estimates for random effects models and distinguish between the pointwise and functional approaches. We explore the relationship between the two approaches and compare them with the EM algorithm. The functional approach is more ambitious but the approximation is local in nature which we demonstrate graphically using two simple examples. A remedy is to obtain successively better approximations of the relative likelihood function near the true maximum likelihood estimate. To save computing time, we use only one Newton iteration to approximate the maximiser of each Monte Carlo likelihood and show that this is equivalent to the pointwise approach. The procedure is applied to fit a latent process model to a set of polio incidence data. The paper ends by a comparison between the marginal likelihood and the recently proposed hierarchical likelihood which avoids integration altogether.  相似文献   

14.
We consider the problem of full information maximum likelihood (FIML) estimation in factor analysis when a majority of the data values are missing. The expectation–maximization (EM) algorithm is often used to find the FIML estimates, in which the missing values on manifest variables are included in complete data. However, the ordinary EM algorithm has an extremely high computational cost. In this paper, we propose a new algorithm that is based on the EM algorithm but that efficiently computes the FIML estimates. A significant improvement in the computational speed is realized by not treating the missing values on manifest variables as a part of complete data. When there are many missing data values, it is not clear if the FIML procedure can achieve good estimation accuracy. In order to investigate this, we conduct Monte Carlo simulations under a wide variety of sample sizes.  相似文献   

15.
A special source of difficulty in the statistical analysis is the possibility that some subjects may not have a complete observation of the response variable. Such incomplete observation of the response variable is called censoring. Censorship can occur for a variety of reasons, including limitations of measurement equipment, design of the experiment, and non-occurrence of the event of interest until the end of the study. In the presence of censoring, the dependence of the response variable on the explanatory variables can be explored through regression analysis. In this paper, we propose to examine the censorship problem in context of the class of asymmetric, i.e., we have proposed a linear regression model with censored responses based on skew scale mixtures of normal distributions. We develop a Monte Carlo EM (MCEM) algorithm to perform maximum likelihood inference of the parameters in the proposed linear censored regression models with skew scale mixtures of normal distributions. The MCEM algorithm has been discussed with an emphasis on the skew-normal, skew Student-t-normal, skew-slash and skew-contaminated normal distributions. To examine the performance of the proposed method, we present some simulation studies and analyze a real dataset.  相似文献   

16.
The lognormal distribution is quite commonly used as a lifetime distribution. Data arising from life-testing and reliability studies are often left truncated and right censored. Here, the EM algorithm is used to estimate the parameters of the lognormal model based on left truncated and right censored data. The maximization step of the algorithm is carried out by two alternative methods, with one involving approximation using Taylor series expansion (leading to approximate maximum likelihood estimate) and the other based on the EM gradient algorithm (Lange, 1995). These two methods are compared based on Monte Carlo simulations. The Fisher scoring method for obtaining the maximum likelihood estimates shows a problem of convergence under this setup, except when the truncation percentage is small. The asymptotic variance-covariance matrix of the MLEs is derived by using the missing information principle (Louis, 1982), and then the asymptotic confidence intervals for scale and shape parameters are obtained and compared with corresponding bootstrap confidence intervals. Finally, some numerical examples are given to illustrate all the methods of inference developed here.  相似文献   

17.
This paper deals with the regression analysis of failure time data when there are censoring and multiple types of failures. We propose a semiparametric generalization of a parametric mixture model of Larson & Dinse (1985), for which the marginal probabilities of the various failure types are logistic functions of the covariates. Given the type of failure, the conditional distribution of the time to failure follows a proportional hazards model. A marginal like lihood approach to estimating regression parameters is suggested, whereby the baseline hazard functions are eliminated as nuisance parameters. The Monte Carlo method is used to approximate the marginal likelihood; the resulting function is maximized easily using existing software. Some guidelines for choosing the number of Monte Carlo replications are given. Fixing the regression parameters at their estimated values, the full likelihood is maximized via an EM algorithm to estimate the baseline survivor functions. The methods suggested are illustrated using the Stanford heart transplant data.  相似文献   

18.
For linear regression models with non normally distributed errors, the least squares estimate (LSE) will lose some efficiency compared to the maximum likelihood estimate (MLE). In this article, we propose a kernel density-based regression estimate (KDRE) that is adaptive to the unknown error distribution. The key idea is to approximate the likelihood function by using a nonparametric kernel density estimate of the error density based on some initial parameter estimate. The proposed estimate is shown to be asymptotically as efficient as the oracle MLE which assumes the error density were known. In addition, we propose an EM type algorithm to maximize the estimated likelihood function and show that the KDRE can be considered as an iterated weighted least squares estimate, which provides us some insights on the adaptiveness of KDRE to the unknown error distribution. Our Monte Carlo simulation studies show that, while comparable to the traditional LSE for normal errors, the proposed estimation procedure can have substantial efficiency gain for non normal errors. Moreover, the efficiency gain can be achieved even for a small sample size.  相似文献   

19.
This article focuses on data analyses under the scenario of missing at random within discrete-time Markov chain models. The naive method, nonlinear (NL) method, and Expectation-Maximization (EM) algorithm are discussed. We extend the NL method into a Bayesian framework, using an adjusted rejection algorithm to sample the posterior distribution, and estimating the transition probabilities with a Monte Carlo algorithm. We compare the Bayesian nonlinear (BNL) method with the naive method and the EM algorithm with various missing rates, and comprehensively evaluate estimators in terms of biases, variances, mean square errors, and coverage probabilities (CPs). Our simulation results show that the EM algorithm usually offers smallest variances but with poorest CP, while the BNL method has smaller variances and better/similar CP as compared to the naive method. When the missing rate is low (about 9%, MAR), the three methods are comparable. Whereas when the missing rate is high (about 25%, MAR), overall, the BNL method performs slightly but consistently better than the naive method regarding variances and CP. Data from a longitudinal study of stress level among caregivers of individuals with Alzheimer’s disease is used to illustrate these methods.  相似文献   

20.
In this paper, a generalized partially linear model (GPLM) with missing covariates is studied and a Monte Carlo EM (MCEM) algorithm with penalized-spline (P-spline) technique is developed to estimate the regression coefficients and nonparametric function, respectively. As classical model selection procedures such as Akaike's information criterion become invalid for our considered models with incomplete data, some new model selection criterions for GPLMs with missing covariates are proposed under two different missingness mechanism, say, missing at random (MAR) and missing not at random (MNAR). The most attractive point of our method is that it is rather general and can be extended to various situations with missing observations based on EM algorithm, especially when no missing data involved, our new model selection criterions are reduced to classical AIC. Therefore, we can not only compare models with missing observations under MAR/MNAR settings, but also can compare missing data models with complete-data models simultaneously. Theoretical properties of the proposed estimator, including consistency of the model selection criterions are investigated. A simulation study and a real example are used to illustrate the proposed methodology.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号