期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

A monte carlo comparison of the smoothing,scoring and em algorithms for dispersion matrix estimation with incomplete growth curve data

《Journal of Statistical Computation and Simulation》2012,82(1-2):77-92

Incomplete growth curve data often result from missing or mistimed observations in a repeated measures design. Virtually all methods of analysis rely on the dispersion matrix estimates. A Monte Carlo simulation was used to compare three methods of estimation of dispersion matrices for incomplete growth curve data. The three methods were: 1) maximum likelihood estimation with a smoothing algorithm, which finds the closest positive semidefinite estimate of the pairwise estimated dispersion matrix; 2) a mixed effects model using the EM (estimation maximization) algorithm; and 3) a mixed effects model with the scoring algorithm. The simulation included 5 dispersion structures, 20 or 40 subjects with 4 or 8 observations per subject and 10 or 30% missing data. In all the simulations, the smoothing algorithm was the poorest estimator of the dispersion matrix. In most cases, there were no significant differences between the scoring and EM algorithms. The EM algorithm tended to be better than the scoring algorithm when the variances of the random effects were close to zero, especially for the simulations with 4 observations per subject and two random effects. 相似文献

2.

Acceleration of Expectation-Maximization algorithm for length-biased right-censored data

Kwun Chuen Gary Chan 《Lifetime data analysis》2017,23(1):102-112

Vardi’s Expectation-Maximization (EM) algorithm is frequently used for computing the nonparametric maximum likelihood estimator of length-biased right-censored data, which does not admit a closed-form representation. The EM algorithm may converge slowly, particularly for heavily censored data. We studied two algorithms for accelerating the convergence of the EM algorithm, based on iterative convex minorant and Aitken’s delta squared process. Numerical simulations demonstrate that the acceleration algorithms converge more rapidly than the EM algorithm in terms of number of iterations and actual timing. The acceleration method based on a modification of Aitken’s delta squared performed the best under a variety of settings. 相似文献

3.

Maximizing generalized linear mixed model likelihoods with an automated Monte Carlo EM algorithm

J. G. Booth & J. P. Hobert 《Journal of the Royal Statistical Society. Series B, Statistical methodology》1999,61(1):265-285

Two new implementations of the EM algorithm are proposed for maximum likelihood fitting of generalized linear mixed models. Both methods use random (independent and identically distributed) sampling to construct Monte Carlo approximations at the E-step. One approach involves generating random samples from the exact conditional distribution of the random effects (given the data) by rejection sampling, using the marginal distribution as a candidate. The second method uses a multivariate t importance sampling approximation. In many applications the two methods are complementary. Rejection sampling is more efficient when sample sizes are small, whereas importance sampling is better with larger sample sizes. Monte Carlo approximation using random samples allows the Monte Carlo error at each iteration to be assessed by using standard central limit theory combined with Taylor series methods. Specifically, we construct a sandwich variance estimate for the maximizer at each approximate E-step. This suggests a rule for automatically increasing the Monte Carlo sample size after iterations in which the true EM step is swamped by Monte Carlo error. In contrast, techniques for assessing Monte Carlo error have not been developed for use with alternative implementations of Monte Carlo EM algorithms utilizing Markov chain Monte Carlo E-step approximations. Three different data sets, including the infamous salamander data of McCullagh and Nelder, are used to illustrate the techniques and to compare them with the alternatives. The results show that the methods proposed can be considerably more efficient than those based on Markov chain Monte Carlo algorithms. However, the methods proposed may break down when the intractable integrals in the likelihood function are of high dimension. 相似文献

4.

Parameter estimation of nonlinear mixed-effects models using first-order conditional linearization and the EM algorithm

Liyong Fu Yuancai Lei Ram P. Sharma 《Journal of applied statistics》2013,40(2):252-265

Nonlinear mixed-effects (NLME) models are flexible enough to handle repeated-measures data from various disciplines. In this article, we propose both maximum-likelihood and restricted maximum-likelihood estimations of NLME models using first-order conditional expansion (FOCE) and the expectation–maximization (EM) algorithm. The FOCE-EM algorithm implemented in the ForStat procedure SNLME is compared with the Lindstrom and Bates (LB) algorithm implemented in both the SAS macro NLINMIX and the S-Plus/R function nlme in terms of computational efficiency and statistical properties. Two realworld data sets an orange tree data set and a Chinese fir (Cunninghamia lanceolata) data set, and a simulated data set were used for evaluation. FOCE-EM converged for all mixed models derived from the base model in the two realworld cases, while LB did not, especially for the models in which random effects are simultaneously considered in several parameters to account for between-subject variation. However, both algorithms had identical estimated parameters and fit statistics for the converged models. We therefore recommend using FOCE-EM in NLME models, particularly when convergence is a concern in model selection. 相似文献

5.

A note on model selection using information criteria for general linear models estimated using REML

Arunas Petras Verbyla 《Australian & New Zealand Journal of Statistics》2019,61(1):39-50

It is common practice to compare the fit of non‐nested models using the Akaike (AIC) or Bayesian (BIC) information criteria. The basis of these criteria is the log‐likelihood evaluated at the maximum likelihood estimates of the unknown parameters. For the general linear model (and the linear mixed model, which is a special case), estimation is usually carried out using residual or restricted maximum likelihood (REML). However, for models with different fixed effects, the residual likelihoods are not comparable and hence information criteria based on the residual likelihood cannot be used. For model selection, it is often suggested that the models are refitted using maximum likelihood to enable the criteria to be used. The first aim of this paper is to highlight that both the AIC and BIC can be used for the general linear model by using the full log‐likelihood evaluated at the REML estimates. The second aim is to provide a derivation of the criteria under REML estimation. This aim is achieved by noting that the full likelihood can be decomposed into a marginal (residual) and conditional likelihood and this decomposition then incorporates aspects of both the fixed effects and variance parameters. Using this decomposition, the appropriate information criteria for model selection of models which differ in their fixed effects specification can be derived. An example is presented to illustrate the results and code is available for analyses using the ASReml‐R package. 相似文献

6.

On convergence of the EM algorithmand the Gibbs sampler

Sujit K. Sahu Gareth O. Roberts 《Statistics and Computing》1999,9(1):55-64

In this article we investigate the relationship between the EM algorithm and the Gibbs sampler. We show that the approximate rate of convergence of the Gibbs sampler by Gaussian approximation is equal to that of the corresponding EM-type algorithm. This helps in implementing either of the algorithms as improvement strategies for one algorithm can be directly transported to the other. In particular, by running the EM algorithm we know approximately how many iterations are needed for convergence of the Gibbs sampler. We also obtain a result that under certain conditions, the EM algorithm used for finding the maximum likelihood estimates can be slower to converge than the corresponding Gibbs sampler for Bayesian inference. We illustrate our results in a number of realistic examples all based on the generalized linear mixed models. 相似文献

7.

On the difference between ML and REML estimators in the modelling of multivariate longitudinal data

《Journal of statistical planning and inference》2005,134(1):194-205

A random effects model is examined in the multivariate setting where more than one characteristics are measured at each time point. ML and REML estimators are obtained under the restriction that estimates of variance matrices being at least p.s.d. It is shown that REML has greater probability of giving full rank estimates of variance components matrices but as regards the efficiency in the estimation of the location parameter, correct specification of the number of random effects is needed. In general, REML provides larger estimates of variance of model parameters than ML. 相似文献

8.

Simplex Mixed‐Effects Models for Longitudinal Proportional Data

ZHENGUO QIU PETER X.‐K. SONG MING TAN 《Scandinavian Journal of Statistics》2008,35(4):577-596

Abstract. Continuous proportional outcomes are collected from many practical studies, where responses are confined within the unit interval (0,1). Utilizing Barndorff‐Nielsen and Jørgensen's simplex distribution, we propose a new type of generalized linear mixed‐effects model for longitudinal proportional data, where the expected value of proportion is directly modelled through a logit function of fixed and random effects. We establish statistical inference along the lines of Breslow and Clayton's penalized quasi‐likelihood (PQL) and restricted maximum likelihood (REML) in the proposed model. We derive the PQL/REML using the high‐order multivariate Laplace approximation, which gives satisfactory estimation of the model parameters. The proposed model and inference are illustrated by simulation studies and a data example. The simulation studies conclude that the fourth order approximate PQL/REML performs satisfactorily. The data example shows that Aitchison's technique of the normal linear mixed model for logit‐transformed proportional outcomes is not robust against outliers. 相似文献

9.

An augmented data scoring algorithm for maximum likelihood

Jun Ma H. Malcolm Hudson 《统计学通讯:理论与方法》2013,42(11):2761-2776

The expectation-maximization (EM) method facilitates computation of max¬imum likelihood (ML) and maximum penalized likelihood (MPL) solutions. The procedure requires specification of unobservabie complete data which augment the measured or incomplete data. This specification defines a conditional expectation of the complete data log-likelihood function which is computed in the E-stcp. The EM algorithm is most effective when maximizing the iunction Q{0) denned in the F-stnp is easier than maximizing the likelihood function.

The Monte Carlo EM (MCEM) algorithm of Wei & Tanner (1990) was introduced for problems where computation of Q is difficult or intractable. However Monte Carlo can he computationally expensive, e.g. in signal processing applications involving large numbers of parameters. We provide another approach: a modification of thc standard EM algorithm avoiding computation of conditional expectations. 相似文献

10.

Likelihood inference for small variance components

Steven E. Stern A. H. Welsh 《Revue canadienne de statistique》2000,28(3):517-532

The authors explore likelihood‐based methods for making inferences about the components of variance in a general normal mixed linear model. In particular, they use local asymptotic approximations to construct confidence intervals for the components of variance when the components are close to the boundary of the parameter space. In the process, they explore the question of how to profile the restricted likelihood (REML). Also, they show that general REML estimates are less likely to fall on the boundary of the parameter space than maximum‐likelihood estimates and that the likelihood‐ratio test based on the local asymptotic approximation has higher power than the likelihood‐ratio test based on the usual chi‐squared approximation. They examine the finite‐sample properties of the proposed intervals by means of a simulation study. 相似文献

11.

Fuzzy clustering algorithm for latent class model

Lin Chin-Tsai Chen Chie-Bein Wu Wen-Hsiang 《Statistics and Computing》2004,14(4):299-310

The expectation maximization (EM) algorithm is a widely used parameter approach for estimating the parameters of multivariate multinomial mixtures in a latent class model. However, this approach has unsatisfactory computing efficiency. This study proposes a fuzzy clustering algorithm (FCA) based on both the maximum penalized likelihood (MPL) for the latent class model and the modified penalty fuzzy c-means (PFCM) for normal mixtures. Numerical examples confirm that the FCA-MPL algorithm is more efficient (that is, requires fewer iterations) and more computationally effective (measured by the approximate relative ratio of accurate classification) than the EM algorithm. 相似文献

12.

Fast EM-type implementations for mixed effects models

X.-L. Meng & D. van Dyk 《Journal of the Royal Statistical Society. Series B, Statistical methodology》1998,60(3):559-578

The mixed effects model, in its various forms, is a common model in applied statistics. A useful strategy for fitting this model implements EM-type algorithms by treating the random effects as missing data. Such implementations, however, can be painfully slow when the variances of the random effects are small relative to the residual variance. In this paper, we apply the 'working parameter' approach to derive alternative EM-type implementations for fitting mixed effects models, which we show empirically can be hundreds of times faster than the common EM-type implementations. In our limited simulations, they also compare well with the routines in S-PLUS® and Stata® in terms of both speed and reliability. The central idea of the working parameter approach is to search for efficient data augmentation schemes for implementing the EM algorithm by minimizing the augmented information over the working parameter, and in the mixed effects setting this leads to a transfer of the mixed effects variances into the regression slope parameters. We also describe a variation for computing the restricted maximum likelihood estimate and an adaptive algorithm that takes advantage of both the standard and the alternative EM-type implementations. 相似文献

13.

Efficient estimation for incomplete multivariate data

Bent Jørgensen Hans Chr. Petersen 《Journal of statistical planning and inference》2012,142(5):1215-1224

We review the Fisher scoring and EM algorithms for incomplete multivariate data from an estimating function point of view, and examine the corresponding quasi-score functions under second-moment assumptions. A bias-corrected REML-type estimator for the covariance matrix is derived, and the Fisher, Godambe and empirical sandwich information matrices are compared. We make a numerical investigation of the two algorithms, and compare with a hybrid algorithm, where Fisher scoring is used for the mean vector and the EM algorithm for the covariance matrix. 相似文献

14.

Likelihood estimation of missing cell means in the fixed model analysis of variance

G.W. Fellingham H.D. Tolley D.T. Scott 《统计学通讯:理论与方法》2013,42(9):2429-2447

This paper examines the formation of maximum likelihood estimates of cell means in analysis of variance problems for cells with missing observations. Methods of estimating the means for missing cells has a long history which includes iterative maximum likelihood techniques, approximation techniques and ad hoc techniques. The use of the EM algorithm to form maximum likelihood estimates has resolved most of the issues associated with this problem. Implementation of the EM algorithm entails specification of a reduced model. As demonstrated in this paper, when there are several missing cells, it is possible to specify a reduced model that results in an unidentifiable likelihood. The EM algorithm in this case does not converge, although the slow divergence may often be mistaken by the unwary as convergence. This paper presents a simple matrix method of determining whether or not the reduced model results in an identifiable likelihood, and consequently in an EM algorithm that converges. We also show the EM algorithm in this case to be equivalent to a method which yields a closed form solution. 相似文献

15.

Likelihood-based and Bayesian methods for Tweedie compound Poisson linear mixed models

Yanwei Zhang 《Statistics and Computing》2013,23(6):743-757

The Tweedie compound Poisson distribution is a subclass of the exponential dispersion family with a power variance function, in which the value of the power index lies in the interval (1,2). It is well known that the Tweedie compound Poisson density function is not analytically tractable, and numerical procedures that allow the density to be accurately and fast evaluated did not appear until fairly recently. Unsurprisingly, there has been little statistical literature devoted to full maximum likelihood inference for Tweedie compound Poisson mixed models. To date, the focus has been on estimation methods in the quasi-likelihood framework. Further, Tweedie compound Poisson mixed models involve an unknown variance function, which has a significant impact on hypothesis tests and predictive uncertainty measures. The estimation of the unknown variance function is thus of independent interest in many applications. However, quasi-likelihood-based methods are not well suited to this task. This paper presents several likelihood-based inferential methods for the Tweedie compound Poisson mixed model that enable estimation of the variance function from the data. These algorithms include the likelihood approximation method, in which both the integral over the random effects and the compound Poisson density function are evaluated numerically; and the latent variable approach, in which maximum likelihood estimation is carried out via the Monte Carlo EM algorithm, without the need for approximating the density function. In addition, we derive the corresponding Markov Chain Monte Carlo algorithm for a Bayesian formulation of the mixed model. We demonstrate the use of the various methods through a numerical example, and conduct an array of simulation studies to evaluate the statistical properties of the proposed estimators. 相似文献

16.

ASYMPTOTIC PROPERTIES OF RESTRICTED MAXIMUM LIKELIHOOD (REML) ESTIMATES FOR HIERARCHICAL MIXED LINEAR MODELS

A.M. RICHARDSON A.H. WELSH 《Australian & New Zealand Journal of Statistics》1994,36(1):31-43

This paper explores the asymptotic distribution of the restricted maximum likelihood estimator of the variance components in a general mixed model. Restricting attention to hierarchical models, central limit theorems are obtained using elementary arguments with only mild conditions on the covariates in the fixed part of the model and without having to assume that the data are either normally or spherically symmetrically distributed. Further, the REML and maximum likelihood estimators are shown to be asymptotically equivalent in this general framework, and the asymptotic distribution of the weighted least squares estimator (based on the REML estimator) of the fixed effect parameters is derived. 相似文献

17.

Simple Fitting Algorithms for Incomplete Categorical Data

Geert Molenberghs & Els Goetghebeur 《Journal of the Royal Statistical Society. Series B, Statistical methodology》1997,59(2):401-414

A popular approach to estimation based on incomplete data is the EM algorithm. For categorical data, this paper presents a simple expression of the observed data log-likelihood and its derivatives in terms of the complete data for a broad class of models and missing data patterns. We show that using the observed data likelihood directly is easy and has some advantages. One can gain considerable computational speed over the EM algorithm and a straightforward variance estimator is obtained for the parameter estimates. The general formulation treats a wide range of missing data problems in a uniform way. Two examples are worked out in full. 相似文献

18.

Standard errors for EM estimation

M. Jamshidian & R. I. Jennrich 《Journal of the Royal Statistical Society. Series B, Statistical methodology》2000,62(2):257-270

The EM algorithm is a popular method for computing maximum likelihood estimates. One of its drawbacks is that it does not produce standard errors as a by-product. We consider obtaining standard errors by numerical differentiation. Two approaches are considered. The first differentiates the Fisher score vector to yield the Hessian of the log-likelihood. The second differentiates the EM operator and uses an identity that relates its derivative to the Hessian of the log-likelihood. The well-known SEM algorithm uses the second approach. We consider three additional algorithms: one that uses the first approach and two that use the second. We evaluate the complexity and precision of these three and the SEM in algorithm seven examples. The first is a single-parameter example used to give insight. The others are three examples in each of two areas of EM application: Poisson mixture models and the estimation of covariance from incomplete data. The examples show that there are algorithms that are much simpler and more accurate than the SEM algorithm. Hopefully their simplicity will increase the availability of standard error estimates in EM applications. It is shown that, as previously conjectured, a symmetry diagnostic can accurately estimate errors arising from numerical differentiation. Some issues related to the speed of the EM algorithm and algorithms that differentiate the EM operator are identified. 相似文献

19.

Editorial collaborators

《Journal of Statistical Computation and Simulation》2012,82(2):169-170

A simulation study of the binomial-logit model with correlated random effects is carried out based on the generalized linear mixed model (GLMM) methodology. Simulated data with various numbers of regression parameters and different values of the variance component are considered. The performance of approximate maximum likelihood (ML) and residual maximum likelihood (REML) estimators is evaluated. For a range of true parameter values, we report the average biases of estimators, the standard error of the average bias and the standard error of estimates over the simulations. In general, in terms of bias, the two methods do not show significant differences in estimating regression parameters. The REML estimation method is slightly better in reducing the bias of variance component estimates. 相似文献

20.

LASSO-type estimators for semiparametric nonlinear mixed-effects models estimation

Ana Arribas-Gil Karine Bertin Cristian Meza Vincent Rivoirard 《Statistics and Computing》2014,24(3):443-460

Parametric nonlinear mixed effects models (NLMEs) are now widely used in biometrical studies, especially in pharmacokinetics research and HIV dynamics models, due to, among other aspects, the computational advances achieved during the last years. However, this kind of models may not be flexible enough for complex longitudinal data analysis. Semiparametric NLMEs (SNMMs) have been proposed as an extension of NLMEs. These models are a good compromise and retain nice features of both parametric and nonparametric models resulting in more flexible models than standard parametric NLMEs. However, SNMMs are complex models for which estimation still remains a challenge. Previous estimation procedures are based on a combination of log-likelihood approximation methods for parametric estimation and smoothing splines techniques for nonparametric estimation. In this work, we propose new estimation strategies in SNMMs. On the one hand, we use the Stochastic Approximation version of EM algorithm (SAEM) to obtain exact ML and REML estimates of the fixed effects and variance components. On the other hand, we propose a LASSO-type method to estimate the unknown nonlinear function. We derive oracle inequalities for this nonparametric estimator. We combine the two approaches in a general estimation procedure that we illustrate with simulations and through the analysis of a real data set of price evolution in on-line auctions. 相似文献