首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
In this article, a general approach to latent variable models based on an underlying generalized linear model (GLM) with factor analysis observation process is introduced. We call these models Generalized Linear Factor Models (GLFM). The observations are produced from a general model framework that involves observed and latent variables that are assumed to be distributed in the exponential family. More specifically, we concentrate on situations where the observed variables are both discretely measured (e.g., binomial, Poisson) and continuously distributed (e.g., gamma). The common latent factors are assumed to be independent with a standard multivariate normal distribution. Practical details of training such models with a new local expectation-maximization (EM) algorithm, which can be considered as a generalized EM-type algorithm, are also discussed. In conjunction with an approximated version of the Fisher score algorithm (FSA), we show how to calculate maximum likelihood estimates of the model parameters, and to yield inferences about the unobservable path of the common factors. The methodology is illustrated by an extensive Monte Carlo simulation study and the results show promising performance.  相似文献   

2.
A new technique is devised to mitigate the errors-in-variables bias in linear regression. The procedure mimics a 2-stage least squares procedure where an auxiliary regression which generates a better behaved predictor variable is derived. The generated variable is then used as a substitute for the error-prone variable in the first-stage model. The performance of the algorithm is tested by simulation and regression analyses. Simulations suggest the algorithm efficiently captures the additive error term used to contaminate the artificial variables. Regressions provide further credit to the simulations as they clearly show that the compact genetic algorithm-based estimate of the true but unobserved regressor yields considerably better results. These conclusions are robust across different sample sizes and different variance structures imposed on both the measurement error and regression disturbances.  相似文献   

3.
The theory and properties of trend-free (TF) and nearly trend-free (NTF) block designs are wel1 developed. Applications have been hampered because a methodology for design construction has not been available.

This article begins with a short review of concepts and properties of TF and NTF block designs. The major contribution is provision of an algorithm for the construction of linear and nearly linear TF block designs. The algorithm is incorporated in a computer program in FORTRAN 77 provided in an appendix for the IBM PC or compatible microcomputer, a program adaptable also to other computers. Three sets of block designs generated by the program are given as examples.

A numerical example of analysis of a linear trend-free balanced incomplete block design is provided.  相似文献   

4.
This paper proposes a method for estimating the parameters in a generalized linear model with missing covariates. The missing covariates are assumed to come from a continuous distribution, and are assumed to be missing at random. In particular, Gaussian quadrature methods are used on the E-step of the EM algorithm, leading to an approximate EM algorithm. The parameters are then estimated using the weighted EM procedure given in Ibrahim (1990). This approximate EM procedure leads to approximate maximum likelihood estimates, whose standard errors and asymptotic properties are given. The proposed procedure is illustrated on a data set.  相似文献   

5.
In the optimal experimental design literature, the G-optimality is defined as minimizing the maximum prediction variance over the entire experimental design space. Although the G-optimality is a highly desirable property in many applications, there are few computer algorithms developed for constructing G-optimal designs. Some existing methods employ an exhaustive search over all candidate designs, which is time-consuming and inefficient. In this paper, a new algorithm for constructing G-optimal experimental designs is developed for both linear and generalized linear models. The new algorithm is made based on the clustering of candidate or evaluation points over the design space and it is a combination of point exchange algorithm and coordinate exchange algorithm. In addition, a robust design algorithm is proposed for generalized linear models with modification of an existing method. The proposed algorithm are compared with the methods proposed by Rodriguez et al. [Generating and assessing exact G-optimal designs. J. Qual. Technol. 2010;42(1):3–20] and Borkowski [Using a genetic algorithm to generate small exact response surface designs. J. Prob. Stat. Sci. 2003;1(1):65–88] for linear models and with the simulated annealing method and the genetic algorithm for generalized linear models through several examples in terms of the G-efficiency and computation time. The result shows that the proposed algorithm can obtain a design with higher G-efficiency in a much shorter time. Moreover, the computation time of the proposed algorithm only increases polynomially when the size of model increases.  相似文献   

6.
In this article, a non-iterative sampling algorithm is developed to obtain an independently and identically distributed samples approximately from the posterior distribution of parameters in Laplace linear regression model. By combining the inverse Bayes formulae, sampling/importance resampling, and expectation maximum algorithm, the algorithm eliminates the diagnosis of convergence in the iterative Gibbs sampling and the samples generated from it can be used for inferences immediately. Simulations are conducted to illustrate the robustness and effectiveness of the algorithm. Finally, real data are studied to show the usefulness of the proposed methodology.  相似文献   

7.
A novel application of the expectation maximization (EM) algorithm is proposed for modeling right-censored multiple regression. Parameter estimates, variability assessment, and model selection are summarized in a multiple regression settings assuming a normal model. The performance of this method is assessed through a simulation study. New formulas for measuring model utility and diagnostics are derived based on the EM algorithm. They include reconstructed coefficient of determination and influence diagnostics based on a one-step deletion method. A real data set, provided by North Dakota Department of Veterans Affairs, is modeled using the proposed methodology. Empirical findings should be of benefit to government policy-makers.  相似文献   

8.
The medical costs in an ageing society substantially increase when the incidences of chronic diseases, disabilities and inability to live independently are high. Healthy lifestyles not only affect elderly individuals but also influence the entire community. When assessing treatment efficacy, survival and quality of life should be considered simultaneously. This paper proposes the joint likelihood approach for modelling survival and longitudinal binary covariates simultaneously. Because some unobservable information is present in the model, the Monte Carlo EM algorithm and Metropolis-Hastings algorithm are used to find the estimators. Monte Carlo simulations are performed to evaluate the performance of the proposed model based on the accuracy and precision of the estimates. Real data are used to demonstrate the feasibility of the proposed model.  相似文献   

9.
We propose a hybrid two-group classification method that integrates linear discriminant analysis, a polynomial expansion of the basis (or variable space), and a genetic algorithm with multiple crossover operations to select variables from the expanded basis. Using new product launch data from the biochemical industry, we found that the proposed algorithm offers mean percentage decreases in the misclassification error rate of 50%, 56%, 59%, 77%, and 78% in comparison to a support vector machine, artificial neural network, quadratic discriminant analysis, linear discriminant analysis, and logistic regression, respectively. These improvements correspond to annual cost savings of $4.40–$25.73 million.  相似文献   

10.
In this work, we propose a new model called generalized symmetrical partial linear model, based on the theory of generalized linear models and symmetrical distributions. In our model the response variable follows a symmetrical distribution such a normal, Student-t, power exponential, among others. Following the context of generalized linear models we consider replacing the traditional linear predictors by the more general predictors in whose case one covariate is related with the response variable in a non-parametric fashion, that we do not specified the parametric function. As an example, we could imagine a regression model in which the intercept term is believed to vary in time or geographical location. The backfitting algorithm is used for estimating the parameters of the proposed model. We perform a simulation study for assessing the behavior of the penalized maximum likelihood estimators. We use the quantile residuals for checking the assumption of the model. Finally, we analyzed real data set related with pH rivers in Ireland.  相似文献   

11.
In recent years much effort has been devoted to maximum likelihood estimation of generalized linear mixed models. Most of the existing methods use the EM algorithm, with various techniques in handling the intractable E-step. In this paper, a new implementation of a stochastic approximation algorithm with Markov chain Monte Carlo method is investigated. The proposed algorithm is computationally straightforward and its convergence is guaranteed. A simulation and three real data sets, including the challenging salamander data, are used to illustrate the procedure and to compare it with some existing methods. The results indicate that the proposed algorithm is an attractive alternative for problems with a large number of random effects or with high dimensional intractable integrals in the likelihood function.  相似文献   

12.
An algorithm, in t h e form of a Fortran subroutine TRIPLE,is given to compute statistics forthetriples test for symmetry, The 2 computational complexity of the algorithm is O(n2 ) which is an 3 improvement over the straightforward method, which is O(n3).  相似文献   

13.
In most applications, the parameters of a mixture of linear regression models are estimated by maximum likelihood using the expectation maximization (EM) algorithm. In this article, we propose the comparison of three algorithms to compute maximum likelihood estimates of the parameters of these models: the EM algorithm, the classification EM algorithm and the stochastic EM algorithm. The comparison of the three procedures was done through a simulation study of the performance (computational effort, statistical properties of estimators and goodness of fit) of these approaches on simulated data sets.

Simulation results show that the choice of the approach depends essentially on the configuration of the true regression lines and the initialization of the algorithms.  相似文献   

14.
The Expectation–Maximization (EM) algorithm is a very popular technique for maximum likelihood estimation in incomplete data models. When the expectation step cannot be performed in closed form, a stochastic approximation of EM (SAEM) can be used. Under very general conditions, the authors have shown that the attractive stationary points of the SAEM algorithm correspond to the global and local maxima of the observed likelihood. In order to avoid convergence towards a local maxima, a simulated annealing version of SAEM is proposed. An illustrative application to the convolution model for estimating the coefficients of the filter is given.  相似文献   

15.
An automated (Markov chain) Monte Carlo EM algorithm   总被引:1,自引:0,他引:1  
We present an automated Monte Carlo EM (MCEM) algorithm which efficiently assesses Monte Carlo error in the presence of dependent Monte Carlo, particularly Markov chain Monte Carlo, E-step samples and chooses an appropriate Monte Carlo sample size to minimize this Monte Carlo error with respect to progressive EM step estimates. Monte Carlo error is gauged though an application of the central limit theorem during renewal periods of the MCMC sampler used in the E-step. The resulting normal approximation allows us to construct a rigorous and adaptive rule for updating the Monte Carlo sample size each iteration of the MCEM algorithm. We illustrate our automated routine and compare the performance with competing MCEM algorithms in an analysis of a data set fit by a generalized linear mixed model.  相似文献   

16.
The EM algorithm is a popular method for computing maximum likelihood estimates or posterior modes in models that can be formulated in terms of missing data or latent structure. Although easy implementation and stable convergence help to explain the popularity of the algorithm, its convergence is sometimes notoriously slow. In recent years, however, various adaptations have significantly improved the speed of EM while maintaining its stability and simplicity. One especially successful method for maximum likelihood is known as the parameter expanded EM or PXEM algorithm. Unfortunately, PXEM does not generally have a closed form M-step when computing posterior modes, even when the corresponding EM algorithm is in closed form. In this paper we confront this problem by adapting the one-step-late EM algorithm to PXEM to establish a fast closed form algorithm that improves on the one-step-late EM algorithm by insuring monotone convergence. We use this algorithm to fit a probit regression model and a variety of dynamic linear models, showing computational savings of as much as 99.9%, with the biggest savings occurring when the EM algorithm is the slowest to converge.  相似文献   

17.
Summary.  The expectation–maximization (EM) algorithm is a popular tool for maximizing likelihood functions in the presence of missing data. Unfortunately, EM often requires the evaluation of analytically intractable and high dimensional integrals. The Monte Carlo EM (MCEM) algorithm is the natural extension of EM that employs Monte Carlo methods to estimate the relevant integrals. Typically, a very large Monte Carlo sample size is required to estimate these integrals within an acceptable tolerance when the algorithm is near convergence. Even if this sample size were known at the onset of implementation of MCEM, its use throughout all iterations is wasteful, especially when accurate starting values are not available. We propose a data-driven strategy for controlling Monte Carlo resources in MCEM. The algorithm proposed improves on similar existing methods by recovering EM's ascent (i.e. likelihood increasing) property with high probability, being more robust to the effect of user-defined inputs and handling classical Monte Carlo and Markov chain Monte Carlo methods within a common framework. Because of the first of these properties we refer to the algorithm as 'ascent-based MCEM'. We apply ascent-based MCEM to a variety of examples, including one where it is used to accelerate the convergence of deterministic EM dramatically.  相似文献   

18.
A special source of difficulty in the statistical analysis is the possibility that some subjects may not have a complete observation of the response variable. Such incomplete observation of the response variable is called censoring. Censorship can occur for a variety of reasons, including limitations of measurement equipment, design of the experiment, and non-occurrence of the event of interest until the end of the study. In the presence of censoring, the dependence of the response variable on the explanatory variables can be explored through regression analysis. In this paper, we propose to examine the censorship problem in context of the class of asymmetric, i.e., we have proposed a linear regression model with censored responses based on skew scale mixtures of normal distributions. We develop a Monte Carlo EM (MCEM) algorithm to perform maximum likelihood inference of the parameters in the proposed linear censored regression models with skew scale mixtures of normal distributions. The MCEM algorithm has been discussed with an emphasis on the skew-normal, skew Student-t-normal, skew-slash and skew-contaminated normal distributions. To examine the performance of the proposed method, we present some simulation studies and analyze a real dataset.  相似文献   

19.
It is known that the Fisher scoring iteration for generalized linear models has the same form as the Gauss-Newton algorithm for normal regression. This note shows that exponential dispersion models are the most general families to preserve this form for the scoring iteration. Therefore exponential dispersion models are the most general extension of generalized linear models for which the analogy with normal regression is preserved. The multinomial distribution is used as an example.  相似文献   

20.
Combining data of several tests or markers for the classification of patients according to their health status for assigning better treatments is a major issue in the study of diseases such as cancer. In order to tackle this problem, several approaches have been proposed in the literature. In this paper, a step-by-step algorithm for estimating the parameters of a linear classifier that combines several measures is considered. The optimization criterion is to maximize the area under the receiver operating characteristic curve. The algorithm is applied to different simulated data sets and its performance is evaluated. Finally, the method is illustrated with a prostate cancer staging database.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号