首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
This work presents a new linear calibration model with replication by assuming that the error of the model follows a skew scale mixture of the normal distributions family, which is a class of asymmetric thick-tailed distributions that includes the skew normal distribution and symmetric distributions. In the literature, most calibration models assume that the errors are normally distributed. However, the normal distribution is not suitable when there are atypical observations and asymmetry. The estimation of the calibration model parameters are done numerically by the EM algorithm. A simulation study is carried out to verify the properties of the maximum likelihood estimators. This new approach is applied to a real dataset from a chemical analysis.  相似文献   

2.
A Monte Carlo algorithm is said to be adaptive if it automatically calibrates its current proposal distribution using past simulations. The choice of the parametric family that defines the set of proposal distributions is critical for good performance. In this paper, we present such a parametric family for adaptive sampling on high dimensional binary spaces. A practical motivation for this problem is variable selection in a linear regression context. We want to sample from a Bayesian posterior distribution on the model space using an appropriate version of Sequential Monte Carlo. Raw versions of Sequential Monte Carlo are easily implemented using binary vectors with independent components. For high dimensional problems, however, these simple proposals do not yield satisfactory results. The key to an efficient adaptive algorithm are binary parametric families which take correlations into account, analogously to the multivariate normal distribution on continuous spaces. We provide a review of models for binary data and make one of them work in the context of Sequential Monte Carlo sampling. Computational studies on real life data with about a hundred covariates suggest that, on difficult instances, our Sequential Monte Carlo approach clearly outperforms standard techniques based on Markov chain exploration.  相似文献   

3.
We introduce the log-odd Weibull regression model based on the odd Weibull distribution (Cooray, 2006). We derive some mathematical properties of the log-transformed distribution. The new regression model represents a parametric family of models that includes as sub-models some widely known regression models that can be applied to censored survival data. We employ a frequentist analysis and a parametric bootstrap for the parameters of the proposed model. We derive the appropriate matrices for assessing local influence on the parameter estimates under different perturbation schemes and present some ways to assess global influence. Further, for different parameter settings, sample sizes and censoring percentages, some simulations are performed. In addition, the empirical distribution of some modified residuals are given and compared with the standard normal distribution. These studies suggest that the residual analysis usually performed in normal linear regression models can be extended to a modified deviance residual in the proposed regression model applied to censored data. We define martingale and deviance residuals to check the model assumptions. The extended regression model is very useful for the analysis of real data.  相似文献   

4.
In this article, a general approach to latent variable models based on an underlying generalized linear model (GLM) with factor analysis observation process is introduced. We call these models Generalized Linear Factor Models (GLFM). The observations are produced from a general model framework that involves observed and latent variables that are assumed to be distributed in the exponential family. More specifically, we concentrate on situations where the observed variables are both discretely measured (e.g., binomial, Poisson) and continuously distributed (e.g., gamma). The common latent factors are assumed to be independent with a standard multivariate normal distribution. Practical details of training such models with a new local expectation-maximization (EM) algorithm, which can be considered as a generalized EM-type algorithm, are also discussed. In conjunction with an approximated version of the Fisher score algorithm (FSA), we show how to calculate maximum likelihood estimates of the model parameters, and to yield inferences about the unobservable path of the common factors. The methodology is illustrated by an extensive Monte Carlo simulation study and the results show promising performance.  相似文献   

5.
Partially linear models (PLMs) are an important tool in modelling economic and biometric data and are considered as a flexible generalization of the linear model by including a nonparametric component of some covariate into the linear predictor. Usually, the error component is assumed to follow a normal distribution. However, the theory and application (through simulation or experimentation) often generate a great amount of data sets that are skewed. The objective of this paper is to extend the PLMs allowing the errors to follow a skew-normal distribution [A. Azzalini, A class of distributions which includes the normal ones, Scand. J. Statist. 12 (1985), pp. 171–178], increasing the flexibility of the model. In particular, we develop the expectation-maximization (EM) algorithm for linear regression models and diagnostic analysis via local influence as well as generalized leverage, following [H. Zhu and S. Lee, Local influence for incomplete-data models, J. R. Stat. Soc. Ser. B 63 (2001), pp. 111–126]. A simulation study is also conducted to evaluate the efficiency of the EM algorithm. Finally, a suitable transformation is applied in a data set on ragweed pollen concentration in order to fit PLMs under asymmetric distributions. An illustrative comparison is performed between normal and skew-normal errors.  相似文献   

6.
The main objective of this paper is to develop a full Bayesian analysis for the Birnbaum–Saunders (BS) regression model based on scale mixtures of the normal (SMN) distribution with right-censored survival data. The BS distributions based on SMN models are a very general approach for analysing lifetime data, which has as special cases the Student-t-BS, slash-BS and the contaminated normal-BS distributions, being a flexible alternative to the use of the corresponding BS distribution or any other well-known compatible model, such as the log-normal distribution. A Gibbs sample algorithm with Metropolis–Hastings algorithm is used to obtain the Bayesian estimates of the parameters. Moreover, some discussions on the model selection to compare the fitted models are given and case-deletion influence diagnostics are developed for the joint posterior distribution based on the Kullback–Leibler divergence. The newly developed procedures are illustrated on a real data set previously analysed under BS regression models.  相似文献   

7.
This paper concerns the geometric treatment of graphical models using Bayes linear methods. We introduce Bayes linear separation as a second order generalised conditional independence relation, and Bayes linear graphical models are constructed using this property. A system of interpretive and diagnostic shadings are given, which summarise the analysis over the associated moral graph. Principles of local computation are outlined for the graphical models, and an algorithm for implementing such computation over the junction tree is described. The approach is illustrated with two examples. The first concerns sales forecasting using a multivariate dynamic linear model. The second concerns inference for the error variance matrices of the model for sales, and illustrates the generality of our geometric approach by treating the matrices directly as random objects. The examples are implemented using a freely available set of object-oriented programming tools for Bayes linear local computation and graphical diagnostic display.  相似文献   

8.
9.
In some situations, the distribution of the error terms of a multivariate linear regression model may depart from normality. This problem has been addressed, for example, by specifying a different parametric distribution family for the error terms, such as multivariate skewed and/or heavy-tailed distributions. A new solution is proposed, which is obtained by modelling the error term distribution through a finite mixture of multi-dimensional Gaussian components. The multivariate linear regression model is studied under this assumption. Identifiability conditions are proved and maximum likelihood estimation of the model parameters is performed using the EM algorithm. The number of mixture components is chosen through model selection criteria; when this number is equal to one, the proposal results in the classical approach. The performances of the proposed approach are evaluated through Monte Carlo experiments and compared to the ones of other approaches. In conclusion, the results obtained from the analysis of a real dataset are presented.  相似文献   

10.
A general class of mixed Poisson regression models is introduced. This class is based on a mixing between the Poisson distribution and a distribution belonging to the exponential family. With this, we unified some overdispersed models which have been studied separately, such as negative binomial and Poisson inverse gaussian models. We consider a regression structure for both the mean and dispersion parameters of the mixed Poisson models, thus extending, and in some cases correcting, some previous models considered in the literature. An expectation–maximization (EM) algorithm is proposed for estimation of the parameters and some diagnostic measures, based on the EM algorithm, are considered. We also obtain an explicit expression for the observed information matrix. An empirical illustration is presented in order to show the performance of our class of mixed Poisson models. This paper contains a Supplementary Material.  相似文献   

11.
A class of log‐linear models, referred to as labelled graphical models (LGMs), is introduced for multinomial distributions. These models generalize graphical models (GMs) by employing partial conditional independence restrictions which are valid only in subsets of an outcome space. Theoretical results concerning model identifiability, decomposability and estimation are derived. A decision theoretical framework and a search algorithm for the identification of plausible models are described. Real data sets are used to illustrate that LGMs may provide a simpler interpretation of a dependence structure than GMs.  相似文献   

12.
An exploratory model analysis device we call CDF knotting is introduced. It is a technique we have found useful for exploring relationships between points in the parameter space of a model and global properties of associated distribution functions. It can be used to alert the model builder to a condition we call lack of distinguishability which is to nonlinear models what multicollinearity is to linear models. While there are simple remedial actions to deal with multicollinearity in linear models, techniques such as deleting redundant variables in those models do not have obvious parallels for nonlinear models. In some of these nonlinear situations, however, CDF knotting may lead to alternative models with fewer parameters whose distribution functions are very similar to those of the original overparameterized model. We also show how CDF knotting can be exploited as a mathematical tool for deriving limiting distributions and illustrate the technique for the 3-parameterWeibull family obtaining limiting forms and moment ratios which correct and extend previously published results. Finally, geometric insights obtained by CDF knotting are verified relative to data fitting and estimation.  相似文献   

13.
In this paper, we consider a new mixture of varying coefficient models, in which each mixture component follows a varying coefficient model and the mixing proportions and dispersion parameters are also allowed to be unknown smooth functions. We systematically study the identifiability, estimation and inference for the new mixture model. The proposed new mixture model is rather general, encompassing many mixture models as its special cases such as mixtures of linear regression models, mixtures of generalized linear models, mixtures of partially linear models and mixtures of generalized additive models, some of which are new mixture models by themselves and have not been investigated before. The new mixture of varying coefficient model is shown to be identifiable under mild conditions. We develop a local likelihood procedure and a modified expectation–maximization algorithm for the estimation of the unknown non‐parametric functions. Asymptotic normality is established for the proposed estimator. A generalized likelihood ratio test is further developed for testing whether some of the unknown functions are constants. We derive the asymptotic distribution of the proposed generalized likelihood ratio test statistics and prove that the Wilks phenomenon holds. The proposed methodology is illustrated by Monte Carlo simulations and an analysis of a CO2‐GDP data set.  相似文献   

14.
Sampling from the posterior distribution in generalized linear mixed models   总被引:5,自引:0,他引:5  
Generalized linear mixed models provide a unified framework for treatment of exponential family regression models, overdispersed data and longitudinal studies. These problems typically involve the presence of random effects and this paper presents a new methodology for making Bayesian inference about them. The approach is simulation-based and involves the use of Markov chain Monte Carlo techniques. The usual iterative weighted least squares algorithm is extended to include a sampling step based on the Metropolis–Hastings algorithm thus providing a unified iterative scheme. Non-normal prior distributions for the regression coefficients and for the random effects distribution are considered. Random effect structures with nesting required by longitudinal studies are also considered. Particular interests concern the significance of regression coefficients and assessment of the form of the random effects. Extensions to unknown scale parameters, unknown link functions, survival and frailty models are outlined.  相似文献   

15.
In this paper, we compare three residuals to assess departures from the error assumptions as well as to detect outlying observations in log-Burr XII regression models with censored observations. These residuals can also be used for the log-logistic regression model, which is a special case of the log-Burr XII regression model. For different parameter settings, sample sizes and censoring percentages, various simulation studies are performed and the empirical distribution of each residual is displayed and compared with the standard normal distribution. These studies suggest that the residual analysis usually performed in normal linear regression models can be straightforwardly extended to the modified martingale-type residual in log-Burr XII regression models with censored data.  相似文献   

16.
Some conditional models to deal with binary longitudinal responses are proposed, extending random effects models to include serial dependence of Markovian form, and hence allowing for quite general association structures between repeated observations recorded on the same individual. The presence of both these components implies a form of dependence between them, and so a complicated expression for the resulting likelihood. To handle this problem, we introduce, as a first instance, what Follmann and Wu (1995) called, in a different setting, an approximate conditional model, which represents an optimal choice for the general framework of categorical longitudinal responses. Then we define two more formally correct models for the binary case, with no assumption about the distribution of the random effect. All of the discussed models are estimated by means of an EM algorithm for nonparametric maximum likelihood. The algorithm, an adaptation of that used by Aitkin (1996) for the analysis of overdispersed generalized linear models, is initially derived as a form of Gaussian quadrature, and then extended to a completely unknown mixing distribution. A large scale simulation work is described to explore the behaviour of the proposed approaches in a number of different situations.  相似文献   

17.
This paper considers the analysis of linear models where the response variable is a linear function of observable component variables. For example, scores on two or more psychometric measures (the component variables) might be weighted and summed to construct a single response variable in a psychological study. A linear model is then fit to the response variable. The question addressed in this paper is how to optimally transform the component variables so that the response is approximately normally distributed. The transformed component variables, themselves, need not be jointly normal. Two cases are considered; in both cases, the Box-Cox power family of transformations is employed. In Case I, the coefficients of the linear transformation are known constants. In Case II, the linear function is the first principal component based on the matrix of correlations among the transformed component variables. For each case, an algorithm is described for finding the transformation powers that minimize a generalized Anderson-Darling statistic. The proposed transformation procedure is compared to likelihood-based methods by means of simulation. The proposed method rarely performed worse than likelihood-based methods and for many data sets performed substantially better. As an illustration, the algorithm is applied to a problem from rural sociology and social psychology; namely scaling family residences along an urban-rural dimension.  相似文献   

18.
In this article, we compare three residuals based on the deviance component in generalised log-gamma regression models with censored observations. For different parameter settings, sample sizes and censoring percentages, various simulation studies are performed and the empirical distribution of each residual is displayed and compared with the standard normal distribution. For all cases studied, the empirical distributions of the proposed residuals are in general symmetric around zero, but only a martingale-type residual presented negligible kurtosis for the majority of the cases studied. These studies suggest that the residual analysis usually performed in normal linear regression models can be straightforwardly extended for the martingale-type residual in generalised log-gamma regression models with censored data. A lifetime data set is analysed under log-gamma regression models and a model checking based on the martingale-type residual is performed.  相似文献   

19.
Seemingly unrelated linear regression models are introduced in which the distribution of the errors is a finite mixture of Gaussian distributions. Identifiability conditions are provided. The score vector and the Hessian matrix are derived. Parameter estimation is performed using the maximum likelihood method and an Expectation–Maximisation algorithm is developed. The usefulness of the proposed methods and a numerical evaluation of their properties are illustrated through the analysis of simulated and real datasets.  相似文献   

20.
We present a class of truncated non linear regression models for location and scale where the truncated nature of the data is incorporated into the statistical model by assuming that the response variable follows a truncated distribution. The location parameter of the response variable is assumed to be modeled by a continuous non linear function of covariates and unknown parameters. In addition, the proposed model also allows for the scale parameter of the responses to be characterized by a continuous function of the covariates and unknown parameters. Three particular cases of the proposed models are presented by considering the response variable to follow a truncated normal, truncated skew normal, and truncated beta distribution. These truncated non linear regression models are constructed assuming fixed known truncation limits and model parameters are estimated by direct maximization of the log-likelihood using a non linear optimization algorithm. Standardized residuals and diagnostic metrics based on the cases deletion are considered to verify the adequacy of the model and to detect outliers and influential observations. Results based on simulated data are presented to assess the frequentist properties of estimates, and a real data set on soil-water retention from the Buriti Vermelho River Basin database is analyzed using the proposed methodology.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号