首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
We review Bayesian analysis of hierarchical non-standard Poisson regression models with an emphasis on microlevel heterogeneity and macrolevel autocorrelation. For the former case, we confirm that negative binomial regression usually accounts for microlevel heterogeneity (overdispersion) satisfactorily; for the latter case, we apply the simple first-order Markov transition model to conveniently capture the macrolevel autocorrelation which often arises from temporal and/or spatial count data, rather than attaching complex random effects directly to the regression parameters. Specifically, we extend the hierarchical (multilevel) Poisson model into negative binomial models with macrolevel autocorrelation using restricted gamma mixture with unit mean and Markov transition covariate created from preceding residuals. We prove a mild sufficient condition for posterior propriety under flat prior for the interesting fixed effects. Our methodology is implemented by analyzing the Baltic sea peracarids diurnal activity data published in the marine biology and ecology literature.  相似文献   

2.
Modelling count data with overdispersion and spatial effects   总被引:1,自引:1,他引:0  
In this paper we consider regression models for count data allowing for overdispersion in a Bayesian framework. We account for unobserved heterogeneity in the data in two ways. On the one hand, we consider more flexible models than a common Poisson model allowing for overdispersion in different ways. In particular, the negative binomial and the generalized Poisson (GP) distribution are addressed where overdispersion is modelled by an additional model parameter. Further, zero-inflated models in which overdispersion is assumed to be caused by an excessive number of zeros are discussed. On the other hand, extra spatial variability in the data is taken into account by adding correlated spatial random effects to the models. This approach allows for an underlying spatial dependency structure which is modelled using a conditional autoregressive prior based on Pettitt et al. in Stat Comput 12(4):353–367, (2002). In an application the presented models are used to analyse the number of invasive meningococcal disease cases in Germany in the year 2004. Models are compared according to the deviance information criterion (DIC) suggested by Spiegelhalter et al. in J R Stat Soc B64(4):583–640, (2002) and using proper scoring rules, see for example Gneiting and Raftery in Technical Report no. 463, University of Washington, (2004). We observe a rather high degree of overdispersion in the data which is captured best by the GP model when spatial effects are neglected. While the addition of spatial effects to the models allowing for overdispersion gives no or only little improvement, spatial Poisson models with spatially correlated or uncorrelated random effects are to be preferred over all other models according to the considered criteria.  相似文献   

3.
Count data are routinely assumed to have a Poisson distribution, especially when there are no straightforward diagnostic procedures for checking this assumption. We reanalyse two data sets from crossover trials of treatments for angina pectoris , in which the outcomes are counts of anginal attacks. Standard analyses focus on treatment effects, averaged over subjects; we are also interested in the dispersion of these effects (treatment heterogeneity). We set up a log-Poisson model with random coefficients to estimate the distribution of the treatment effects and show that the analysis is very sensitive to the distributional assumption; the population variance of the treatment effects is confounded with the (variance) function that relates the conditional variance of the outcomes, given the subject's rate of attacks, to the conditional mean. Diagnostic model checks based on resampling from the fitted distribution indicate that the default choice of the Poisson distribution for the analysed data sets is poorly supported. We propose to augment the data sets with observations of the counts, made possibly outside the clinical setting, so that the conditional distribution of the counts could be established.  相似文献   

4.
Hall (2000) has described zero‐inflated Poisson and binomial regression models that include random effects to account for excess zeros and additional sources of heterogeneity in the data. The authors of the present paper propose a general score test for the null hypothesis that variance components associated with these random effects are zero. For a zero‐inflated Poisson model with random intercept, the new test reduces to an alternative to the overdispersion test of Ridout, Demério & Hinde (2001). The authors also examine their general test in the special case of the zero‐inflated binomial model with random intercept and propose an overdispersion test in that context which is based on a beta‐binomial alternative.  相似文献   

5.
Frailty models can be fit as mixed-effects Poisson models after transforming time-to-event data to the Poisson model framework. We assess, through simulations, the robustness of Poisson likelihood estimation for Cox proportional hazards models with log-normal frailties under misspecified frailty distribution. The log-gamma and Laplace distributions were used as true distributions for frailties on a natural log scale. Factors such as the magnitude of heterogeneity, censoring rate, number and sizes of groups were explored. In the simulations, the Poisson modeling approach that assumes log-normally distributed frailties provided accurate estimates of within- and between-group fixed effects even under a misspecified frailty distribution. Non-robust estimation of variance components was observed in the situations of substantial heterogeneity, large event rates, or high data dimensions.  相似文献   

6.
Count data with excess zeros are widely encountered in the fields of biomedical, medical, public health and social survey, etc. Zero-inflated Poisson (ZIP) regression models with mixed effects are useful tools for analyzing such data, in which covariates are usually incorporated in the model to explain inter-subject variation and normal distribution is assumed for both random effects and random errors. However, in many practical applications, such assumptions may be violated as the data often exhibit skewness and some covariates may be measured with measurement errors. In this paper, we deal with these issues simultaneously by developing a Bayesian joint hierarchical modeling approach. Specifically, by treating intercepts and slopes in logistic and Poisson regression as random, a flexible two-level ZIP regression model is proposed, where a covariate process with measurement errors is established and a skew-t-distribution is considered for both random errors and random effects. Under the Bayesian framework, model selection is carried out using deviance information criterion (DIC) and a goodness-of-fit statistics is also developed for assessing the plausibility of the posited model. The main advantage of our method is that it allows for more robustness and correctness for investigating heterogeneity from different levels, while accommodating the skewness and measurement errors simultaneously. An application to Shanghai Youth Fitness Survey is used as an illustrate example. Through this real example, it is showed that our approach is of interest and usefulness for applications.  相似文献   

7.
In this paper we consider spatial regression models for count data. We examine not only the Poisson distribution but also the generalized Poisson capable of modeling over-dispersion, the negative Binomial as well as the zero-inflated Poisson distribution which allows for excess zeros as possible response distribution. We add random spatial effects for modeling spatial dependency and develop and implement MCMC algorithms in $R$ for Bayesian estimation. The corresponding R library ‘spatcounts’ is available on CRAN. In an application the presented models are used to analyze the number of benefits received per patient in a German private health insurance company. Since the deviance information criterion is only appropriate for exponential family models, we use in addition the Vuong and Clarke test with a Schwarz correction to compare possibly non nested models. We illustrate how they can be used in a Bayesian context.  相似文献   

8.
Data sets with excess zeroes are frequently analyzed in many disciplines. A common framework used to analyze such data is the zero-inflated (ZI) regression model. It mixes a degenerate distribution with point mass at zero with a non-degenerate distribution. The estimates from ZI models quantify the effects of covariates on the means of latent random variables, which are often not the quantities of primary interest. Recently, marginal zero-inflated Poisson (MZIP; Long et al. [A marginalized zero-inflated Poisson regression model with overall exposure effects. Stat. Med. 33 (2014), pp. 5151–5165]) and negative binomial (MZINB; Preisser et al., 2016) models have been introduced that model the mean response directly. These models yield covariate effects that have simple interpretations that are, for many applications, more appealing than those available from ZI regression. This paper outlines a general framework for marginal zero-inflated models where the latent distribution is a member of the exponential dispersion family, focusing on common distributions for count data. In particular, our discussion includes the marginal zero-inflated binomial (MZIB) model, which has not been discussed previously. The details of maximum likelihood estimation via the EM algorithm are presented and the properties of the estimators as well as Wald and likelihood ratio-based inference are examined via simulation. Two examples presented illustrate the advantages of MZIP, MZINB, and MZIB models for practical data analysis.  相似文献   

9.
Frailty models are often used to model heterogeneity in survival analysis. The most common frailty model has an individual intensity which is a product of a random factor and a basic intensity common to all individuals. This paper uses the compound Poisson distribution as the random factor. It allows some individuals to be non-susceptible, which can be useful in many settings. In some diseases, one may suppose that a number of families have an increased susceptibility due to genetic circumstances. Then, it is logical to use a frailty model where the individuals within each family have some shared factor, while individuals between families have different factors. This can be attained by randomizing the Poisson parameter in the compound Poisson distribution. To our knowledge, this is a new distribution. The power variance function distributions are used for the Poisson parameter. The subsequent appearing distributions are studied in some detail, both regarding appearance and various statistical properties. An application to infant mortality data from the Medical Birth Registry of Norway is included, where the model is compared to more traditional shared frailty models.  相似文献   

10.
Linear mixed models are widely used when multiple correlated measurements are made on each unit of interest. In many applications, the units may form several distinct clusters, and such heterogeneity can be more appropriately modelled by a finite mixture linear mixed model. The classical estimation approach, in which both the random effects and the error parts are assumed to follow normal distribution, is sensitive to outliers, and failure to accommodate outliers may greatly jeopardize the model estimation and inference. We propose a new mixture linear mixed model using multivariate t distribution. For each mixture component, we assume the response and the random effects jointly follow a multivariate t distribution, to conveniently robustify the estimation procedure. An efficient expectation conditional maximization algorithm is developed for conducting maximum likelihood estimation. The degrees of freedom parameters of the t distributions are chosen data adaptively, for achieving flexible trade-off between estimation robustness and efficiency. Simulation studies and an application on analysing lung growth longitudinal data showcase the efficacy of the proposed approach.  相似文献   

11.
In this paper, we consider the distribution of life length of a series system with random number of components, say Z. Considering the distribution of Z as generalized Poisson, an exponential-generalized Poisson (EGP) distribution is developed. The generalized Poisson distribution is a generalization of the Poisson distribution having one extra parameter. The structural properties of the resulting distribution are presented and the maximum likelihood estimation of the parameters is investigated. Extensive simulation studies are carried out to study the performance of the estimates. The score test is developed to test the importance of the extra parameter. For illustration, two real data sets are examined and it is shown that the EGP model, presented here, fits better than the exponential–Poisson distribution.  相似文献   

12.
In recent years, there has been considerable interest in regression models based on zero-inflated distributions. These models are commonly encountered in many disciplines, such as medicine, public health, and environmental sciences, among others. The zero-inflated Poisson (ZIP) model has been typically considered for these types of problems. However, the ZIP model can fail if the non-zero counts are overdispersed in relation to the Poisson distribution, hence the zero-inflated negative binomial (ZINB) model may be more appropriate. In this paper, we present a Bayesian approach for fitting the ZINB regression model. This model considers that an observed zero may come from a point mass distribution at zero or from the negative binomial model. The likelihood function is utilized to compute not only some Bayesian model selection measures, but also to develop Bayesian case-deletion influence diagnostics based on q-divergence measures. The approach can be easily implemented using standard Bayesian software, such as WinBUGS. The performance of the proposed method is evaluated with a simulation study. Further, a real data set is analyzed, where we show that ZINB regression models seems to fit the data better than the Poisson counterpart.  相似文献   

13.
孟生旺  杨亮 《统计研究》2015,32(11):97-103
索赔频率预测是非寿险费率厘定的重要组成部分。最常使用的索赔频率预测模型是泊松回归和负二项回归,以及与它们相对应的零膨胀回归模型。但是,当索赔次数观察值既具有零膨胀特征,又存在组内相依结构时,上述模型都不能很好地拟合实际数据。为此,本文在泊松分布、负二项分布、广义泊松分布、P型负二项分布等条件下分别建立了随机效应零膨胀损失次数回归模型。为了改进模型的预测效果,对于连续型的解释变量,还引入了二次平滑项,并建立了结构性零比例与解释变量之间的回归关系。基于一组实际索赔次数数据的实证分析结果表明,该模型可以显著改进现有模型的拟合效果。  相似文献   

14.
Inflated data and over-dispersion are two common problems when modeling count data with traditional Poisson regression models. In this study, we propose a latent class inflated Poisson (LCIP) regression model to solve the unobserved heterogeneity that leads to inflations and over-dispersion. The performance of the model estimation is evaluated through simulation studies. We illustrate the usefulness of introducing a latent class variable by analyzing the Behavioral Risk Factor Surveillance System (BRFSS) data, which contain several excessive values and characterized by over-dispersion. As a result, the new model we proposed displays a better fit than the standard Poisson regression and zero-inflated Poisson regression models for the inflated counts.KEYWORDS: Inflated data, latent class, heterogeneity, Poisson regression, over-dispersion  相似文献   

15.
Abstract

For non-negative integer-valued random variables, the concept of “damaged” observations was introduced, for the first time, by Rao and Rubin [Rao, C. R., Rubin, H. (1964). On a characterization of the Poisson distribution. Sankhya 26:295–298] in 1964 on a paper concerning the characterization of Poisson distribution. In 1965, Rao [Rao, C. R. (1965). On discrete distribution arising out of methods of ascertainment. Sankhya Ser. A. 27:311–324] discusses some results related with inferences for parameters of a Poisson Model when it has occurred partial destruction of observations. A random variable is said to be damaged if it is unobservable, due to a damage mechanism which randomly reduces its magnitude. In subsequent years, considerable attention has been given to characterizations of distributions of such random variables that satisfy the “Rao–Rubin” condition. This article presents some inference aspects of a damaged Poisson distribution, under reasonable assumption that, when an observation on the random variable is made, it is also possible to determine whether or not some damage has occurred. In other words, we do not know how many items are damaged, but we can identify the existence of damage. Particularly it is illustrated the situation in which it is possible to identify the occurrence of some damage although it is not possible to determine the amount of items damaged. Maximum likelihood estimators of the underlying parameters and their asymptotic covariance matrix are obtained. Convergence of the estimates of parameters to the asymptotic values are studied through Monte Carlo simulations.  相似文献   

16.
Typical joint modeling of longitudinal measurements and time to event data assumes that two models share a common set of random effects with a normal distribution assumption. But, sometimes the underlying population that the sample is extracted from is a heterogeneous population and detecting homogeneous subsamples of it is an important scientific question. In this paper, a finite mixture of normal distributions for the shared random effects is proposed for considering the heterogeneity in the population. For detecting whether the unobserved heterogeneity exits or not, we use a simple graphical exploratory diagnostic tool proposed by Verbeke and Molenberghs [34] to assess whether the traditional normality assumption for the random effects in the mixed model is adequate. In the joint modeling setting, in the case of evidence against normality (homogeneity), a finite mixture of normals is used for the shared random-effects distribution. A Bayesian MCMC procedure is developed for parameter estimation and inference. The methodology is illustrated using some simulation studies. Also, the proposed approach is used for analyzing a real HIV data set, using the heterogeneous joint model for this data set, the individuals are classified into two groups: a group with high risk and a group with moderate risk.  相似文献   

17.
Consider repeated event-count data from a sequence of exposures, during each of which a subject can experience some number of events, which is reported at ‘visits’ following each exposure. Within-subject heterogeneity not accounted for by visit-varying covariates is called ‘visit-level’ heterogeneity. Using generalized linear mixed models with log link for longitudinal Poisson regression, I model visit-level heterogeneity by cumulatively adding ‘disturbances’ to the random intercept of each subject over visits to create a ‘disturbed-random-intercept$rsquo; model. I also create a ‘disturbed-random-slope’ model, where the slope is over visits, and both intercept and slope are random but only the slope is disturbed. Simulation studies compare fixed-effect estimation for these models in data with 15 visits, large visit-level heterogeneity, and large multiplicative overdispersion. These studies show statistically significant superiority of the disturbed-random-intercept model. Examples with epidemiological data compare results of this model with those from other published models.  相似文献   

18.
In this paper, we consider a mixed compound Poisson process, that is, a random sum of independent and identically distributed (i.i.d.) random variables where the number of terms is a Poisson process with random intensity. We study nonparametric estimators of the jump density by specific deconvolution methods. Firstly, assuming that the random intensity has exponential distribution with unknown expectation, we propose two types of estimators based on the observation of an i.i.d. sample. Risks bounds and adaptive procedures are provided. Then, with no assumption on the distribution of the random intensity, we propose two non‐parametric estimators of the jump density based on the joint observation of the number of jumps and the random sum of jumps. Risks bounds are provided, leading to unusual rates for one of the two estimators. The methods are implemented and compared via simulations.  相似文献   

19.
Latent variable models are widely used for jointly modeling of mixed data including nominal, ordinal, count and continuous data. In this paper, we consider a latent variable model for jointly modeling relationships between mixed binary, count and continuous variables with some observed covariates. We assume that, given a latent variable, mixed variables of interest are independent and count and continuous variables have Poisson distribution and normal distribution, respectively. As such data may be extracted from different subpopulations, consideration of an unobserved heterogeneity has to be taken into account. A mixture distribution is considered (for the distribution of the latent variable) which accounts the heterogeneity. The generalized EM algorithm which uses the Newton–Raphson algorithm inside the EM algorithm is used to compute the maximum likelihood estimates of parameters. The standard errors of the maximum likelihood estimates are computed by using the supplemented EM algorithm. Analysis of the primary biliary cirrhosis data is presented as an application of the proposed model.  相似文献   

20.
In life-testing and survival analysis, sometimes the components are arranged in series or parallel system and the number of components is initially unknown. Thus, the number of components, say Z, is considered as random with an appropriate probability mass function. In this paper, we model the survival data with baseline distribution as Weibull and the distribution of Z as generalized Poisson, giving rise to four parameters in the model: increasing, decreasing, bathtub and upside bathtub failure rates. Two examples are provided and the maximum-likelihood estimation of the parameters is studied. Rao's score test is developed to compare the results with the exponential Poisson model studied by Kus [17] and the exponential-generalized Poisson distribution with baseline distribution as exponential and the distribution of Z as generalized Poisson. Simulation studies are carried out to examine the performance of the estimates.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号