首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
This paper is concerned with the analysis of repeated measures count data overdispersed relative to a Poisson distribution, with the overdispersion possibly heterogeneous. To accommodate the overdispersion, the Poisson random variable is compounded with a gamma random variable, and both the mean of the Poisson and the variance of the gamma are modelled using log linear models. Maximum likelihood estimates (MLE) are then obtained. The paper also gives extended quasi-likelihood estimates for a more general class of compounding distributions which are shown to be approximations to the MLEs obtained for the gamma case. The theory is illustrated by modelling the determination of asbestos fibre intensity on membrane filters mounted on microscope slides.  相似文献   

2.
Modelling count data with overdispersion and spatial effects   总被引:1,自引:1,他引:0  
In this paper we consider regression models for count data allowing for overdispersion in a Bayesian framework. We account for unobserved heterogeneity in the data in two ways. On the one hand, we consider more flexible models than a common Poisson model allowing for overdispersion in different ways. In particular, the negative binomial and the generalized Poisson (GP) distribution are addressed where overdispersion is modelled by an additional model parameter. Further, zero-inflated models in which overdispersion is assumed to be caused by an excessive number of zeros are discussed. On the other hand, extra spatial variability in the data is taken into account by adding correlated spatial random effects to the models. This approach allows for an underlying spatial dependency structure which is modelled using a conditional autoregressive prior based on Pettitt et al. in Stat Comput 12(4):353–367, (2002). In an application the presented models are used to analyse the number of invasive meningococcal disease cases in Germany in the year 2004. Models are compared according to the deviance information criterion (DIC) suggested by Spiegelhalter et al. in J R Stat Soc B64(4):583–640, (2002) and using proper scoring rules, see for example Gneiting and Raftery in Technical Report no. 463, University of Washington, (2004). We observe a rather high degree of overdispersion in the data which is captured best by the GP model when spatial effects are neglected. While the addition of spatial effects to the models allowing for overdispersion gives no or only little improvement, spatial Poisson models with spatially correlated or uncorrelated random effects are to be preferred over all other models according to the considered criteria.  相似文献   

3.
Summary.  For rare diseases the observed disease count may exhibit extra Poisson variability, particularly in areas with low or sparse populations. Hence the variance of the estimates of disease risk, the standardized mortality ratios, may be highly unstable. This overdispersion must be taken into account otherwise subsequent maps based on standardized mortality ratios will be misleading and, rather than displaying the true spatial pattern of disease risk, the most extreme values will be highlighted. Neighbouring areas tend to exhibit spatial correlation as they may share more similarities than non-neighbouring areas. The need to address overdispersion and spatial correlation has led to the proposal of Bayesian approaches for smoothing estimates of disease risk. We propose a new model for investigating the spatial variation of disease risks in conjunction with an alternative specification for estimates of disease risk in geographical areas—the multivariate Poisson–gamma model. The main advantages of this new model lie in its simplicity and ability to account naturally for overdispersion and spatial auto-correlation. Exact expressions for important quantities such as expectations, variances and covariances can be easily derived.  相似文献   

4.
We describe a class of random field models for geostatistical count data based on Gaussian copulas. Unlike hierarchical Poisson models often used to describe this type of data, Gaussian copula models allow a more direct modelling of the marginal distributions and association structure of the count data. We study in detail the correlation structure of these random fields when the family of marginal distributions is either negative binomial or zero‐inflated Poisson; these represent two types of overdispersion often encountered in geostatistical count data. We also contrast the correlation structure of one of these Gaussian copula models with that of a hierarchical Poisson model having the same family of marginal distributions, and show that the former is more flexible than the latter in terms of range of feasible correlation, sensitivity to the mean function and modelling of isotropy. An exploratory analysis of a dataset of Japanese beetle larvae counts illustrate some of the findings. All of these investigations show that Gaussian copula models are useful alternatives to hierarchical Poisson models, specially for geostatistical count data that display substantial correlation and small overdispersion.  相似文献   

5.
The zero-inflated negative binomial (ZINB) model is used to account for commonly occurring overdispersion detected in data that are initially analyzed under the zero-inflated Poisson (ZIP) model. Tests for overdispersion (Wald test, likelihood ratio test [LRT], and score test) based on ZINB model for use in ZIP regression models have been developed. Due to similarity to the ZINB model, we consider the zero-inflated generalized Poisson (ZIGP) model as an alternate model for overdispersed zero-inflated count data. The score test has an advantage over the LRT and the Wald test in that the score test only requires that the parameter of interest be estimated under the null hypothesis. This paper proposes score tests for overdispersion based on the ZIGP model and illustrates that the derived score statistics are exactly the same as the score statistics under the ZINB model. A simulation study indicates the proposed score statistics are preferred to other tests for higher empirical power. In practice, based on the approximate mean–variance relationship in the data, the ZINB or ZIGP model can be considered, and a formal score test based on asymptotic standard normal distribution can be employed for assessing overdispersion in the ZIP model. We provide an example to illustrate the procedures for data analysis.  相似文献   

6.
"This paper considers parametric graduation for mortality, fertility and migration with particular reference to the development of parameterized local and regional demographic projections. Parametric graduations facilitate comparisons of demographic schedules across many areas and across time points--a feature which can be used to advantage in making forecasts of the three demographic components and thus in setting the assumptions for projections. Particular methodological issues raised are the questions of parsimony in fit and...of overdispersion in relation to binomial or Poisson assumptions. The analysis is illustrated with cross-sectional material for the 32 London boroughs and with time series at the level of Greater London."  相似文献   

7.
This article is about the statistical analysis of overdispersed paired count data for comparing two treatments. The data consist of the number of events obtained in a stratum during the fixed observation period. Three types of model are discussed: the Poisson, a mixed, and a semiparametric model. Overdispersion is represented in the last two models but not in the Poisson model. Of particular interests are to examine whether there is any loss of efficiency in using the estimate of the treatment effect obtained under other two models if the mixed model is true, and also whether overdispersion leads to a larger variance of the estimate than that expected from the Poisson model. It is shown that all three models provide the same estimate of the treatment effect (i.e., there is no loss of efficiency) and that the variance of the estimate of the treatment effect obtained under the Poisson model is the same as that based on the mixed model. However, the semiparametric model provides the variance of the estimate larger than those obtained under the other two models.  相似文献   

8.
Event counts are response variables with non-negative integer values representing the number of times that an event occurs within a fixed domain such as a time interval, a geographical area or a cell of a contingency table. Analysis of counts by Gaussian regression models ignores the discreteness, asymmetry and heteroscedasticity and is inefficient, providing unrealistic standard errors or possibly negative predictions of the expected number of events. The Poisson regression is the standard model for count data with underlying assumptions on the generating process which may be implausible in many applications. Statisticians have long recognized the limitation of imposing equidispersion under the Poisson regression model. A typical situation is when the conditional variance exceeds the conditional mean, in which case models allowing for overdispersion are routinely used. Less reported is the case of underdispersion with fewer modeling alternatives and assessments available in the literature. One of such alternatives, the Gamma-count model, is adopted here in the analysis of an agronomic experiment designed to investigate the effect of levels of defoliation on different phenological states upon the number of cotton bolls. Data set and code for analysis are available as online supplements. Results show improvements over the Poisson model and the semi-parametric quasi-Poisson model in capturing the observed variability in the data. Estimating rather than assuming the underlying variance process leads to important insights into the process.  相似文献   

9.
Count data often display excessive number of zero outcomes than are expected in the Poisson regression model. The zero-inflated Poisson regression model has been suggested to handle zero-inflated data, whereas the zero-inflated negative binomial (ZINB) regression model has been fitted for zero-inflated data with additional overdispersion. For bivariate and zero-inflated cases, several regression models such as the bivariate zero-inflated Poisson (BZIP) and bivariate zero-inflated negative binomial (BZINB) have been considered. This paper introduces several forms of nested BZINB regression model which can be fitted to bivariate and zero-inflated count data. The mean–variance approach is used for comparing the BZIP and our forms of BZINB regression model in this study. A similar approach was also used by past researchers for defining several negative binomial and zero-inflated negative binomial regression models based on the appearance of linear and quadratic terms of the variance function. The nested BZINB regression models proposed in this study have several advantages; the likelihood ratio tests can be performed for choosing the best model, the models have flexible forms of marginal mean–variance relationship, the models can be fitted to bivariate zero-inflated count data with positive or negative correlations, and the models allow additional overdispersion of the two dependent variables.  相似文献   

10.
Hall (2000) has described zero‐inflated Poisson and binomial regression models that include random effects to account for excess zeros and additional sources of heterogeneity in the data. The authors of the present paper propose a general score test for the null hypothesis that variance components associated with these random effects are zero. For a zero‐inflated Poisson model with random intercept, the new test reduces to an alternative to the overdispersion test of Ridout, Demério & Hinde (2001). The authors also examine their general test in the special case of the zero‐inflated binomial model with random intercept and propose an overdispersion test in that context which is based on a beta‐binomial alternative.  相似文献   

11.
This paper presents results from a simulation study motivated by a recent study of the relationships between ambient levels of air pollution and human health in the community of Prince George, British Columbia. The simulation study was designed to evaluate the performance of methods based on overdispersed Poisson regression models for the analysis of series of count data. Aspects addressed include estimation of the dispersion parameter, estimation of regression coefficients and their standard errors, and the performance of model selection tests. The effects of varying amounts of overdispersion and differing underlying variance structure on this performance were of particular interest. This study is related to work reported by Breslow (1990) although the context is quite different. Preliminary work led to the conclusion that estimation of the dispersion parameter should be based on Pearson's chi-square statistic rather than the Poisson deviance. Regression coefficients are well estimated, even in the présence of substantial overdispersion and when the model for the variance function is incorrectly specified. Despite potential greater variability, the empirical estimator of the covariance matrix is preferred because the model-based estimator is unreliable in general. When the model for the variance function is incorrect, model-based test statistics may perform poorly, in sharp contrast to empirical test statistics, which performed very well in this study.  相似文献   

12.
Overdispersion is a common phenomenon in Poisson modeling. The generalized Poisson (GP) regression model accommodates both overdispersion and underdispersion in count data modeling, and is an increasingly popular platform for modeling overdispersed count data. The Poisson model is one of the special cases in the collection of models which may be specified by GP regression. Thus, we may derive a test of overdispersion which compares the equi-dispersion Poisson model within the context of the more general GP regression model. The score test has an advantage over the likelihood ratio test (LRT) and over the Wald test in that the score test only requires that the parameter of interest be estimated under the null hypothesis (the Poisson model). Herein, we propose a score test for overdispersion based on the GP model (specifically the GP-2 model) and compare the power of the test with the LRT and Wald tests. A simulation study indicates the proposed score test based on asymptotic standard normal distribution is more appropriate in practical applications.  相似文献   

13.
Count data analysis techniques have been developed in biological and medical research areas. In particular, zero-inflated versions of parametric count distributions have been used to model excessive zeros that are often present in these assays. The most common count distributions for analyzing such data are Poisson and negative binomial. However, a Poisson distribution can only handle equidispersed data and a negative binomial distribution can only cope with overdispersion. However, a Conway–Maxwell–Poisson (CMP) distribution [4] can handle a wide range of dispersion. We show, with an illustrative data set on next-generation sequencing of maize hybrids, that both underdispersion and overdispersion can be present in genomic data. Furthermore, the maize data set consists of clustered observations and, therefore, we develop inference procedures for a zero-inflated CMP regression that incorporates a cluster-specific random effect term. Unlike the Gaussian models, the underlying likelihood is computationally challenging. We use a numerical approximation via a Gaussian quadrature to circumvent this issue. A test for checking zero-inflation has also been developed in our setting. Finite sample properties of our estimators and test have been investigated by extensive simulations. Finally, the statistical methodology has been applied to analyze the maize data mentioned before.  相似文献   

14.
The complex triparametric Pearson (CTP) distribution is a flexible model belonging to the Gaussian hypergeometric family that can account for over- and underdispersion. However, despite its good properties, not much attention has been paid to it. So, we revive the CTP comparing it with some well-known distributions that cope with overdispersion (negative binomial, generalized Poisson and univariate generalized Waring) as well as underdispersion (Conway–Maxwell–Poisson (CMP) and hyper-Poisson (HP)). We make a simulation study that reveals the performance of the CTP and shows that it has its own space among count data models. In this sense, we also explore some overdispersed datasets which seem to be more appropriately modelled by the CTP than by other usual models. Moreover, we include two underdispersed examples to illustrate that the CTP can provide similar fits to the CMP or HP (sometimes even more accurate) without the computational problems of these models.  相似文献   

15.
This paper proposes a simple and flexible count data regression model which is able to incorporate overdispersion (the variance is greater than the mean) and which can be considered a competitor to the Poisson model. As is well known, this classical model imposes the restriction that the conditional mean of each count variable must equal the conditional variance. Nevertheless, for the common case of well-dispersed counts the Poisson regression may not be appropriate, while the count regression model proposed here is potentially useful. We consider an application to model counts of medical care utilization by the elderly in the USA using a well-known data set from the National Medical Expenditure Survey (1987), where the dependent variable is the number of stays after hospital admission, and where 10 explanatory variables are analysed.  相似文献   

16.
Summary. We propose modelling short-term pollutant exposure effects on health by using dynamic generalized linear models. The time series of count data are modelled by a Poisson distribution having mean driven by a latent Markov process; estimation is performed by the extended Kalman filter and smoother. This modelling strategy allows us to take into account possible overdispersion and time-varying effects of the covariates. These ideas are illustrated by reanalysing data on the relationship between daily non-accidental deaths and air pollution in the city of Birmingham, Alabama.  相似文献   

17.
This article derives score tests for extra-Poisson variation in the positive or truncated-at-zero Poisson regression model against truncated-at-zero negative binomial family alternatives. It also develops size-corrected tests of overdispersion that are expected to improve their small-sample properties. Further, small-sample performance of the tests is investigated by means of Monte Carlo experiments. As an illustration, the proposed tests are applied to a model of strikes in U.S. manufacturing. The proposed tests have an interpretation as conditional moment tests and require only the positive Poisson model to be estimated. It is shown that most of the tests for overdispersion in the regular Poisson model given in the econometric and statistical literature can be obtained as special cases of the tests developed in this article. Monte Carlo experiments indicate that the size correction, based on the asymptotic expansions of the score function, is effective in improving the accuracy of the size and power of the tests in small samples.  相似文献   

18.
Two ways of modelling overdispersion in non-normal data   总被引:2,自引:0,他引:2  
For non-normal data assumed to have distributions, such as the Poisson distribution, which have an a priori dispersion parameter, there are two ways of modelling overdispersion: by a quasi-likelihood approach or with a random-effect model. The two approaches yield different variance functions for the response, which may be distinguishable if adequate data are available. The epilepsy data of Thall and Vail and the fabric data of Bissell are used to exemplify the ideas.  相似文献   

19.
20.
Overdispersion due to a large proportion of zero observations in data sets is a common occurrence in many applications of many fields of research; we consider such scenarios in count panel (longitudinal) data. A well-known and widely implemented technique for handling such data is that of random effects modeling, which addresses the serial correlation inherent in panel data, as well as overdispersion. To deal with the excess zeros, a zero-inflated Poisson distribution has come to be canonical, which relaxes the equal mean-variance specification of a traditional Poisson model and allows for the larger variance characteristic of overdispersed data. A natural proposal then to approach count panel data with overdispersion due to excess zeros is to combine these two methodologies, deriving a likelihood from the resulting conditional probability. In performing simulation studies, we find that this approach in fact poses problems of identifiability. In this article, we construct and explain in full detail why a model obtained from the marriage of two classical and well-established techniques is unidentifiable and provide results of simulation studies demonstrating this effect. A discussion on alternative methodologies to resolve the problem is provided in the conclusion.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号