首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
In count data models, overdispersion of the dependent variable can be incorporated into the model if a heterogeneity term is added into the mean parameter of the Poisson distribution. We use a nonparametric estimation for the heterogeneity density based on a squared Kth-order polynomial expansion, that we generalize for panel data. A numerical illustration using an insurance dataset is discussed. Even if some statistical analyses showed no clear differences between these new models and the standard Poisson with gamma random effects, we show that the choice of the random effects distribution has a significant influence for interpreting our results.  相似文献   

2.
Hall (2000) has described zero‐inflated Poisson and binomial regression models that include random effects to account for excess zeros and additional sources of heterogeneity in the data. The authors of the present paper propose a general score test for the null hypothesis that variance components associated with these random effects are zero. For a zero‐inflated Poisson model with random intercept, the new test reduces to an alternative to the overdispersion test of Ridout, Demério & Hinde (2001). The authors also examine their general test in the special case of the zero‐inflated binomial model with random intercept and propose an overdispersion test in that context which is based on a beta‐binomial alternative.  相似文献   

3.
This paper concerns maximum likelihood estimation for the semiparametric shared gamma frailty model; that is the Cox proportional hazards model with the hazard function multiplied by a gamma random variable with mean 1 and variance θ. A hybrid ML-EM algorithm is applied to 26 400 simulated samples of 400 to 8000 observations with Weibull hazards. The hybrid algorithm is much faster than the standard EM algorithm, faster than standard direct maximum likelihood (ML, Newton Raphson) for large samples, and gives almost identical results to the penalised likelihood method in S-PLUS 2000. When the true value θ0 of θ is zero, the estimates of θ are asymptotically distributed as a 50–50 mixture between a point mass at zero and a normal random variable on the positive axis. When θ0 > 0, the asymptotic distribution is normal. However, for small samples, simulations suggest that the estimates of θ are approximately distributed as an x ? (100 ? x)% mixture, 0 ≤ x ≤ 50, between a point mass at zero and a normal random variable on the positive axis even for θ0 > 0. In light of this, p-values and confidence intervals need to be adjusted accordingly. We indicate an approximate method for carrying out the adjustment.  相似文献   

4.
Non-Gaussian outcomes are often modeled using members of the so-called exponential family. The Poisson model for count data falls within this tradition. The family in general, and the Poisson model in particular, are at the same time convenient since mathematically elegant, but in need of extension since often somewhat restrictive. Two of the main rationales for existing extensions are (1) the occurrence of overdispersion, in the sense that the variability in the data is not adequately captured by the model's prescribed mean-variance link, and (2) the accommodation of data hierarchies owing to, for example, repeatedly measuring the outcome on the same subject, recording information from various members of the same family, etc. There is a variety of overdispersion models for count data, such as, for example, the negative-binomial model. Hierarchies are often accommodated through the inclusion of subject-specific, random effects. Though not always, one conventionally assumes such random effects to be normally distributed. While both of these issues may occur simultaneously, models accommodating them at once are less than common. This paper proposes a generalized linear model, accommodating overdispersion and clustering through two separate sets of random effects, of gamma and normal type, respectively. This is in line with the proposal by Booth et al. (Stat Model 3:179-181, 2003). The model extends both classical overdispersion models for count data (Breslow, Appl Stat 33:38-44, 1984), in particular the negative binomial model, as well as the generalized linear mixed model (Breslow and Clayton, J Am Stat Assoc 88:9-25, 1993). Apart from model formulation, we briefly discuss several estimation options, and then settle for maximum likelihood estimation with both fully analytic integration as well as hybrid between analytic and numerical integration. The latter is implemented in the SAS procedure NLMIXED. The methodology is applied to data from a study in epileptic seizures.  相似文献   

5.
It is common to fit generalized linear models with binomial and Poisson responses, where the data show a variability that is greater than the theoretical variability assumed by the model. This phenomenon, known as overdispersion, may spoil inferences about the model by considering significant parameters associated with variables that have no significant effect on the dependent variable. This paper explains some methods to detect overdispersion and presents and evaluates three well-known methodologies that have shown their usefulness in correcting this problem, using random mean models, quasi-likelihood methods and a double exponential family. In addition, it proposes some new Bayesian model extensions that have proved their usefulness in correcting the overdispersion problem. Finally, using the information provided by the National Demographic and Health Survey 2005, the departmental factors that have an influence on the mortality of children under 5 years and female postnatal period screening are determined. Based on the results, extensions that generalize some of the aforementioned models are also proposed, and their use is motivated by the data set under study. The results conclude that the proposed overdispersion models provide a better statistical fit of the data.  相似文献   

6.
Summary.  For rare diseases the observed disease count may exhibit extra Poisson variability, particularly in areas with low or sparse populations. Hence the variance of the estimates of disease risk, the standardized mortality ratios, may be highly unstable. This overdispersion must be taken into account otherwise subsequent maps based on standardized mortality ratios will be misleading and, rather than displaying the true spatial pattern of disease risk, the most extreme values will be highlighted. Neighbouring areas tend to exhibit spatial correlation as they may share more similarities than non-neighbouring areas. The need to address overdispersion and spatial correlation has led to the proposal of Bayesian approaches for smoothing estimates of disease risk. We propose a new model for investigating the spatial variation of disease risks in conjunction with an alternative specification for estimates of disease risk in geographical areas—the multivariate Poisson–gamma model. The main advantages of this new model lie in its simplicity and ability to account naturally for overdispersion and spatial auto-correlation. Exact expressions for important quantities such as expectations, variances and covariances can be easily derived.  相似文献   

7.
Modelling count data with overdispersion and spatial effects   总被引:1,自引:1,他引:0  
In this paper we consider regression models for count data allowing for overdispersion in a Bayesian framework. We account for unobserved heterogeneity in the data in two ways. On the one hand, we consider more flexible models than a common Poisson model allowing for overdispersion in different ways. In particular, the negative binomial and the generalized Poisson (GP) distribution are addressed where overdispersion is modelled by an additional model parameter. Further, zero-inflated models in which overdispersion is assumed to be caused by an excessive number of zeros are discussed. On the other hand, extra spatial variability in the data is taken into account by adding correlated spatial random effects to the models. This approach allows for an underlying spatial dependency structure which is modelled using a conditional autoregressive prior based on Pettitt et al. in Stat Comput 12(4):353–367, (2002). In an application the presented models are used to analyse the number of invasive meningococcal disease cases in Germany in the year 2004. Models are compared according to the deviance information criterion (DIC) suggested by Spiegelhalter et al. in J R Stat Soc B64(4):583–640, (2002) and using proper scoring rules, see for example Gneiting and Raftery in Technical Report no. 463, University of Washington, (2004). We observe a rather high degree of overdispersion in the data which is captured best by the GP model when spatial effects are neglected. While the addition of spatial effects to the models allowing for overdispersion gives no or only little improvement, spatial Poisson models with spatially correlated or uncorrelated random effects are to be preferred over all other models according to the considered criteria.  相似文献   

8.
Global regression assumes that a single model adequately describes all parts of a study region. However, the heterogeneity in the data may be sufficiently strong that relationships between variables can not be spatially constant. In addition, the factors involved are often sufficiently complex that it is difficult to identify them in the form of explanatory variables. As a result Geographically Weighted Regression (GWR) was introduced as a tool for the modeling of non-stationary spatial data. Using kernel functions, the GWR methodology allows the model parameters to vary spatially and produces non-parametric surfaces of their estimates. To model count data with overdispersion, it is more appropriate to use a negative binomial distribution instead of a Poisson distribution. Therefore, we propose the Geographically Weighted Negative Binomial Regression (GWNBR) method for the modeling of data with overdispersion. The results obtained using simulated and real data show the superiority of this method for the modeling of non-stationary count data with overdispersion compared with competing models, such as global regressions, e.g., Poisson and negative binomial and Geographically Weighted Poisson Regression (GWPR). Moreover, we illustrate that these competing models are special cases of the more robust model GWNBR.  相似文献   

9.
Overdispersion has been a common phenomenon in count data and usually treated with the negative binomial model. This paper shows that measurement errors in covariates in general also lead to overdispersion on the observed data if the true data generating process is indeed the Poisson regression. This kind of overdispersion cannot be treated using the negative binomial model, as otherwise, biases will occur. To provide consistent estimates, we propose a new type of corrected score estimator assuming that the distribution of the latent variables is known. The consistency and asymptotic normality of the proposed estimator are established. Simulation results show that this estimator has good finite sample performance. We also illustrate that the Akaike information criterion and Bayesian information criterion work well for selecting the correct model if the true model is the errors-in-variables Poisson regression.  相似文献   

10.
Methods for modelling overdispersed data are compared. These methods are considered to be of two kinds: a likelihood based approach and a method-of-moments based approach. The likelihood method facilitates computation of maximum likelihood estimates which can be obtained through the same algorithm as that of weighted least squares. The quasi-likelihood or moment approaches seem to be appropriate when severe overdispersion may be present. The comparisons are made via analyses of the Ames Salmonella Reverse Mutagenicity Assay (Margolin et a/., 1981) and a seed dataset (Crow-der, 1978).  相似文献   

11.
Estimation in Semiparametric Marginal Shared Gamma Frailty Models   总被引:1,自引:0,他引:1  
The semiparametric marginal shared frailty models in survival analysis have the non–parametric hazard functions multiplied by a random frailty in each cluster, and the survival times conditional on frailties are assumed to be independent. In addition, the marginal hazard functions have the same form as in the usual Cox proportional hazard models. In this paper, an approach based on maximum likelihood and expectation–maximization is applied to semiparametric marginal shared gamma frailty models, where the frailties are assumed to be gamma distributed with mean 1 and variance θ. The estimates of the fixed–effect parameters and their standard errors obtained using this approach are compared in terms of both bias and efficiency with those obtained using the extended marginal approach. Similarly, the standard errors of our frailty variance estimates are found to compare favourably with those obtained using other methods. The asymptotic distribution of the frailty variance estimates is shown to be a 50–50 mixture of a point mass at zero and a truncated normal random variable on the positive axis for θ0 = 0. Simulations demonstrate that, for θ0 < 0, it is approximately an x −(100 − x )%, 0 ≤ x ≤ 50, mixture between a point mass at zero and a truncated normal random variable on the positive axis for small samples and small values of θ0; otherwise, it is approximately normal.  相似文献   

12.
We describe a class of random field models for geostatistical count data based on Gaussian copulas. Unlike hierarchical Poisson models often used to describe this type of data, Gaussian copula models allow a more direct modelling of the marginal distributions and association structure of the count data. We study in detail the correlation structure of these random fields when the family of marginal distributions is either negative binomial or zero‐inflated Poisson; these represent two types of overdispersion often encountered in geostatistical count data. We also contrast the correlation structure of one of these Gaussian copula models with that of a hierarchical Poisson model having the same family of marginal distributions, and show that the former is more flexible than the latter in terms of range of feasible correlation, sensitivity to the mean function and modelling of isotropy. An exploratory analysis of a dataset of Japanese beetle larvae counts illustrate some of the findings. All of these investigations show that Gaussian copula models are useful alternatives to hierarchical Poisson models, specially for geostatistical count data that display substantial correlation and small overdispersion.  相似文献   

13.
We review Bayesian analysis of hierarchical non-standard Poisson regression models with an emphasis on microlevel heterogeneity and macrolevel autocorrelation. For the former case, we confirm that negative binomial regression usually accounts for microlevel heterogeneity (overdispersion) satisfactorily; for the latter case, we apply the simple first-order Markov transition model to conveniently capture the macrolevel autocorrelation which often arises from temporal and/or spatial count data, rather than attaching complex random effects directly to the regression parameters. Specifically, we extend the hierarchical (multilevel) Poisson model into negative binomial models with macrolevel autocorrelation using restricted gamma mixture with unit mean and Markov transition covariate created from preceding residuals. We prove a mild sufficient condition for posterior propriety under flat prior for the interesting fixed effects. Our methodology is implemented by analyzing the Baltic sea peracarids diurnal activity data published in the marine biology and ecology literature.  相似文献   

14.
The identification of seasonality and trend patterns of the weekly number of hospitalizations may be useful to plan the structure of health care and the vaccination calendar. A generalized additive model with the negative binomial distribution and a generalized additive model with autoregressive terms (GAMAR) and Poisson distribution are fitted including seasonal parameters and nonlinear trend using splines. The GAMAR includes autoregressive terms to take into account the serial correlation, yielding correct standard errors and reducing overdispersion. For the number of hospitalizations of people older than 60 years due to respiratory diseases in São Paulo city, both models present similar estimates but the Poisson-GAMAR presents uncorrelated residuals, no overdispersion and provides smaller confidence intervals for the weekly percentage changes. Forecasts for the next year based on both models are obtained by simulation and the Poisson-GAMAR presented better performance.  相似文献   

15.
In this paper, we establish several connections of the Poisson weight function to overdispersion and underdispersion. Specifically, we establish that the logconvexity (logconcavity) of the mean weight function is a necessary and sufficient condition for overdispersion (underdispersion) when the Poisson weight function does not depend on the original Poisson parameter. We also discuss some properties of the weighted Poisson distributions (WPD). We then introduce a notion of pointwise duality between two WPDs and discuss some associated properties. Next, we present some illustrative examples and provide a discussion on various Poisson weight functions used in practice. Finally, some concluding remarks are made.  相似文献   

16.
Abstract

For non-negative integer-valued random variables, the concept of “damaged” observations was introduced, for the first time, by Rao and Rubin [Rao, C. R., Rubin, H. (1964). On a characterization of the Poisson distribution. Sankhya 26:295–298] in 1964 on a paper concerning the characterization of Poisson distribution. In 1965, Rao [Rao, C. R. (1965). On discrete distribution arising out of methods of ascertainment. Sankhya Ser. A. 27:311–324] discusses some results related with inferences for parameters of a Poisson Model when it has occurred partial destruction of observations. A random variable is said to be damaged if it is unobservable, due to a damage mechanism which randomly reduces its magnitude. In subsequent years, considerable attention has been given to characterizations of distributions of such random variables that satisfy the “Rao–Rubin” condition. This article presents some inference aspects of a damaged Poisson distribution, under reasonable assumption that, when an observation on the random variable is made, it is also possible to determine whether or not some damage has occurred. In other words, we do not know how many items are damaged, but we can identify the existence of damage. Particularly it is illustrated the situation in which it is possible to identify the occurrence of some damage although it is not possible to determine the amount of items damaged. Maximum likelihood estimators of the underlying parameters and their asymptotic covariance matrix are obtained. Convergence of the estimates of parameters to the asymptotic values are studied through Monte Carlo simulations.  相似文献   

17.
In this paper, we introduce a new first-order generalized Poisson integer-valued autoregressive process, for modeling integer-valued time series exhibiting a piecewise structure and overdispersion. Basic probabilistic and statistical properties of this model are discussed. Conditional least squares and conditional maximum likelihood estimators are derived. The asymptotic properties of the estimators are established. Moreover, two special cases of the process are discussed. Finally, some numerical results of the estimates and a real data example are presented.  相似文献   

18.
Overdispersion due to a large proportion of zero observations in data sets is a common occurrence in many applications of many fields of research; we consider such scenarios in count panel (longitudinal) data. A well-known and widely implemented technique for handling such data is that of random effects modeling, which addresses the serial correlation inherent in panel data, as well as overdispersion. To deal with the excess zeros, a zero-inflated Poisson distribution has come to be canonical, which relaxes the equal mean-variance specification of a traditional Poisson model and allows for the larger variance characteristic of overdispersed data. A natural proposal then to approach count panel data with overdispersion due to excess zeros is to combine these two methodologies, deriving a likelihood from the resulting conditional probability. In performing simulation studies, we find that this approach in fact poses problems of identifiability. In this article, we construct and explain in full detail why a model obtained from the marriage of two classical and well-established techniques is unidentifiable and provide results of simulation studies demonstrating this effect. A discussion on alternative methodologies to resolve the problem is provided in the conclusion.  相似文献   

19.
The negative binomial (NB) model and the generalized Poisson (GP) model are common alternatives to Poisson models when overdispersion is present in the data. Having accounted for initial overdispersion, we may require further investigation as to whether there is evidence for zero-inflation in the data. Two score statistics are derived from the GP model for testing zero-inflation. These statistics, unlike Wald-type test statistics, do not require that we fit the more complex zero-inflated overdispersed models to evaluate zero-inflation. A simulation study illustrates that the developed score statistics reasonably follow a χ2 distribution and maintain the nominal level. Extensive simulation results also indicate the power behavior is different for including a continuous variable than a binary variable in the zero-inflation (ZI) part of the model. These differences are the basis from which suggestions are provided for real data analysis. Two practical examples are presented in this article. Results from these examples along with practical experience lead us to suggest performing the developed score test before fitting a zero-inflated NB model to the data.  相似文献   

20.
We consider a generalization of a standard test for overdispersion (underdispersion) of possibly Poison data. Under the null hypothesis observed counts are increments of Poisson processes. Particular applications are toa random sample of identically distributed processes and a single observed process. The test has intuitive appeal beyond the specific alternatives considered.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号