期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Efficient regression modeling for correlated and overdispersed count data

《统计学通讯:理论与方法》2012,41(24):6005-6018

Abstract

The objective of this paper is to propose an efficient estimation procedure in a marginal mean regression model for longitudinal count data and to develop a hypothesis test for detecting the presence of overdispersion. We extend the matrix expansion idea of quadratic inference functions to the negative binomial regression framework that entails accommodating both the within-subject correlation and overdispersion issue. Theoretical and numerical results show that the proposed procedure yields a more efficient estimator asymptotically than the one ignoring either the within-subject correlation or overdispersion. When the overdispersion is absent in data, the proposed method might hinder the estimation efficiency in practice, yet the Poisson regression based regression model is fitted to the data sufficiently well. Therefore, we construct the hypothesis test that recommends an appropriate model for the analysis of the correlated count data. Extensive simulation studies indicate that the proposed test can identify the effective model consistently. The proposed procedure is also applied to a transportation safety study and recommends the proposed negative binomial regression model. 相似文献

2.

Bivariate zero-inflated negative binomial regression model with applications

Pouya Faroughi 《Journal of Statistical Computation and Simulation》2017,87(3):457-477

Count data often display excessive number of zero outcomes than are expected in the Poisson regression model. The zero-inflated Poisson regression model has been suggested to handle zero-inflated data, whereas the zero-inflated negative binomial (ZINB) regression model has been fitted for zero-inflated data with additional overdispersion. For bivariate and zero-inflated cases, several regression models such as the bivariate zero-inflated Poisson (BZIP) and bivariate zero-inflated negative binomial (BZINB) have been considered. This paper introduces several forms of nested BZINB regression model which can be fitted to bivariate and zero-inflated count data. The mean–variance approach is used for comparing the BZIP and our forms of BZINB regression model in this study. A similar approach was also used by past researchers for defining several negative binomial and zero-inflated negative binomial regression models based on the appearance of linear and quadratic terms of the variance function. The nested BZINB regression models proposed in this study have several advantages; the likelihood ratio tests can be performed for choosing the best model, the models have flexible forms of marginal mean–variance relationship, the models can be fitted to bivariate zero-inflated count data with positive or negative correlations, and the models allow additional overdispersion of the two dependent variables. 相似文献

3.

Score test for testing zero-inflated Poisson regression against zero-inflated generalized Poisson alternatives

Hossein Zamani 《Journal of applied statistics》2013,40(9):2056-2068

In several cases, count data often have excessive number of zero outcomes. This zero-inflated phenomenon is a specific cause of overdispersion, and zero-inflated Poisson regression model (ZIP) has been proposed for accommodating zero-inflated data. However, if the data continue to suggest additional overdispersion, zero-inflated negative binomial (ZINB) and zero-inflated generalized Poisson (ZIGP) regression models have been considered as alternatives. This study proposes the score test for testing ZIP regression model against ZIGP alternatives and proves that it is equal to the score test for testing ZIP regression model against ZINB alternatives. The advantage of using the score test over other alternative tests such as likelihood ratio and Wald is that the score test can be used to determine whether a more complex model is appropriate without fitting the more complex model. Applications of the proposed score test on several datasets are also illustrated. 相似文献

4.

Analysis of hypoglycemic events using negative binomial models

Junxiang Luo Yongming Qu 《Pharmaceutical statistics》2013,12(4):233-242

Negative binomial regression is a standard model to analyze hypoglycemic events in diabetes clinical trials. Adjusting for baseline covariates could potentially increase the estimation efficiency of negative binomial regression. However, adjusting for covariates raises concerns about model misspecification, in which the negative binomial regression is not robust because of its requirement for strong model assumptions. In some literature, it was suggested to correct the standard error of the maximum likelihood estimator through introducing overdispersion, which can be estimated by the Deviance or Pearson Chi‐square. We proposed to conduct the negative binomial regression using Sandwich estimation to calculate the covariance matrix of the parameter estimates together with Pearson overdispersion correction (denoted by NBSP). In this research, we compared several commonly used negative binomial model options with our proposed NBSP. Simulations and real data analyses showed that NBSP is the most robust to model misspecification, and the estimation efficiency will be improved by adjusting for baseline hypoglycemia. Copyright © 2013 John Wiley & Sons, Ltd. 相似文献

5.

Score tests for heterogeneity and overdispersion in zero‐inflated Poisson and binomial regression models

Daniel B. Hall Kenneth S. Berenhaut 《Revue canadienne de statistique》2002,30(3):415-430

Hall (2000) has described zero‐inflated Poisson and binomial regression models that include random effects to account for excess zeros and additional sources of heterogeneity in the data. The authors of the present paper propose a general score test for the null hypothesis that variance components associated with these random effects are zero. For a zero‐inflated Poisson model with random intercept, the new test reduces to an alternative to the overdispersion test of Ridout, Demério & Hinde (2001). The authors also examine their general test in the special case of the zero‐inflated binomial model with random intercept and propose an overdispersion test in that context which is based on a beta‐binomial alternative. 相似文献

6.

Modelling count data with overdispersion and spatial effects 总被引：1，自引：1，他引：0

Susanne Gschlößl Claudia Czado 《Statistical Papers》2008,49(3):531-552

In this paper we consider regression models for count data allowing for overdispersion in a Bayesian framework. We account for unobserved heterogeneity in the data in two ways. On the one hand, we consider more flexible models than a common Poisson model allowing for overdispersion in different ways. In particular, the negative binomial and the generalized Poisson (GP) distribution are addressed where overdispersion is modelled by an additional model parameter. Further, zero-inflated models in which overdispersion is assumed to be caused by an excessive number of zeros are discussed. On the other hand, extra spatial variability in the data is taken into account by adding correlated spatial random effects to the models. This approach allows for an underlying spatial dependency structure which is modelled using a conditional autoregressive prior based on Pettitt et al. in Stat Comput 12(4):353–367, (2002). In an application the presented models are used to analyse the number of invasive meningococcal disease cases in Germany in the year 2004. Models are compared according to the deviance information criterion (DIC) suggested by Spiegelhalter et al. in J R Stat Soc B64(4):583–640, (2002) and using proper scoring rules, see for example Gneiting and Raftery in Technical Report no. 463, University of Washington, (2004). We observe a rather high degree of overdispersion in the data which is captured best by the GP model when spatial effects are neglected. While the addition of spatial effects to the models allowing for overdispersion gives no or only little improvement, spatial Poisson models with spatially correlated or uncorrelated random effects are to be preferred over all other models according to the considered criteria. 相似文献

7.

Geographically Weighted Negative Binomial Regression—incorporating overdispersion

Alan Ricardo da Silva Thais Carvalho Valadares Rodrigues 《Statistics and Computing》2014,24(5):769-783

Global regression assumes that a single model adequately describes all parts of a study region. However, the heterogeneity in the data may be sufficiently strong that relationships between variables can not be spatially constant. In addition, the factors involved are often sufficiently complex that it is difficult to identify them in the form of explanatory variables. As a result Geographically Weighted Regression (GWR) was introduced as a tool for the modeling of non-stationary spatial data. Using kernel functions, the GWR methodology allows the model parameters to vary spatially and produces non-parametric surfaces of their estimates. To model count data with overdispersion, it is more appropriate to use a negative binomial distribution instead of a Poisson distribution. Therefore, we propose the Geographically Weighted Negative Binomial Regression (GWNBR) method for the modeling of data with overdispersion. The results obtained using simulated and real data show the superiority of this method for the modeling of non-stationary count data with overdispersion compared with competing models, such as global regressions, e.g., Poisson and negative binomial and Geographically Weighted Poisson Regression (GWPR). Moreover, we illustrate that these competing models are special cases of the more robust model GWNBR. 相似文献

8.

Functional Form for the Zero-Inflated Generalized Poisson Regression Model

Hossein Zamani 《统计学通讯:理论与方法》2014,43(3):515-529

The generalized Poisson (GP) regression is an increasingly popular approach for modeling overdispersed as well as underdispersed count data. Several parameterizations have been performed for the GP regression, and the two well known models, the GP-1 and the GP-2, have been applied. The GP-P regression, which has been recently proposed, has the advantage of nesting the GP-1 and the GP-2 parametrically, besides allowing the statistical tests of the GP-1 and the GP-2 against a more general alternative. In several cases, count data often have excessive number of zero outcomes than are expected in the Poisson. This zero-inflation phenomenon is a specific cause of overdispersion, and the zero-inflated Poisson (ZIP) regression model has been proposed. However, if the data continue to suggest additional overdispersion, the zero-inflated negative binomial (ZINB-1 and ZINB-2) and the zero-inflated generalized Poisson (ZIGP-1 and ZIGP-2) regression models have been considered as alternatives. This article proposes a functional form of the ZIGP which mixes a distribution degenerate at zero with a GP-P distribution. The suggested model has the advantage of nesting the ZIP and the two well known ZIGP (ZIGP-1 and ZIGP-2) regression models, besides allowing the statistical tests of the ZIGP-1 and the ZIGP-2 against a more general alternative. The ZIP and the functional form of the ZIGP regression models are fitted, compared and tested on two sets of count data; the Malaysian insurance claim data and the German healthcare data. 相似文献

9.

Analyzing clustered count data with a cluster-specific random effect zero-inflated Conway–Maxwell–Poisson distribution

Hyoyoung Choo-Wosoba Somnath Datta 《Journal of applied statistics》2018,45(5):799-814

Count data analysis techniques have been developed in biological and medical research areas. In particular, zero-inflated versions of parametric count distributions have been used to model excessive zeros that are often present in these assays. The most common count distributions for analyzing such data are Poisson and negative binomial. However, a Poisson distribution can only handle equidispersed data and a negative binomial distribution can only cope with overdispersion. However, a Conway–Maxwell–Poisson (CMP) distribution [4] can handle a wide range of dispersion. We show, with an illustrative data set on next-generation sequencing of maize hybrids, that both underdispersion and overdispersion can be present in genomic data. Furthermore, the maize data set consists of clustered observations and, therefore, we develop inference procedures for a zero-inflated CMP regression that incorporates a cluster-specific random effect term. Unlike the Gaussian models, the underlying likelihood is computationally challenging. We use a numerical approximation via a Gaussian quadrature to circumvent this issue. A test for checking zero-inflation has also been developed in our setting. Finite sample properties of our estimators and test have been investigated by extensive simulations. Finally, the statistical methodology has been applied to analyze the maize data mentioned before. 相似文献

10.

Analysis of the human sex ratio by using overdispersion models 总被引：2，自引：1，他引：1

Lindsey JK Altham PM 《Journal of the Royal Statistical Society. Series C, Applied statistics》1998,47(1):149-157

For study of the human sex ratio, one of the most important data sets was collected in Saxony in the 19th century by Geissler. The data contain the sizes of families, with the sex of all children, at the time of registration of the birth of a child. These data are reanalysed to determine how the probability for each sex changes with family size. Three models for overdispersion are fitted: the beta–binomial model of Skellam, the 'multiplicative' binomial model of Altham and the double-binomial model of Efron. For each distribution, both the probability and the dispersion parameters are allowed to vary simultaneously with family size according to two separate regression equations. A finite mixture model is also fitted. The models are fitted using non-linear Poisson regression. They are compared using direct likelihood methods based on the Akaike information criterion. The multiplicative and beta–binomial models provide similar fits, substantially better than that of the double-binomial model. All models show that both the probability that the child is a boy and the dispersion are greater in larger families. There is also some indication that a point probability mass is needed for families containing children uniquely of one sex. 相似文献

11.

Testing overdispersion in the zero-inflated Poisson model

Zhao Yang James W. Hardin Cheryl L. Addy 《Journal of statistical planning and inference》2009

The zero-inflated negative binomial (ZINB) model is used to account for commonly occurring overdispersion detected in data that are initially analyzed under the zero-inflated Poisson (ZIP) model. Tests for overdispersion (Wald test, likelihood ratio test [LRT], and score test) based on ZINB model for use in ZIP regression models have been developed. Due to similarity to the ZINB model, we consider the zero-inflated generalized Poisson (ZIGP) model as an alternate model for overdispersed zero-inflated count data. The score test has an advantage over the LRT and the Wald test in that the score test only requires that the parameter of interest be estimated under the null hypothesis. This paper proposes score tests for overdispersion based on the ZIGP model and illustrates that the derived score statistics are exactly the same as the score statistics under the ZINB model. A simulation study indicates the proposed score statistics are preferred to other tests for higher empirical power. In practice, based on the approximate mean–variance relationship in the data, the ZINB or ZIGP model can be considered, and a formal score test based on asymptotic standard normal distribution can be employed for assessing overdispersion in the ZIP model. We provide an example to illustrate the procedures for data analysis. 相似文献

12.

On the use of corrections for overdispersion 总被引：3，自引：0，他引：3

J. K. Lindsey 《Journal of the Royal Statistical Society. Series C, Applied statistics》1999,48(4):553-561

In studying fluctuations in the size of a blackgrouse ( Tetrao tetrix ) population, an autoregressive model using climatic conditions appears to follow the change quite well. However, the deviance of the model is considerably larger than its number of degrees of freedom. A widely used statistical rule of thumb holds that overdispersion is present in such situations, but model selection based on a direct likelihood approach can produce opposing results. Two further examples, of binomial and of Poisson data, have models with deviances that are almost twice the degrees of freedom and yet various overdispersion models do not fit better than the standard model for independent data. This can arise because the rule of thumb only considers a point estimate of dispersion, without regard for any measure of its precision. A reasonable criterion for detecting overdispersion is that the deviance be at least twice the number of degrees of freedom, the familiar Akaike information criterion, but the actual presence of overdispersion should then be checked by some appropriate modelling procedure. 相似文献

13.

Hierarchical overdispersed Poisson model with macrolevel autocorrelation

《Statistical Methodology》2007,4(3):354-370

We review Bayesian analysis of hierarchical non-standard Poisson regression models with an emphasis on microlevel heterogeneity and macrolevel autocorrelation. For the former case, we confirm that negative binomial regression usually accounts for microlevel heterogeneity (overdispersion) satisfactorily; for the latter case, we apply the simple first-order Markov transition model to conveniently capture the macrolevel autocorrelation which often arises from temporal and/or spatial count data, rather than attaching complex random effects directly to the regression parameters. Specifically, we extend the hierarchical (multilevel) Poisson model into negative binomial models with macrolevel autocorrelation using restricted gamma mixture with unit mean and Markov transition covariate created from preceding residuals. We prove a mild sufficient condition for posterior propriety under flat prior for the interesting fixed effects. Our methodology is implemented by analyzing the Baltic sea peracarids diurnal activity data published in the marine biology and ecology literature. 相似文献

14.

Tests for Detecting Overdispersion in the Positive Poisson Regression Model

Shiferaw Gurmu 《商业与经济统计学杂志》2013,31(2):215-222

This article derives score tests for extra-Poisson variation in the positive or truncated-at-zero Poisson regression model against truncated-at-zero negative binomial family alternatives. It also develops size-corrected tests of overdispersion that are expected to improve their small-sample properties. Further, small-sample performance of the tests is investigated by means of Monte Carlo experiments. As an illustration, the proposed tests are applied to a model of strikes in U.S. manufacturing. The proposed tests have an interpretation as conditional moment tests and require only the positive Poisson model to be estimated. It is shown that most of the tests for overdispersion in the regular Poisson model given in the econometric and statistical literature can be obtained as special cases of the tests developed in this article. Monte Carlo experiments indicate that the size correction, based on the asymptotic expansions of the score function, is effective in improving the accuracy of the size and power of the tests in small samples. 相似文献

15.

Comparisons of some bivariate regression models

《Journal of Statistical Computation and Simulation》2012,82(7):937-949

The bivariate negative binomial regression (BNBR) and the bivariate Poisson log-normal regression (BPLR) models have been used to describe count data that are over-dispersed. In this paper, a new bivariate generalized Poisson regression (BGPR) model is defined. An advantage of the new regression model over the BNBR and BPLR models is that the BGPR can be used to model bivariate count data with either over-dispersion or under-dispersion. In this paper, we carry out a simulation study to compare the three regression models when the true data-generating process exhibits over-dispersion. In the simulation experiment, we observe that the bivariate generalized Poisson regression model performs better than the bivariate negative binomial regression model and the BPLR model. 相似文献

16.

A note on Dean's overdispersion test

Zhao Yang James W. Hardin Cheryl L. Addy 《Journal of statistical planning and inference》2009

This note discusses an extension to the score test statistics for overdispersion in Poisson and binomial regression models [Dean, C.B., 1992. Testing for overdispersion in Poisson and binomial regression models. J. Amer. Statist. Assoc. 87, 451–457]. Examples illustrate the application of the extended results. 相似文献

17.

Estimating Abundance from Presence–Absence Maps via a Paired Negative‐Binomial Model

下载免费PDF全文

Wen‐Han Hwang Richard Huggins 《Scandinavian Journal of Statistics》2016,43(2):573-586

The estimation of abundance from presence–absence data is an intriguing problem in applied statistics. The classical Poisson model makes strong independence and homogeneity assumptions and in practice generally underestimates the true abundance. A controversial ad hoc method based on negative‐binomial counts (Am. Nat.) has been empirically successful but lacks theoretical justification. We first present an alternative estimator of abundance based on a paired negative binomial model that is consistent and asymptotically normally distributed. A quadruple negative binomial extension is also developed, which yields the previous ad hoc approach and resolves the controversy in the literature. We examine the performance of the estimators in a simulation study and estimate the abundance of 44 tree species in a permanent forest plot. 相似文献

18.

Tests for independence in a bivariate negative binomial model

Sooyoung Cheon Seuck Heun Song Byoung Cheol Jung 《Journal of the Korean Statistical Society》2009,38(2):185-190

The score test and LR test statistic for testing independence are proposed in a bivariate negative binomial regression model. We also propose an adjusted score test in order to enhance the efficiency of the score test. This study is an extension of the work in a univariate model by Dean and Lawless [Dean, C., Lawless, F. (1989). Tests for detecting overdispersion in Poisson regression models. Journal of the American Statistical Association, 84, 467–472]. The adjusted score test proposed in this study is more efficient than the complicated LR test. 相似文献

19.

Bayesian estimation and case influence diagnostics for the zero-inflated negative binomial regression model 总被引：1，自引：0，他引：1

Aldo M. Garay Victor H. Lachos Heleno Bolfarine 《Journal of applied statistics》2015,42(6):1148-1165

In recent years, there has been considerable interest in regression models based on zero-inflated distributions. These models are commonly encountered in many disciplines, such as medicine, public health, and environmental sciences, among others. The zero-inflated Poisson (ZIP) model has been typically considered for these types of problems. However, the ZIP model can fail if the non-zero counts are overdispersed in relation to the Poisson distribution, hence the zero-inflated negative binomial (ZINB) model may be more appropriate. In this paper, we present a Bayesian approach for fitting the ZINB regression model. This model considers that an observed zero may come from a point mass distribution at zero or from the negative binomial model. The likelihood function is utilized to compute not only some Bayesian model selection measures, but also to develop Bayesian case-deletion influence diagnostics based on q-divergence measures. The approach can be easily implemented using standard Bayesian software, such as WinBUGS. The performance of the proposed method is evaluated with a simulation study. Further, a real data set is analyzed, where we show that ZINB regression models seems to fit the data better than the Poisson counterpart. 相似文献

20.

Threshold negative binomial autoregressive model

Mengya Liu Qi Li 《Statistics》2019,53(1):1-25

This article studies an observation-driven model for time series of counts, which allows for overdispersion and negative serial dependence in the observations. The observations are supposed to follow a negative binomial distribution conditioned on past information with the form of thresh old models, which generates a two-regime structure on the basis of the magnitude of the lagged observations. We use the weak dependence approach to establish the stationarity and ergodicity, and the inference for regression parameters are obtained by the quasi-likelihood. Moreover, asymptotic properties of both quasi-maximum likelihood estimators and the threshold estimator are established, respectively. Simulation studies are considered and so are two applications, one of which is the trading volume of a stock and another is the number of major earthquakes. 相似文献