期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

A score test for overdispersion in Poisson regression based on the generalized Poisson-2 model

Zhao Yang James W. Hardin Cheryl L. Addy 《Journal of statistical planning and inference》2009

Overdispersion is a common phenomenon in Poisson modeling. The generalized Poisson (GP) regression model accommodates both overdispersion and underdispersion in count data modeling, and is an increasingly popular platform for modeling overdispersed count data. The Poisson model is one of the special cases in the collection of models which may be specified by GP regression. Thus, we may derive a test of overdispersion which compares the equi-dispersion Poisson model within the context of the more general GP regression model. The score test has an advantage over the likelihood ratio test (LRT) and over the Wald test in that the score test only requires that the parameter of interest be estimated under the null hypothesis (the Poisson model). Herein, we propose a score test for overdispersion based on the GP model (specifically the GP-2 model) and compare the power of the test with the LRT and Wald tests. A simulation study indicates the proposed score test based on asymptotic standard normal distribution is more appropriate in practical applications. 相似文献

2.

Testing overdispersion in the zero-inflated Poisson model

Zhao Yang James W. Hardin Cheryl L. Addy 《Journal of statistical planning and inference》2009

The zero-inflated negative binomial (ZINB) model is used to account for commonly occurring overdispersion detected in data that are initially analyzed under the zero-inflated Poisson (ZIP) model. Tests for overdispersion (Wald test, likelihood ratio test [LRT], and score test) based on ZINB model for use in ZIP regression models have been developed. Due to similarity to the ZINB model, we consider the zero-inflated generalized Poisson (ZIGP) model as an alternate model for overdispersed zero-inflated count data. The score test has an advantage over the LRT and the Wald test in that the score test only requires that the parameter of interest be estimated under the null hypothesis. This paper proposes score tests for overdispersion based on the ZIGP model and illustrates that the derived score statistics are exactly the same as the score statistics under the ZINB model. A simulation study indicates the proposed score statistics are preferred to other tests for higher empirical power. In practice, based on the approximate mean–variance relationship in the data, the ZINB or ZIGP model can be considered, and a formal score test based on asymptotic standard normal distribution can be employed for assessing overdispersion in the ZIP model. We provide an example to illustrate the procedures for data analysis. 相似文献

3.

Testing for varying zero-inflation and dispersion in generalized Poisson regression models

Feng-Chang Xie Jin-Guan Lin Bo-Cheng Wei 《Journal of applied statistics》2010,37(9):1509-1522

Homogeneity of dispersion parameters and zero-inflation parameters is a standard assumption in zero-inflated generalized Poisson regression (ZIGPR) models. However, this assumption may be not appropriate in some situations. This work develops a score test for varying dispersion and/or zero-inflation parameter in the ZIGPR models, and corresponding test statistics are obtained. Two numerical examples are given to illustrate our methodology, and the properties of score test statistics are investigated through Monte Carlo simulations. 相似文献

4.

Functional Form for the Zero-Inflated Generalized Poisson Regression Model

Hossein Zamani 《统计学通讯:理论与方法》2014,43(3):515-529

The generalized Poisson (GP) regression is an increasingly popular approach for modeling overdispersed as well as underdispersed count data. Several parameterizations have been performed for the GP regression, and the two well known models, the GP-1 and the GP-2, have been applied. The GP-P regression, which has been recently proposed, has the advantage of nesting the GP-1 and the GP-2 parametrically, besides allowing the statistical tests of the GP-1 and the GP-2 against a more general alternative. In several cases, count data often have excessive number of zero outcomes than are expected in the Poisson. This zero-inflation phenomenon is a specific cause of overdispersion, and the zero-inflated Poisson (ZIP) regression model has been proposed. However, if the data continue to suggest additional overdispersion, the zero-inflated negative binomial (ZINB-1 and ZINB-2) and the zero-inflated generalized Poisson (ZIGP-1 and ZIGP-2) regression models have been considered as alternatives. This article proposes a functional form of the ZIGP which mixes a distribution degenerate at zero with a GP-P distribution. The suggested model has the advantage of nesting the ZIP and the two well known ZIGP (ZIGP-1 and ZIGP-2) regression models, besides allowing the statistical tests of the ZIGP-1 and the ZIGP-2 against a more general alternative. The ZIP and the functional form of the ZIGP regression models are fitted, compared and tested on two sets of count data; the Malaysian insurance claim data and the German healthcare data. 相似文献

5.

Analyzing clustered count data with a cluster-specific random effect zero-inflated Conway–Maxwell–Poisson distribution

Hyoyoung Choo-Wosoba Somnath Datta 《Journal of applied statistics》2018,45(5):799-814

Count data analysis techniques have been developed in biological and medical research areas. In particular, zero-inflated versions of parametric count distributions have been used to model excessive zeros that are often present in these assays. The most common count distributions for analyzing such data are Poisson and negative binomial. However, a Poisson distribution can only handle equidispersed data and a negative binomial distribution can only cope with overdispersion. However, a Conway–Maxwell–Poisson (CMP) distribution [4] can handle a wide range of dispersion. We show, with an illustrative data set on next-generation sequencing of maize hybrids, that both underdispersion and overdispersion can be present in genomic data. Furthermore, the maize data set consists of clustered observations and, therefore, we develop inference procedures for a zero-inflated CMP regression that incorporates a cluster-specific random effect term. Unlike the Gaussian models, the underlying likelihood is computationally challenging. We use a numerical approximation via a Gaussian quadrature to circumvent this issue. A test for checking zero-inflation has also been developed in our setting. Finite sample properties of our estimators and test have been investigated by extensive simulations. Finally, the statistical methodology has been applied to analyze the maize data mentioned before. 相似文献

6.

Tests for detecting overdispersion in poisson models

Sunho Lee Cheolyong Park Bynng Soo Kim 《统计学通讯:理论与方法》2013,42(9):2405-2420

Collings and Margolin(1985) developed a locally most powerful unbiased test for detecting negative binomial departures from a Poisson model, when the variance is a quadratic function of the mean. Kim and Park(1992) developed a locally most powerful unbiased test, when the variance is a linear function of the mean. It is found that a different mean-variance structure of a negative binomial derives a different locally optimal test statistic.

In this paper Collings and Margolin's and Kim and Park's results are unified and extended by developing a test for overdispersion in Poisson model against Katz family of distributions, Our setup has two extensions: First, Katz family of distributions is employed as an extension of the negative binomial distribution. Second, the mean-variance structure of the mixed Poisson model is given by σ² = μ+cμ^r for arbitrary but fixed r. We derive a local score test for testing H₀ : c = 0. Superiority of a new test is proved by the asymtotic relative efficiency as well as the simulation study. 相似文献

7.

A note on Dean's overdispersion test

Zhao Yang James W. Hardin Cheryl L. Addy 《Journal of statistical planning and inference》2009

This note discusses an extension to the score test statistics for overdispersion in Poisson and binomial regression models [Dean, C.B., 1992. Testing for overdispersion in Poisson and binomial regression models. J. Amer. Statist. Assoc. 87, 451–457]. Examples illustrate the application of the extended results. 相似文献

8.

Score test for testing zero-inflated Poisson regression against zero-inflated generalized Poisson alternatives

Hossein Zamani 《Journal of applied statistics》2013,40(9):2056-2068

In several cases, count data often have excessive number of zero outcomes. This zero-inflated phenomenon is a specific cause of overdispersion, and zero-inflated Poisson regression model (ZIP) has been proposed for accommodating zero-inflated data. However, if the data continue to suggest additional overdispersion, zero-inflated negative binomial (ZINB) and zero-inflated generalized Poisson (ZIGP) regression models have been considered as alternatives. This study proposes the score test for testing ZIP regression model against ZIGP alternatives and proves that it is equal to the score test for testing ZIP regression model against ZINB alternatives. The advantage of using the score test over other alternative tests such as likelihood ratio and Wald is that the score test can be used to determine whether a more complex model is appropriate without fitting the more complex model. Applications of the proposed score test on several datasets are also illustrated. 相似文献

9.

Time series count data regression

Kurt Brännäs Per Johansson 《统计学通讯:理论与方法》2013,42(10):2907-2925

The count data model studied in the paper extends the Poisson model by al-lowing for overdispersion and serial correlation. Alternative approaches to esti-mate nuisance parameters, required for the correction of the Poisson maximum likelihood covariance matrix estimator and for a quasi-likelihood estimator, are studied. The estimators are evaluated by finite sample Monte Carlo experi-mentation. It is found that the Poisson maximum likelihood estimator with corrected covariance matrix estimators provide reliable inferences for longer time series. Overdispersion test statistics are wellbehaved, while conventional portmanteau statistics for white noise have too large sizes. Two empirical illustrations are included. 相似文献

10.

Overdispersed poisson regression models for studies of air pollution and human health

Brad McNeney John Petkau 《Revue canadienne de statistique》1994,22(4):421-440

This paper presents results from a simulation study motivated by a recent study of the relationships between ambient levels of air pollution and human health in the community of Prince George, British Columbia. The simulation study was designed to evaluate the performance of methods based on overdispersed Poisson regression models for the analysis of series of count data. Aspects addressed include estimation of the dispersion parameter, estimation of regression coefficients and their standard errors, and the performance of model selection tests. The effects of varying amounts of overdispersion and differing underlying variance structure on this performance were of particular interest. This study is related to work reported by Breslow (1990) although the context is quite different. Preliminary work led to the conclusion that estimation of the dispersion parameter should be based on Pearson's chi-square statistic rather than the Poisson deviance. Regression coefficients are well estimated, even in the présence of substantial overdispersion and when the model for the variance function is incorrectly specified. Despite potential greater variability, the empirical estimator of the covariance matrix is preferred because the model-based estimator is unreliable in general. When the model for the variance function is incorrect, model-based test statistics may perform poorly, in sharp contrast to empirical test statistics, which performed very well in this study. 相似文献

11.

Underdispersion models: Models that are “under the radar”

Kimberly F. Sellers Darcy S. Morris 《统计学通讯:理论与方法》2017,46(24):12075-12086

The Poisson distribution is a benchmark for modeling count data. Its equidispersion constraint, however, does not accurately represent real data. Most real datasets express overdispersion; hence attention in the statistics community focuses on associated issues. More examples are surfacing, however, that display underdispersion, warranting the need to highlight this phenomenon and bring more attention to those models that can better describe such data structures. This work addresses various sources of data underdispersion and surveys several distributions that can model underdispersed data, comparing their performance on applied datasets. 相似文献

12.

Analysis of discrete data by Conway–Maxwell Poisson distribution

Ramesh C. Gupta S. Z. Sim S. H. Ong 《AStA Advances in Statistical Analysis》2014,98(4):327-343

In this paper, we further study the Conway–Maxwell Poisson distribution having one more parameter than the Poisson distribution and compare it with the Poisson distribution with respect to some stochastic orderings used in reliability theory. Likelihood ratio test and the score test are developed to test the importance of this additional parameter. Simulation studies are carried out to examine the performance of the two tests. Two examples are presented, one showing overdispersion and the other showing underdispersion, to illustrate the procedure. It is shown that the COM-Poisson model fits better than the generalized Poisson distribution. 相似文献

13.

Tests for Detecting Overdispersion in the Positive Poisson Regression Model

Shiferaw Gurmu 《商业与经济统计学杂志》2013,31(2):215-222

This article derives score tests for extra-Poisson variation in the positive or truncated-at-zero Poisson regression model against truncated-at-zero negative binomial family alternatives. It also develops size-corrected tests of overdispersion that are expected to improve their small-sample properties. Further, small-sample performance of the tests is investigated by means of Monte Carlo experiments. As an illustration, the proposed tests are applied to a model of strikes in U.S. manufacturing. The proposed tests have an interpretation as conditional moment tests and require only the positive Poisson model to be estimated. It is shown that most of the tests for overdispersion in the regular Poisson model given in the econometric and statistical literature can be obtained as special cases of the tests developed in this article. Monte Carlo experiments indicate that the size correction, based on the asymptotic expansions of the score function, is effective in improving the accuracy of the size and power of the tests in small samples. 相似文献

14.

An extended random-effects approach to modeling repeated, overdispersed count data

Molenberghs G Verbeke G Demétrio CG 《Lifetime data analysis》2007,13(4):513-531

Non-Gaussian outcomes are often modeled using members of the so-called exponential family. The Poisson model for count data falls within this tradition. The family in general, and the Poisson model in particular, are at the same time convenient since mathematically elegant, but in need of extension since often somewhat restrictive. Two of the main rationales for existing extensions are (1) the occurrence of overdispersion, in the sense that the variability in the data is not adequately captured by the model's prescribed mean-variance link, and (2) the accommodation of data hierarchies owing to, for example, repeatedly measuring the outcome on the same subject, recording information from various members of the same family, etc. There is a variety of overdispersion models for count data, such as, for example, the negative-binomial model. Hierarchies are often accommodated through the inclusion of subject-specific, random effects. Though not always, one conventionally assumes such random effects to be normally distributed. While both of these issues may occur simultaneously, models accommodating them at once are less than common. This paper proposes a generalized linear model, accommodating overdispersion and clustering through two separate sets of random effects, of gamma and normal type, respectively. This is in line with the proposal by Booth et al. (Stat Model 3:179-181, 2003). The model extends both classical overdispersion models for count data (Breslow, Appl Stat 33:38-44, 1984), in particular the negative binomial model, as well as the generalized linear mixed model (Breslow and Clayton, J Am Stat Assoc 88:9-25, 1993). Apart from model formulation, we briefly discuss several estimation options, and then settle for maximum likelihood estimation with both fully analytic integration as well as hybrid between analytic and numerical integration. The latter is implemented in the SAS procedure NLMIXED. The methodology is applied to data from a study in epileptic seizures. 相似文献

15.

Score test for homogeneity of dispersion in generalized Poisson mixed models with excess zeros

Feng-Chang Xie Jin-Guan Lin Bo-Cheng Wei 《统计学通讯:模拟与计算》2017,46(1):301-314

In many applications, the clustered count data often contain excess zeros and the zero-inflated generalized Poisson mixed (ZIGPM) regression model may be suitable. However, dispersion in ZIGPM is often treated as fixed unknown parameter, and this assumption may be not appropriate in some situations. In this article, a score test for homogeneity of dispersion parameter in ZIGPM regression model is developed and corresponding test statistic is obtained. Sampling distribution and power of the score test statistic are investigated through Monte Carlo simulation. Finally, results from a biological example illustrate the usefulness of the diagnostic statistic. 相似文献

16.

Efficient regression modeling for correlated and overdispersed count data

《统计学通讯:理论与方法》2012,41(24):6005-6018

Abstract

The objective of this paper is to propose an efficient estimation procedure in a marginal mean regression model for longitudinal count data and to develop a hypothesis test for detecting the presence of overdispersion. We extend the matrix expansion idea of quadratic inference functions to the negative binomial regression framework that entails accommodating both the within-subject correlation and overdispersion issue. Theoretical and numerical results show that the proposed procedure yields a more efficient estimator asymptotically than the one ignoring either the within-subject correlation or overdispersion. When the overdispersion is absent in data, the proposed method might hinder the estimation efficiency in practice, yet the Poisson regression based regression model is fitted to the data sufficiently well. Therefore, we construct the hypothesis test that recommends an appropriate model for the analysis of the correlated count data. Extensive simulation studies indicate that the proposed test can identify the effective model consistently. The proposed procedure is also applied to a transportation safety study and recommends the proposed negative binomial regression model. 相似文献

17.

Empirical Bayes estimates of finite mixture of negative binomial regression models and its application to highway safety

Yajie Zou John E. Ash Dominique Lord Lingtao Wu 《Journal of applied statistics》2018,45(9):1652-1669

The empirical Bayes (EB) method is commonly used by transportation safety analysts for conducting different types of safety analyses, such as before–after studies and hotspot analyses. To date, most implementations of the EB method have been applied using a negative binomial (NB) model, as it can easily accommodate the overdispersion commonly observed in crash data. Recent studies have shown that a generalized finite mixture of NB models with K mixture components (GFMNB-K) can also be used to model crash data subjected to overdispersion and generally offers better statistical performance than the traditional NB model. So far, nobody has developed how the EB method could be used with finite mixtures of NB models. The main objective of this study is therefore to use a GFMNB-K model in the calculation of EB estimates. Specifically, GFMNB-K models with varying weight parameters are developed to analyze crash data from Indiana and Texas. The main finding shows that the rankings produced by the NB and GFMNB-2 models for hotspot identification are often quite different, and this was especially noticeable with the Texas dataset. Finally, a simulation study designed to examine which model formulation can better identify the hotspot is recommended as our future research. 相似文献

18.

Testing for zero-inflation in count series: application to occupational health

Y. Zhao V. Burke K. K.W. Yau 《Journal of applied statistics》2009,36(12):1353-1359

Count data series with extra zeros relative to a Poisson distribution are common in many biomedical applications. A score test is presented to assess whether the zero-inflation problem is significant to warrant the analysis by the more complex zero-inflated Poisson autoregression model. The score test is implemented as a computer program in the Splus platform. For illustration, the test procedure is applied to a workplace injury series where many zero counts are observed due to the heterogeneity in injury risk and the dynamic population involved. 相似文献

19.

The multivariate component zero-inflated Poisson model for correlated count data analysis

Qin Wu Guo-Liang Tian Tao Li Man-Lai Tang Chi Zhang 《Australian & New Zealand Journal of Statistics》2023,65(3):234-261

Multivariate zero-inflated Poisson (ZIP) distributions are important tools for modelling and analysing correlated count data with extra zeros. Unfortunately, existing multivariate ZIP distributions consider only the overall zero-inflation while the component zero-inflation is not well addressed. This paper proposes a flexible multivariate ZIP distribution, called the multivariate component ZIP distribution, in which both the overall and component zero-inflations are taken into account. Likelihood-based inference procedures including the calculation of maximum likelihood estimates of parameters in the model without and with covariates are provided. Simulation studies indicate that the performance of the proposed methods on the multivariate component ZIP model is satisfactory. The Australia health care utilisation data set is analysed to demonstrate that the new distribution is more appropriate than the existing multivariate ZIP distributions. 相似文献

20.

The VGAM package for negative binomial regression

Thomas W. Yee 《Australian & New Zealand Journal of Statistics》2020,62(1):116-131

Negative binomial (NB) regression is the most common full‐likelihood method for analysing count data exhibiting overdispersion with respect to the Poisson distribution. Usually most practitioners are content to fit one of two NB variants, however other important variants exist. It is demonstrated here that the VGAM R package can fit them all under a common statistical framework founded upon a generalised linear and additive model approach. Additionally, other modifications such as zero‐altered (hurdle), zero‐truncated and zero‐inflated NB distributions are naturally handled. Rootograms are also available for graphically checking the goodness of fit. Two data sets and some recently added features of the VGAM package are used here for illustration. 相似文献