期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

GEE-based zero-inflated generalized Poisson model for clustered over or under-dispersed count data

Fatemeh Sarvi Hossein Mahjub 《Journal of Statistical Computation and Simulation》2019,89(14):2711-2732

The zero-inflated regression models such as zero-inflated Poisson (ZIP), zero-inflated negative binomial (ZINB) or zero-inflated generalized Poisson (ZIGP) regression models can model the count data with excess zeros. The ZINB model can handle over-dispersed and the ZIGP model can handle the over or under-dispersed count data with excess zeros as well. Moreover, the count data may be correlated because of data collection procedure or special study design. The clustered sampling approach is one of the examples in which the correlation among subjects could be defined. In such situations, a marginal model using generalized estimating equation (GEE) approach can incorporate these correlations and lead up to the relationships at the population level. In this study, the GEE-based zero-inflated generalized Poisson regression model was proposed to fit over and under-dispersed clustered count data with excess zeros. 相似文献

2.

Random effect exponentiated-exponential geometric model for clustered/longitudinal zero-inflated count data

Leili Tapak Omid Hamidi Payam Amini Geert Verbeke 《Journal of applied statistics》2020,47(12):2272

For count responses, there are situations in biomedical and sociological applications in which extra zeroes occur. Modeling correlated (e.g. repeated measures and clustered) zero-inflated count data includes special challenges because the correlation between measurements for a subject or a cluster needs to be taken into account. Moreover, zero-inflated count data are often faced with over/under dispersion problem. In this paper, we propose a random effect model for repeated measurements or clustered data with over/under dispersed response called random effect zero-inflated exponentiated-exponential geometric regression model. The proposed method was illustrated through real examples. The performance of the model and asymptotical properties of the estimations were investigated using simulation studies.KEYWORDS: Count model, under- and over-dispersion, zero-inflation, mixture model, zero-inflated poisson model 相似文献

3.

Bivariate zero-inflated negative binomial regression model with applications

Pouya Faroughi 《Journal of Statistical Computation and Simulation》2017,87(3):457-477

Count data often display excessive number of zero outcomes than are expected in the Poisson regression model. The zero-inflated Poisson regression model has been suggested to handle zero-inflated data, whereas the zero-inflated negative binomial (ZINB) regression model has been fitted for zero-inflated data with additional overdispersion. For bivariate and zero-inflated cases, several regression models such as the bivariate zero-inflated Poisson (BZIP) and bivariate zero-inflated negative binomial (BZINB) have been considered. This paper introduces several forms of nested BZINB regression model which can be fitted to bivariate and zero-inflated count data. The mean–variance approach is used for comparing the BZIP and our forms of BZINB regression model in this study. A similar approach was also used by past researchers for defining several negative binomial and zero-inflated negative binomial regression models based on the appearance of linear and quadratic terms of the variance function. The nested BZINB regression models proposed in this study have several advantages; the likelihood ratio tests can be performed for choosing the best model, the models have flexible forms of marginal mean–variance relationship, the models can be fitted to bivariate zero-inflated count data with positive or negative correlations, and the models allow additional overdispersion of the two dependent variables. 相似文献

4.

Score test for testing zero-inflated Poisson regression against zero-inflated generalized Poisson alternatives

Hossein Zamani 《Journal of applied statistics》2013,40(9):2056-2068

In several cases, count data often have excessive number of zero outcomes. This zero-inflated phenomenon is a specific cause of overdispersion, and zero-inflated Poisson regression model (ZIP) has been proposed for accommodating zero-inflated data. However, if the data continue to suggest additional overdispersion, zero-inflated negative binomial (ZINB) and zero-inflated generalized Poisson (ZIGP) regression models have been considered as alternatives. This study proposes the score test for testing ZIP regression model against ZIGP alternatives and proves that it is equal to the score test for testing ZIP regression model against ZINB alternatives. The advantage of using the score test over other alternative tests such as likelihood ratio and Wald is that the score test can be used to determine whether a more complex model is appropriate without fitting the more complex model. Applications of the proposed score test on several datasets are also illustrated. 相似文献

5.

A Bayesian Approach for Zero-Inflated Count Regression Models by Using the Reversible Jump Markov Chain Monte Carlo Method and an Application

İlknur Özmen 《统计学通讯:理论与方法》2013,42(12):2109-2127

In this study, estimation of the parameters of the zero-inflated count regression models and computations of posterior model probabilities of the log-linear models defined for each zero-inflated count regression models are investigated from the Bayesian point of view. In addition, determinations of the most suitable log-linear and regression models are investigated. It is known that zero-inflated count regression models cover zero-inflated Poisson, zero-inflated negative binomial, and zero-inflated generalized Poisson regression models. The classical approach has some problematic points but the Bayesian approach does not have similar flaws. This work points out the reasons for using the Bayesian approach. It also lists advantages and disadvantages of the classical and Bayesian approaches. As an application, a zoological data set, including structural and sampling zeros, is used in the presence of extra zeros. In this work, it is observed that fitting a zero-inflated negative binomial regression model creates no problems at all, even though it is known that fitting a zero-inflated negative binomial regression model is the most problematic procedure in the classical approach. Additionally, it is found that the best fitting model is the log-linear model under the negative binomial regression model, which does not include three-way interactions of factors. 相似文献

6.

Score test for homogeneity of dispersion in generalized Poisson mixed models with excess zeros

Feng-Chang Xie Jin-Guan Lin Bo-Cheng Wei 《统计学通讯:模拟与计算》2017,46(1):301-314

In many applications, the clustered count data often contain excess zeros and the zero-inflated generalized Poisson mixed (ZIGPM) regression model may be suitable. However, dispersion in ZIGPM is often treated as fixed unknown parameter, and this assumption may be not appropriate in some situations. In this article, a score test for homogeneity of dispersion parameter in ZIGPM regression model is developed and corresponding test statistic is obtained. Sampling distribution and power of the score test statistic are investigated through Monte Carlo simulation. Finally, results from a biological example illustrate the usefulness of the diagnostic statistic. 相似文献

7.

Quasi-binomial zero-inflated regression model suitable for variables with bounded support

E. Gmez&#x;Dniz D. I. Gallardo H. W. Gmez 《Journal of applied statistics》2020,47(12):2208

In recent years, a variety of regression models, including zero-inflated and hurdle versions, have been proposed to explain the case of a dependent variable with respect to exogenous covariates. Apart from the classical Poisson, negative binomial and generalised Poisson distributions, many proposals have appeared in the statistical literature, perhaps in response to the new possibilities offered by advanced software that now enables researchers to implement numerous special functions in a relatively simple way. However, we believe that a significant research gap remains, since very little attention has been paid to the quasi-binomial distribution, which was first proposed over fifty years ago. We believe this distribution might constitute a valid alternative to existing regression models, in situations in which the variable has bounded support. Therefore, in this paper we present a zero-inflated regression model based on the quasi-binomial distribution, taking into account the moments and maximum likelihood estimators, and perform a score test to compare the zero-inflated quasi-binomial distribution with the zero-inflated binomial distribution, and the zero-inflated model with the homogeneous model (the model in which covariates are not considered). This analysis is illustrated with two data sets that are well known in the statistical literature and which contain a large number of zeros. 相似文献

8.

Solving unobserved heterogeneity with latent class inflated Poisson regression model

Ting Hsiang Lin Min-Hsiao Tsai 《Journal of applied statistics》2022,49(11):2953

Inflated data and over-dispersion are two common problems when modeling count data with traditional Poisson regression models. In this study, we propose a latent class inflated Poisson (LCIP) regression model to solve the unobserved heterogeneity that leads to inflations and over-dispersion. The performance of the model estimation is evaluated through simulation studies. We illustrate the usefulness of introducing a latent class variable by analyzing the Behavioral Risk Factor Surveillance System (BRFSS) data, which contain several excessive values and characterized by over-dispersion. As a result, the new model we proposed displays a better fit than the standard Poisson regression and zero-inflated Poisson regression models for the inflated counts.KEYWORDS: Inflated data, latent class, heterogeneity, Poisson regression, over-dispersion 相似文献

9.

Analyzing clustered count data with a cluster-specific random effect zero-inflated Conway–Maxwell–Poisson distribution

Hyoyoung Choo-Wosoba Somnath Datta 《Journal of applied statistics》2018,45(5):799-814

Count data analysis techniques have been developed in biological and medical research areas. In particular, zero-inflated versions of parametric count distributions have been used to model excessive zeros that are often present in these assays. The most common count distributions for analyzing such data are Poisson and negative binomial. However, a Poisson distribution can only handle equidispersed data and a negative binomial distribution can only cope with overdispersion. However, a Conway–Maxwell–Poisson (CMP) distribution [4] can handle a wide range of dispersion. We show, with an illustrative data set on next-generation sequencing of maize hybrids, that both underdispersion and overdispersion can be present in genomic data. Furthermore, the maize data set consists of clustered observations and, therefore, we develop inference procedures for a zero-inflated CMP regression that incorporates a cluster-specific random effect term. Unlike the Gaussian models, the underlying likelihood is computationally challenging. We use a numerical approximation via a Gaussian quadrature to circumvent this issue. A test for checking zero-inflation has also been developed in our setting. Finite sample properties of our estimators and test have been investigated by extensive simulations. Finally, the statistical methodology has been applied to analyze the maize data mentioned before. 相似文献

10.

Zero-spiked regression models generated by gamma random variables with application in the resin oil production

Elizabeth M. Hashimoto Gauss M. Cordeiro Vicente G. Cancho Carine Klauberg 《Journal of Statistical Computation and Simulation》2019,89(1):52-70

Zero-inflated data are more frequent when the data represent counts. However, there are practical situations in which continuous data contain an excess of zeros. In these cases, the zero-inflated Poisson, binomial or negative binomial models are not suitable. In order to reduce this gap, we propose the zero-spiked gamma-Weibull (ZSGW) model by mixing a distribution which is degenerate at zero with the gamma-Weibull distribution, which has positive support. The model attempts to estimate simultaneously the effects of explanatory variables on the response variable and the zero-spiked. We consider a frequentist analysis and a non-parametric bootstrap for estimating the parameters of the ZSGW regression model. We derive the appropriate matrices for assessing local influence on the model parameters. We illustrate the performance of the proposed regression model by means of a real data set (copaiba oil resin production) from a study carried out at the Department of Forest Science of the Luiz de Queiroz School of Agriculture, University of São Paulo. Based on the ZSGW regression model, we determine the explanatory variables that can influence the excess of zeros of the resin oil production and identify influential observations. We also prove empirically that the proposed regression model can be superior to the zero-adjusted inverse Gaussian regression model to fit zero-inflated positive continuous data. 相似文献

11.

Estimation of Median in Two-Phase Sampling Using Two Auxiliary Variables

Sat Gupta Javid Shabbir Shabbir Ahmad 《统计学通讯:理论与方法》2013,42(11):1815-1822

In recent years, zero-inflated count data models, such as zero-inflated Poisson (ZIP) models, are widely used as the count data with extra zeros are very common in many practical problems. In order to model the correlated count data which are either clustered or repeated and to assess the effects of continuous covariates or of time scales in a flexible way, a class of semiparametric mixed-effects models for zero-inflated count data is considered. In this article, we propose a fully Bayesian inference for such models based on a data augmentation scheme that reflects both random effects of covariates and mixture of zero-inflated distribution. A computational efficient MCMC method which combines the Gibbs sampler and M-H algorithm is implemented to obtain the estimate of the model parameters. Finally, a simulation study and a real example are used to illustrate the proposed methodologies. 相似文献

12.

Bayesian zero-inflated generalized Poisson regression model: estimation and case influence diagnostics

Feng-Chang Xie Jin-Guan Lin Bo-Cheng Wei 《Journal of applied statistics》2014,41(6):1383-1392

Count data with excess zeros arises in many contexts. Here our concern is to develop a Bayesian analysis for the zero-inflated generalized Poisson (ZIGP) regression model to address this problem. This model provides a useful generalization of zero-inflated Poisson model since the generalized Poisson distribution is overdispersed/underdispersed relative to Poisson. Due to the complexity of the ZIGP model, Markov chain Monte Carlo methods are used to develop a Bayesian procedure for the considered model. Additionally, some discussions on the model selection criteria are presented and a Bayesian case deletion influence diagnostics is investigated for the joint posterior distribution based on the Kullback–Leibler divergence. Finally, a simulation study and a psychological example are given to illustrate our methodology. 相似文献

13.

Bayesian estimation and case influence diagnostics for the zero-inflated negative binomial regression model 总被引：1，自引：0，他引：1

Aldo M. Garay Victor H. Lachos Heleno Bolfarine 《Journal of applied statistics》2015,42(6):1148-1165

In recent years, there has been considerable interest in regression models based on zero-inflated distributions. These models are commonly encountered in many disciplines, such as medicine, public health, and environmental sciences, among others. The zero-inflated Poisson (ZIP) model has been typically considered for these types of problems. However, the ZIP model can fail if the non-zero counts are overdispersed in relation to the Poisson distribution, hence the zero-inflated negative binomial (ZINB) model may be more appropriate. In this paper, we present a Bayesian approach for fitting the ZINB regression model. This model considers that an observed zero may come from a point mass distribution at zero or from the negative binomial model. The likelihood function is utilized to compute not only some Bayesian model selection measures, but also to develop Bayesian case-deletion influence diagnostics based on q-divergence measures. The approach can be easily implemented using standard Bayesian software, such as WinBUGS. The performance of the proposed method is evaluated with a simulation study. Further, a real data set is analyzed, where we show that ZINB regression models seems to fit the data better than the Poisson counterpart. 相似文献

14.

Bivariate zero-inflated generalized Poisson regression model with flexible covariance

Pouya Faroughi 《统计学通讯:理论与方法》2017,46(15):7769-7785

This paper introduces several forms of nested bivariate zero-inflated generalized Poisson (BZIGP) regression model which can be fitted to bivariate and zero-inflated count data. The main advantage of having several forms of BZIGP regression model is that they are nested and allow likelihood ratio test to be performed for choosing the best model. In addition, the BZIGP regression models have flexible forms of marginal mean–variance relationship, can be fitted to bivariate and zero-inflated count data with positive or negative correlations, and allow additional overdispersion of the two response variables. The BZIGP regression models are fitted to the Australian Health Survey data. 相似文献

15.

Marginal zero-inflated regression models for count data

Jacob Martin Daniel B. Hall 《Journal of applied statistics》2017,44(10):1807-1826

Data sets with excess zeroes are frequently analyzed in many disciplines. A common framework used to analyze such data is the zero-inflated (ZI) regression model. It mixes a degenerate distribution with point mass at zero with a non-degenerate distribution. The estimates from ZI models quantify the effects of covariates on the means of latent random variables, which are often not the quantities of primary interest. Recently, marginal zero-inflated Poisson (MZIP; Long et al. [A marginalized zero-inflated Poisson regression model with overall exposure effects. Stat. Med. 33 (2014), pp. 5151–5165]) and negative binomial (MZINB; Preisser et al., 2016) models have been introduced that model the mean response directly. These models yield covariate effects that have simple interpretations that are, for many applications, more appealing than those available from ZI regression. This paper outlines a general framework for marginal zero-inflated models where the latent distribution is a member of the exponential dispersion family, focusing on common distributions for count data. In particular, our discussion includes the marginal zero-inflated binomial (MZIB) model, which has not been discussed previously. The details of maximum likelihood estimation via the EM algorithm are presented and the properties of the estimators as well as Wald and likelihood ratio-based inference are examined via simulation. Two examples presented illustrate the advantages of MZIP, MZINB, and MZIB models for practical data analysis. 相似文献

16.

Heterogeneous credit union production technologies with endogenous switching and correlated effects

Emir Malikov Diego A. Restrepo-Tobón Subal C. Kumbhakar 《Econometric Reviews》2018,37(10):1095-1119

Credit unions differ in the types of financial services they offer to their members. This article explicitly models this observed heterogeneity using a generalized model of endogenous ordered switching. Our approach captures the endogenous choice that credit unions make when adding new products to their financial services mix. The model that we consider also allows for the dependence between unobserved effects and regressors in both the selection and outcome equations and can accommodate the presence of predetermined covariates in the model. We use this model to estimate returns to scale for U.S. retail credit unions from 1996 to 2011. We document strong evidence of persistent technological heterogeneity among credit unions offering different financial service mixes, which, if ignored, can produce quite misleading results. Employing our model, we find that credit unions of all types exhibit substantial economies of scale. 相似文献

17.

零膨胀泊松模型的改进在零次索赔建模中的应用

郭念国《统计与信息论坛》2010,25(7):22-25

零膨胀是非寿险精算中的一种常见现象,国内外许多学者对此进行了研究分析,而最具影响力的方法是零膨胀泊松模型与Hurdle模型,但这两个方法在区分零之间的差别时存在不足。实际中,产生零次索赔的保单持有人并非全部同质,如何提取零中所包含的信息对保险公司来说是重要的。鉴此,基于零膨胀泊松模型与Hurdle模型的思想,提出修正的零膨胀泊松模型,并利用非寿险精算中的实际数据,对新模型进行了拟合分析。与零膨胀泊松模型拟合结果的比较说明,修正的零膨胀模型在零的处理上更符合实际情况,更能体现零中所包含的信息。相似文献

18.

Stochastic variational inference for large-scale discrete choice models using adaptive batch sizes

Linda S. L. Tan 《Statistics and Computing》2017,27(1):237-257

Discrete choice models describe the choices made by decision makers among alternatives and play an important role in transportation planning, marketing research and other applications. The mixed multinomial logit (MMNL) model is a popular discrete choice model that captures heterogeneity in the preferences of decision makers through random coefficients. While Markov chain Monte Carlo methods provide the Bayesian analogue to classical procedures for estimating MMNL models, computations can be prohibitively expensive for large datasets. Approximate inference can be obtained using variational methods at a lower computational cost with competitive accuracy. In this paper, we develop variational methods for estimating MMNL models that allow random coefficients to be correlated in the posterior and can be extended easily to large-scale datasets. We explore three alternatives: (1) Laplace variational inference, (2) nonconjugate variational message passing and (3) stochastic linear regression. Their performances are compared using real and simulated data. To accelerate convergence for large datasets, we develop stochastic variational inference for MMNL models using each of the above alternatives. Stochastic variational inference allows data to be processed in minibatches by optimizing global variational parameters using stochastic gradient approximation. A novel strategy for increasing minibatch sizes adaptively within stochastic variational inference is proposed. 相似文献

19.

Likelihood estimation for longitudinal zero-inflated power series regression models

E. Bahrami Samani Y. Amirian M. Ganjali 《Journal of applied statistics》2012,39(9):1965-1974

In this paper, a zero-inflated power series regression model for longitudinal count data with excess zeros is presented. We demonstrate how to calculate the likelihood for such data when it is assumed that the increment in the cumulative total follows a discrete distribution with a location parameter that depends on a linear function of explanatory variables. Simulation studies indicate that this method can provide improvements in obtaining standard errors of the estimates. We also calculate the dispersion index for this model. The influence of a small perturbation of the dispersion index of the zero-inflated model on likelihood displacement is also studied. The zero-inflated negative binomial regression model is illustrated on data regarding joint damage in psoriatic arthritis. 相似文献

20.

Testing overdispersion in the zero-inflated Poisson model

Zhao Yang James W. Hardin Cheryl L. Addy 《Journal of statistical planning and inference》2009

The zero-inflated negative binomial (ZINB) model is used to account for commonly occurring overdispersion detected in data that are initially analyzed under the zero-inflated Poisson (ZIP) model. Tests for overdispersion (Wald test, likelihood ratio test [LRT], and score test) based on ZINB model for use in ZIP regression models have been developed. Due to similarity to the ZINB model, we consider the zero-inflated generalized Poisson (ZIGP) model as an alternate model for overdispersed zero-inflated count data. The score test has an advantage over the LRT and the Wald test in that the score test only requires that the parameter of interest be estimated under the null hypothesis. This paper proposes score tests for overdispersion based on the ZIGP model and illustrates that the derived score statistics are exactly the same as the score statistics under the ZINB model. A simulation study indicates the proposed score statistics are preferred to other tests for higher empirical power. In practice, based on the approximate mean–variance relationship in the data, the ZINB or ZIGP model can be considered, and a formal score test based on asymptotic standard normal distribution can be employed for assessing overdispersion in the ZIP model. We provide an example to illustrate the procedures for data analysis. 相似文献