首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
The zero-inflated regression models such as zero-inflated Poisson (ZIP), zero-inflated negative binomial (ZINB) or zero-inflated generalized Poisson (ZIGP) regression models can model the count data with excess zeros. The ZINB model can handle over-dispersed and the ZIGP model can handle the over or under-dispersed count data with excess zeros as well. Moreover, the count data may be correlated because of data collection procedure or special study design. The clustered sampling approach is one of the examples in which the correlation among subjects could be defined. In such situations, a marginal model using generalized estimating equation (GEE) approach can incorporate these correlations and lead up to the relationships at the population level. In this study, the GEE-based zero-inflated generalized Poisson regression model was proposed to fit over and under-dispersed clustered count data with excess zeros.  相似文献   

2.
Count data with structural zeros are common in public health applications. There are considerable researches focusing on zero-inflated models such as zero-inflated Poisson (ZIP) and zero-inflated Negative Binomial (ZINB) models for such zero-inflated count data when used as response variable. However, when such variables are used as predictors, the difference between structural and random zeros is often ignored and may result in biased estimates. One remedy is to include an indicator of the structural zero in the model as a predictor if observed. However, structural zeros are often not observed in practice, in which case no statistical method is available to address the bias issue. This paper is aimed to fill this methodological gap by developing parametric methods to model zero-inflated count data when used as predictors based on the maximum likelihood approach. The response variable can be any type of data including continuous, binary, count or even zero-inflated count responses. Simulation studies are performed to assess the numerical performance of this new approach when sample size is small to moderate. A real data example is also used to demonstrate the application of this method.  相似文献   

3.
Count data with excess zeros often occurs in areas such as public health, epidemiology, psychology, sociology, engineering, and agriculture. Zero-inflated Poisson (ZIP) regression and zero-inflated negative binomial (ZINB) regression are useful for modeling such data, but because of hierarchical study design or the data collection procedure, zero-inflation and correlation may occur simultaneously. To overcome these challenges ZIP or ZINB may still be used. In this paper, multilevel ZINB regression is used to overcome these problems. The method of parameter estimation is an expectation-maximization algorithm in conjunction with the penalized likelihood and restricted maximum likelihood estimates for variance components. Alternative modeling strategies, namely the ZIP distribution are also considered. An application of the proposed model is shown on decayed, missing, and filled teeth of children aged 12 years old.  相似文献   

4.
While excess zeros are often thought to cause data over-dispersion (i.e. when the variance exceeds the mean), this implication is not absolute. One should instead consider a flexible class of distributions that can address data dispersion along with excess zeros. This work develops a zero-inflated sum-of-Conway-Maxwell-Poissons (ZISCMP) regression as a flexible analysis tool to model count data that express significant data dispersion and contain excess zeros. This class of models contains several special case zero-inflated regressions, including zero-inflated Poisson (ZIP), zero-inflated negative binomial (ZINB), zero-inflated binomial (ZIB), and the zero-inflated Conway-Maxwell-Poisson (ZICMP). Through simulated and real data examples, we demonstrate class flexibility and usefulness. We further utilize it to analyze shark species data from Australia's Great Barrier Reef to assess the environmental impact of human action on the number of various species of sharks.  相似文献   

5.
In several cases, count data often have excessive number of zero outcomes. This zero-inflated phenomenon is a specific cause of overdispersion, and zero-inflated Poisson regression model (ZIP) has been proposed for accommodating zero-inflated data. However, if the data continue to suggest additional overdispersion, zero-inflated negative binomial (ZINB) and zero-inflated generalized Poisson (ZIGP) regression models have been considered as alternatives. This study proposes the score test for testing ZIP regression model against ZIGP alternatives and proves that it is equal to the score test for testing ZIP regression model against ZINB alternatives. The advantage of using the score test over other alternative tests such as likelihood ratio and Wald is that the score test can be used to determine whether a more complex model is appropriate without fitting the more complex model. Applications of the proposed score test on several datasets are also illustrated.  相似文献   

6.
The generalized Poisson (GP) regression is an increasingly popular approach for modeling overdispersed as well as underdispersed count data. Several parameterizations have been performed for the GP regression, and the two well known models, the GP-1 and the GP-2, have been applied. The GP-P regression, which has been recently proposed, has the advantage of nesting the GP-1 and the GP-2 parametrically, besides allowing the statistical tests of the GP-1 and the GP-2 against a more general alternative. In several cases, count data often have excessive number of zero outcomes than are expected in the Poisson. This zero-inflation phenomenon is a specific cause of overdispersion, and the zero-inflated Poisson (ZIP) regression model has been proposed. However, if the data continue to suggest additional overdispersion, the zero-inflated negative binomial (ZINB-1 and ZINB-2) and the zero-inflated generalized Poisson (ZIGP-1 and ZIGP-2) regression models have been considered as alternatives. This article proposes a functional form of the ZIGP which mixes a distribution degenerate at zero with a GP-P distribution. The suggested model has the advantage of nesting the ZIP and the two well known ZIGP (ZIGP-1 and ZIGP-2) regression models, besides allowing the statistical tests of the ZIGP-1 and the ZIGP-2 against a more general alternative. The ZIP and the functional form of the ZIGP regression models are fitted, compared and tested on two sets of count data; the Malaysian insurance claim data and the German healthcare data.  相似文献   

7.
In this study, estimation of the parameters of the zero-inflated count regression models and computations of posterior model probabilities of the log-linear models defined for each zero-inflated count regression models are investigated from the Bayesian point of view. In addition, determinations of the most suitable log-linear and regression models are investigated. It is known that zero-inflated count regression models cover zero-inflated Poisson, zero-inflated negative binomial, and zero-inflated generalized Poisson regression models. The classical approach has some problematic points but the Bayesian approach does not have similar flaws. This work points out the reasons for using the Bayesian approach. It also lists advantages and disadvantages of the classical and Bayesian approaches. As an application, a zoological data set, including structural and sampling zeros, is used in the presence of extra zeros. In this work, it is observed that fitting a zero-inflated negative binomial regression model creates no problems at all, even though it is known that fitting a zero-inflated negative binomial regression model is the most problematic procedure in the classical approach. Additionally, it is found that the best fitting model is the log-linear model under the negative binomial regression model, which does not include three-way interactions of factors.  相似文献   

8.
In recent years, there has been considerable interest in regression models based on zero-inflated distributions. These models are commonly encountered in many disciplines, such as medicine, public health, and environmental sciences, among others. The zero-inflated Poisson (ZIP) model has been typically considered for these types of problems. However, the ZIP model can fail if the non-zero counts are overdispersed in relation to the Poisson distribution, hence the zero-inflated negative binomial (ZINB) model may be more appropriate. In this paper, we present a Bayesian approach for fitting the ZINB regression model. This model considers that an observed zero may come from a point mass distribution at zero or from the negative binomial model. The likelihood function is utilized to compute not only some Bayesian model selection measures, but also to develop Bayesian case-deletion influence diagnostics based on q-divergence measures. The approach can be easily implemented using standard Bayesian software, such as WinBUGS. The performance of the proposed method is evaluated with a simulation study. Further, a real data set is analyzed, where we show that ZINB regression models seems to fit the data better than the Poisson counterpart.  相似文献   

9.
In this article, we compare the zero-inflated Poisson (ZIP) and negative binomial (NB) distributions based on three most important criteria: the probability of zero, the mean value, and the variance. Our results show that with same mean value and variance, the ZIP distribution always has a larger probability of zeros; with same mean value and probability of zeros, the NB distribution always has a larger variance; and with same variance and probability of zeros, the ZIP distribution always has a larger mean value. We also study the properties of Vuong test in model selection in three cases by simulations.  相似文献   

10.
The zero-inflated Poisson (ZIP) distribution is widely used for modeling a count data set when the frequency of zeros is higher than the one expected under the Poisson distribution. There are many methods for making inferences for the inflation parameter in the ZIP models, e.g. the methods for testing Poisson (the inflation parameter is zero) versus ZIP distribution (the inflation parameter is positive). Most of these methods are based on the maximum likelihood estimators which do not have an explicit expression. However, the estimators which are obtained by the method of moments are powerful enough, easy to obtain and implement. In this paper, we propose an approach based on the method of moments for making inferences about the inflation parameter in the ZIP distribution. Our method is also compared to some recent methods via a simulation study and it is illustrated by an example.  相似文献   

11.
Zero-inflated Poisson (ZIP) and zero-inflated negative binomial (ZINB) models are recommended for handling excessive zeros in count data. For various reasons, researchers may not address zero inflation. This paper helps educate researchers on (1) the importance of accounting for zero inflation and (2) the consequences of misspecifying the statistical model. Using simulations, we found that when the zero inflation in the data was ignored, estimation was poor and statistically significant findings were missed. When overdispersion within the zero-inflated data was ignored, poor estimation and inflated Type I errors resulted. Recommendations on when to use the ZINB and ZIP models are provided. In an illustration using a two-step model selection procedure (likelihood ratio test and the Vuong test), the ZIP model was correctly identified only when the distributions had moderate means and sample sizes and did not correctly identify the ZINB model or the zero inflation in the ZIP and ZINB distributions.  相似文献   

12.
Count data often display excessive number of zero outcomes than are expected in the Poisson regression model. The zero-inflated Poisson regression model has been suggested to handle zero-inflated data, whereas the zero-inflated negative binomial (ZINB) regression model has been fitted for zero-inflated data with additional overdispersion. For bivariate and zero-inflated cases, several regression models such as the bivariate zero-inflated Poisson (BZIP) and bivariate zero-inflated negative binomial (BZINB) have been considered. This paper introduces several forms of nested BZINB regression model which can be fitted to bivariate and zero-inflated count data. The mean–variance approach is used for comparing the BZIP and our forms of BZINB regression model in this study. A similar approach was also used by past researchers for defining several negative binomial and zero-inflated negative binomial regression models based on the appearance of linear and quadratic terms of the variance function. The nested BZINB regression models proposed in this study have several advantages; the likelihood ratio tests can be performed for choosing the best model, the models have flexible forms of marginal mean–variance relationship, the models can be fitted to bivariate zero-inflated count data with positive or negative correlations, and the models allow additional overdispersion of the two dependent variables.  相似文献   

13.
Count data with excess zeros are common in many biomedical and public health applications. The zero-inflated Poisson (ZIP) regression model has been widely used in practice to analyze such data. In this paper, we extend the classical ZIP regression framework to model count time series with excess zeros. A Markov regression model is presented and developed, and the partial likelihood is employed for statistical inference. Partial likelihood inference has been successfully applied in modeling time series where the conditional distribution of the response lies within the exponential family. Extending this approach to ZIP time series poses methodological and theoretical challenges, since the ZIP distribution is a mixture and therefore lies outside the exponential family. In the partial likelihood framework, we develop an EM algorithm to compute the maximum partial likelihood estimator (MPLE). We establish the asymptotic theory of the MPLE under mild regularity conditions and investigate its finite sample behavior in a simulation study. The performances of different partial-likelihood based model selection criteria are compared in the presence of model misspecification. Finally, we present an epidemiological application to illustrate the proposed methodology.  相似文献   

14.
The objective of this study is providing a comparative assessment for researchers to deal with the challenges of analyzing count data and examining the factors associated with daily cigarette consumption among the young people in Turkey. We fitted Poisson (P), negative binomial (NB), zero-inflated Poisson (ZIP), zero-inflated negative binomial (ZINB), Poisson hurdle (PH) and negative binomial hurdle (NBH) regressions to cigarette consumption count data by using the 2014 Turkey Health Survey. Our results showed that the ZINB and NBH models should be preferred. We also found that, gender, employment and tobacco use at home are more effective factors for smokers and nonsmokers in the 15–24 age group in Turkey.  相似文献   

15.
In recent years, zero-inflated count data models, such as zero-inflated Poisson (ZIP) models, are widely used as the count data with extra zeros are very common in many practical problems. In order to model the correlated count data which are either clustered or repeated and to assess the effects of continuous covariates or of time scales in a flexible way, a class of semiparametric mixed-effects models for zero-inflated count data is considered. In this article, we propose a fully Bayesian inference for such models based on a data augmentation scheme that reflects both random effects of covariates and mixture of zero-inflated distribution. A computational efficient MCMC method which combines the Gibbs sampler and M-H algorithm is implemented to obtain the estimate of the model parameters. Finally, a simulation study and a real example are used to illustrate the proposed methodologies.  相似文献   

16.
The zero-inflated negative binomial (ZINB) model is used to account for commonly occurring overdispersion detected in data that are initially analyzed under the zero-inflated Poisson (ZIP) model. Tests for overdispersion (Wald test, likelihood ratio test [LRT], and score test) based on ZINB model for use in ZIP regression models have been developed. Due to similarity to the ZINB model, we consider the zero-inflated generalized Poisson (ZIGP) model as an alternate model for overdispersed zero-inflated count data. The score test has an advantage over the LRT and the Wald test in that the score test only requires that the parameter of interest be estimated under the null hypothesis. This paper proposes score tests for overdispersion based on the ZIGP model and illustrates that the derived score statistics are exactly the same as the score statistics under the ZINB model. A simulation study indicates the proposed score statistics are preferred to other tests for higher empirical power. In practice, based on the approximate mean–variance relationship in the data, the ZINB or ZIGP model can be considered, and a formal score test based on asymptotic standard normal distribution can be employed for assessing overdispersion in the ZIP model. We provide an example to illustrate the procedures for data analysis.  相似文献   

17.
Longitudinal count data with excessive zeros frequently occur in social, biological, medical, and health research. To model such data, zero-inflated Poisson (ZIP) models are commonly used, after separating zero and positive responses. As longitudinal count responses are likely to be serially correlated, such separation may destroy the underlying serial correlation structure. To overcome this problem recently observation- and parameter-driven modelling approaches have been proposed. In the observation-driven model, the response at a specific time point is modelled through the responses at previous time points after incorporating serial correlation. One limitation of the observation-driven model is that it fails to accommodate the presence of any possible over-dispersion, which frequently occurs in the count responses. This limitation is overcome in a parameter-driven model, where the serial correlation is captured through the latent process using random effects. We compare the results obtained by the two models. A quasi-likelihood approach has been developed to estimate the model parameters. The methodology is illustrated with analysis of two real life datasets. To examine model performance the models are also compared through a simulation study.  相似文献   

18.
For frequency counts, the situation of extra zeros often arises in biomedical applications. This is demonstrated with count data from a dental epidemiological study in Belo Horizonte (the Belo Horizonte caries prevention study) which evaluated various programmes for reducing caries. Extra zeros, however, violate the variance–mean relationship of the Poisson error structure. This extra-Poisson variation can easily be explained by a special mixture model, the zero-inflated Poisson (ZIP) model. On the basis of the ZIP model, a graphical device is presented which not only summarizes the mixing distribution but also provides visual information about the overall mean. This device can be exploited to evaluate and compare various groups. Ways are discussed to include covariates and to develop an extension of the conventional Poisson regression. Finally, a method to evaluate intervention effects on the basis of the ZIP regression model is described and applied to the data of the Belo Horizonte caries prevention study.  相似文献   

19.
Multivariate zero-inflated Poisson (ZIP) distributions are important tools for modelling and analysing correlated count data with extra zeros. Unfortunately, existing multivariate ZIP distributions consider only the overall zero-inflation while the component zero-inflation is not well addressed. This paper proposes a flexible multivariate ZIP distribution, called the multivariate component ZIP distribution, in which both the overall and component zero-inflations are taken into account. Likelihood-based inference procedures including the calculation of maximum likelihood estimates of parameters in the model without and with covariates are provided. Simulation studies indicate that the performance of the proposed methods on the multivariate component ZIP model is satisfactory. The Australia health care utilisation data set is analysed to demonstrate that the new distribution is more appropriate than the existing multivariate ZIP distributions.  相似文献   

20.
Count data analysis techniques have been developed in biological and medical research areas. In particular, zero-inflated versions of parametric count distributions have been used to model excessive zeros that are often present in these assays. The most common count distributions for analyzing such data are Poisson and negative binomial. However, a Poisson distribution can only handle equidispersed data and a negative binomial distribution can only cope with overdispersion. However, a Conway–Maxwell–Poisson (CMP) distribution [4] can handle a wide range of dispersion. We show, with an illustrative data set on next-generation sequencing of maize hybrids, that both underdispersion and overdispersion can be present in genomic data. Furthermore, the maize data set consists of clustered observations and, therefore, we develop inference procedures for a zero-inflated CMP regression that incorporates a cluster-specific random effect term. Unlike the Gaussian models, the underlying likelihood is computationally challenging. We use a numerical approximation via a Gaussian quadrature to circumvent this issue. A test for checking zero-inflation has also been developed in our setting. Finite sample properties of our estimators and test have been investigated by extensive simulations. Finally, the statistical methodology has been applied to analyze the maize data mentioned before.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号