首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Count data with excess zeros often occurs in areas such as public health, epidemiology, psychology, sociology, engineering, and agriculture. Zero-inflated Poisson (ZIP) regression and zero-inflated negative binomial (ZINB) regression are useful for modeling such data, but because of hierarchical study design or the data collection procedure, zero-inflation and correlation may occur simultaneously. To overcome these challenges ZIP or ZINB may still be used. In this paper, multilevel ZINB regression is used to overcome these problems. The method of parameter estimation is an expectation-maximization algorithm in conjunction with the penalized likelihood and restricted maximum likelihood estimates for variance components. Alternative modeling strategies, namely the ZIP distribution are also considered. An application of the proposed model is shown on decayed, missing, and filled teeth of children aged 12 years old.  相似文献   

2.
Clinical studies in overactive bladder have traditionally used analysis of covariance or nonparametric methods to analyse the number of incontinence episodes and other count data. It is known that if the underlying distributional assumptions of a particular parametric method do not hold, an alternative parametric method may be more efficient than a nonparametric one, which makes no assumptions regarding the underlying distribution of the data. Therefore, there are advantages in using methods based on the Poisson distribution or extensions of that method, which incorporate specific features that provide a modelling framework for count data. One challenge with count data is overdispersion, but methods are available that can account for this through the introduction of random effect terms in the modelling, and it is this modelling framework that leads to the negative binomial distribution. These models can also provide clinicians with a clearer and more appropriate interpretation of treatment effects in terms of rate ratios. In this paper, the previously used parametric and non‐parametric approaches are contrasted with those based on Poisson regression and various extensions in trials evaluating solifenacin and mirabegron in patients with overactive bladder. In these applications, negative binomial models are seen to fit the data well. Copyright © 2014 John Wiley & Sons, Ltd.  相似文献   

3.
We extend the family of Poisson and negative binomial models to derive the joint distribution of clustered count outcomes with extra zeros. Two random effects models are formulated. The first model assumes a shared random effects term between the conditional probability of perfect zeros and the conditional mean of the imperfect state. The second formulation relaxes the shared random effects assumption by relating the conditional probability of perfect zeros and the conditional mean of the imperfect state to two different but correlated random effects variables. Under the conditional independence and the missing data at random assumption, a direct optimization of the marginal likelihood and an EM algorithm are proposed to fit the proposed models. Our proposed models are fitted to dental caries counts of children under the age of six in the city of Detroit.  相似文献   

4.
Data sets with excess zeroes are frequently analyzed in many disciplines. A common framework used to analyze such data is the zero-inflated (ZI) regression model. It mixes a degenerate distribution with point mass at zero with a non-degenerate distribution. The estimates from ZI models quantify the effects of covariates on the means of latent random variables, which are often not the quantities of primary interest. Recently, marginal zero-inflated Poisson (MZIP; Long et al. [A marginalized zero-inflated Poisson regression model with overall exposure effects. Stat. Med. 33 (2014), pp. 5151–5165]) and negative binomial (MZINB; Preisser et al., 2016) models have been introduced that model the mean response directly. These models yield covariate effects that have simple interpretations that are, for many applications, more appealing than those available from ZI regression. This paper outlines a general framework for marginal zero-inflated models where the latent distribution is a member of the exponential dispersion family, focusing on common distributions for count data. In particular, our discussion includes the marginal zero-inflated binomial (MZIB) model, which has not been discussed previously. The details of maximum likelihood estimation via the EM algorithm are presented and the properties of the estimators as well as Wald and likelihood ratio-based inference are examined via simulation. Two examples presented illustrate the advantages of MZIP, MZINB, and MZIB models for practical data analysis.  相似文献   

5.
Abstract

The objective of this paper is to propose an efficient estimation procedure in a marginal mean regression model for longitudinal count data and to develop a hypothesis test for detecting the presence of overdispersion. We extend the matrix expansion idea of quadratic inference functions to the negative binomial regression framework that entails accommodating both the within-subject correlation and overdispersion issue. Theoretical and numerical results show that the proposed procedure yields a more efficient estimator asymptotically than the one ignoring either the within-subject correlation or overdispersion. When the overdispersion is absent in data, the proposed method might hinder the estimation efficiency in practice, yet the Poisson regression based regression model is fitted to the data sufficiently well. Therefore, we construct the hypothesis test that recommends an appropriate model for the analysis of the correlated count data. Extensive simulation studies indicate that the proposed test can identify the effective model consistently. The proposed procedure is also applied to a transportation safety study and recommends the proposed negative binomial regression model.  相似文献   

6.
When a count data set has excessive zero counts, nonzero counts are overdispersed, and the effect of a continuous covariate might be nonlinear, for analysis a semiparametric zero-inflated negative binomial (ZINB) regression model is proposed. The unspecified smooth functional form for the continuous covariate effect is approximated by a cubic spline. The semiparametric ZINB regression model is fitted by maximizing the likelihood function. The likelihood ratio procedure is used to evaluate the adequacy of a postulated parametric functional form for the continuous covariate effect. An extensive simulation study is conducted to assess the finite-sample performance of the proposed test. The practicality of the proposed methodology is demonstrated with data of a motorcycle survey of traffic regulations conducted in 2007 in Taiwan by the Ministry of Transportation and Communication.  相似文献   

7.
In this study, we deal with the problem of overdispersion beyond extra zeros for a collection of counts that can be correlated. Poisson, negative binomial, zero-inflated Poisson and zero-inflated negative binomial distributions have been considered. First, we propose a multivariate count model in which all counts follow the same distribution and are correlated. Then we extend this model in a sense that correlated counts may follow different distributions. To accommodate correlation among counts, we have considered correlated random effects for each individual in the mean structure, thus inducing dependency among common observations to an individual. The method is applied to real data to investigate variation in food resources use in a species of marsupial in a locality of the Brazilian Cerrado biome.  相似文献   

8.
The objective of this study is providing a comparative assessment for researchers to deal with the challenges of analyzing count data and examining the factors associated with daily cigarette consumption among the young people in Turkey. We fitted Poisson (P), negative binomial (NB), zero-inflated Poisson (ZIP), zero-inflated negative binomial (ZINB), Poisson hurdle (PH) and negative binomial hurdle (NBH) regressions to cigarette consumption count data by using the 2014 Turkey Health Survey. Our results showed that the ZINB and NBH models should be preferred. We also found that, gender, employment and tobacco use at home are more effective factors for smokers and nonsmokers in the 15–24 age group in Turkey.  相似文献   

9.
This article proposes a variable selection approach for zero-inflated count data analysis based on the adaptive lasso technique. Two models including the zero-inflated Poisson and the zero-inflated negative binomial are investigated. An efficient algorithm is used to minimize the penalized log-likelihood function in an approximate manner. Both the generalized cross-validation and Bayesian information criterion procedures are employed to determine the optimal tuning parameter, and a consistent sandwich formula of standard errors for nonzero estimates is given based on local quadratic approximation. We evaluate the performance of the proposed adaptive lasso approach through extensive simulation studies, and apply it to analyze real-life data about doctor visits.  相似文献   

10.
Lesion count observed on brain magnetic resonance imaging scan is a common end point in phase 2 clinical trials evaluating therapeutic treatment in relapsing remitting multiple sclerosis (MS). This paper compares the performances of Poisson, zero‐inflated poisson (ZIP), negative binomial (NB), and zero‐inflated NB (ZINB) mixed‐effects regression models in fitting lesion count data in a clinical trial evaluating the efficacy and safety of fingolimod in comparison with placebo, in MS. The NB and ZINB models prove to be superior to the Poisson and ZIP models. We discuss the advantages and limitations of zero‐inflated models in the context of MS treatment. Copyright © 2012 John Wiley & Sons, Ltd.  相似文献   

11.
Overdispersion is a problem encountered in the analysis of count data that can lead to invalid inference if unaddressed. Decision about whether data are overdispersed is often reached by checking whether the ratio of the Pearson chi-square statistic to its degrees of freedom is greater than one; however, there is currently no fixed threshold for declaring the need for statistical intervention. We consider simulated cross-sectional and longitudinal datasets containing varying magnitudes of overdispersion caused by outliers or zero inflation, as well as real datasets, to determine an appropriate threshold value of this statistic which indicates when overdispersion should be addressed.  相似文献   

12.
Count data with excess zeros are widely encountered in the fields of biomedical, medical, public health and social survey, etc. Zero-inflated Poisson (ZIP) regression models with mixed effects are useful tools for analyzing such data, in which covariates are usually incorporated in the model to explain inter-subject variation and normal distribution is assumed for both random effects and random errors. However, in many practical applications, such assumptions may be violated as the data often exhibit skewness and some covariates may be measured with measurement errors. In this paper, we deal with these issues simultaneously by developing a Bayesian joint hierarchical modeling approach. Specifically, by treating intercepts and slopes in logistic and Poisson regression as random, a flexible two-level ZIP regression model is proposed, where a covariate process with measurement errors is established and a skew-t-distribution is considered for both random errors and random effects. Under the Bayesian framework, model selection is carried out using deviance information criterion (DIC) and a goodness-of-fit statistics is also developed for assessing the plausibility of the posited model. The main advantage of our method is that it allows for more robustness and correctness for investigating heterogeneity from different levels, while accommodating the skewness and measurement errors simultaneously. An application to Shanghai Youth Fitness Survey is used as an illustrate example. Through this real example, it is showed that our approach is of interest and usefulness for applications.  相似文献   

13.
We present a bivariate regression model for count data that allows for positive as well as negative correlation of the response variables. The covariance structure is based on the Sarmanov distribution and consists of a product of generalised Poisson marginals and a factor that depends on particular functions of the response variables. The closed form of the probability function is derived by means of the moment-generating function. The model is applied to a large real dataset on health care demand. Its performance is compared with alternative models presented in the literature. We find that our model is significantly better than or at least equivalent to the benchmark models. It gives insights into influences on the variance of the response variables.  相似文献   

14.
Negative binomial regression (NBR) and Poisson regression (PR) applications have become very popular in the analysis of count data in recent years. However, if there is a high degree of relationship between the independent variables, the problem of multicollinearity arises in these models. We introduce new two-parameter estimators (TPEs) for the NBR and the PR models by unifying the two-parameter estimator (TPE) of Özkale and Kaç?ranlar [The restricted and unrestricted two-parameter estimators. Commun Stat Theory Methods. 2007;36:2707–2725]. These new estimators are general estimators which include maximum likelihood (ML) estimator, ridge estimator (RE), Liu estimator (LE) and contraction estimator (CE) as special cases. Furthermore, biasing parameters of these estimators are given and a Monte Carlo simulation is done to evaluate the performance of these estimators using mean square error (MSE) criterion. The benefits of the new TPEs are also illustrated in an empirical application. The results show that the new proposed TPEs for the NBR and the PR models are better than the ML estimator, the RE and the LE.  相似文献   

15.
The constrained, non-normal nature of time-use data poses a challenge to ordinary analysis of variance. This paper investigates a computationally simple variance decomposition technique suitable for those data. As a by-product of the analysis, a measure of fit for systems of time-demand equations is proposed that possesses several useful properties.  相似文献   

16.
Much research has been performed in the area of multiple linear regression, with the resuit that the field is well-developed. This is not true of logistic regression, however. The latter presents special problems because the response is not continuous. Some of these problems are: the difficulty of developing a suitable R2 statistic, possibly poor results produced by the method of maximum likelihood, and the challenge to develop suitable graphical techniques. We describe recent work in some of these directions, and discuss the need for additional research.  相似文献   

17.
18.
19.
The coefficient of determination, a.k.a. R2, is well-defined in linear regression models, and measures the proportion of variation in the dependent variable explained by the predictors included in the model. To extend it for generalized linear models, we use the variance function to define the total variation of the dependent variable, as well as the remaining variation of the dependent variable after modeling the predictive effects of the independent variables. Unlike other definitions that demand complete specification of the likelihood function, our definition of R2 only needs to know the mean and variance functions, so applicable to more general quasi-models. It is consistent with the classical measure of uncertainty using variance, and reduces to the classical definition of the coefficient of determination when linear regression models are considered.  相似文献   

20.
The present paper investigates for the first time, the robustness of some of the familiar transformations of the sample correlation coefficient when the parent population is discrete. Three specific cases examined are:The bivariate Poisson (BVP):the bivariate negative binomial (BNB):The trinomial (TN). Investigation of the (near) normality of the transformed statistics is done by the techniques considered by Subrahmaniam and Gajjar. In addition, an empirical examination of their behaviour is carried out by the density estimation technique due to Tarter and Kronmal.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号