首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 687 毫秒
1.
In this study, we deal with the problem of overdispersion beyond extra zeros for a collection of counts that can be correlated. Poisson, negative binomial, zero-inflated Poisson and zero-inflated negative binomial distributions have been considered. First, we propose a multivariate count model in which all counts follow the same distribution and are correlated. Then we extend this model in a sense that correlated counts may follow different distributions. To accommodate correlation among counts, we have considered correlated random effects for each individual in the mean structure, thus inducing dependency among common observations to an individual. The method is applied to real data to investigate variation in food resources use in a species of marsupial in a locality of the Brazilian Cerrado biome.  相似文献   

2.
Count data are routinely assumed to have a Poisson distribution, especially when there are no straightforward diagnostic procedures for checking this assumption. We reanalyse two data sets from crossover trials of treatments for angina pectoris , in which the outcomes are counts of anginal attacks. Standard analyses focus on treatment effects, averaged over subjects; we are also interested in the dispersion of these effects (treatment heterogeneity). We set up a log-Poisson model with random coefficients to estimate the distribution of the treatment effects and show that the analysis is very sensitive to the distributional assumption; the population variance of the treatment effects is confounded with the (variance) function that relates the conditional variance of the outcomes, given the subject's rate of attacks, to the conditional mean. Diagnostic model checks based on resampling from the fitted distribution indicate that the default choice of the Poisson distribution for the analysed data sets is poorly supported. We propose to augment the data sets with observations of the counts, made possibly outside the clinical setting, so that the conditional distribution of the counts could be established.  相似文献   

3.
Zero inflated Poisson regression is a model commonly used to analyze data with excessive zeros. Although many models have been developed to fit zero-inflated data, most of them strongly depend on the special features of the individual data. For example, there is a need for new models when dealing with truncated and inflated data. In this paper, we propose a new model that is sufficiently flexible to model inflation and truncation simultaneously, and which is a mixture of a multinomial logistic and a truncated Poisson regression, in which the multinomial logistic component models the occurrence of excessive counts. The truncated Poisson regression models the counts that are assumed to follow a truncated Poisson distribution. The performance of our proposed model is evaluated through simulation studies, and our model is found to have the smallest mean absolute error and best model fit. In the empirical example, the data are truncated with inflated values of zero and fourteen, and the results show that our model has a better fit than the other competing models.  相似文献   

4.
This paper proposes a simple and flexible count data regression model which is able to incorporate overdispersion (the variance is greater than the mean) and which can be considered a competitor to the Poisson model. As is well known, this classical model imposes the restriction that the conditional mean of each count variable must equal the conditional variance. Nevertheless, for the common case of well-dispersed counts the Poisson regression may not be appropriate, while the count regression model proposed here is potentially useful. We consider an application to model counts of medical care utilization by the elderly in the USA using a well-known data set from the National Medical Expenditure Survey (1987), where the dependent variable is the number of stays after hospital admission, and where 10 explanatory variables are analysed.  相似文献   

5.
We present a novel model, which is a two-parameter extension of the Poisson distribution. Its normalizing constant is related to the Touchard polynomials, hence the name of this model. It is a flexible distribution that can account for both under- or overdispersion and concentration of zeros that are frequently found in non-Poisson count data. In contrast to some other generalizations, the Hessian matrix for maximum likelihood estimation of the Touchard parameters has a simple form. We exemplify with three data sets, showing that our suggested model is a competitive candidate for fitting non-Poisson counts.  相似文献   

6.
In this article we propose a new cure rate survival model. In our approach the number of competing causes of the event of interest is assumed to follow an exponential discrete power series distribution. An advantage of our model is that it is very flexible, including several particular cases, such as, Bernoulli, geometric, Poisson, etc. Moreover, we derive the appropriate matrices for assessing local influence on the parameter estimates under different perturbation schemes and present some ways to perform global influence analysis. Distribution fitting can be tested for the best fitting in a straightforward way. Maximum likelihood estimation is discussed. Our proposed model is illustrated through cutaneous melanoma data.  相似文献   

7.
The mixed Poisson–inverse-Gaussian distribution has been used by Holla, Sankaran, Sichel, and others in univariate problems involving counts. We propose a Poisson–inverse-Gaussian regression model which can be used for regression analysis of counts. The model provides an attractive framework for incorporating random effects in Poisson regression models and in handling extra-Poisson variation. Maximum-likelihood and quasilikelihood-moment estimation is investigated and illustrated with an example involving motor-insurance claims.  相似文献   

8.
Summary.  A useful discrete distribution (the Conway–Maxwell–Poisson distribution) is revived and its statistical and probabilistic properties are introduced and explored. This distribution is a two-parameter extension of the Poisson distribution that generalizes some well-known discrete distributions (Poisson, Bernoulli and geometric). It also leads to the generalization of distributions derived from these discrete distributions (i.e. the binomial and negative binomial distributions). We describe three methods for estimating the parameters of the Conway–Maxwell–Poisson distribution. The first is a fast simple weighted least squares method, which leads to estimates that are sufficiently accurate for practical purposes. The second method, using maximum likelihood, can be used to refine the initial estimates. This method requires iterations and is more computationally intensive. The third estimation method is Bayesian. Using the conjugate prior, the posterior density of the parameters of the Conway–Maxwell–Poisson distribution is easily computed. It is a flexible distribution that can account for overdispersion or underdispersion that is commonly encountered in count data. We also explore two sets of real world data demonstrating the flexibility and elegance of the Conway–Maxwell–Poisson distribution in fitting count data which do not seem to follow the Poisson distribution.  相似文献   

9.
In recent years, there has been considerable interest in regression models based on zero-inflated distributions. These models are commonly encountered in many disciplines, such as medicine, public health, and environmental sciences, among others. The zero-inflated Poisson (ZIP) model has been typically considered for these types of problems. However, the ZIP model can fail if the non-zero counts are overdispersed in relation to the Poisson distribution, hence the zero-inflated negative binomial (ZINB) model may be more appropriate. In this paper, we present a Bayesian approach for fitting the ZINB regression model. This model considers that an observed zero may come from a point mass distribution at zero or from the negative binomial model. The likelihood function is utilized to compute not only some Bayesian model selection measures, but also to develop Bayesian case-deletion influence diagnostics based on q-divergence measures. The approach can be easily implemented using standard Bayesian software, such as WinBUGS. The performance of the proposed method is evaluated with a simulation study. Further, a real data set is analyzed, where we show that ZINB regression models seems to fit the data better than the Poisson counterpart.  相似文献   

10.
We propose a model for count data from two-stage cluster sampling, where observations within each cluster are subjected simultaneously to internal influences and external factors at the cluster level. This model can be seen as a two-stage hierarchical model with local and global predictors. This parameter-driven model causes the counts within a cluster to share a common latent factor and to be correlated. Maximum likelihood (ml) estimation based on an EM algorithm for the model is discussed. Simulation study is carried out to assess the benefit of using ml estimates compared to a standard Poisson regression analysis that ignores the within cluster correlation.  相似文献   

11.
Medical and public health research often involve the analysis of repeated or longitudinal count data that exhibit excess zeros such as the number of yearly doctor visits by a group of individuals over a number of years. Zero-inflated Poisson (ZIP) regression models can be used to account for excess zeros in count data. We propose an extension of the ZIP model that is appropriate for longitudinal data. Our extension includes a non stationary, observation-driven time series model based correlation structure. We discuss estimation of the model parameters and the inefficiency of the estimators when the correlation structure is mis-specified. The model's application to the analysis of health care utilization data is also discussed.  相似文献   

12.
We present a graphical method based on the empirical probability generating function for preliminary statistical analysis of distributions for counts. The method is especially useful in fitting a Poisson model, or for identifying alternative models as well as possible outlying observations from general discrete distributions.  相似文献   

13.
ABSTRACT

Mixed Poisson distributions are widely used in various applications of count data mainly when extra variation is present. This paper introduces an extension in terms of a mixed strategy to jointly deal with extra-Poisson variation and zero-inflated counts. In particular, we propose the Poisson log-skew-normal distribution which utilizes the log-skew-normal as a mixing prior and present its main properties. This is directly done through additional hierarchy level to the lognormal prior and includes the Poisson lognormal distribution as its special case. Two numerical methods are developed for the evaluation of associated likelihoods based on the Gauss–Hermite quadrature and the Lambert's W function. By conducting simulation studies, we show that the proposed distribution performs better than several commonly used distributions that allow for over-dispersion or zero inflation. The usefulness of the proposed distribution in empirical work is highlighted by the analysis of a real data set taken from health economics contexts.  相似文献   

14.
ABSTRACT

The log-logistic distribution is commonly used to model lifetime data. We propose a wider distribution, named the exponentiated log-logistic geometric distribution, based on a double activation approach. We obtain the quantile function, ordinary moments, and generating function. The method of maximum likelihood is used to estimate the model parameters. We propose a new extended regression model based on the logarithm of the exponentiated log-logistic geometric distribution. This regression model can be very useful in the analysis of real data and could provide better fits than other special regression models. The potentiality of the new models is illustrated by means of two applications to real lifetime data sets.  相似文献   

15.
The zero truncated inverse Gaussian–Poisson model, obtained by first mixing the Poisson model assuming its expected value has an inverse Gaussian distribution and then truncating the model at zero, is very useful when modelling frequency count data. A Bayesian analysis based on this statistical model is implemented on the word frequency counts of various texts, and its validity is checked by exploring the posterior distribution of the Pearson errors and by implementing posterior predictive consistency checks. The analysis based on this model is useful because it allows one to use the posterior distribution of the model mixing density as an approximation of the posterior distribution of the density of the word frequencies of the vocabulary of the author, which is useful to characterize the style of that author. The posterior distribution of the expectation and of measures of the variability of that mixing distribution can be used to assess the size and diversity of his vocabulary. An alternative analysis is proposed based on the inverse Gaussian-zero truncated Poisson mixture model, which is obtained by switching the order of the mixing and the truncation stages. Even though this second model fits some of the word frequency data sets more accurately than the first model, in practice the analysis based on it is not as useful because it does not allow one to estimate the word frequency distribution of the vocabulary.  相似文献   

16.
Bayesian methods are often used to reduce the sample sizes and/or increase the power of clinical trials. The right choice of the prior distribution is a critical step in Bayesian modeling. If the prior not completely specified, historical data may be used to estimate it. In the empirical Bayesian analysis, the resulting prior can be used to produce the posterior distribution. In this paper, we describe a Bayesian Poisson model with a conjugate Gamma prior. The parameters of Gamma distribution are estimated in the empirical Bayesian framework under two estimation schemes. The straightforward numerical search for the maximum likelihood (ML) solution using the marginal negative binomial distribution is unfeasible occasionally. We propose a simplification to the maximization procedure. The Markov Chain Monte Carlo method is used to create a set of Poisson parameters from the historical count data. These Poisson parameters are used to uniquely define the Gamma likelihood function. Easily computable approximation formulae may be used to find the ML estimations for the parameters of gamma distribution. For the sample size calculations, the ML solution is replaced by its upper confidence limit to reflect an incomplete exchangeability of historical trials as opposed to current studies. The exchangeability is measured by the confidence interval for the historical rate of the events. With this prior, the formula for the sample size calculation is completely defined. Published in 2009 by John Wiley & Sons, Ltd.  相似文献   

17.
When incomplete repeated failure times are collected from a large number of independent individuals, interest is focused primarily on the consistent and efficient estimation of the effects of the associated covariates on the failure times. Since repeated failure times are likely to be correlated, it is important to exploit the correlation structure of the failure data in order to obtain such consistent and efficient estimates. However, it may be difficult to specify an appropriate correlation structure for a real life data set. We propose a robust correlation structure that can be used irrespective of the true correlation structure. This structure is used in constructing an estimating equation for the hazard ratio parameter, under the assumption that the number of repeated failure times for an individual is random. The consistency and efficiency of the estimates is examined through a simulation study, where we consider failure times that marginally follow an exponential distribution and a Poisson distribution is assumed for the random number of repeated failure times. We conclude by using the proposed method to analyze a bladder cancer dataset.  相似文献   

18.
Zero-inflated data are more frequent when the data represent counts. However, there are practical situations in which continuous data contain an excess of zeros. In these cases, the zero-inflated Poisson, binomial or negative binomial models are not suitable. In order to reduce this gap, we propose the zero-spiked gamma-Weibull (ZSGW) model by mixing a distribution which is degenerate at zero with the gamma-Weibull distribution, which has positive support. The model attempts to estimate simultaneously the effects of explanatory variables on the response variable and the zero-spiked. We consider a frequentist analysis and a non-parametric bootstrap for estimating the parameters of the ZSGW regression model. We derive the appropriate matrices for assessing local influence on the model parameters. We illustrate the performance of the proposed regression model by means of a real data set (copaiba oil resin production) from a study carried out at the Department of Forest Science of the Luiz de Queiroz School of Agriculture, University of São Paulo. Based on the ZSGW regression model, we determine the explanatory variables that can influence the excess of zeros of the resin oil production and identify influential observations. We also prove empirically that the proposed regression model can be superior to the zero-adjusted inverse Gaussian regression model to fit zero-inflated positive continuous data.  相似文献   

19.
We describe a class of random field models for geostatistical count data based on Gaussian copulas. Unlike hierarchical Poisson models often used to describe this type of data, Gaussian copula models allow a more direct modelling of the marginal distributions and association structure of the count data. We study in detail the correlation structure of these random fields when the family of marginal distributions is either negative binomial or zero‐inflated Poisson; these represent two types of overdispersion often encountered in geostatistical count data. We also contrast the correlation structure of one of these Gaussian copula models with that of a hierarchical Poisson model having the same family of marginal distributions, and show that the former is more flexible than the latter in terms of range of feasible correlation, sensitivity to the mean function and modelling of isotropy. An exploratory analysis of a dataset of Japanese beetle larvae counts illustrate some of the findings. All of these investigations show that Gaussian copula models are useful alternatives to hierarchical Poisson models, specially for geostatistical count data that display substantial correlation and small overdispersion.  相似文献   

20.
In this paper, we propose a cure rate survival model by assuming that the number of competing causes of the event of interest follows the Poisson distribution and the time to event has the Birnbaum–Saunders (BS) distribution. We define the Poisson BS distribution and provide two useful representations for its density function which facilitate to obtain some mathematical properties. Two closed-form expressions for the moments of the new distribution are given. We estimate the parameters of the model with cure rate using maximum likelihood. For different parameter settings, sample sizes and censoring percentages, several simulations are performed. We derive the appropriate matrices for assessing local influence on the parameter estimates under different perturbation schemes and present some ways to perform a global influence study. We analyse a real data set from the medical area.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号