首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 27 毫秒
1.
Modelling count data is one of the most important issues in statistical research. In this paper, a new probability mass function is introduced by discretizing the continuous failure model of the Lindley distribution. The model obtained is over-dispersed and competitive with the Poisson distribution to fit automobile claim frequency data. After revising some of its properties a compound discrete Lindley distribution is obtained in closed form. This model is suitable to be applied in the collective risk model when both number of claims and size of a single claim are implemented into the model. The new compound distribution fades away to zero much more slowly than the classical compound Poisson distribution, being therefore suitable for modelling extreme data.  相似文献   

2.
Count data analysis techniques have been developed in biological and medical research areas. In particular, zero-inflated versions of parametric count distributions have been used to model excessive zeros that are often present in these assays. The most common count distributions for analyzing such data are Poisson and negative binomial. However, a Poisson distribution can only handle equidispersed data and a negative binomial distribution can only cope with overdispersion. However, a Conway–Maxwell–Poisson (CMP) distribution [4] can handle a wide range of dispersion. We show, with an illustrative data set on next-generation sequencing of maize hybrids, that both underdispersion and overdispersion can be present in genomic data. Furthermore, the maize data set consists of clustered observations and, therefore, we develop inference procedures for a zero-inflated CMP regression that incorporates a cluster-specific random effect term. Unlike the Gaussian models, the underlying likelihood is computationally challenging. We use a numerical approximation via a Gaussian quadrature to circumvent this issue. A test for checking zero-inflation has also been developed in our setting. Finite sample properties of our estimators and test have been investigated by extensive simulations. Finally, the statistical methodology has been applied to analyze the maize data mentioned before.  相似文献   

3.
The Bernoulli and Poisson processes are two popular discrete count processes; however, both rely on strict assumptions. We instead propose a generalized homogenous count process (which we name the Conway–Maxwell–Poisson or COM-Poisson process) that not only includes the Bernoulli and Poisson processes as special cases, but also serves as a flexible mechanism to describe count processes that approximate data with over- or under-dispersion. We introduce the process and an associated generalized waiting time distribution with several real-data applications to illustrate its flexibility for a variety of data structures. We consider model estimation under different scenarios of data availability, and assess performance through simulated and real datasets. This new generalized process will enable analysts to better model count processes where data dispersion exists in a more accommodating and flexible manner.  相似文献   

4.
In this paper we extend the Poisson regression model to deal with the situation in which the event count is observed le in “grouped” form, By this we mean that for some observations, all that is known about the count is that it falls within a certain range of integers, and the actual value is unknown, A typical likelihood contribution for this extended model is the sum of a set of consecutive Poisson probabilities, The log-likelihood function is derived for a general grouping rule, using a logarithmic link for the Poisson mean, This log-likelihood function is shown to be globally concave. The model is applied to grouped count data on the frequency of trips to pubs made over a one-week period by a sample of Norfolk young persons.  相似文献   

5.
In count data models, overdispersion of the dependent variable can be incorporated into the model if a heterogeneity term is added into the mean parameter of the Poisson distribution. We use a nonparametric estimation for the heterogeneity density based on a squared Kth-order polynomial expansion, that we generalize for panel data. A numerical illustration using an insurance dataset is discussed. Even if some statistical analyses showed no clear differences between these new models and the standard Poisson with gamma random effects, we show that the choice of the random effects distribution has a significant influence for interpreting our results.  相似文献   

6.
Dependent multivariate count data occur in several research studies. These data can be modelled by a multivariate Poisson or Negative binomial distribution constructed using copulas. However, when some of the counts are inflated, that is, the number of observations in some cells are much larger than other cells, then the copula-based multivariate Poisson (or Negative binomial) distribution may not fit well and it is not an appropriate statistical model for the data. There is a need to modify or adjust the multivariate distribution to account for the inflated frequencies. In this article, we consider the situation where the frequencies of two cells are higher compared to the other cells and develop a doubly inflated multivariate Poisson distribution function using multivariate Gaussian copula. We also discuss procedures for regression on covariates for the doubly inflated multivariate count data. For illustrating the proposed methodologies, we present real data containing bivariate count observations with inflations in two cells. Several models and linear predictors with log link functions are considered, and we discuss maximum likelihood estimation to estimate unknown parameters of the models.  相似文献   

7.
This paper presents a new family of distributions for count data, the so called zero-modified power series (ZMPS), which is an extension of the power series (PS) distribution family, whose support starts at zero. This extension consists in modifying the probability of observing zero of each PS distribution, enabling the new zero-modified distribution to appropriately accommodate data which have any amount of zero observations (for instance, zero-inflated or zero-deflated data). The Hurdle distribution version of the ZMPS distribution is presented. PS distributions included in the proposed ZMPS family are the Poisson, Generalized Poisson, Geometric, Binomial, Negative Binomial and Generalized Negative Binomial distributions. The paper also describes the properties and particularities of the new distribution family for count data. The distribution parameters are estimated via maximum likelihood method and the use of the new family is illustrated in three real data sets. We emphasize that the new distribution family can accommodate sets of count data without any previous knowledge on the characteristic of zero-inflation or zero-deflation present in the data.  相似文献   

8.
Lifetime Data Analysis - In this article we extend the factor copula model to deal with right-censored event time data grouped in clusters. The new methodology allows for clusters to have variable...  相似文献   

9.
Summary.  A useful discrete distribution (the Conway–Maxwell–Poisson distribution) is revived and its statistical and probabilistic properties are introduced and explored. This distribution is a two-parameter extension of the Poisson distribution that generalizes some well-known discrete distributions (Poisson, Bernoulli and geometric). It also leads to the generalization of distributions derived from these discrete distributions (i.e. the binomial and negative binomial distributions). We describe three methods for estimating the parameters of the Conway–Maxwell–Poisson distribution. The first is a fast simple weighted least squares method, which leads to estimates that are sufficiently accurate for practical purposes. The second method, using maximum likelihood, can be used to refine the initial estimates. This method requires iterations and is more computationally intensive. The third estimation method is Bayesian. Using the conjugate prior, the posterior density of the parameters of the Conway–Maxwell–Poisson distribution is easily computed. It is a flexible distribution that can account for overdispersion or underdispersion that is commonly encountered in count data. We also explore two sets of real world data demonstrating the flexibility and elegance of the Conway–Maxwell–Poisson distribution in fitting count data which do not seem to follow the Poisson distribution.  相似文献   

10.
The excess of zeros is not a rare feature in count data. Statisticians advocate the Poisson-type hurdle model (among other techniques) as an interesting approach to handle this data peculiarity. However, the frequency of gross errors and the complexity intrinsic to some considered phenomena may render this classical model unreliable and too limiting. In this paper, we develop a robust version of the Poisson hurdle model by extending the robust procedure for GLM of Cantoni and Ronchetti (2001) to the truncated Poisson regression model. The performance of the new robust approach is then investigated via a simulation study, a real data application and a sensitivity analysis. The results show the reliability of the new technique in the neighborhood of the truncated Poisson model. This robust modelling approach is therefore a valuable complement to the classical one, providing a tool for reliable statistical conclusions and to take more effective decisions.  相似文献   

11.
Count responses with structural zeros are very common in medical and psychosocial research, especially in alcohol and HIV research, and the zero-inflated Poisson (ZIP) and zero-inflated negative binomial models are widely used for modeling such outcomes. However, as alcohol drinking outcomes such as days of drinkings are counts within a given period, their distributions are bounded above by an upper limit (total days in the period) and thus inherently follow a binomial or zero-inflated binomial (ZIB) distribution, rather than a Poisson or ZIP distribution, in the presence of structural zeros. In this paper, we develop a new semiparametric approach for modeling ZIB-like count responses for cross-sectional as well as longitudinal data. We illustrate this approach with both simulated and real study data.  相似文献   

12.
The problem of discriminating between the Poisson and binomial models is discussed in the context of a detailed statistical analysis of the number of appointments of the U.S. Supreme Court justices from 1789 to 2004. Various new and existing tests are examined. The analysis shows that both simple Poisson and simple binomial models are equally appropriate for describing the data. No firm statistical evidence in favour of an exponential Poisson regression model was found. Two attendant results were obtained by simulation: firstly, that the likelihood ratio test is the most powerful of those considered when testing for the Poisson versus binomial and, secondly, that the classical variance test with an upper-tail critical region is biased.  相似文献   

13.
The purpose of this paper is to develop a new linear regression model for count data, namely generalized-Poisson Lindley (GPL) linear model. The GPL linear model is performed by applying generalized linear model to GPL distribution. The model parameters are estimated by the maximum likelihood estimation. We utilize the GPL linear model to fit two real data sets and compare it with the Poisson, negative binomial (NB) and Poisson-weighted exponential (P-WE) models for count data. It is found that the GPL linear model can fit over-dispersed count data, and it shows the highest log-likelihood, the smallest AIC and BIC values. As a consequence, the linear regression model from the GPL distribution is a valuable alternative model to the Poisson, NB, and P-WE models.  相似文献   

14.
We explore the standard life table (actuarial) estimator for grouped right-censored survival data and its extensions in order to consider its relationship with the Kaplan–Meier estimator, and to investigate the critical properties of the extended life table estimators (ELTEs). We discuss certain conditions for the ELTE to be consistent and develop a characterization of the standard life table estimator using the consistency property under any choice of at least two observation times of a finite interval. We also perform a comparative analysis of the ELTEs with the corresponding maximum likelihood estimators for grouped right-censored survival data.  相似文献   

15.
In this article, a new mixed Poisson distribution is introduced. This new distribution is obtained by utilizing mixing process, with Poisson distribution as mixed distribution and Transmuted Exponential as mixing distribution. Distributional properties like unimodality, moments, over-dispersion, infinite divisibility are studied. Three methods viz. Method of moment, Method of moment and proportion, and Maximum-likelihood method are used for parameter estimation. Further, an actuarial application in context of aggregate claim distribution is presented. Finally, to show the applicability and superiority of proposed model, we discuss count data and count regression modeling and compare with some well established models.  相似文献   

16.
The bivariate negative binomial regression (BNBR) and the bivariate Poisson log-normal regression (BPLR) models have been used to describe count data that are over-dispersed. In this paper, a new bivariate generalized Poisson regression (BGPR) model is defined. An advantage of the new regression model over the BNBR and BPLR models is that the BGPR can be used to model bivariate count data with either over-dispersion or under-dispersion. In this paper, we carry out a simulation study to compare the three regression models when the true data-generating process exhibits over-dispersion. In the simulation experiment, we observe that the bivariate generalized Poisson regression model performs better than the bivariate negative binomial regression model and the BPLR model.  相似文献   

17.
The maximum likelihood estimation of parameters of the Poisson binomial distribution, based on a sample with exact and grouped observations, is considered by applying the EM algorithm (Dempster et al, 1977). The results of Louis (1982) are used in obtaining the observed information matrix and accelerating the convergence of the EM algorithm substantially. The maximum likelihood estimation from samples consisting entirely of complete (Sprott, 1958) or grouped observations are treated as special cases of the estimation problem mentioned above. A brief account is given for the implementation of the EM algorithm when the sampling distribution is the Neyman Type A since the latter is a limiting form of the Poisson binomial. Numerical examples based on real data are included.  相似文献   

18.
In this article, a new three-parameter extension of the two-parameter log-logistic distribution is introduced. Several distributional properties such as moment-generating function, quantile function, mean residual lifetime, the Renyi and Shanon entropies, and order statistics are considered. The estimation of the model parameters for complete and right-censored cases is investigated competently by maximum likelihood estimation (MLE). A simulation study is conducted to show that these MLEs are consistent in moderate samples. Two real datasets are considered; one is a right-censored data to show that the proposed model has a superior performance over several existing popular models.  相似文献   

19.
This paper proposes a simple and flexible count data regression model which is able to incorporate overdispersion (the variance is greater than the mean) and which can be considered a competitor to the Poisson model. As is well known, this classical model imposes the restriction that the conditional mean of each count variable must equal the conditional variance. Nevertheless, for the common case of well-dispersed counts the Poisson regression may not be appropriate, while the count regression model proposed here is potentially useful. We consider an application to model counts of medical care utilization by the elderly in the USA using a well-known data set from the National Medical Expenditure Survey (1987), where the dependent variable is the number of stays after hospital admission, and where 10 explanatory variables are analysed.  相似文献   

20.
Starting from the compound Poisson INGARCH models, we introduce in this paper a new family of integer-valued models suitable to describe count data without zeros that we name zero-truncated CP-INGARCH processes. For such class of models, a probabilistic study concerning moments existence, stationarity and ergodicity is developed. The conditional quasi-maximum likelihood method is introduced to consistently estimate the parameters of a wide zero-truncated compound Poisson subclass of models. The conditional maximum likelihood method is also used to estimate the parameters of ZTCP-INGARCH processes associated with well-specified conditional laws. A simulation study that compares some of those estimators and illustrates their finite distance behaviour as well as a real-data application conclude the paper.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号