首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 25 毫秒
1.
In many applications, the clustered count data often contain excess zeros and the zero-inflated generalized Poisson mixed (ZIGPM) regression model may be suitable. However, dispersion in ZIGPM is often treated as fixed unknown parameter, and this assumption may be not appropriate in some situations. In this article, a score test for homogeneity of dispersion parameter in ZIGPM regression model is developed and corresponding test statistic is obtained. Sampling distribution and power of the score test statistic are investigated through Monte Carlo simulation. Finally, results from a biological example illustrate the usefulness of the diagnostic statistic.  相似文献   

2.
We present a test of the fit to a Poisson model based on the empirical probability generating function (epgf). We derive the limiting distribution of the test under the Poisson hypothesis and show that a rescaling of it is approximately independent of the mean parameter in the Poisson distribution. When inspected under a simulation study over a range of alternative distributions, we find that this test shows reasonable behaviour compared to other goodness-of-fit tests like the Poisson index of dispersion and smooth test applied to the Poisson model. These results illustrate that epgf-based methods for anlyzing count data are promising.  相似文献   

3.
In this paper, a zero-inflated power series regression model for longitudinal count data with excess zeros is presented. We demonstrate how to calculate the likelihood for such data when it is assumed that the increment in the cumulative total follows a discrete distribution with a location parameter that depends on a linear function of explanatory variables. Simulation studies indicate that this method can provide improvements in obtaining standard errors of the estimates. We also calculate the dispersion index for this model. The influence of a small perturbation of the dispersion index of the zero-inflated model on likelihood displacement is also studied. The zero-inflated negative binomial regression model is illustrated on data regarding joint damage in psoriatic arthritis.  相似文献   

4.
Artur J. Lemonte 《Statistics》2013,47(6):1249-1265
The class of generalized linear models with dispersion covariates, which allows us to jointly model the mean and dispersion parameters, is a natural extension to the classical generalized linear models. In this paper, we derive the asymptotic expansions under a sequence of Pitman alternatives (up to order n ?1/2) for the nonnull distribution functions of the likelihood ratio, Wald, Rao score and gradient statistics in this class of models. The asymptotic distributions of these statistics are obtained for testing a subset of regression parameters and for testing a subset of dispersion parameters. Based on these nonnull asymptotic expansions, the power of all four tests, which are equivalent to first order, are compared. Furthermore, we consider Monte Carlo simulations in order to compare the finite-sample performance of these tests in this class of models. We present two empirical applications to two real data sets for illustrative purposes.  相似文献   

5.
Traditional techniques for calculating control limits for processes with discrete responses are based on the Poisson distribution. However, for many processes, the assumption of a Poisson distribution is violated. In such cases, use of traditional Poisson control limits may result in an inflated risk of Type I error. The negative binomial distribution is a natural extension of the Poisson distribution and allows for over‐dispersion relative to the Poisson distribution. A simple approach to calculating exact and approximate control limits for count data based on the negative binomial distribution is described. The approach is illustrated by application to water bacteria count data taken from a water purification system. Copyright © 2003 John Wiley & Sons, Ltd.  相似文献   

6.
The Bernoulli and Poisson processes are two popular discrete count processes; however, both rely on strict assumptions. We instead propose a generalized homogenous count process (which we name the Conway–Maxwell–Poisson or COM-Poisson process) that not only includes the Bernoulli and Poisson processes as special cases, but also serves as a flexible mechanism to describe count processes that approximate data with over- or under-dispersion. We introduce the process and an associated generalized waiting time distribution with several real-data applications to illustrate its flexibility for a variety of data structures. We consider model estimation under different scenarios of data availability, and assess performance through simulated and real datasets. This new generalized process will enable analysts to better model count processes where data dispersion exists in a more accommodating and flexible manner.  相似文献   

7.
A multivariate generalized Poisson regression model based on the multivariate generalized Poisson distribution is defined and studied. The regression model can be used to describe a count data with any type of dispersion. The model allows for both positive and negative correlation between any pair of the response variables. The parameters of the regression model are estimated by using the maximum likelihood method. Some test statistics are discussed, and two numerical data sets are used to illustrate the applications of the multivariate count data regression model.  相似文献   

8.
The negative binomial (NB)-mixed regression in many situations is more appropriate for analysing the correlated and over-dispersed count data. In this paper, a score test for assessing extra zeros against the NB-mixed regression in the correlated count data with excess zeros is developed. The sampling distribution and power of the score test statistic is evaluated using a simulation study. The results show that under a wide range of conditions, the score statistic performs satisfactorily. Finally, the use of the score test is illustrated on DMFT index data of children aged 12 years old.  相似文献   

9.
Count data analysis techniques have been developed in biological and medical research areas. In particular, zero-inflated versions of parametric count distributions have been used to model excessive zeros that are often present in these assays. The most common count distributions for analyzing such data are Poisson and negative binomial. However, a Poisson distribution can only handle equidispersed data and a negative binomial distribution can only cope with overdispersion. However, a Conway–Maxwell–Poisson (CMP) distribution [4] can handle a wide range of dispersion. We show, with an illustrative data set on next-generation sequencing of maize hybrids, that both underdispersion and overdispersion can be present in genomic data. Furthermore, the maize data set consists of clustered observations and, therefore, we develop inference procedures for a zero-inflated CMP regression that incorporates a cluster-specific random effect term. Unlike the Gaussian models, the underlying likelihood is computationally challenging. We use a numerical approximation via a Gaussian quadrature to circumvent this issue. A test for checking zero-inflation has also been developed in our setting. Finite sample properties of our estimators and test have been investigated by extensive simulations. Finally, the statistical methodology has been applied to analyze the maize data mentioned before.  相似文献   

10.
This paper is concerned with semiparametric discrete kernel estimators when the unknown count distribution can be considered to have a general weighted Poisson form. The estimator is constructed by multiplying the Poisson estimate with a nonparametric discrete kernel-type estimate of the Poisson weight function. Comparisons are then carried out with the ordinary discrete kernel probability mass function estimators. The Poisson weight function is thus a local multiplicative correction factor, and is considered as the uniform measure to detect departures from the equidispersed Poisson distribution. In this way, the effects of dispersion and zero-proportion with respect to the standard Poisson distribution are also minimized. This method of estimation is also applied to the weighted binomial form for the count distribution having a finite support. The proposed estimators, in addition to being simple, easy-to-implement and effective, also outperform the competing nonparametric and parametric estimators in finite-sample situations. Two examples illustrate this new semiparametric estimation.  相似文献   

11.
Contamination of a sampled distribution, for example by a heavy-tailed distribution, can degrade the performance of a statistical estimator. We suggest a general approach to alleviating this problem, using a version of the weighted bootstrap. The idea is to 'tilt' away from the contaminated distribution by a given (but arbitrary) amount, in a direction that minimizes a measure of the new distribution's dispersion. This theoretical proposal has a simple empirical version, which results in each data value being assigned a weight according to an assessment of its influence on dispersion. Importantly, distance can be measured directly in terms of the likely level of contamination, without reference to an empirical measure of scale. This makes the procedure particularly attractive for use in multivariate problems. It has several forms, depending on the definitions taken for dispersion and for distance between distributions. Examples of dispersion measures include variance and generalizations based on high order moments. Practicable measures of the distance between distributions may be based on power divergence, which includes Hellinger and Kullback–Leibler distances. The resulting location estimator has a smooth, redescending influence curve and appears to avoid computational difficulties that are typically associated with redescending estimators. Its breakdown point can be located at any desired value ε∈ (0, ½) simply by 'trimming' to a known distance (depending only on ε and the choice of distance measure) from the empirical distribution. The estimator has an affine equivariant multivariate form. Further, the general method is applicable to a range of statistical problems, including regression.  相似文献   

12.
Poisson regression is the most well-known method for modeling count data. When data display over-dispersion, thereby violating the underlying equi-dispersion assumption of Poisson regression, the common solution is to use negative-binomial regression. We show, however, that count data that appear to be equi- or over-dispersed may actually stem from a mixture of populations with different dispersion levels. To detect and model such a mixture, we introduce a generalization of the Conway-Maxwell-Poisson (COM-Poisson) regression model that allows for group-level dispersion. We illustrate mixed dispersion effects and the proposed methodology via semi-authentic data.  相似文献   

13.
Beta-Binomial回归模型及其应用   总被引:1,自引:0,他引:1  
在成败型试验中或满意度支持率调查中,Beta-Binomial分布常被用来刻画具有偏大离差的计数型比例数据,由此提出Beta-Binomial回归模型,研究参数的最大似然估计方法并基于Newton-Raphson算法给出参数估计的迭代方法;重点讨论模型中回归参数和相关性参数存在的检验问题,提出Score检验方法并通过数值模拟研究Score检验统计量的检验功效问题;实例分析证明Beta-Binomial回归模型的有用性。  相似文献   

14.
In recent years, zero-inflated count data models, such as zero-inflated Poisson (ZIP) models, are widely used as the count data with extra zeros are very common in many practical problems. In order to model the correlated count data which are either clustered or repeated and to assess the effects of continuous covariates or of time scales in a flexible way, a class of semiparametric mixed-effects models for zero-inflated count data is considered. In this article, we propose a fully Bayesian inference for such models based on a data augmentation scheme that reflects both random effects of covariates and mixture of zero-inflated distribution. A computational efficient MCMC method which combines the Gibbs sampler and M-H algorithm is implemented to obtain the estimate of the model parameters. Finally, a simulation study and a real example are used to illustrate the proposed methodologies.  相似文献   

15.
Overdispersion due to a large proportion of zero observations in data sets is a common occurrence in many applications of many fields of research; we consider such scenarios in count panel (longitudinal) data. A well-known and widely implemented technique for handling such data is that of random effects modeling, which addresses the serial correlation inherent in panel data, as well as overdispersion. To deal with the excess zeros, a zero-inflated Poisson distribution has come to be canonical, which relaxes the equal mean-variance specification of a traditional Poisson model and allows for the larger variance characteristic of overdispersed data. A natural proposal then to approach count panel data with overdispersion due to excess zeros is to combine these two methodologies, deriving a likelihood from the resulting conditional probability. In performing simulation studies, we find that this approach in fact poses problems of identifiability. In this article, we construct and explain in full detail why a model obtained from the marriage of two classical and well-established techniques is unidentifiable and provide results of simulation studies demonstrating this effect. A discussion on alternative methodologies to resolve the problem is provided in the conclusion.  相似文献   

16.
COM-Poisson regression is an increasingly popular model for count data. Its main advantage is that it permits to model separately the mean and the variance of the counts, thus allowing the same covariate to affect in different ways the average level and the variability of the response variable. A key limiting factor to the use of the COM-Poisson distribution is the calculation of the normalisation constant: its accurate evaluation can be time-consuming and is not always feasible. We circumvent this problem, in the context of estimating a Bayesian COM-Poisson regression, by resorting to the exchange algorithm, an MCMC method applicable to situations where the sampling model (likelihood) can only be computed up to a normalisation constant. The algorithm requires to draw from the sampling model, which in the case of the COM-Poisson distribution can be done efficiently using rejection sampling. We illustrate the method and the benefits of using a Bayesian COM-Poisson regression model, through a simulation and two real-world data sets with different levels of dispersion.  相似文献   

17.
Poisson log-linear regression is a popular model for count responses. We examine two popular extensions of this model – the generalized estimating equations (GEE) and the generalized linear mixed-effects model (GLMM) – to longitudinal data analysis and complement the existing literature on characterizing the relationship between the two dueling paradigms in this setting. Unlike linear regression, the GEE and the GLMM carry significant conceptual and practical implications when applied to modeling count data. Our findings shed additional light on the differences between the two classes of models when used for count data. Our considerations are demonstrated by both real study and simulated data.  相似文献   

18.
The importance of the dispersion parameter in counts occurring in toxicology, biology, clinical medicine, epidemiology, and other similar studies is well known. A couple of procedures for the construction of confidence intervals (CIs) of the dispersion parameter have been investigated, but little attention has been paid to the accuracy of its CIs. In this paper, we introduce the profile likelihood (PL) approach and the hybrid profile variance (HPV) approach for constructing the CIs of the dispersion parameter for counts based on the negative binomial model. The non-parametric bootstrap (NPB) approach based on the maximum likelihood (ML) estimates of the dispersion parameter is also considered. We then compare our proposed approaches with an asymptotic approach based on the ML and the restricted ML (REML) estimates of the dispersion parameter as well as the parametric bootstrap (PB) approach based on the ML estimates of the dispersion parameter. As assessed by Monte Carlo simulations, the PL approach has the best small-sample performance, followed by the REML, HPV, NPB, and PB approaches. Three examples to biological count data are presented.  相似文献   

19.
The essence of the generalised multivariate Behrens–Fisher problem (BFP) is how to test the null hypothesis of equality of mean vectors for two or more populations when their dispersion matrices differ. Solutions to the BFP usually assume variables are multivariate normal and do not handle high‐dimensional data. In ecology, species' count data are often high‐dimensional, non‐normal and heterogeneous. Also, interest lies in analysing compositional dissimilarities among whole communities in non‐Euclidean (semi‐metric or non‐metric) multivariate space. Hence, dissimilarity‐based tests by permutation (e.g., PERMANOVA, ANOSIM) are used to detect differences among groups of multivariate samples. Such tests are not robust, however, to heterogeneity of dispersions in the space of the chosen dissimilarity measure, most conspicuously for unbalanced designs. Here, we propose a modification to the PERMANOVA test statistic, coupled with either permutation or bootstrap resampling methods, as a solution to the BFP for dissimilarity‐based tests. Empirical simulations demonstrate that the type I error remains close to nominal significance levels under classical scenarios known to cause problems for the un‐modified test. Furthermore, the permutation approach is found to be more powerful than the (more conservative) bootstrap for detecting changes in community structure for real ecological datasets. The utility of the approach is shown through analysis of 809 species of benthic soft‐sediment invertebrates from 101 sites in five areas spanning 1960 km along the Norwegian continental shelf, based on the Jaccard dissimilarity measure.  相似文献   

20.
In this paper we study some problems associated with count data from a bivariate Poisson distribution, in which the marginal means are functions of explanatory variables. The estimates of these regression coefficients are developed under a variety of conditions: unrestricted linear model; parallelism of the regression planes; the coincidence of the regression planes. Tests are also developed for the validity of hypotheses involved in these models. The techniques are illustrated using simulated data.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号