首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
In survey sampling and in stereology, it is often desirable to estimate the ratio of means θ= E(Y)/E(X) from bivariate count data (X, Y) with unknown joint distribution. We review methods that are available for this problem, with particular reference to stereological applications. We also develop new methods based on explicit statistical models for the data, and associated model diagnostics. The methods are tested on a stereological dataset. For point‐count data, binomial regression and bivariate binomial models are generally adequate. Intercept‐count data are often overdispersed relative to Poisson regression models, but adequately fitted by negative binomial regression.  相似文献   

2.
In recent years, there has been considerable interest in regression models based on zero-inflated distributions. These models are commonly encountered in many disciplines, such as medicine, public health, and environmental sciences, among others. The zero-inflated Poisson (ZIP) model has been typically considered for these types of problems. However, the ZIP model can fail if the non-zero counts are overdispersed in relation to the Poisson distribution, hence the zero-inflated negative binomial (ZINB) model may be more appropriate. In this paper, we present a Bayesian approach for fitting the ZINB regression model. This model considers that an observed zero may come from a point mass distribution at zero or from the negative binomial model. The likelihood function is utilized to compute not only some Bayesian model selection measures, but also to develop Bayesian case-deletion influence diagnostics based on q-divergence measures. The approach can be easily implemented using standard Bayesian software, such as WinBUGS. The performance of the proposed method is evaluated with a simulation study. Further, a real data set is analyzed, where we show that ZINB regression models seems to fit the data better than the Poisson counterpart.  相似文献   

3.
Data in the form of proportions are often analyzed under a binomial model. However, because genuine random sampling is often infeasible, the subjects in the sample may be collected in clumps and the variances of the observed proportions may be considerably larger than those corresponding to the binomial model. A set of data from a study of the proportion of subjects testing positive to the disease toxoplasmosis is used in this article to motivate partially correlated binomial models capable of describing data observed in practical situations where clumped sampling is likely to appear, According to these models, the extra-binomial variance of the observed frequencies may range from a linear to a quadratic function of the sample size. An efficient algorithm for the evaluation of the resulting probability mass function is given.  相似文献   

4.
SOME MODELS FOR OVERDISPERSED BINOMIAL DATA   总被引:1,自引:0,他引:1  
Various models are currently used to model overdispersed binomial data. It is not always clear which model is appropriate for a given situation. Here we examine the assumptions and discuss the problems and pitfalls of some of these models. We focus on clustered data with one level of nesting, briefly touching on more complex strata and longitudinal data. The estimation procedures are illustrated and some critical comments are made about the various models. We indicate which models are restrictive and how and which can be extended to model more complex situations. In addition some inadequacies in testing procedures are noted. Recommendations as to which models should be used, and when, are made.  相似文献   

5.
The zero-inflated negative binomial (ZINB) model is used to account for commonly occurring overdispersion detected in data that are initially analyzed under the zero-inflated Poisson (ZIP) model. Tests for overdispersion (Wald test, likelihood ratio test [LRT], and score test) based on ZINB model for use in ZIP regression models have been developed. Due to similarity to the ZINB model, we consider the zero-inflated generalized Poisson (ZIGP) model as an alternate model for overdispersed zero-inflated count data. The score test has an advantage over the LRT and the Wald test in that the score test only requires that the parameter of interest be estimated under the null hypothesis. This paper proposes score tests for overdispersion based on the ZIGP model and illustrates that the derived score statistics are exactly the same as the score statistics under the ZINB model. A simulation study indicates the proposed score statistics are preferred to other tests for higher empirical power. In practice, based on the approximate mean–variance relationship in the data, the ZINB or ZIGP model can be considered, and a formal score test based on asymptotic standard normal distribution can be employed for assessing overdispersion in the ZIP model. We provide an example to illustrate the procedures for data analysis.  相似文献   

6.
This paper introduces an exchangeable negative binomial distribution resulting from relaxing the independence of the Bernoulli sequence associated with a negative binomial distribution to exchangeability. It is demonstrated that the introduced distribution is a mixture of negative binomial distributions and can be characterized by infinitely many parameters that form a completely monotone sequence. The moments of the distribution are derived and a small simulation is conducted to illustrate the distribution. For data analytic purposes, two methods, truncation and completely-monotone links, are given for converting the saturated distribution of infinitely many parameters to parsimonious distributions of finitely many parameters. A full likelihood procedure is described which can be used to investigate correlated and overdispersed count data common in biomedical sciences and teratology. In the end, the introduced distribution is applied to analyze a real clinical data of burn wounds on patients.  相似文献   

7.
The two-sided power (TSP) distribution is a flexible two-parameter distribution having uniform, power function and triangular as sub-distributions, and it is a reasonable alternative to beta distribution in some cases. In this work, we introduce the TSP-binomial model which is defined as a mixture of binomial distributions, with the binomial parameter p having a TSP distribution. We study its distributional properties and demonstrate its use on some data. It is shown that the newly defined model is a useful candidate for overdispersed binomial data.  相似文献   

8.
In this paper we propose a new stationary first‐order non‐negative integer valued autoregressive process with geometric marginals based on a generalised version of the negative binomial thinning operator. In this manner we obtain another process that we refer to as a generalised stationary integer‐valued autoregressive process of the first order with geometric marginals. This new process will enable one to tackle the problem of overdispersion inherent in the analysis of integer‐valued time series data, and contains the new geometric process as a particular case. In addition various properties of the new process, such as conditional distribution, autocorrelation structure and innovation structure, are derived. We discuss conditional maximum likelihood estimation of the model parameters. We evaluate the performance of the conditional maximum likelihood estimators by a Monte Carlo study. The proposed process is fitted to time series of number of weekly sales (economics) and weekly number of syphilis cases (medicine) illustrating its capabilities in challenging cases of highly overdispersed count data.  相似文献   

9.
The negative binomial distribution offers an alternative view to the binomial distribution for modeling count data. This alternative view is particularly useful when the probability of success is very small, because, unlike the fixed sampling scheme of the binomial distribution, the inverse sampling approach allows one to collect enough data in order to adequately estimate the proportion of success. However, despite work that has been done on the joint estimation of two binomial proportions from independent samples, there is little, if any, similar work for negative binomial proportions. In this paper, we construct and investigate three confidence regions for two negative binomial proportions based on three statistics: the Wald (W), score (S) and likelihood ratio (LR) statistics. For large-to-moderate sample sizes, this paper finds that all three regions have good coverage properties, with comparable average areas for large sample sizes but with the S method producing the smaller regions for moderate sample sizes. In the small sample case, the LR method has good coverage properties, but often at the expense of comparatively larger areas. Finally, we apply these three regions to some real data for the joint estimation of liver damage rates in patients taking one of two drugs.  相似文献   

10.
This paper presents an EM algorithm for maximum likelihood estimation in generalized linear models with overdispersion. The algorithm is initially derived as a form of Gaussian quadrature assuming a normal mixing distribution, but with only slight variation it can be used for a completely unknown mixing distribution, giving a straightforward method for the fully non-parametric ML estimation of this distribution. This is of value because the ML estimates of the GLM parameters may be sensitive to the specification of a parametric form for the mixing distribution. A listing of a GLIM4 algorithm for fitting the overdispersed binomial logit model is given in an appendix.A simple method is given for obtaining correct standard errors for parameter estimates when using the EM algorithm.Several examples are discussed.  相似文献   

11.
Real count data time series often show the phenomenon of the underdispersion and overdispersion. In this paper, we develop two extensions of the first-order integer-valued autoregressive process with Poisson innovations, based on binomial thinning, for modeling integer-valued time series with equidispersion, underdispersion, and overdispersion. The main properties of the models are derived. The methods of conditional maximum likelihood, Yule–Walker, and conditional least squares are used for estimating the parameters, and their asymptotic properties are established. We also use a test based on our processes for checking if the count time series considered is overdispersed or underdispersed. The proposed models are fitted to time series of the weekly number of syphilis cases and monthly counts of family violence illustrating its capabilities in challenging the overdispersed and underdispersed count data.  相似文献   

12.
A general class of mixed Poisson regression models is introduced. This class is based on a mixing between the Poisson distribution and a distribution belonging to the exponential family. With this, we unified some overdispersed models which have been studied separately, such as negative binomial and Poisson inverse gaussian models. We consider a regression structure for both the mean and dispersion parameters of the mixed Poisson models, thus extending, and in some cases correcting, some previous models considered in the literature. An expectation–maximization (EM) algorithm is proposed for estimation of the parameters and some diagnostic measures, based on the EM algorithm, are considered. We also obtain an explicit expression for the observed information matrix. An empirical illustration is presented in order to show the performance of our class of mixed Poisson models. This paper contains a Supplementary Material.  相似文献   

13.
Control charts are widely used for monitoring quality characteristics of high-yield processes. In such processes where a large number of zero observations exists in count data, the zero-inflated binomial (ZIB) models are more appropriate than the ordinary binomial models. In ZIB models, random shocks occur with probability θ, and upon the occurrence of random shocks, the number of non-conforming items in a sample of size n follows the binomial distribution with proportion p. In the present article, we study in more detail the exponentially weighted moving average control chart based on ZIB distribution (ZIB-EWMA) and we also propose a new control chart based on the double exponentially weighted moving average statistic for monitoring ZIB data (ZIB-DEWMA). The two control charts are studied in detecting upward shifts in θ or p individually, as well as in both parameters simultaneously. Through a simulation study, we compare the performance of the proposed chart with the ZIB-Shewhart, ZIB-EWMA and ZIB-CUSUM charts. Finally, an illustrative example is also presented to display the practical application of the ZIB charts.  相似文献   

14.
We describe a new discrete probability distribution with several useful properties for the analysis and modelling of survival processes and dispersion. First, the model can be used to describe survival processes with monotonically decreasing, constant, or increasing hazard functions, simply by tuning one parameter. Also, the model can describe counts that are overdispersed (contagious) or underdispersed, since the variance can exceed, equal, or be less than the mean. All of these properties are demonstrated both theoretically and with ecological examples, using ad-hoc parameter estimation techniques. Finally, the equations are tractable compared with, say, the negative binomial, and easily incorporated into larger models.  相似文献   

15.
Summary.  A common application of multilevel models is to apportion the variance in the response according to the different levels of the data. Whereas partitioning variances is straightforward in models with a continuous response variable with a normal error distribution at each level, the extension of this partitioning to models with binary responses or to proportions or counts is less obvious. We describe methodology due to Goldstein and co-workers for apportioning variance that is attributable to higher levels in multilevel binomial logistic models. This partitioning they referred to as the variance partition coefficient. We consider extending the variance partition coefficient concept to data sets when the response is a proportion and where the binomial assumption may not be appropriate owing to overdispersion in the response variable. Using the literacy data from the 1991 Indian census we estimate simple and complex variance partition coefficients at multiple levels of geography in models with significant overdispersion and thereby establish the relative importance of different geographic levels that influence educational disparities in India.  相似文献   

16.
Generalized linear mixed models are widely used for describing overdispersed and correlated data. Such data arise frequently in studies involving clustered and hierarchical designs. A more flexible class of models has been developed here through the Dirichlet process mixture. An additional advantage of using such mixture models is that the observations can be grouped together on the basis of the overdispersion present in the data. This paper proposes a partial empirical Bayes method for estimating all the model parameters by adopting a version of the EM algorithm. An augmented model that helps to implement an efficient Gibbs sampling scheme, under the non‐conjugate Dirichlet process generalized linear model, generates observations from the conditional predictive distribution of unobserved random effects and provides an estimate of the average number of mixing components in the Dirichlet process mixture. A simulation study has been carried out to demonstrate the consistency of the proposed method. The approach is also applied to a study on outdoor bacteria concentration in the air and to data from 14 retrospective lung‐cancer studies.  相似文献   

17.
We extend proportional hazards frailty models for lifetime data to allow a negative binomial, Poisson, Geometric or other discrete distribution of the frailty variable. This might represent, for example, the unknown number of flaws in an item under test. Zero frailty corresponds to a limited failure model containing a proportion of units that never fail (long-term survivors). Ways of modifying the model to avoid this are discussed. The models are illustrated on a previously published set of data on failures of printed circuit boards and on new data on breaking strengths of samples of cord.  相似文献   

18.
The negative binomial (NB) model and the generalized Poisson (GP) model are common alternatives to Poisson models when overdispersion is present in the data. Having accounted for initial overdispersion, we may require further investigation as to whether there is evidence for zero-inflation in the data. Two score statistics are derived from the GP model for testing zero-inflation. These statistics, unlike Wald-type test statistics, do not require that we fit the more complex zero-inflated overdispersed models to evaluate zero-inflation. A simulation study illustrates that the developed score statistics reasonably follow a χ2 distribution and maintain the nominal level. Extensive simulation results also indicate the power behavior is different for including a continuous variable than a binary variable in the zero-inflation (ZI) part of the model. These differences are the basis from which suggestions are provided for real data analysis. Two practical examples are presented in this article. Results from these examples along with practical experience lead us to suggest performing the developed score test before fitting a zero-inflated NB model to the data.  相似文献   

19.
We present a model for data in the form of matched pairs of counts. Our work is motivated by a problem in fission-track analysis, where the determination of a crystal's age is based on the ratio of counts of spontaneous and induced tracks. It is often reasonable to assume that the counts follow a Poisson distribution, but typically they are overdispersed and there exists a positive correlation between the numbers of spontaneous and induced tracks in the same crystal. We propose a model that allows for both overdispersion and correlation by assuming that the mean densities follow a bivariate Wishart distribution. Our model is quite general, having the usual negative-binomial and Poisson models as special cases. We propose a maximum-likelihood estimation method based on a stochastic implementation of the EM algorithm, and we derive the asymptotic standard errors of the parameter estimates. We illustrate the method with a data set of fission-track counts in matched areas of zircon crystals.  相似文献   

20.
This paper considers a sequence of independent counts, with each count arising from a mixture of binomial distributions; the mixing distribution is fixed but the number of trials varies from count to count. In this common situation, an estimate of the underlying mean binomial proportion is needed. Two estimators are in general use: the arithmetic average and a weighted average of the observed proportions. Variances of the two estimators are compared and used to decide which estimator is preferred in a given context. The relative merits depend on the distribution of the proportions and the numbers of trials used.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号