首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 500 毫秒
1.
Count responses with structural zeros are very common in medical and psychosocial research, especially in alcohol and HIV research, and the zero-inflated Poisson (ZIP) and zero-inflated negative binomial models are widely used for modeling such outcomes. However, as alcohol drinking outcomes such as days of drinkings are counts within a given period, their distributions are bounded above by an upper limit (total days in the period) and thus inherently follow a binomial or zero-inflated binomial (ZIB) distribution, rather than a Poisson or ZIP distribution, in the presence of structural zeros. In this paper, we develop a new semiparametric approach for modeling ZIB-like count responses for cross-sectional as well as longitudinal data. We illustrate this approach with both simulated and real study data.  相似文献   

2.
The classical Shewhart c-chart and p-chart which are constructed based on the Poisson and binomial distributions are inappropriate in monitoring zero-inflated counts. They tend to underestimate the dispersion of zero-inflated counts and subsequently lead to higher false alarm rate in detecting out-of-control signals. Another drawback of these charts is that their 3-sigma control limits, evaluated based on the asymptotic normality assumption of the attribute counts, have a systematic negative bias in their coverage probability. We recommend that the zero-inflated models which account for the excess number of zeros should first be fitted to the zero-inflated Poisson and binomial counts. The Poisson parameter λ estimated from a zero-inflated Poisson model is then used to construct a one-sided c-chart with its upper control limit constructed based on the Jeffreys prior interval that provides good coverage probability for λ. Similarly, the binomial parameter p estimated from a zero-inflated binomial model is used to construct a one-sided np-chart with its upper control limit constructed based on the Jeffreys prior interval or Blyth–Still interval of the binomial proportion p. A simple two-of-two control rule is also recommended to improve further on the performance of these two proposed charts.  相似文献   

3.
While excess zeros are often thought to cause data over-dispersion (i.e. when the variance exceeds the mean), this implication is not absolute. One should instead consider a flexible class of distributions that can address data dispersion along with excess zeros. This work develops a zero-inflated sum-of-Conway-Maxwell-Poissons (ZISCMP) regression as a flexible analysis tool to model count data that express significant data dispersion and contain excess zeros. This class of models contains several special case zero-inflated regressions, including zero-inflated Poisson (ZIP), zero-inflated negative binomial (ZINB), zero-inflated binomial (ZIB), and the zero-inflated Conway-Maxwell-Poisson (ZICMP). Through simulated and real data examples, we demonstrate class flexibility and usefulness. We further utilize it to analyze shark species data from Australia's Great Barrier Reef to assess the environmental impact of human action on the number of various species of sharks.  相似文献   

4.
In recent years, a variety of regression models, including zero-inflated and hurdle versions, have been proposed to explain the case of a dependent variable with respect to exogenous covariates. Apart from the classical Poisson, negative binomial and generalised Poisson distributions, many proposals have appeared in the statistical literature, perhaps in response to the new possibilities offered by advanced software that now enables researchers to implement numerous special functions in a relatively simple way. However, we believe that a significant research gap remains, since very little attention has been paid to the quasi-binomial distribution, which was first proposed over fifty years ago. We believe this distribution might constitute a valid alternative to existing regression models, in situations in which the variable has bounded support. Therefore, in this paper we present a zero-inflated regression model based on the quasi-binomial distribution, taking into account the moments and maximum likelihood estimators, and perform a score test to compare the zero-inflated quasi-binomial distribution with the zero-inflated binomial distribution, and the zero-inflated model with the homogeneous model (the model in which covariates are not considered). This analysis is illustrated with two data sets that are well known in the statistical literature and which contain a large number of zeros.  相似文献   

5.
We extend the family of Poisson and negative binomial models to derive the joint distribution of clustered count outcomes with extra zeros. Two random effects models are formulated. The first model assumes a shared random effects term between the conditional probability of perfect zeros and the conditional mean of the imperfect state. The second formulation relaxes the shared random effects assumption by relating the conditional probability of perfect zeros and the conditional mean of the imperfect state to two different but correlated random effects variables. Under the conditional independence and the missing data at random assumption, a direct optimization of the marginal likelihood and an EM algorithm are proposed to fit the proposed models. Our proposed models are fitted to dental caries counts of children under the age of six in the city of Detroit.  相似文献   

6.
In this paper, a zero-inflated power series regression model for longitudinal count data with excess zeros is presented. We demonstrate how to calculate the likelihood for such data when it is assumed that the increment in the cumulative total follows a discrete distribution with a location parameter that depends on a linear function of explanatory variables. Simulation studies indicate that this method can provide improvements in obtaining standard errors of the estimates. We also calculate the dispersion index for this model. The influence of a small perturbation of the dispersion index of the zero-inflated model on likelihood displacement is also studied. The zero-inflated negative binomial regression model is illustrated on data regarding joint damage in psoriatic arthritis.  相似文献   

7.
Zero-inflated data are more frequent when the data represent counts. However, there are practical situations in which continuous data contain an excess of zeros. In these cases, the zero-inflated Poisson, binomial or negative binomial models are not suitable. In order to reduce this gap, we propose the zero-spiked gamma-Weibull (ZSGW) model by mixing a distribution which is degenerate at zero with the gamma-Weibull distribution, which has positive support. The model attempts to estimate simultaneously the effects of explanatory variables on the response variable and the zero-spiked. We consider a frequentist analysis and a non-parametric bootstrap for estimating the parameters of the ZSGW regression model. We derive the appropriate matrices for assessing local influence on the model parameters. We illustrate the performance of the proposed regression model by means of a real data set (copaiba oil resin production) from a study carried out at the Department of Forest Science of the Luiz de Queiroz School of Agriculture, University of São Paulo. Based on the ZSGW regression model, we determine the explanatory variables that can influence the excess of zeros of the resin oil production and identify influential observations. We also prove empirically that the proposed regression model can be superior to the zero-adjusted inverse Gaussian regression model to fit zero-inflated positive continuous data.  相似文献   

8.
In this article, the zero-one inflated binomial mixed regression is proposed to model proportional data with large frequencies of both zeros and binomial denominators. Score tests for assessing both extra zeros and extra binomial denominators in proportional data are developed. The empirical levels and empirical powers of the score test statistics are evaluated using a simulation study. Finally, the application of the proposed model is illustrated on the whitefly data.  相似文献   

9.
The modelling and analysis of count-data time series are areas of emerging interest with various applications in practice. We consider the particular case of the binomial AR(1) model, which is well suited for describing binomial counts with a first-order autoregressive serial dependence structure. We derive explicit expressions for the joint (central) moments and cumulants up to order 4. Then, we apply these results for expressing moments and asymptotic distribution of the squared difference estimator as an alternative to the sample autocovariance. We also analyse the asymptotic distribution of the conditional least-squares estimators of the parameters of the binomial AR(1) model. The finite-sample performance of these estimators is investigated in a simulation study, and we apply them to real data about computerized workstations.  相似文献   

10.
In this study, we deal with the problem of overdispersion beyond extra zeros for a collection of counts that can be correlated. Poisson, negative binomial, zero-inflated Poisson and zero-inflated negative binomial distributions have been considered. First, we propose a multivariate count model in which all counts follow the same distribution and are correlated. Then we extend this model in a sense that correlated counts may follow different distributions. To accommodate correlation among counts, we have considered correlated random effects for each individual in the mean structure, thus inducing dependency among common observations to an individual. The method is applied to real data to investigate variation in food resources use in a species of marsupial in a locality of the Brazilian Cerrado biome.  相似文献   

11.
This article considers the problem of detection for changes in persistence with heavy-tailed innovations. We adopt a ratio type test and derive its null asymptotic distribution which is dependent on the stable index. Then a residual-based bootstrap is proposed when the stable index is unknown. Our procedure requires drawing bootstrap samples of size m < T, T being the size of original sample. We establish the convergence in probability of the bootstrap distribution function assuming that m → ∞ and m/T → 0. A Monte Carlo study has shown that the bootstrap improve the finite sample size and power compared to the asymptotic test, especially for small stable index.  相似文献   

12.
We consider a likelihood ratio test of independence for large two-way contingency tables having both structural (non-random) and sampling (random) zeros in many cells. The solution of this problem is not available using standard likelihood ratio tests. One way to bypass this problem is to remove the structural zeroes from the table and implement a test on the remaining cells which incorporate the randomness in the sampling zeros; the resulting test is a test of quasi-independence of the two categorical variables. This test is based only on the positive counts in the contingency table and is valid when there is at least one sampling (random) zero. The proposed (likelihood ratio) test is an alternative to the commonly used ad hoc procedures of converting the zero cells to positive ones by adding a small constant. One practical advantage of our procedure is that there is no need to know if a zero cell is structural zero or a sampling zero. We model the positive counts using a truncated multinomial distribution. In fact, we have two truncated multinomial distributions; one for the null hypothesis of independence and the other for the unrestricted parameter space. We use Monte Carlo methods to obtain the maximum likelihood estimators of the parameters and also the p-value of our proposed test. To obtain the sampling distribution of the likelihood ratio test statistic, we use bootstrap methods. We discuss many examples, and also empirically compare the power function of the likelihood ratio test relative to those of some well-known test statistics.  相似文献   

13.
The article considers Bayesian analysis of hierarchical models for count, binomial and multinomial data using efficient MCMC sampling procedures. To this end, an improved method of auxiliary mixture sampling is proposed. In contrast to previously proposed samplers the method uses a bounded number of latent variables per observation, independent of the intensity of the underlying Poisson process in the case of count data, or of the number of experiments in the case of binomial and multinomial data. The bounded number of latent variables results in a more general error distribution, which is a negative log-Gamma distribution with arbitrary integer shape parameter. The required approximations of these distributions by Gaussian mixtures have been computed. Overall, the improvement leads to a substantial increase in efficiency of auxiliary mixture sampling for highly structured models. The method is illustrated for finite mixtures of generalized linear models and an epidemiological case study.  相似文献   

14.
ABSTRACT

The binomial exponential 2 (BE2) distribution was proposed by Bakouch et al. as a distribution of a random sum of independent exponential random variables, when the sample size has a zero truncated binomial distribution. In this article, we introduce a generalization of BE2 distribution which offers a more flexible model for lifetime data than the BE2 distribution. The hazard rate function of the proposed distribution can be decreasing, increasing, decreasing–increasing–decreasing and unimodal, so it turns out to be quite flexible for analyzing non-negative real life data. Some statistical properties and parameters estimation of the distribution are investigated. Three different algorithms are proposed for generating random data from the new distribution. Two real data applications regarding the strength data and Proschan's air-conditioner data are used to show that the new distribution is better than the BE2 distribution and some other well-known distributions in modeling lifetime data.  相似文献   

15.
This study considers the exact hypothesis test for the shape parameter of a new two-parameter distribution with the shape of a bathtub or increasing failure rate function under type II progressive censoring with random removals, where the number of units removed at each failure time follows a binomial or a uniform distribution. Several test statistics are proposed and one numerical example is provided to illustrate the proposed hypothesis test for the shape parameter. Finally, a simulation study is performed to compare the power performances of all proposed test statistics. We concluded that the test statistic w 1 is more attractive than other methods as it has better performance than other test statistics for most cases based on the criteria of maximum power.  相似文献   

16.
Regression analyses are commonly performed with doubly limited continuous dependent variables; for instance, when modeling the behavior of rates, proportions and income concentration indices. Several models are available in the literature for use with such variables, one of them being the unit gamma regression model. In all such models, parameter estimation is typically performed using the maximum likelihood method and testing inferences on the model''s parameters are usually based on the likelihood ratio test. Such a test can, however, deliver quite imprecise inferences when the sample size is small. In this paper, we propose two modified likelihood ratio test statistics for use with the unit gamma regressions that deliver much more accurate inferences when the number of data points in small. Numerical (i.e. simulation) evidence is presented for both fixed dispersion and varying dispersion models, and also for tests that involve nonnested models. We also present and discuss two empirical applications.  相似文献   

17.
This paper investigates improved testing inferences under a general multivariate elliptical regression model. The model is very flexible in terms of the specification of the mean vector and the dispersion matrix, and of the choice of the error distribution. The error terms are allowed to follow a multivariate distribution in the class of the elliptical distributions, which has the multivariate normal and Student-t distributions as special cases. We obtain Skovgaard's adjusted likelihood ratio (LR) statistics and Barndorff-Nielsen's adjusted signed LR statistics and we compare the methods through simulations. The simulations suggest that the proposed tests display superior finite sample behaviour as compared to the standard tests. Two applications are presented in order to illustrate the methods.  相似文献   

18.
Count data analysis techniques have been developed in biological and medical research areas. In particular, zero-inflated versions of parametric count distributions have been used to model excessive zeros that are often present in these assays. The most common count distributions for analyzing such data are Poisson and negative binomial. However, a Poisson distribution can only handle equidispersed data and a negative binomial distribution can only cope with overdispersion. However, a Conway–Maxwell–Poisson (CMP) distribution [4] can handle a wide range of dispersion. We show, with an illustrative data set on next-generation sequencing of maize hybrids, that both underdispersion and overdispersion can be present in genomic data. Furthermore, the maize data set consists of clustered observations and, therefore, we develop inference procedures for a zero-inflated CMP regression that incorporates a cluster-specific random effect term. Unlike the Gaussian models, the underlying likelihood is computationally challenging. We use a numerical approximation via a Gaussian quadrature to circumvent this issue. A test for checking zero-inflation has also been developed in our setting. Finite sample properties of our estimators and test have been investigated by extensive simulations. Finally, the statistical methodology has been applied to analyze the maize data mentioned before.  相似文献   

19.
Hee-Young Kim 《Statistics》2015,49(2):291-315
The binomial AR(1) model describes a nonlinear process with a first-order autoregressive (AR(1)) structure and a binomial marginal distribution. To develop goodness-of-fit tests for the binomial AR(1) model, we investigate the observed marginal distribution of the binomial AR(1) process, and we tackle its autocorrelation structure. Motivated by the family of power-divergence statistics for handling discrete multivariate data, we derive the asymptotic distribution of certain categorized power-divergence statistics for the case of a binomial AR(1) process. Then we consider Bartlett's formula, which is widely used in time series analysis to provide estimates of the asymptotic covariance between sample autocorrelations, but which is not applicable when the underlying process is nonlinear. Hence, we derive a novel Bartlett-type formula for the asymptotic distribution of the sample autocorrelations of a binomial AR(1) process, which is then applied to develop tests concerning the autocorrelation structure. Simulation studies are carried out to evaluate the size and power of the proposed tests under diverse alternative process models. Several real examples are used to illustrate our methods and findings.  相似文献   

20.
The negative binomial distribution offers an alternative view to the binomial distribution for modeling count data. This alternative view is particularly useful when the probability of success is very small, because, unlike the fixed sampling scheme of the binomial distribution, the inverse sampling approach allows one to collect enough data in order to adequately estimate the proportion of success. However, despite work that has been done on the joint estimation of two binomial proportions from independent samples, there is little, if any, similar work for negative binomial proportions. In this paper, we construct and investigate three confidence regions for two negative binomial proportions based on three statistics: the Wald (W), score (S) and likelihood ratio (LR) statistics. For large-to-moderate sample sizes, this paper finds that all three regions have good coverage properties, with comparable average areas for large sample sizes but with the S method producing the smaller regions for moderate sample sizes. In the small sample case, the LR method has good coverage properties, but often at the expense of comparatively larger areas. Finally, we apply these three regions to some real data for the joint estimation of liver damage rates in patients taking one of two drugs.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号