期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Analysis of mixed correlated bivariate zero-inflated count and (k,l)-inflated beta responses with application to social network datasets

E. Tabrizi M. Ganjali 《统计学通讯:理论与方法》2019,48(7):1651-1681

This paper presents a new model that monitors the basic network formation mechanisms via the attributes through time. It considers the issue of joint modeling of longitudinal inflated (0, 1)-support continuous and inflated count response variables. For joint model of mentioned response variables, a correlated generalized linear mixed model is studied. The fraction response is inflated in two points k and l (k < l) and a k and l inflated beta distribution is introduced to use as its distribution. Also, the count response is inflated in zero and we use some members of zero-inflated power series distributions, hurdle-at-zero, members of zero-inflated double power series distributions and zero-inflated generalized Poisson distribution as our count response distribution. A full likelihood-based approach is used to yield maximum likelihood estimates of the model parameters and the model is applied to a real social network obtained from an observational study where the rate of the ith node’s responsiveness to the jth node and the number of arrows or edges with some specific characteristics from the ith node to the jth node are the correlated inflated (0, 1)-support continuous and inflated count response variables, respectively. The effect of the sender and receiver positions in an office environment on the responses are investigated simultaneously. 相似文献

2.

Testing overdispersion in the zero-inflated Poisson model

Zhao Yang James W. Hardin Cheryl L. Addy 《Journal of statistical planning and inference》2009

The zero-inflated negative binomial (ZINB) model is used to account for commonly occurring overdispersion detected in data that are initially analyzed under the zero-inflated Poisson (ZIP) model. Tests for overdispersion (Wald test, likelihood ratio test [LRT], and score test) based on ZINB model for use in ZIP regression models have been developed. Due to similarity to the ZINB model, we consider the zero-inflated generalized Poisson (ZIGP) model as an alternate model for overdispersed zero-inflated count data. The score test has an advantage over the LRT and the Wald test in that the score test only requires that the parameter of interest be estimated under the null hypothesis. This paper proposes score tests for overdispersion based on the ZIGP model and illustrates that the derived score statistics are exactly the same as the score statistics under the ZINB model. A simulation study indicates the proposed score statistics are preferred to other tests for higher empirical power. In practice, based on the approximate mean–variance relationship in the data, the ZINB or ZIGP model can be considered, and a formal score test based on asymptotic standard normal distribution can be employed for assessing overdispersion in the ZIP model. We provide an example to illustrate the procedures for data analysis. 相似文献

3.

Attribute Charts for Zero-Inflated Processes

C. H. Sim M. H. Lim 《统计学通讯:模拟与计算》2013,42(7):1440-1452

The classical Shewhart c-chart and p-chart which are constructed based on the Poisson and binomial distributions are inappropriate in monitoring zero-inflated counts. They tend to underestimate the dispersion of zero-inflated counts and subsequently lead to higher false alarm rate in detecting out-of-control signals. Another drawback of these charts is that their 3-sigma control limits, evaluated based on the asymptotic normality assumption of the attribute counts, have a systematic negative bias in their coverage probability. We recommend that the zero-inflated models which account for the excess number of zeros should first be fitted to the zero-inflated Poisson and binomial counts. The Poisson parameter λ estimated from a zero-inflated Poisson model is then used to construct a one-sided c-chart with its upper control limit constructed based on the Jeffreys prior interval that provides good coverage probability for λ. Similarly, the binomial parameter p estimated from a zero-inflated binomial model is used to construct a one-sided np-chart with its upper control limit constructed based on the Jeffreys prior interval or Blyth–Still interval of the binomial proportion p. A simple two-of-two control rule is also recommended to improve further on the performance of these two proposed charts. 相似文献

4.

Model fitting and inference under latent equilibrium processes

Bhattacharya S Gelfand AE Holsinger KE 《Statistics and Computing》2007,17(2):193-208

This paper presents a methodology for model fitting and inference in the context of Bayesian models of the type f(Y | X,θ)f(X|θ)f(θ), where Y is the (set of) observed data, θ is a set of model parameters and X is an unobserved (latent) stationary stochastic process induced by the first order transition model f(X ^(t+1)|X ^(t),θ), where X ^(t) denotes the state of the process at time (or generation) t. The crucial feature of the above type of model is that, given θ, the transition model f(X ^(t+1)|X ^(t),θ) is known but the distribution of the stochastic process in equilibrium, that is f(X|θ), is, except in very special cases, intractable, hence unknown. A further point to note is that the data Y has been assumed to be observed when the underlying process is in equilibrium. In other words, the data is not collected dynamically over time. We refer to such specification as a latent equilibrium process (LEP) model. It is motivated by problems in population genetics (though other applications are discussed), where it is of interest to learn about parameters such as mutation and migration rates and population sizes, given a sample of allele frequencies at one or more loci. In such problems it is natural to assume that the distribution of the observed allele frequencies depends on the true (unobserved) population allele frequencies, whereas the distribution of the true allele frequencies is only indirectly specified through a transition model. As a hierarchical specification, it is natural to fit the LEP within a Bayesian framework. Fitting such models is usually done via Markov chain Monte Carlo (MCMC). However, we demonstrate that, in the case of LEP models, implementation of MCMC is far from straightforward. The main contribution of this paper is to provide a methodology to implement MCMC for LEP models. We demonstrate our approach in population genetics problems with both simulated and real data sets. The resultant model fitting is computationally intensive and thus, we also discuss parallel implementation of the procedure in special cases. 相似文献

5.

Marginal zero-inflated regression models for count data

Jacob Martin Daniel B. Hall 《Journal of applied statistics》2017,44(10):1807-1826

Data sets with excess zeroes are frequently analyzed in many disciplines. A common framework used to analyze such data is the zero-inflated (ZI) regression model. It mixes a degenerate distribution with point mass at zero with a non-degenerate distribution. The estimates from ZI models quantify the effects of covariates on the means of latent random variables, which are often not the quantities of primary interest. Recently, marginal zero-inflated Poisson (MZIP; Long et al. [A marginalized zero-inflated Poisson regression model with overall exposure effects. Stat. Med. 33 (2014), pp. 5151–5165]) and negative binomial (MZINB; Preisser et al., 2016) models have been introduced that model the mean response directly. These models yield covariate effects that have simple interpretations that are, for many applications, more appealing than those available from ZI regression. This paper outlines a general framework for marginal zero-inflated models where the latent distribution is a member of the exponential dispersion family, focusing on common distributions for count data. In particular, our discussion includes the marginal zero-inflated binomial (MZIB) model, which has not been discussed previously. The details of maximum likelihood estimation via the EM algorithm are presented and the properties of the estimators as well as Wald and likelihood ratio-based inference are examined via simulation. Two examples presented illustrate the advantages of MZIP, MZINB, and MZIB models for practical data analysis. 相似文献

6.

Bayesian estimation and case influence diagnostics for the zero-inflated negative binomial regression model 总被引：1，自引：0，他引：1

Aldo M. Garay Victor H. Lachos Heleno Bolfarine 《Journal of applied statistics》2015,42(6):1148-1165

In recent years, there has been considerable interest in regression models based on zero-inflated distributions. These models are commonly encountered in many disciplines, such as medicine, public health, and environmental sciences, among others. The zero-inflated Poisson (ZIP) model has been typically considered for these types of problems. However, the ZIP model can fail if the non-zero counts are overdispersed in relation to the Poisson distribution, hence the zero-inflated negative binomial (ZINB) model may be more appropriate. In this paper, we present a Bayesian approach for fitting the ZINB regression model. This model considers that an observed zero may come from a point mass distribution at zero or from the negative binomial model. The likelihood function is utilized to compute not only some Bayesian model selection measures, but also to develop Bayesian case-deletion influence diagnostics based on q-divergence measures. The approach can be easily implemented using standard Bayesian software, such as WinBUGS. The performance of the proposed method is evaluated with a simulation study. Further, a real data set is analyzed, where we show that ZINB regression models seems to fit the data better than the Poisson counterpart. 相似文献

7.

Score Tests for Zero-Inflation in Overdispersed Count Data

Zhao Yang James W. Hardin Cheryl L. Addy 《统计学通讯:理论与方法》2013,42(11):2008-2030

The negative binomial (NB) model and the generalized Poisson (GP) model are common alternatives to Poisson models when overdispersion is present in the data. Having accounted for initial overdispersion, we may require further investigation as to whether there is evidence for zero-inflation in the data. Two score statistics are derived from the GP model for testing zero-inflation. These statistics, unlike Wald-type test statistics, do not require that we fit the more complex zero-inflated overdispersed models to evaluate zero-inflation. A simulation study illustrates that the developed score statistics reasonably follow a χ² distribution and maintain the nominal level. Extensive simulation results also indicate the power behavior is different for including a continuous variable than a binary variable in the zero-inflation (ZI) part of the model. These differences are the basis from which suggestions are provided for real data analysis. Two practical examples are presented in this article. Results from these examples along with practical experience lead us to suggest performing the developed score test before fitting a zero-inflated NB model to the data. 相似文献

8.

A Bayesian analysis for the Wilcoxon signed-rank statistic

Richard A. Chechile 《统计学通讯:理论与方法》2018,47(21):5241-5254

A Bayesian analysis is provided for the Wilcoxon signed-rank statistic (T⁺). The Bayesian analysis is based on a sign-bias parameter φ on the (0, 1) interval. For the case of a uniform prior probability distribution for φ and for small sample sizes (i.e., 6 ? n ? 25), values for the statistic T⁺ are computed that enable probabilistic statements about φ. For larger sample sizes, approximations are provided for the asymptotic likelihood function P(T⁺|φ) as well as for the posterior distribution P(φ|T⁺). Power analyses are examined both for properly specified Gaussian sampling and for misspecified non Gaussian models. The new Bayesian metric has high power efficiency in the range of 0.9–1 relative to a standard t test when there is Gaussian sampling. But if the sampling is from an unknown and misspecified distribution, then the new statistic still has high power; in some cases, the power can be higher than the t test (especially for probability mixtures and heavy-tailed distributions). The new Bayesian analysis is thus a useful and robust method for applications where the usual parametric assumptions are questionable. These properties further enable a way to do a generic Bayesian analysis for many non Gaussian distributions that currently lack a formal Bayesian model. 相似文献

9.

The Likelihood Ratio Test with the Box–Cox Transformation for the Normal Mixture Problem: Power and Sample Size Study

《统计学通讯:模拟与计算》2013,42(3):553-565

Abstract

Through simulation and regression, we study the alternative distribution of the likelihood ratio test in which the null hypothesis postulates that the data are from a normal distribution after a restricted Box–Cox transformation and the alternative hypothesis postulates that they are from a mixture of two normals after a restricted (possibly different) Box–Cox transformation. The number of observations in the sample is called N. The standardized distance between components (after transformation) is D = (μ₂ ? μ₁)/σ, where μ₁ and μ₂ are the component means and σ² is their common variance. One component contains the fraction π of observed, and the other 1 ? π. The simulation results demonstrate a dependence of power on the mixing proportion, with power decreasing as the mixing proportion differs from 0.5. The alternative distribution appears to be a non-central chi-squared with approximately 2.48 + 10N ^?0.75 degrees of freedom and non-centrality parameter 0.174N(D ? 1.4)² × [π(1 ? π)]. At least 900 observations are needed to have power 95% for a 5% test when D = 2. For fixed values of D, power, and significance level, substantially more observations are necessary when π ≥ 0.90 or π ≤ 0.10. We give the estimated powers for the alternatives studied and a table of sample sizes needed for 50%, 80%, 90%, and 95% power. 相似文献

10.

A Bayesian approach to zero-inflated data in extremes

Alexandre Henrique Quadros Gramosa Fidel Ernesto Castro Morales 《统计学通讯:理论与方法》2020,49(17):4150-4161

Abstract

The generalized extreme value (GEV) distribution is known as the limiting result for the modeling of maxima blocks of size n, which is used in the modeling of extreme events. However, it is possible for the data to present an excessive number of zeros when dealing with extreme data, making it difficult to analyze and estimate these events by using the usual GEV distribution. The Zero-Inflated Distribution (ZID) is widely known in literature for modeling data with inflated zeros, where the inflator parameter w is inserted. The present work aims to create a new approach to analyze zero-inflated extreme values, that will be applied in data of monthly maximum precipitation, that can occur during months where there was no precipitation, being these computed as zero. An inference was made on the Bayesian paradigm, and the parameter estimation was made by numerical approximations of the posterior distribution using Markov Chain Monte Carlo (MCMC) methods. Time series of some cities in the northeastern region of Brazil were analyzed, some of them with predominance of non-rainy months. The results of these applications showed the need to use this approach to obtain more accurate and with better adjustment measures results when compared to the standard distribution of extreme value analysis. 相似文献

11.

On double hysteretic heteroskedastic model

Cathy W.S. Chen Buu-Chau Truong 《Journal of Statistical Computation and Simulation》2016,86(13):2684-2705

ABSTRACT

This paper proposes a hysteretic autoregressive model with GARCH specification and a skew Student's t-error distribution for financial time series. With an integrated hysteresis zone, this model allows both the conditional mean and conditional volatility switching in a regime to be delayed when the hysteresis variable lies in a hysteresis zone. We perform Bayesian estimation via an adaptive Markov Chain Monte Carlo sampling scheme. The proposed Bayesian method allows simultaneous inferences for all unknown parameters, including threshold values and a delay parameter. To implement model selection, we propose a numerical approximation of the marginal likelihoods to posterior odds. The proposed methodology is illustrated using simulation studies and two major Asia stock basis series. We conduct a model comparison for variant hysteresis and threshold GARCH models based on the posterior odds ratios, finding strong evidence of the hysteretic effect and some asymmetric heavy-tailness. Versus multi-regime threshold GARCH models, this new collection of models is more suitable to describe real data sets. Finally, we employ Bayesian forecasting methods in a Value-at-Risk study of the return series. 相似文献

12.

Three estimators for the poisson regression model with measurement errors

Alexander Kukush Hans Schneeweis Roland Wolf 《Statistical Papers》2004,45(3):351-368

We consider two consistent estimators for the parameters of the linear predictor in the Poisson regression model, where the covariate is measured with errors. The measurement errors are assumed to be normally distributed with known error variance σ _u ² . The SQS estimator, based on a conditional mean-variance model, takes the distribution of the latent covariate into account, and this is here assumed to be a normal distribution. The CS estimator, based on a corrected score function, does not use the distribution of the latent covariate. Nevertheless, for small σ _u ² , both estimators have identical asymptotic covariance matrices up to the order of σ _u ² . We also compare the consistent estimators to the naive estimator, which is based on replacing the latent covariate with its (erroneously) measured counterpart. The naive estimator is biased, but has a smaller covariance matrix than the consistent estimators (at least up to the order of σ _u ² ). 相似文献

13.

Methods for calculating stationary distribution in linear models of time series

J. Anděl I. Netuka 《Statistics》2013,47(4):279-287

The article deals with methods for computing the stationary marginal distribution in linear models of time series. Two approaches are described. First, an algorithm based on approximation of solution of the corresponding integral equation is briefly reviewed. Then, we study the limit behaviour of the partial sums c ₁ η₁+c ₂ η₂+···+c _n η_n where η_i are i.i.d. random variables and c _i real constants. We generalize procedure of Haiman (1998) [Haiman, G., 1998, Upper and lower bounds for the tail of the invariant distribution of some AR(1) processes. Asymptotic Methods in Probability and Statistics, 45, 723–730.] to an arbitrary causal linear process and relax the assumptions of his result significantly. This is achieved by investigating the properties of convolution of densities. 相似文献

14.

Moments of the Product-Limit Estimator Under Left-Truncation and Right-Censoring

《统计学通讯:理论与方法》2013,42(8):1833-1850

ABSTRACT

The product-limit estimator (PLE) is a well-known nonparametric estimator for the distribution function of the lifetime when data are left-truncated and right-censored. Much work has focused on developing its asymptotic properties. Finite sample results have been difficult to obtain. This article is concerned about finite moments of the PLE. The moments of the PLE can be represented as a power series in n ^?1. In addition, through the U-statistic mechanism, we obtain also computable formulas for the first, second, third, and fourth of the PLE up to o(n ^?2). Finally, a numerical example is presented. 相似文献

15.

Testing for normality in linear regression models

《Journal of Statistical Computation and Simulation》2012,82(10):1101-1113

The importance of the normal distribution for fitting continuous data is well known. However, in many practical situations data distribution departs from normality. For example, the sample skewness and the sample kurtosis are far away from 0 and 3, respectively, which are nice properties of normal distributions. So, it is important to have formal tests of normality against any alternative. D'Agostino et al. [A suggestion for using powerful and informative tests of normality, Am. Statist. 44 (1990), pp. 316–321] review four procedures Z ²(g ₁), Z ²(g ₂), D and K ² for testing departure from normality. The first two of these procedures are tests of normality against departure due to skewness and kurtosis, respectively. The other two tests are omnibus tests. An alternative to the normal distribution is a class of skew-normal distributions (see [A. Azzalini, A class of distributions which includes the normal ones, Scand. J. Statist. 12 (1985), pp. 171–178]). In this paper, we obtain a score test (W) and a likelihood ratio test (LR) of goodness of fit of the normal regression model against the skew-normal family of regression models. It turns out that the score test is based on the sample skewness and is of very simple form. The performance of these six procedures, in terms of size and power, are compared using simulations. The level properties of the three statistics LR, W and Z ²(g ₁) are similar and close to the nominal level for moderate to large sample sizes. Also, their power properties are similar for small departure from normality due to skewness (γ₁≤0.4). Of these, the score test statistic has a very simple form and computationally much simpler than the other two statistics. The LR statistic, in general, has highest power, although it is computationally much complex as it requires estimates of the parameters under the normal model as well as those under the skew-normal model. So, the score test may be used to test for normality against small departure from normality due to skewness. Otherwise, the likelihood ratio statistic LR should be used as it detects general departure from normality (due to both skewness and kurtosis) with, in general, largest power. 相似文献

16.

Likelihood estimation for longitudinal zero-inflated power series regression models

E. Bahrami Samani Y. Amirian M. Ganjali 《Journal of applied statistics》2012,39(9):1965-1974

In this paper, a zero-inflated power series regression model for longitudinal count data with excess zeros is presented. We demonstrate how to calculate the likelihood for such data when it is assumed that the increment in the cumulative total follows a discrete distribution with a location parameter that depends on a linear function of explanatory variables. Simulation studies indicate that this method can provide improvements in obtaining standard errors of the estimates. We also calculate the dispersion index for this model. The influence of a small perturbation of the dispersion index of the zero-inflated model on likelihood displacement is also studied. The zero-inflated negative binomial regression model is illustrated on data regarding joint damage in psoriatic arthritis. 相似文献

17.

Prediction based on linear combinations of order statistics and bivariate concomitants in the case of multivariate elliptical distributions

《Journal of Statistical Computation and Simulation》2012,82(5):1079-1098

In this paper, by considering a (3n+1) -dimensional random vector (X₀, X^T, Y^T, Z^T)^T having a multivariate elliptical distribution, we derive the exact joint distribution of (X₀, a^TX_(n), b^TY_[n], c^TZ_[n])^T, where a, b, c∈?ⁿ, X_(n)=(X₍₁₎, …, X_(n))^T, X₍₁₎<···<X_(n), is the vector of order statistics arising from X, and Y_[n]=(Y_[1], …, Y_[n])^T and Z_[n]=(Z_[1], …, Z_[n])^T denote the vectors of concomitants corresponding to X_(n) ((Y_[r], Z_[r])^T, for r=1, …, n, is the vector of bivariate concomitants corresponding to X_(r)). We then present an alternate approach for the derivation of the exact joint distribution of (X₀, X_(r), Y_[r], Z_[r])^T, for r=1, …, n. We show that these joint distributions can be expressed as mixtures of four-variate unified skew-elliptical distributions and these mixture forms facilitate the prediction of X_(r), say, based on the concomitants Y_[r] and Z_[r]. Finally, we illustrate the usefulness of our results by a real data. 相似文献

18.

Adjusted empirical likelihood inference for additive hazards regression

Shanshan Wang Hengjian Cui 《统计学通讯:理论与方法》2013,42(24):7294-7305

ABSTRACT

This article develops an adjusted empirical likelihood (EL) method for the additive hazards model. The adjusted EL ratio is shown to have a central chi-squared limiting distribution under the null hypothesis. We also evaluate its asymptotic distribution as a non central chi-squared distribution under the local alternatives of order n^{? 1/2}, deriving the expression for the asymptotic power function. Simulation studies and a real example are conducted to evaluate the finite sample performance of the proposed method. Compared with the normal approximation-based method, the proposed method tends to have more larger empirical power and smaller confidence regions with comparable coverage probabilities. 相似文献

19.

A lack-of-fit test for parametric zero-inflated Poisson models

《Journal of Statistical Computation and Simulation》2012,82(9):1081-1098

Count data often contain many zeros. In parametric regression analysis of zero-inflated count data, the effect of a covariate of interest is typically modelled via a linear predictor. This approach imposes a restrictive, and potentially questionable, functional form on the relation between the independent and dependent variables. To address the noted restrictions, a flexible parametric procedure is employed to model the covariate effect as a linear combination of fixed-knot cubic basis splines or B-splines. The semiparametric zero-inflated Poisson regression model is fitted by maximizing the likelihood function through an expectation–maximization algorithm. The smooth estimate of the functional form of the covariate effect can enhance modelling flexibility. Within this modelling framework, a log-likelihood ratio test is used to assess the adequacy of the covariate function. Simulation results show that the proposed test has excellent power in detecting the lack of fit of a linear predictor. A real-life data set is used to illustrate the practicality of the methodology. 相似文献

20.

Characterizing Relationships Between Estimations Under a General Linear Model with Explicit and Implicit Restrictions by Rank of Matrix

《统计学通讯:理论与方法》2012,41(13-14):2588-2601

In the investigation of the restricted linear model ?_r = {y, X β | A β = b, σ² Σ}, the parameter constraints A β = b are often handled by transforming the model into certain implicitly restricted model. Any estimation derived from the explicitly and implicitly restricted models on the vector β and its functions should be equivalent, although the expressions of the estimation under the two models may be different. However, people more likely want to directly compare different expressions of estimations and yield a conclusion on their equivalence by using some algebraic operations on expressions of estimations. In this article, we give some results on equivalence of the well-known OLSEs and BLUEs under the explicitly and implicitly restricted linear models by using some expansion formulas for ranks of matrices. 相似文献