首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 78 毫秒
1.
We present a model for data in the form of matched pairs of counts. Our work is motivated by a problem in fission-track analysis, where the determination of a crystal's age is based on the ratio of counts of spontaneous and induced tracks. It is often reasonable to assume that the counts follow a Poisson distribution, but typically they are overdispersed and there exists a positive correlation between the numbers of spontaneous and induced tracks in the same crystal. We propose a model that allows for both overdispersion and correlation by assuming that the mean densities follow a bivariate Wishart distribution. Our model is quite general, having the usual negative-binomial and Poisson models as special cases. We propose a maximum-likelihood estimation method based on a stochastic implementation of the EM algorithm, and we derive the asymptotic standard errors of the parameter estimates. We illustrate the method with a data set of fission-track counts in matched areas of zircon crystals.  相似文献   

2.
The analysis of incomplete contingency tables is a practical and an interesting problem. In this paper, we provide characterizations for the various missing mechanisms of a variable in terms of response and non-response odds for two and three dimensional incomplete tables. Log-linear parametrization and some distinctive properties of the missing data models for the above tables are discussed. All possible cases in which data on one, two or all variables may be missing are considered. We study the missingness of each variable in a model, which is more insightful for analyzing cross-classified data than the missingness of the outcome vector. For sensitivity analysis of the incomplete tables, we propose easily verifiable procedures to evaluate the missing at random (MAR), missing completely at random (MCAR) and not missing at random (NMAR) assumptions of the missing data models. These methods depend only on joint and marginal odds computed from fully and partially observed counts in the tables, respectively. Finally, some real-life datasets are analyzed to illustrate our results, which are confirmed based on simulation studies.  相似文献   

3.
In this paper, we investigate Bayesian generalized nonlinear mixed‐effects (NLME) regression models for zero‐inflated longitudinal count data. The methodology is motivated by and applied to colony forming unit (CFU) counts in extended bactericidal activity tuberculosis (TB) trials. Furthermore, for model comparisons, we present a generalized method for calculating the marginal likelihoods required to determine Bayes factors. A simulation study shows that the proposed zero‐inflated negative binomial regression model has good accuracy, precision, and credibility interval coverage. In contrast, conventional normal NLME regression models applied to log‐transformed count data, which handle zero counts as left censored values, may yield credibility intervals that undercover the true bactericidal activity of anti‐TB drugs. We therefore recommend that zero‐inflated NLME regression models should be fitted to CFU count on the original scale, as an alternative to conventional normal NLME regression models on the logarithmic scale.  相似文献   

4.
We study methods to estimate regression and variance parameters for over-dispersed and correlated count data from highly stratified surveys. Our application involves counts of fish catches from stratified research surveys and we propose a novel model in fisheries science to address changes in survey protocols. A challenge with this model is the large number of nuisance parameters which leads to computational issues and biased statistical inferences. We use a computationally efficient profile generalized estimating equation method and compare it to marginal maximum likelihood (MLE) and restricted MLE (REML) methods. We use REML to address bias and inaccurate confidence intervals because of many nuisance parameters. The marginal MLE and REML approaches involve intractable integrals and we used a new R package that is designed for estimating complex nonlinear models that may include random effects. We conclude from simulation analyses that the REML method provides more reliable statistical inferences among the three methods we investigated.  相似文献   

5.
We consider a likelihood ratio test of independence for large two-way contingency tables having both structural (non-random) and sampling (random) zeros in many cells. The solution of this problem is not available using standard likelihood ratio tests. One way to bypass this problem is to remove the structural zeroes from the table and implement a test on the remaining cells which incorporate the randomness in the sampling zeros; the resulting test is a test of quasi-independence of the two categorical variables. This test is based only on the positive counts in the contingency table and is valid when there is at least one sampling (random) zero. The proposed (likelihood ratio) test is an alternative to the commonly used ad hoc procedures of converting the zero cells to positive ones by adding a small constant. One practical advantage of our procedure is that there is no need to know if a zero cell is structural zero or a sampling zero. We model the positive counts using a truncated multinomial distribution. In fact, we have two truncated multinomial distributions; one for the null hypothesis of independence and the other for the unrestricted parameter space. We use Monte Carlo methods to obtain the maximum likelihood estimators of the parameters and also the p-value of our proposed test. To obtain the sampling distribution of the likelihood ratio test statistic, we use bootstrap methods. We discuss many examples, and also empirically compare the power function of the likelihood ratio test relative to those of some well-known test statistics.  相似文献   

6.
7.
The generalized estimating equation (GEE) approach to the analysis of longitudinal data has many attractive robustness properties and can provide a 'population average' characterization of interest, for example, to clinicians who have to treat patients on the basis of their observed characteristics. However, these methods have limitations which restrict their usefulness in both the social and the medical sciences. This conclusion is based on the premise that the main motivations for longitudinal analysis are insight into microlevel dynamics and improved control for omitted or unmeasured variables. We claim that to address these issues a properly formulated random-effects model is required. In addition to a theoretical assessment of some of the issues, we illustrate this by reanalysing data on polyp counts. In this example, the covariates include a base-line outcome, and the effectiveness of the treatment seems to vary by base-line. We compare the random-effects approach with the GEE approach and conclude that the GEE approach is inappropriate for assessing the treatment effects for these data.  相似文献   

8.
In this article, we examine the limiting behavior of generalized method of moments (GMM) sample moment conditions and point out an important discontinuity that arises in their asymptotic distribution. We show that the part of the scaled sample moment conditions that gives rise to degeneracy in the asymptotic normal distribution is T-consistent and has a nonstandard limiting distribution. We derive the appropriate asymptotic (weighted chi-squared) distribution when this degeneracy occurs and show how to conduct asymptotically valid statistical inference. We also propose a new rank test that provides guidance on which (standard or nonstandard) asymptotic framework should be used for inference. The finite-sample properties of the proposed asymptotic approximation are demonstrated using simulated data from some popular asset pricing models.  相似文献   

9.
Missing response problem is ubiquitous in survey sampling, medical, social science and epidemiology studies. It is well known that non-ignorable missing is the most difficult missing data problem where the missing of a response depends on its own value. In statistical literature, unlike the ignorable missing data problem, not many papers on non-ignorable missing data are available except for the full parametric model based approach. In this paper we study a semiparametric model for non-ignorable missing data in which the missing probability is known up to some parameters, but the underlying distributions are not specified. By employing Owen (1988)’s empirical likelihood method we can obtain the constrained maximum empirical likelihood estimators of the parameters in the missing probability and the mean response which are shown to be asymptotically normal. Moreover the likelihood ratio statistic can be used to test whether the missing of the responses is non-ignorable or completely at random. The theoretical results are confirmed by a simulation study. As an illustration, the analysis of a real AIDS trial data shows that the missing of CD4 counts around two years are non-ignorable and the sample mean based on observed data only is biased.  相似文献   

10.
ABSTRACT

In this paper, we propose the use of the Data Cloning (DC) approach to estimate parameter-driven zero-inflated Poisson and Negative Binomial models for time series of counts. The data cloning algorithm obtains the familiar maximum likelihood estimators and their standard errors via a fully Bayesian estimation. This provides some computational ease as well as inferential tools such as confidence intervals and diagnostic methods which, otherwise, are not readily available for parameter-driven models. To illustrate the performance of the proposed method, we use Monte Carlo Simulations and real data on asthma-related emergency department visits in the Canadian province of Ontario.  相似文献   

11.
Random effect models have often been used in longitudinal data analysis since they allow for association among repeated measurements due to unobserved heterogeneity. Various approaches have been proposed to extend mixed models for repeated count data to include dependence on baseline counts. Dependence between baseline counts and individual-specific random effects result in a complex form of the (conditional) likelihood. An approximate solution can be achieved ignoring this dependence, but this approach could result in biased parameter estimates and in wrong inferences. We propose a computationally feasible approach to overcome this problem, leaving the random effect distribution unspecified. In this context, we show how the EM algorithm for nonparametric maximum likelihood (NPML) can be extended to deal with dependence of repeated measures on baseline counts.  相似文献   

12.
We analyse longitudinal data on CD4 cell counts from patients who participated in clinical trials that compared two therapeutic treatments: zidovudine and didanosine. The investigators were interested in modelling the CD4 cell count as a function of treatment, age at base-line and disease stage at base-line. Serious concerns can be raised about the normality assumption of CD4 cell counts that is implicit in many methods and therefore an analysis may have to start with a transformation. Instead of assuming that we know the transformation (e.g. logarithmic) that makes the outcome normal and linearly related to the covariates, we estimate the transformation, by using maximum likelihood, within the Box–Cox family. There has been considerable work on the Box–Cox transformation for univariate regression models. Here, we discuss the Box–Cox transformation for longitudinal regression models when the outcome can be missing over time, and we also implement a maximization method for the likelihood, assumming that the missing data are missing at random.  相似文献   

13.
Count data are routinely assumed to have a Poisson distribution, especially when there are no straightforward diagnostic procedures for checking this assumption. We reanalyse two data sets from crossover trials of treatments for angina pectoris , in which the outcomes are counts of anginal attacks. Standard analyses focus on treatment effects, averaged over subjects; we are also interested in the dispersion of these effects (treatment heterogeneity). We set up a log-Poisson model with random coefficients to estimate the distribution of the treatment effects and show that the analysis is very sensitive to the distributional assumption; the population variance of the treatment effects is confounded with the (variance) function that relates the conditional variance of the outcomes, given the subject's rate of attacks, to the conditional mean. Diagnostic model checks based on resampling from the fitted distribution indicate that the default choice of the Poisson distribution for the analysed data sets is poorly supported. We propose to augment the data sets with observations of the counts, made possibly outside the clinical setting, so that the conditional distribution of the counts could be established.  相似文献   

14.
15.
Joint damage in psoriatic arthritis can be measured by clinical and radiological methods, the former being done more frequently during longitudinal follow-up of patients. Motivated by the need to compare findings based on the different methods with different observation patterns, we consider longitudinal data where the outcome variable is a cumulative total of counts that can be unobserved when other, informative, explanatory variables are recorded. We demonstrate how to calculate the likelihood for such data when it is assumed that the increment in the cumulative total follows a discrete distribution with a location parameter that depends on a linear function of explanatory variables. An approach to the incorporation of informative observation is suggested. We present analyses based on an observational database from a psoriatic arthritis clinic. Although the use of the new statistical methodology has relatively little effect in this example, simulation studies indicate that the method can provide substantial improvements in bias and coverage in some situations where there is an important time varying explanatory variable.  相似文献   

16.
Automated public health surveillance of disease counts for rapid outbreak, epidemic or bioterrorism detection using conventional control chart methods can be hampered by over-dispersion and background (‘in-control’) mean counts that vary over time. An adaptive cumulative sum (CUSUM) plan is developed for signalling unusually high incidence in prospectively monitored time series of over-dispersed daily disease counts with a non-homogeneous mean. Negative binomial transitional regression is used to prospectively model background counts and provide ‘one-step-ahead’ forecasts of the next day's count. A CUSUM plan then accumulates departures of observed counts from an offset (reference value) that is dynamically updated using the modelled forecasts. The CUSUM signals whenever the accumulated departures exceed a threshold. The amount of memory of past observations retained by the CUSUM plan is determined by the offset value; a smaller offset retains more memory and is efficient at detecting smaller shifts. Our approach optimises early outbreak detection by dynamically adjusting the offset value. We demonstrate the practical application of the ‘optimal’ CUSUM plans to daily counts of laboratory-notified influenza and Ross River virus diagnoses, with particular emphasis on the steady-state situation (i.e. changes that occur after the CUSUM statistic has run through several in-control counts).  相似文献   

17.
It is known that the maximum likelihood methods does not provide explicit estimators for the mean and standard deviation of the normal distribution based on Type II censored samples. In this paper we present a simple method of deriving explicit estimators by approximating the likelihood equations appropriately. We obtain the variances and covariance of these estimators. We also show that these estimators are almost as eficient as the maximum likelihood (ML) estimators and just as eficient as the best linear unbiased (BLU), and the modified maximum likelihood (MML) estimators. Finally, we illustrate this method of estimation by applying it to Gupta's and Darwin's data.  相似文献   

18.
In this study, we deal with the problem of overdispersion beyond extra zeros for a collection of counts that can be correlated. Poisson, negative binomial, zero-inflated Poisson and zero-inflated negative binomial distributions have been considered. First, we propose a multivariate count model in which all counts follow the same distribution and are correlated. Then we extend this model in a sense that correlated counts may follow different distributions. To accommodate correlation among counts, we have considered correlated random effects for each individual in the mean structure, thus inducing dependency among common observations to an individual. The method is applied to real data to investigate variation in food resources use in a species of marsupial in a locality of the Brazilian Cerrado biome.  相似文献   

19.
This article extends the concept of using the steady state ranked simulated sampling approach (SRSIS) by Al-Saleh and Samawi (2000) for improving Monte Carlo methods for single integration problem to multiple integration problems. We demonstrate that this approach provides unbiased estimators and substantially improves the performance of some Monte Carlo methods for bivariate integral approximations, which can be extended to multiple integrals’ approximations. This results in a significant reduction in costs and time required to attain a certain level of accuracy. In order to compare the performance of our method with the Samawi and Al-Saleh (2007) method, we use the same two illustrations for the bivariate case.  相似文献   

20.
Dependent multivariate count data occur in several research studies. These data can be modelled by a multivariate Poisson or Negative binomial distribution constructed using copulas. However, when some of the counts are inflated, that is, the number of observations in some cells are much larger than other cells, then the copula-based multivariate Poisson (or Negative binomial) distribution may not fit well and it is not an appropriate statistical model for the data. There is a need to modify or adjust the multivariate distribution to account for the inflated frequencies. In this article, we consider the situation where the frequencies of two cells are higher compared to the other cells and develop a doubly inflated multivariate Poisson distribution function using multivariate Gaussian copula. We also discuss procedures for regression on covariates for the doubly inflated multivariate count data. For illustrating the proposed methodologies, we present real data containing bivariate count observations with inflations in two cells. Several models and linear predictors with log link functions are considered, and we discuss maximum likelihood estimation to estimate unknown parameters of the models.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号