首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 453 毫秒
1.
In survey sampling and in stereology, it is often desirable to estimate the ratio of means θ= E(Y)/E(X) from bivariate count data (X, Y) with unknown joint distribution. We review methods that are available for this problem, with particular reference to stereological applications. We also develop new methods based on explicit statistical models for the data, and associated model diagnostics. The methods are tested on a stereological dataset. For point‐count data, binomial regression and bivariate binomial models are generally adequate. Intercept‐count data are often overdispersed relative to Poisson regression models, but adequately fitted by negative binomial regression.  相似文献   

2.
Abstract

Linear mixed effects models have been popular in small area estimation problems for modeling survey data when the sample size in one or more areas is too small for reliable inference. However, when the data are restricted to a bounded interval, the linear model may be inappropriate, particularly if the data are near the boundary. Nonlinear sampling models are becoming increasingly popular for small area estimation problems when the normal model is inadequate. This paper studies the use of a beta distribution as an alternative to the normal distribution as a sampling model for survey estimates of proportions which take values in (0, 1). Inference for small area proportions based on the posterior distribution of a beta regression model ensures that point estimates and credible intervals take values in (0, 1). Properties of a hierarchical Bayesian small area model with a beta sampling distribution and logistic link function are presented and compared to those of the linear mixed effect model. Propriety of the posterior distribution using certain noninformative priors is shown, and behavior of the posterior mean as a function of the sampling variance and the model variance is described. An example using 2010 Small Area Income and Poverty Estimates (SAIPE) data is given, and a numerical example studying small sample properties of the model is presented.  相似文献   

3.
Confidence intervals for parameters of distributions with discrete sample spaces will be less conservative (i.e. have smaller coverage probabilities that are closer to the nominal level) when defined by inverting a test that does not require equal probability in each tail. However, the P‐value obtained from such tests can exhibit undesirable properties, which in turn result in undesirable properties in the associated confidence intervals. We illustrate these difficulties using P‐values for binomial proportions and the difference between binomial proportions.  相似文献   

4.
The negative binomial distribution offers an alternative view to the binomial distribution for modeling count data. This alternative view is particularly useful when the probability of success is very small, because, unlike the fixed sampling scheme of the binomial distribution, the inverse sampling approach allows one to collect enough data in order to adequately estimate the proportion of success. However, despite work that has been done on the joint estimation of two binomial proportions from independent samples, there is little, if any, similar work for negative binomial proportions. In this paper, we construct and investigate three confidence regions for two negative binomial proportions based on three statistics: the Wald (W), score (S) and likelihood ratio (LR) statistics. For large-to-moderate sample sizes, this paper finds that all three regions have good coverage properties, with comparable average areas for large sample sizes but with the S method producing the smaller regions for moderate sample sizes. In the small sample case, the LR method has good coverage properties, but often at the expense of comparatively larger areas. Finally, we apply these three regions to some real data for the joint estimation of liver damage rates in patients taking one of two drugs.  相似文献   

5.
In recent years, there has been considerable interest in regression models based on zero-inflated distributions. These models are commonly encountered in many disciplines, such as medicine, public health, and environmental sciences, among others. The zero-inflated Poisson (ZIP) model has been typically considered for these types of problems. However, the ZIP model can fail if the non-zero counts are overdispersed in relation to the Poisson distribution, hence the zero-inflated negative binomial (ZINB) model may be more appropriate. In this paper, we present a Bayesian approach for fitting the ZINB regression model. This model considers that an observed zero may come from a point mass distribution at zero or from the negative binomial model. The likelihood function is utilized to compute not only some Bayesian model selection measures, but also to develop Bayesian case-deletion influence diagnostics based on q-divergence measures. The approach can be easily implemented using standard Bayesian software, such as WinBUGS. The performance of the proposed method is evaluated with a simulation study. Further, a real data set is analyzed, where we show that ZINB regression models seems to fit the data better than the Poisson counterpart.  相似文献   

6.
The exact and asymptotic upper tail probabilities (α = .10, .05, .01, .001) of the three chi-squared goodness-of-fit statistics Pearson's X 2, likelihood ratioG 2, and powerdivergence statisticD 2(λ), with λ= 2/3 are compared by complete enumeration for the binomial and the mixture binomial. For the two-component mixture binomial, three cases have been distinguished. 1. Both success probabilities and the mixing weights are unknwon. 2. One of the two success probabilities is known. And 3., the mixing weights are known. The binomial was investigated for the number of cellsk, being between 3 and 6 with sample sizes between 5 and 100, for k = 7 with sample sizes between 5 and 45, and for k = 10 with sample sizes ranging from 5 to 20. For the mixture binomial, solely k = 5 cells were considered with sample sizes from 5 to 100 and k = 8 cells with sample sizes between 4 and 20. Rating the relative accuracy of the chi-squared approximation in terms of ±10% and ±20% intervals around α led to the following conclusions for the binomial: 1. Using G2 is not recommendable. 2. At the significance levels α=.10 and α=.05X 2 should be preferred over D 2; D 2 is the best choice at α = .01. 3. Cochran's (1954; Biometrics, 10, 417-451) rule for the minimum expectation when using X 2 seems to generalize to the binomial for G 2 and D 2 ; as a compromise, it gives a rather strong lower limit for the expected cell frequencies in some circumstances, but a rather liberal in others. To draw similar conclusions concerning the mixture binomial was not possible, because in that case, the accuracy of the chi-squared approximation is not only a function of the chosen test statistic and of the significance level, but also heavily depends on the numerical value of theinvolved unknown parameters and on the hypothesis to be tested. Thereto, the present study may give rise only to warnings against the application of mixture models to small samples.  相似文献   

7.
Consider the exchangeable Bayesian hierarchical model where observations yi are independently distributed from sampling densities with unknown means, the means µi, are a random sample from a distribution g, and the parameters of g are assigned a known distribution h. A simple algorithm is presented for summarizing the posterior distribution based on Gibbs sampling and the Metropolis algorithm. The software program Matlab is used to implement the algorithm and provide a graphical output analysis. An binomial example is used to illustrate the flexibility of modeling possible using this algorithm. Methods of model checking and extensions to hierarchical regression modeling are discussed.  相似文献   

8.
Biological control of pests is an important branch of entomology, providing environmentally friendly forms of crop protection. Bioassays are used to find the optimal conditions for the production of parasites and strategies for application in the field. In some of these assays, proportions are measured and, often, these data have an inflated number of zeros. In this work, six models will be applied to data sets obtained from biological control assays for Diatraea saccharalis , a common pest in sugar cane production. A natural choice for modelling proportion data is the binomial model. The second model will be an overdispersed version of the binomial model, estimated by a quasi-likelihood method. This model was initially built to model overdispersion generated by individual variability in the probability of success. When interest is only in the positive proportion data, a model can be based on the truncated binomial distribution and in its overdispersed version. The last two models include the zero proportions and are based on a finite mixture model with the binomial distribution or its overdispersed version for the positive data. Here, we will present the models, discuss their estimation and compare the results.  相似文献   

9.
Point process models are a natural approach for modelling data that arise as point events. In the case of Poisson counts, these may be fitted easily as a weighted Poisson regression. Point processes lack the notion of sample size. This is problematic for model selection, because various classical criteria such as the Bayesian information criterion (BIC) are a function of the sample size, n, and are derived in an asymptotic framework where n tends to infinity. In this paper, we develop an asymptotic result for Poisson point process models in which the observed number of point events, m, plays the role that sample size does in the classical regression context. Following from this result, we derive a version of BIC for point process models, and when fitted via penalised likelihood, conditions for the LASSO penalty that ensure consistency in estimation and the oracle property. We discuss challenges extending these results to the wider class of Gibbs models, of which the Poisson point process model is a special case.  相似文献   

10.
Biased sampling occurs often in observational studies. With one biased sample, the problem of nonparametrically estimating both a target density function and a selection bias function is unidentifiable. This paper studies the nonparametric estimation problem when there are two biased samples that have some overlapping observations (i.e. recaptures) from a finite population. Since an intelligent subject sampled previously may experience a memory effect if sampled again, two general 2-stage models that incorporate both a selection bias and a possible memory effect are proposed. Nonparametric estimators of the target density, selection bias, and memory functions, as well as the population size are developed. Asymptotic properties of these estimators are studied and confidence bands for the selection function and memory function are provided. Our procedures are compared with those ignoring the memory effect or the selection bias in finite sample situations. A nonparametric model selection procedure is also given for choosing a model from the two 2-stage models and a mixture of these two models. Our procedures work well with or without a memory effect, and with or without a selection bias. The paper concludes with an application to a real survey data set.  相似文献   

11.
Two‐phase sampling is often used for estimating a population total or mean when the cost per unit of collecting auxiliary variables, x, is much smaller than the cost per unit of measuring a characteristic of interest, y. In the first phase, a large sample s1 is drawn according to a specific sampling design p(s1) , and auxiliary data x are observed for the units is1 . Given the first‐phase sample s1 , a second‐phase sample s2 is selected from s1 according to a specified sampling design {p(s2s1) } , and (y, x) is observed for the units is2 . In some cases, the population totals of some components of x may also be known. Two‐phase sampling is used for stratification at the second phase or both phases and for regression estimation. Horvitz–Thompson‐type variance estimators are used for variance estimation. However, the Horvitz–Thompson ( Horvitz & Thompson, J. Amer. Statist. Assoc. 1952 ) variance estimator in uni‐phase sampling is known to be highly unstable and may take negative values when the units are selected with unequal probabilities. On the other hand, the Sen–Yates–Grundy variance estimator is relatively stable and non‐negative for several unequal probability sampling designs with fixed sample sizes. In this paper, we extend the Sen–Yates–Grundy ( Sen , J. Ind. Soc. Agric. Statist. 1953; Yates & Grundy , J. Roy. Statist. Soc. Ser. B 1953) variance estimator to two‐phase sampling, assuming fixed first‐phase sample size and fixed second‐phase sample size given the first‐phase sample. We apply the new variance estimators to two‐phase sampling designs with stratification at the second phase or both phases. We also develop Sen–Yates–Grundy‐type variance estimators of the two‐phase regression estimators that make use of the first‐phase auxiliary data and known population totals of some of the auxiliary variables.  相似文献   

12.
The classical Shewhart c-chart and p-chart which are constructed based on the Poisson and binomial distributions are inappropriate in monitoring zero-inflated counts. They tend to underestimate the dispersion of zero-inflated counts and subsequently lead to higher false alarm rate in detecting out-of-control signals. Another drawback of these charts is that their 3-sigma control limits, evaluated based on the asymptotic normality assumption of the attribute counts, have a systematic negative bias in their coverage probability. We recommend that the zero-inflated models which account for the excess number of zeros should first be fitted to the zero-inflated Poisson and binomial counts. The Poisson parameter λ estimated from a zero-inflated Poisson model is then used to construct a one-sided c-chart with its upper control limit constructed based on the Jeffreys prior interval that provides good coverage probability for λ. Similarly, the binomial parameter p estimated from a zero-inflated binomial model is used to construct a one-sided np-chart with its upper control limit constructed based on the Jeffreys prior interval or Blyth–Still interval of the binomial proportion p. A simple two-of-two control rule is also recommended to improve further on the performance of these two proposed charts.  相似文献   

13.
Left-truncated data often arise in epidemiology and individual follow-up studies due to a biased sampling plan since subjects with shorter survival times tend to be excluded from the sample. Moreover, the survival time of recruited subjects are often subject to right censoring. In this article, a general class of semiparametric transformation models that include proportional hazards model and proportional odds model as special cases is studied for the analysis of left-truncated and right-censored data. We propose a conditional likelihood approach and develop the conditional maximum likelihood estimators (cMLE) for the regression parameters and cumulative hazard function of these models. The derived score equations for regression parameter and infinite-dimensional function suggest an iterative algorithm for cMLE. The cMLE is shown to be consistent and asymptotically normal. The limiting variances for the estimators can be consistently estimated using the inverse of negative Hessian matrix. Intensive simulation studies are conducted to investigate the performance of the cMLE. An application to the Channing House data is given to illustrate the methodology.  相似文献   

14.
Layer Sampling     
Layer sampling is an algorithm for generating variates from a non-normalized multidimensional distribution p( · ). It empirically constructs a majorizing function for p( · ) from a sequence of layers. The method first selects a layer based on the previous variate. Next, a sample is drawn from the selected layer, using a method such as Rejection Sampling. Layer sampling is regenerative. At regeneration times, the layers may be adapted to increase mixing of the Markov chain. Layer sampling may also be used to estimate arbitrary integrals, including normalizing constants.  相似文献   

15.
ABSTRACT

Inverse binomial sampling is preferred for quick report. It is also recommended when the population proportion is really small to ensure a positive sample is contained. Group testing has been discussed extensively under binomial model, but not so much under negative binomial model. In this study, we investigate the problem of how to determine the group size using inverse binomial group testing. We propose to choose the optimal group size by minimizing asymptotic variance of the estimator or the cost relative to Fisher information. We show the good performance of our estimator by applying to the data of Chlamydia.  相似文献   

16.
Previous work has been carried out on the use of double sampling schemes for inference from binomial data which are subject to misclassification. The double sampling scheme utilizes a sample of n units which are classified by both a fallible and a true device and another sample of n2 units which are classified only by a fallible device. A triple sampljng scheme incorporates an additional sample of nl units which are classified only by the true device. In this paper we apply this triple sampling to estimation from binomialdata. First estimation of a binomial proportion is discussed under different misclassification structures. Then, the problem of optimal allocation of sample sizes is discussed.  相似文献   

17.
This article considers the construction of level 1?α fixed width 2d confidence intervals for a Bernoulli success probability p, assuming no prior knowledge about p and so p can be anywhere in the interval [0, 1]. It is shown that some fixed width 2d confidence intervals that combine sequential sampling of Hall [Asymptotic theory of triple sampling for sequential estimation of a mean, Ann. Stat. 9 (1981), pp. 1229–1238] and fixed-sample-size confidence intervals of Agresti and Coull [Approximate is better than ‘exact’ for interval estimation of binomial proportions, Am. Stat. 52 (1998), pp. 119–126], Wilson [Probable inference, the law of succession, and statistical inference, J. Am. Stat. Assoc. 22 (1927), pp. 209–212] and Brown et al. [Interval estimation for binomial proportion (with discussion), Stat. Sci. 16 (2001), pp. 101–133] have close to 1?α confidence level. These sequential confidence intervals require a much smaller sample size than a fixed-sample-size confidence interval. For the coin jamming example considered, a fixed-sample-size confidence interval requires a sample size of 9457, while a sequential confidence interval requires a sample size that rarely exceeds 2042.  相似文献   

18.
The authors study the asymptotic behaviour of the likelihood ratio statistic for testing homogeneity in the finite mixture models of a general parametric distribution family. They prove that the limiting distribution of this statistic is the squared supremum of a truncated standard Gaussian process. The autocorrelation function of the Gaussian process is explicitly presented. A re‐sampling procedure is recommended to obtain the asymptotic p‐value. Three kernel functions, normal, binomial and Poisson, are used in a simulation study which illustrates the procedure.  相似文献   

19.
ABSTRACT

Extra-binomial variation in longitudinal/clustered binomial data is frequently observed in biomedical and observational studies. The usual generalized estimating equations method treats the extra-binomial parameter as a constant across all subjects. In this paper, a two-parameter variance function modelling the extraneous variance is proposed to account for heterogeneity among subjects. The new approach allows modelling the extra-binomial variation as a function of the mean and binomial size.  相似文献   

20.
Model summaries based on the ratio of fitted and null likelihoods have been proposed for generalised linear models, reducing to the familiar R2 coefficient of determination in the Gaussian model with identity link. In this note I show how to define the Cox–Snell and Nagelkerke summaries under arbitrary probability sampling designs, giving a design‐consistent estimator of the population model summary. It is also shown that for logistic regression models under case–control sampling the usual Cox–Snell and Nagelkerke R2 are not design‐consistent, but are systematically larger than would be obtained with a cross‐sectional or cohort sample from the same population, even in settings where the weighted and unweighted logistic regression estimators are similar or identical. Implementation of the new estimators is straightforward and code is provided in R.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号