首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 359 毫秒
1.
Sample size determination for testing the hypothesis of equality of two proportions against an alternative with specified type I and type II error probabilities is considered for two finite populations. When two finite populations involved are quite different in sizes, the equal size assumption may not be appropriate. In this paper, we impose a balanced sampling condition to determine the necessary samples taken without replacement from the finite populations. It is found that our solution requires smaller samples as compared to those using binomial distributions. Furthermore, our solution is consistent with the sampling with replacement or when population size is large. Finally, three examples are given to show the application of the derived sample size formula.  相似文献   

2.
Confidence intervals for the difference of two binomial proportions are well known, however, confidence intervals for the weighted sum of two binomial proportions are less studied. We develop and compare seven methods for constructing confidence intervals for the weighted sum of two independent binomial proportions. The interval estimates are constructed by inverting the Wald test, the score test and the Likelihood ratio test. The weights can be negative, so our results generalize those for the difference between two independent proportions. We provide a numerical study that shows that these confidence intervals based on large‐sample approximations perform very well, even when a relatively small amount of data is available. The intervals based on the inversion of the score test showed the best performance. Finally, we show that as for the difference of two binomial proportions, adding four pseudo‐outcomes to the Wald interval for the weighted sum of two binomial proportions improves its coverage significantly, and we provide a justification for this correction.  相似文献   

3.
Generally, confidence regions for the probabilities of a multinomial population are constructed based on the Pearson χ2 statistic. Morales et al. (Bootstrap confidence regions in multinomial sampling. Appl Math Comput. 2004;155:295–315) considered the bootstrap and asymptotic confidence regions based on a broader family of test statistics known as power-divergence test statistics. In this study, we extend their work and propose penalized power-divergence test statistics-based confidence regions. We only consider small sample sizes where asymptotic properties fail and alternative methods are needed. Both bootstrap and asymptotic confidence regions are constructed. We consider the percentile and the bias corrected and accelerated bootstrap confidence regions. The latter confidence region has not been studied previously for the power-divergence statistics much less for the penalized ones. Designed simulation studies are carried out to calculate average coverage probabilities. Mean absolute deviation between actual and nominal coverage probabilities is used to compare the proposed confidence regions.  相似文献   

4.
In this article, we present a procedure for approximate negative binomial tolerance intervals. We utilize an approach that has been well-studied to approximate tolerance intervals for the binomial and Poisson settings, which is based on the confidence interval for the parameter in the respective distribution. A simulation study is performed to assess the coverage probabilities and expected widths of the tolerance intervals. The simulation study also compares eight different confidence interval approaches for the negative binomial proportions. We recommend using those in practice that perform the best based on our simulation results. The method is also illustrated using two real data examples.  相似文献   

5.
Abstract

Negative hypergeometric distribution arises as a waiting time distribution when we sample without replacement from a finite population. It has applications in many areas such as inspection sampling and estimation of wildlife populations. However, as is well known, the negative hypergeometric distribution is over-dispersed in the sense that its variance is greater than the mean. To make it more flexible and versatile, we propose a modified version of negative hypergeometric distribution called COM-Negative Hypergeometric distribution (COM-NH) by introducing a shape parameter as in the COM-Poisson and COMP-Binomial distributions. It is shown that under some limiting conditions, COM-NH approaches to a distribution that we call the COM-Negative binomial (COMP-NB), which in turn, approaches to the COM Poisson distribution. For the proposed model, we investigate the dispersion characteristics and shape of the probability mass function for different combinations of parameters. We also develop statistical inference for this model including parameter estimation and hypothesis tests. In particular, we investigate some properties such as bias, MSE, and coverage probabilities of the maximum likelihood estimators for its parameters by Monte Carlo simulation and likelihood ratio test to assess shape parameter of the underlying model. We present illustrative data to provide discussion.  相似文献   

6.
Asymptotic Normality in Mixtures of Power Series Distributions   总被引:1,自引:0,他引:1  
Abstract.  The problem of estimating the individual probabilities of a discrete distribution is considered. The true distribution of the independent observations is a mixture of a family of power series distributions. First, we ensure identifiability of the mixing distribution assuming mild conditions. Next, the mixing distribution is estimated by non-parametric maximum likelihood and an estimator for individual probabilities is obtained from the corresponding marginal mixture density. We establish asymptotic normality for the estimator of individual probabilities by showing that, under certain conditions, the difference between this estimator and the empirical proportions is asymptotically negligible. Our framework includes Poisson, negative binomial and logarithmic series as well as binomial mixture models. Simulations highlight the benefit in achieving normality when using the proposed marginal mixture density approach instead of the empirical one, especially for small sample sizes and/or when interest is in the tail areas. A real data example is given to illustrate the use of the methodology.  相似文献   

7.
Previous work has been carried out on the use of double sampling schemes for inference from binomial data which are subject to misclassification. The double sampling scheme utilizes a sample of n units which are classified by both a fallible and a true device and another sample of n2 units which are classified only by a fallible device. A triple sampljng scheme incorporates an additional sample of nl units which are classified only by the true device. In this paper we apply this triple sampling to estimation from binomialdata. First estimation of a binomial proportion is discussed under different misclassification structures. Then, the problem of optimal allocation of sample sizes is discussed.  相似文献   

8.
Abstract

Linear mixed effects models have been popular in small area estimation problems for modeling survey data when the sample size in one or more areas is too small for reliable inference. However, when the data are restricted to a bounded interval, the linear model may be inappropriate, particularly if the data are near the boundary. Nonlinear sampling models are becoming increasingly popular for small area estimation problems when the normal model is inadequate. This paper studies the use of a beta distribution as an alternative to the normal distribution as a sampling model for survey estimates of proportions which take values in (0, 1). Inference for small area proportions based on the posterior distribution of a beta regression model ensures that point estimates and credible intervals take values in (0, 1). Properties of a hierarchical Bayesian small area model with a beta sampling distribution and logistic link function are presented and compared to those of the linear mixed effect model. Propriety of the posterior distribution using certain noninformative priors is shown, and behavior of the posterior mean as a function of the sampling variance and the model variance is described. An example using 2010 Small Area Income and Poverty Estimates (SAIPE) data is given, and a numerical example studying small sample properties of the model is presented.  相似文献   

9.
Motivated by a study on comparing sensitivities and specificities of two diagnostic tests in a paired design when the sample size is small, we first derived an Edgeworth expansion for the studentized difference between two binomial proportions of paired data. The Edgeworth expansion can help us understand why the usual Wald interval for the difference has poor coverage performance in the small sample size. Based on the Edgeworth expansion, we then derived a transformation based confidence interval for the difference. The new interval removes the skewness in the Edgeworth expansion; the new interval is easy to compute, and its coverage probability converges to the nominal level at a rate of O(n−1/2). Numerical results indicate that the new interval has the average coverage probability that is very close to the nominal level on average even for sample sizes as small as 10. Numerical results also indicate this new interval has better average coverage accuracy than the best existing intervals in finite sample sizes.  相似文献   

10.
This article compares the accuracy of the median unbiased estimator with that of the maximum likelihood estimator for a logistic regression model with two binary covariates. The former estimator is shown to be uniformly more accurate than the latter for small to moderately large sample sizes and a broad range of parameter values. In view of the recently developed efficient algorithms for generating exact distributions of sufficient statistics in binary-data problems, these results call for a serious consideration of median unbiased estimation as an alternative to maximum likelihood estimation, especially when the sample size is not large, or when the data structure is sparse.  相似文献   

11.
A challenge for implementing performance-based Bayesian sample size determination is selecting which of several methods to use. We compare three Bayesian sample size criteria: the average coverage criterion (ACC) which controls the coverage rate of fixed length credible intervals over the predictive distribution of the data, the average length criterion (ALC) which controls the length of credible intervals with a fixed coverage rate, and the worst outcome criterion (WOC) which ensures the desired coverage rate and interval length over all (or a subset of) possible datasets. For most models, the WOC produces the largest sample size among the three criteria, and sample sizes obtained by the ACC and the ALC are not the same. For Bayesian sample size determination for normal means and differences between normal means, we investigate, for the first time, the direction and magnitude of differences between the ACC and ALC sample sizes. For fixed hyperparameter values, we show that the difference of the ACC and ALC sample size depends on the nominal coverage, and not on the nominal interval length. There exists a threshold value of the nominal coverage level such that below the threshold the ALC sample size is larger than the ACC sample size, and above the threshold the ACC sample size is larger. Furthermore, the ACC sample size is more sensitive to changes in the nominal coverage. We also show that for fixed hyperparameter values, there exists an asymptotic constant ratio between the WOC sample size and the ALC (ACC) sample size. Simulation studies are conducted to show that similar relationships among the ACC, ALC, and WOC may hold for estimating binomial proportions. We provide a heuristic argument that the results can be generalized to a larger class of models.  相似文献   

12.
Biological control of pests is an important branch of entomology, providing environmentally friendly forms of crop protection. Bioassays are used to find the optimal conditions for the production of parasites and strategies for application in the field. In some of these assays, proportions are measured and, often, these data have an inflated number of zeros. In this work, six models will be applied to data sets obtained from biological control assays for Diatraea saccharalis , a common pest in sugar cane production. A natural choice for modelling proportion data is the binomial model. The second model will be an overdispersed version of the binomial model, estimated by a quasi-likelihood method. This model was initially built to model overdispersion generated by individual variability in the probability of success. When interest is only in the positive proportion data, a model can be based on the truncated binomial distribution and in its overdispersed version. The last two models include the zero proportions and are based on a finite mixture model with the binomial distribution or its overdispersed version for the positive data. Here, we will present the models, discuss their estimation and compare the results.  相似文献   

13.
The inverse hypergeometric distribution is of interest in applications of inverse sampling without replacement from a finite population where a binary observation is made on each sampling unit. Thus, sampling is performed by randomly choosing units sequentially one at a time until a specified number of one of the two types is selected for the sample. Assuming the total number of units in the population is known but the number of each type is not, we consider the problem of estimating this parameter. We use the Delta method to develop approximations for the variance of three parameter estimators. We then propose three large sample confidence intervals for the parameter. Based on these results, we selected a sampling of parameter values for the inverse hypergeometric distribution to empirically investigate performance of these estimators. We evaluate their performance in terms of expected probability of parameter coverage and confidence interval length calculated as means of possible outcomes weighted by the appropriate outcome probabilities for each parameter value considered. The unbiased estimator of the parameter is the preferred estimator relative to the maximum likelihood estimator and an estimator based on a negative binomial approximation, as evidenced by empirical estimates of closeness to the true parameter value. Confidence intervals based on the unbiased estimator tend to be shorter than the two competitors because of its relatively small variance but at a slight cost in terms of coverage probability.  相似文献   

14.
The incorporation of prior information about θ, where θ is the success probability in a binomial sampling model, is an essential feature of Bayesian statistics. Methodology based on information-theoretic concepts is introduced which (a) quantifies the amount of information provided by the sample data relative to that provided by the prior distribution and (b) allows for a ranking of prior distributions with respect to conservativeness, where conservatism refers to restraint of extraneous information about θ which is embedded in any prior distribution. In effect, the most conservative prior distribution from a specified class (each member o f which carries the available prior information about θ) is that prior distribution within the class over which the likelihood function has the greatest average domination. The most conservative prior distributions from five different families of prior distributions over the interval (0,1) including the beta distribution are determined and compared for three situations: (1) no prior estimate of θ is available, (2) a prior point estimate or θ is available, and (3) a prior interval estimate of θ is available. The results of the comparisons not only advocate the use of the beta prior distribution in binomial sampling but also indicate which particular one to use in the three aforementioned situations.  相似文献   

15.
For the modeling of bounded counts, the binomial distribution is a common choice. In applications, however, one often observes an excessive number of zeros and extra-binomial variation, which cannot be explained by a binomial distribution. We propose statistics to evaluate the number of zeros and the dispersion with respect to a binomial model, which is based on the sample binomial index of dispersion and the sample binomial zero index. We apply this index to autocorrelated counts generated by a binomial autoregressive process of order one, which also includes the special case of independent and identically (i. i. d.) bounded counts. The limiting null distributions of the proposed test statistics are derived. A Monte-Carlo study evaluates their size and power under various alternatives. Finally, we present two real-data applications as well as the derivation of effective sample sizes to illustrate the proposed methodology.  相似文献   

16.
Several methods are available for generating confidence intervals for rate difference, rate ratio, or odds ratio, when comparing two independent binomial proportions or Poisson (exposure‐adjusted) incidence rates. Most methods have some degree of systematic bias in one‐sided coverage, so that a nominal 95% two‐sided interval cannot be assumed to have tail probabilities of 2.5% at each end, and any associated hypothesis test is at risk of inflated type I error rate. Skewness‐corrected asymptotic score methods have been shown to have superior equal‐tailed coverage properties for the binomial case. This paper completes this class of methods by introducing novel skewness corrections for the Poisson case and for odds ratio, with and without stratification. Graphical methods are used to compare the performance of these intervals against selected alternatives. The skewness‐corrected methods perform favourably in all situations—including those with small sample sizes or rare events—and the skewness correction should be considered essential for analysis of rate ratios. The stratified method is found to have excellent coverage properties for a fixed effects analysis. In addition, another new stratified score method is proposed, based on the t‐distribution, which is suitable for use in either a fixed effects or random effects analysis. By using a novel weighting scheme, this approach improves on conventional and modern meta‐analysis methods with weights that rely on crude estimation of stratum variances. In summary, this paper describes methods that are found to be robust for a wide range of applications in the analysis of rates.  相似文献   

17.
In this paper, the Gompertz model is extended to incorporate time-dependent covariates in the presence of interval-, right-, left-censored and uncensored data. Then, its performance at different sample sizes, study periods and attendance probabilities are studied. Following that, the model is compared to a fixed covariate model. Finally, two confidence interval estimation methods, Wald and likelihood ratio (LR), are explored and conclusions are drawn based on the results of the coverage probability study. The results indicate that bias, standard error and root mean square error values of the parameter estimates decrease with the increase in study period, attendance probability and sample size. Also, LR was found to work slightly better than the Wald for parameters of the model.  相似文献   

18.
The modelling and analysis of count-data time series are areas of emerging interest with various applications in practice. We consider the particular case of the binomial AR(1) model, which is well suited for describing binomial counts with a first-order autoregressive serial dependence structure. We derive explicit expressions for the joint (central) moments and cumulants up to order 4. Then, we apply these results for expressing moments and asymptotic distribution of the squared difference estimator as an alternative to the sample autocovariance. We also analyse the asymptotic distribution of the conditional least-squares estimators of the parameters of the binomial AR(1) model. The finite-sample performance of these estimators is investigated in a simulation study, and we apply them to real data about computerized workstations.  相似文献   

19.
Three methods for testing the equality of nonindependent proportions were compared with, the use of Monte Carlo techniques. The three methods included Cochran's test, an ANOVA F test, and Hotelling's T2 test. With respect to empirical significance levels, the ANOVA F test is recommended as the preferred method of analysis.

Oftentimes an experimenter is interested in testing the equality of several proportions. When the proportions are independent Kemp and Butcher (1972) and Butcher and Kemp (1974) compared several methods for analysing large sample binomial data for the case of a 3 x 3 factorial design without replication. In addition, Levy and Narula (1977) compared many of the same methods for analyzing binomial data; however, Levy and Narula investigated the relative utility of the methods for small sample sizes.  相似文献   

20.
ABSTRACT

The performances of six confidence intervals for estimating the arithmetic mean of a lognormal distribution are compared using simulated data. The first interval considered is based on an exact method and is recommended in U.S. EPA guidance documents for calculating upper confidence limits for contamination data. Two intervals are based on asymptotic properties due to the Central Limit Theorem, and the other three are based on transformations and maximum likelihood estimation. The effects of departures from lognormality on the performance of these intervals are also investigated. The gamma distribution is considered to represent departures from the lognormal distribution. The average width and coverage of each confidence interval is reported for varying mean, variance, and sample size. In the lognormal case, the exact interval gives good coverage, but for small sample sizes and large variances the confidence intervals are too wide. In these cases, an approximation that incorporates sampling variability of the sample variance tends to perform better. When the underlying distribution is a gamma distribution, the intervals based upon the Central Limit Theorem tend to perform better than those based upon lognormal assumptions.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号