首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
ABSTRACT

Dirichlet-process-based non-parametric Bayesian inference is developed for a Y-linked two-sex branching process with blind choice. This stochastic model is suitable for analysing the evolution of the number of carriers of two alleles of a Y-linked gene in a two-sex monogamous population where each female chooses her partner from among the male population without caring about his type (i.e. the allele he carries). The only data assumed to be available are the total number of females and males (regardless of their types) up to some generation and the numbers of each type of male in the last generation. A simulation method which is based on a Dirichlet process and a Gibbs sampler is developed to estimate the posterior distributions of the model's main parameters. Finally, the computational efficiency of the algorithm is illustrated with example simulations and an application to real data.  相似文献   

2.
The inverse hypergeometric distribution is of interest in applications of inverse sampling without replacement from a finite population where a binary observation is made on each sampling unit. Thus, sampling is performed by randomly choosing units sequentially one at a time until a specified number of one of the two types is selected for the sample. Assuming the total number of units in the population is known but the number of each type is not, we consider the problem of estimating this parameter. We use the Delta method to develop approximations for the variance of three parameter estimators. We then propose three large sample confidence intervals for the parameter. Based on these results, we selected a sampling of parameter values for the inverse hypergeometric distribution to empirically investigate performance of these estimators. We evaluate their performance in terms of expected probability of parameter coverage and confidence interval length calculated as means of possible outcomes weighted by the appropriate outcome probabilities for each parameter value considered. The unbiased estimator of the parameter is the preferred estimator relative to the maximum likelihood estimator and an estimator based on a negative binomial approximation, as evidenced by empirical estimates of closeness to the true parameter value. Confidence intervals based on the unbiased estimator tend to be shorter than the two competitors because of its relatively small variance but at a slight cost in terms of coverage probability.  相似文献   

3.
This article focuses on two‐phase sampling designs for a population with unknown number of rare objects. The first phase is used to estimate the number of rare or potentially rare objects in a population, and the second phase to design sampling plans to capture a certain number or a certain proportion of such type of objects. A hypergeometric‐binomial model is applied to infer the number of rare or potentially rare objects and Monte Carlo simulation based approaches are developed to calculate needed sample sizes. Simulations and real data applications are discussed. The Canadian Journal of Statistics 37: 417–434; 2009 © 2009 Statistical Society of Canada  相似文献   

4.
Latent class models have recently drawn considerable attention among many researchers and practitioners as a class of useful tools for capturing heterogeneity across different segments in a target market or population. In this paper, we consider a latent class logit model with parameter constraints and deal with two important issues in the latent class models--parameter estimation and selection of an appropriate number of classes--within a Bayesian framework. A simple Gibbs sampling algorithm is proposed for sample generation from the posterior distribution of unknown parameters. Using the Gibbs output, we propose a method for determining an appropriate number of the latent classes. A real-world marketing example as an application for market segmentation is provided to illustrate the proposed method.  相似文献   

5.
The likelihood ratio method is used to construct a confidence interval for a population mean when sampling from a population with certain characteristics found in many applications, such as auditing. Specifically, a sample taken from this type of population usually consists of a very large number of zero values, plus a small number of nonzero values that follow some continuous distribution. In this situation, the traditional confidence interval constructed for the population mean is known to be unreliable. This article derives confidence intervals based on the likelihood-ratio-test approach by assuming (1) a normal distribution (normal algorithm) and (2) an exponential distribution (exponential algorithm). Because the error population distribution is usually unknown, it is important to study the robustness of the proposed procedures. We perform an extensive simulation study to compare the percentage of confidence intervals containing the true population mean using the two proposed algorithms with the percentage obtained from the traditional method based on the central limit theorem. It is shown that the normal algorithm is the most robust procedure against many different distributional error assumptions.  相似文献   

6.
A rotation scheme for a stratified multi-stage sample, discussed in this paper, was designed to statisfy the following conditions: (i) there is a constraint on the number of units that can be replaced in each round, and (ii) it is relatively inexpensive to increase the sample size gradually. An example of these conditions was observed in the development of a plan for measuring the accuracy of the billing process of a telephone company. Estimators of the population proportion of elements that possess a specified characteristic are also derived. Each estimator is a weighted average of the corresponding estimates based on the retained units from the original sample and on the new units, where the weight of the estimate based on the new units increases over time. While this rotation scheme is discussed in connection with the billing accuracy of a telephone company, the methodology can be applied to other similar problems.  相似文献   

7.
A consistent test for difference in locations between two bivariate populations is proposed, The test is similar as the Mann-Whitney test and depends on the exceedances of slopes of the two samples where slope for each sample observation is computed by taking the ratios of the observed values. In terms of the slopes, it reduces to a univariate problem, The power of the test has been compared with those of various existing tests by simulation. The proposed test statistic is compared with Mardia's(1967) test statistics, Peters-Randies(1991) test statistic, Wilcoxon's rank sum test. statistic and Hotelling' T2 test statistic using Monte Carlo technique. It performs better than other statistics compared for small differences in locations between two populations when underlying population is population 7(light tailed population) and sample size 15 and 18 respectively. When underlying population is population 6(heavy tailed population) and sample sizes are 15 and 18 it performas better than other statistic compared except Wilcoxon's rank sum test statistics for small differences in location between two populations. It performs better than Mardia's(1967) test statistic for large differences in location between two population when underlying population is bivariate normal mixture with probability p=0.5, population 6, Pearson type II population and Pearson type VII population for sample size 15 and 18 .Under bivariate normal population it performs as good as Mardia' (1967) test statistic for small differences in locations between two populations and sample sizes 15 and 18. For sample sizes 25 and 28 respectively it performs better than Mardia's (1967) test statistic when underlying population is population 6, Pearson type II population and Pearson type VII population  相似文献   

8.
Summary.  Complex survey sampling is often used to sample a fraction of a large finite population. In general, the survey is conducted so that each unit (e.g. subject) in the sample has a different probability of being selected into the sample. For generalizability of the sample to the population, both the design and the probability of being selected into the sample must be incorporated in the analysis. In this paper we focus on non-standard regression models for complex survey data. In our motivating example, which is based on data from the Medical Expenditure Panel Survey, the outcome variable is the subject's 'total health care expenditures in the year 2002'. Previous analyses of medical cost data suggest that the variance is approximately equal to the mean raised to the power of 1.5, which is a non-standard variance function. Currently, the regression parameters for this model cannot be easily estimated in standard statistical software packages. We propose a simple two-step method to obtain consistent regression parameter and variance estimates; the method proposed can be implemented within any standard sample survey package. The approach is applicable to complex sample surveys with any number of stages.  相似文献   

9.
We propose a class of multidimensional Item Response Theory models for polytomously-scored items with ordinal response categories. This class extends an existing class of multidimensional models for dichotomously-scored items in which the latent abilities are represented by a random vector assumed to have a discrete distribution, with support points corresponding to different latent classes in the population. In the proposed approach, we allow for different parameterizations for the conditional distribution of the response variables given the latent traits, which depend on the type of link function and the constraints imposed on the item parameters. Moreover, we suggest a strategy for model selection that is based on a series of steps consisting of selecting specific features, such as the dimension of the model (number of latent traits), the number of latent classes, and the specific parameterization. In order to illustrate the proposed approach, we analyze a dataset from a study on anxiety and depression on a sample of oncological patients.  相似文献   

10.
A method is described for determining the sample size required for a specified precision simultaneous confidence statement about the parameters of a multinomial population. The method is based on a simultaneous confidence interval procedure due to Goodman, and the results are compared with those obtained by separately considering each cell of the multinomial population as a binomial.  相似文献   

11.
In the estimation of a population mean or total from a random sample, certain methods based on linear models are known to be automatically design consistent, regardless of how well the underlying model describes the population. A sufficient condition is identified for this type of robustness to model failure; the condition, which we call 'internal bias calibration', relates to the combination of a model and the method used to fit it. Included among the internally bias-calibrated models, in addition to the aforementioned linear models, are certain canonical link generalized linear models and nonparametric regressions constructed from them by a particular style of local likelihood fitting. Other models can often be made robust by using a suboptimal fitting method. Thus the class of model-based, but design consistent, analyses is enlarged to include more realistic models for certain types of survey variable such as binary indicators and counts. Particular applications discussed are the estimation of the size of a population subdomain, as arises in tax auditing for example, and the estimation of a bootstrap tail probability.  相似文献   

12.
Array-based comparative genomic hybridization (aCGH) is a high-resolution high-throughput technique for studying the genetic basis of cancer. The resulting data consists of log fluorescence ratios as a function of the genomic DNA location and provides a cytogenetic representation of the relative DNA copy number variation. Analysis of such data typically involves estimation of the underlying copy number state at each location and segmenting regions of DNA with similar copy number states. Most current methods proceed by modeling a single sample/array at a time, and thus fail to borrow strength across multiple samples to infer shared regions of copy number aberrations. We propose a hierarchical Bayesian random segmentation approach for modeling aCGH data that utilizes information across arrays from a common population to yield segments of shared copy number changes. These changes characterize the underlying population and allow us to compare different population aCGH profiles to assess which regions of the genome have differential alterations. Our method, referred to as BDSAcgh (Bayesian Detection of Shared Aberrations in aCGH), is based on a unified Bayesian hierarchical model that allows us to obtain probabilities of alteration states as well as probabilities of differential alteration that correspond to local false discovery rates. We evaluate the operating characteristics of our method via simulations and an application using a lung cancer aCGH data set.  相似文献   

13.
《统计学通讯:理论与方法》2012,41(16-17):3278-3300
Under complex survey sampling, in particular when selection probabilities depend on the response variable (informative sampling), the sample and population distributions are different, possibly resulting in selection bias. This article is concerned with this problem by fitting two statistical models, namely: the variance components model (a two-stage model) and the fixed effects model (a single-stage model) for one-way analysis of variance, under complex survey design, for example, two-stage sampling, stratification, and unequal probability of selection, etc. Classical theory underlying the use of the two-stage model involves simple random sampling for each of the two stages. In such cases the model in the sample, after sample selection, is the same as model for the population; before sample selection. When the selection probabilities are related to the values of the response variable, standard estimates of the population model parameters may be severely biased, leading possibly to false inference. The idea behind the approach is to extract the model holding for the sample data as a function of the model in the population and of the first order inclusion probabilities. And then fit the sample model, using analysis of variance, maximum likelihood, and pseudo maximum likelihood methods of estimation. The main feature of the proposed techniques is related to their behavior in terms of the informativeness parameter. We also show that the use of the population model that ignores the informative sampling design, yields biased model fitting.  相似文献   

14.
The problem of sample selection, when a one-stage superpopulation model-based approach is used to predict individual variate values for each unit in a finite population based on a sample of only some of the units, is investigated. The model framework is discussed and a sample selection scheme based on the model is derived. The sample selection scheme is evaluated using actual data. Future research topics including multiple predictions per unit are suggested.  相似文献   

15.
In this study we discuss the group sequential procedures for comparing two treatments based on multivariate observations in clinical trials. Also we suppose that a response vector on each of two treatments has a multivariate normal distribution with unknown covariance matrix. Then we propose a group sequential x2 statistic in order to carry out repeated significance test for hypothesis of no difference between two population mean vectors. In order to realize the group sequential test where average sample number is reduced, we propose another modified group sequential x2 statistic by extension of Jennison and Turnbull ( 1991 ). After construction of repeated confidence boundaries for making the repeated significance test, we compare two group sequential procedures based on two statistics regarding the average sample number and the power of the test in the simulations.  相似文献   

16.
This paper presents a new parametric model for recurrent events, in which the time of each recurrence is associated to one or multiple latent causes and no information is provided about the responsible cause for the event. This model is characterized by a rate function and it is based on the Poisson-exponential distribution, namely the distribution of the maximum among a random number (truncated Poisson distributed) of exponential times. The time of each recurrence is then given by the maximum lifetime value among all latent causes. Inference is based on a maximum likelihood approach. A simulation study is performed in order to observe the frequentist properties of the estimation procedure for small and moderate sample sizes. We also investigated likelihood-based tests procedures. A real example from a gastroenterology study concerning small bowel motility during fasting state is used to illustrate the methodology. Finally, we apply the proposed model to a real data set and compare it with the classical Homogeneous Poisson model, which is a particular case.  相似文献   

17.
ABSTRACT

New invariant and consistent goodness-of-fit tests for multivariate normality are introduced. Tests are based on the Karhunen–Loève transformation of a multidimensional sample from a population. A comparison of simulated powers of tests and other well-known tests with respect to some alternatives is given. The simulation study demonstrates that power of the proposed McCull test almost does not depend on the number of grouping cells. The test shows an advantage over other chi-squared type tests. However, averaged over all of the simulated conditions examined in this article, the Anderson–Darling type and the Cramer–von Mises type tests seem to be the best.  相似文献   

18.
Wild Bootstrapping in Finite Populations with Auxiliary Information   总被引:1,自引:0,他引:1  
Consider a finite population u , which can be viewed as a realization of a super-population model. A simple ratio model (linear regression, without intercept) with heteroscedastic errors is supposed to have generated u . A random sample is drawn without replacement from u . In this set-up a two-stage wild bootstrap resampling scheme as well as several other useful forms of bootstrapping in finite populations will be considered. Some asymptotic results for various bootstrap approximations for normalized and Studentized versions of the well-known ratio and regression estimator are given. Bootstrap based confidence interval s for the population total and for the regression parameter of the underlying ratio model are also discussed  相似文献   

19.

A goodness-of-fit technique for random samples from the exponential distribution based on the sample Lorenz curve is adapted for use in the exponential order statistic (EOS) model. In the EOS model, only those observations in a random sample from the exponential distribution of unknown size N that are less than some known stopping time T are observable. The model is known as the Jelinski-Moranda model in software reliability, where it is used to estimate the number of bugs in software during development. Distributional results are derived for the distance between the sample Lorenz curve and the population Lorenz curve so that it can be used as a goodness-of-fit test statistic. Simulations show that the test has good power against several alternative distributions. Simulations also indicate that in some cases, model misspecification leads to poor parameter estimation. A plotting procedure provides a means of graphical assessment of fit.  相似文献   

20.
A new method is described of drawing, without replacement, two sample units per stratum from any population. The method is developed from a consideration of the asymptotic properties of systematic sampling with unequal probabilities, as the sizes of the population units tend to zero. The essential properties of this method are very easily analysed. They also converge, over a large number of strata, to those of systematic sampling from the same strata with their population units arranged in random order. In proving this, the assumption is made that the underlying population is of the type to which it is appropriate to apply ratio estimation. The sampling method described is, however, simple enough to commend itself as an alternative to systematic sampling when the underlying population is not of this type. Consideration is given to the case where the sizes of some of the population units exceed the skip interval.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号