首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
First a comprehensive treatment of the hierarchical-conjugate Bayesian predictive approach to binary survey data is presented, encompassing simple random, stratified, cluster, and two-stage sampling, as well as two-stage sampling within strata. For the case of two-stage sampling within strata when there is more than one variable of stratification, analysis using an unsaturated logit linear model on the prior means is proposed. This allows there to be cells containing no sampled clusters. Formulas for posterior predictive means, variances, and covariances of numbers of successes in unsampled portions of clusters are presented in terms of posterior expectations of certain functions of hyperparameters; these may be evaluated by existing methods. The technique is illustrated using a small subset of Canada Youth & AIDS Study data. A sample of students within each of various selected school boards was chosen and interviewed via questionnaire. The boards were stratified/poststratified in two dimensions, but some of the resulting cells contained no data. The additive logit linear model on the prior means produced estimates and posterior variances for boards in all cells. Data showed the additive model to be plausible.  相似文献   

2.
This paper considers the effects of informative two-stage cluster sampling on estimation and prediction. The aims of this article are twofold: first to estimate the parameters of the superpopulation model for two-stage cluster sampling from a finite population, when the sampling design for both stages is informative, using maximum likelihood estimation methods based on the sample-likelihood function; secondly to predict the finite population total and to predict the cluster-specific effects and the cluster totals for clusters in the sample and for clusters not in the sample. To achieve this we derive the sample and sample-complement distributions and the moments of the first and second stage measurements. Also we derive the conditional sample and conditional sample-complement distributions and the moments of the cluster-specific effects given the cluster measurements. It should be noted that classical design-based inference that consists of weighting the sample observations by the inverse of sample selection probabilities cannot be applied for the prediction of the cluster-specific effects for clusters not in the sample. Also we give an alternative justification of the Royall [1976. The linear least squares prediction approach to two-stage sampling. Journal of the American Statistical Association 71, 657–664] predictor of the finite population total under two-stage cluster population. Furthermore, small-area models are studied under informative sampling.  相似文献   

3.
We consider a Bayesian approach to the study of independence in a two-way contingency table which has been obtained from a two-stage cluster sampling design. If a procedure based on single-stage simple random sampling (rather than the appropriate cluster sampling) is used to test for independence, the p-value may be too small, resulting in a conclusion that the null hypothesis is false when it is, in fact, true. For many large complex surveys the Rao–Scott corrections to the standard chi-squared (or likelihood ratio) statistic provide appropriate inference. For smaller surveys, though, the Rao–Scott corrections may not be accurate, partly because the chi-squared test is inaccurate. In this paper, we use a hierarchical Bayesian model to convert the observed cluster samples to simple random samples. This provides surrogate samples which can be used to derive the distribution of the Bayes factor. We demonstrate the utility of our procedure using an example and also provide a simulation study which establishes our methodology as a viable alternative to the Rao–Scott approximations for relatively small two-stage cluster samples. We also show the additional insight gained by displaying the distribution of the Bayes factor rather than simply relying on a summary of the distribution.  相似文献   

4.
Market segmentation is a key concept in marketing research. Identification of consumer segments helps in setting up and improving a marketing strategy. Hence, the need is to improve existing methods and to develop new segmentation methods. We introduce two new consumer indicators that can be used as segmentation basis in two-stage methods, the forces and the dfbetas. Both bases express a subject’s effect on the aggregate estimates of the parameters in a conditional logit model. Further, individual-level estimates, obtained by either estimating a conditional logit model for each individual separately with maximum likelihood or by hierarchical Bayes (HB) estimation of a mixed logit choice model, and the respondents’ raw choices are also used as segmentation basis. In the second stage of the methods the bases are classified into segments with cluster analysis or latent class models. All methods are applied to choice data because of the increasing popularity of choice experiments to analyze choice behavior. To verify whether two-stage segmentation methods can compete with a one-stage approach, a latent class choice model is estimated as well. A simulation study reveals the superiority of the two-stage method that clusters the HB estimates and the one-stage latent class choice model. Additionally, very good results are obtained for two-stage latent class cluster analysis of the choices as well as for the two-stage methods clustering the forces, the dfbetas and the choices.  相似文献   

5.
Hypothesis Testing in Two-Stage Cluster Sampling   总被引:1,自引:0,他引:1  
Correlated observations often arise in complex sampling schemes such as two-stage cluster sampling. The resulting observations from this sampling scheme usually exhibit certain positive intracluster correlation, as a result of which the standard statistical procedures for testing hypotheses concerning linear combinations of the parameters may lack some of the optimal properties that these possess when the data are uncorrelated. The aim of this paper is to present exact methods for testing these hypotheses by combining within and between cluster information much as in Zhou & Mathew (1993).  相似文献   

6.
Prediction of random effects is an important problem with expanding applications. In the simplest context, the problem corresponds to prediction of the latent value (the mean) of a realized cluster selected via two-stage sampling. Recently, Stanek and Singer [Predicting random effects from finite population clustered samples with response error. J. Amer. Statist. Assoc. 99, 119–130] developed best linear unbiased predictors (BLUP) under a finite population mixed model that outperform BLUPs from mixed models and superpopulation models. Their setup, however, does not allow for unequally sized clusters. To overcome this drawback, we consider an expanded finite population mixed model based on a larger set of random variables that span a higher dimensional space than those typically applied to such problems. We show that BLUPs for linear combinations of the realized cluster means derived under such a model have considerably smaller mean squared error (MSE) than those obtained from mixed models, superpopulation models, and finite population mixed models. We motivate our general approach by an example developed for two-stage cluster sampling and show that it faithfully captures the stochastic aspects of sampling in the problem. We also consider simulation studies to illustrate the increased accuracy of the BLUP obtained under the expanded finite population mixed model.  相似文献   

7.
Adaptive cluster sampling is usually applied when estimating the abundance of elusive, clustered biological populations. It is commonly supposed that all individuals in the selected area units are detected by the observer, but in many acutal situations this assumption may be highly unrealistic and some individuals may be missed. This paper deals with the problem of handling imperfect detectability in adaptive cluster sampling by using a pure design-based approach. A two-stage adaptive procedure is proposed where the abundance in the selected units is estimated by replicated counts.  相似文献   

8.
This paper considers the analysis of multivariate survival data where the marginal distributions are specified by semiparametric transformation models, a general class including the Cox model and the proportional odds model as special cases. First, consideration is given to the situation where the joint distribution of all failure times within the same cluster is specified by the Clayton–Oakes model (Clayton, Biometrika 65:141–151, l978; Oakes, J R Stat Soc B 44:412–422, 1982). A two-stage estimation procedure is adopted by first estimating the marginal parameters under the independence working assumption, and then the association parameter is estimated from the maximization of the full likelihood function with the estimators of the marginal parameters plugged in. The asymptotic properties of all estimators in the semiparametric model are derived. For the second situation, the third and higher order dependency structures are left unspecified, and interest focuses on the pairwise correlation between any two failure times. Thus, the pairwise association estimate can be obtained in the second stage by maximizing the pairwise likelihood function. Large sample properties for the pairwise association are also derived. Simulation studies show that the proposed approach is appropriate for practical use. To illustrate, a subset of the data from the Diabetic Retinopathy Study is used.  相似文献   

9.
We consider the adjustment, based upon a sample of size n, of collections of vectors drawn from either an infinite or finite population. The vectors may be judged to be either normally distributed or, more generally, second-order exchangeable. We develop the work of Goldstein and Wooff (1998) to show how the familiar univariate finite population corrections (FPCs) naturally generalise to individual quantities in the multivariate population. The types of information we gain by sampling are identified with the orthogonal canonical variable directions derived from a generalised eigenvalue problem. These canonical directions share the same co-ordinate representation for all sample sizes and, for equally defined individuals, all population sizes enabling simple comparisons between both the effects of different sample sizes and of different population sizes. We conclude by considering how the FPC is modified for multivariate cluster sampling with exchangeable clusters. In univariate two-stage cluster sampling, we may decompose the variance of the population mean into the sum of the variance of cluster means and the variance of the cluster members within clusters. The first term has a FPC relating to the sampling fraction of clusters, the second term has a FPC relating to the sampling fraction of cluster size. We illustrate how this generalises in the multivariate case. We decompose the variance into two terms: the first relating to multivariate finite population sampling of clusters and the second to multivariate finite population sampling within clusters. We solve two generalised eigenvalue problems to show how to generalise the univariate to the multivariate: each of the two FPCs attaches to one, and only one, of the two eigenbases.  相似文献   

10.
In the health and social sciences, researchers often encounter categorical data for which complexities come from a nested hierarchy and/or cross-classification for the sampling structure. A common feature of these studies is a non-standard data structure with repeated measurements which may have some degree of clustering. In this paper, methodology is presented for the joint estimation of quantities of interest in the context of a stratified two-stage sample with bivariate dichotomous data. These quantities are the mean value π of an observed dichotomous response for a certain condition or time-point and a set of correlation coefficients for intra-cluster association for each condition or time period and for inter-condition correlation within and among clusters. The methodology uses the cluster means and pairwise joint probability parameters from each cluster. They together provide appropriate information across clusters for the estimation of the correlation coefficients.  相似文献   

11.
The seemingly unrelated regression model is viewed in the context of repeated measures analysis. Regression parameters and the variance-covariance matrix of the seemingly unrelated regression model can be estimated by using two-stage Aitken estimation. The first stage is to obtain a consistent estimator of the variance-covariance matrix. The second stage uses this matrix to obtain the generalized least squares estimators of the regression parameters. The maximum likelihood (ML) estimators of the regression parameters can be obtained by performing the two-stage estimation iteratively. The iterative two-stage estimation procedure is shown to be equivalent to the EM algorithm (Dempster, Laird, and Rubin, 1977) proposed by Jennrich and Schluchter (1986) and Laird, Lange, and Stram (1987) for repeated measures data. The equivalence of the iterative two-stage estimator and the ML estimator has been previously demonstrated empirically in a Monte Carlo study by Kmenta and Gilbert (1968). It does not appear to be widely known that the two estimators are equivalent theoretically. This paper demonstrates this equivalence.  相似文献   

12.
This paper considers a regression model in which coefficients obtained from a previous regression are themselves the object of analysis. It is shown that the parameters of interest can be obtained in two ways: pooling across observations and subsamples, or a two-stage process of first estimating the coefficients within each subsample, and then using these coefficients as dependent variables in a second stage regression. The relative properties of these estimators are analyzed, and the conditions under which the two estimators are equivalent are derived.  相似文献   

13.
Many experiments aim at populations with persons nested within clusters. Randomization to treatment conditions can be done at the cluster level or at the person level within each cluster. The latter may result in control group contamination, and cluster randomization is therefore oftenpreferred in practice. This article models the control group contamination, calculates the required sample sizes for both levels of randomization, and gives the degree of contamination for which cluster randomization is preferable above randomization of persons within clusters. Moreover, itprovides examples of situations where one has to make a choice between both levels of randomization.  相似文献   

14.
The performance of several test statistics for comparing vectors of propor tions from certain survey data was compared. The statistics were used to analyze a subsample of data from the 'High School and Beyond' survey. These tests include the Wald test statistic X2w and the modified Wald test statistic FW, the chi-squared test statistic X2rSB and its modification FRSB, a test X2dmb based on a probability model, and a method of moments approach, X2H. Data were also simulated based on two-stage cluster sampling design and the type I error level, and the power of these tests was obtained for selected combinations of parameter values. The statistics X2DMB XRSB, FRSB and X2H performed well both for a small number of clusters or a small number of units within clusters. The power performance of these tests is quite stable. Approximate intervals were constructed for design effect constants. Methods of estimating these constants based on a normality assumption worked best.  相似文献   

15.
Motivated by a real-life problem, we develop a Two-Stage Cluster Sampling with Ranked Set Sampling (TSCRSS) design in the second stage for which we derive an unbiased estimator of population mean and its variance. An unbiased estimator of the variance of mean estimator is also derived. It is proved that the TSCRSS is more efficient—in the sense of having smaller variance—than the conventional two-stage cluster simple random sampling in which the second-stage sampling is with replacement. Using a simulation study on a real-life population, we show that the TSCRSS is more efficient than the conventional two-stage cluster sampling when simple random sampling without replacement is used in both stages.  相似文献   

16.
Abstract

In many experimental situations, the average treatment performance within its own group is used as a benchmark to be compared with each individual treatment. Multiple comparison procedures with the average (MCA) are thus proposed. A simulation comparison study of the traditional MCA, the single-stage MCA and the two-stage MCA for normal distribution under heteroscedasticity is investigated by the Monte-Carlo techniques in this paper. It was found that the two-stage MCA has shorter confidence length than the single-stage MCA for most cases and it is also more robust for non-normal distributions. Therefore, the two-stage MCA is recommended. But when the additional samples at the second stage could be costly, the data-analysis oriented single-stage MCA can be used. A biometrical example to illustrate the single-stage MCA and the two-stage MCA with equal confidence length is also given in this article.  相似文献   

17.
Because of limitations of the univariate frailty model in analysis of multivariate survival data, a bivariate frailty model is introduced for the analysis of bivariate survival data. This provides tremendous flexibility especially in allowing negative associations between subjects within the same cluster. The approach involves incorporating into the model two possibly correlated frailties for each cluster. The bivariate lognormal distribution is used as the frailty distribution. The model is then generalized to multivariate survival data with two distinguished groups and also to alternating process data. A modified EM algorithm is developed with no requirement of specification of the baseline hazards. The estimators are generalized maximum likelihood estimators with subject-specific interpretation. The model is applied to a mental health study on evaluation of health policy effects for inpatient psychiatric care.  相似文献   

18.
《统计学通讯:理论与方法》2012,41(16-17):3278-3300
Under complex survey sampling, in particular when selection probabilities depend on the response variable (informative sampling), the sample and population distributions are different, possibly resulting in selection bias. This article is concerned with this problem by fitting two statistical models, namely: the variance components model (a two-stage model) and the fixed effects model (a single-stage model) for one-way analysis of variance, under complex survey design, for example, two-stage sampling, stratification, and unequal probability of selection, etc. Classical theory underlying the use of the two-stage model involves simple random sampling for each of the two stages. In such cases the model in the sample, after sample selection, is the same as model for the population; before sample selection. When the selection probabilities are related to the values of the response variable, standard estimates of the population model parameters may be severely biased, leading possibly to false inference. The idea behind the approach is to extract the model holding for the sample data as a function of the model in the population and of the first order inclusion probabilities. And then fit the sample model, using analysis of variance, maximum likelihood, and pseudo maximum likelihood methods of estimation. The main feature of the proposed techniques is related to their behavior in terms of the informativeness parameter. We also show that the use of the population model that ignores the informative sampling design, yields biased model fitting.  相似文献   

19.
This paper discusses the large sample theory of the two-stage Welsh's trimmed mean for the limited information simultaneous equations model. Besides having asymptotic normality, this trimmed mean, as the two-stage least squares estimator, is a generalized least squares estimator. It also acts as a robust Aitken estimator for the simultaneous equations model. Examples illustrate real data analysis and large sample inferences based on this trimmed mean.  相似文献   

20.
When one wants to check a tentatively proposed model for departures that are not well specified, looking at residuals is the most common diagnostic technique. Here, we investigate the use of Bayesian standardized residuals to detect unknown hierarchical structure. Asymptotic theory, also supported by simulations, shows that the use of Bayesian standardized residuals is effective when the within group correlation, ρ, is large. However, we show that standardized residuals may not detect hierarchical structure when ρ is small. Thus, if it is important to detect modest hierarchical structure (i.e., ρ small) one should use other diagnostic techniques in addition to the standardized residuals. We use “quality of care” data from the Patterns of Care Study, a two-stage cluster sample of patients undergoing radiation therapy for cervix cancer, to illustrate the potential use of these residuals to detect missing hierarchical structure.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号