首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Summary.  The number of people to select within selected households has significant consequences for the conduct and output of household surveys. The operational and data quality implications of this choice are carefully considered in many surveys, but the effect on statistical efficiency is not well understood. The usual approach is to select all people in each selected household, where operational and data quality concerns make this feasible. If not, one person is usually selected from each selected household. We find that this strategy is not always justified, and we develop intermediate designs between these two extremes. Current practices were developed when household survey field procedures needed to be simple and robust; however, more complex designs are now feasible owing to the increasing use of computer-assisted interviewing. We develop more flexible designs by optimizing survey cost, based on a simple cost model, subject to a required variance for an estimator of population total. The innovation lies in the fact that household sample sizes are small integers, which creates challenges in both design and estimation. The new methods are evaluated empirically by using census and health survey data, showing considerable improvement over existing methods in some cases.  相似文献   

2.
On the planning and design of sample surveys   总被引:1,自引:1,他引:0  
Surveys rely on structured questions used to map out reality, using sample observations from a population frame, into data that can be statistically analyzed. This paper focuses on the planning and design of surveys, making a distinction between individual surveys, household surveys and establishment surveys. Knowledge from cognitive science is used to provide guidelines on questionnaire design. Non-standard, but simple, statistical methods are described for analyzing survey results. The paper is based on experience gained by conducting over 150 customer satisfaction surveys in Europe, America and the Far East.  相似文献   

3.
Statistical simulation in survey statistics is usually based on repeatedly drawing samples from population data. Furthermore, population data may be used in courses on survey statistics to explain issues regarding, e.g., sampling designs. Since the availability of real population data is in general very limited, it is necessary to generate synthetic data for such applications. The simulated data need to be as realistic as possible, while at the same time ensuring data confidentiality. This paper proposes a method for generating close-to-reality population data for complex household surveys. The procedure consists of four steps for setting up the household structure, simulating categorical variables, simulating continuous variables and splitting continuous variables into different components. It is not required to perform all four steps so that the framework is applicable to a broad class of surveys. In addition, the proposed method is evaluated in an application to the European Union Statistics on Income and Living Conditions (EU-SILC).  相似文献   

4.
Data from large surveys are often supplemented with sampling weights that are designed to reflect unequal probabilities of response and selection inherent in complex survey sampling methods. We propose two methods for Bayesian estimation of parametric models in a setting where the survey data and the weights are available, but where information on how the weights were constructed is unavailable. The first approach is to simply replace the likelihood with the pseudo likelihood in the formulation of Bayes theorem. This is proven to lead to a consistent estimator but also leads to credible intervals that suffer from systematic undercoverage. Our second approach involves using the weights to generate a representative sample which is integrated into a Markov chain Monte Carlo (MCMC) or other simulation algorithms designed to estimate the parameters of the model. In the extensive simulation studies, the latter methodology is shown to achieve performance comparable to the standard frequentist solution of pseudo maximum likelihood, with the added advantage of being applicable to models that require inference via MCMC. The methodology is demonstrated further by fitting a mixture of gamma densities to a sample of Australian household income.  相似文献   

5.
Summary.  Over the past few years surveys have expanded to new populations, have incorporated measurement of new and more complex substantive issues and have adopted new data collection tools. At the same time there has been a growing reluctance among many household populations to participate in surveys. These factors have combined to present survey designers and survey researchers with increased uncertainty about the performance of any given survey design at any particular point in time. This uncertainty has, in turn, challenged the survey practitioner's ability to control the cost of data collection and quality of resulting statistics. The development of computer-assisted methods for data collection has provided survey researchers with tools to capture a variety of process data ('paradata') that can be used to inform cost–quality trade-off decisions in realtime. The ability to monitor continually the streams of process data and survey data creates the opportunity to alter the design during the course of data collection to improve survey cost efficiency and to achieve more precise, less biased estimates. We label such surveys as 'responsive designs'. The paper defines responsive design and uses examples to illustrate the responsive use of paradata to guide mid-survey decisions affecting the non-response, measurement and sampling variance properties of resulting statistics.  相似文献   

6.
Summary.  In sample surveys of finite populations, subpopulations for which the sample size is too small for estimation of adequate precision are referred to as small domains. Demand for small domain estimates has been growing in recent years among users of survey data. We explore the possibility of enhancing the precision of domain estimators by combining comparable information collected in multiple surveys of the same population. For this, we propose a regression method of estimation that is essentially an extended calibration procedure whereby comparable domain estimates from the various surveys are calibrated to each other. We show through analytic results and an empirical study that this method may greatly improve the precision of domain estimators for the variables that are common to these surveys, as these estimators make effective use of increased sample size for the common survey items. The design-based direct estimators proposed involve only domain-specific data on the variables of interest. This is in contrast with small domain (mostly small area) indirect estimators, based on a single survey, which incorporate through modelling data that are external to the targeted small domains. The approach proposed is also highly effective in handling the closely related problem of estimation for rare population characteristics.  相似文献   

7.
Important empirical information on household behavior and finances is obtained from surveys, and these data are used heavily by researchers, central banks, and for policy consulting. However, various interdependent factors that can be controlled only to a limited extent lead to unit and item nonresponse, and missing data on certain items is a frequent source of difficulties in statistical practice. More than ever, it is important to explore techniques for the imputation of large survey data. This paper presents the theoretical underpinnings of a Markov chain Monte Carlo multiple imputation procedure and outlines important technical aspects of the application of MCMC-type algorithms to large socio-economic data sets. In an illustrative application it is found that MCMC algorithms have good convergence properties even on large data sets with complex patterns of missingness, and that the use of a rich set of covariates in the imputation models has a substantial effect on the distributions of key financial variables.  相似文献   

8.
Summary.  Statistical agencies make changes to the data collection methodology of their surveys to improve the quality of the data collected or to improve the efficiency with which they are collected. For reasons of cost it may not be possible to estimate the effect of such a change on survey estimates or response rates reliably, without conducting an experiment that is embedded in the survey which involves enumerating some respondents by using the new method and some under the existing method. Embedded experiments are often designed for repeated and overlapping surveys; however, previous methods use sample data from only one occasion. The paper focuses on estimating the effect of a methodological change on estimates in the case of repeated surveys with overlapping samples from several occasions. Efficient design of an embedded experiment that covers more than one time point is also mentioned. All inference is unbiased over an assumed measurement model, the experimental design and the complex sample design. Other benefits of the approach proposed include the following: it exploits the correlation between the samples on each occasion to improve estimates of treatment effects; treatment effects are allowed to vary over time; it is robust against incorrectly rejecting the null hypothesis of no treatment effect; it allows a wide set of alternative experimental designs. This paper applies the methodology proposed to the Australian Labour Force Survey to measure the effect of replacing pen-and-paper interviewing with computer-assisted interviewing. This application considered alternative experimental designs in terms of their statistical efficiency and their risks to maintaining a consistent series. The approach proposed is significantly more efficient than using only 1 month of sample data in estimation.  相似文献   

9.
In non-experimental research, data on the same population process may be collected simultaneously by more than one instrument. For example, in the present application, two sample surveys and a population birth registration system all collect observations on first births by age and year, while the two surveys additionally collect information on women’s education. To make maximum use of the three data sources, the survey data are pooled and the population data introduced as constraints in a logistic regression equation. Reductions in standard errors about the age and birth-cohort parameters of the regression equation in the order of three-quarters are obtained by introducing the population data as constraints. A halving of the standard errors about the education parameters is achieved by pooling observations from the larger survey dataset with those from the smaller survey. The percentage reduction in the standard errors through imposing population constraints is independent of the total survey sample size.  相似文献   

10.
Survey statisticians make use of auxiliary information to improve estimates. One important example is calibration estimation, which constructs new weights that match benchmark constraints on auxiliary variables while remaining “close” to the design weights. Multiple-frame surveys are increasingly used by statistical agencies and private organizations to reduce sampling costs and/or avoid frame undercoverage errors. Several ways of combining estimates derived from such frames have been proposed elsewhere; in this paper, we extend the calibration paradigm, previously used for single-frame surveys, to calculate the total value of a variable of interest in a dual-frame survey. Calibration is a general tool that allows to include auxiliary information from two frames. It also incorporates, as a special case, certain dual-frame estimators that have been proposed previously. The theoretical properties of our class of estimators are derived and discussed, and simulation studies conducted to compare the efficiency of the procedure, using different sets of auxiliary variables. Finally, the proposed methodology is applied to real data obtained from the Barometer of Culture of Andalusia survey.  相似文献   

11.
Misclassifications in binary responses have long been a common problem in medical and health surveys. One way to handle misclassifications in clustered or longitudinal data is to incorporate the misclassification model through the generalized estimating equation (GEE) approach. However, existing methods are developed under a non-survey setting and cannot be used directly for complex survey data. We propose a pseudo-GEE method for the analysis of binary survey responses with misclassifications. We focus on cluster sampling and develop analysis strategies for analyzing binary survey responses with different forms of additional information for the misclassification process. The proposed methodology has several attractive features, including simultaneous inferences for both the response model and the association parameters. Finite sample performance of the proposed estimators is evaluated through simulation studies and an application using a real dataset from the Canadian Longitudinal Study on Aging.  相似文献   

12.
The author compares aspects of voluntary and involuntary sample surveys in West Germany. "The German microcensus as a non-voluntary survey draws a random sample from the total population which includes persons that would also respond in a voluntary survey (respondents) and persons that would not respond (non-respondents). The population of a voluntary survey, however, includes only respondents. Hence, statistical inference from a voluntary sample survey is only valid for the total population, if the population of respondents does not differ from the total population. This null hypothesis must be rejected from the comparisons of data from the German microcensus of 1985, 1986 and 1987 and corresponding voluntary test sample surveys. The discrepancies are great in central demographic and socio-economic variables such as region of residence, community size, age, marital status, income and social security." (SUMMARY IN ENG)  相似文献   

13.
Computer-assisted telephone interviewing and random digit dialling are increasingly being used to conduct household surveys in Australia. However, there is little published information concerning Australian experience with such surveys. In 1995 the Government Statistician's Office in Queensland conducted a household survey to study population migration using these techniques. The survey involved a sample of 110 000 telephone numbers resulting in 38 000 responding households. This article describes a computerized survey management system that was developed and which provided information concerning important operational and quality aspects of the survey.  相似文献   

14.
The problem of designing large scale crop surveys which utilize Landsat data is addressed in this paper. Emphasis is on stratification and sampling approaches designed to achieve preselected precisions while permitting manageable data processing and survey costs. A dynamic area sampling frame which is versatile and highly suited for sampling of Landsat data is discussed along with the use of rotation sampling for incorporating multiple years of information into the area estimation  相似文献   

15.
Longitudinal surveys have emerged in recent years as an important data collection tool for population studies where the primary interest is to examine population changes over time at the individual level. Longitudinal data are often analyzed through the generalized estimating equations (GEE) approach. The vast majority of existing literature on the GEE method; however, is developed under non‐survey settings and are inappropriate for data collected through complex sampling designs. In this paper the authors develop a pseudo‐GEE approach for the analysis of survey data. They show that survey weights must and can be appropriately accounted in the GEE method under a joint randomization framework. The consistency of the resulting pseudo‐GEE estimators is established under the proposed framework. Linearization variance estimators are developed for the pseudo‐GEE estimators when the finite population sampling fractions are small or negligible, a scenario often held for large‐scale surveys. Finite sample performances of the proposed estimators are investigated through an extensive simulation study using data from the National Longitudinal Survey of Children and Youth. The results show that the pseudo‐GEE estimators and the linearization variance estimators perform well under several sampling designs and for both continuous and binary responses. The Canadian Journal of Statistics 38: 540–554; 2010 © 2010 Statistical Society of Canada  相似文献   

16.
In studies about sensitive characteristics, randomized response (RR) methods are useful for generating reliable data, protecting respondents’ privacy. It is shown that all RR surveys for estimating a proportion can be encompassed in a common model and some general results for statistical inferences can be used for any given survey. The concepts of design and scheme are introduced for characterizing RR surveys. Some consequences of comparing RR designs based on statistical measures of efficiency and respondent’ protection are discussed. In particular, such comparisons lead to the designs that may not be suitable in practice. It is suggested that one should consider other criteria and the scheme parameters for planning a RR survey.  相似文献   

17.
Summary.  The first British National Survey of Sexual Attitudes and Lifestyles (NATSAL) was conducted in 1990–1991 and the second in 1999–2001. When surveys are repeated, the changes in population parameters are of interest and are generally estimated from a comparison of the data between surveys. However, since all surveys may be subject to bias, such comparisons may partly reflect a change in bias. Typically limited external data are available to estimate the change in bias directly. However, one approach, which is often possible, is to define in each survey a sample of participants who are eligible for both surveys, and then to compare the reporting of selected events that occurred before the earlier survey time point. A difference in reporting suggests a change in overall survey bias between time points, although other explanations are possible. In NATSAL, changes in bias are likely to be similar for groups of sexual experiences. The grouping of experiences allows the information that is derived from the selected events to be incorporated into inference concerning population changes in other sexual experiences. We use generalized estimating equations, which incorporate weighting for differential probabilities of sampling and non-response in a relatively straightforward manner. The results, combined with estimates of the change in reporting, are used to derive minimum established population changes, based on NATSAL data. For some key population parameters, the change in reporting is seen to be consistent with a change in bias alone. Recommendations are made for the design of future surveys.  相似文献   

18.
Before releasing survey data, statistical agencies usually perturb the original data to keep each survey unit''s information confidential. One significant concern in releasing survey microdata is identity disclosure, which occurs when an intruder correctly identifies the records of a survey unit by matching the values of some key (or pseudo-identifying) variables. We examine a recently developed post-randomization method for a strict control of identification risks in releasing survey microdata. While that procedure well preserves the observed frequencies and hence statistical estimates in case of simple random sampling, we show that in general surveys, it may induce considerable bias in commonly used survey-weighted estimators. We propose a modified procedure that better preserves weighted estimates. The procedure is illustrated and empirically assessed with an application to a publicly available US Census Bureau data set.  相似文献   

19.
刘建平  罗薇 《统计研究》2016,33(8):3-11
住户调查一体化设计包括对各项住户调查的通盘考虑和与普查、行政记录的有机衔接。首先,在借鉴国际经验和考虑我国实际的基础上,提出我国住户调查一体化设计的两个基本要求;其次,构造出我国住户调查一体化设计的基础框架;最后,充分利用现行国家调查制度的渠道和机制,对住户调查项目按其调查内容特征和内在逻辑关系进行精简、整合,形成以劳动力调查和住户收支与生活状况调查为核心的住户调查体系,并给出以主样本为主体的我国住户调查的一体化设计思路。  相似文献   

20.
We propose an orthogonal series density estimator for complex surveys, where samples are neither independent nor identically distributed. The proposed estimator is proved to be design-unbiased and asymptotically design-consistent. The asymptotic normality is proved under both design and combined spaces. Two data driven estimators are proposed based on the proposed oracle estimator. We show the efficiency of the proposed estimators in simulation studies. A real survey data example is provided for an illustration.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号