期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Embedded experiments in repeated and overlapping surveys

James Chipperfield Philip Bell 《Journal of the Royal Statistical Society. Series A, (Statistics in Society)》2010,173(1):51-66

Summary. Statistical agencies make changes to the data collection methodology of their surveys to improve the quality of the data collected or to improve the efficiency with which they are collected. For reasons of cost it may not be possible to estimate the effect of such a change on survey estimates or response rates reliably, without conducting an experiment that is embedded in the survey which involves enumerating some respondents by using the new method and some under the existing method. Embedded experiments are often designed for repeated and overlapping surveys; however, previous methods use sample data from only one occasion. The paper focuses on estimating the effect of a methodological change on estimates in the case of repeated surveys with overlapping samples from several occasions. Efficient design of an embedded experiment that covers more than one time point is also mentioned. All inference is unbiased over an assumed measurement model, the experimental design and the complex sample design. Other benefits of the approach proposed include the following: it exploits the correlation between the samples on each occasion to improve estimates of treatment effects; treatment effects are allowed to vary over time; it is robust against incorrectly rejecting the null hypothesis of no treatment effect; it allows a wide set of alternative experimental designs. This paper applies the methodology proposed to the Australian Labour Force Survey to measure the effect of replacing pen-and-paper interviewing with computer-assisted interviewing. This application considered alternative experimental designs in terms of their statistical efficiency and their risks to maintaining a consistent series. The approach proposed is significantly more efficient than using only 1 month of sample data in estimation. 相似文献

2.

我国住户调查一体化设计研究

刘建平罗薇《统计研究》2016,33(8):3-11

住户调查一体化设计包括对各项住户调查的通盘考虑和与普查、行政记录的有机衔接。首先,在借鉴国际经验和考虑我国实际的基础上,提出我国住户调查一体化设计的两个基本要求;其次,构造出我国住户调查一体化设计的基础框架;最后,充分利用现行国家调查制度的渠道和机制,对住户调查项目按其调查内容特征和内在逻辑关系进行精简、整合,形成以劳动力调查和住户收支与生活状况调查为核心的住户调查体系,并给出以主样本为主体的我国住户调查的一体化设计思路。相似文献

3.

Simulation and quality of a synthetic close-to-reality employer–employee population

M. Templ P. Filzmoser 《Journal of applied statistics》2014,41(5):1053-1072

It is of essential importance that researchers have access to linked employer–employee data, but such data sets are rarely available for researchers or the public. Even in case that survey data have been made available, the evaluation of estimation methods is usually done by complex design-based simulation studies. For this aim, data on population level are needed to know the true parameters that are compared with the estimations derived from complex samples. These samples are usually drawn from the population under various sampling designs, missing values and outlier scenarios. The structural earnings statistics sample survey proposes accurate and harmonized data on the level and structure of remuneration of employees, their individual characteristics and the enterprise or place of employment to which they belong in EU member states and candidate countries. At the basis of this data set, we show how to simulate a synthetic close-to-reality population representing the employer and employee structure of Austria. The proposed simulation is based on work of A. Alfons, S. Kraft, M. Templ, and P. Filzmoser [{\em On the simulation of complex universes in the case of applying the German microcensus}, DACSEIS research paper series No. 4, University of Tübingen, 2003] and R. Münnich and J. Schürle [{\em Simulation of close-to-reality population data for household surveys with application to EU-SILC}, Statistical Methods & Applications 20(3) (2011c), pp. 383–407]. However, new challenges are related to consider the special structure of employer–employee data and the complexity induced with the underlying two-stage design of the survey. By using quality measures in form of simple summary statistics, benchmarking indicators and visualizations, the simulated population is analysed and evaluated. An accompanying study on literature has been made to select the most important benchmarking indicators. 相似文献

4.

Generating synthetic data to produce public-use microdata for small geographic areas based on complex sample survey data with application to the National Health Interview Survey

Joseph W. Sakshaug Trivellore E. Raghunathan 《Journal of applied statistics》2014,41(10):2103-2122

Small area statistics obtained from sample survey data provide a critical source of information used to study health, economic, and sociological trends. However, most large-scale sample surveys are not designed for the purpose of producing small area statistics. Moreover, data disseminators are prevented from releasing public-use microdata for small geographic areas for disclosure reasons; thus, limiting the utility of the data they collect. This research evaluates a synthetic data method, intended for data disseminators, for releasing public-use microdata for small geographic areas based on complex sample survey data. The method replaces all observed survey values with synthetic (or imputed) values generated from a hierarchical Bayesian model that explicitly accounts for complex sample design features, including stratification, clustering, and sampling weights. The method is applied to restricted microdata from the National Health Interview Survey and synthetic data are generated for both sampled and non-sampled small areas. The analytic validity of the resulting small area inferences is assessed by direct comparison with the actual data, a simulation study, and a cross-validation study. 相似文献

5.

Gap-based inverse sampling

Bardia Panahbehagh Jennifer Brown 《统计学通讯:理论与方法》2017,46(19):9651-9661

We present a new inverse sampling design for surveys of rare events, Gap-Based Inverse Sampling. In the design, sampling stops if after a predetermined interval, or gap, no new rare events are found. The length of the gap that follows after finding a rare event is used as a way of limiting sample effort. We present stopping rules using decisions based on the gap length, the total number of rare events found, and a fixed upper limit of survey effort. We illustrate the use of the design with stratified sampling of two biological populations. The design uses the intuitive behavior of a field biologist in stratified sampling, where if in a stratum nothing is found after a long search, the field surveyor would like to consider the stratum is empty and stop searching. Our design has appeal for surveying rare events (for example, a rare species) with stratified sampling where there are likely to be some completely empty strata. 相似文献

6.

Leadership in Statistics: Increasing Our Value and Visibility

Eric W. Gibson 《The American statistician》2019,73(2):109-116

Scientists in every discipline are generating data more rapidly than ever before, resulting in an increasing need for statistical skills at a time when there is decreasing visibility for the field of statistics. Resolving this paradox requires stronger statistical leadership to guide multidisciplinary teams in the design and planning of scientific research and making decisions based on data. It requires more effective communication to nonstatisticians of the value of statistics in using data to answer questions, predict outcomes, and support decision-making in the face of uncertainty. It also requires a greater appreciation of the unique capabilities of alternative quantitative disciplines such as machine learning, data science, pharmacometrics, and bioinformatics which represent an opportunity for statisticians to achieve greater impact through collaborative partnership. Examples taken from pharmaceutical drug development are used to illustrate the concept of statistical leadership in a collaborative multidisciplinary team environment. 相似文献

7.

Variance estimation in complex survey sampling for generalized linear models 总被引：1，自引：0，他引：1

Sundar Natarajan Stuart R. Lipsitz Garrett Fitzmaurice Charity G. Moore Rene Gonin 《Journal of the Royal Statistical Society. Series C, Applied statistics》2008,57(1):75-87

Summary. Complex survey sampling is often used to sample a fraction of a large finite population. In general, the survey is conducted so that each unit (e.g. subject) in the sample has a different probability of being selected into the sample. For generalizability of the sample to the population, both the design and the probability of being selected into the sample must be incorporated in the analysis. In this paper we focus on non-standard regression models for complex survey data. In our motivating example, which is based on data from the Medical Expenditure Panel Survey, the outcome variable is the subject's 'total health care expenditures in the year 2002'. Previous analyses of medical cost data suggest that the variance is approximately equal to the mean raised to the power of 1.5, which is a non-standard variance function. Currently, the regression parameters for this model cannot be easily estimated in standard statistical software packages. We propose a simple two-step method to obtain consistent regression parameter and variance estimates; the method proposed can be implemented within any standard sample survey package. The approach is applicable to complex sample surveys with any number of stages. 相似文献

8.

A few statistical principles for data science

Noel Cressie 《Australian & New Zealand Journal of Statistics》2021,63(1):182-200

In any other circumstance, it might make sense to define the extent of the terrain (Data Science) first, and then locate and describe the landmarks (Principles). But this data revolution we are experiencing defies a cadastral survey. Areas are continually being annexed into Data Science. For example, biometrics was traditionally statistics for agriculture in all its forms but now, in Data Science, it means the study of characteristics that can be used to identify an individual. Examples of non-intrusive measurements include height, weight, fingerprints, retina scan, voice, photograph/video (facial landmarks and facial expressions) and gait. A multivariate analysis of such data would be a complex project for a statistician, but a software engineer might appear to have no trouble with it at all. In any applied-statistics project, the statistician worries about uncertainty and quantifies it by modelling data as realisations generated from a probability space. Another approach to uncertainty quantification is to find similar data sets, and then use the variability of results between these data sets to capture the uncertainty. Both approaches allow ‘error bars’ to be put on estimates obtained from the original data set, although the interpretations are different. A third approach, that concentrates on giving a single answer and gives up on uncertainty quantification, could be considered as Data Engineering, although it has staked a claim in the Data Science terrain. This article presents a few (actually nine) statistical principles for data scientists that have helped me, and continue to help me, when I work on complex interdisciplinary projects. 相似文献

9.

Fitting disposition codes to mobile phone surveys: experiences from studies in Finland, Slovenia and the USA

Mario Callegaro Charlotte Steeh Trent D. Buskirk Vasja Vehovar Vesa Kuusela Linda Piekarski 《Journal of the Royal Statistical Society. Series A, (Statistics in Society)》2007,170(3):647-670

Summary. Using mobile phones to conduct survey interviews has gathered momentum recently. However, using mobile telephones in surveys poses many new challenges. One important challenge involves properly classifying final case dispositions to understand response rates and non-response error and to implement responsive survey designs. Both purposes demand accurate assessments of the outcomes of individual call attempts. By looking at actual practices across three countries, we suggest how the disposition codes of the American Association for Public Opinion Research, which have been developed for telephone surveys, can be modified to fit mobile phones. Adding an international dimension to these standard definitions will improve survey methods by making systematic comparisons across different contexts possible. 相似文献

10.

Sampling within households in household surveys

Robert G. Clark David G. Steel 《Journal of the Royal Statistical Society. Series A, (Statistics in Society)》2007,170(1):63-82

Summary. The number of people to select within selected households has significant consequences for the conduct and output of household surveys. The operational and data quality implications of this choice are carefully considered in many surveys, but the effect on statistical efficiency is not well understood. The usual approach is to select all people in each selected household, where operational and data quality concerns make this feasible. If not, one person is usually selected from each selected household. We find that this strategy is not always justified, and we develop intermediate designs between these two extremes. Current practices were developed when household survey field procedures needed to be simple and robust; however, more complex designs are now feasible owing to the increasing use of computer-assisted interviewing. We develop more flexible designs by optimizing survey cost, based on a simple cost model, subject to a required variance for an estimator of population total. The innovation lies in the fact that household sample sizes are small integers, which creates challenges in both design and estimation. The new methods are evaluated empirically by using census and health survey data, showing considerable improvement over existing methods in some cases. 相似文献

11.

Randomized response model in a matched pair study

Chien-Hua Wu Shu-Mei Wan Mei-Chi Li 《Journal of statistical planning and inference》2008

The development of randomized response models for personal interview surveys has attracted much attention since the pioneering work of Warner [1965. Randomized response: a survey technique for eliminating evasive answer bias. J. Amer. Statist. Assoc. 60, 63–69]. Several randomized response models have been developed by researchers for collecting data on both qualitative and the quantitative variables, but none of these models discuss matched pair data. In this paper, we develop a new randomized response model and study its application to an important political question. 相似文献

12.

Missing data methods in official statistics in the United Kingdom: Some recent developments

Gabriele B. Durrant 《Allgemeines Statistisches Archiv》2006,90(4):577-593

In recent years an increase in nonresponse rates in major government and social surveys has been observed. It is thought that decreasing response rates and changes in nonresponse bias may affect, potentially severely, the quality of survey data. This paper discusses the problem of unit and item nonresponse in government surveys from an applied perspective and highlights some newer developments in this field with a focus on official statistics in the United Kingdom (UK). The main focus of the paper is on post-survey adjustment methods, in particular adjustment for item nonresponse. The use of various imputation and weighting methods is discussed in an example. The application also illustrates the close relationship between missing data and measurement error. JEL classification C42, C81 相似文献

13.

Teaching Survey Sampling

Ronald S. Fecso William D. Kalsbeek Sharon L. Lohr Richard L. Scheaffer Elizabeth A. Stasny 《The American statistician》2013,67(4):328-340

In recent years the focus of research in survey sampling has changed to include a number of nontraditional topics such as nonsampling errors. In addition, the availability of data from large-scale sample surveys, along with computers and software to analyze the data, have changed the tools needed by survey sampling statisticians. It has also resulted in a diverse group of secondary data users who wish to learn how to analyze data from a complex survey. Thus it is time to reassess what we should be teaching students about survey sampling. This article brings together a panel of experts on survey sampling and teaching to discuss their views on what should be taught in survey sampling classes and how it should be taught. 相似文献

14.

Optimal allocation of sample sizes between regular banding and radio-tagging for estimating annual survival and emigration rates

Marlina D. Nasution Cavell Brownie Kenneth H. Pollock 《Journal of applied statistics》2002,29(1-4):443-457

Many authors have shown that a combined analysis of data from two or more types of recapture survey brings advantages, such as the ability to provide more information about parameters of interest. For example, a combined analysis of annual resighting and monthly radio-telemetry data allows separate estimates of true survival and emigration rates, whereas only apparent survival can be estimated from the resighting data alone. For studies involving more than one type of survey, biologists should consider how to allocate the total budget to the surveys related to the different types of marks so that they will gain optimal information from the surveys. For example, since radio tags and subsequent monitoring are very costly, while leg bands are cheap, the biologists should try to balance costs with information obtained in deciding how many animals should receive radios. Given a total budget and specific costs, it is possible to determine the allocation of sample sizes to different types of marks in order to minimize the variance of parameters of interest, such as annual survival and emigration rates. In this paper, we propose a cost function for a study where all birds receive leg bands and a subset receives radio tags and all new releases occur at the start of the study. Using this cost function, we obtain the allocation of sample sizes to the two survey types that minimizes the standard error of survival rate estimates or, alternatively, the standard error of emigration rates. Given the proposed costs, we show that for high resighting probability, e.g. 0.6, tagging roughly 10-40% of birds with radios will give survival estimates with standard errors within the minimum range. Lower resighting rates will require a higher percentage of radioed birds. In addition, the proposed costs require tagging the maximum possible percentage of radioed birds to minimize the standard error of emigration estimates. 相似文献

15.

Robust rank screening for ultrahigh dimensional discriminant analysis

Guosheng Cheng Xingxiang Li Peng Lai Fengli Song Jun Yu 《Statistics and Computing》2017,27(2):535-545

In this paper, we consider sure independence feature screening for ultrahigh dimensional discriminant analysis. We propose a new method named robust rank screening based on the conditional expectation of the rank of predictor’s samples. We also establish the sure screening property for the proposed procedure under simple assumptions. The new procedure has some additional desirable characters. First, it is robust against heavy-tailed distributions, potential outliers and the sample shortage for some categories. Second, it is model-free without any specification of a regression model and directly applicable to the situation with many categories. Third, it is simple in theoretical derivation due to the boundedness of the resulting statistics. Forth, it is relatively inexpensive in computational cost because of the simple structure of the screening index. Monte Carlo simulations and real data examples are used to demonstrate the finite sample performance. 相似文献

16.

Simulation of close-to-reality population data for household surveys with application to EU-SILC 总被引：1，自引：0，他引：1

Andreas Alfons Stefan Kraft Matthias Templ Peter Filzmoser 《Statistical Methods and Applications》2011,20(3):383-407

Statistical simulation in survey statistics is usually based on repeatedly drawing samples from population data. Furthermore, population data may be used in courses on survey statistics to explain issues regarding, e.g., sampling designs. Since the availability of real population data is in general very limited, it is necessary to generate synthetic data for such applications. The simulated data need to be as realistic as possible, while at the same time ensuring data confidentiality. This paper proposes a method for generating close-to-reality population data for complex household surveys. The procedure consists of four steps for setting up the household structure, simulating categorical variables, simulating continuous variables and splitting continuous variables into different components. It is not required to perform all four steps so that the framework is applicable to a broad class of surveys. In addition, the proposed method is evaluated in an application to the European Union Statistics on Income and Living Conditions (EU-SILC). 相似文献

17.

On randomized response surveys for estimating a proportion

Tapan K. Nayak 《统计学通讯:理论与方法》2013,42(11):3303-3321

In studies about sensitive characteristics, randomized response (RR) methods are useful for generating reliable data, protecting respondents’ privacy. It is shown that all RR surveys for estimating a proportion can be encompassed in a common model and some general results for statistical inferences can be used for any given survey. The concepts of design and scheme are introduced for characterizing RR surveys. Some consequences of comparing RR designs based on statistical measures of efficiency and respondent’ protection are discussed. In particular, such comparisons lead to the designs that may not be suitable in practice. It is suggested that one should consider other criteria and the scheme parameters for planning a RR survey. 相似文献

18.

Serials Digest

David Walker Lupton 《Serials Review》2013,39(2):99-102

Abstract

A collection manager and an acquisition librarian discuss difficult decisions to be made regarding electronic resources and associated value-added services. Balancing budget constraints with patron demands for easy access to information requires librarians to reevaluate assumptions about the electronic products and associated services that have quickly become staples of library life, even as these staples become increasingly untenable. The authors scrutinize the cost/benefit of continuing value-added services, such as providing access to abstracting and indexing tools and full MARC cataloging records of journal titles, as well as considering the adoption of new services such as federated search engines and link resolvers. 相似文献

19.

Advances in estimation by the item sum technique using auxiliary information in complex surveys

María del Mar García Rueda Pier Francesco Perri Beatriz Rodríguez Cobo 《AStA Advances in Statistical Analysis》2018,102(3):455-478

To collect sensitive data, survey statisticians have designed many strategies to reduce nonresponse rates and social desirability response bias. In recent years, the item count technique has gained considerable popularity and credibility as an alternative mode of indirect questioning survey, and several variants of this technique have been proposed as new needs and challenges arise. The item sum technique (IST), which was introduced by Chaudhuri and Christofides (Indirect questioning in sample surveys, Springer-Verlag, Berlin, 2013) and Trappmann et al. (J Surv Stat Methodol 2:58–77, 2014), is one such variant, used to estimate the mean of a sensitive quantitative variable. In this approach, sampled units are asked to respond to a two-list of items containing a sensitive question related to the study variable and various innocuous, nonsensitive, questions. To the best of our knowledge, very few theoretical and applied papers have addressed the IST. In this article, therefore, we present certain methodological advances as a contribution to appraising the use of the IST in real-world surveys. In particular, we employ a generic sampling design to examine the problem of how to improve the estimates of the sensitive mean when auxiliary information on the population under study is available and is used at the design and estimation stages. A Horvitz–Thompson-type estimator and a calibration-type estimator are proposed and their efficiency is evaluated by means of an extensive simulation study. Using simulation experiments, we show that estimates obtained by the IST are nearly equivalent to those obtained using “true data” and that in general they outperform the estimates provided by a competitive randomized response method. Moreover, variance estimation may be considered satisfactory. These results open up new perspectives for academics, researchers and survey practitioners and could justify the use of the IST as a valid alternative to traditional direct questioning survey modes. 相似文献

20.

Dealing with uncertainty: statistics for an aging population

Stoto MA 《The American statistician》1988,42(2):103-110

"Uncertainty in statistics and demographic projections for aging and other policy purposes comes from four sources: differences in definitions, sampling error, nonsampling error, and scientific uncertainty. Some of these uncertainties can be reduced by proper planning and coordination, but most often decisions have to be made in the face of some remaining uncertainty. Although decision makers have a tendency to ignore uncertainty, doing so does not lead to good policy-making. Techniques for estimating and reporting on uncertainty include sampling theory, assessment of experts' subjective distributions, sensitivity analysis, and multiple independent estimates." The primary geographical focus is on the United States. 相似文献