首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到15条相似文献,搜索用时 0 毫秒
1.
In 1991 Marsh and co-workers made the case for a sample of anonymized records (SAR) from the 1991 census of population. The case was accepted by the Office for National Statistics (then the Office of Population Censuses and Surveys) and a request was made by the Economic and Social Research Council to purchase the SARs. Two files were released for Great Britain—a 2% sample of individuals and a 1% sample of households. Subsequently similar samples were released for Northern Ireland. Since their release, the files have been heavily used for research and there has been no known breach of confidentiality. There is a considerable demand for similar files from the 2001 census, with specific requests for a larger sample size and lower population threshold for the individual SAR. This paper reassesses the analysis of Marsh and co-workers of the risk of identification of an individual or household in a sample of microdata from the 1991 census and also uses alternative ways of assessing risks with the 1991 SARs. The results of both the reassessment and the new analyses are reassuring and allow us to take the 1991 SARs as a base-line against which to assess proposals for changes to the size and structure of samples from the 2001 census.  相似文献   

2.
Summary.  Top coding of extreme values of variables like income is a common method of statistical disclosure control, but it creates problems for the data analyst. The paper proposes two alternative methods to top coding for statistical disclosure control that are based on multiple imputation. We show in simulation studies that the multiple-imputation methods provide better inferences of the publicly released data than top coding, using straightforward multiple-imputation methods of analysis, while maintaining good statistical disclosure control properties. We illustrate the methods on data from the 1995 Chinese household income project.  相似文献   

3.
Summary. Protection against disclosure is important for statistical agencies releasing microdata files from sample surveys. Simple measures of disclosure risk can provide useful evidence to support decisions about release. We propose a new measure of disclosure risk: the probability that a unique match between a microdata record and a population unit is correct. We argue that this measure has at least two advantages. First, we suggest that it may be a more realistic measure of risk than two measures that are currently used with census data. Second, we show that consistent inference (in a specified sense) may be made about this measure from sample data without strong modelling assumptions. This is a surprising finding, in its contrast with the properties of the two 'similar' established measures. As a result, this measure has potentially useful applications to sample surveys. In addition to obtaining a simple consistent predictor of the measure, we propose a simple variance estimator and show that it is consistent. We also consider the extension of inference to allow for certain complex sampling schemes. We present a numerical study based on 1991 census data for about 450 000 enumerated individuals in one area of Great Britain. We show that the theoretical results on the properties of the point predictor of the measure of risk and its variance estimator hold to a good approximation for these data.  相似文献   

4.
By using prior knowledge it may be possible to deduce pieces of individual information from a frequency distribution of a population. If the prior information is described by a stochastic model, an information-theoretic approach can be applied in order to judge the possibilities for disclosure. By specifying the stochastic model in various ways it is shown how the decrease in entropy caused by the publication of a frequency distribution can be determined and interpreted. The stochastic models are also used to derive formulae for disclosure risks and expected numbers of disclosures.  相似文献   

5.
Summary.  Record linkage is a powerful tool to obtain individual follow-up information that is held in routinely collected databases. However, this method is potentially limited not only by the quality of the original data but also by the temporal and geographic coverage of the routine data. Migration in particular is a factor that might introduce systematic bias even in analyses of data covering relatively large geographical areas. We describe a linkage application where emigration bias might be an issue and use the sensitivity analysis approach that has been described by Molenberghs and co-workers and Kenward and co-workers to assess the extent of this bias.  相似文献   

6.
The statistics of linear models: back to basics   总被引:2,自引:0,他引:2  
  相似文献   

7.
For every discrete or continuous location-scale family having a square-integrable density, there is a unique continuous probability distribution on the unit interval that is determined by the density-quantile composition introduced by Parzen in 1979. These probability density quantiles (pdQs) only differ in shape, and can be usefully compared with the Hellinger distance or Kullback–Leibler divergences. Convergent empirical estimates of these pdQs are provided, which leads to a robust global fitting procedure of shape families to data. Asymmetry can be measured in terms of distance or divergence of pdQs from the symmetric class. Further, a precise classification of shapes by tail behaviour can be defined simply in terms of pdQ boundary derivatives.  相似文献   

8.
The use of Bayesian methods to support pharmaceutical product development has grown in recent years. In clinical statistics, the drive to provide faster access for patients to medical treatments has led to a heightened focus by industry and regulatory authorities on innovative clinical trial designs, including those that apply Bayesian methods. In nonclinical statistics, Bayesian applications have also made advances. However, they have been embraced far more slowly in the nonclinical area than in the clinical counterpart. In this article, we explore some of the reasons for this slower rate of adoption. We also present the results of a survey conducted for the purpose of understanding the current state of Bayesian application in nonclinical areas and for identifying areas of priority for the DIA/ASA-BIOP Nonclinical Bayesian Working Group. The survey explored current usage, hurdles, perceptions, and training needs for Bayesian methods among nonclinical statisticians. Based on the survey results, a set of recommendations is provided to help guide the future advancement of Bayesian applications in nonclinical pharmaceutical statistics.  相似文献   

9.
Control charts are widely used in industries to monitor a process for quality improvement. Evaluation of the average run length (ARL) or average time to signal (ATS) plays an important role in the design of control charts and performance comparison. In this paper, we review several basic and popular procedures, including the Markov chain and integral equation methods for computing ARL, ATS and associated run length distributions for cumulative sum charts, exponentially weighted moving average charts and combined control charts, respectively. Some important references and key formulations are provided for practitioners.  相似文献   

10.
虽然关于行业维度的研究大都认为FDI在发展中国家产生了显著技术溢出效应,但却忽略了FDI 在空间维度所产生的溢出效应.因此,引入样本截面区位信息,构建空间Durbin模型,实证检验FDI对中国Malmquist生产率的空间溢出效应.结果显示:FDI在区域内产生了正向溢出、在区域间产生了负向溢出,FDI空间溢出总效应弱化了全要素生产率;FDI溢出的空间辐射能力有限,仅对距离非常近的一阶邻域具有显著影响.  相似文献   

11.
We provide a comprehensive and critical review of Yates’ continuity correction for the normal approximation to the binomial distribution, in particular emphasizing its poor ability to approximate extreme tail probabilities. As an alternative method, we also review Cressie's finely tuned continuity correction. In addition, we demonstrate how Yates’ continuity correction is used to improve the coverage probability of binomial confidence limits, and propose new confidence limits by applying Cressie's continuity correction. These continuity correction methods are numerically compared and illustrated by data examples arising from industry and medicine.  相似文献   

12.
The aim of the paper is to characterize the factors that determine the transition from university to work as well as to evaluate the effectiveness of universities and course programmes with respect to the labour market outcomes of their graduates. The study is focused on the analysis of the time to obtain the first job, taking into account the graduates' characteristics and the effects pertaining to course programmes and universities. For this a three-level discrete time survival model is used, where the logit of the hazard—conditionally on the random effects at course programme and university level—is a linear function of the covariates. The analysis is accomplished by using a large data set from a survey on job opportunities for the 1992 Italian graduates.  相似文献   

13.
In recent years, several attempts have been made to characterize the generalized Pareto distributions (GPD) based on the properties of order statistics and record values. In the present article, we give some characterization results on GPD based on order statistics and generalized order statistics. Some characterizations of uniform distribution based on expectation of some functions of order statistics are also given.  相似文献   

14.
林业碳汇的经济价值是决定林业碳汇生产和交易的重要指标。基于浙江省温州市碳控排企业调查数据,运用条件价值法(CVM),引入计划行为理论,从碳控排企业支付意愿视角探讨林业碳汇经济价值及其影响因素。结果表明,碳控排企业对林业碳汇的支付意愿是"是否愿意支付"和"愿意支付多少金额"两个决策过程的统一。受访企业负责人个体特征、企业特征、气候变化认知、行为态度、主观规范变量显著正向影响碳控排企业是否愿意为林业碳汇支付;受访企业负责人个体特征、企业特征、林业碳汇认知、行为态度、知觉行为控制、行为经验变量显著正向影响碳控排企业的林业碳汇支付金额,反向行为执行意向则产生显著负向影响,利用PCE模型计算得到其平均支付金额,即林业碳汇经济价值为47.36元/t·CO_2。  相似文献   

15.
The Australian Bureau of Statistics is creating a longitudinal sample, called the Australian Census Longitudinal Dataset (ACLD), by linking person records across its five‐yearly Census of Population and Housing. This paper proposes a Multi‐Panel framework for selecting and weighting records in the ACLD. This framework can be applied more generally to selecting longitudinal samples from a series of cross‐sectional administrative files. The proposed framework avoids some significant limitations of the popular ‘Top‐Up’ sampling approach to maintaining the cross‐sectional and longitudinal representativeness of a sample over time.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号