首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
A difference-based variance estimator is proposed for nonparametric regression in complex surveys. By using a combined inference framework, the estimator is shown to be asymptotically normal and to converge to the true variance at a parametric rate. Simulation studies show that the proposed variance estimator works well for complex survey data and also reveals some finite sample properties of the estimator.  相似文献   

2.
ABSTRACT

For monitoring systemic risk from regulators’ point of view, this article proposes a relative risk measure, which is sensitive to the market comovement. The asymptotic normality of a nonparametric estimator and its smoothed version is established when the observations are independent. To effectively construct an interval without complicated asymptotic variance estimation, a jackknife empirical likelihood inference procedure based on the smoothed nonparametric estimation is provided with a Wilks type of result in case of independent observations. When data follow from AR-GARCH models, the relative risk measure with respect to the errors becomes useful and so we propose a corresponding nonparametric estimator. A simulation study and real-life data analysis show that the proposed relative risk measure is useful in monitoring systemic risk.  相似文献   

3.
The main purpose of this paper is to gain inference about the parameter estimated via the CLS method for a particular bilinear model. We propose a new CLS estimator which is strongly consistent and the CLT and LIL hold with milder conditions on the moments. Furthermore, we derive a closed form expression for the asymptotic variance of this CLS estimator.  相似文献   

4.
Before releasing survey data, statistical agencies usually perturb the original data to keep each survey unit''s information confidential. One significant concern in releasing survey microdata is identity disclosure, which occurs when an intruder correctly identifies the records of a survey unit by matching the values of some key (or pseudo-identifying) variables. We examine a recently developed post-randomization method for a strict control of identification risks in releasing survey microdata. While that procedure well preserves the observed frequencies and hence statistical estimates in case of simple random sampling, we show that in general surveys, it may induce considerable bias in commonly used survey-weighted estimators. We propose a modified procedure that better preserves weighted estimates. The procedure is illustrated and empirically assessed with an application to a publicly available US Census Bureau data set.  相似文献   

5.
We propose model-free measures for Granger causality in mean between random variables. Unlike the existing measures, ours are able to detect and quantify nonlinear causal effects. The new measures are based on nonparametric regressions and defined as logarithmic functions of restricted and unrestricted mean square forecast errors. They are easily and consistently estimated by replacing the unknown mean square forecast errors by their nonparametric kernel estimates. We derive the asymptotic normality of nonparametric estimator of causality measures, which we use to build tests for their statistical significance. We establish the validity of smoothed local bootstrap that one can use in finite sample settings to perform statistical tests. Monte Carlo simulations reveal that the proposed test has good finite sample size and power properties for a variety of data-generating processes and different sample sizes. Finally, the empirical importance of measuring nonlinear causality in mean is also illustrated. We quantify the degree of nonlinear predictability of equity risk premium using variance risk premium. Our empirical results show that the variance risk premium is a very good predictor of risk premium at horizons less than 6 months. We also find that there is a high degree of predictability at the 1-month horizon, that can be attributed to a nonlinear causal effect. Supplementary materials for this article are available online.  相似文献   

6.
We give a formal definition of a representative sample, but roughly speaking, it is a scaled‐down version of the population, capturing its characteristics. New methods for selecting representative probability samples in the presence of auxiliary variables are introduced. Representative samples are needed for multipurpose surveys, when several target variables are of interest. Such samples also enable estimation of parameters in subspaces and improved estimation of target variable distributions. We describe how two recently proposed sampling designs can be used to produce representative samples. Both designs use distance between population units when producing a sample. We propose a distance function that can calculate distances between units in general auxiliary spaces. We also propose a variance estimator for the commonly used Horvitz–Thompson estimator. Real data as well as illustrative examples show that representative samples are obtained and that the variance of the Horvitz–Thompson estimator is reduced compared with simple random sampling.  相似文献   

7.
Summary.  The paper establishes a correspondence between statistical disclosure control and forensic statistics regarding their common use of the concept of 'probability of identification'. The paper then seeks to investigate what lessons for disclosure control can be learnt from the forensic identification literature. The main lesson that is considered is that disclosure risk assessment cannot, in general, ignore the search method that is employed by an intruder seeking to achieve disclosure. The effects of using several search methods are considered. Through consideration of the plausibility of assumptions and 'worst case' approaches, the paper suggests how the impact of search method can be handled. The paper focuses on foundations of disclosure risk assessment, providing some justification for some modelling assumptions underlying some existing record level measures of disclosure risk. The paper illustrates the effects of using various search methods in a numerical example based on microdata from a sample from the 2001 UK census.  相似文献   

8.
In stratified sampling, methods for the allocation of effort among strata usually rely on some measure of within-stratum variance. If we do not have enough information about these variances, adaptive allocation can be used. In adaptive allocation designs, surveys are conducted in two phases. Information from the first phase is used to allocate the remaining units among the strata in the second phase. Brown et al. [Adaptive two-stage sequential sampling, Popul. Ecol. 50 (2008), pp. 239–245] introduced an adaptive allocation sampling design – where the final sample size was random – and an unbiased estimator. Here, we derive an unbiased variance estimator for the design, and consider a related design where the final sample size is fixed. Having a fixed final sample size can make survey-planning easier. We introduce a biased Horvitz–Thompson type estimator and a biased sample mean type estimator for the sampling designs. We conduct two simulation studies on honey producers in Kurdistan and synthetic zirconium distribution in a region on the moon. Results show that the introduced estimators are more efficient than the available estimators for both variable and fixed sample size designs, and the conventional unbiased estimator of stratified simple random sampling design. In order to evaluate efficiencies of the introduced designs and their estimator furthermore, we first review some well-known adaptive allocation designs and compare their estimator with the introduced estimators. Simulation results show that the introduced estimators are more efficient than available estimators of these well-known adaptive allocation designs.  相似文献   

9.
In this paper we consider inference of parameters in time series regression models. In the traditional inference approach, the heteroskedasticity and autocorrelation consistent (HAC) estimation is often involved to consistently estimate the asymptotic covariance matrix of regression parameter estimator. Since the bandwidth parameter in the HAC estimation is difficult to choose in practice, there has been a recent surge of interest in developing bandwidth-free inference methods. However, existing simulation studies show that these new methods suffer from severe size distortion in the presence of strong temporal dependence for a medium sample size. To remedy the problem, we propose to apply the prewhitening to the inconsistent long-run variance estimator in these methods to reduce the size distortion. The asymptotic distribution of the prewhitened Wald statistic is obtained and the general effectiveness of prewhitening is shown through simulations.  相似文献   

10.
Nonresponse is a major source of estimation error in sample surveys. The response rate is widely used to measure survey quality associated with nonresponse, but is inadequate as an indicator because of its limited relation with nonresponse bias. Schouten et al. (2009) proposed an alternative indicator, which they refer to as an indicator of representativeness or R-indicator. This indicator measures the variability of the probabilities of response for units in the population. This paper develops methods for the estimation of this R-indicator assuming that values of a set of auxiliary variables are observed for both respondents and nonrespondents. We propose bias adjustments to the point estimator proposed by Schouten et al. (2009) and demonstrate the effectiveness of this adjustment in a simulation study where it is shown that the method is valid, especially for smaller sample sizes. We also propose linearization variance estimators which avoid the need for computer-intensive replication methods and show good coverage in the simulation study even when models are not fully specified. The use of the proposed procedures is also illustrated in an application to two business surveys at Statistics Netherlands.  相似文献   

11.
In this paper we discuss a new theoretical basis for perturbation methods. In developing this new theoretical basis, we define the ideal measures of data utility and disclosure risk. Maximum data utility is achieved when the statistical characteristics of the perturbed data are the same as that of the original data. Disclosure risk is minimized if providing users with microdata access does not result in any additional information. We show that when the perturbed values of the confidential variables are generated as independent realizations from the distribution of the confidential variables conditioned on the non-confidential variables, they satisfy the data utility and disclosure risk requirements. We also discuss the relationship between the theoretical basis and some commonly used methods for generating perturbed values of confidential numerical variables.  相似文献   

12.
In this paper, we propose a smoothed Q‐learning algorithm for estimating optimal dynamic treatment regimes. In contrast to the Q‐learning algorithm in which nonregular inference is involved, we show that, under assumptions adopted in this paper, the proposed smoothed Q‐learning estimator is asymptotically normally distributed even when the Q‐learning estimator is not and its asymptotic variance can be consistently estimated. As a result, inference based on the smoothed Q‐learning estimator is standard. We derive the optimal smoothing parameter and propose a data‐driven method for estimating it. The finite sample properties of the smoothed Q‐learning estimator are studied and compared with several existing estimators including the Q‐learning estimator via an extensive simulation study. We illustrate the new method by analyzing data from the Clinical Antipsychotic Trials of Intervention Effectiveness–Alzheimer's Disease (CATIE‐AD) study.  相似文献   

13.
In this article, we propose a kernel-based estimator for the finite-dimensional parameter of a partially additive linear quantile regression model. For dependent processes that are strictly stationary and absolutely regular, we establish a precise convergent rate and show that the estimator is root-n consistent and asymptotically normal. To help facilitate inferential procedures, a consistent estimator for the asymptotic variance is also provided. In addition to conducting a simulation experiment to evaluate the finite sample performance of the estimator, an application to US inflation is presented. We use the real-data example to motivate how partially additive linear quantile models can offer an alternative modeling option for time-series data.  相似文献   

14.
 实证研究离不开数据,当前,官方汇总数据日益成为一种公共产品,研究团体和社会公众有很多渠道获取。但是,由于技术、经济、法律、甚至是政治等种种因素的制约,微观统计数据共享和传播渠道缺失,迫使研究团体和个人自己去进行数据收集,造成大量的重复劳动和财力时间的浪费。同时,对于已有微观统计数据的开发不足,降低了数据收集的回报,严重制约了统计能力的提升。本文对微观数据发布的现状进行了中外比较,讨论了微观数据发布的效用与风险,指出最关键的问题是满足日益增长的数据需求和统计泄密风险的矛盾,并且介绍了当前国际上常用的控制泄密风险的方法,并最终结合实际情况对中国的微观数据发布提出相关的建议。  相似文献   

15.
Abstract.  We consider large sample inference in a semiparametric logistic/proportional-hazards mixture model. This model has been proposed to model survival data where there exists a positive portion of subjects in the population who are not susceptible to the event under consideration. Previous studies of the logistic/proportional-hazards mixture model have focused on developing point estimation procedures for the unknown parameters. This paper studies large sample inferences based on the semiparametric maximum likelihood estimator. Specifically, we establish existence, consistency and asymptotic normality results for the semiparametric maximum likelihood estimator. We also derive consistent variance estimates for both the parametric and non-parametric components. The results provide a theoretical foundation for making large sample inference under the logistic/proportional-hazards mixture model.  相似文献   

16.
Missing covariate values is a common problem in survival analysis. In this paper we propose a novel method for the Cox regression model that is close to maximum likelihood but avoids the use of the EM-algorithm. It exploits that the observed hazard function is multiplicative in the baseline hazard function with the idea being to profile out this function before carrying out the estimation of the parameter of interest. In this step one uses a Breslow type estimator to estimate the cumulative baseline hazard function. We focus on the situation where the observed covariates are categorical which allows us to calculate estimators without having to assume anything about the distribution of the covariates. We show that the proposed estimator is consistent and asymptotically normal, and derive a consistent estimator of the variance–covariance matrix that does not involve any choice of a perturbation parameter. Moderate sample size performance of the estimators is investigated via simulation and by application to a real data example.  相似文献   

17.
The authors develop empirical likelihood (EL) based methods of inference for a common mean using data from several independent but nonhomogeneous populations. For point estimation, they propose a maximum empirical likelihood (MEL) estimator and show that it is n‐consistent and asymptotically optimal. For confidence intervals, they consider two EL based methods and show that both intervals have approximately correct coverage probabilities under large samples. Finite‐sample performances of the MEL estimator and the EL based confidence intervals are evaluated through a simulation study. The results indicate that overall the MEL estimator and the weighted EL confidence interval are superior alternatives to the existing methods.  相似文献   

18.
The purpose of this paper is to consider the problem of statistical inference about a hazard rate function that is specified as the product of a parametric regression part and a non-parametric baseline hazard. Unlike Cox's proportional hazard model, the baseline hazard not only depends on the duration variable, but also on the starting date of the phenomenon of interest. We propose a new estimator of the regression parameter which allows for non-stationarity in the hazard rate. We show that it is asymptotically normal at root- n and that its asymptotic variance attains the information bound for estimation of the regression coefficient. We also consider an estimator of the integrated baseline hazard, and determine its asymptotic properties. The finite sample performance of our estimators are studied.  相似文献   

19.
In this paper we discuss methodology for the safe release of business microdata. In particular we extend the model-based protection procedure of Franconi and Stander (2002, The Statistician 51: 1–11) by allowing the model to take account of the spatial structure underlying the geographical information in the microdata. We discuss the use of the Gibbs sampler for performing the computations required by this spatial approach. We provide an empirical comparison of these non-spatial and spatial disclosure limitation methods based on the Italian sample from the Community Innovation Survey. We quantify the level of protection achieved for the released microdata and the error induced when various inferences are performed. We find that although the spatial method often induces higher inferential errors, it almost always provides more protection. Moreover the aggregated areas from the spatial procedure can be somewhat more spatially smooth, and hence possibly more meaningful, than those from the non-spatial approach. We discuss possible applications of these model-based protection procedures to more spatially extensive data sets.  相似文献   

20.
The performance of Statistical Disclosure Control (SDC) methods for microdata (also called masking methods) is measured in terms of the utility and the disclosure risk associated to the protected microdata set. Empirical disclosure risk assessment based on record linkage stands out as a realistic and practical disclosure risk assessment methodology which is applicable to every conceivable masking method. The intruder is assumed to know an external data set, whose records are to be linked to those in the protected data set; the percent of correctly linked record pairs is a measure of disclosure risk. This paper reviews conventional record linkage, which assumes shared variables between the external and the protected data sets, and then shows that record linkage—and thus disclosure—is still possible without shared variables.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号