首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Statistical offices try to match item models when measuring inflation between two periods. However, for product areas with a high turnover of differentiated models, the use of hedonic indexes is more appropriate, because these include the prices and quantities of unmatched new and old models. The two main approaches to hedonic indexes are hedonic imputation (HI) indexes and dummy time hedonic (DTH) indexes. This study provides a formal analysis of the difference between the two approaches for alternative implementations of the Törnqvist “superlative” index. It shows why the results of the HI and DTH indexes may differ and discusses the issue of choice between these two approaches.  相似文献   

2.
Summary.  We consider three sorts of diagnostics for random imputations: displays of the completed data, which are intended to reveal unusual patterns that might suggest problems with the imputations, comparisons of the distributions of observed and imputed data values and checks of the fit of observed data to the model that is used to create the imputations. We formulate these methods in terms of sequential regression multivariate imputation, which is an iterative procedure in which the missing values of each variable are randomly imputed conditionally on all the other variables in the completed data matrix. We also consider a recalibration procedure for sequential regression imputations. We apply these methods to the 2002 environmental sustainability index, which is a linear aggregation of 64 environmental variables on 142 countries.  相似文献   

3.
Sequential regression multiple imputation has emerged as a popular approach for handling incomplete data with complex features. In this approach, imputations for each missing variable are produced based on a regression model using other variables as predictors in a cyclic manner. Normality assumption is frequently imposed for the error distributions in the conditional regression models for continuous variables, despite that it rarely holds in real scenarios. We use a simulation study to investigate the performance of several sequential regression imputation methods when the error distribution is flat or heavy tailed. The methods evaluated include the sequential normal imputation and its several extensions which adjust for non normal error terms. The results show that all methods perform well for estimating the marginal mean and proportion, as well as the regression coefficient when the error distribution is flat or moderately heavy tailed. When the error distribution is strongly heavy tailed, all methods retain their good performances for the mean and the adjusted methods have robust performances for the proportion; but all methods can have poor performances for the regression coefficient because they cannot accommodate the extreme values well. We caution against the mechanical use of sequential regression imputation without model checking and diagnostics.  相似文献   

4.
陈立双  祝丹 《统计研究》2020,37(4):18-31
大数据来源下CPI指数的创新编制,对及时了解新经济时代的物价走向和识别通胀危机、预测宏观经济拐点以实现我国通胀治理现代化、推动经济平稳和高质量发展具有重大意义。GEKS多边指数是近些年国际学术界重点研发的大数据热点价格指数,但其构造方法颇具争议。借助超市扫描大数据,就GEKS指数序列更新方法、窗口长度选择等学界难题开展理论与实证研究,获得了以下富有启发性的结论:①GEKS指数序列更新方法2、3应用效果相对较差;②随着窗口长度的增加,GEKS环比价格指数会趋于单位值,不同更新方法下的GEKS链式指数也会呈现一定的趋同性;而GEKS指数的通胀趋势判断力却不受此影响,但更新方法的选择却会导致其不同的通胀趋势预测结果;③更新方法4会随着窗口长度的增加而呈现更强的替代偏误,方法1却没有出现明显的替代偏误。综合而言,更新方法1和13个月窗口长度应该是编制GEKS指数序列更为合理的组合方式。  相似文献   

5.
A general nonparametric imputation procedure, based on kernel regression, is proposed to estimate points as well as set- and function-indexed parameters when the data are missing at random (MAR). The proposed method works by imputing a specific function of a missing value (and not the missing value itself), where the form of this specific function is dictated by the parameter of interest. Both single and multiple imputations are considered. The associated empirical processes provide the right tool to study the uniform convergence properties of the resulting estimators. Our estimators include, as special cases, the imputation estimator of the mean, the estimator of the distribution function proposed by Cheng and Chu [1996. Kernel estimation of distribution functions and quantiles with missing data. Statist. Sinica 6, 63–78], imputation estimators of a marginal density, and imputation estimators of regression functions.  相似文献   

6.
This paper computes a quality adjusted price index for the personal computer CPU from 1996 to 2000. The index is based on the pure characteristics demand model. I first compute the quality adjusted price index for the whole market, and show that it is very comparable with the hedonic price index but more sensitive to changes in product quality. Two types of the hedonic index are considered. One is the dummy variable index and the other is the formulation in Pakes (2003). When I group consumers by their willingness to pay for attribute improvement, the index shows consumer groups are differently affected by their product choices.  相似文献   

7.
In official statistics, when a file of microdata must be delivered to external users, it is very difficult to propose them a file where missing values has been treated by multiple imputations. In order to overcome this difficulty, we propose a method of single imputation for qualitative data that respect numerous constraints. The imputation is balanced on totals previously estimated; editing rules can be respected; the imputation is random, but the totals are not affected by an imputation variance.  相似文献   

8.
9.
In this article, we propose a new method of imputation that makes use of higher order moments of an auxiliary variable while imputing missing values. The mean, ratio, and regression methods of imputation are shown to be special cases and less efficient than the newly developed method of imputation, which makes use of higher order moments. Analytical comparisons show that the first-order mean squared error approximation for the proposed new method of imputation is always smaller than that for the regression method of imputation. At the end, the proposed higher order moments-based imputation method has been applied to a real dataset.  相似文献   

10.
This paper presents missing data methods for repeated measures data in small samples. Most methods currently available are for large samples. In particular, no studies have compared the performance of multiple imputation methods to that of non-imputation incomplete analysis methods. We first develop a strategy for multiple imputations for repeated measures data under a cell-means model that is applicable for any multivariate data with small samples. Multiple imputation inference procedures are applied to the resulting multiply imputed complete data sets. Comparisons to other available non-imputation incomplete data methods is made via simulation studies to conclude that there is not much gain in using the computer intensive multiple imputation methods for small sample repeated measures data analysis in terms of the power of testing hypotheses of parameters of interest.  相似文献   

11.
Multiple imputation has emerged as a popular approach to handling data sets with missing values. For incomplete continuous variables, imputations are usually produced using multivariate normal models. However, this approach might be problematic for variables with a strong non-normal shape, as it would generate imputations incoherent with actual distributions and thus lead to incorrect inferences. For non-normal data, we consider a multivariate extension of Tukey's gh distribution/transformation [38] to accommodate skewness and/or kurtosis and capture the correlation among the variables. We propose an algorithm to fit the incomplete data with the model and generate imputations. We apply the method to a national data set for hospital performance on several standard quality measures, which are highly skewed to the left and substantially correlated with each other. We use Monte Carlo studies to assess the performance of the proposed approach. We discuss possible generalizations and give some advices to practitioners on how to handle non-normal incomplete data.  相似文献   

12.
Every hedonic price index is an estimate of an unknown economic parameter. It depends, in practice, on one or more random samples of prices and characteristics of a certain good. Bootstrap resampling methods provide a tool for quantifying sampling errors. Following some general reflections on hedonic elementary price indices, this paper proposes a case-based, a model-based, and a wild bootstrap approach for estimating confidence intervals for hedonic price indices. Empirical results are obtained for a data set on used cars in Switzerland. A simple and an enhanced adaptive semi-logarithmic model are fit to monthly samples, and bootstrap confidence intervals are estimated for Jevons-type hedonic elementary price indices.  相似文献   

13.
In multiple imputation (MI), the resulting estimates are consistent if the imputation model is correct. To specify the imputation model, it is recommended to combine two sets of variables: those that are related to the incomplete variable and those that are related to the missingness mechanism. Several possibilities exist, but it is not clear how they perform in practice. The method that simply groups all variables together into the imputation model and four other methods that are based on the propensity scores are presented. Two of them are new and have not been used in the context of MI. The performance of the methods is investigated by a simulation study under different missing at random mechanisms for different types of variables. We conclude that all methods, except for one method based on the propensity scores, perform well. It turns out that as long as the relevant variables are taken into the imputation model, the form of the imputation model has only a minor effect in the quality of the imputations.  相似文献   

14.
Multiple imputation is a common approach for dealing with missing values in statistical databases. The imputer fills in missing values with draws from predictive models estimated from the observed data, resulting in multiple, completed versions of the database. Researchers have developed a variety of default routines to implement multiple imputation; however, there has been limited research comparing the performance of these methods, particularly for categorical data. We use simulation studies to compare repeated sampling properties of three default multiple imputation methods for categorical data, including chained equations using generalized linear models, chained equations using classification and regression trees, and a fully Bayesian joint distribution based on Dirichlet process mixture models. We base the simulations on categorical data from the American Community Survey. In the circumstances of this study, the results suggest that default chained equations approaches based on generalized linear models are dominated by the default regression tree and Bayesian mixture model approaches. They also suggest competing advantages for the regression tree and Bayesian mixture model approaches, making both reasonable default engines for multiple imputation of categorical data. Supplementary material for this article is available online.  相似文献   

15.
Book Reviews     
The diagnostic tools examined in this article are applicable to regressions estimated with panel data or cross-sectional data drawn from a population with grouped structure. The diagnostic tools considered include (a) tests for the existence of group effects under both fixed and random effects models, (b) checks for outlying groups, and (c) specification tests for comparing the fixed and random effects models. A group-specific counterpart to the studentized residual is introduced. The methods are illustrated using a hedonic housing price regression.  相似文献   

16.
Estimation of price indexes in the United States is generally based on complex rotating panel surveys. The sample for the Consumer Price Index, for example, is selected in three stages—geographic areas, establishments, and individual items—with 20% of the sample being replaced by rotation each year. At each period, a time series of data is available for use in estimation. This article examines how to best combine data for estimation of long-term and short-term changes and how to estimate the variances of the index estimators in the context of two-stage sampling. I extend the class of estimators, introduced by Valliant and Miller, of Laspeyres indexes formed using sample data collected from the current period back to a previous base period. Linearization estimators of variance for indexes of long-term and short-term change are derived. The theory is supported by an empirical simulation study using two-stage sampling of establishments and items from a population derived from U.S. Bureau of Labor Statistics data.  相似文献   

17.

When using multiple imputation to form confidence intervals with missing data, Rubin and Schenker (1986) proposed using a t -distribution with approximate degrees-of-freedom which is a function of the number of multiple imputations and the within and between imputation variance. In this t -approximation, Rubin and Schenker assume there are a finite number of multiple imputations, but an infinite number of observations in the sample. We propose a further degrees-of-freedom approximation which is a function of the within and between imputation variance, the number of multiple imputations, and the number of observations in the sample. When the number of observations in the sample is small, our approximate degrees-of-freedom may be more appropriate, as seen in our simulations.  相似文献   

18.
In real-life situations, we often encounter data sets containing missing observations. Statistical methods that address missingness have been extensively studied in recent years. One of the more popular approaches involves imputation of the missing values prior to the analysis, thereby rendering the data complete. Imputation broadly encompasses an entire scope of techniques that have been developed to make inferences about incomplete data, ranging from very simple strategies (e.g. mean imputation) to more advanced approaches that require estimation, for instance, of posterior distributions using Markov chain Monte Carlo methods. Additional complexity arises when the number of missingness patterns increases and/or when both categorical and continuous random variables are involved. Implementation of routines, procedures, or packages capable of generating imputations for incomplete data are now widely available. We review some of these in the context of a motivating example, as well as in a simulation study, under two missingness mechanisms (missing at random and missing not at random). Thus far, evaluation of existing implementations have frequently centred on the resulting parameter estimates of the prescribed model of interest after imputing the missing data. In some situations, however, interest may very well be on the quality of the imputed values at the level of the individual – an issue that has received relatively little attention. In this paper, we focus on the latter to provide further insight about the performance of the different routines, procedures, and packages in this respect.  相似文献   

19.
Imputation methods that assign a selection of respondents’ values for missing i tern nonresponses give rise to an addd,tional source of sampling variation, which we term imputation varLance , We examine the effect of imputation variance on the precision of the mean, and propose four procedures for sampling the rEespondents that reduce this additional variance. Two of the procedures employ improved sample designs through selection of respc,ndents by sampling without replacement and by stratified sampl;lng. The other two increase the sample base by the use of multiple imputations.  相似文献   

20.
"This paper gives a brief introduction to multiple imputation for handling non-response in surveys. We then describe a recently completed project in which multiple imputation was used to recalibrate industry and occupation codes in 1970 U.S. census public use samples to the 1980 standard. Using analyses of data from the project, we examine the utility of analysing a large data set having imputed values compared with analysing a small data set having true values, and we provide examples of the amount by which variability is underestimated by using just one imputation rather than multiple imputations."  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号