首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 93 毫秒
1.
This paper considers the estimation of coefficients in a linear regression model with missing observations in the independent variables and introduces a modification of the standard first order regression method for imputation of missing values. The modification provides stochastic values for imputation and, as an extension, makes use of the principle of weighted mixed regression. The proposed procedures are compared with two popular procedures—one which utilizes only the complete observations and the other which employs the standard first order regression imputation method for missing values. A simulation experiment to evaluate the gain in efficiency and to examine interesting issues like the impact of varying degree of multicollinearity in explanatory variables is proceeded. Some work on the case of discrete regressor variables is in progress and will be reported in a future article to follow.  相似文献   

2.
In this paper, a new power transformation estimator of population mean in the presence of non-response has been suggested. The estimator of mean obtained from proposed technique remains better than the estimators obtained from ratio or mean methods of imputation. The mean squared error of the resultant estimator is less than that of the estimator obtained on the basis of ratio method of imputation for the optinum choice of parameters. An estimator for estimating a parameter involved in the process of new method of imputation has been discussed. The MSE expressions for the proposed estimators have been derived analytically and compared empirically. Product method of imputation for negatively correlated variables has also been introduced. The work has been extended to the case of multi-auxiliary information to be used for imputation.  相似文献   

3.
This article examines methods to efficiently estimate the mean response in a linear model with an unknown error distribution under the assumption that the responses are missing at random. We show how the asymptotic variance is affected by the estimator of the regression parameter, and by the imputation method. To estimate the regression parameter, the ordinary least squares is efficient only if the error distribution happens to be normal. If the errors are not normal, then we propose a one step improvement estimator or a maximum empirical likelihood estimator to efficiently estimate the parameter.To investigate the imputation’s impact on the estimation of the mean response, we compare the listwise deletion method and the propensity score method (which do not use imputation at all), and two imputation methods. We demonstrate that listwise deletion and the propensity score method are inefficient. Partial imputation, where only the missing responses are imputed, is compared to full imputation, where both missing and non-missing responses are imputed. Our results reveal that, in general, full imputation is better than partial imputation. However, when the regression parameter is estimated very poorly, the partial imputation will outperform full imputation. The efficient estimator for the mean response is the full imputation estimator that utilizes an efficient estimator of the parameter.  相似文献   

4.
研究缺失偏态数据下线性回归模型的参数估计问题,针对缺失偏态数据,为克服样本分布扭曲缺点和提高模型的回归系数、尺度参数和偏度参数的估计效果,提出了一种适合偏态数据下线性回归模型中缺失数据的修正回归插补方法.通过随机模拟和实例研究,并与均值插补、回归插补、随机回归插补方法比较,结果表明所提出的修正回归插补方法是有效可行的.  相似文献   

5.
Sarjinder Singh 《Statistics》2013,47(5):499-511
In this paper, an alternative estimator of population mean in the presence of non-response has been suggested which comes in the form of Walsh's estimator. The estimator of mean obtained from the proposed technique remains better than the estimators obtained from ratio or mean methods of imputation. The mean-squared error (MSE) of the resultant estimator is less than that of the estimator obtained on the basis of ratio method of imputation for the optimum choice of parameters. An estimator for estimating a parameter involved in the process of new method of imputation has been discussed. A suggestion to form ‘warm deck’ method of imputation has been suggested. The MSE expressions for the proposed estimators have been derived analytically and compared empirically. The work has been extended to the case of multi-auxiliary information to be used for imputation. Numerical illustrations are also provided.  相似文献   

6.
The recently developed rolling year GEKS procedure makes maximum use of all matches in the data to construct nonrevisable price indexes that are approximately free from chain drift. A potential weakness is that unmatched items are ignored. In this article we use imputation Törnqvist price indexes as inputs into the rolling year GEKS procedure. These indexes account for quality changes by imputing the “missing prices” associated with new and disappearing items. Three imputation methods are discussed. The first method makes explicit imputations using a hedonic regression model which is estimated for each time period. The other two methods make implicit imputations; they are based on time dummy hedonic and time-product dummy regression models and are estimated on bilateral pooled data. We present empirical evidence for New Zealand from scanner data on eight consumer electronics products and find that accounting for quality change can make a substantial difference.  相似文献   

7.
ABSTRACT

We propose a multiple imputation method based on principal component analysis (PCA) to deal with incomplete continuous data. To reflect the uncertainty of the parameters from one imputation to the next, we use a Bayesian treatment of the PCA model. Using a simulation study and real data sets, the method is compared to two classical approaches: multiple imputation based on joint modelling and on fully conditional modelling. Contrary to the others, the proposed method can be easily used on data sets where the number of individuals is less than the number of variables and when the variables are highly correlated. In addition, it provides unbiased point estimates of quantities of interest, such as an expectation, a regression coefficient or a correlation coefficient, with a smaller mean squared error. Furthermore, the widths of the confidence intervals built for the quantities of interest are often smaller whilst ensuring a valid coverage.  相似文献   

8.
The sensitivity of multiple imputation methods to deviations from their distributional assumptions is investigated using simulations, where the parameters of scientific interest are the coefficients of a linear regression model, and values in predictor variables are missing at random. The performance of a newly proposed imputation method based on generalized additive models for location, scale, and shape (GAMLSS) is investigated. Although imputation methods based on predictive mean matching are virtually unbiased, they suffer from mild to moderate under-coverage, even in the experiment where all variables are jointly normal distributed. The GAMLSS method features better coverage than currently available methods.  相似文献   

9.
In this paper, we introduce a fresh methodology for imputing missing values by making use of sensible constraints on both a study variable and auxiliary variables that are correlated with the variable of interest. The resultant estimator based on these imputed values is shown to lead to the regression type method of imputation in survey sampling. Furthermore, when the data are hybrid of both that missing at random and missing complexly at random, the resultant estimator is shown to be a consistent estimator that has asymptotic mean squared error equal to that of the linear regression method of imputation. A generalization to any type of method of imputation is possible and has been included at the end.  相似文献   

10.
Consider estimation of a population mean of a response variable when the observations are missing at random with respect to the covariate. Two common approaches to imputing the missing values are the nonparametric regression weighting method and the Horvitz-Thompson (HT) inverse weighting approach. The regression approach includes the kernel regression imputation and the nearest neighbor imputation. The HT approach, employing inverse kernel-estimated weights, includes the basic estimator, the ratio estimator and the estimator using inverse kernel-weighted residuals. Asymptotic normality of the nearest neighbor imputation estimators is derived and compared to kernel regression imputation estimator under standard regularity conditions of the regression function and the missing pattern function. A comprehensive simulation study shows that the basic HT estimator is most sensitive to discontinuity in the missing data patterns, and the nearest neighbors estimators can be insensitive to missing data patterns unbalanced with respect to the distribution of the covariate. Empirical studies show that the nearest neighbor imputation method is most effective among these imputation methods for estimating a finite population mean and for classifying the species of the iris flower data.  相似文献   

11.
For the first time, we propose a five-parameter lifetime model called the McDonald Weibull distribution to extend the Weibull, exponentiated Weibull, beta Weibull and Kumaraswamy Weibull distributions, among several other models. We obtain explicit expressions for the ordinary moments, quantile and generating functions, mean deviations and moments of the order statistics. We use the method of maximum likelihood to fit the new distribution and determine the observed information matrix. We define the log-McDonald Weibull regression model for censored data. The potentiality of the new model is illustrated by means of two real data sets.  相似文献   

12.
A general nonparametric imputation procedure, based on kernel regression, is proposed to estimate points as well as set- and function-indexed parameters when the data are missing at random (MAR). The proposed method works by imputing a specific function of a missing value (and not the missing value itself), where the form of this specific function is dictated by the parameter of interest. Both single and multiple imputations are considered. The associated empirical processes provide the right tool to study the uniform convergence properties of the resulting estimators. Our estimators include, as special cases, the imputation estimator of the mean, the estimator of the distribution function proposed by Cheng and Chu [1996. Kernel estimation of distribution functions and quantiles with missing data. Statist. Sinica 6, 63–78], imputation estimators of a marginal density, and imputation estimators of regression functions.  相似文献   

13.
Sequential regression multiple imputation has emerged as a popular approach for handling incomplete data with complex features. In this approach, imputations for each missing variable are produced based on a regression model using other variables as predictors in a cyclic manner. Normality assumption is frequently imposed for the error distributions in the conditional regression models for continuous variables, despite that it rarely holds in real scenarios. We use a simulation study to investigate the performance of several sequential regression imputation methods when the error distribution is flat or heavy tailed. The methods evaluated include the sequential normal imputation and its several extensions which adjust for non normal error terms. The results show that all methods perform well for estimating the marginal mean and proportion, as well as the regression coefficient when the error distribution is flat or moderately heavy tailed. When the error distribution is strongly heavy tailed, all methods retain their good performances for the mean and the adjusted methods have robust performances for the proportion; but all methods can have poor performances for the regression coefficient because they cannot accommodate the extreme values well. We caution against the mechanical use of sequential regression imputation without model checking and diagnostics.  相似文献   

14.
为了研究缺失偏态数据下的联合位置与尺度模型,基于分布自身的特点,提出了一种适合缺失偏态数据下联合建模的插补方法———修正随机回归插补方法,该方法对缺失数据下模型偏度参数的调整十分显著。通过随机模拟和实例研究,并与回归插补和随机回归插补方法进行比较,结果表明,所提出的修正随机回归插补方法是有用和有效的。  相似文献   

15.
Conditional expectation imputation and local-likelihood methods are contrasted with a midpoint imputation method for bivariate regression involving interval-censored responses. Although the methods can be extended in principle to higher order polynomials, our focus is on the local constant case. Comparisons are based on simulations of data scattered about three target functions with normally distributed errors. Two censoring mechanisms are considered: the first is analogous to current-status data in which monitoring times occur according to a homogeneous Poisson process; the second is analogous to a coarsening mechanism such as would arise when the response values are binned. We find that, according to a pointwise MSE criterion, no method dominates any other when interval sizes are fixed, but when the intervals have a variable width, the local-likelihood method often performs better than the other methods, and midpoint imputation performs the worst. Several illustrative examples are presented.  相似文献   

16.
熊巍等 《统计研究》2020,37(5):104-116
随着计算机技术的迅猛发展,高维成分数据不断涌现并伴有大量近似零值和缺失,数据的高维特性不仅给传统统计方法带来了巨大的挑战,其厚尾特征、复杂的协方差结构也使得理论分析难上加难。于是如何对高维成分数据的近似零值进行稳健的插补,挖掘潜在的内蕴结构成为当今学者研究的焦点。对此,本文结合修正的EM算法,提出基于R型聚类的Lasso-分位回归插补法(SubLQR)对高维成分数据的近似零值问题予以解决。与现有高维近似零值插补方法相比,本文所提出的SubLQR具有如下优势。①稳健全面性:利用Lasso-分位回归方法,不仅可以有效地探测到响应变量的整个条件分布,还能提供更加真实的高维稀疏模式;②有效准确性:采用基于R型聚类的思想进行插补,可以降低计算复杂度,极大提高插补的精度。模拟研究证实,本文提出的SubLQR高效灵活准确,特别在零值、异常值较多的情形更具优势。最后将SubLQR方法应用于罕见病代谢组学研究中,进一步表明本文所提出的方法具有广泛的适用性。  相似文献   

17.
Two-phase sampling is a cost-effective method of data collection using outcome-dependent sampling for the second-phase sample. In order to make efficient use of auxiliary information and to improve domain estimation, mass imputation can be used in two-phase sampling. Rao and Sitter (1995) introduce mass imputation for two-phase sampling and its variance estimation under simple random sampling in both phases. In this paper, we extend the Rao–Sitter method to general sampling design. The proposed method is further extended to mass imputation for categorical data. A limited simulation study is performed to examine the performance of the proposed methods.  相似文献   

18.
Abstract. Estimating higher‐order moments, particularly fourth‐order moments in linear mixed models is an important, but difficult issue. In this article, an orthogonality‐based estimation of moments is proposed. Under only moment conditions, this method can easily be used to estimate the model parameters and moments, particularly those of higher order than the second order, and in the estimators the random effects and errors do not affect each other. The asymptotic normality of all the estimators is provided. Moreover, the method is readily extended to handle non‐linear, semiparametric and non‐linear models. A simulation study is carried out to examine the performance of the new method.  相似文献   

19.
Missing covariate data are common in biomedical studies. In this article, by using the non parametric kernel regression technique, a new imputation approach is developed for the Cox-proportional hazard regression model with missing covariates. This method achieves the same efficiency as the fully augmented weighted estimators (Qi et al. 2005. Journal of the American Statistical Association, 100:1250) and has a simpler form. The asymptotic properties of the proposed estimator are derived and analyzed. The comparisons between the proposed imputation method and several other existing methods are conducted via a number of simulation studies and a mouse leukemia data.  相似文献   

20.
We have compared the efficacy of five imputation algorithms readily available in SAS for the quadratic discriminant function. Here, we have generated several different parametric-configuration training data with missing data, including monotone missing-at-random observations, and used a Monte Carlo simulation to examine the expected probabilities of misclassification for the two-class quadratic statistical discrimination problem under five different imputation methods. Specifically, we have compared the efficacy of the complete observation-only method and the mean substitution, regression, predictive mean matching, propensity score, and Markov Chain Monte Carlo (MCMC) imputation methods. We found that the MCMC and propensity score multiple imputation approaches are, in general, superior to the other imputation methods for the configurations and training-sample sizes we considered.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号