首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
文章在响应变量随机缺失下,基于分位数回归研究了半参数模型的稳健估计问题。首先基于B样条基函数近似技术,将模型非参数函数的估计问题转化为样条系数向量估计问题;其次,在响应变量随机缺失下,提出了一种新的插补方法,对缺失的响应变量进行多重插补;再次,基于插补后的数据集,构造出新的分位数目标函数,得到模型非参数函数以及参数向量的稳健估计;最后给出了有效算法计算多重插补估计量。通过模拟研究验证了所提方法的有效性和稳健性。  相似文献   

2.
在数据随机缺失的分位数回归模型中,运用诱导光滑思想构造光滑的估计方程,得到了回归参数的诱导光滑估计及渐近协方差估计。接着证明了诱导光滑估计的渐近正态性质,并给出诱导光滑估计及其渐近协方差估计的算法。模拟研究表明新方法在有限样本中表现出色。  相似文献   

3.
Abstract.  A kernel regression imputation method for missing response data is developed. A class of bias-corrected empirical log-likelihood ratios for the response mean is defined. It is shown that any member of our class of ratios is asymptotically chi-squared, and the corresponding empirical likelihood confidence interval for the response mean is constructed. Our ratios share some of the desired features of the existing methods: they are self-scale invariant and no plug-in estimators for the adjustment factor and asymptotic variance are needed; when estimating the non-parametric function in the model, undersmoothing to ensure root- n consistency of the estimator for the parameter is avoided. Since the range of bandwidths contains the optimal bandwidth for estimating the regression function, the existing data-driven algorithm is valid for selecting an optimal bandwidth. We also study the normal approximation-based method. A simulation study is undertaken to compare the empirical likelihood with the normal approximation method in terms of coverage accuracies and average lengths of confidence intervals.  相似文献   

4.
In this article, we consider a partially linear single-index model Y = g(Z τθ0) + X τβ0 + ? when the covariate X may be missing at random. We propose weighted estimators for the unknown parametric and nonparametric part by applying weighted estimating equations. We establish normality of the estimators of the parameters and asymptotic expansion for the estimator of the nonparametric part when the selection probabilities are unknown. Simulation studies are also conducted to illustrate the finite sample properties of these estimators.  相似文献   

5.
Linear increments (LI) are used to analyse repeated outcome data with missing values. Previously, two LI methods have been proposed, one allowing non‐monotone missingness but not independent measurement error and one allowing independent measurement error but only monotone missingness. In both, it was suggested that the expected increment could depend on current outcome. We show that LI can allow non‐monotone missingness and either independent measurement error of unknown variance or dependence of expected increment on current outcome but not both. A popular alternative to LI is a multivariate normal model ignoring the missingness pattern. This gives consistent estimation when data are normally distributed and missing at random (MAR). We clarify the relation between MAR and the assumptions of LI and show that for continuous outcomes multivariate normal estimators are also consistent under (non‐MAR and non‐normal) assumptions not much stronger than those of LI. Moreover, when missingness is non‐monotone, they are typically more efficient.  相似文献   

6.
This article considers statistical inference for partially linear varying-coefficient models when the responses are missing at random. We propose a profile least-squares estimator for the parametric component with complete-case data and show that the resulting estimator is asymptotically normal. To avoid to estimate the asymptotic covariance in establishing confidence region of the parametric component with the normal-approximation method, we define an empirical likelihood based statistic and show that its limiting distribution is chi-squared distribution. Then, the confidence regions of the parametric component with asymptotically correct coverage probabilities can be constructed by the result. To check the validity of the linear constraints on the parametric component, we construct a modified generalized likelihood ratio test statistic and demonstrate that it follows asymptotically chi-squared distribution under the null hypothesis. Then, we extend the generalized likelihood ratio technique to the context of missing data. Finally, some simulations are conducted to illustrate the proposed methods.  相似文献   

7.
随机系数自回归模型能够较好地描述模型系数随时间变化的特性,因此得到了广泛应用。文章讨论具有缺失数据的随机系数自回归模型的参数估计问题,在缺失数据情形下给出了四种模型参数估计方法:无数据填充条件最小二乘法、均值填充法、条件均值填充法以及桥填充法。最后,通过随机模拟说明了上述估计方法的精确性,并给出了应用实例。  相似文献   

8.
缺失偏态数据下线性回归模型的统计推断   总被引:1,自引:2,他引:1  
研究缺失偏态数据下线性回归模型的参数估计问题,针对缺失偏态数据,为克服样本分布扭曲缺点和提高模型的回归系数、尺度参数和偏度参数的估计效果,提出了一种适合偏态数据下线性回归模型中缺失数据的修正回归插补方法.通过随机模拟和实例研究,并与均值插补、回归插补、随机回归插补方法比较,结果表明所提出的修正回归插补方法是有效可行的.  相似文献   

9.
This article is concerned with partially non linear models when the response variables are missing at random. We examine the empirical likelihood (EL) ratio statistics for unknown parameter in non linear function based on complete-case data, semiparametric regression imputation, and bias-corrected imputation. All the proposed statistics are proven to be asymptotically chi-square distribution under some suitable conditions. Simulation experiments are conducted to compare the finite sample behaviors of the proposed approaches in terms of confidence intervals. It showed that the EL method has advantage compared to the conventional method, and moreover, the imputation technique performs better than the complete-case data.  相似文献   

10.
In this paper, a regression semi-parametric model is considered where responses are assumed to be missing at random. From the empirical likelihood function defined based on the rank-based estimating equation, robust confidence intervals/regions of the true regression coefficient are derived. Monte Carlo simulation experiments show that the proposed approach provides more accurate confidence intervals/regions compared to its normal approximation counterpart under different model error structure. The approach is also compared with the least squares approach, and its superiority is shown whenever the error distribution in the simulation study is heavy tailed or contaminated. Finally, a real data example is given to illustrate our proposed method.  相似文献   

11.
Empirical Likelihood-based Inference in Linear Models with Missing Data   总被引:18,自引:0,他引:18  
The missing response problem in linear regression is studied. An adjusted empirical likelihood approach to inference on the mean of the response variable is developed. A non-parametric version of Wilks's theorem for the adjusted empirical likelihood is proved, and the corresponding empirical likelihood confidence interval for the mean is constructed. With auxiliary information, an empirical likelihood-based estimator with asymptotic normality is defined and an adjusted empirical log-likelihood function with asymptotic χ2 is derived. A simulation study is conducted to compare the adjusted empirical likelihood methods and the normal approximation methods in terms of coverage accuracies and average lengths of the confidence intervals. Based on biases and standard errors, a comparison is also made between the empirical likelihood-based estimator and related estimators by simulation. Our simulation indicates that the adjusted empirical likelihood methods perform competitively and the use of auxiliary information provides improved inferences.  相似文献   

12.
Quantitle regression (QR) is a popular approach to estimate functional relations between variables for all portions of a probability distribution. Parameter estimation in QR with missing data is one of the most challenging issues in statistics. Regression quantiles can be substantially biased when observations are subject to missingness. We study several inverse probability weighting (IPW) estimators for parameters in QR when covariates or responses are subject to missing not at random. Maximum likelihood and semiparametric likelihood methods are employed to estimate the respondent probability function. To achieve nice efficiency properties, we develop an empirical likelihood (EL) approach to QR with the auxiliary information from the calibration constraints. The proposed methods are less sensitive to misspecified missing mechanisms. Asymptotic properties of the proposed IPW estimators are shown under general settings. The efficiency gain of EL-based IPW estimator is quantified theoretically. Simulation studies and a data set on the work limitation of injured workers from Canada are used to illustrated our proposed methodologies.  相似文献   

13.
In real-life situations, we often encounter data sets containing missing observations. Statistical methods that address missingness have been extensively studied in recent years. One of the more popular approaches involves imputation of the missing values prior to the analysis, thereby rendering the data complete. Imputation broadly encompasses an entire scope of techniques that have been developed to make inferences about incomplete data, ranging from very simple strategies (e.g. mean imputation) to more advanced approaches that require estimation, for instance, of posterior distributions using Markov chain Monte Carlo methods. Additional complexity arises when the number of missingness patterns increases and/or when both categorical and continuous random variables are involved. Implementation of routines, procedures, or packages capable of generating imputations for incomplete data are now widely available. We review some of these in the context of a motivating example, as well as in a simulation study, under two missingness mechanisms (missing at random and missing not at random). Thus far, evaluation of existing implementations have frequently centred on the resulting parameter estimates of the prescribed model of interest after imputing the missing data. In some situations, however, interest may very well be on the quality of the imputed values at the level of the individual – an issue that has received relatively little attention. In this paper, we focus on the latter to provide further insight about the performance of the different routines, procedures, and packages in this respect.  相似文献   

14.
Suppose that we have a nonparametric regression model Y = m(X) + ε with XRp, where X is a random design variable and is observed completely, and Y is the response variable and some Y-values are missing at random. Based on the “complete” data sets for Y after nonaprametric regression imputation and inverse probability weighted imputation, two estimators of the regression function m(x0) for fixed x0Rp are proposed. Asymptotic normality of two estimators is established, which is used to construct normal approximation-based confidence intervals for m(x0). We also construct an empirical likelihood (EL) statistic for m(x0) with limiting distribution of χ21, which is used to construct an EL confidence interval for m(x0).  相似文献   

15.
基于随机森林模型的分类数据缺失值插补   总被引:6,自引:1,他引:6  
缺失数据是影响调查问卷数据质量的重要因素,对调查问卷中的缺失值进行插补可以显著提高调查数据的质量。调查问卷的数据类型多以分类型数据为主,数据挖掘技术中的分类算法是处理属性分类问题的常用方法,随机森林模型是众多分类算法中精度较高的方法之一。将随机森林模型引入调查问卷缺失数据的插补研究中,提出了基于随机森林模型的分类数据缺失值插补方法,并根据不同的缺失模式探讨了相应的插补步骤。通过与其它方法的实证模拟比较,表明随机森林插补法得到的插补值准确度更优、可信度更高。  相似文献   

16.
利用经验似然方法,讨论缺失数据下广义线性模型中参数的置信域问题,得到了对数经验似然比统计量的渐近分布为标准卡方分布;给出参数的一些估计量及其渐近分布,利用数据模拟解释了所提出的方法。  相似文献   

17.
This paper investigates the estimations of regression parameters and response mean in nonlinear regression models in the presence of missing response variables that are missing with missingness probabilities depending on covariates. We propose four empirical likelihood (EL)-based estimators for the regression parameters and the response mean. The resulting estimators are shown to be consistent and asymptotically normal under some general assumptions. To construct the confidence regions for the regression parameters as well as the response mean, we develop four EL ratio statistics, which are proven to have the χ2 distribution asymptotically. Simulation studies and an artificial data set are used to illustrate the proposed methodologies. Empirical results show that the EL method behaves better than the normal approximation method and that the coverage probabilities and average lengths depend on the selection probability function.  相似文献   

18.
This article addresses issues in creating public-use data files in the presence of missing ordinal responses and subsequent statistical analyses of the dataset by users. The authors propose a fully efficient fractional imputation (FI) procedure for ordinal responses with missing observations. The proposed imputation strategy retrieves the missing values through the full conditional distribution of the response given the covariates and results in a single imputed data file that can be analyzed by different data users with different scientific objectives. Two most critical aspects of statistical analyses based on the imputed data set,  validity  and  efficiency, are examined through regression analysis involving the ordinal response and a selected set of covariates. It is shown through both theoretical development and simulation studies that, when the ordinal responses are missing at random, the proposed FI procedure leads to valid and highly efficient inferences as compared to existing methods. Variance estimation using the fractionally imputed data set is also discussed. The Canadian Journal of Statistics 48: 138–151; 2020 © 2019 Statistical Society of Canada  相似文献   

19.
In some randomized (drug versus placebo) clinical trials, the estimand of interest is the between‐treatment difference in population means of a clinical endpoint that is free from the confounding effects of “rescue” medication (e.g., HbA1c change from baseline at 24 weeks that would be observed without rescue medication regardless of whether or when the assigned treatment was discontinued). In such settings, a missing data problem arises if some patients prematurely discontinue from the trial or initiate rescue medication while in the trial, the latter necessitating the discarding of post‐rescue data. We caution that the commonly used mixed‐effects model repeated measures analysis with the embedded missing at random assumption can deliver an exaggerated estimate of the aforementioned estimand of interest. This happens, in part, due to implicit imputation of an overly optimistic mean for “dropouts” (i.e., patients with missing endpoint data of interest) in the drug arm. We propose an alternative approach in which the missing mean for the drug arm dropouts is explicitly replaced with either the estimated mean of the entire endpoint distribution under placebo (primary analysis) or a sequence of increasingly more conservative means within a tipping point framework (sensitivity analysis); patient‐level imputation is not required. A supplemental “dropout = failure” analysis is considered in which a common poor outcome is imputed for all dropouts followed by a between‐treatment comparison using quantile regression. All analyses address the same estimand and can adjust for baseline covariates. Three examples and simulation results are used to support our recommendations.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号