首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Asymptotic properties of M-estimators with complete data are investigated extensively. In the presence of missing data, however, the standard inference procedures for complete data cannot be applied directly. In this article, the inverse probability weighted method is applied to missing response problem to define M-estimators. The existence of M-estimators is established under very general regularity conditions. Consistency and asymptotic normality of the M-estimators are proved, respectively. An iterative algorithm is applied to calculating the M-estimators. It is shown that one step iteration suffices and the resulting one-step M-estimate has the same limit distribution as in the fully iterated M-estimators.  相似文献   

2.
邰凌楠等 《统计研究》2018,35(9):115-128
数据缺失问题普遍存在于应用研究中。在随机缺失机制假定下,本文从模型推断角度出发,针对线性缺失分位回归模型,提出一种新的有效估计方法——逆概率多重加权(IPMW)估计。该方法是在逆概率加权(IPW)估计的基础上,结合倾向得分匹配及模型平均思想,经过多次估计,加权确定最终参数估计结果。该方法适用于响应变量是独立同分布或独立非同分布的情形,并适用于绝大多数缺失场景。经过理论推导及模拟研究发现,IPMW估计量在继承IPW估计量的优势上具有更稳健的性质。最后,将该方法应用于含有缺失数据的微观调查数据中,研究了经济较发达的准一线城市中等收入群体消费水平的影响因素,对比两种估计方法的估计结果及置信带,发现逆概率多重加权估计量的标准偏差更小,估计结果更稳健。  相似文献   

3.
We propose a new weighting (WT) method to handle missing categorical outcomes in longitudinal data analysis using generalized estimating equations (GEE). The proposed WT provides a valid GEE estimator when the data are missing at random (MAR), and has more stable weights and shows advantage in efficiency compared to the inverse probability weighing method in the presence of small observation probabilities. The WT estimator is similar to the stabilized weighting (SWT) estimator under mild conditions, but it is more stable and efficient than SWT when the associations of the outcome with the observation probabilities and the covariate are strong.  相似文献   

4.
When data are missing, analyzing records that are completely observed may cause bias or inefficiency. Existing approaches in handling missing data include likelihood, imputation and inverse probability weighting. In this paper, we propose three estimators inspired by deleting some completely observed data in the regression setting. First, we generate artificial observation indicators that are independent of outcome given the observed data and draw inferences conditioning on the artificial observation indicators. Second, we propose a closely related weighting method. The proposed weighting method has more stable weights than those of the inverse probability weighting method (Zhao, L., Lipsitz, S., 1992. Designs and analysis of two-stage studies. Statistics in Medicine 11, 769–782). Third, we improve the efficiency of the proposed weighting estimator by subtracting the projection of the estimating function onto the nuisance tangent space. When data are missing completely at random, we show that the proposed estimators have asymptotic variances smaller than or equal to the variance of the estimator obtained from using completely observed records only. Asymptotic relative efficiency computation and simulation studies indicate that the proposed weighting estimators are more efficient than the inverse probability weighting estimators under wide range of practical situations especially when the missingness proportion is large.  相似文献   

5.
Crossover designs are used often in clinical trials. It is not uncommon that subjects discontinue before completing all treatment periods in a crossover study. Despite availability of statistical methodologies utilizing all available data and software for obtaining valid inferences under the assumption of missing at random (MAR), naïve approaches, such as the complete case (CC) analysis, which is only valid with a strong assumption of missing completely at random are still widely used in practice. In this article, we obtain the analytical form of the estimation bias of treatment effects with CC for linear mixed models. We use simulation studies to examine the inflation of Type I error and efficiency loss in the inferences with CC under MAR. Invalidity and inefficiency of two other commonly used approaches for defining analyzed data in the presence of missing data, including data from at least two periods in three period crossover and available cases for a specific comparison of interest, are also demonstrated through simulation studies.  相似文献   

6.
We propose a class of estimators for the population mean when there are missing data in the data set. Obtaining the mean square error equations of the proposed estimators, we show the conditions where the proposed estimators are more efficient than the sample mean, ratio-type estimators, and the estimators in Singh and Horn (2000 Singh , S. , Horn , S. ( 2000 ). Compromised imputation in survey sampling . Metrika 51 : 267276 .[Crossref], [Web of Science ®] [Google Scholar]) and Singh and Deo (2003 Singh , S. , Deo , B. (2003). Imputation by power transformation. Statist. Pap. 44:555579.[Crossref], [Web of Science ®] [Google Scholar]) in the case of missing data. These conditions are also supported by a numerical example.  相似文献   

7.
Suppose that we have a nonparametric regression model Y = m(X) + ε with XRp, where X is a random design variable and is observed completely, and Y is the response variable and some Y-values are missing at random. Based on the “complete” data sets for Y after nonaprametric regression imputation and inverse probability weighted imputation, two estimators of the regression function m(x0) for fixed x0Rp are proposed. Asymptotic normality of two estimators is established, which is used to construct normal approximation-based confidence intervals for m(x0). We also construct an empirical likelihood (EL) statistic for m(x0) with limiting distribution of χ21, which is used to construct an EL confidence interval for m(x0).  相似文献   

8.
In this article, we consider the order estimation of autoregressive models with incomplete data using the expectation–maximization (EM) algorithm-based information criteria. The criteria take the form of a penalization of the conditional expectation of the log-likelihood. The evaluation of the penalization term generally involves numerical differentiation and matrix inversion. We introduce a simplification of the penalization term for autoregressive model selection and we propose a penalty factor based on a resampling procedure in the criteria formula. The simulation results show the improvements yielded by the proposed method when compared with the classical information criteria for model selection with incomplete data.  相似文献   

9.
We propose inverse probability weighted estimators for the local average treatment effect (LATE) and the local average treatment effect for the treated (LATT) under instrumental variable assumptions with covariates. We show that these estimators are asymptotically normal and efficient. When the (binary) instrument satisfies one-sided noncompliance, we propose a Durbin–Wu–Hausman-type test of whether treatment assignment is unconfounded conditional on some observables. The test is based on the fact that under one-sided noncompliance LATT coincides with the average treatment effect for the treated (ATT). We conduct Monte Carlo simulations to demonstrate, among other things, that part of the theoretical efficiency gain afforded by unconfoundedness in estimating ATT survives pretesting. We illustrate the implementation of the test on data from training programs administered under the Job Training Partnership Act in the United States. This article has online supplementary material.  相似文献   

10.
Inverse Gaussian first hitting time regression models sometimes provide an attractive representation of lifetime data. Various authors comment that dependence of both parameters on the same covariate may imply multicollinearity. The frequent appearance of conflicting signs for the two coefficients of the same covariate may be related to this. We carry out simulation studies to examine the reality of this possible multicollinearity. Although there is some dependence between estimates, multicollinearity does not seem to be a major problem. Fitting this model to data generated by a Weibull regression suggests that conflicting signs of estimates may be due to model misspecification.  相似文献   

11.
Inverse probability weighting (IPW) and multiple imputation are two widely adopted approaches dealing with missing data. The former models the selection probability, and the latter models data distribution. Consistent estimation requires correct specification of corresponding models. Although the augmented IPW method provides an extra layer of protection on consistency, it is usually not sufficient in practice as the true data‐generating process is unknown. This paper proposes a method combining the two approaches in the same spirit of calibration in sampling survey literature. Multiple models for both the selection probability and data distribution can be simultaneously accounted for, and the resulting estimator is consistent if any model is correctly specified. The proposed method is within the framework of estimating equations and is general enough to cover regression analysis with missing outcomes and/or missing covariates. Results on both theoretical and numerical investigation are provided.  相似文献   

12.
This article is concerned with the estimation of a varying-coefficient regression model when the response variable is sometimes missing and some of the covariates are measured with additive errors. We propose a class of estimators for the coefficient functions, as well as for the population mean and the error variance. The resulting estimators are shown to be asymptotically normal. Simulation studies are conducted to illustrate our approach.  相似文献   

13.

Let Y be a response and, given covariate X,Y has a conditional density f(y | x, θ), where θ is a unknown p-dimensional vector of parameters and the marginal distribution of X is unknown. When responses are missing at random, with auxiliary information and imputation, we define an adjusted empirical log-likelihood ratio for the mean of Y and obtain its asymptotic distribution. A simulation study is conducted to compare the adjusted empirical log-likelihood and the normal approximation method in terms of coverage accuracies.  相似文献   

14.
In two-stage randomization designs, patients are randomized to one or more available therapies upon entry into the study. Depending on the response to the initial treatment (such as complete remission or shrinkage of tumor), patients are then randomized to receive maintenance treatments to maintain the response or salvage treatment to induce response. One goal of such trials is to compare the combinations of initial and maintenance or salvage therapies in the form of treatment strategies. In cases where the endpoint is defined as overall survival, Lunceford et al. [2002. Estimation of survival distributions of treatment policies in two-stage and randomization designs in clinical trials. Biometrics 58, 48–57] used mean survival time and pointwise survival probability to compare treatment strategies. But, mean survival time or survival probability at a specific time may not be a good summary representative of the overall distribution when the data are skewed or contain influential tail observations. In this article, we propose consistent and asymptotic normal estimators for percentiles of survival curves under various treatment strategies and demonstrate the use of percentiles for comparing treatment strategies. Small sample properties of these estimators are investigated using simulation. We demonstrate our methods by applying them to a leukemia clinical trial data set that motivated this research.  相似文献   

15.
This article considers statistical inference for the heteroscedastic partially linear varying coefficient models. We construct an efficient estimator for the parametric component by applying the weighted profile least-squares approach, and show that it is semiparametrically efficient in the sense that the inverse of the asymptotic variance of the estimator reaches the semiparametric efficiency bound. Simulation studies are conducted to illustrate the performance of the proposed method.  相似文献   

16.
Abstract.  Theory on semi-parametric efficient estimation in missing data problems has been systematically developed by Robins and his coauthors. Except in relatively simple problems, semi-parametric efficient scores cannot be expressed in closed forms. Instead, the efficient scores are often expressed as solutions to integral equations. Neumann series was proposed in the form of successive approximation to the efficient scores in those situations. Statistical properties of the estimator based on the Neumann series approximation are difficult to obtain and as a result, have not been clearly studied. In this paper, we reformulate the successive approximation in a simple iterative form and study the statistical properties of the estimator based on the reformulation. We show that a doubly robust locally efficient estimator can be obtained following the algorithm in robustifying the likelihood score. The results can be applied to, among others, parametric regression, marginal regression and Cox regression when data are subject to missing values and the data are missing at random. A simulation study is conducted to evaluate the performance of the approach and a real data example is analysed to demonstrate the use of the approach.  相似文献   

17.
《Econometric Reviews》2013,32(3):229-257
Abstract

We obtain semiparametric efficiency bounds for estimation of a location parameter in a time series model where the innovations are stationary and ergodic conditionally symmetric martingale differences but otherwise possess general dependence and distributions of unknown form. We then describe an iterative estimator that achieves this bound when the conditional density functions of the sample are known. Finally, we develop a “semi-adaptive” estimator that achieves the bound when these densities are unknown by the investigator. This estimator employs nonparametric kernel estimates of the densities. Monte Carlo results are reported.  相似文献   

18.
This article considers a discrete-time Markov chain for modeling transition probabilities when multiple successive observations are missing at random between two observed outcomes using three methods: a na\"?ve analog of complete-case analysis using the observed one-step transitions alone, a non data-augmentation method (NL) by solving nonlinear equations, and a data-augmentation method, the Expectation-Maximization (EM) algorithm. The explicit form of the conditional log-likelihood given the observed information as required by the E step is provided, and the iterative formula in the M step is expressed in a closed form. An empirical study was performed to examine the accuracy and precision of the estimates obtained in the three methods under ignorable missing mechanisms of missing completely at random and missing at random. A dataset from the mental health arena was used for illustration. It was found that both data-augmentation and nonaugmentation methods provide accurate and precise point estimation, and that the na\"?ve method resulted in estimates of the transition probabilities with similar bias but larger MSE. The NL method and the EM algorithm in general provide similar results whereas the latter provides conditional expected row margins leading to smaller standard errors.  相似文献   

19.
We obtain semiparametric efficiency bounds for estimation of a location parameter in a time series model where the innovations are stationary and ergodic conditionally symmetric martingale differences but otherwise possess general dependence and distributions of unknown form. We then describe an iterative estimator that achieves this bound when the conditional density functions of the sample are known. Finally, we develop a “semi-adaptive” estimator that achieves the bound when these densities are unknown by the investigator. This estimator employs nonparametric kernel estimates of the densities. Monte Carlo results are reported.  相似文献   

20.
Abstract. In the presence of missing covariates, standard model validation procedures may result in misleading conclusions. By building generalized score statistics on augmented inverse probability weighted complete‐case estimating equations, we develop a new model validation procedure to assess the adequacy of a prescribed analysis model when covariate data are missing at random. The asymptotic distribution and local alternative efficiency for the test are investigated. Under certain conditions, our approach provides not only valid but also asymptotically optimal results. A simulation study for both linear and logistic regression illustrates the applicability and finite sample performance of the methodology. Our method is also employed to analyse a coronary artery disease diagnostic dataset.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号