首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 671 毫秒
1.
This paper develops a novel weighted composite quantile regression (CQR) method for estimation of a linear model when some covariates are missing at random and the probability for missingness mechanism can be modelled parametrically. By incorporating the unbiased estimating equations of incomplete data into empirical likelihood (EL), we obtain the EL-based weights, and then re-adjust the inverse probability weighted CQR for estimating the vector of regression coefficients. Theoretical results show that the proposed method can achieve semiparametric efficiency if the selection probability function is correctly specified, therefore the EL weighted CQR is more efficient than the inverse probability weighted CQR. Besides, our algorithm is computationally simple and easy to implement. Simulation studies are conducted to examine the finite sample performance of the proposed procedures. Finally, we apply the new method to analyse the US news College data.  相似文献   

2.
Quantitle regression (QR) is a popular approach to estimate functional relations between variables for all portions of a probability distribution. Parameter estimation in QR with missing data is one of the most challenging issues in statistics. Regression quantiles can be substantially biased when observations are subject to missingness. We study several inverse probability weighting (IPW) estimators for parameters in QR when covariates or responses are subject to missing not at random. Maximum likelihood and semiparametric likelihood methods are employed to estimate the respondent probability function. To achieve nice efficiency properties, we develop an empirical likelihood (EL) approach to QR with the auxiliary information from the calibration constraints. The proposed methods are less sensitive to misspecified missing mechanisms. Asymptotic properties of the proposed IPW estimators are shown under general settings. The efficiency gain of EL-based IPW estimator is quantified theoretically. Simulation studies and a data set on the work limitation of injured workers from Canada are used to illustrated our proposed methodologies.  相似文献   

3.
We study nonparametric estimation of the illness-death model using left-truncated and right-censored data. The general aim is to estimate the multivariate distribution of a progressive multi-state process. Maximum likelihood estimation under censoring suffers from problems of uniqueness and consistency, so instead we review and extend methods that are based on inverse probability weighting. For univariate left-truncated and right-censored data, nonparametric maximum likelihood estimation can be considerably improved when exploiting knowledge on the truncation distribution. We aim to examine the gain in using such knowledge for inverse probability weighting estimators in the illness-death framework. Additionally, we compare the weights that use truncation variables with the weights that integrate them out, showing, by simulation, that the latter performs more stably and efficiently. We apply the methods to intensive care units data collected in a cross-sectional design, and discuss how the estimators can be easily modified to more general multi-state models.  相似文献   

4.
We propose a new weighting (WT) method to handle missing categorical outcomes in longitudinal data analysis using generalized estimating equations (GEE). The proposed WT provides a valid GEE estimator when the data are missing at random (MAR), and has more stable weights and shows advantage in efficiency compared to the inverse probability weighing method in the presence of small observation probabilities. The WT estimator is similar to the stabilized weighting (SWT) estimator under mild conditions, but it is more stable and efficient than SWT when the associations of the outcome with the observation probabilities and the covariate are strong.  相似文献   

5.
Assessing dose response from flexible‐dose clinical trials is problematic. The true dose effect may be obscured and even reversed in observed data because dose is related to both previous and subsequent outcomes. To remove selection bias, we propose marginal structural models, inverse probability of treatment‐weighting (IPTW) methodology. Potential clinical outcomes are compared across dose groups using a marginal structural model (MSM) based on a weighted pooled repeated measures analysis (generalized estimating equations with robust estimates of standard errors), with dose effect represented by current dose and recent dose history, and weights estimated from the data (via logistic regression) and determined as products of (i) inverse probability of receiving dose assignments that were actually received and (ii) inverse probability of remaining on treatment by this time. In simulations, this method led to almost unbiased estimates of true dose effect under various scenarios. Results were compared with those obtained by unweighted analyses and by weighted analyses under various model specifications. The simulation showed that the IPTW MSM methodology is highly sensitive to model misspecification even when weights are known. Practitioners applying MSM should be cautious about the challenges of implementing MSM with real clinical data. Clinical trial data are used to illustrate the methodology. Copyright © 2012 John Wiley & Sons, Ltd.  相似文献   

6.
We consider logistic regression with covariate measurement error. Most existing approaches require certain replicates of the error‐contaminated covariates, which may not be available in the data. We propose generalized method of moments (GMM) nonparametric correction approaches that use instrumental variables observed in a calibration subsample. The instrumental variable is related to the underlying true covariates through a general nonparametric model, and the probability of being in the calibration subsample may depend on the observed variables. We first take a simple approach adopting the inverse selection probability weighting technique using the calibration subsample. We then improve the approach based on the GMM using the whole sample. The asymptotic properties are derived, and the finite sample performance is evaluated through simulation studies and an application to a real data set.  相似文献   

7.
Inverse probability weighting (IPW) and multiple imputation are two widely adopted approaches dealing with missing data. The former models the selection probability, and the latter models data distribution. Consistent estimation requires correct specification of corresponding models. Although the augmented IPW method provides an extra layer of protection on consistency, it is usually not sufficient in practice as the true data‐generating process is unknown. This paper proposes a method combining the two approaches in the same spirit of calibration in sampling survey literature. Multiple models for both the selection probability and data distribution can be simultaneously accounted for, and the resulting estimator is consistent if any model is correctly specified. The proposed method is within the framework of estimating equations and is general enough to cover regression analysis with missing outcomes and/or missing covariates. Results on both theoretical and numerical investigation are provided.  相似文献   

8.
When data are missing, analyzing records that are completely observed may cause bias or inefficiency. Existing approaches in handling missing data include likelihood, imputation and inverse probability weighting. In this paper, we propose three estimators inspired by deleting some completely observed data in the regression setting. First, we generate artificial observation indicators that are independent of outcome given the observed data and draw inferences conditioning on the artificial observation indicators. Second, we propose a closely related weighting method. The proposed weighting method has more stable weights than those of the inverse probability weighting method (Zhao, L., Lipsitz, S., 1992. Designs and analysis of two-stage studies. Statistics in Medicine 11, 769–782). Third, we improve the efficiency of the proposed weighting estimator by subtracting the projection of the estimating function onto the nuisance tangent space. When data are missing completely at random, we show that the proposed estimators have asymptotic variances smaller than or equal to the variance of the estimator obtained from using completely observed records only. Asymptotic relative efficiency computation and simulation studies indicate that the proposed weighting estimators are more efficient than the inverse probability weighting estimators under wide range of practical situations especially when the missingness proportion is large.  相似文献   

9.
To estimate parameters defined by estimating equations with covariates missing at random, we consider three bias-corrected nonparametric approaches based on inverse probability weighting, regression and augmented inverse probability weighting. However, when the dimension of covariates is not low, the estimation efficiency will be affected due to the curse of dimensionality. To address this issue, we propose a two-stage estimation procedure by using the dimension-reduced kernel estimation in conjunction with bias-corrected estimating equations. We show that the resulting three estimators are asymptotically equivalent and achieve the desirable properties. The impact of dimension reduction in nonparametric estimation of parameters is also investigated. The finite-sample performance of the proposed estimators is studied through simulation, and an application to an automobile data set is also presented.  相似文献   

10.
Clustered longitudinal data feature cross‐sectional associations within clusters, serial dependence within subjects, and associations between responses at different time points from different subjects within the same cluster. Generalized estimating equations are often used for inference with data of this sort since they do not require full specification of the response model. When data are incomplete, however, they require data to be missing completely at random unless inverse probability weights are introduced based on a model for the missing data process. The authors propose a robust approach for incomplete clustered longitudinal data using composite likelihood. Specifically, pairwise likelihood methods are described for conducting robust estimation with minimal model assumptions made. The authors also show that the resulting estimates remain valid for a wide variety of missing data problems including missing at random mechanisms and so in such cases there is no need to model the missing data process. In addition to describing the asymptotic properties of the resulting estimators, it is shown that the method performs well empirically through simulation studies for complete and incomplete data. Pairwise likelihood estimators are also compared with estimators obtained from inverse probability weighted alternating logistic regression. An application to data from the Waterloo Smoking Prevention Project is provided for illustration. The Canadian Journal of Statistics 39: 34–51; 2011 © 2010 Statistical Society of Canada  相似文献   

11.
Survival functions are often estimated by nonparametric estimators such as the Kaplan‐Meier estimator. For valid estimation, proper adjustment for confounding factors is needed when treatment assignment may depend on confounding factors. Inverse probability weighting is a commonly used approach, especially when there is a large number of potential confounders to adjust for. Direct adjustment may also be used if the relationship between the time‐to‐event and all confounders can be modeled. However, either approach requires a correctly specified model for the relationship between confounders and treatment allocation or between confounders and the time‐to‐event. We propose a pseudo‐observation–based doubly robust estimator, which is valid when either the treatment allocation model or the time‐to‐event model is correctly specified and is generally more efficient than the inverse probability weighting approach. The approach can be easily implemented using standard software. A simulation study was conducted to evaluate this approach under a number of scenarios, and the results are presented and discussed. The results confirm robustness and efficiency of the proposed approach. A real data example is also provided for illustration.  相似文献   

12.
The main purpose of this paper is to introduce first a new family of empirical test statistics for testing a simple null hypothesis when the vector of parameters of interest is defined through a specific set of unbiased estimating functions. This family of test statistics is based on a distance between two probability vectors, with the first probability vector obtained by maximizing the empirical likelihood (EL) on the vector of parameters, and the second vector defined from the fixed vector of parameters under the simple null hypothesis. The distance considered for this purpose is the phi-divergence measure. The asymptotic distribution is then derived for this family of test statistics. The proposed methodology is illustrated through the well-known data of Newcomb's measurements on the passage time for light. A simulation study is carried out to compare its performance with that of the EL ratio test when confidence intervals are constructed based on the respective statistics for small sample sizes. The results suggest that the ‘empirical modified likelihood ratio test statistic’ provides a competitive alternative to the EL ratio test statistic, and is also more robust than the EL ratio test statistic in the presence of contamination in the data. Finally, we propose empirical phi-divergence test statistics for testing a composite null hypothesis and present some asymptotic as well as simulation results for evaluating the performance of these test procedures.  相似文献   

13.
EMPIRICAL LIKELIHOOD-BASED KERNEL DENSITY ESTIMATION   总被引:2,自引:0,他引:2  
This paper considers the estimation of a probability density function when extra distributional information is available (e.g. the mean of the distribution is known or the variance is a known function of the mean). The standard kernel method cannot exploit such extra information systematically as it uses an equal probability weight n-1 at each data point. The paper suggests using empirical likelihood to choose the probability weights under constraints formulated from the extra distributional information. An empirical likelihood-based kernel density estimator is given by replacing n-1 by the empirical likelihood weights, and has these advantages: it makes systematic use of the extra information, it is able to reflect the extra characteristics of the density function, and its variance is smaller than that of the standard kernel density estimator.  相似文献   

14.
We propose a new model for regression and dependence analysis when addressing spatial data with possibly heavy tails and an asymmetric marginal distribution. We first propose a stationary process with t marginals obtained through scale mixing of a Gaussian process with an inverse square root process with Gamma marginals. We then generalize this construction by considering a skew‐Gaussian process, thus obtaining a process with skew‐t marginal distributions. For the proposed (skew) t process, we study the second‐order and geometrical properties and in the t case, we provide analytic expressions for the bivariate distribution. In an extensive simulation study, we investigate the use of the weighted pairwise likelihood as a method of estimation for the t process. Moreover we compare the performance of the optimal linear predictor of the t process versus the optimal Gaussian predictor. Finally, the effectiveness of our methodology is illustrated by analyzing a georeferenced dataset on maximum temperatures in Australia.  相似文献   

15.
The authors consider the empirical likelihood method for the regression model of mean quality‐adjusted lifetime with right censoring. They show that an empirical log‐likelihood ratio for the vector of the regression parameters is asymptotically a weighted sum of independent chi‐squared random variables. They adjust this empirical log‐likelihood ratio so that the limiting distribution is a standard chi‐square and construct corresponding confidence regions. Simulation studies lead them to conclude that empirical likelihood methods outperform the normal approximation methods in terms of coverage probability. They illustrate their methods with a data example from a breast cancer clinical trial study.  相似文献   

16.
Often in observational studies of time to an event, the study population is a biased (i.e., unrepresentative) sample of the target population. In the presence of biased samples, it is common to weight subjects by the inverse of their respective selection probabilities. Pan and Schaubel (Can J Stat 36:111–127, 2008) recently proposed inference procedures for an inverse selection probability weighted (ISPW) Cox model, applicable when selection probabilities are not treated as fixed but estimated empirically. The proposed weighting procedure requires auxiliary data to estimate the weights and is computationally more intense than unweighted estimation. The ignorability of sample selection process in terms of parameter estimators and predictions is often of interest, from several perspectives: e.g., to determine if weighting makes a significant difference to the analysis at hand, which would in turn address whether the collection of auxiliary data is required in future studies; to evaluate previous studies which did not correct for selection bias. In this article, we propose methods to quantify the degree of bias corrected by the weighting procedure in the partial likelihood and Breslow-Aalen estimators. Asymptotic properties of the proposed test statistics are derived. The finite-sample significance level and power are evaluated through simulation. The proposed methods are then applied to data from a national organ failure registry to evaluate the bias in a post-kidney transplant survival model.  相似文献   

17.
ABSTRACT

Stress testing correlation matrix is a challenging exercise for portfolio risk management. Most existing methods directly modify the estimated correlation matrix to satisfy stress conditions while maintaining positive semidefiniteness. The focus lies on technical optimization issues but the resultant stressed correlation matrices usually lack statistical interpretations. In this article, we suggest a novel approach using Empirical Likelihood method to modify the probability weights of sample observations to construct a stressed correlation matrix. The resultant correlations correspond to a stress scenario that is nearest to the observed scenario in a Kullback–Leibler divergence sense. Besides providing a clearer statistical interpretation, the proposed method is non-parametric in distribution, simple in computation and free from subjective tunings. We illustrate the method through an application to a portfolio of international assets.  相似文献   

18.
Efficient statistical inference on nonignorable missing data is a challenging problem. This paper proposes a new estimation procedure based on composite quantile regression (CQR) for linear regression models with nonignorable missing data, that is applicable even with high-dimensional covariates. A parametric model is assumed for modelling response probability, which is estimated by the empirical likelihood approach. Local identifiability of the proposed strategy is guaranteed on the basis of an instrumental variable approach. A set of data-based adaptive weights constructed via an empirical likelihood method is used to weight CQR functions. The proposed method is resistant to heavy-tailed errors or outliers in the response. An adaptive penalisation method for variable selection is proposed to achieve sparsity with high-dimensional covariates. Limiting distributions of the proposed estimators are derived. Simulation studies are conducted to investigate the finite sample performance of the proposed methodologies. An application to the ACTG 175 data is analysed.  相似文献   

19.
Summary.  Multiple imputation is now a well-established technique for analysing data sets where some units have incomplete observations. Provided that the imputation model is correct, the resulting estimates are consistent. An alternative, weighting by the inverse probability of observing complete data on a unit, is conceptually simple and involves fewer modelling assumptions, but it is known to be both inefficient (relative to a fully parametric approach) and sensitive to the choice of weighting model. Over the last decade, there has been a considerable body of theoretical work to improve the performance of inverse probability weighting, leading to the development of 'doubly robust' or 'doubly protected' estimators. We present an intuitive review of these developments and contrast these estimators with multiple imputation from both a theoretical and a practical viewpoint.  相似文献   

20.
For an estimation with missing data, a crucial step is to determine if the data are missing completely at random (MCAR), in which case a complete‐case analysis would suffice. Most existing tests for MCAR do not provide a method for a subsequent estimation once the MCAR is rejected. In the setting of estimating means, we propose a unified approach for testing MCAR and the subsequent estimation. Upon rejecting MCAR, the same set of weights used for testing can then be used for estimation. The resulting estimators are consistent if the missingness of each response variable depends only on a set of fully observed auxiliary variables and the true outcome regression model is among the user‐specified functions for deriving the weights. The proposed method is based on the calibration idea from survey sampling literature and the empirical likelihood theory.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号