期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Improved kth power expectile regression with nonignorable dropouts

Dongyu Li Lei Wang 《Journal of applied statistics》2022,49(11):2767

The kth (

1 < k \leq 2

) power expectile regression (ER) can balance robustness and effectiveness between the ordinary quantile regression and ER simultaneously. Motivated by a longitudinal ACTG 193A data with nonignorable dropouts, we propose a two-stage estimation procedure and statistical inference methods based on the kth power ER and empirical likelihood to accommodate both the within-subject correlations and nonignorable dropouts. Firstly, we construct the bias-corrected generalized estimating equations by combining the kth power ER and inverse probability weighting approaches. Subsequently, the generalized method of moments is utilized to estimate the parameters in the nonignorable dropout propensity based on sufficient instrumental estimating equations. Secondly, in order to incorporate the within-subject correlations under an informative working correlation structure, we borrow the idea of quadratic inference function to obtain the improved empirical likelihood procedures. The asymptotic properties of the corresponding estimators and their confidence regions are derived. The finite-sample performance of the proposed estimators is studied through simulation and an application to the ACTG 193A data is also presented. 相似文献

2.

Efficient inverse probability weighting method for quantile regression with nonignorable missing data

Pu-Ying Zhao De-Peng Jiang 《Statistics》2017,51(2):363-386

Quantitle regression (QR) is a popular approach to estimate functional relations between variables for all portions of a probability distribution. Parameter estimation in QR with missing data is one of the most challenging issues in statistics. Regression quantiles can be substantially biased when observations are subject to missingness. We study several inverse probability weighting (IPW) estimators for parameters in QR when covariates or responses are subject to missing not at random. Maximum likelihood and semiparametric likelihood methods are employed to estimate the respondent probability function. To achieve nice efficiency properties, we develop an empirical likelihood (EL) approach to QR with the auxiliary information from the calibration constraints. The proposed methods are less sensitive to misspecified missing mechanisms. Asymptotic properties of the proposed IPW estimators are shown under general settings. The efficiency gain of EL-based IPW estimator is quantified theoretically. Simulation studies and a data set on the work limitation of injured workers from Canada are used to illustrated our proposed methodologies. 相似文献

3.

Xiao Song Ching‐Yun Wang 《Scandinavian Journal of Statistics》2019,46(3):898-919

We consider logistic regression with covariate measurement error. Most existing approaches require certain replicates of the error‐contaminated covariates, which may not be available in the data. We propose generalized method of moments (GMM) nonparametric correction approaches that use instrumental variables observed in a calibration subsample. The instrumental variable is related to the underlying true covariates through a general nonparametric model, and the probability of being in the calibration subsample may depend on the observed variables. We first take a simple approach adopting the inverse selection probability weighting technique using the calibration subsample. We then improve the approach based on the GMM using the whole sample. The asymptotic properties are derived, and the finite sample performance is evaluated through simulation studies and an application to a real data set. 相似文献

4.

A Model Validation Procedure when Covariate Data are Missing at Random

LEI JIN SUOJIN WANG 《Scandinavian Journal of Statistics》2010,37(3):403-421

Abstract. In the presence of missing covariates, standard model validation procedures may result in misleading conclusions. By building generalized score statistics on augmented inverse probability weighted complete‐case estimating equations, we develop a new model validation procedure to assess the adequacy of a prescribed analysis model when covariate data are missing at random. The asymptotic distribution and local alternative efficiency for the test are investigated. Under certain conditions, our approach provides not only valid but also asymptotically optimal results. A simulation study for both linear and logistic regression illustrates the applicability and finite sample performance of the methodology. Our method is also employed to analyse a coronary artery disease diagnostic dataset. 相似文献

5.

Biao Zhang 《Statistics》2016,50(5):1173-1194

Missing covariate data occurs often in regression analysis. We study methods for estimating the regression coefficients in an assumed conditional mean function when some covariates are completely observed but other covariates are missing for some subjects. We adopt the semiparametric perspective of Robins et al. [Estimation of regression coefficients when some regressors are not always observed. J Amer Statist Assoc. 1994;89:846–866] on regression analyses with missing covariates, in which they pioneered the use of two working models, the working propensity score model and the working conditional score model. A recent approach to missing covariate data analysis is the empirical likelihood method of Qin et al. [Empirical likelihood in missing data problems. J Amer Statist Assoc. 2009;104:1492–1503], which effectively combines unbiased estimating equations. In this paper, we consider an alternative likelihood approach based on the full likelihood of the observed data. This full likelihood-based method enables us to generate estimators for the vector of the regression coefficients that are (a) asymptotically equivalent to those of Qin et al. [Empirical likelihood in missing data problems. J Amer Statist Assoc. 2009;104:1492–1503] when the working propensity score model is correctly specified, and (b) doubly robust, like the augmented inverse probability weighting (AIPW) estimators of Robins et al. [Estimation of regression coefficients when some regressors are not always observed. J Am Statist Assoc. 1994;89:846–866]. Thus, the proposed full likelihood-based estimators improve on the efficiency of the AIPW estimators when the working propensity score model is correct but the working conditional score model is possibly incorrect, and also improve on the empirical likelihood estimators of Qin, Zhang and Leung [Empirical likelihood in missing data problems. J Amer Statist Assoc. 2009;104:1492–1503] when the reverse is true, that is, the working conditional score model is correct but the working propensity score model is possibly incorrect. In addition, we consider a regression method for estimation of the regression coefficients when the working conditional score model is correctly specified; the asymptotic variance of the resulting estimator is no greater than the semiparametric variance bound characterized by the theory of Robins et al. [Estimation of regression coefficients when some regressors are not always observed. J Amer Statist Assoc. 1994;89:846–866]. Finally, we compare the finite-sample performance of various estimators in a simulation study. 相似文献

6.

《Journal of nonparametric statistics》2012,24(4):990-1002

相似文献

7.

《Scandinavian Journal of Statistics》2018,45(1):87-109

In survival analysis, covariate measurements often contain missing observations; ignoring this feature can lead to invalid inference. We propose a class of weighted estimating equations for right‐censored data with missing covariates under semiparametric transformation models. Time‐specific and subject‐specific weights are accommodated in the formulation of the weighted estimating equations. We establish unified results for estimating missingness probabilities that cover both parametric and non‐parametric modelling schemes. To improve estimation efficiency, the weighted estimating equations are augmented by a new set of unbiased estimating equations. The resultant estimator has the so‐called ‘double robustness’ property and is optimal within a class of consistent estimators. 相似文献

8.

An R package for model fitting,model selection and the simulation for longitudinal data with dropout missingness

Cong Xu Zheng Li Yuan Xue Lijun Zhang 《统计学通讯:模拟与计算》2013,42(9):2812-2829

Abstract

Missing data arise frequently in clinical and epidemiological fields, in particular in longitudinal studies. This paper describes the core features of an R package wgeesel, which implements marginal model fitting (i.e., weighted generalized estimating equations, WGEE; doubly robust GEE) for longitudinal data with dropouts under the assumption of missing at random. More importantly, this package comprehensively provide existing information criteria for WGEE model selection on marginal mean or correlation structures. Also, it can serve as a valuable tool for simulating longitudinal data with missing outcomes. Lastly, a real data example and simulations are presented to illustrate and validate our package. 相似文献

9.

Semiparametric inference for estimating equations with nonignorably missing covariates

Ji Chen Fang Fang 《Journal of nonparametric statistics》2018,30(3):796-812

We consider statistical inference of unknown parameters in estimating equations (EEs) when some covariates have nonignorably missing values, which is quite common in practice but has rarely been discussed in the literature. When an instrument, a fully observed covariate vector that helps identifying parameters under nonignorable missingness, is available, the conditional distribution of the missing covariates given other covariates can be estimated by the pseudolikelihood method of Zhao and Shao [(2015), ‘Semiparametric pseudo likelihoods in generalised linear models with nonignorable missing data’, Journal of the American Statistical Association, 110, 1577–1590)] and be used to construct unbiased EEs. These modified EEs then constitute a basis for valid inference by empirical likelihood. Our method is applicable to a wide range of EEs used in practice. It is semiparametric since no parametric model for the propensity of missing covariate data is assumed. Asymptotic properties of the proposed estimator and the empirical likelihood ratio test statistic are derived. Some simulation results and a real data analysis are presented for illustration. 相似文献

10.

Peisong Han 《Scandinavian Journal of Statistics》2016,43(1):246-260

Inverse probability weighting (IPW) and multiple imputation are two widely adopted approaches dealing with missing data. The former models the selection probability, and the latter models data distribution. Consistent estimation requires correct specification of corresponding models. Although the augmented IPW method provides an extra layer of protection on consistency, it is usually not sufficient in practice as the true data‐generating process is unknown. This paper proposes a method combining the two approaches in the same spirit of calibration in sampling survey literature. Multiple models for both the selection probability and data distribution can be simultaneously accounted for, and the resulting estimator is consistent if any model is correctly specified. The proposed method is within the framework of estimating equations and is general enough to cover regression analysis with missing outcomes and/or missing covariates. Results on both theoretical and numerical investigation are provided. 相似文献

11.

A comparison of various software tools for dealing with missing data via imputation

《Journal of Statistical Computation and Simulation》2012,82(11):1653-1675

In real-life situations, we often encounter data sets containing missing observations. Statistical methods that address missingness have been extensively studied in recent years. One of the more popular approaches involves imputation of the missing values prior to the analysis, thereby rendering the data complete. Imputation broadly encompasses an entire scope of techniques that have been developed to make inferences about incomplete data, ranging from very simple strategies (e.g. mean imputation) to more advanced approaches that require estimation, for instance, of posterior distributions using Markov chain Monte Carlo methods. Additional complexity arises when the number of missingness patterns increases and/or when both categorical and continuous random variables are involved. Implementation of routines, procedures, or packages capable of generating imputations for incomplete data are now widely available. We review some of these in the context of a motivating example, as well as in a simulation study, under two missingness mechanisms (missing at random and missing not at random). Thus far, evaluation of existing implementations have frequently centred on the resulting parameter estimates of the prescribed model of interest after imputing the missing data. In some situations, however, interest may very well be on the quality of the imputed values at the level of the individual – an issue that has received relatively little attention. In this paper, we focus on the latter to provide further insight about the performance of the different routines, procedures, and packages in this respect. 相似文献

12.

Donglin Guo Liugen Xue 《统计学通讯:理论与方法》2018,47(17):4160-4169

In this article, based on the covariate balancing propensity score (CBPS), estimators for the regression coefficients and the population mean are obtained, when the responses of linear models are missing at random. It is proved that the proposed estimators are asymptotically normal. In simulation studies and real example, the proposed estimators show improved performance relative to usual augmented inverse probability weighted estimators. 相似文献

13.

On inference for Kendall's τ within a longitudinal data setting

Yan Ma 《Journal of applied statistics》2012,39(11):2441-2452

Kendall's τ is a non-parametric measure of correlation based on ranks and is used in a wide range of research disciplines. Although methods are available for making inference about Kendall's τ, none has been extended to modeling multiple Kendall's τs arising in longitudinal data analysis. Compounding this problem is the pervasive issue of missing data in such study designs. In this article, we develop a novel approach to provide inference about Kendall's τ within a longitudinal study setting under both complete and missing data. The proposed approach is illustrated with simulated data and applied to an HIV prevention study. 相似文献

14.

Evaluating Statistical Hypotheses Using Weakly‐Identifiable Estimating Functions

GUANQUN CAO DAVID TODEM LIJIAN YANG JASON P. FINE 《Scandinavian Journal of Statistics》2013,40(2):256-273

Abstract. Many statistical models arising in applications contain non‐ and weakly‐identified parameters. Due to identifiability concerns, tests concerning the parameters of interest may not be able to use conventional theories and it may not be clear how to assess statistical significance. This paper extends the literature by developing a testing procedure that can be used to evaluate hypotheses under non‐ and weakly‐identifiable semiparametric models. The test statistic is constructed from a general estimating function of a finite dimensional parameter model representing the population characteristics of interest, but other characteristics which may be described by infinite dimensional parameters, and viewed as nuisance, are left completely unspecified. We derive the limiting distribution of this statistic and propose theoretically justified resampling approaches to approximate its asymptotic distribution. The methodology's practical utility is illustrated in simulations and an analysis of quality‐of‐life outcomes from a longitudinal study on breast cancer. 相似文献

15.

Bias from the use of generalized estimating equations to analyze incomplete longitudinal binary data

Andrew J. Copas Shaun R. Seaman 《Journal of applied statistics》2010,37(6):911-922

Patient dropout is a common problem in studies that collect repeated binary measurements. Generalized estimating equations (GEE) are often used to analyze such data. The dropout mechanism may be plausibly missing at random (MAR), i.e. unrelated to future measurements given covariates and past measurements. In this case, various authors have recommended weighted GEE with weights based on an assumed dropout model, or an imputation approach, or a doubly robust approach based on weighting and imputation. These approaches provide asymptotically unbiased inference, provided the dropout or imputation model (as appropriate) is correctly specified. Other authors have suggested that, provided the working correlation structure is correctly specified, GEE using an improved estimator of the correlation parameters (‘modified GEE’) show minimal bias. These modified GEE have not been thoroughly examined. In this paper, we study the asymptotic bias under MAR dropout of these modified GEE, the standard GEE, and also GEE using the true correlation. We demonstrate that all three methods are biased in general. The modified GEE may be preferred to the standard GEE and are subject to only minimal bias in many MAR scenarios but in others are substantially biased. Hence, we recommend the modified GEE be used with caution. 相似文献

16.

Abdul-Karim Iddrisu Freedom Gumedze 《Journal of applied statistics》2019,46(4):754-769

In this paper, we investigate the effect of tuberculosis pericarditis (TBP) treatment on CD4 count changes over time and draw inferences in the presence of missing data. We accounted for missing data and conducted sensitivity analyses to assess whether inferences under missing at random (MAR) assumption are sensitive to not missing at random (NMAR) assumptions using the selection model (SeM) framework. We conducted sensitivity analysis using the local influence approach and stress-testing analysis. Our analyses showed that the inferences from the MAR are robust to the NMAR assumption and influential subjects do not overturn the study conclusions about treatment effects and the dropout mechanism. Therefore, the missing CD4 count measurements are likely to be MAR. The results also revealed that TBP treatment does not interact with HIV/AIDS treatment and that TBP treatment has no significant effect on CD4 count changes over time. Although the methods considered were applied to data in the IMPI trial setting, the methods can also be applied to clinical trials with similar settings. 相似文献

17.

Haocheng Li Yukun Zhang 《Australian & New Zealand Journal of Statistics》2018,60(3):394-404

In longitudinal data, missing observations occur commonly with incomplete responses and covariates. Missing data can have a ‘missing not at random’ mechanism, a non‐monotone missing pattern, and moreover response and covariates can be missing not simultaneously. To avoid complexities in both modelling and computation, a two‐stage estimation method and a pairwise‐likelihood method are proposed. The two‐stage estimation method enjoys simplicities in computation, but incurs more severe efficiency loss. On the other hand, the pairwise approach leads to estimators with better efficiency, but can be cumbersome in computation. In this paper, we develop a compromise method using a hybrid pairwise‐likelihood framework. Our proposed approach has better efficiency than the two‐stage method, but its computational cost is still reasonable compared to the pairwise approach. The performance of the methods is evaluated empirically by means of simulation studies. Our methods are used to analyse longitudinal data obtained from the National Population Health Study. 相似文献

18.

Ian R. White 《Research Synthesis Methods》2023,14(1):137-139

The paper by Lee and Beretvas (doi: 10.1002/jrsm.1585 ) described a well-executed simulation study comparing ‘modern’ with ‘ad hoc’ methods for performing meta-regression when some covariates are incomplete. However, they drew practical conclusions after simulating data under a single missing data mechanism which favoured the ‘modern’ methods, while other missing data mechanisms would have favoured the ‘ad hoc’ methods. Broad recommendations about methods to use in practice should instead be based on simulation studies using a range of plausible data-generating mechanisms. This range must represent what is believed likely to occur in practice, and not what is convenient for statistical analysis. 相似文献

19.

Empirical likelihood for single index model with missing covariates at random

Xu Guo Yiping Yang Wangli Xu 《Statistics》2015,49(3):588-601

In this paper, we investigate the empirical-likelihood-based inference for the construction of confidence intervals and regions of the parameters of interest in single index models with missing covariates at random. An augmented inverse probability weighted-type empirical likelihood ratio for the parameters of interest is defined such that this ratio is asymptotically standard chi-squared. Our approach is to directly calibrate the empirical log-likelihood ratio, and does not need multiplication by an adjustment factor for the original ratio. Our bias-corrected empirical likelihood is self-scale invariant and no plug-in estimator for the limiting variance is needed. Some simulation studies are carried out to assess the performance of our proposed method. 相似文献

20.

Augmented inverse probability weighted fractional imputation in quantile regression

Hao Cheng 《Pharmaceutical statistics》2021,20(1):25-38

By employing all the observed information and the optimal augmentation term, we propose an augmented inverse probability weighted fractional imputation method (AFI) to handle covariates missing at random in quantile regression. Compared with the existing completely case analysis, inverse probability weighting, multiple imputation and fractional imputation based on quantile regression model with missing covarites, we carry out simulation study to investigate its performance in estimation accuracy and efficiency, computational efficiency and estimation robustness. We also talk about the influence of imputation replicates in our AFI. Finally, we apply our methodology to part of the National Health and Nutrition Examination Survey data. 相似文献