首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 359 毫秒
1.
Analyses of randomised trials are often based on regression models which adjust for baseline covariates, in addition to randomised group. Based on such models, one can obtain estimates of the marginal mean outcome for the population under assignment to each treatment, by averaging the model‐based predictions across the empirical distribution of the baseline covariates in the trial. We identify under what conditions such estimates are consistent, and in particular show that for canonical generalised linear models, the resulting estimates are always consistent. We show that a recently proposed variance estimator underestimates the variance of the estimator around the true marginal population mean when the baseline covariates are not fixed in repeated sampling and provide a simple adjustment to remedy this. We also describe an alternative semiparametric estimator, which is consistent even when the outcome regression model used is misspecified. The different estimators are compared through simulations and application to a recently conducted trial in asthma.  相似文献   

2.
Li G  Wu TT 《Statistica Sinica》2010,20(4):1581-1607
In this article we study a semiparametric additive risks model (McKeague and Sasieni (1994)) for two-stage design survival data where accurate information is available only on second stage subjects, a subset of the first stage study. We derive two-stage estimators by combining data from both stages. Large sample inferences are developed. As a by-product, we also obtain asymptotic properties of the single stage estimators of McKeague and Sasieni (1994) when the semiparametric additive risks model is misspecified. The proposed two-stage estimators are shown to be asymptotically more efficient than the second stage estimators. They also demonstrate smaller bias and variance for finite samples. The developed methods are illustrated using small intestine cancer data from the SEER (Surveillance, Epidemiology, and End Results) Program.  相似文献   

3.
Motivated by a recent tuberculosis (TB) study, this paper is concerned with covariates missing not at random (MNAR) and models the potential intracluster correlation by a frailty. We consider the regression analysis of right‐censored event times from clustered subjects under a Cox proportional hazards frailty model and present the semiparametric maximum likelihood estimator (SPMLE) of the model parameters. An easy‐to‐implement pseudo‐SPMLE is then proposed to accommodate more realistic situations using readily available supplementary information on the missing covariates. Algorithms are provided to compute the estimators and their consistent variance estimators. We demonstrate that both the SPMLE and the pseudo‐SPMLE are consistent and asymptotically normal by the arguments based on the theory of modern empirical processes. The proposed approach is examined numerically via simulation and illustrated with an analysis of the motivating TB study data.  相似文献   

4.
We propose correcting for non-compliance in randomized trials by estimating the parameters of a class of semi-parametric failure time models, the rank preserving structural failure time models, using a class of rank estimators. These models are the structural or strong version of the “accelerated failure time model with time-dependent covariates” of Cox and Oakes (1984). In this paper we develop a large sample theory for these estimators, derive the optimal estimator within this class, and briefly consider the construction of “partially adaptive” estimators whose efficiency may approach that of the optimal estimator. We show that in the absence of censoring the optimal estimator attains the semiparametric efficiency bound for the model.  相似文献   

5.
The authors consider semiparametric efficient estimation of parameters in the conditional mean model for a simple incomplete data structure in which the outcome of interest is observed only for a random subset of subjects but covariates and surrogate (auxiliary) outcomes are observed for all. They use optimal estimating function theory to derive the semiparametric efficient score in closed form. They show that when covariates and auxiliary outcomes are discrete, a Horvitz‐Thompson type estimator with empirically estimated weights is semiparametric efficient. The authors give simulation studies validating the finite‐sample behaviour of the semiparametric efficient estimator and its asymptotic variance; they demonstrate the efficiency of the estimator in realistic settings.  相似文献   

6.
王亚峰 《统计研究》2012,29(2):88-93
本文发展了一个针对样本选择模型的两阶段半参数估计量,其首先在第一阶段基于对数欧几里得分布差异测度估计离散选择概率,进而在第二阶段利用非参数sieve方法估计一个包含参数和非参数部分的部分线性模型以得到模型参数的估计。相对于文献中已有的半参数估计量,该估计量的计算更加简便,且计算负担相对较小。我们说明了该半参数估计量的一致性和渐近正态性,同时给出了其渐近方差的计算公式。蒙特卡洛模拟结果符合我们的理论结论。  相似文献   

7.
We examine the asymptotic and small sample properties of model-based and robust tests of the null hypothesis of no randomized treatment effect based on the partial likelihood arising from an arbitrarily misspecified Cox proportional hazards model. When the distribution of the censoring variable is either conditionally independent of the treatment group given covariates or conditionally independent of covariates given the treatment group, the numerators of the partial likelihood treatment score and Wald tests have asymptotic mean equal to 0 under the null hypothesis, regardless of whether or how the Cox model is misspecified. We show that the model-based variance estimators used in the calculation of the model-based tests are not, in general, consistent under model misspecification, yet using analytic considerations and simulations we show that their true sizes can be as close to the nominal value as tests calculated with robust variance estimators. As a special case, we show that the model-based log-rank test is asymptotically valid. When the Cox model is misspecified and the distribution of censoring depends on both treatment group and covariates, the asymptotic distributions of the resulting partial likelihood treatment score statistic and maximum partial likelihood estimator do not, in general, have a zero mean under the null hypothesis. Here neither the fully model-based tests, including the log-rank test, nor the robust tests will be asymptotically valid, and we show through simulations that the distortion to test size can be substantial.  相似文献   

8.
Efficient inference for regression models requires that the heteroscedasticity be taken into account. We consider statistical inference under heteroscedasticity in a semiparametric measurement error regression model, in which some covariates are measured with errors. This paper has multiple components. First, we propose a new method for testing the heteroscedasticity. The advantages of the proposed method over the existing ones are that it does not need any nonparametric estimation and does not involve any mismeasured variables. Second, we propose a new two-step estimator for the error variances if there is heteroscedasticity. Finally, we propose a weighted estimating equation-based estimator (WEEBE) for the regression coefficients and establish its asymptotic properties. Compared with existing estimators, the proposed WEEBE is asymptotically more efficient, avoids undersmoothing the regressor functions and requires less restrictions on the observed regressors. Simulation studies show that the proposed test procedure and estimators have nice finite sample performance. A real data set is used to illustrate the utility of our proposed methods.  相似文献   

9.
We propose a new class of semiparametric estimators for proportional hazards models in the presence of measurement error in the covariates, where the baseline hazard function, the hazard function for the censoring time, and the distribution of the true covariates are considered as unknown infinite dimensional parameters. We estimate the model components by solving estimating equations based on the semiparametric efficient scores under a sequence of restricted models where the logarithm of the hazard functions are approximated by reduced rank regression splines. The proposed estimators are locally efficient in the sense that the estimators are semiparametrically efficient if the distribution of the error‐prone covariates is specified correctly and are still consistent and asymptotically normal if the distribution is misspecified. Our simulation studies show that the proposed estimators have smaller biases and variances than competing methods. We further illustrate the new method with a real application in an HIV clinical trial.  相似文献   

10.
This paper addresses the problem of the probability density estimation in the presence of covariates when data are missing at random (MAR). The inverse probability weighted method is used to define a nonparametric and a semiparametric weighted probability density estimators. A regression calibration technique is also used to define an imputed estimator. It is shown that all the estimators are asymptotically normal with the same asymptotic variance as that of the inverse probability weighted estimator with known selection probability function and weights. Also, we establish the mean squared error (MSE) bounds and obtain the MSE convergence rates. A simulation is carried out to assess the proposed estimators in terms of the bias and standard error.  相似文献   

11.
Biao Zhang 《Statistics》2016,50(5):1173-1194
Missing covariate data occurs often in regression analysis. We study methods for estimating the regression coefficients in an assumed conditional mean function when some covariates are completely observed but other covariates are missing for some subjects. We adopt the semiparametric perspective of Robins et al. [Estimation of regression coefficients when some regressors are not always observed. J Amer Statist Assoc. 1994;89:846–866] on regression analyses with missing covariates, in which they pioneered the use of two working models, the working propensity score model and the working conditional score model. A recent approach to missing covariate data analysis is the empirical likelihood method of Qin et al. [Empirical likelihood in missing data problems. J Amer Statist Assoc. 2009;104:1492–1503], which effectively combines unbiased estimating equations. In this paper, we consider an alternative likelihood approach based on the full likelihood of the observed data. This full likelihood-based method enables us to generate estimators for the vector of the regression coefficients that are (a) asymptotically equivalent to those of Qin et al. [Empirical likelihood in missing data problems. J Amer Statist Assoc. 2009;104:1492–1503] when the working propensity score model is correctly specified, and (b) doubly robust, like the augmented inverse probability weighting (AIPW) estimators of Robins et al. [Estimation of regression coefficients when some regressors are not always observed. J Am Statist Assoc. 1994;89:846–866]. Thus, the proposed full likelihood-based estimators improve on the efficiency of the AIPW estimators when the working propensity score model is correct but the working conditional score model is possibly incorrect, and also improve on the empirical likelihood estimators of Qin, Zhang and Leung [Empirical likelihood in missing data problems. J Amer Statist Assoc. 2009;104:1492–1503] when the reverse is true, that is, the working conditional score model is correct but the working propensity score model is possibly incorrect. In addition, we consider a regression method for estimation of the regression coefficients when the working conditional score model is correctly specified; the asymptotic variance of the resulting estimator is no greater than the semiparametric variance bound characterized by the theory of Robins et al. [Estimation of regression coefficients when some regressors are not always observed. J Amer Statist Assoc. 1994;89:846–866]. Finally, we compare the finite-sample performance of various estimators in a simulation study.  相似文献   

12.
In an attempt to provide a statistical tool for disease screening and prediction, we propose a semiparametric approach to analysis of the Cox proportional hazards cure model in situations where the observations on the event time are subject to right censoring and some covariates are missing not at random. To facilitate the methodological development, we begin with semiparametric maximum likelihood estimation (SPMLE) assuming that the (conditional) distribution of the missing covariates is known. A variant of the EM algorithm is used to compute the estimator. We then adapt the SPMLE to a more practical situation where the distribution is unknown and there is a consistent estimator based on available information. We establish the consistency and weak convergence of the resulting pseudo-SPMLE, and identify a suitable variance estimator. The application of our inference procedure to disease screening and prediction is illustrated via empirical studies. The proposed approach is used to analyze the tuberculosis screening study data that motivated this research. Its finite-sample performance is examined by simulation.  相似文献   

13.
In this paper we discuss semiparametric additive isotonic regression models. We discuss the efficiency bound of the model and the least squares estimator under this model. We show that the ordinary least square estimator studied by Huang (2002) and Cheng (2009) for the semiparametric isotonic regression achieves the efficiency bound for the regular estimator when the true parameter belongs to the interior of the parameter space. We also show that the result by Cheng (2009) can be generalized to the case that the covariates are dependent on each other.  相似文献   

14.
Recognizing that the efficiency in relative risk estimation for the Cox proportional hazards model is largely constrained by the total number of cases, Prentice (1986) proposed the case-cohort design in which covariates are measured on all cases and on a random sample of the cohort. Subsequent to Prentice, other methods of estimation and sampling have been proposed for these designs. We formalize an approach to variance estimation suggested by Barlow (1994), and derive a robust variance estimator based on the influence function. We consider the applicability of the variance estimator to all the proposed case-cohort estimators, and derive the influence function when known sampling probabilities in the estimators are replaced by observed sampling fractions. We discuss the modifications required when cases are missing covariate information. The missingness may occur by chance, and be completely at random; or may occur as part of the sampling design, and depend upon other observed covariates. We provide an adaptation of S-plus code that allows estimating influence function variances in the presence of such missing covariates. Using examples from our current case-cohort studies on esophageal and gastric cancer, we illustrate how our results our useful in solving design and analytic issues that arise in practice.  相似文献   

15.
The binary logistic regression is a commonly used statistical method when the outcome variable is dichotomous or binary. The explanatory variables are correlated in some situations of the logit model. This problem is called multicollinearity. It is known that the variance of the maximum likelihood estimator (MLE) is inflated in the presence of multicollinearity. Therefore, in this study, we define a new two-parameter ridge estimator for the logistic regression model to decrease the variance and overcome multicollinearity problem. We compare the new estimator to the other well-known estimators by studying their mean squared error (MSE) properties. Moreover, a Monte Carlo simulation is designed to evaluate the performances of the estimators. Finally, a real data application is illustrated to show the applicability of the new method. According to the results of the simulation and real application, the new estimator outperforms the other estimators for all of the situations considered.  相似文献   

16.
We consider a variance estimation when a stratified single stage cluster sample is selected in the first phase and a stratified simple random element sample is selected in the second phase. We propose explicit formulas of (asymptotically), we propose explicit formulas of (asymptotically) unbiased variance estimators for the double expansion estimator and regression estimator. We perform a small simulation study to investigate the performance of the proposed variance estimators. In our simulation study, the proposed variance estimator showed better or comparable performance to the Jackknife variance estimator. We also extend the results to a two-phase sampling design in which a stratified pps with replacement cluster sample is selected in the first phase.  相似文献   

17.
Abstract

This study concerns semiparametric approaches to estimate discrete multivariate count regression functions. The semiparametric approaches investigated consist of combining discrete multivariate nonparametric kernel and parametric estimations such that (i) a prior knowledge of the conditional distribution of model response may be incorporated and (ii) the bias of the traditional nonparametric kernel regression estimator of Nadaraya-Watson may be reduced. We are precisely interested in combination of the two estimations approaches with some asymptotic properties of the resulting estimators. Asymptotic normality results were showed for nonparametric correction terms of parametric start function of the estimators. The performance of discrete semiparametric multivariate kernel estimators studied is illustrated using simulations and real count data. In addition, diagnostic checks are performed to test the adequacy of the parametric start model to the true discrete regression model. Finally, using discrete semiparametric multivariate kernel estimators provides a bias reduction when the parametric multivariate regression model used as start regression function belongs to a neighborhood of the true regression model.  相似文献   

18.
In many randomized clinical trials, the primary response variable, for example, the survival time, is not observed directly after the patients enroll in the study but rather observed after some period of time (lag time). It is often the case that such a response variable is missing for some patients due to censoring that occurs when the study ends before the patient’s response is observed or when the patients drop out of the study. It is often assumed that censoring occurs at random which is referred to as noninformative censoring; however, in many cases such an assumption may not be reasonable. If the missing data are not analyzed properly, the estimator or test for the treatment effect may be biased. In this paper, we use semiparametric theory to derive a class of consistent and asymptotically normal estimators for the treatment effect parameter which are applicable when the response variable is right censored. The baseline auxiliary covariates and post-treatment auxiliary covariates, which may be time-dependent, are also considered in our semiparametric model. These auxiliary covariates are used to derive estimators that both account for informative censoring and are more efficient then the estimators which do not consider the auxiliary covariates.  相似文献   

19.
We consider the variance estimation of the weighted likelihood estimator (WLE) under two‐phase stratified sampling without replacement. Asymptotic variance of the WLE in many semiparametric models contains unknown functions or does not have a closed form. The standard method of the inverse probability weighted (IPW) sample variances of an estimated influence function is then not available in these models. To address this issue, we develop the variance estimation procedure for the WLE in a general semiparametric model. The phase I variance is estimated by taking a numerical derivative of the IPW log likelihood. The phase II variance is estimated based on the bootstrap for a stratified sample in a finite population. Despite a theoretical difficulty of dependent observations due to sampling without replacement, we establish the (bootstrap) consistency of our estimators. Finite sample properties of our method are illustrated in a simulation study.  相似文献   

20.
In biostatistical applications interest often focuses on the estimation of the distribution of time T between two consecutive events. If the initial event time is observed and the subsequent event time is only known to be larger or smaller than an observed monitoring time C, then the data conforms to the well understood singly-censored current status model, also known as interval censored data, case I. Additional covariates can be used to allow for dependent censoring and to improve estimation of the marginal distribution of T. Assuming a wrong model for the conditional distribution of T, given the covariates, will lead to an inconsistent estimator of the marginal distribution. On the other hand, the nonparametric maximum likelihood estimator of FT requires splitting up the sample in several subsamples corresponding with a particular value of the covariates, computing the NPMLE for every subsample and then taking an average. With a few continuous covariates the performance of the resulting estimator is typically miserable. In van der Laan, Robins (1996) a locally efficient one-step estimator is proposed for smooth functionals of the distribution of T, assuming nothing about the conditional distribution of T, given the covariates, but assuming a model for censoring, given the covariates. The estimators are asymptotically linear if the censoring mechanism is estimated correctly. The estimator also uses an estimator of the conditional distribution of T, given the covariates. If this estimate is consistent, then the estimator is efficient and if it is inconsistent, then the estimator is still consistent and asymptotically normal. In this paper we show that the estimators can also be used to estimate the distribution function in a locally optimal way. Moreover, we show that the proposed estimator can be used to estimate the distribution based on interval censored data (T is now known to lie between two observed points) in the presence of covariates. The resulting estimator also has a known influence curve so that asymptotic confidence intervals are directly available. In particular, one can apply our proposal to the interval censored data without covariates. In Geskus (1992) the information bound for interval censored data with two uniformly distributed monitoring times at the uniform distribution (for T has been computed. We show that the relative efficiency of our proposal w.r.t. this optimal bound equals 0.994, which is also reflected in finite sample simulations. Finally, the good practical performance of the estimator is shown in a simulation study. This revised version was published online in July 2006 with corrections to the Cover Date.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号