首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Under the case-cohort design introduced by Prentice (Biometrica 73:1–11, 1986), the covariate histories are ascertained only for the subjects who experience the event of interest (i.e., the cases) during the follow-up period and for a relatively small random sample from the original cohort (i.e., the subcohort). The case-cohort design has been widely used in clinical and epidemiological studies to assess the effects of covariates on failure times. Most statistical methods developed for the case-cohort design use the proportional hazards model, and few methods allow for time-varying regression coefficients. In addition, most methods disregard data from subjects outside of the subcohort, which can result in inefficient inference. Addressing these issues, this paper proposes an estimation procedure for the semiparametric additive hazards model with case-cohort/two-phase sampling data, allowing the covariates of interest to be missing for cases as well as for non-cases. A more flexible form of the additive model is considered that allows the effects of some covariates to be time varying while specifying the effects of others to be constant. An augmented inverse probability weighted estimation procedure is proposed. The proposed method allows utilizing the auxiliary information that correlates with the phase-two covariates to improve efficiency. The asymptotic properties of the proposed estimators are established. An extensive simulation study shows that the augmented inverse probability weighted estimation is more efficient than the widely adopted inverse probability weighted complete-case estimation method. The method is applied to analyze data from a preventive HIV vaccine efficacy trial.  相似文献   

2.
The case-cohort study design is widely used to reduce cost when collecting expensive covariates in large cohort studies with survival or competing risks outcomes. A case-cohort study dataset consists of two parts: (a) a random sample and (b) all cases or failures from a specific cause of interest. Clinicians often assess covariate effects on competing risks outcomes. The proportional subdistribution hazards model directly evaluates the effect of a covariate on the cumulative incidence function under the non-covariate-dependent censoring assumption for the full cohort study. However, the non-covariate-dependent censoring assumption is often violated in many biomedical studies. In this article, we propose a proportional subdistribution hazards model for case-cohort studies with stratified data with covariate-adjusted censoring weight. We further propose an efficient estimator when extra information from the other causes is available under case-cohort studies. The proposed estimators are shown to be consistent and asymptotically normal. Simulation studies show (a) the proposed estimator is unbiased when the censoring distribution depends on covariates and (b) the proposed efficient estimator gains estimation efficiency when using extra information from the other causes. We analyze a bone marrow transplant dataset and a coronary heart disease dataset using the proposed method.  相似文献   

3.
Computing the Cox Model for Case Cohort Designs   总被引:2,自引:1,他引:1  
Prentice (1986) proposed a case-cohort design as an efficient subsampling mechanism for survival studies. Several other authors have expanded on these ideas to create a family of related sampling plans, along with estimators for the covariate effects. We describe how to obtain the proposed parameter estimates and their variance estimates using standard software packages, with SAS and SPLUS as particular examples.  相似文献   

4.
We consider a variance estimation when a stratified single stage cluster sample is selected in the first phase and a stratified simple random element sample is selected in the second phase. We propose explicit formulas of (asymptotically), we propose explicit formulas of (asymptotically) unbiased variance estimators for the double expansion estimator and regression estimator. We perform a small simulation study to investigate the performance of the proposed variance estimators. In our simulation study, the proposed variance estimator showed better or comparable performance to the Jackknife variance estimator. We also extend the results to a two-phase sampling design in which a stratified pps with replacement cluster sample is selected in the first phase.  相似文献   

5.
Abstract

In this article, we study the variable selection and estimation for linear regression models with missing covariates. The proposed estimation method is almost as efficient as the popular least-squares-based estimation method for normal random errors and empirically shown to be much more efficient and robust with respect to heavy tailed errors or outliers in the responses and covariates. To achieve sparsity, a variable selection procedure based on SCAD is proposed to conduct estimation and variable selection simultaneously. The procedure is shown to possess the oracle property. To deal with the covariates missing, we consider the inverse probability weighted estimators for the linear model when the selection probability is known or unknown. It is shown that the estimator by using estimated selection probability has a smaller asymptotic variance than that with true selection probability, thus is more efficient. Therefore, the important Horvitz-Thompson property is verified for penalized rank estimator with the covariates missing in the linear model. Some numerical examples are provided to demonstrate the performance of the estimators.  相似文献   

6.
This paper addresses the problem of the probability density estimation in the presence of covariates when data are missing at random (MAR). The inverse probability weighted method is used to define a nonparametric and a semiparametric weighted probability density estimators. A regression calibration technique is also used to define an imputed estimator. It is shown that all the estimators are asymptotically normal with the same asymptotic variance as that of the inverse probability weighted estimator with known selection probability function and weights. Also, we establish the mean squared error (MSE) bounds and obtain the MSE convergence rates. A simulation is carried out to assess the proposed estimators in terms of the bias and standard error.  相似文献   

7.
Summary.  The jackknife method is often used for variance estimation in sample surveys but has only been developed for a limited class of sampling designs. We propose a jackknife variance estimator which is defined for any without-replacement unequal probability sampling design. We demonstrate design consistency of this estimator for a broad class of point estimators. A Monte Carlo study shows how the proposed estimator may improve on existing estimators.  相似文献   

8.
The estimation of the variance for the GREG (general regression) estimator by weighted residuals is widely accepted as a method which yields estimators with good conditional properties. Since the optimal (regression) estimator shares the properties of GREG estimators which are used in the construction of weighted variance estimators, we introduce the weighting procedure also for estimating the variance of the optimal estimator. This method of variance estimation was originally presented in a seemingly ad hoc manner, and we shall discuss it from a conditional point of view and also look at an alternative way of utilizing the weights. Examples that stress conditional behaviour of estimators are then given for elementary sampling designs such as simple random sampling, stratified simple random sampling and Poisson sampling, where for the latter design we have conducted a small simulation study.  相似文献   

9.
Semiparametric accelerated failure time (AFT) models directly relate the expected failure times to covariates and are a useful alternative to models that work on the hazard function or the survival function. For case-cohort data, much less development has been done with AFT models. In addition to the missing covariates outside of the sub-cohort in controls, challenges from AFT model inferences with full cohort are retained. The regression parameter estimator is hard to compute because the most widely used rank-based estimating equations are not smooth. Further, its variance depends on the unspecified error distribution, and most methods rely on computationally intensive bootstrap to estimate it. We propose fast rank-based inference procedures for AFT models, applying recent methodological advances to the context of case-cohort data. Parameters are estimated with an induced smoothing approach that smooths the estimating functions and facilitates the numerical solution. Variance estimators are obtained through efficient resampling methods for nonsmooth estimating functions that avoids full blown bootstrap. Simulation studies suggest that the recommended procedure provides fast and valid inferences among several competing procedures. Application to a tumor study demonstrates the utility of the proposed method in routine data analysis.  相似文献   

10.
Missing covariate values is a common problem in survival analysis. In this paper we propose a novel method for the Cox regression model that is close to maximum likelihood but avoids the use of the EM-algorithm. It exploits that the observed hazard function is multiplicative in the baseline hazard function with the idea being to profile out this function before carrying out the estimation of the parameter of interest. In this step one uses a Breslow type estimator to estimate the cumulative baseline hazard function. We focus on the situation where the observed covariates are categorical which allows us to calculate estimators without having to assume anything about the distribution of the covariates. We show that the proposed estimator is consistent and asymptotically normal, and derive a consistent estimator of the variance–covariance matrix that does not involve any choice of a perturbation parameter. Moderate sample size performance of the estimators is investigated via simulation and by application to a real data example.  相似文献   

11.
Quantitle regression (QR) is a popular approach to estimate functional relations between variables for all portions of a probability distribution. Parameter estimation in QR with missing data is one of the most challenging issues in statistics. Regression quantiles can be substantially biased when observations are subject to missingness. We study several inverse probability weighting (IPW) estimators for parameters in QR when covariates or responses are subject to missing not at random. Maximum likelihood and semiparametric likelihood methods are employed to estimate the respondent probability function. To achieve nice efficiency properties, we develop an empirical likelihood (EL) approach to QR with the auxiliary information from the calibration constraints. The proposed methods are less sensitive to misspecified missing mechanisms. Asymptotic properties of the proposed IPW estimators are shown under general settings. The efficiency gain of EL-based IPW estimator is quantified theoretically. Simulation studies and a data set on the work limitation of injured workers from Canada are used to illustrated our proposed methodologies.  相似文献   

12.
In this paper we propose a smooth nonparametric estimation for the conditional probability density function based on a Bernstein polynomial representation. Our estimator can be written as a finite mixture of beta densities with data-driven weights. Using the Bernstein estimator of the conditional density function, we derive new estimators for the distribution function and conditional mean. We establish the asymptotic properties of the proposed estimators, by proving their asymptotic normality and by providing their asymptotic bias and variance. Simulation results suggest that the proposed estimators can outperform the Nadaraya–Watson estimator and, in some specific setups, the local linear kernel estimators. Finally, we use our estimators for modeling the income in Italy, conditional on year from 1951 to 1998, and have another look at the well known Old Faithful Geyser data.  相似文献   

13.
This paper deals with estimation of parameters and the mean life of a mixed failure time distribution that has a discrete probability mass at zero and an exponential distribution with mean O for positive values. A new sampling scheme similar to Jayade and Prasad (1990) is proposed for estimation of parameters. We derive expressions for biases and mean square errors (MSEs) of the maximum likelihood estimators (MLEs). We also obtain the uniformly minimum variance unbiased estimators (UMVUEs) of the parameters. We compare the estimator of O and mean life fj based on the proposed sampling scheme with the estimators obtained by using the sampling scheme of Jayade and Prasad (1990).  相似文献   

14.
Calibration method adjusts the original design weights to improve the estimates by using auxiliary information. In this article we have proposed new calibration estimators under stratified ranked set sampling design and derive the estimator of variance of calibration estimator. A simulation study is carried out to see the performance of proposed estimators.  相似文献   

15.
For nonparametric regression models with fixed and random design, two classes of estimators for the error variance have been introduced: second sample moments based on residuals from a nonparametric fit, and difference-based estimators. The former are asymptotically optimal but require estimating the regression function; the latter are simple but have larger asymptotic variance. For nonparametric regression models with random covariates, we introduce a class of estimators for the error variance that are related to difference-based estimators: covariate-matched U-statistics. We give conditions on the random weights involved that lead to asymptotically optimal estimators of the error variance. Our explicit construction of the weights uses a kernel estimator for the covariate density.  相似文献   

16.
A general nonparametric imputation procedure, based on kernel regression, is proposed to estimate points as well as set- and function-indexed parameters when the data are missing at random (MAR). The proposed method works by imputing a specific function of a missing value (and not the missing value itself), where the form of this specific function is dictated by the parameter of interest. Both single and multiple imputations are considered. The associated empirical processes provide the right tool to study the uniform convergence properties of the resulting estimators. Our estimators include, as special cases, the imputation estimator of the mean, the estimator of the distribution function proposed by Cheng and Chu [1996. Kernel estimation of distribution functions and quantiles with missing data. Statist. Sinica 6, 63–78], imputation estimators of a marginal density, and imputation estimators of regression functions.  相似文献   

17.
In stratified sampling, methods for the allocation of effort among strata usually rely on some measure of within-stratum variance. If we do not have enough information about these variances, adaptive allocation can be used. In adaptive allocation designs, surveys are conducted in two phases. Information from the first phase is used to allocate the remaining units among the strata in the second phase. Brown et al. [Adaptive two-stage sequential sampling, Popul. Ecol. 50 (2008), pp. 239–245] introduced an adaptive allocation sampling design – where the final sample size was random – and an unbiased estimator. Here, we derive an unbiased variance estimator for the design, and consider a related design where the final sample size is fixed. Having a fixed final sample size can make survey-planning easier. We introduce a biased Horvitz–Thompson type estimator and a biased sample mean type estimator for the sampling designs. We conduct two simulation studies on honey producers in Kurdistan and synthetic zirconium distribution in a region on the moon. Results show that the introduced estimators are more efficient than the available estimators for both variable and fixed sample size designs, and the conventional unbiased estimator of stratified simple random sampling design. In order to evaluate efficiencies of the introduced designs and their estimator furthermore, we first review some well-known adaptive allocation designs and compare their estimator with the introduced estimators. Simulation results show that the introduced estimators are more efficient than available estimators of these well-known adaptive allocation designs.  相似文献   

18.
In many randomized clinical trials, the primary response variable, for example, the survival time, is not observed directly after the patients enroll in the study but rather observed after some period of time (lag time). It is often the case that such a response variable is missing for some patients due to censoring that occurs when the study ends before the patient’s response is observed or when the patients drop out of the study. It is often assumed that censoring occurs at random which is referred to as noninformative censoring; however, in many cases such an assumption may not be reasonable. If the missing data are not analyzed properly, the estimator or test for the treatment effect may be biased. In this paper, we use semiparametric theory to derive a class of consistent and asymptotically normal estimators for the treatment effect parameter which are applicable when the response variable is right censored. The baseline auxiliary covariates and post-treatment auxiliary covariates, which may be time-dependent, are also considered in our semiparametric model. These auxiliary covariates are used to derive estimators that both account for informative censoring and are more efficient then the estimators which do not consider the auxiliary covariates.  相似文献   

19.
Motivated by a recent tuberculosis (TB) study, this paper is concerned with covariates missing not at random (MNAR) and models the potential intracluster correlation by a frailty. We consider the regression analysis of right‐censored event times from clustered subjects under a Cox proportional hazards frailty model and present the semiparametric maximum likelihood estimator (SPMLE) of the model parameters. An easy‐to‐implement pseudo‐SPMLE is then proposed to accommodate more realistic situations using readily available supplementary information on the missing covariates. Algorithms are provided to compute the estimators and their consistent variance estimators. We demonstrate that both the SPMLE and the pseudo‐SPMLE are consistent and asymptotically normal by the arguments based on the theory of modern empirical processes. The proposed approach is examined numerically via simulation and illustrated with an analysis of the motivating TB study data.  相似文献   

20.
黄莺  李金昌 《统计研究》2008,25(7):66-69
校正估计法已被大量运用于抽样调查中,它利用辅助信息构造的校正权重提高了对总体总值(或均值)的估计精度。本文提出了分层抽样中的校正组合比率估计量,并推广到分层双重抽样中。同时给出新估计量的近似方差表达式。最后利用计算机随机模拟验证较正估计量对估计精度的改进。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号