首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Investigators often gather longitudinal data to assess changes in responses over time within subjects and to relate these changes to within‐subject changes in predictors. Missing data are common in such studies and predictors can be correlated with subject‐specific effects. Maximum likelihood methods for generalized linear mixed models provide consistent estimates when the data are ‘missing at random’ (MAR) but can produce inconsistent estimates in settings where the random effects are correlated with one of the predictors. On the other hand, conditional maximum likelihood methods (and closely related maximum likelihood methods that partition covariates into between‐ and within‐cluster components) provide consistent estimation when random effects are correlated with predictors but can produce inconsistent covariate effect estimates when data are MAR. Using theory, simulation studies, and fits to example data this paper shows that decomposition methods using complete covariate information produce consistent estimates. In some practical cases these methods, that ostensibly require complete covariate information, actually only involve the observed covariates. These results offer an easy‐to‐use approach to simultaneously protect against bias from both cluster‐level confounding and MAR missingness in assessments of change.  相似文献   

2.
We propose methods for Bayesian inference for missing covariate data with a novel class of semi-parametric survival models with a cure fraction. We allow the missing covariates to be either categorical or continuous and specify a parametric distribution for the covariates that is written as a sequence of one dimensional conditional distributions. We assume that the missing covariates are missing at random (MAR) throughout. We propose an informative class of joint prior distributions for the regression coefficients and the parameters arising from the covariate distributions. The proposed class of priors are shown to be useful in recovering information on the missing covariates especially in situations where the missing data fraction is large. Properties of the proposed prior and resulting posterior distributions are examined. Also, model checking techniques are proposed for sensitivity analyses and for checking the goodness of fit of a particular model. Specifically, we extend the Conditional Predictive Ordinate (CPO) statistic to assess goodness of fit in the presence of missing covariate data. Computational techniques using the Gibbs sampler are implemented. A real data set involving a melanoma cancer clinical trial is examined to demonstrate the methodology.  相似文献   

3.
With competing risks data, one often needs to assess the treatment and covariate effects on the cumulative incidence function. Fine and Gray proposed a proportional hazards regression model for the subdistribution of a competing risk with the assumption that the censoring distribution and the covariates are independent. Covariate‐dependent censoring sometimes occurs in medical studies. In this paper, we study the proportional hazards regression model for the subdistribution of a competing risk with proper adjustments for covariate‐dependent censoring. We consider a covariate‐adjusted weight function by fitting the Cox model for the censoring distribution and using the predictive probability for each individual. Our simulation study shows that the covariate‐adjusted weight estimator is basically unbiased when the censoring time depends on the covariates, and the covariate‐adjusted weight approach works well for the variance estimator as well. We illustrate our methods with bone marrow transplant data from the Center for International Blood and Marrow Transplant Research. Here, cancer relapse and death in complete remission are two competing risks.  相似文献   

4.
Current methods of testing the equality of conditional correlations of bivariate data on a third variable of interest (covariate) are limited due to discretizing of the covariate when it is continuous. In this study, we propose a linear model approach for estimation and hypothesis testing of the Pearson correlation coefficient, where the correlation itself can be modeled as a function of continuous covariates. The restricted maximum likelihood method is applied for parameter estimation, and the corrected likelihood ratio test is performed for hypothesis testing. This approach allows for flexible and robust inference and prediction of the conditional correlations based on the linear model. Simulation studies show that the proposed method is statistically more powerful and more flexible in accommodating complex covariate patterns than the existing methods. In addition, we illustrate the approach by analyzing the correlation between the physical component summary and the mental component summary of the MOS SF-36 form across a fair number of covariates in the national survey data.  相似文献   

5.
Functional linear models are useful in longitudinal data analysis. They include many classical and recently proposed statistical models for longitudinal data and other functional data. Recently, smoothing spline and kernel methods have been proposed for estimating their coefficient functions nonparametrically but these methods are either intensive in computation or inefficient in performance. To overcome these drawbacks, in this paper, a simple and powerful two-step alternative is proposed. In particular, the implementation of the proposed approach via local polynomial smoothing is discussed. Methods for estimating standard deviations of estimated coefficient functions are also proposed. Some asymptotic results for the local polynomial estimators are established. Two longitudinal data sets, one of which involves time-dependent covariates, are used to demonstrate the approach proposed. Simulation studies show that our two-step approach improves the kernel method proposed by Hoover and co-workers in several aspects such as accuracy, computational time and visual appeal of the estimators.  相似文献   

6.
Matched case–control designs are commonly used in epidemiological studies for estimating the effect of exposure variables on the risk of a disease by controlling the effect of confounding variables. Due to retrospective nature of the study, information on a covariate could be missing for some subjects. A straightforward application of the conditional logistic likelihood for analyzing matched case–control data with the partially missing covariate may yield inefficient estimators of the parameters. A robust method has been proposed to handle this problem using an estimated conditional score approach when the missingness mechanism does not depend on the disease status. Within the conditional logistic likelihood framework, an empirical procedure is used to estimate the odds of the disease for the subjects with missing covariate values. The asymptotic distribution and the asymptotic variance of the estimator when the matching variables and the completely observed covariates are categorical. The finite sample performance of the proposed estimator is assessed through a simulation study. Finally, the proposed method has been applied to analyze two matched case–control studies. The Canadian Journal of Statistics 38: 680–697; 2010 © 2010 Statistical Society of Canada  相似文献   

7.
As a flexible alternative to the Cox model, the accelerated failure time (AFT) model assumes that the event time of interest depends on the covariates through a regression function. The AFT model with non‐parametric covariate effects is investigated, when variable selection is desired along with estimation. Formulated in the framework of the smoothing spline analysis of variance model, the proposed method based on the Stute estimate ( Stute, 1993 [Consistent estimation under random censorship when covariables are present, J. Multivariate Anal. 45 , 89–103]) can achieve a sparse representation of the functional decomposition, by utilizing a reproducing kernel Hilbert norm penalty. Computational algorithms and theoretical properties of the proposed method are investigated. The finite sample size performance of the proposed approach is assessed via simulation studies. The primary biliary cirrhosis data is analyzed for demonstration.  相似文献   

8.
Various methods have been suggested in the literature to handle a missing covariate in the presence of surrogate covariates. These methods belong to one of two paradigms. In the imputation paradigm, Pepe and Fleming (1991) and Reilly and Pepe (1995) suggested filling in missing covariates using the empirical distribution of the covariate obtained from the observed data. We can proceed one step further by imputing the missing covariate using nonparametric maximum likelihood estimates (NPMLE) of the density of the covariate. Recently Murphy and Van der Vaart (1998a) showed that such an approach yields a consistent, asymptotically normal, and semiparametric efficient estimate for the logistic regression coefficient. In the weighting paradigm, Zhao and Lipsitz (1992) suggested an estimating function using completely observed records after weighting inversely by the probability of observation. An extension of this weighting approach designed to achieve semiparametric efficient bound is considered by Robins, Hsieh and Newey (RHN) (1995). The two ends of each paradigm (NPMLE and RHN) attain the efficiency bound and are asymptotically equivalent. However, both require a substantial amount of computation. A question arises whether and when, in practical situations, this extensive computation is worthwhile. In this paper we investigate the performance of single and multiple imputation estimates, weighting estimates, semiparametric efficient estimates, and two new imputation estimates. Simulation studies suggest that the sample size should be substantially large (e.g. n=2000) for NPMLE and RHN to be more efficient than simpler imputation estimates. When the sample size is moderately large (n≤ 1500), simpler imputation estimates have as small a variance as semiparametric efficient estimates.  相似文献   

9.
Recognizing that the efficiency in relative risk estimation for the Cox proportional hazards model is largely constrained by the total number of cases, Prentice (1986) proposed the case-cohort design in which covariates are measured on all cases and on a random sample of the cohort. Subsequent to Prentice, other methods of estimation and sampling have been proposed for these designs. We formalize an approach to variance estimation suggested by Barlow (1994), and derive a robust variance estimator based on the influence function. We consider the applicability of the variance estimator to all the proposed case-cohort estimators, and derive the influence function when known sampling probabilities in the estimators are replaced by observed sampling fractions. We discuss the modifications required when cases are missing covariate information. The missingness may occur by chance, and be completely at random; or may occur as part of the sampling design, and depend upon other observed covariates. We provide an adaptation of S-plus code that allows estimating influence function variances in the presence of such missing covariates. Using examples from our current case-cohort studies on esophageal and gastric cancer, we illustrate how our results our useful in solving design and analytic issues that arise in practice.  相似文献   

10.
The case-cohort study design is widely used to reduce cost when collecting expensive covariates in large cohort studies with survival or competing risks outcomes. A case-cohort study dataset consists of two parts: (a) a random sample and (b) all cases or failures from a specific cause of interest. Clinicians often assess covariate effects on competing risks outcomes. The proportional subdistribution hazards model directly evaluates the effect of a covariate on the cumulative incidence function under the non-covariate-dependent censoring assumption for the full cohort study. However, the non-covariate-dependent censoring assumption is often violated in many biomedical studies. In this article, we propose a proportional subdistribution hazards model for case-cohort studies with stratified data with covariate-adjusted censoring weight. We further propose an efficient estimator when extra information from the other causes is available under case-cohort studies. The proposed estimators are shown to be consistent and asymptotically normal. Simulation studies show (a) the proposed estimator is unbiased when the censoring distribution depends on covariates and (b) the proposed efficient estimator gains estimation efficiency when using extra information from the other causes. We analyze a bone marrow transplant dataset and a coronary heart disease dataset using the proposed method.  相似文献   

11.
12.
In many biomedical studies, it is common that due to budget constraints, the primary covariate is only collected in a randomly selected subset from the full study cohort. Often, there is an inexpensive auxiliary covariate for the primary exposure variable that is readily available for all the cohort subjects. Valid statistical methods that make use of the auxiliary information to improve study efficiency need to be developed. To this end, we develop an estimated partial likelihood approach for correlated failure time data with auxiliary information. We assume a marginal hazard model with common baseline hazard function. The asymptotic properties for the proposed estimators are developed. The proof of the asymptotic results for the proposed estimators is nontrivial since the moments used in estimating equation are not martingale-based and the classical martingale theory is not sufficient. Instead, our proofs rely on modern empirical process theory. The proposed estimator is evaluated through simulation studies and is shown to have increased efficiency compared to existing methods. The proposed method is illustrated with a data set from the Framingham study.  相似文献   

13.
Nested case-control and case-cohort studies are useful for studying associations between covariates and time-to-event when some covariates are expensive to measure. Full covariate information is collected in the nested case-control or case-cohort sample only, while cheaply measured covariates are often observed for the full cohort. Standard analysis of such case-control samples ignores any full cohort data. Previous work has shown how data for the full cohort can be used efficiently by multiple imputation of the expensive covariate(s), followed by a full-cohort analysis. For large cohorts this is computationally expensive or even infeasible. An alternative is to supplement the case-control samples with additional controls on which cheaply measured covariates are observed. We show how multiple imputation can be used for analysis of such supersampled data. Simulations show that this brings efficiency gains relative to a traditional analysis and that the efficiency loss relative to using the full cohort data is not substantial.  相似文献   

14.
Abstract.  In this paper, we consider a semiparametric time-varying coefficients regression model where the influences of some covariates vary non-parametrically with time while the effects of the remaining covariates follow certain parametric functions of time. The weighted least squares type estimators for the unknown parameters of the parametric coefficient functions as well as the estimators for the non-parametric coefficient functions are developed. We show that the kernel smoothing that avoids modelling of the sampling times is asymptotically more efficient than a single nearest neighbour smoothing that depends on the estimation of the sampling model. The asymptotic optimal bandwidth is also derived. A hypothesis testing procedure is proposed to test whether some covariate effects follow certain parametric forms. Simulation studies are conducted to compare the finite sample performances of the kernel neighbourhood smoothing and the single nearest neighbour smoothing and to check the empirical sizes and powers of the proposed testing procedures. An application to a data set from an AIDS clinical trial study is provided for illustration.  相似文献   

15.
Varying covariate effects often manifest meaningful heterogeneity in covariate-response associations. In this paper, we adopt a quantile regression model that assumes linearity at a continuous range of quantile levels as a tool to explore such data dynamics. The consideration of potential non-constancy of covariate effects necessitates a new perspective for variable selection, which, under the assumed quantile regression model, is to retain variables that have effects on all quantiles of interest as well as those that influence only part of quantiles considered. Current work on l 1-penalized quantile regression either does not concern varying covariate effects or may not produce consistent variable selection in the presence of covariates with partial effects, a practical scenario of interest. In this work, we propose a shrinkage approach by adopting a novel uniform adaptive LASSO penalty. The new approach enjoys easy implementation without requiring smoothing. Moreover, it can consistently identify the true model (uniformly across quantiles) and achieve the oracle estimation efficiency. We further extend the proposed shrinkage method to the case where responses are subject to random right censoring. Numerical studies confirm the theoretical results and support the utility of our proposals.  相似文献   

16.
The authors propose methods for Bayesian inference for generalized linear models with missing covariate data. They specify a parametric distribution for the covariates that is written as a sequence of one‐dimensional conditional distributions. They propose an informative class of joint prior distributions for the regression coefficients and the parameters arising from the covariate distributions. They examine the properties of the proposed prior and resulting posterior distributions. They also present a Bayesian criterion for comparing various models, and a calibration is derived for it. A detailed simulation is conducted and two real data sets are examined to demonstrate the methodology.  相似文献   

17.
Missing covariate data are common in biomedical studies. In this article, by using the non parametric kernel regression technique, a new imputation approach is developed for the Cox-proportional hazard regression model with missing covariates. This method achieves the same efficiency as the fully augmented weighted estimators (Qi et al. 2005. Journal of the American Statistical Association, 100:1250) and has a simpler form. The asymptotic properties of the proposed estimator are derived and analyzed. The comparisons between the proposed imputation method and several other existing methods are conducted via a number of simulation studies and a mouse leukemia data.  相似文献   

18.
This paper considers inference about the individual level relationship between two dichotomous variables based on aggregated data. It is known that such analyses suffer from 'ecological bias', caused by the lack of homogeneity of this relationship across the groups over which the aggregation occurs. Two new methods for overcoming this bias, one based on local smoothing and the other a simple semiparametric approach, are developed and evaluated. The local smoothing approach performs best when it is used with a covariate which accounts for some of the variation in the relationships across groups. The semiparametric approach performed well in our evaluation even without such auxiliary information  相似文献   

19.
In longitudinal observational studies, repeated measures are often correlated with observation times as well as censoring time. This article proposes joint modeling and analysis of longitudinal data with time-dependent covariates in the presence of informative observation and censoring times via a latent variable. Estimating equation approaches are developed for parameter estimation and asymptotic properties of the proposed estimators are established. In addition, a generalization of the semiparametric model with time-varying coefficients for the longitudinal response is considered. Furthermore, a lack-of-fit test is provided for assessing the adequacy of the model, and some tests are presented for investigating whether or not covariate effects vary with time. The finite-sample behavior of the proposed methods is examined in simulation studies, and an application to a bladder cancer study is illustrated.  相似文献   

20.
We propose an exploratory data analysis approach when data are observed as intervals in a nonparametric regression setting. The interval-valued data contain richer information than single-valued data in the sense that they provide both center and range information of the underlying structure. Conventionally, these two attributes have been studied separately as traditional tools can be readily used for single-valued data analysis. We propose a unified data analysis tool that attempts to capture the relationship between response and covariate by simultaneously accounting for variability present in the data. It utilizes a kernel smoothing approach, which is conducted in scale-space so that it considers a wide range of smoothing parameters rather than selecting an optimal value. It also visually summarizes the significance of trends in the data as a color map across multiple locations and scales. We demonstrate its effectiveness as an exploratory data analysis tool for interval-valued data using simulated and real examples.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号