首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
When variable selection with stepwise regression and model fitting are conducted on the same data set, competition for inclusion in the model induces a selection bias in coefficient estimators away from zero. In proportional hazards regression with right-censored data, selection bias inflates the absolute value of parameter estimate of selected parameters, while the omission of other variables may shrink coefficients toward zero. This paper explores the extent of the bias in parameter estimates from stepwise proportional hazards regression and proposes a bootstrap method, similar to those proposed by Miller (Subset Selection in Regression, 2nd edn. Chapman & Hall/CRC, 2002) for linear regression, to correct for selection bias. We also use bootstrap methods to estimate the standard error of the adjusted estimators. Simulation results show that substantial biases could be present in uncorrected stepwise estimators and, for binary covariates, could exceed 250% of the true parameter value. The simulations also show that the conditional mean of the proposed bootstrap bias-corrected parameter estimator, given that a variable is selected, is moved closer to the unconditional mean of the standard partial likelihood estimator in the chosen model, and to the population value of the parameter. We also explore the effect of the adjustment on estimates of log relative risk, given the values of the covariates in a selected model. The proposed method is illustrated with data sets in primary biliary cirrhosis and in multiple myeloma from the Eastern Cooperative Oncology Group.  相似文献   

2.
The added variable plot is useful for examining the effect of a covariate in regression models. The plot provides information regarding the inclusion of a covariate, and is useful in identifying influential observations on the parameter estimates. Hall et al. (1996) proposed a plot for Cox's proportional hazards model derived by regarding the Cox model as a generalized linear model. This paper proves and discusses properties of this plot. These properties make the plot a valuable tool in model evaluation. Quantities considered include parameter estimates, residuals, leverage, case influence measures and correspondence to previously proposed residuals and diagnostics.  相似文献   

3.
Panel count data often occur in a long-term study where the primary end point is the time to a specific event and each subject may experience multiple recurrences of this event. Furthermore, suppose that it is not feasible to keep subjects under observation continuously and the numbers of recurrences for each subject are only recorded at several distinct time points over the study period. Moreover, the set of observation times may vary from subject to subject. In this paper, regression methods, which are derived under simple semiparametric models, are proposed for the analysis of such longitudinal count data. Especially, we consider the situation when both observation and censoring times may depend on covariates. The new procedures are illustrated with data from a well-known cancer study.  相似文献   

4.
Data‐analytic tools for models other than the normal linear regression model are relatively rare. Here we develop plots and diagnostic statistics for nonconstant variance for the random‐effects model (REM). REMs for longitudinal data include both within‐ and between‐subject variances. A basic assumption is that the two variance terms are constant across subjects. However, we often find that these variances are functions of covariates, and the data set has what we call explainable heterogeneity, which needs to be allowed for in the model. We characterize several types of heterogeneity of variance in REMs and develop three diagnostic tests using the score statistic: one for each of the two variance terms, and the third for a form of multivariate nonconstant variance. For each test we present an adjusted residual plot which can identify cases that are unusually influential on the outcome of the test.  相似文献   

5.
With competing risks data, one often needs to assess the treatment and covariate effects on the cumulative incidence function. Fine and Gray proposed a proportional hazards regression model for the subdistribution of a competing risk with the assumption that the censoring distribution and the covariates are independent. Covariate‐dependent censoring sometimes occurs in medical studies. In this paper, we study the proportional hazards regression model for the subdistribution of a competing risk with proper adjustments for covariate‐dependent censoring. We consider a covariate‐adjusted weight function by fitting the Cox model for the censoring distribution and using the predictive probability for each individual. Our simulation study shows that the covariate‐adjusted weight estimator is basically unbiased when the censoring time depends on the covariates, and the covariate‐adjusted weight approach works well for the variance estimator as well. We illustrate our methods with bone marrow transplant data from the Center for International Blood and Marrow Transplant Research. Here, cancer relapse and death in complete remission are two competing risks.  相似文献   

6.
We propose a method for assessing an individual patient's risk of a future clinical event using clinical trial or cohort data and Cox proportional hazards regression, combining the information from several studies using meta-analysis techniques. The method combines patient-specific estimates of the log cumulative hazard across studies, weighting by the relative precision of the estimates, using either fixed- or random-effects meta-analysis calculations. Risk assessment can be done for any future patient using a few key summary statistics determined once and for all from each study. Generalizations of the method to logistic regression and linear models are immediate. We evaluate the methods using simulation studies and illustrate their application using real data.  相似文献   

7.
Summary.  The analysis of covariance is a technique that is used to improve the power of a k -sample test by adjusting for concomitant variables. If the end point is the time of survival, and some observations are right censored, the score statistic from the Cox proportional hazards model is the method that is most commonly used to test the equality of conditional hazard functions. In many situations, however, the proportional hazards model assumptions are not satisfied. Specifically, the relative risk function is not time invariant or represented as a log-linear function of the covariates. We propose an asymptotically valid k -sample test statistic to compare conditional hazard functions which does not require the assumption of proportional hazards, a parametric specification of the relative risk function or randomization of group assignment. Simulation results indicate that the performance of this statistic is satisfactory. The methodology is demonstrated on a data set in prostate cancer.  相似文献   

8.
Circular data are observations that are represented as points on a unit circle. Times of day and directions of wind are two such examples. In this work, we present a Bayesian approach to regress a circular variable on a linear predictor. The regression coefficients are assumed to have a nonparametric distribution with a Dirichlet process prior. The semiparametric Bayesian approach gives added flexibility to the model and is useful especially when the likelihood surface is ill behaved. Markov chain Monte Carlo techniques are used to fit the proposed model and to generate predictions. The method is illustrated using an environmental data set.  相似文献   

9.
This paper provides a statistically unified method for modelling trends in groundwater levels for a national project that aims to predict areas at risk from salinity in 2020. It was necessary to characterize the trends in groundwater levels in thousands of boreholes that have been monitored by Agriculture Western Australia throughout the south-west of Western Australia over the last 10 years. The approach investigated in the present paper uses segmented regression with constraints when the number of change points is unknown. For each segment defined by change points, the trend can be described by a linear trend possibly superimposed on a periodic response. Four different types of change point are defined by constraints on the model parameters to cope with different patterns of change in groundwater levels. For a set of candidate change points provided by the user, a modified Akaike information criterion is used for model selection. Model parameters can be estimated by multiple linear regression. Some typical examples are presented to demonstrate the performance of the approach.  相似文献   

10.
We-propose the use of hyperbolas as covariates in piecewise linear regression splines to fit data exhibiting a multi-phase linear response with smooth transitions between phases. The hyperbolic regression spline model, fitted by non-linear regression, provides an intuitive and easy way to extend to multiple phases the two-phase hyperbolic response model previously proposed by others. The small additional effort required to fit non-linear, as opposed to linear, regression models is particularly worthwhile when investigators are unwilling to assume that the slope of the response changes abruptly at the join points. Furthermore, undue influence on the join point and slope estimates, resulting from points in the transition region, may be avoided by using the hyperbolic regression spline. Two examples illustrate the use of this method.  相似文献   

11.
This article develops a local partial likelihood technique to estimate the time-dependent coefficients in Cox's regression model. The basic idea is a simple extension of the local linear fitting technique used in the scatterplot smoothing. The coefficients are estimated locally based on the partial likelihood in a window around each time point. Multiple time-dependent covariates are incorporated in the local partial likelihood procedure. The procedure is useful as a diagnostic tool and can be used in uncovering time-dependencies or departure from the proportional hazards model. The programming involved in the local partial likelihood estimation is relatively simple and it can be modified with few efforts from the existing programs for the proportional hazards model. The asymptotic properties of the resulting estimator are established and compared with those from the local constant fitting. A consistent estimator of the asymptotic variance is also proposed. The approach is illustrated by a real data set from the study of gastric cancer patients and a simulation study is also presented.  相似文献   

12.
Yu  Tingting  Wu  Lang  Gilbert  Peter 《Lifetime data analysis》2019,25(2):229-258

In HIV vaccine studies, longitudinal immune response biomarker data are often left-censored due to lower limits of quantification of the employed immunological assays. The censoring information is important for predicting HIV infection, the failure event of interest. We propose two approaches to addressing left censoring in longitudinal data: one that makes no distributional assumptions for the censored data—treating left censored values as a “point mass” subgroup—and the other makes a distributional assumption for a subset of the censored data but not for the remaining subset. We develop these two approaches to handling censoring for joint modelling of longitudinal and survival data via a Cox proportional hazards model fit by h-likelihood. We evaluate the new methods via simulation and analyze an HIV vaccine trial data set, finding that longitudinal characteristics of the immune response biomarkers are highly associated with the risk of HIV infection.

  相似文献   

13.
Longitudinal studies of neurological disorders suffer almost inevitably from non-compliance, which is likely to be non-ignorable. It is important in these cases to model the response variable and the dropout mechanism jointly. In this article we propose a Monte Carlo version of the EM algorithm that can be used to fit random-coefficient-based dropout models. A linear mixed model is assumed for the response variable and a discrete-time proportional hazards model for the dropout mechanism; these share a common set of random coefficients. The ideas are illustrated using data from a five-year trial assessing the efficacy of two drugs in the treatment of patients in the early stages of Parkinson's disease.  相似文献   

14.
Abstract.  We study a binary regression model using the complementary log–log link, where the response variable Δ is the indicator of an event of interest (for example, the incidence of cancer, or the detection of a tumour) and the set of covariates can be partitioned as ( X ,  Z ) where Z (real valued) is the primary covariate and X (vector valued) denotes a set of control variables. The conditional probability of the event of interest is assumed to be monotonic in Z , for every fixed X . A finite-dimensional (regression) parameter β describes the effect of X . We show that the baseline conditional probability function (corresponding to X  =  0 ) can be estimated by isotonic regression procedures and develop an asymptotically pivotal likelihood-ratio-based method for constructing (asymptotic) confidence sets for the regression function. We also show how likelihood-ratio-based confidence intervals for the regression parameter can be constructed using the chi-square distribution. An interesting connection to the Cox proportional hazards model under current status censoring emerges. We present simulation results to illustrate the theory and apply our results to a data set involving lung tumour incidence in mice.  相似文献   

15.
Fong  Daniel Y.T.  Lam  K.F.  Lawless  J.F.  Lee  Y.W. 《Lifetime data analysis》2001,7(4):345-362
We consider recurrent event data when the duration or gap times between successive event occurrences are of intrinsic interest. Subject heterogeneity not attributed to observed covariates is usually handled by random effects which result in an exchangeable correlation structure for the gap times of a subject. Recently, efforts have been put into relaxing this restriction to allow non-exchangeable correlation. Here we consider dynamic models where random effects can vary stochastically over the gap times. We extend the traditional Gaussian variance components models and evaluate a previously proposed proportional hazards model through a simulation study and some examples. Besides, semiparametric estimation of the proportional hazards models is considered. Both models are easily used. The Gaussian models are easily interpreted in terms of the variance structure. On the other hand, the proportional hazards models would be more appropriate in the context of survival analysis, particularly in the interpretation of the regression parameters. They can be sensitive to the choice of model for random effects but not to the choice of the baseline hazard function.  相似文献   

16.
The authors define a class of “partially linear single‐index” survival models that are more flexible than the classical proportional hazards regression models in their treatment of covariates. The latter enter the proposed model either via a parametric linear form or a nonparametric single‐index form. It is then possible to model both linear and functional effects of covariates on the logarithm of the hazard function and if necessary, to reduce the dimensionality of multiple covariates via the single‐index component. The partially linear hazards model and the single‐index hazards model are special cases of the proposed model. The authors develop a likelihood‐based inference to estimate the model components via an iterative algorithm. They establish an asymptotic distribution theory for the proposed estimators, examine their finite‐sample behaviour through simulation, and use a set of real data to illustrate their approach.  相似文献   

17.
Proportional hazards frailty models use a random effect, so called frailty, to construct association for clustered failure time data. It is customary to assume that the random frailty follows a gamma distribution. In this paper, we propose a graphical method for assessing adequacy of the proportional hazards frailty models. In particular, we focus on the assessment of the gamma distribution assumption for the frailties. We calculate the average of the posterior expected frailties at several followup time points and compare it at these time points to 1, the known mean frailty. Large discrepancies indicate lack of fit. To aid in assessing the goodness of fit, we derive and estimate the standard error of the mean of the posterior expected frailties at each time point examined. We give an example to illustrate the proposed methodology and perform sensitivity analysis by simulations.  相似文献   

18.
Many clinical research studies evaluate a time‐to‐event outcome, illustrate survival functions, and conventionally report estimated hazard ratios to express the magnitude of the treatment effect when comparing between groups. However, it may not be straightforward to interpret the hazard ratio clinically and statistically when the proportional hazards assumption is invalid. In some recent papers published in clinical journals, the use of restricted mean survival time (RMST) or τ ‐year mean survival time is discussed as one of the alternative summary measures for the time‐to‐event outcome. The RMST is defined as the expected value of time to event limited to a specific time point corresponding to the area under the survival curve up to the specific time point. This article summarizes the necessary information to conduct statistical analysis using the RMST, including the definition and statistical properties of the RMST, adjusted analysis methods, sample size calculation, information fraction for the RMST difference, and clinical and statistical meaning and interpretation. Additionally, we discuss how to set the specific time point to define the RMST from two main points of view. We also provide developed SAS codes to determine the sample size required to detect an expected RMST difference with appropriate power and reconstruct individual survival data to estimate an RMST reference value from a reported survival curve.  相似文献   

19.
An added variable plot is a commonly used plot in regression diagnostics. The rationale for this plot is to provide information about the addition of a further explanatory variable to the model. In addition, an added variable plot is most often used for detecting high leverage points and influential data. So far as we know, this type of plot involves the least squares residuals which, we suspect, could produce a confusing picture when a group of unusual cases are present in the data. In this situation, added variable plots may not only fail to detect the unusual cases but also may fail to focus on the need for adding a further regressor to the model. We suggest that residuals from deletion should be more convincing and reliable in this type of plot. The usefulness of an added variable plot based on residuals from deletion is investigated through a few examples and a Monte Carlo simulation experiment in a variety of situations.  相似文献   

20.
A test for lack of fit in regression is presented. Unlike other methods, this one doesn't require replicates or a prior estimate of variance. It can be used for linear or multiple regression, and would be easy to add to existing computer packages. It is based on comparing a fit over low leverage points with a fit over the entire set of data. Distribution theory results are pre¬sented, with examples of power. A discussion of its use for de¬tecting violations of other regression assumptions is also given.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号