首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Using several variables known to be related to prostate cancer, a multivariate classification method is developed to predict the onset of clinical prostate cancer. A multivariate mixed-effects model is used to describe longitudinal changes in prostate specific antigen (PSA), a free testosterone index (FTI), and body mass index (BMI) before any clinical evidence of prostate cancer. The patterns of change in these three variables are allowed to vary depending on whether the subject develops prostate cancer or not and the severity of the prostate cancer at diagnosis. An application of Bayes' theorem provides posterior probabilities that we use to predict whether an individual will develop prostate cancer and, if so, whether it is a high-risk or a low-risk cancer. The classification rule is applied sequentially one multivariate observation at a time until the subject is classified as a cancer case or until the last observation has been used. We perform the analyses using each of the three variables individually, combined together in pairs, and all three variables together in one analysis. We compare the classification results among the various analyses and a simulation study demonstrates how the sensitivity of prediction changes with respect to the number and type of variables used in the prediction process.  相似文献   

2.
Accurate diagnosis of a molecularly defined subtype of cancer is often an important step toward its effective control and treatment. For the diagnosis of some subtypes of a cancer, a gold standard with perfect sensitivity and specificity may be unavailable. In those scenarios, tumor subtype status is commonly measured by multiple imperfect diagnostic markers. Additionally, in many such studies, some subjects are only measured by a subset of diagnostic tests and the missing probabilities may depend on the unknown disease status. In this paper, we present statistical methods based on the EM algorithm to evaluate incomplete multiple imperfect diagnostic tests under a missing at random assumption and one missing not at random scenario. We apply the proposed methods to a real data set from the National Cancer Institute (NCI) colon cancer family registry on diagnosing microsatellite instability for hereditary non-polyposis colorectal cancer to estimate diagnostic accuracy parameters (i.e. sensitivities and specificities), prevalence, and potential differential missing probabilities for 11 biomarker tests. Simulations are also conducted to evaluate the small-sample performance of our methods.  相似文献   

3.
In biomedical research, two or more biomarkers may be available for diagnosis of a particular disease. Selecting one single biomarker which ideally discriminate a diseased group from a healthy group is confront in a diagnostic process. Frequently, most of the people use the accuracy measure, area under the receiver operating characteristic (ROC) curve to choose the best diagnostic marker among the available markers for diagnosis. Some authors have tried to combine the multiple markers by an optimal linear combination to increase the discriminatory power. In this paper, we propose an alternative method that combines two continuous biomarkers by direct bivariate modeling of the ROC curve under log-normality assumption. The proposed method is applied to simulated data set and prostate cancer diagnostic biomarker data set.  相似文献   

4.
Array-based comparative genomic hybridization (aCGH) is a high-resolution high-throughput technique for studying the genetic basis of cancer. The resulting data consists of log fluorescence ratios as a function of the genomic DNA location and provides a cytogenetic representation of the relative DNA copy number variation. Analysis of such data typically involves estimation of the underlying copy number state at each location and segmenting regions of DNA with similar copy number states. Most current methods proceed by modeling a single sample/array at a time, and thus fail to borrow strength across multiple samples to infer shared regions of copy number aberrations. We propose a hierarchical Bayesian random segmentation approach for modeling aCGH data that utilizes information across arrays from a common population to yield segments of shared copy number changes. These changes characterize the underlying population and allow us to compare different population aCGH profiles to assess which regions of the genome have differential alterations. Our method, referred to as BDSAcgh (Bayesian Detection of Shared Aberrations in aCGH), is based on a unified Bayesian hierarchical model that allows us to obtain probabilities of alteration states as well as probabilities of differential alteration that correspond to local false discovery rates. We evaluate the operating characteristics of our method via simulations and an application using a lung cancer aCGH data set.  相似文献   

5.
In this article, for the first time, we propose the negative binomial–beta Weibull (BW) regression model for studying the recurrence of prostate cancer and to predict the cure fraction for patients with clinically localized prostate cancer treated by open radical prostatectomy. The cure model considers that a fraction of the survivors are cured of the disease. The survival function for the population of patients can be modeled by a cure parametric model using the BW distribution. We derive an explicit expansion for the moments of the recurrence time distribution for the uncured individuals. The proposed distribution can be used to model survival data when the hazard rate function is increasing, decreasing, unimodal and bathtub shaped. Another advantage is that the proposed model includes as special sub-models some of the well-known cure rate models discussed in the literature. We derive the appropriate matrices for assessing local influence on the parameter estimates under different perturbation schemes. We analyze a real data set for localized prostate cancer patients after open radical prostatectomy.  相似文献   

6.
We study the properties of the called log-beta Weibull distribution defined by the logarithm of the beta Weibull random variable (Famoye et al. in J Stat Theory Appl 4:121–136, 2005; Lee et al. in J Mod Appl Stat Methods 6:173–186, 2007). An advantage of the new distribution is that it includes as special sub-models classical distributions reported in the lifetime literature. We obtain formal expressions for the moments, moment generating function, quantile function and mean deviations. We construct a regression model based on the new distribution to predict recurrence of prostate cancer for patients with clinically localized prostate cancer treated by open radical prostatectomy. It can be applied to censored data since it represents a parametric family of models that includes as special sub-models several widely-known regression models. The regression model was fitted to a data set of 1,324 eligible prostate cancer patients. We can predict recurrence free probability after the radical prostatectomy in terms of highly significant clinical and pathological explanatory variables associated with the recurrence of the disease. The predicted probabilities of remaining free of cancer progression are calculated under two nested models.  相似文献   

7.
Historically, the cure rate model has been used for modeling time-to-event data within which a significant proportion of patients are assumed to be cured of illnesses, including breast cancer, non-Hodgkin lymphoma, leukemia, prostate cancer, melanoma, and head and neck cancer. Perhaps the most popular type of cure rate model is the mixture model introduced by Berkson and Gage [1]. In this model, it is assumed that a certain proportion of the patients are cured, in the sense that they do not present the event of interest during a long period of time and can found to be immune to the cause of failure under study. In this paper, we propose a general hazard model which accommodates comprehensive families of cure rate models as particular cases, including the model proposed by Berkson and Gage. The maximum-likelihood-estimation procedure is discussed. A simulation study analyzes the coverage probabilities of the asymptotic confidence intervals for the parameters. A real data set on children exposed to HIV by vertical transmission illustrates the methodology.  相似文献   

8.
We present new statistical analyses of data arising from a clinical trial designed to compare two-stage dynamic treatment regimes (DTRs) for advanced prostate cancer. The trial protocol mandated that patients were to be initially randomized among four chemotherapies, and that those who responded poorly were to be rerandomized to one of the remaining candidate therapies. The primary aim was to compare the DTRs' overall success rates, with success defined by the occurrence of successful responses in each of two consecutive courses of the patient's therapy. Of the one hundred and fifty study participants, forty seven did not complete their therapy per the algorithm. However, thirty five of them did so for reasons that precluded further chemotherapy; i.e. toxicity and/or progressive disease. Consequently, rather than comparing the overall success rates of the DTRs in the unrealistic event that these patients had remained on their assigned chemotherapies, we conducted an analysis that compared viable switch rules defined by the per-protocol rules but with the additional provision that patients who developed toxicity or progressive disease switch to a non-prespecified therapeutic or palliative strategy. This modification involved consideration of bivariate per-course outcomes encoding both efficacy and toxicity. We used numerical scores elicited from the trial's Principal Investigator to quantify the clinical desirability of each bivariate per-course outcome, and defined one endpoint as their average over all courses of treatment. Two other simpler sets of scores as well as log survival time also were used as endpoints. Estimation of each DTR-specific mean score was conducted using inverse probability weighted methods that assumed that missingness in the twelve remaining drop-outs was informative but explainable in that it only depended on past recorded data. We conducted additional worst-best case analyses to evaluate sensitivity of our findings to extreme departures from the explainable drop-out assumption.  相似文献   

9.
In biomedical studies, it is of substantial interest to develop risk prediction scores using high-dimensional data such as gene expression data for clinical endpoints that are subject to censoring. In the presence of well-established clinical risk factors, investigators often prefer a procedure that also adjusts for these clinical variables. While accelerated failure time (AFT) models are a useful tool for the analysis of censored outcome data, it assumes that covariate effects on the logarithm of time-to-event are linear, which is often unrealistic in practice. We propose to build risk prediction scores through regularized rank estimation in partly linear AFT models, where high-dimensional data such as gene expression data are modeled linearly and important clinical variables are modeled nonlinearly using penalized regression splines. We show through simulation studies that our model has better operating characteristics compared to several existing models. In particular, we show that there is a non-negligible effect on prediction as well as feature selection when nonlinear clinical effects are misspecified as linear. This work is motivated by a recent prostate cancer study, where investigators collected gene expression data along with established prognostic clinical variables and the primary endpoint is time to prostate cancer recurrence. We analyzed the prostate cancer data and evaluated prediction performance of several models based on the extended c statistic for censored data, showing that 1) the relationship between the clinical variable, prostate specific antigen, and the prostate cancer recurrence is likely nonlinear, i.e., the time to recurrence decreases as PSA increases and it starts to level off when PSA becomes greater than 11; 2) correct specification of this nonlinear effect improves performance in prediction and feature selection; and 3) addition of gene expression data does not seem to further improve the performance of the resultant risk prediction scores.  相似文献   

10.
New statistical procedures are introduced to analyse typical microRNA expression data sets. For each separate microRNA expression, the null hypothesis to be tested is that there is no difference between the distributions of the expression in different groups. The test statistics are then constructed having certain type of alternatives in mind. To avoid strong (parametric) distributional assumptions, the alternatives are formulated using probabilities of different orders of pairs or triples of observations coming from different groups, and the test statistics are then constructed using corresponding several‐sample U‐statistics, natural estimates of these probabilities. Classical several‐sample rank test statistics, such as the Kruskal–Wallis and Jonckheere–Terpstra tests, are special cases in our approach. Also, as the number of variables (microRNAs) is huge, we confront a serious simultaneous testing problem. Different approaches to control the family‐wise error rate or the false discovery rate are shortly discussed, and it is shown how the Chen–Stein theorem can be used to show that family‐wise error rate can be controlled for cluster‐dependent microRNAs under weak assumptions. The theory is illustrated with an analysis of real data, a microRNA expression data set on Finnish (aggressive and non‐aggressive) prostate cancer patients and their controls.  相似文献   

11.
In many diagnostic studies, multiple diagnostic tests are performed on each subject or multiple disease markers are available. Commonly, the information should be combined to improve the diagnostic accuracy. We consider the problem of comparing the discriminatory abilities between two groups of biomarkers. Specifically, this article focuses on confidence interval estimation of the difference between paired AUCs based on optimally combined markers under the assumption of multivariate normality. Simulation studies demonstrate that the proposed generalized variable approach provides confidence intervals with satisfying coverage probabilities at finite sample sizes. The proposed method can also easily provide P-values for hypothesis testing. Application to analysis of a subset of data from a study on coronary heart disease illustrates the utility of the method in practice.  相似文献   

12.
A full likelihood method is proposed to analyse continuous longitudinal data with non-ignorable (informative) missing values and non-monotone patterns. The problem arose in a breast cancer clinical trial where repeated assessments of quality of life were collected: patients rated their coping ability during and after treatment. We allow the missingness probabilities to depend on unobserved responses, and we use a multivariate normal model for the outcomes. A first-order Markov dependence structure for the responses is a natural choice and facilitates the construction of the likelihood; estimates are obtained via the Nelder–Mead simplex algorithm. Computations are difficult and become intractable with more than three or four assessments. Applying the method to the quality-of-life data results in easily interpretable estimates, confirms the suspicion that the data are non-ignorably missing and highlights the likely bias of standard methods. Although treatment comparisons are not affected here, the methods are useful for obtaining unbiased means and estimating trends over time.  相似文献   

13.
A model for survival analysis is studied that is relevant for samples which are subject to multiple types of failure. In comparison with a more standard approach, through the appropriate use of hazard functions and transition probabilities, the model allows for a more accurate study of cause-specific failure with regard to both the timing and type of failure. A semiparametric specification of a mixture model is employed that is able to adjust for concomitant variables and allows for the assessment of their effects on the probabilities of eventual causes of failure through a generalized logistic model, and their effects on the corresponding conditional hazard functions by employing the Cox proportional hazards model. A carefully formulated estimation procedure is presented that uses an EM algorithm based on a profile likelihood construction. The methods discussed, which could also be used for reliability analysis, are applied to a prostate cancer data set.  相似文献   

14.
Prostate cancer is the most common cancer diagnosed in American men and the second leading cause of death from malignancies. There are large geographical variation and racial disparities existing in the survival rate of prostate cancer. Much work on the spatial survival model is based on the proportional hazards model, but few focused on the accelerated failure time model. In this paper, we investigate the prostate cancer data of Louisiana from the SEER program and the violation of the proportional hazards assumption suggests the spatial survival model based on the accelerated failure time model is more appropriate for this data set. To account for the possible extra-variation, we consider spatially-referenced independent or dependent spatial structures. The deviance information criterion (DIC) is used to select a best fitting model within the Bayesian frame work. The results from our study indicate that age, race, stage and geographical distribution are significant in evaluating prostate cancer survival.  相似文献   

15.
Suppose that the conditional density of a response variable given a vector of explanatory variables is parametrically modelled, and that data are collected by a two-phase sampling design. First, a simple random sample is drawn from the population. The stratum membership in a finite number of strata of the response and explanatory variables is recorded for each unit. Second, a subsample is drawn from the phase-one sample such that the selection probability is determined by the stratum membership. The response and explanatory variables are fully measured at this phase. We synthesize existing results on nonparametric likelihood estimation and present a streamlined approach for the computation and the large sample theory of profile likelihood in four different situations. The amount of information in terms of data and assumptions varies depending on whether the phase-one data are retained, the selection probabilities are known, and/or the stratum probabilities are known. We establish and illustrate numerically the order of efficiency among the maximum likelihood estimators, according to the amount of information utilized, in the four situations.  相似文献   

16.
Most clinical studies, which investigate the impact of therapy simultaneously, record the frequency of adverse events in order to monitor safety of the intervention. Study reports typically summarise adverse event data by tabulating the frequencies of the worst grade experienced but provide no details of the temporal profiles of specific types of adverse events. Such 'toxicity profiles' are potentially important tools in disease management and in the assessment of newer therapies including targeted treatments and immunotherapy where different types of toxicity may be more common at various times during long-term drug exposure. Toxicity profiles of commonly experienced adverse events occurring due to exposure to long-term treatment could assist in evaluating the costs of the health care benefits of therapy. We show how to generate toxicity profiles using an adaptation of the ordinal time-to-event model comprising of a two-step process, involving estimation of the multinomial response probabilities using multinomial logistic regression and combining these with recurrent time to event hazard estimates to produce cumulative event probabilities for each of the multinomial adverse event response categories. Such a model permits the simultaneous assessment of the risk of events over time and provides cumulative risk probabilities for each type of adverse event response. The method can be applied more generally by using different models to estimate outcome/response probabilities. The method is illustrated by developing toxicity profiles for three distinct types of adverse events associated with two treatment regimens for patients with advanced breast cancer.  相似文献   

17.
Plasma HIV viral load (VL) is the clinical indicator used to evaluate disease burden for HIV-infected patients. We developed a covariate-adjusted, three-state, homogenous continuous time Markov chain model for HIV/AIDS disease burden among subgroups. We defined Detectable and Undetectable HIV VL levels as two transient states and Death as the third absorbing state. We implemented the exact maximum likelihood method to estimate the parameters with related asymptotic distribution to conduct hypothesis testing. We evaluated the proposed model using HIV-infected individuals from South Carolina (SC) HIV surveillance data. Using the developed model, we estimated and compared the transition hazards, transition probabilities, and the state-specific duration for HIV-infected individuals. We examined gender, race/ethnicity, age, CD4 count, place of residence, and antiretroviral treatment regimen prescribed at the beginning of the study period. We found that patients with a higher CD4 count, increased age, heterosexual orientation, white, and single tablet regimen users were associated with reduced risk of transitioning to a Detectable VL from an Undetectable VL, whereas shorter time since diagnosis, being male, and injection drug use increased the risk of the same transition.  相似文献   

18.
Previous research on prostate cancer survival trends in the United States National Cancer Institute's Surveillance Epidemiology and End Results database has indicated a potential change-point in the age of diagnosis of prostate cancer around age 50. Identifying a change-point value in prostate cancer survival and cure could have important policy and health care management implications. Statistical analysis of this data has to address two complicating features: (1) change-point models are not smooth functions and so present computational and theoretical difficulties; and (2) models for prostate cancer survival need to account for the fact that many men diagnosed with prostate cancer can be effectively cured of their disease with early treatment. We develop a cure survival model that allows for change-point effects in covariates to investigate a potential change-point in the age of diagnosis of prostate cancer. Our results do not indicate that age under 50 is associated with increased hazard of death from prostate cancer.  相似文献   

19.
Many diseases, especially cancer, are not static, but rather can be summarized by a series of events or stages (e.g. diagnosis, remission, recurrence, metastasis, death). Most available methods to analyze multi-stage data ignore intermediate events and focus on the terminal event or consider (time to) multiple events as independent. Competing-risk or semi-competing-risk models are often deficient in describing the complex relationship between disease progression events which are driven by a shared progression stochastic process. A multi-stage model can only examine two stages at a time and thus fails to capture the effect of one stage on the time spent between other stages. Moreover, most models do not account for latent stages. We propose a semi-parametric joint model of diagnosis, latent metastasis, and cancer death and use nonparametric maximum likelihood to estimate covariate effects on the risks of intermediate events and death and the dependence between them. We illustrate the model with Monte Carlo simulations and analysis of real data on prostate cancer from the SEER database.  相似文献   

20.
Many disease processes are characterized by two or more successive health states, and it is often of interest and importance to assess state-specific covariate effects. However, with incomplete follow-up data such inference has not been satisfactorily addressed in the literature. We model the logarithm-transformed sojourn time in each state as linearly related to the covariates; however, neither the distributional form of the error term nor the dependence structure of the states needs to be specified. We propose a regression procedure to accommodate incomplete follow-up data. Asymptotic theory is presented, along with some tools for goodness-of-fit diagnostics. Simulation studies show that the proposal is reliable for practical use. We illustrate it by application to a cancer clinical trial.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号