首页 | 本学科首页   官方微博 | 高级检索  
 共查询到12条相似文献,搜索用时 15 毫秒
Differential analysis techniques are commonly used to offer scientists a dimension reduction procedure and an interpretable gateway to variable selection, especially when confronting high-dimensional genomic data. Huang et al. used a gene expression profile of breast cancer cell lines to identify genomic markers which are highly correlated with in vitro sensitivity of a drug Dasatinib. They considered three statistical methods to identify differentially expressed genes and finally used the results from the intersection. But the statistical methods that are used in the paper are not sufficient to select the genomic markers. In this paper we used three alternative statistical methods to select a combined list of genomic markers and compared the genes that were proposed by Huang et al. We then proposed to use sparse principal component analysis (Sparse PCA) to identify a final list of genomic markers. The Sparse PCA incorporates correlation into account among the genes and helps to draw a successful genomic markers discovery. We present a new and a small set of genomic markers to separate out the groups of patients effectively who are sensitive to the drug Dasatinib. The analysis procedure will also encourage scientists in identifying genomic markers that can help to separate out two groups.  相似文献   

In the cases with three ordinal diagnostic groups, the important measures of diagnostic accuracy are the volume under surface (VUS) and the partial volume under surface (PVUS) which are the extended forms of the area under curve (AUC) and the partial area under curve (PAUC). This article addresses confidence interval estimation of the difference in paired VUS s and the difference in paired PVUS s. To focus especially on studies with small to moderate sample sizes, we propose an approach based on the concepts of generalized inference. A Monte Carlo study demonstrates that the proposed approach generally can provide confidence intervals with reasonable coverage probabilities even at small sample sizes. The proposed approach is compared to a parametric bootstrap approach and a large sample approach through simulation. Finally, the proposed approach is illustrated via an application to a data set of blood test results of anemia patients.  相似文献   

In biomedical research, two or more biomarkers may be available for diagnosis of a particular disease. Selecting one single biomarker which ideally discriminate a diseased group from a healthy group is confront in a diagnostic process. Frequently, most of the people use the accuracy measure, area under the receiver operating characteristic (ROC) curve to choose the best diagnostic marker among the available markers for diagnosis. Some authors have tried to combine the multiple markers by an optimal linear combination to increase the discriminatory power. In this paper, we propose an alternative method that combines two continuous biomarkers by direct bivariate modeling of the ROC curve under log-normality assumption. The proposed method is applied to simulated data set and prostate cancer diagnostic biomarker data set.  相似文献   

Combining data of several tests or markers for the classification of patients according to their health status for assigning better treatments is a major issue in the study of diseases such as cancer. In order to tackle this problem, several approaches have been proposed in the literature. In this paper, a step-by-step algorithm for estimating the parameters of a linear classifier that combines several measures is considered. The optimization criterion is to maximize the area under the receiver operating characteristic curve. The algorithm is applied to different simulated data sets and its performance is evaluated. Finally, the method is illustrated with a prostate cancer staging database.  相似文献   

In many clinical studies, longitudinal biomarkers are often used to monitor the progression of a disease. For example, in a kidney transplant study, the glomerular filtration rate (GFR) is used as a longitudinal biomarker to monitor the progression of the kidney function and the patient''s state of survival that is characterized by multiple time-to-event outcomes, such as kidney transplant failure and death. It is known that the joint modelling of longitudinal and survival data leads to a more accurate and comprehensive estimation of the covariates'' effect. While most joint models use the longitudinal outcome as a covariate for predicting survival, very few models consider the further decomposition of the variation within the longitudinal trajectories and its effect on survival. We develop a joint model that uses functional principal component analysis (FPCA) to extract useful features from the longitudinal trajectories and adopt the competing risk model to handle multiple time-to-event outcomes. The longitudinal trajectories and the multiple time-to-event outcomes are linked via the shared functional features. The application of our model on a real kidney transplant data set reveals the significance of these functional features, and a simulation study is carried out to validate the accurateness of the estimation method.  相似文献   

A cure rate model is a survival model incorporating the cure rate with the assumption that the population contains both uncured and cured individuals. It is a powerful statistical tool for prognostic studies, especially in cancer. The cure rate is important for making treatment decisions in clinical practice. The proportional hazards (PH) cure model can predict the cure rate for each patient. This contains a logistic regression component for the cure rate and a Cox regression component to estimate the hazard for uncured patients. A measure for quantifying the predictive accuracy of the cure rate estimated by the Cox PH cure model is required, as there has been a lack of previous research in this area. We used the Cox PH cure model for the breast cancer data; however, the area under the receiver operating characteristic curve (AUC) could not be estimated because many patients were censored. In this study, we used imputation‐based AUCs to assess the predictive accuracy of the cure rate from the PH cure model. We examined the precision of these AUCs using simulation studies. The results demonstrated that the imputation‐based AUCs were estimable and their biases were negligibly small in many cases, although ordinary AUC could not be estimated. Additionally, we introduced the bias‐correction method of imputation‐based AUCs and found that the bias‐corrected estimate successfully compensated the overestimation in the simulation studies. We also illustrated the estimation of the imputation‐based AUCs using breast cancer data. Copyright © 2014 John Wiley & Sons, Ltd.  相似文献   

Omid Khademnoe 《Statistics》2016,50(5):974-990
There has been substantial recent attention on problems involving a functional linear regression model with scalar response. Among them, there have been few works dealing with asymptotic distribution of prediction in functional linear regression models. In recent literature, the centeral limit theorem for prediction has been discussed, but the proof and conditions under which the random bias terms for a fixed predictor converge to zero have been ignored so that the impact of these terms on the convergence of the prediction has not been well understood. Clarifying the proof and conditions under which the bias terms converge to zero, we show that the asymptotic distribution of the prediction is normal. Furthermore, we have derived those results related to other terms that already obtained by others, under milder conditions. Finally, we conduct a simulation study to investigate performance of the asymptotic distribution under various parameter settings.  相似文献   

We propose to utilize the group lasso algorithm for logistic regression to construct a risk scoring system for predicting disease in swine. This work is motivated by the need to develop a risk scoring system from survey data on risk factor for porcine reproductive and respiratory syndrome (PRRS), which is a major health, production and financial problem for swine producers in nearly every country. Group lasso provides an attractive solution to this research question because of its ability to achieve group variable selection and stabilize parameter estimates at the same time. We propose to choose the penalty parameter for group lasso through leave-one-out cross-validation, using the criterion of the area under the receiver operating characteristic curve. Survey data for 896 swine breeding herd sites in the USA and Canada completed between March 2005 and March 2009 are used to construct the risk scoring system for predicting PRRS outbreaks in swine. We show that our scoring system for PRRS significantly improves the current scoring system that is based on an expert opinion. We also show that our proposed scoring system is superior in terms of area under the curve to that developed using multiple logistic regression model selected based on variable significance.  相似文献   

Receiver operating characteristic(ROC)curves are useful for studying the performance of diagnostic tests. ROC curves occur in many fields of applications including psychophysics, quality control and medical diagnostics. In practical situations, often the responses to a diagnostic test are classified into a number of ordered categories. Such data are referred to as ratings data. It is typically assumed that the underlying model is based on a continuous probability distribution. The ROC curve is then constructed from such data using this probability model. Properties of the ROC curve are inherited from the model. Therefore, understanding the role of different probability distributions in ROC modeling is an interesting and important area of research. In this paper the Lomax distribution is considered as a model for ratings data and the corresponding ROC curve is derived. The maximum likelihood estimation procedure for the related parameters is discussed. This procedure is then illustrated in the analysis of a neurological data example.  相似文献   

This paper presents a study on symmetry of repeated bi-phased data signals, in particular, on quantification of the deviation between the two parts of the signal. Three symmetry scores are defined using functional data techniques such as smoothing and registration. One score is related to the L 2-distance between the two parts of the signal, whereas the other two are constructed to specifically measure differences in amplitude and phase. Moreover, symmetry scores based on functional principal component analysis (PCA) are examined. The scores are applied to acceleration signals from a study on equine gait. The scores turn out to be highly associated with lameness, and their applicability for lameness quantification and detection is investigated. Four classification approaches turn out to give similar results. The scores describing amplitude and phase variation turn out to outperform the PCA scores when it comes to the classification of lameness.  相似文献   

In disease screening, a biomarker combination developed by combining multiple markers tends to have a higher sensitivity than an individual marker. Parametric methods for marker combination rely on the inverse of covariance matrices, which is often a non-trivial problem for high-dimensional data generated by modern high-throughput technologies. Additionally, another common problem in disease diagnosis is the existence of limit of detection (LOD) for an instrument – that is, when a biomarker''s value falls below the limit, it cannot be observed and is assigned an NA value. To handle these two challenges in combining high-dimensional biomarkers with the presence of LOD, we propose a resample-replace lasso procedure. We first impute the values below LOD and then use the graphical lasso method to estimate the means and precision matrices for the high-dimensional biomarkers. The simulation results show that our method outperforms alternative methods such as either substitute NA values with LOD values or remove observations that have NA values. A real case analysis on a protein profiling study of glioblastoma patients on their survival status indicates that the biomarker combination obtained through the proposed method is more accurate in distinguishing between two groups.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号