首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
The comparison of the accuracy of two binary diagnostic tests has traditionally required knowledge of the disease status in all of the patients in the sample via the application of a gold standard. In practice, the gold standard is not always applied to all patients in a sample, and the problem of partial verification of the disease arises. The accuracy of a binary diagnostic test can be measured in terms of positive and negative predictive values, which represent the accuracy of a diagnostic test when it is applied to a cohort of patients. In this paper, we deduce the maximum likelihood estimators of predictive values (PVs) of two binary diagnostic tests, and the hypothesis tests to compare these measures when, in the presence of partial disease verification, the verification process only depends on the results of the two diagnostic tests. The effect of verification bias on the naïve estimators of PVs of two diagnostic tests is studied, and simulation experiments are performed in order to investigate the small sample behaviour of hypothesis tests. The hypothesis tests which we have deduced can be applied when all of the patients are verified with the gold standard. The results obtained have been applied to the diagnosis of coronary stenosis.  相似文献   

2.
For evaluating diagnostic accuracy of inherently continuous diagnostic tests/biomarkers, sensitivity and specificity are well-known measures both of which depend on a diagnostic cut-off, which is usually estimated. Sensitivity (specificity) is the conditional probability of testing positive (negative) given the true disease status. However, a more relevant question is “what is the probability of having (not having) a disease if a test is positive (negative)?”. Such post-test probabilities are denoted as positive predictive value (PPV) and negative predictive value (NPV). The PPV and NPV at the same estimated cut-off are correlated, hence it is desirable to make the joint inference on PPV and NPV to account for such correlation. Existing inference methods for PPV and NPV focus on the individual confidence intervals and they were developed under binomial distribution assuming binary instead of continuous test results. Several approaches are proposed to estimate the joint confidence region as well as the individual confidence intervals of PPV and NPV. Simulation results indicate the proposed approaches perform well with satisfactory coverage probabilities for normal and non-normal data and, additionally, outperform existing methods with improved coverage as well as narrower confidence intervals for PPV and NPV. The Alzheimer's Disease Neuroimaging Initiative (ADNI) data set is used to illustrate the proposed approaches and compare them with the existing methods.  相似文献   

3.
The accuracy of a binary diagnostic test is usually measured in terms of its sensitivity and its specificity. Other measures of the performance of a diagnostic test are the positive and negative likelihood ratios, which quantify the increase in knowledge about the presence of the disease through the application of a diagnostic test, and which depend on the sensitivity and specificity of the diagnostic test. In this article, we construct an asymptotic hypothesis test to simultaneously compare the positive and negative likelihood ratios of two or more diagnostic tests in unpaired designs. The hypothesis test is based on the logarithmic transformation of the likelihood ratios and on the chi-square distribution. Simulation experiments have been carried out to study the type I error and the power of the constructed hypothesis test when comparing two and three binary diagnostic tests. The method has been extended to the case of multiple multi-level diagnostic tests.  相似文献   

4.
Receiver operating characteristic (ROC) curves can be used to assess the accuracy of tests measured on ordinal or continuous scales. The most commonly used measure for the overall diagnostic accuracy of diagnostic tests is the area under the ROC curve (AUC). A gold standard (GS) test on the true disease status is required to estimate the AUC. However, a GS test may be too expensive or infeasible. In many medical researches, the true disease status of the subjects may remain unknown. Under the normality assumption on test results from each disease group of subjects, we propose a heuristic method of estimating confidence intervals for the difference in paired AUCs of two diagnostic tests in the absence of a GS reference. This heuristic method is a three-stage method by combining the expectation-maximization (EM) algorithm, bootstrap method, and an estimation based on asymptotic generalized pivotal quantities (GPQs) to construct generalized confidence intervals for the difference in paired AUCs in the absence of a GS. Simulation results show that the proposed interval estimation procedure yields satisfactory coverage probabilities and expected interval lengths. The numerical example using a published dataset illustrates the proposed method.  相似文献   

5.
In this article, we use a latent class model (LCM) with prevalence modeled as a function of covariates to assess diagnostic test accuracy in situations where the true disease status is not observed, but observations on three or more conditionally independent diagnostic tests are available. A fast Monte Carlo expectation–maximization (MCEM) algorithm with binary (disease) diagnostic data is implemented to estimate parameters of interest; namely, sensitivity, specificity, and prevalence of the disease as a function of covariates. To obtain standard errors for confidence interval construction of estimated parameters, the missing information principle is applied to adjust information matrix estimates. We compare the adjusted information matrix-based standard error estimates with the bootstrap standard error estimates both obtained using the fast MCEM algorithm through an extensive Monte Carlo study. Simulation demonstrates that the adjusted information matrix approach estimates the standard error similarly with the bootstrap methods under certain scenarios. The bootstrap percentile intervals have satisfactory coverage probabilities. We then apply the LCM analysis to a real data set of 122 subjects from a Gynecologic Oncology Group study of significant cervical lesion diagnosis in women with atypical glandular cells of undetermined significance to compare the diagnostic accuracy of a histology-based evaluation, a carbonic anhydrase-IX biomarker-based test and a human papillomavirus DNA test.  相似文献   

6.
As new diagnostic tests are developed and marketed, it is very important to be able to compare the accuracy of a given two continuous‐scale diagnostic tests. An effective method to evaluate the difference between the diagnostic accuracy of two tests is to compare partial areas under the receiver operating characteristic curves (AUCs). In this paper, we review existing parametric methods. Then, we propose a new semiparametric method and a new nonparametric method to investigate the difference between two partial AUCs. For the difference between two partial AUCs under each method, we derive a normal approximation, define an empirical log‐likelihood ratio, and show that the empirical log‐likelihood ratio follows a scaled chi‐square distribution. We construct five confidence intervals for the difference based on normal approximation, bootstrap, and empirical likelihood methods. Finally, extensive simulation studies are conducted to compare the finite‐sample performances of these intervals, and a real example is used as an application of our recommended intervals. The simulation results indicate that the proposed hybrid bootstrap and empirical likelihood intervals outperform other existing intervals in most cases.  相似文献   

7.
In many diagnostic studies, multiple diagnostic tests are performed on each subject or multiple disease markers are available. Commonly, the information should be combined to improve the diagnostic accuracy. We consider the problem of comparing the discriminatory abilities between two groups of biomarkers. Specifically, this article focuses on confidence interval estimation of the difference between paired AUCs based on optimally combined markers under the assumption of multivariate normality. Simulation studies demonstrate that the proposed generalized variable approach provides confidence intervals with satisfying coverage probabilities at finite sample sizes. The proposed method can also easily provide P-values for hypothesis testing. Application to analysis of a subset of data from a study on coronary heart disease illustrates the utility of the method in practice.  相似文献   

8.
9.
Case–control design to assess the accuracy of a binary diagnostic test (BDT) is very frequent in clinical practice. This design consists of applying the diagnostic test to all of the individuals in a sample of those who have the disease and in another sample of those who do not have the disease. The sensitivity of the diagnostic test is estimated from the case sample and the specificity is estimated from the control sample. Another parameter which is used to assess the performance of a BDT is the weighted kappa coefficient. The weighted kappa coefficient depends on the sensitivity and specificity of the diagnostic test, on the disease prevalence and on the weighting index. In this article, confidence intervals are studied for the weighted kappa coefficient subject to a case–control design and a method is proposed to calculate the sample sizes to estimate this parameter. The results obtained were applied to a real example.  相似文献   

10.
In many engineering problems it is necessary to draw statistical inferences on the mean of a lognormal distribution based on a complete sample of observations. Statistical demonstration of mean time to repair (MTTR) is one example. Although optimum confidence intervals and hypothesis tests for the lognormal mean have been developed, they are difficult to use, requiring extensive tables and/or a computer. In this paper, simplified conservative methods for calculating confidence intervals or hypothesis tests for the lognormal mean are presented. In this paper, “conservative” refers to confidence intervals (hypothesis tests) whose infimum coverage probability (supremum probability of rejecting the null hypothesis taken over parameter values under the null hypothesis) equals the nominal level. The term “conservative” has obvious implications to confidence intervals (they are “wider” in some sense than their optimum or exact counterparts). Applying the term “conservative” to hypothesis tests should not be confusing if it is remembered that this implies that their equivalent confidence intervals are conservative. No implication of optimality is intended for these conservative procedures. It is emphasized that these are direct statistical inference methods for the lognormal mean, as opposed to the already well-known methods for the parameters of the underlying normal distribution. The method currently employed in MIL-STD-471A for statistical demonstration of MTTR is analyzed and compared to the new method in terms of asymptotic relative efficiency. The new methods are also compared to the optimum methods derived by Land (1971, 1973).  相似文献   

11.
When analyzing a response variable at the presence of both factors and covariates, with potentially correlated responses and violated assumptions of the normal residual or the linear relationship between the response and the covariates, rank-based tests can be an option for inferential procedures instead of the parametric repeated measures analysis of covariance (ANCOVA) models. This article derives a rank-based method for multi-way ANCOVA models with correlated responses. The generalized estimating equations (GEE) technique is employed to construct the proposed rank tests. Asymptotic properties of the proposed tests are derived. Simulation studies confirmed the performance of the proposed tests.  相似文献   

12.
Many diagnostic tests may be available to identify a particular disease. Diagnostic performance can be potentially improved by combining. “Either” and “both” positive strategies for combining tests have been discussed in the literature, where a gain in diagnostic performance is measured by a ratio of positive (negative) likelihood ratio of the combined test to that of an individual test. Normal theory and bootstrap confidence intervals are constructed for gains in likelihood ratios. The performance (coverage probability, width) of the two methods are compared via simulation. All confidence intervals perform satisfactorily for large samples, while bootstrap performs better in smaller samples in terms of coverage and width.  相似文献   

13.
In the presence of partial disease verification, the comparison of the accuracy of binary diagnostic tests cannot be carried out through the paired comparison of the diagnostic tests applying McNemar's test, since for a subsample of patients the disease status is unknown. In this study, we have deduced the maximum likelihood estimators for the sensitivities and specificities of multiple binary diagnostic tests and we have studied various joint hypothesis tests based on the chi-square distribution to compare simultaneously the accuracy of these binary diagnostic tests when for some patients in the sample the disease status is unknown. Simulation experiments were carried out to study the type I error and the power of each hypothesis test deduced. The results obtained were applied to the diagnosis of coronary stenosis.  相似文献   

14.
Directly relating to sensitivity and specificity and providing an optimal cut-point, which maximizes overall classification effectiveness for diagnosis purpose, the Youden index has been frequently utilized in biomedical diagnosis practice. Current application of the Youden index is limited to two diagnostic groups. However, there usually exists a transitional intermediate stage in many disease processes. Early recognition of this intermediate stage is vital to open an optimal window for therapeutic intervention. In this article, we extend the Youden index to assess diagnostic accuracy when there are three ordinal diagnostic groups. Parametric and nonparametric methods are presented to estimate the optimal Youden index, the underlying optimal cut-points, and the associated confidence intervals. Extensive simulation studies covering representative distributional assumptions are reported to compare performance of the proposed methods. A real example illustrates the usefulness of the Youden index in evaluating discriminating ability of diagnostic tests.  相似文献   

15.
Abstract

Measuring the accuracy of diagnostic tests is crucial in many application areas including medicine, machine learning and credit scoring. The receiver operating characteristic (ROC) curve and surface are useful tools to assess the ability of diagnostic tests to discriminate between ordered classes or groups. To define these diagnostic tests, selecting the optimal thresholds that maximize the accuracy of these tests is required. One procedure that is commonly used to find the optimal thresholds is by maximizing what is known as Youden’s index. This article presents nonparametric predictive inference (NPI) for selecting the optimal thresholds of a diagnostic test. NPI is a frequentist statistical method that is explicitly aimed at using few modeling assumptions, enabled through the use of lower and upper probabilities to quantify uncertainty. Based on multiple future observations, the NPI approach is presented for selecting the optimal thresholds for two-group and three-group scenarios. In addition, a pairwise approach has also been presented for the three-group scenario. The article ends with an example to illustrate the proposed methods and a simulation study of the predictive performance of the proposed methods along with some classical methods such as Youden index. The NPI-based methods show some interesting results that overcome some of the issues concerning the predictive performance of Youden’s index.  相似文献   

16.
Accurate diagnosis of disease is a critical part of health care. New diagnostic and screening tests must be evaluated based on their abilities to discriminate diseased conditions from non‐diseased conditions. For a continuous‐scale diagnostic test, a popular summary index of the receiver operating characteristic (ROC) curve is the area under the curve (AUC). However, when our focus is on a certain region of false positive rates, we often use the partial AUC instead. In this paper we have derived the asymptotic normal distribution for the non‐parametric estimator of the partial AUC with an explicit variance formula. The empirical likelihood (EL) ratio for the partial AUC is defined and it is shown that its limiting distribution is a scaled chi‐square distribution. Hybrid bootstrap and EL confidence intervals for the partial AUC are proposed by using the newly developed EL theory. We also conduct extensive simulation studies to compare the relative performance of the proposed intervals and existing intervals for the partial AUC. A real example is used to illustrate the application of the recommended intervals. The Canadian Journal of Statistics 39: 17–33; 2011 © 2011 Statistical Society of Canada  相似文献   

17.
The (continuous) data are n observations that are believed to be a random sample from a symmetrical population. Confidence intervals and significance tests for the population mean are desired. There is, however, the possibility that either the smallest observation or the largest observation is an outlier. That is, the population providing this observation differs from the symmetrical population providing the other n - 1 observations. If this occurs, intervals and tests are desired for the mean of the population providing the other n - 1 observations. Some investigation difficulties can be overcome if intervals and tests can be developed that are simultaneously usable for all of these three situations (a confidence coefficient, or significance level, has the same value for all three situations). Two kinds of intervals and tests with this property are developed. These results always involve both the next to smallest observations and should have at least moderately high efficiencies. Also, some extensions are considered, such as allowing each observation to be from a different population.  相似文献   

18.
In medicine, there are often two diagnostic tests that serve the same purpose. Typically, one of the tests will have a lower diagnostic performance but be less invasive, easier to perform, or cheaper. Clinicians must assess the agreement between the tests while accounting for test–retest variation in both techniques. In this paper, we investigate a specific example from interventional cardiology, studying the agreement between the fractional flow reserve and the instantaneous wave-free ratio. We analyze potential definitions of the agreement (accuracy) between the two tests and compare five families of statistical estimators. We contrast their statistical behavior both theoretically and using numerical simulations. Surprisingly for clinicians, seemingly natural and equivalent definitions of the concept of agreement can lead to discordant and even nonsensical estimates.  相似文献   

19.
Measuring the accuracy of diagnostic tests is crucial in many application areas including medicine, machine learning, and credit scoring. The receiver operating characteristic (ROC) surface is a useful tool to assess the ability of a diagnostic test to discriminate among three-ordered classes or groups. In this article, nonparametric predictive inference (NPI) for three-group ROC analysis for ordinal outcomes is presented. NPI is a frequentist statistical method that is explicitly aimed at using few modeling assumptions, enabled through the use of lower and upper probabilities to quantify uncertainty. This article also includes results on the volumes under the ROC surfaces and consideration of the choice of decision thresholds for the diagnosis. Two examples are provided to illustrate our method.  相似文献   

20.
The receiver operating characteristic (ROC) curve is one of the most commonly used methods to compare the diagnostic performance of two or more laboratory or diagnostic tests. In this paper, we propose semi-empirical likelihood based confidence intervals for ROC curves of two populations, where one population is parametric and the other one is non-parametric and both have missing data. After imputing missing values, we derive the semi-empirical likelihood ratio statistic and the corresponding likelihood equations. It is shown that the log-semi-empirical likelihood ratio statistic is asymptotically scaled chi-squared. The estimating equations are solved simultaneously to obtain the estimated lower and upper bounds of semi-empirical likelihood confidence intervals. We conduct extensive simulation studies to evaluate the finite sample performance of the proposed empirical likelihood confidence intervals with various sample sizes and different missing probabilities.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号