首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 875 毫秒
1.
In the presence of partial disease verification, the comparison of the accuracy of binary diagnostic tests cannot be carried out through the paired comparison of the diagnostic tests applying McNemar's test, since for a subsample of patients the disease status is unknown. In this study, we have deduced the maximum likelihood estimators for the sensitivities and specificities of multiple binary diagnostic tests and we have studied various joint hypothesis tests based on the chi-square distribution to compare simultaneously the accuracy of these binary diagnostic tests when for some patients in the sample the disease status is unknown. Simulation experiments were carried out to study the type I error and the power of each hypothesis test deduced. The results obtained were applied to the diagnosis of coronary stenosis.  相似文献   

2.
Likelihood ratios (LRs) are used to characterize the efficiency of diagnostic tests. In this paper, we use the classical weighted least squares (CWLS) test procedure, which was originally used for testing the homogeneity of relative risks, for comparing the LRs of two or more binary diagnostic tests. We compare the performance of this method with the relative diagnostic likelihood ratio (rDLR) method and the diagnostic likelihood ratio regression (DLRReg) approach in terms of size and power, and we observe that the performances of CWLS and rDLR are the same when used to compare two diagnostic tests, while DLRReg method has higher type I error rates and powers. We also examine the performances of the CWLS and DLRReg methods for comparing three diagnostic tests in various sample size and prevalence combinations. On the basis of Monte Carlo simulations, we conclude that all of the tests are generally conservative and have low power, especially in settings of small sample size and low prevalence.  相似文献   

3.
Binocular data typically arise in ophthalmology where pairs of eyes are evaluated, through some diagnostic procedure, for the presence of certain diseases or pathologies. Treating eyes as independent and adopting the usual approach in estimating the sensitivity and specificity of a diagnostic test ignores the correlation between fellow eyes. This may consequently yield incorrect estimates, especially of the standard errors. The paper is concerned with diagnostic studies wherein several diagnostic tests, or the same test read by several readers, are administered to identify one or more diseases. A likelihood-based method of estimating disease-specific sensitivities and specificities via hierarchical generalized linear mixed models is proposed to meaningfully delineate the various correlations in the data. The efficiency of the estimates is assessed in a simulation study. Data from a study on diabetic retinopathy are analyzed to illustrate the methodology.  相似文献   

4.
The accuracy of a binary diagnostic test is usually measured in terms of its sensitivity and its specificity. Other measures of the performance of a diagnostic test are the positive and negative likelihood ratios, which quantify the increase in knowledge about the presence of the disease through the application of a diagnostic test, and which depend on the sensitivity and specificity of the diagnostic test. In this article, we construct an asymptotic hypothesis test to simultaneously compare the positive and negative likelihood ratios of two or more diagnostic tests in unpaired designs. The hypothesis test is based on the logarithmic transformation of the likelihood ratios and on the chi-square distribution. Simulation experiments have been carried out to study the type I error and the power of the constructed hypothesis test when comparing two and three binary diagnostic tests. The method has been extended to the case of multiple multi-level diagnostic tests.  相似文献   

5.
Cost and burden of diagnostic testing may be reduced if fewer tests can be applied. Sequential testing involves selecting a sequence of tests, but only administering subsequent tests dependent on results of previous tests. This research provides guidance to choosing between single tests or the believe the positive (BP) and believe the negative (BN) sequential testing strategies, using accuracy (as measured by the Youden Index) as the primary determinant. Approximately 75% of the parameter combinations examined resulted in either BP or BN being recommended based on a higher accuracy at the optimal point. In about half of the scenarios BP was preferred, and the other half, BN, with the choice often a function of the value of the ratio of standard deviations of those without and with disease (b). Large values of b for the first test of the sequence tended to be associated with preference for BN as opposed to BP, while small values of b appear to favor BP. When there was no preference between sequences and/or single tests based on the Youden Index, cost of the sequence was considered. In this case, disease prevalence plays a large role in the selection of strategies, with lower values favoring BN and sometimes higher values favoring BP. The cost threshold for the sequential strategy to be preferred over a single, more accurate test, was often quite high. It appears that while sequential strategies most often increase diagnostic accuracy over a single test, sequential strategies are not always preferred.  相似文献   

6.
Measuring the accuracy of diagnostic tests is crucial in many application areas including medicine and health care. Good methods for determining diagnostic accuracy provide useful guidance on selection of patient treatment, and the ability to compare different diagnostic tests has a direct impact on quality of care. In this paper Nonparametric Predictive Inference (NPI) methods for accuracy of diagnostic tests with continuous test results are presented and discussed. For such tests, Receiver Operating Characteristic (ROC) curves have become popular tools for describing the performance of diagnostic tests. We present the NPI approach to ROC curves, and some important summaries of these curves. As NPI does not aim at inference for an entire population but instead explicitly considers a future observation, this provides an attractive alternative to standard methods. We show how NPI can be used to compare two continuous diagnostic tests.  相似文献   

7.
徐凤  黎实 《统计研究》2014,31(9):91-98
对固定效应模型,本文基于拉格朗日乘数(LM)原理提出了一种新的可混合性检验。不同于已有的LM型可混合性检验,这里使用每个截面个体的LM统计量构建可混合性检验统计量。数理分析表明,本文所提的方法有着渐进正态性,对于扰动项的异方差和非正态均稳健,且与PY检验(Pesaran&Yamagata,2008)渐近等价。Monte Carlo模拟实验表明,相对于PY检验及另外两种LM型的可混合性检验,对于不同大小的 ,本文提出的方法有着良好的水平表现和更优越的检验势。  相似文献   

8.
Epidemiology studies increasingly examine multiple exposures in relation to disease by selecting the exposures of interest in a thematic manner. For example, sun exposure, sunburn, and sun protection behavior could be themes for an investigation of sun-related exposures. Several studies now use pre-defined linear combinations of the exposures pertaining to the themes to estimate the effects of the individual exposures. Such analyses may improve the precision of the exposure effects, but they can lead to inflated bias and type I errors when the linear combinations are inaccurate. We investigate preliminary test estimators and empirical Bayes type shrinkage estimators as alternative approaches when it is desirable to exploit the thematic choice of exposures, but the accuracy of the pre-defined linear combinations is unknown. We show that the two types of estimator are intimately related under certain assumptions. The shrinkage estimator derived under the assumption of an exchangeable prior distribution gives precise estimates and is robust to misspecifications of the user-defined linear combinations. The precision gains and robustness of the shrinkage estimation approach are illustrated using data from the SONIC study, where the exposures are the individual questionnaire items and the outcome is (log) total back nevus count.  相似文献   

9.
Receiver operating characteristic (ROC) curves can be used to assess the accuracy of tests measured on ordinal or continuous scales. The most commonly used measure for the overall diagnostic accuracy of diagnostic tests is the area under the ROC curve (AUC). A gold standard (GS) test on the true disease status is required to estimate the AUC. However, a GS test may be too expensive or infeasible. In many medical researches, the true disease status of the subjects may remain unknown. Under the normality assumption on test results from each disease group of subjects, we propose a heuristic method of estimating confidence intervals for the difference in paired AUCs of two diagnostic tests in the absence of a GS reference. This heuristic method is a three-stage method by combining the expectation-maximization (EM) algorithm, bootstrap method, and an estimation based on asymptotic generalized pivotal quantities (GPQs) to construct generalized confidence intervals for the difference in paired AUCs in the absence of a GS. Simulation results show that the proposed interval estimation procedure yields satisfactory coverage probabilities and expected interval lengths. The numerical example using a published dataset illustrates the proposed method.  相似文献   

10.
Accurate diagnosis of a molecularly defined subtype of cancer is often an important step toward its effective control and treatment. For the diagnosis of some subtypes of a cancer, a gold standard with perfect sensitivity and specificity may be unavailable. In those scenarios, tumor subtype status is commonly measured by multiple imperfect diagnostic markers. Additionally, in many such studies, some subjects are only measured by a subset of diagnostic tests and the missing probabilities may depend on the unknown disease status. In this paper, we present statistical methods based on the EM algorithm to evaluate incomplete multiple imperfect diagnostic tests under a missing at random assumption and one missing not at random scenario. We apply the proposed methods to a real data set from the National Cancer Institute (NCI) colon cancer family registry on diagnosing microsatellite instability for hereditary non-polyposis colorectal cancer to estimate diagnostic accuracy parameters (i.e. sensitivities and specificities), prevalence, and potential differential missing probabilities for 11 biomarker tests. Simulations are also conducted to evaluate the small-sample performance of our methods.  相似文献   

11.
In assessing the area under the ROC curve for the accuracy of a diagnostic test, it is imperative to detect and locate multiple abnormalities per image. This approach takes that into account by adopting a statistical model that allows for correlation between the reader scores of several regions of interest (ROI).

The ROI method of partitioning the image is taken. The readers give a score to each ROI in the image and the statistical model takes into account the correlation between the scores of the ROI's of an image in estimating test accuracy. The test accuracy is given by Pr[Y > Z] + (1/2)Pr[Y = Z], where Y is an ordinal diagnostic measurement of an affected ROI, and Z is the diagnostic measurement of an unaffected ROI. This way of measuring test accuracy is equivalent to the area under the ROC curve. The parameters are the parameters of a multinomial distribution, then based on the multinomial distribution, a Bayesian method of inference is adopted for estimating the test accuracy.

Using a multinomial model for the test results, a Bayesian method based on the predictive distribution of future diagnostic scores is employed to find the test accuracy. By resampling from the posterior distribution of the model parameters, samples from the posterior distribution of test accuracy are also generated. Using these samples, the posterior mean, standard deviation, and credible intervals are calculated in order to estimate the area under the ROC curve. This approach is illustrated by estimating the area under the ROC curve for a study of the diagnostic accuracy of magnetic resonance angiography for diagnosis of arterial atherosclerotic stenosis. A generalization to multiple readers and/or modalities is proposed.

A Bayesian way to estimate test accuracy is easy to perform with standard software packages and has the advantage of employing the efficient inclusion of information from prior related imaging studies.  相似文献   

12.
The study of the dependence between two medical diagnostic tests is an important issue in health research since it can modify the diagnosis and, therefore, the decision regarding a therapeutic treatment for an individual. In many practical situations, the diagnostic procedure includes the use of two tests, with outcomes on a continuous scale. For final classification, usually there is an additional “gold standard” or reference test. Considering binary test responses, we usually assume independence between tests or a joint binary structure for dependence. In this article, we introduce a simulation study assuming two dependent dichotomized tests using two copula function dependence structures in the presence or absence of verification bias. We compare the test parameter estimators obtained under copula structure dependence with those obtained assuming binary dependence or assuming independent tests.  相似文献   

13.
This paper reexamines the predictability of stock returns with a nonparametric model. We first identify, through a set of diagnostic tests, five lagged predictive factors from a linear model. Using these factors, we predict one-month-ahead stock index returns with a nonparametric approach. We find that our nonparametricmodel. We first identify, through a set of diagnostic tests, five lagged predictive factors from a linear model. Using these factors, we predict on -month-ahead stock index returns with a nonparametric approach. We find that our nonparametric model can correctly predict about 74% of stock index return signs. With various ex ante trading rules based on nonparametric predictions and transaction cost schedules, we then compare the performance of "managed" portfolios with that of the buy and hold portfolios. We fmd that the managed portfolios are mean-variance dominant over the buy-and-hold strategies when no or low transaction costs are assumed. When high transaction costs are assumed instead, the mean-variance dominance diminishes However,the Sharpe index of risk-adjusted portfolio performanceindicates that the managed portfolios significantly outperform the buy-and-hold strategies even for the high-transaction cost scenario. We show that the difference in performance between the managed portfolios and the buy-and-hold strategies can be partially explained by the January effect or the small firm effect. In sum, this paper demonstrates the merits of using a nonparametric approach for predicting stock returns and testing market efficiency.  相似文献   

14.
Implementation of the Gibbs sampler for estimating the accuracy of multiple binary diagnostic tests in one population has been investigated. This method, proposed by Joseph, Gyorkos and Coupal, makes use of a Bayesian approach and is used in the absence of a gold standard to estimate the prevalence, the sensitivity and specificity of medical diagnostic tests. The expressions that allow this method to be implemented for an arbitrary number of tests are given. By using the convergence diagnostics procedure of Raftery and Lewis, the relation between the number of iterations of Gibbs sampling and the precision of the estimated quantiles of the posterior distributions is derived. An example concerning a data set of gastro-esophageal reflux disease patients collected to evaluate the accuracy of the water siphon test compared with 24 h pH-monitoring, endoscopy and histology tests is presented. The main message that emerges from our analysis is that implementation of the Gibbs sampler to estimate the parameters of multiple binary diagnostic tests can be critical and convergence diagnostic is advised for this method. The factors which affect the convergence of the chains to the posterior distributions and those that influence the precision of their quantiles are analyzed.  相似文献   

15.
The comparison of the accuracy of two binary diagnostic tests has traditionally required knowledge of the disease status in all of the patients in the sample via the application of a gold standard. In practice, the gold standard is not always applied to all patients in a sample, and the problem of partial verification of the disease arises. The accuracy of a binary diagnostic test can be measured in terms of positive and negative predictive values, which represent the accuracy of a diagnostic test when it is applied to a cohort of patients. In this paper, we deduce the maximum likelihood estimators of predictive values (PVs) of two binary diagnostic tests, and the hypothesis tests to compare these measures when, in the presence of partial disease verification, the verification process only depends on the results of the two diagnostic tests. The effect of verification bias on the naïve estimators of PVs of two diagnostic tests is studied, and simulation experiments are performed in order to investigate the small sample behaviour of hypothesis tests. The hypothesis tests which we have deduced can be applied when all of the patients are verified with the gold standard. The results obtained have been applied to the diagnosis of coronary stenosis.  相似文献   

16.
In this article, we use a latent class model (LCM) with prevalence modeled as a function of covariates to assess diagnostic test accuracy in situations where the true disease status is not observed, but observations on three or more conditionally independent diagnostic tests are available. A fast Monte Carlo expectation–maximization (MCEM) algorithm with binary (disease) diagnostic data is implemented to estimate parameters of interest; namely, sensitivity, specificity, and prevalence of the disease as a function of covariates. To obtain standard errors for confidence interval construction of estimated parameters, the missing information principle is applied to adjust information matrix estimates. We compare the adjusted information matrix-based standard error estimates with the bootstrap standard error estimates both obtained using the fast MCEM algorithm through an extensive Monte Carlo study. Simulation demonstrates that the adjusted information matrix approach estimates the standard error similarly with the bootstrap methods under certain scenarios. The bootstrap percentile intervals have satisfactory coverage probabilities. We then apply the LCM analysis to a real data set of 122 subjects from a Gynecologic Oncology Group study of significant cervical lesion diagnosis in women with atypical glandular cells of undetermined significance to compare the diagnostic accuracy of a histology-based evaluation, a carbonic anhydrase-IX biomarker-based test and a human papillomavirus DNA test.  相似文献   

17.
In this paper, we propose several tests for detecting difference in means and variances simultaneously between two populations under normality. First of all, we propose a likelihood ratio test. Then we obtain an expression of the likelihood ratio statistic by a product of two functions of random quantities, which can be used to test the two individual partial hypotheses for differences in means and variances. With those individual partial tests, we propose a union-intersection test. Also we consider two optimal tests by combining the p-values of the two individual partial tests. For obtaining null distributions, we apply the permutation principle with the Monte Carlo approach. Then we compare efficiency among the proposed tests with well-known ones through a simulation study. Finally, we discuss some interesting features related to the simultaneous tests and resampling methods as concluding remarks.  相似文献   

18.
Summary.  In diagnostic medicine, the receiver operating characteristic (ROC) surface is one of the established tools for assessing the accuracy of a diagnostic test in discriminating three disease states, and the volume under the ROC surface has served as a summary index for diagnostic accuracy. In practice, the selection for definitive disease examination may be based on initial test measurements and induces verification bias in the assessment. We propose a non-parametric likelihood-based approach to construct the empirical ROC surface in the presence of differential verification, and to estimate the volume under the ROC surface. Estimators of the standard deviation are derived by both the Fisher information and the jackknife method, and their relative accuracy is evaluated in an extensive simulation study. The methodology is further extended to incorporate discrete baseline covariates in the selection process, and to compare the accuracy of a pair of diagnostic tests. We apply the proposed method to compare the diagnostic accuracy between mini-mental state examination and clinical evaluation of dementia, in discriminating between three disease states of Alzheimer's disease.  相似文献   

19.
In medicine, there are often two diagnostic tests that serve the same purpose. Typically, one of the tests will have a lower diagnostic performance but be less invasive, easier to perform, or cheaper. Clinicians must assess the agreement between the tests while accounting for test–retest variation in both techniques. In this paper, we investigate a specific example from interventional cardiology, studying the agreement between the fractional flow reserve and the instantaneous wave-free ratio. We analyze potential definitions of the agreement (accuracy) between the two tests and compare five families of statistical estimators. We contrast their statistical behavior both theoretically and using numerical simulations. Surprisingly for clinicians, seemingly natural and equivalent definitions of the concept of agreement can lead to discordant and even nonsensical estimates.  相似文献   

20.
With data collection in environmental science and bioassay, left censoring because of nondetects is a problem. Similarly in reliability and life data analysis right censoring frequently occurs. There is a need for goodness of fit tests that can adapt to left or right censored data and be used to check important distributional assumptions without becoming too difficult to regularly implement in practice. A new test statistic is derived from a plot of the standardized spacings between the order statistics versus their ranks. Any linear or curvilinear pattern is evidence against the null distribution. When testing the Weibull or extreme value null hypothesis this statistic has a null distribution that is approximately F for most combinations of sample size and censoring of practical interest. Our statistic is compared to the Mann-Scheuer-Fertig statistic which also uses the standardized spacings between the order statistics. The results of a simulation study show the two tests are competitive in terms of power. Although the Mann-Scheuer-Fertig statistic is somewhat easier to compute, our test enjoys advantages in the accuracy of the F approximation and the availability of a graphical diagnostic.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号