期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Nonparametric estimation of the ROC curve based on smoothed empirical distribution functions

Alicja Jokiel-Rokita Michał Pulit 《Statistics and Computing》2013,23(6):703-712

The receiver operating characteristic (ROC) curve is a graphical representation of the relationship between false positive and true positive rates. It is a widely used statistical tool for describing the accuracy of a diagnostic test. In this paper we propose a new nonparametric ROC curve estimator based on the smoothed empirical distribution functions. We prove its strong consistency and perform a simulation study to compare it with some other popular nonparametric estimators of the ROC curve. We also apply the proposed method to a real data set. 相似文献

2.

Uniform Design for Experiments with Mixtures

Jian-Hui Ning Kai-Tai Fang Yong-Dao Zhou 《统计学通讯:理论与方法》2013,42(10):1734-1742

The goal of uniform mixture design is to scatter the design points in the experimental region uniformly. The commonly used criteria, such as mean square distance, are based on the Euclidean distance. Based on the Lee distance, a new criterion is proposed in this article. And an algorithm, called NTLBG, is also proposed to refine the randomly generated design for the experimental design with mixtures. Some examples show that the design generated by the NTLBG algorithm has a lower criteria value. 相似文献

3.

Adjusting ROC curves for covariates in the presence of verification bias

Ronen Fluss Benjamin ReiserDavid Faraggi 《Journal of statistical planning and inference》2012,142(1):1-11

The ROC (receiver operating characteristic) curve is frequently used for describing effectiveness of a diagnostic marker or test. Classical estimation of the ROC curve uses independent identically distributed samples taken randomly from the healthy and diseased populations. Frequently not all subjects undergo a definitive gold standard assessment of disease status (verification). Estimation of the ROC curve based on data only from subjects with verified disease status may be badly biased (verification bias). In this work we investigate the properties of the doubly robust (DR) method for estimating the ROC curve adjusted for covariates (ROC regression) under verification bias. We develop the estimator's asymptotic distribution and examine its finite sample size properties via a simulation study. We apply this procedure to fingerstick postprandial blood glucose measurement data adjusting for age. 相似文献

4.

Confidence bands for ROC curves

Lajos Horváth Zsuzsanna Horváth Wang Zhou 《Journal of statistical planning and inference》2008

We develop two methods to construct confidence bands for the receiver operating characteristic (ROC) curve without estimating the densities of the underlying distributions. The first method is based on the smoothed bootstrap while the second method uses the Bonferroni inequality. As an illustration, we provide confidence bands for the ROC curve using data on Duchanne Muscular Dystrophy. 相似文献

5.

Compare diagnostic tests using transformation-invariant smoothed ROC curves()

Tang L Du P Wu C 《Journal of statistical planning and inference》2010,140(11):3540-3551

Receiver operating characteristic (ROC) curve, plotting true positive rates against false positive rates as threshold varies, is an important tool for evaluating biomarkers in diagnostic medicine studies. By definition, ROC curve is monotone increasing from 0 to 1 and is invariant to any monotone transformation of test results. And it is often a curve with certain level of smoothness when test results from the diseased and non-diseased subjects follow continuous distributions. Most existing ROC curve estimation methods do not guarantee all of these properties. One of the exceptions is Du and Tang (2009) which applies certain monotone spline regression procedure to empirical ROC estimates. However, their method does not consider the inherent correlations between empirical ROC estimates. This makes the derivation of the asymptotic properties very difficult. In this paper we propose a penalized weighted least square estimation method, which incorporates the covariance between empirical ROC estimates as a weight matrix. The resulting estimator satisfies all the aforementioned properties, and we show that it is also consistent. Then a resampling approach is used to extend our method for comparisons of two or more diagnostic tests. Our simulations show a significantly improved performance over the existing method, especially for steep ROC curves. We then apply the proposed method to a cancer diagnostic study that compares several newly developed diagnostic biomarkers to a traditional one. 相似文献

6.

Fitting Roc Curves Using Non-linear Binomial Regression

Chris J. Lloyd 《Australian & New Zealand Journal of Statistics》2000,42(2):193-204

The performance of a diagnostic test is summarized by its receiver operating characteristic (ROC) curve. Empirical data on a test's performance often come in the form of observed true positive and false positive relative frequencies, under varying conditions. This paper describes a family of models for analysing such data. The underlying ROC curves are specified by a shift parameter, a shape parameter and a link function. Both the position along the ROC curve and the shift parameter are modelled linearly. The shape parameter enters the model non-linearly but in a very simple manner. One simple application is to the meta-analysis of independent studies of the same diagnostic test, illustrated on some data of Moses, Shapiro & Littenberg (1993). A second application to so-called vigilance data is given, where ROC curves differ across subjects, and modelling of the position along the ROC curve is of primary interest. 相似文献

7.

Two transformation models for estimating an ROC curve derived from continuous data

Kelly H. Zou W. J. Hall 《Journal of applied statistics》2000,27(5):621-631

A receiver operating characteristic (ROC) curve is a plot of two survival functions, derived separately from the diseased and healthy samples. A special feature is that the ROC curve is invariant to any monotone transformation of the measurement scale. We propose and analyse semiparametric and parametric transformation models for this two-sample problem. Following an unspecified or specified monotone transformation, we assume that the healthy and diseased measurements have two normal distributions with different means and variances. Maximum likelihood algorithms for estimating ROC curve parameters are developed. The proposed methods are illustrated on the marker CA125 in the diagnosis of gastric cancer. 相似文献

8.

An alternative procedure for performing a power analysis of Mantel's test

A.R. Silva C.T.S. Dias P.R. Cecon E.R. Rêgo 《Journal of applied statistics》2015,42(9):1984-1992

This study proposes a simple way to perform a power analysis of Mantel's test applied to squared Euclidean distance matrices. The general statistical aspects of the simple Mantel's test are reviewed. The Monte Carlo method is used to generate bivariate Gaussian variables in order to create squared Euclidean distance matrices. The power of the parametric correlation t-test applied to raw data is also evaluated and compared with that of Mantel's test. The standard procedure for calculating punctual power levels is used for validation. The proposed procedure allows one to draw the power curve by running the test only once, dispensing with the time demanding standard procedure of Monte Carlo simulations. Unlike the standard procedure, it does not depend on a knowledge of the distribution of the raw data. The simulated power function has all the properties of the power analysis theory and is in agreement with the results of the standard procedure. 相似文献

9.

Chernoff distance for doubly truncated distributions

Chanchal Kundu 《统计学通讯:理论与方法》2017,46(21):10594-10606

In a recent paper, Nair et al. [Stat Pap 52:893–909, 2011] proposed Chernoff distance measure for left/right-truncated random variables and studied their properties in the context of reliability analysis. Here we extend the definition of Chernoff distance for doubly truncated distributions. This measure may help the information theorists and reliability analysts to study the various characteristics of a system/component when it fails between two time points. We study some properties of this measure and obtain its upper and lower bounds. We also study the interval Chernoff distance between the original and weighted distributions. These results generalize and enhance the related existing results that are developed based on Chernoff distance for one-sided truncated random variables. 相似文献

10.

Semi-empirical likelihood inference for the ROC curve with missing data

Xiaoxia Liu Yichuan Zhao 《Journal of statistical planning and inference》2012

The receiver operating characteristic (ROC) curve is one of the most commonly used methods to compare the diagnostic performance of two or more laboratory or diagnostic tests. In this paper, we propose semi-empirical likelihood based confidence intervals for ROC curves of two populations, where one population is parametric and the other one is non-parametric and both have missing data. After imputing missing values, we derive the semi-empirical likelihood ratio statistic and the corresponding likelihood equations. It is shown that the log-semi-empirical likelihood ratio statistic is asymptotically scaled chi-squared. The estimating equations are solved simultaneously to obtain the estimated lower and upper bounds of semi-empirical likelihood confidence intervals. We conduct extensive simulation studies to evaluate the finite sample performance of the proposed empirical likelihood confidence intervals with various sample sizes and different missing probabilities. 相似文献

11.

Equivalence theorem for Schur optimality of experimental designs

Radoslav Harman 《Journal of statistical planning and inference》2008

An experimental design is said to be Schur optimal, if it is optimal with respect to the class of all Schur isotonic criteria, which includes Kiefer's criteria of _Φ_p

Φ_{p}

-optimality, distance optimality criteria and many others. In the paper we formulate an easily verifiable necessary and sufficient condition for Schur optimality in the set of all approximate designs of a linear regression experiment with uncorrelated errors. We also show that several common models admit a Schur optimal design, for example the trigonometric model, the first-degree model on the Euclidean ball, and the Berman's model. 相似文献

12.

On the confusion matrix in credit scoring and its analytical properties

Guoping Zeng 《统计学通讯:理论与方法》2020,49(9):2080-2093

Abstract

Confusion Matrix is an important measure to evaluate the accuracy of credit scoring models. However, the literature about Confusion Matrix is limited. The analytical properties of Confusion Matrix are ignored. Moreover, the concept of Confusion Matrix is confusing. In this article, we systematically study Confusion Matrix and its analytical properties. We enumerate 16 possible variants of Confusion Matrix and show that only 8 are reasonable. We study the relationship between Confusion Matrix and 2 other performance measures: the receiver operating characteristic curve (ROC) and Kolmogorov-Smirnov statistic (KS). We show that an optimal cutoff score can be attained by KS. 相似文献

13.

Testing the difference between two Kolmogorov–Smirnov values in the context of receiver operating characteristic curves

Wojtek J. Krzanowski David J. Hand 《Journal of applied statistics》2011,38(3):437-450

The maximum vertical distance between a receiver operating characteristic (ROC) curve and its chance diagonal is a common measure of effectiveness of the classifier that gives rise to this curve. This measure is known to be equivalent to a two-sample Kolmogorov–Smirnov statistic; so the absolute difference D between two such statistics is often used informally as a measure of difference between the corresponding classifiers. A significance test of D is of great practical interest, but the available Kolmogorov–Smirnov distribution theory precludes easy analytical construction of such a significance test. We, therefore, propose a Monte Carlo procedure for conducting the test, using the binormal model for the underlying ROC curves. We provide Splus/R routines for the computation, tabulate the results for a number of illustrative cases, apply the methods to some practical examples and discuss some implications. 相似文献

14.

基于B-样条基底展开的曲线聚类方法

黄恒君《统计与信息论坛》2013,28(9):3-8

随着大数据时代的来临,近年来函数型数据分析方法成为研究的热点问题,针对曲线的聚类分析方法引起了学界的关注.给出一种曲线聚类的方法:以L2距离作为亲疏程度的度量,在B样条基底函数展开表述下,将曲线本身信息、曲线变化信息引入聚类算法构建,并实现了曲线聚类与传统多元统计聚类方法的对接.作为应用,以城乡收入函数聚类实例验证了该曲线聚类方法,结果表明,在引入曲线变化信息的情况下,比仅考虑曲线本身信息能够取得更好的聚类效果. 相似文献

15.

A Flexible Method for Estimating the ROC Curve

Haobo Ren Xiao-Hua Zhou Hua Liang 《Journal of applied statistics》2004,31(7):773-784

In this paper we propose a flexible method for estimating a receiver operating characteristic (ROC) curve that is based on a continuous-scale test. The approach is easily understood and efficiently computed, and robust to the smooth parameter selection, which needs intensive computation when using local polynomial and smoothing spline techniques. The results from our simulation experiment indicate that the moderate-sample numerical performance of our estimator is better than the empirical ROC curve estimator and comparable to the local linear estimator. The availability of easy implementation is also illustrated by our simulation. We apply the proposed method to two real data sets. 相似文献

16.

Asymmetric matrix-valued covariances for multivariate random fields on spheres

Alfredo Alegría Emilio Porcu Reinhard Furrer 《Journal of Statistical Computation and Simulation》2018,88(10):1850-1862

ABSTRACT

Matrix-valued covariance functions are crucial to geostatistical modelling of multivariate spatial data. The classical assumption of symmetry of a multivariate covariance function is overly restrictive and has been considered as unrealistic for most of the real data applications. Despite of that, the literature on asymmetric covariance functions has been very sparse. In particular, there is some work related to asymmetric covariances on Euclidean spaces, depending on the Euclidean distance. However, for data collected over large portions of planet Earth, the most natural spatial domain is a sphere, with the corresponding geodesic distance being the natural metric. In this work, we propose a strategy based on spatial rotations to generate asymmetric covariances for multivariate random fields on the d-dimensional unit sphere. We illustrate through simulations as well as real data analysis that our proposal allows to achieve improvements in the predictive performance in comparison to the symmetric counterpart. 相似文献

17.

Extended moment results for improving inferences based on mre'p

D. S. Tracy I. H. Tajuddin 《统计学通讯:理论与方法》2013,42(6):1485-1496

The MRPP test statistic studied by Mielke and others is the weighted average distance between pairs of observations within a group. They defined 12 symmetric functions to obtainits first three moments. We define 23 additional symmetric functions to obtain the fourth moment. This can beuseful instudying further approximations to its sampling distribution. We also study the special case when the distance function is the Euclidean distance between ranks of observations 相似文献

18.

Statistical frameworks for setting cutoff points of metabolic syndrome criteria

Tien-Mu Hsiao Huifen Chen 《统计学通讯:模拟与计算》2017,46(7):5666-5681

We propose three statistical frameworks for determining the cutoff points of metabolic syndrome (MetS) criteria, consisting of six components that are the same as in widely used MetS definitions, e.g., the 2004 updated NCEP-ATPIII criteria. Several international organizations have proposed MetS definitions; no literature indicates that any of these definitions is based on statistical frameworks. For all the three frameworks, the cutoff points are set to maximize the observed prevalence rate of stroke and DM. The three frameworks differ in assumptions on the joint distribution of the six components. Using the cohort data from a regional hospital in Taiwan, we illustrate applications of the three frameworks and compare them with the updated NCEP-ATPIII definition and the 2009 consensus definition of IDF and AHA/NHLBI. The performance measure is the odds ratio, the odds of getting stroke or DM within subjects with MetS divided by the analogous odds for subjects without MetS. Our numerical results show that the odds ratios of the three frameworks are higher than those of the updated-NCEP and consensus definitions, showing that the proposed frameworks seem to provide a better association of MetS with stroke and DM. 相似文献

19.

The average area under correlated receiver operating characteristic curves: a nonparametric approach based on generalized two-sample Wilcoxon statistics 总被引：2，自引：0，他引：2

Mei-Ling Ting Lee & Bernard A. Rosner 《Journal of the Royal Statistical Society. Series C, Applied statistics》2001,50(3):337-344

It is well known that, when sample observations are independent, the area under the receiver operating characteristic (ROC) curve corresponds to the Wilcoxon statistics if the area is calculated by the trapezoidal rule. Correlated ROC curves arise often in medical research and have been studied by various parametric methods. On the basis of the Mann–Whitney U-statistics for clustered data proposed by Rosner and Grove, we construct an average ROC curve and derive nonparametric methods to estimate the area under the average curve for correlated ROC curves obtained from multiple readers. For the more complicated case where, in addition to multiple readers examining results on the same set of individuals, two or more diagnostic tests are involved, we derive analytic methods to compare the areas under correlated average ROC curves for these diagnostic tests. We demonstrate our methods in an example and compare our results with those obtained by other methods. The nonparametric average ROC curve and the analytic methods that we propose are easy to explain and simple to implement. 相似文献

20.

Confidence Bands for ROC Curves With Serially Dependent Data

Kajal Lahiri Liu Yang 《商业与经济统计学杂志》2018,36(1):115-130

We propose serial correlation-robust asymptotic confidence bands for the receiver operating characteristic (ROC) curve and its functional, viz., the area under ROC curve (AUC), estimated by quasi-maximum likelihood in the binormal model. Our simulation experiments confirm that this new method performs fairly well in finite samples, and confers an additional measure of robustness to nonnormality. The conventional procedure is found to be markedly undersized in terms of yielding empirical coverage probabilities lower than the nominal level, especially when the serial correlation is strong. An example from macroeconomic forecasting demonstrates the importance of accounting for serial correlation when the probability forecasts for real GDP declines are evaluated using ROC. Supplementary materials for this article are available online. 相似文献