首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 35 毫秒
Having constructed a rule for classifying objects into classes, one will need to evaluate the performance of that rule both in absolute terms (is it good enough?) and in relative terms (is it better than an alternative?). In this paper, we discuss such evaluation, focusing primarily on the first question, and covering discriminability (how effective the rule is in classifying new objects to the correct class) and reliability (how accurately it estimates probabilities of class membership). Measures based on percentages correct, measures based on probabilities of being correct and distance based measures are outlined, and attractive and problematic properties are discussed.  相似文献   

Idiopathic scoliosis is the most common spinal deformity, affecting perhaps as many as 5% of children. Early recognition of the condition is essential for optimal treatment. A widely used technique for identification is based on a somewhat crude angle measurement from a frontal spinal X-ray. Here, we provide a technique and new summary statistical measures for classifying spinal shape, and present results obtained from clinical X-rays.  相似文献   

For the clinical development of a new drug, the determination of dose-proportionality is an essential part of the pharmacokinetic evaluations, which may provide early indications of non-linear pharmacokinetics and may help to identify sub-populations with divergent clearances. Prior to making any conclusions regarding dose-proportionality, the goodness-of-fit of the model must be assessed to evaluate the model performance. We propose the use of simulation-based visual predictive checks to improve the validity of dose-proportionality conclusions for complex designs. We provide an illustrative example and include a table to facilitate review by regulatory authorities.  相似文献   

The periodic monitoring of drug treatments often involves the collection of biological specimens (e.g. blood, urine, synovial fluid) for the purpose of clinical laboratory assessment. The analysis of a particular specimen yields a vector of measurements from which judgments are made concerning the status of a subject and the effect of the drug. Typically, an observation vector is compared to “normal values” which may be conditioned on covariates such as age, gender, or other relevant characteristics. Under an assumption of multivariate normality of the data available, a method is presented for deciding whether a particular observed vector looks “normal”. The method, based on a predictive approach, is compared to other proposals and is shown to have optimality properties not possessed by standard procedures. Three different approaches are used in the discussion of optimality within the class of invariant methods. The first involves tolerance regions with smallest normalized expected volume, the second involves a decision theoretic comparison of predictive distributions, while the third involves the foundational notions of incoherence (Dutch book) and strong inconsistency.  相似文献   

This paper studies influential observations on the spectrum of a stationary stochastic process. We introduce a leave-one-out procedure in spectral density estimation to identify influential points. A simulated envelope is proposed to assess the magnitude of influence when the data follow an autoregressive integrated moving average model. Practical illustrations are discussed in two examples.  相似文献   

This paper develops a method for assessing the risk for rare events based on the following scenario. There exists a large population with an unknown percentage p of defects. A sample of size N is drawn from the population and, in the sample, 0 defects are drawn. Given these data, we want to determine the probability that no more than n defects will be found in another random sample of N drawn from the population. Estimates on the range of p and n are calculated from a derived joint distribution which depends on p, n and N. Asymptotic risk results based on an infinite sample are then developed. It is shown that these results are applicable even with relatively small sample spaces.  相似文献   

如何科学地评价中国对外贸易质量的状况,是制定有效政策以推进我国对外贸易发展方式转型的前提条件之一.本文利用经济理论判断和统计分析相结合的方法,从世界银行WDI数据库中,选取并构造了1980-2010年反映中国对外贸易状况的6个因变量和9个自变量.通过实证分析,在静态和动态两个维度,描述了改革开放30多年来中国对外贸易质量总体状况的结构变化特征,并提出了相关的政策建议和进一步的研究方向.  相似文献   

The ICH harmonized tripartite guideline 'Statistical Principles for Clinical Trials', more commonly referred to as ICH E9, was adopted by the regulatory bodies of the European Union, Japan and the USA in 1998. This document united related guidance documents on statistical methodology from each of the three ICH regions, and meant that for the first time clear consistent guidance on statistical principles was available to those conducting and reviewing clinical trials. At the 10th anniversary of the guideline's adoption, this paper discusses the influence of ICH E9 by presenting a perspective on how approaches to some aspects of clinical trial design, conduct and analysis have changed in that time in the context of regulatory submissions in the European Union.  相似文献   

Various aspects of assessing multivariate normality are discussed. Practical recommendations are given, and areas of further research interest are noted.  相似文献   

Nonparametric approaches to the analysis of multiple endpoints in clinical studies can be of particular value when the endpoints are heterogeneous or distributional assumptions are suspect. We describe a multivariate Terpstra-Jonckheere U-statistic for assessing multiple endpoints with ordered alternatives, and illustrate its use with data arising from a recent clinical study.  相似文献   

We examine, via a Bayesian analysis, in a general experimental situation, whether if is worthwhile to obtain additional prior information before performing an experiment. Examples illustrate the application of the techniques developed.  相似文献   

Parast  Layla  Tian  Lu  Cai  Tianxi 《Lifetime data analysis》2020,26(2):245-265
Lifetime Data Analysis - Assessing the potential of surrogate markers and surrogate outcomes for replacing a long term outcome is an active area of research. The interest in this topic is partly...  相似文献   

In some crossover experiments, particularly in medical applications, subjects may fail to complete their sequences of treatments for reasons unconnected with the treatments received. A method is described of assessing the robustness of a planned crossover design, with more than two periods, to subjects leaving the study prematurely. The method involves computing measures of efficiency for every possible design that can result, and is therefore very computationally intensive. Summaries of these measures are used to choose between competing designs. The computational problem is reduced to a manageable size by a software implementation of Polya theory. The method is applied to comparing designs for crossover studies involving four treatments and four periods. Designs are identified that are more robust to subjects dropping out in the final period than those currently favoured in medical and clinical trials.  相似文献   

This article introduces graphical procedures for assessing the fit of the gamma distribution. The procedures are based on a standardized version of the cumulant generating function. Plots with bands of 95% simultaneous confidence level are developed by utilizing asymptotic and finite-sample results. The plots have linear scales and do not rely on the use of tables or values of special functions. Further, it is found through simulation, that the goodness-of-fit test implied by these plots compares favorably with respect to power to other known tests for the gamma distribution in samples drawn from lognormal and inverse Gaussian distributions.  相似文献   

We evaluate the performance of various bootstrap methods for constructing confidence intervals for mean and median of several common distributions. Using Monte Carlo simulation, we assessed performance by looking at coverage percentages and average confidence interval lengths. Poor performance is characterized by coverage deviating from 0.95 and large confidence interval lengths. Undercoverage is of greater concern than overcoverage. We also assess the performance of bootstrap methods in estimating the parameters of the Cox Proportional Hazard model and Accelerated Failure Time model.  相似文献   

For banks using the Advanced Internal Ratings-Based Approach in accordance with Basel III requirements, the amount of required regulatory capital relies on the banks'' estimates of the probability of default, the loss given default and the conversion factor for their credit risk portfolio. Therefore, for both model development and validation, assessing the models'' predictive and discriminatory abilities is of key importance in order to ensure an adequate quantification of risk. This paper compares different measures of discriminatory power suitable for multi-class target variables such as in loss given default (LGD) models, which are currently used among banks and supervisory authorities. This analysis highlights the disadvantages of using measures that solely rely on pairwise comparisons when applied in a multi-class setting. Thus, for multi-class classification problems, we suggest using a generalisation of the well-known area under the receiver operating characteristic (ROC) curve known as the volume under the ROC surface (VUS). Furthermore, we present the R-package VUROCS, which allows for a time-efficient computation of the VUS as well as associated (co)variance estimates and illustrate its usage based on real-world loss data and validation principles.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号