期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

On the three-way equivalence of AUC in credit scoring with tied scores

Guoping Zeng Edward Zeng 《统计学通讯:理论与方法》2019,48(7):1635-1650

In credit scoring, it is well known that AUC (the area under curve) can be calculated geometrically, by the probability of a correct ranking of a good and bad pair, and by the Wilcoxon Rank-Sum statistic. This three-way equivalence was first present by Hanley and McNeil in 1982 without considering tied scores and without giving analytical proofs. In this paper, we extend the three-way equivalence to the case with tied scores and provide analytic proofs for the three-way equivalence. 相似文献

2.

On a characterization of the dispersion matrix based on the properties of regression

Bimal Kumar Sinha Bikas Kumar Sinha 《统计学通讯:理论与方法》2013,42(13):1215-1224

This paper provides a partial solution to a problem posed by J. Neyman (1965) regarding the characterization of multivariate negative binomial distribution based on the properties of regression. It is shown that some of the properties of regression characterize the form of the nonsingular dispersion matrix of the parent distribution, which, interestingly enough, corresponds to only two types viz. those of positive and negative multivariate binomial distributions. 相似文献

3.

Principled leveraging of external data in the evaluation of diagnostic devices via the propensity score-integrated composite likelihood approach

Changhong Song Heng Li Wei-Chen Chen Nelson Lu Ram Tiwari Chenguang Wang Yunling Xu Lilly Q. Yue 《Pharmaceutical statistics》2023,22(3):547-569

In the area of diagnostics, it is common practice to leverage external data to augment a traditional study of diagnostic accuracy consisting of prospectively enrolled subjects to potentially reduce the time and/or cost needed for the performance evaluation of an investigational diagnostic device. However, the statistical methods currently being used for such leveraging may not clearly separate study design and outcome data analysis, and they may not adequately address possible bias due to differences in clinically relevant characteristics between the subjects constituting the traditional study and those constituting the external data. This paper is intended to draw attention in the field of diagnostics to the recently developed propensity score-integrated composite likelihood approach, which originally focused on therapeutic medical products. This approach applies the outcome-free principle to separate study design and outcome data analysis and can mitigate bias due to imbalance in covariates, thereby increasing the interpretability of study results. While this approach was conceived as a statistical tool for the design and analysis of clinical studies for therapeutic medical products, here, we will show how it can also be applied to the evaluation of sensitivity and specificity of an investigational diagnostic device leveraging external data. We consider two common scenarios for the design of a traditional diagnostic device study consisting of prospectively enrolled subjects, which is to be augmented by external data. The reader will be taken through the process of implementing this approach step-by-step following the outcome-free principle that preserves study integrity. 相似文献

4.

Medical diagnostic test based on the potential test result approach: bounds and identification

Akiko Kada Zhihong Cai 《Journal of applied statistics》2013,40(8):1659-1672

相似文献

5.

Between a ROC and a hard place: Teaching prevalence plots to understand real world biomarker performance in the clinic

B. Clare Lendrem Dennis W. Lendrem Arthur G. Pratt Najib Naamane Peter McMeekin Wan‐Fai Ng A. Joy Allen Michael Power John Dudley Isaacs 《Pharmaceutical statistics》2019,18(6):632-635

The Receiver Operating Characteristic (ROC) curve and the Area Under the Curve (AUC) of the ROC curve are widely used in discovery to compare the performance of diagnostic and prognostic assays. The ROC curve has the advantage that it is independent of disease prevalence. However, in this note, we remind scientists and clinicians that the performance of an assay upon translation to the clinic is critically dependent upon that very same prevalence. Without an understanding of prevalence in the test population, even robust bioassays with excellent ROC characteristics may perform poorly in the clinic. While the exact prevalence in the target population is not always known, simple plots of candidate assay performance as a function of prevalence rate give a better understanding of the likely real‐world performance and a greater understanding of the likely impact of variation in that prevalence on translation to the clinic. 相似文献

6.

Comparisons of estimators of the number of true null hypotheses and adaptive FDR procedures in multiplicity testing

《Journal of Statistical Computation and Simulation》2012,82(2):207-220

Many exploratory studies such as microarray experiments require the simultaneous comparison of hundreds or thousands of genes. It is common to see that most genes in many microarray experiments are not expected to be differentially expressed. Under such a setting, a procedure that is designed to control the false discovery rate (FDR) is aimed at identifying as many potential differentially expressed genes as possible. The usual FDR controlling procedure is constructed based on the number of hypotheses. However, it can become very conservative when some of the alternative hypotheses are expected to be true. The power of a controlling procedure can be improved if the number of true null hypotheses (m ₀) instead of the number of hypotheses is incorporated in the procedure [Y. Benjamini and Y. Hochberg, On the adaptive control of the false discovery rate in multiple testing with independent statistics, J. Edu. Behav. Statist. 25(2000), pp. 60–83]. Nevertheless, m ₀ is unknown, and has to be estimated. The objective of this article is to evaluate some existing estimators of m ₀ and discuss the feasibility of these estimators in incorporating into FDR controlling procedures under various experimental settings. The results of simulations can help the investigator to choose an appropriate procedure to meet the requirement of the study. 相似文献

7.

Diagonalization matrix and its application in distribution theory

Francisco J. Caro-Lopera 《Statistics》2016,50(4):870-880

Some matrix representations of diverse diagonal arrays are studied in this work; the results allow new definitions of classes of elliptical distributions indexed by kernels mixing Hadamard and usual products. A number of applications are derived in the setting of prior densities from the Bayesian multivariate regression model and families of non-elliptical distributions, such as the matrix multivariate generalized Birnbaum–Saunders density. The philosophy of the research about matrix representations of quadratic and inverse quadratic forms can be extended as a methodology for exploring possible new applications in non-standard distributions, matrix transformations and inference. 相似文献

8.

On the relationship between the matrix operators,vech and vecd

Daisuke Nagakura 《统计学通讯:理论与方法》2018,47(13):3252-3268

We introduce a matrix operator, which we call “vecd” operator. This operator stacks up “diagonals” of a symmetric matrix. This operator is more convenient for some statistical analyses than the commonly used “vech” operator. We show an explicit relationship between the vecd and vech operators. Using this relationship, various properties of the vecd operator are derived. As applications of the vecd operator, we derive concise and explicit expressions of the Wald and score tests for equal variances of a multivariate normal distribution and for the diagonality of variance coefficient matrices in a multivariate generalized autoregressive conditional heteroscedastic (GARCH) model, respectively. 相似文献

9.

Bayesian Interval Estimation for the Difference in TPRs and FPRs of Two Diagnostic Tests with Unverified Negatives

Eileen M. Stock James D. Stamey Dean M. Young 《统计学通讯:模拟与计算》2015,44(2):505-524

We derive Bayesian interval estimators for the differences in the true positive rates and false positive rates of two dichotomous diagnostic tests applied to the members of two distinct populations. The populations have varying disease prevalences with unverified negatives. We compare the performance of the Bayesian credible interval to the Wald interval using Monte Carlo simulation for a spectrum of different TPRs, FPRs, and sample sizes. For the case of a low TPR and low FPR, we found that a Bayesian credible interval with relatively noninformative priors performed well. We obtain similar interval comparison results for the cases of a high TPR and high FPR, a high TPR and low FPR, and of a high TPR and mixed FPR after incorporating mildly informative priors. 相似文献

10.

On the use of the selection matrix in the maximum likelihood estimation of normal distribution models with missing data

Keiji Takai 《统计学通讯:理论与方法》2018,47(14):3392-3407

In this article, by using the constant and random selection matrices, several properties of the maximum likelihood (ML) estimates and the ML estimator of a normal distribution with missing data are derived. The constant selection matrix allows us to obtain an explicit form of the ML estimates and the exact relationship between the EM algorithm and the score function. The random selection matrix allows us to clarify how the missing-data mechanism works in the proof of the consistency of the ML estimator, to derive the asymptotic properties of the sequence by the EM algorithm, and to derive the information matrix. 相似文献

11.

On minimaxity of the normal precision matrix estimator of Krishnamoorthy and Gupta

Yo Sheena† 《Statistics》2013,47(5):387-399

We consider the orthogonally invariant estimation problem of the inverse of the scale matrix of Wishart distribution using Stein's loss (entropy loss). In this problem Krishnamoorthy and Gupta [2] Krishnamoorthy, K. and Gupta, A. K. (1989). Improved minimax estimation of a normal precision matrix. Canad. J. Statist., 17: 91–102. [Crossref], [Web of Science ®] , [Google Scholar] proposed an estimator and showed its good performance in a Monte Carlo simulation. They conjectured their estimator is minimax. Perron [3] Perron, F. (1997). On a conjecture of Krishnamoorthy and Gupta. J. Multivariate Anal., 62: 110–120. [Google Scholar] proved its minimaxity for p?=?2. In this paper we prove it for p?=?3 by using a new method. 相似文献

12.

Risk of Error and the Kappa Coefficient of a Binary Diagnostic Test in the Presence of Partial Verification

J. A. Roldán Nofuentes J. D. Luna Del Castillo 《Journal of applied statistics》2007,34(8):887-898

The accuracy of a binary diagnostic test is usually measured in terms of its sensitivity and its specificity, or through positive and negative predictive values. Another way to describe the validity of a binary diagnostic test is the risk of error and the kappa coefficient of the risk of error. The risk of error is the average loss that is caused when incorrectly classifying a non-diseased or a diseased patient, and the kappa coefficient of the risk of error is a measure of the agreement between the diagnostic test and the gold standard. In the presence of partial verification of the disease, the disease status of some patients is unknown, and therefore the evaluation of a diagnostic test cannot be carried out through the traditional method. In this paper, we have deduced the maximum likelihood estimators and variances of the risk of error and of the kappa coefficient of the risk of error in the presence of partial verification of the disease. Simulation experiments have been carried out to study the effect of the verification probabilities on the coverage of the confidence interval of the kappa coefficient. 相似文献

13.

On upper bounds for the characteristic values of the covariance matrix for multinomial,dirichlet and multivariate hypergeometric distributions

S. Huschens 《Statistical Papers》1990,31(1):155-159

For the characteristic values T1 of the matrix V:=Diag(p)-pp^T with p=(p1,...,pk), p1≥p2≥...≥pk≥pk+1>0 and p1+p2+...+pk+pk+1=1 the inequalities p1≥τ1≥p2≥τ2≥...≥pk≥τk>0 are given by RONNING (1982). These inequalities give, if p and pk+1 are unknown, the upper bound 1≥T1. However, in this note the bound 1/2≥T1 is derived. V is proportional to the covariance matrix for multinomial, Dirichlet and multivariate hypergeometric distributions. A statistical application for the multinomial distribution is given. 相似文献

14.

On the information-based measure of covariance complexity and its application to the evaluation of multivariate linear models

Hamparsum Bozdogan 《统计学通讯:理论与方法》2013,42(1):221-278

This paper introduces a new information-theoretic measure of complexity called ICOMP as a decision rule for model selection and evaluation for multivariate linear models. The development of ICOMP is based on the generalization and utilization of the covariance complexity index of van Emden (1971) in estimation of the multivariate linear model. ICOMP is motivated by Akaike's (1973) Information Criterion (AIC), but it is a different procedure than AIC. In linear or nonlinear statistical models ICOMP uses an information-based characterization of: (i) the covariance matrix properties of the parameter estimates of a model starting from their finite sampling distributions, and (ii) the complexity of the inverse-Fisher information matrix (i-FIM) as a new criterion of achievable accuracy of the model As a result, it provides a trade-off between the accuracy of the parameter estimates and the interaction of the residuals of a model via the measure of complexity of their respective covariances. It controls the risks of both insufficient and overparameterized models, and incorporates the assumption of dependence and the independence of the residuals in one criterion function. A model with minimum ICOMP is chosen to be the best model among all possible competing alternative models. ICOMP relieves the researcher of any need to consider the parameter dimension of a model explicitly. A real numerical example is shown in subset selection of variables in multivariate regression analysis to demonstrate the utility and versatility of the new approach. 相似文献

15.

Some properties of the dirichlet-multinomial distribution and its use in prior elicitation

Kathryn Chaloner George T. Duncan 《统计学通讯:理论与方法》2013,42(2):511-523

Two results on the unimodality of the Dirichlet-multinomial distribution are proved, and a further result is alos proved on the identifiability of mixtures of multinomial distributions. These properties are used in developing a method for eliciting a Dirchlet prior distribution. The elicitation method is based on the mode, and region around the mode, of the Dirichlet-multinomial predictive distribution. 相似文献

16.

Plug-in L2-upper error bounds in deconvolution,for a mixing density estimate in Rd and for its derivatives,via the L1-error for the mixture

《Statistics》2012,46(6):1251-1268

相似文献

17.

On the accuracy in high‐dimensional linear models and its application to genomic selection

Charles‐Elie Rabier Brigitte Mangin Simona Grusea 《Scandinavian Journal of Statistics》2019,46(1):289-313

Genomic selection is today a hot topic in genetics. It consists in predicting breeding values of selection candidates, using the large number of genetic markers now available owing to the recent progress in molecular biology. One of the most popular methods chosen by geneticists is ridge regression. We focus on some predictive aspects of ridge regression and present theoretical results regarding the accuracy criteria, that is, the correlation between predicted value and true value. We show the influence of singular values, the regularization parameter, and the projection of the signal on the space spanned by the rows of the design matrix. Asymptotic results in a high‐dimensional framework are given; in particular, we prove that the convergence to optimal accuracy highly depends on a weighted projection of the signal on each subspace. We discuss on how to improve the prediction. Last, illustrations on simulated and real data are proposed. 相似文献

18.

On the robustness properties for maximum likelihood estimators of parameters in exponential power and generalized T distributions*

《统计学通讯:理论与方法》2012,41(3):607-630

Abstract

Examining the robustness properties of maximum likelihood (ML) estimators of parameters in exponential power and generalized t distributions has been considered together. The well-known asymptotic properties of ML estimators of location, scale and added skewness parameters in these distributions are studied. The ML estimators for location, scale and scale variant (skewness) parameters are represented as an iterative reweighting algorithm (IRA) to compute the estimates of these parameters simultaneously. The artificial data are generated to examine performance of IRA for ML estimators of parameters simultaneously. We make a comparison between these two distributions to test the fitting performance on real data sets. The goodness of fit test and information criteria approve that robustness and fitting performance should be considered together as a key for modeling issue to have the best information from real data sets. 相似文献

19.

On the almost sure convergence for sums of negatively superadditive dependent random vectors in Hilbert spaces and its application

Son Cong Ta Cuong Manh Tran Dung Van Le 《统计学通讯:理论与方法》2020,49(11):2770-2786

Abstract

This paper develops almost sure convergence for sums of negatively superadditive dependent random vectors in Hilbert spaces, we obtain Chung type SLLN and the Jaite type SLLN for sequences of negatively superadditive dependent random vectors in Hilbert spaces. Rate of convergence is studied through considering almost sure convergence to 0 of tail series. As an application, the almost sure convergence of degenerate von Mises-statistics is investigated. 相似文献

20.

On the correct regression function (in L₂) and its applications when the dimension of the covariate vector is random

Majid Mojirsheibani 《Journal of statistical planning and inference》2012

We derive the optimal regression function (i.e., the best approximation in the L₂ sense) when the vector of covariates has a random dimension. Furthermore, we consider applications of these results to problems in statistical regression and classification with missing covariates. It will be seen, perhaps surprisingly, that the correct regression function for the case with missing covariates can sometimes perform better than the usual regression function corresponding to the case with no missing covariates. This is because even if some of the covariates are missing, an indicator random variable δ

δ

, which is always observable, and is equal to 1 if there are no missing values (and 0 otherwise), may have far more information and predictive power about the response variable Y than the missing covariates do. We also propose kernel-based procedures for estimating the correct regression function nonparametrically. As an alternative estimation procedure, we also consider the least-squares method. 相似文献