A Bayesian discovery procedure 总被引:1,自引:0,他引:1
Michele Guindani Peter Müller Song Zhang 《Journal of the Royal Statistical Society. Series B, Statistical methodology》2009,71(5):905-925
Summary. We discuss a Bayesian discovery procedure for multiple-comparison problems. We show that, under a coherent decision theoretic framework, a loss function combining true positive and false positive counts leads to a decision rule that is based on a threshold of the posterior probability of the alternative. Under a semiparametric model for the data, we show that the Bayes rule can be approximated by the optimal discovery procedure, which was recently introduced by Storey. Improving the approximation leads us to a Bayesian discovery procedure, which exploits the multiple shrinkage in clusters that are implied by the assumed non-parametric model. We compare the Bayesian discovery procedure and the optimal discovery procedure estimates in a simple simulation study and in an assessment of differential gene expression based on microarray data from tumour samples. We extend the setting of the optimal discovery procedure by discussing modifications of the loss function that lead to different single-thresholding statistics. Finally, we provide an application of the previous arguments to dependent (spatial) data. 相似文献
This paper considers the statistical reliability on discrete failure data and the selection of the best geometric distribution having the smallest failure probability from among several competitors. Using the Bayesian approach a Bayes selection rule based on type-I censored data is derived and its associated monotonicity is also obtained. An early selection rule which allows us to make a selection possible earlier than the censoring time of the life testing experiment is proposed. This early selection rule can be shown to be equivalent to the Bayes selection rule. An illustrative example is given to demonstrate the use and the performance of the early selection rule. 相似文献
Miao-Yu Tsai 《统计学通讯:理论与方法》2013,42(16):2849-2864
The test of variance components of possibly correlated random effects in generalized linear mixed models (GLMMs) can be used to examine if there exists heterogeneous effects. The Bayesian test with Bayes factors offers a flexible method. In this article, we focus on the performance of Bayesian tests under three reference priors and a conjugate prior: an approximate uniform shrinkage prior, modified approximate Jeffreys' prior, half-normal unit information prior and Wishart prior. To compute Bayes factors, we propose a hybrid approximation approach combining a simulated version of Laplace's method and importance sampling techniques to test the variance components in GLMMs. 相似文献
This paper contributes to the emerging Bayesian literature on treatment effects. It derives treatment parameters in the framework of a potential outcomes model with a treatment choice equation, where the correlation between the unobservable components of the model is driven by a low-dimensional vector of latent factors. The analyst is assumed to have access to a set of measurements generated by the latent factors. This approach has attractive features from both theoretical and practical points of view. Not only does it address the fundamental identification problem arising from the inability to observe the same person in both the treated and untreated states, but it also turns out to be straightforward to implement. Formulae are provided to compute mean treatment effects as well as their distributional versions. A Monte Carlo simulation study is carried out to illustrate how the methodology can easily be applied. 相似文献
Higher dimensional surfaces are used to examine the diagnostic performance of multiclass classification systems. These surfaces are extensions of the ROC curve and are known as ROC surfaces or manifolds. Manifolds may be constructed from either the correct classifications or from the misclassifications of the diagnostic system. Comparisons of the usefulness of each of these ROC manifolds with respect to the performance of the diagnostic system are made with emphasis on inferences from volume under the surface and optimal operating points (thresholds) of the system. Recommendations for when to use each type of ROC manifold and performance measure are discussed. 相似文献
This paper explores the estimation of the area under the ROC curve when test scores are subject to errors. The naive approach that ignores measurement errors generally yields inconsistent estimates. Finding the asymptotic bias of the naive estimator, Coffin and Sukhatme (1995, 1997) proposed bias-corrected estimators for parametric and nonparametric cases. However, the asymptotic distributions of these estimators have not been developed because of their complexity. We propose several alternative approaches, including the SIMEX procedure of Cook and Stefanski (1994). We also provide the asymptotic distributions of the SIMEX estimators for use in statistical inference. Small simulation studies illustrate that the SIMEX estimators perform reasonably well when compared to the bias-corrected estimators. 相似文献
Various mathematical and statistical models for estimation of automobile insurance pricing are reviewed. The methods are compared on their predictive ability based on two sets of automobile insurance data for two different states collected over two different periods. The issue of model complexity versus data availability is resolved through a comparison of the accuracy of prediction. The models reviewed range from the use of simple cell means to various multiplicative-additive schemes to the empirical-Bayes approach. The empirical-Bayes approach, with prediction based on both model-based and individual cell estimates, seems to yield the best forecast. 相似文献
《Journal of Statistical Computation and Simulation》2012,82(2):415-427
Dynamic regression models are widely used because they express and model the behaviour of a system over time. In this article, two dynamic regression models, the distributed lag (DL) model and the autoregressive distributed lag model, are evaluated focusing on their lag lengths. From a classical statistics point of view, there are various methods to determine the number of lags, but none of them are the best in all situations. This is a serious issue since wrong choices will provide bad estimates for the effects of the regressors on the response variable. We present an alternative for the aforementioned problems by considering a Bayesian approach. The posterior distributions of the numbers of lags are derived under an improper prior for the model parameters. The fractional Bayes factor technique [A. O'Hagan, Fractional Bayes factors for model comparison (with discussion), J. R. Statist. Soc. B 57 (1995), pp. 99–138] is used to handle the indeterminacy in the likelihood function caused by the improper prior. The zero-one loss function is used to penalize wrong decisions. A naive method using the specified maximum number of DLs is also presented. The proposed and the naive methods are verified using simulation data. The results are promising for the method we proposed. An illustrative example with a real data set is provided. 相似文献
Hiroyuki Kashima 《Statistical Papers》2005,46(4):523-540
This paper shows that a minimax Bayes rule and shrinkage estimators can be effectively applied to portfolio selection under
the Bayesian approach. Specifically, it is shown that the portfolio selection problem can result in a statistical decision
problem in some situations. Following that, we present a method for solving a problem involved in portfolio selection under
the Bayesian approach. 相似文献
Lili Tian Chengjie XiongChin-Ying Lai Albert Vexler 《Journal of statistical planning and inference》2011,141(1):549-558
In the cases with three ordinal diagnostic groups, the important measures of diagnostic accuracy are the volume under surface (VUS) and the partial volume under surface (PVUS) which are the extended forms of the area under curve (AUC) and the partial area under curve (PAUC). This article addresses confidence interval estimation of the difference in paired VUS s and the difference in paired PVUS s. To focus especially on studies with small to moderate sample sizes, we propose an approach based on the concepts of generalized inference. A Monte Carlo study demonstrates that the proposed approach generally can provide confidence intervals with reasonable coverage probabilities even at small sample sizes. The proposed approach is compared to a parametric bootstrap approach and a large sample approach through simulation. Finally, the proposed approach is illustrated via an application to a data set of blood test results of anemia patients. 相似文献
Gerda Claeskens Bing‐Yi Jing Liang Peng Wang Zhou 《Revue canadienne de statistique》2003,31(2):173-190
Abstract: The authors derive empirical likelihood confidence regions for the comparison distribution of two populations whose distributions are to be tested for equality using random samples. Another application they consider is to ROC curves, which are used to compare measurements of a diagnostic test from two populations. The authors investigate the smoothed empirical likelihood method for estimation in this context, and empirical likelihood based confidence intervals are obtained by means of the Wilks theorem. A bootstrap approach allows for the construction of confidence bands. The method is illustrated with data analysis and a simulation study. 相似文献
Dependent masking and system life data analysis: Bayesian inference for two-component systems 总被引:1,自引:0,他引:1
Data from field operations of a system is often used to estimate the reliability of components. Under ideal circumstances, this system field data contains the time to failure along with information on the exact component responsible for the system failure. However, in many cases, the exact component causing the failure of the system cannot be identified, and is considered to be masked. Previously developed models for estimation of component reliability from masked system life data have been based upon the assumption that masking occurs independently of the true cause of system failure. In this paper we develop a Bayesian methodology for estimating component reliabilities from masked system life data when the probability of masking is dependent upon the true cause of system failure. The Bayesian approach is illustrated for the case of a two-component system of exponentially distributed components. 相似文献
It is common practice to use hierarchical Bayesian model for the informing of a pediatric randomized controlled trial (RCT) by adult data, using a prespecified borrowing fraction parameter (BFP). This implicitly assumes that the BFP is intuitive and corresponds to the degree of similarity between the populations. Generalizing this model to any historical studies, naturally leads to empirical Bayes meta-analysis. In this paper we calculate the Bayesian BFPs and study the factors that drive them. We prove that simultaneous mean squared error reduction relative to an uninformed model is always achievable through application of this model. Power and sample size calculations for a future RCT, designed to be informed by multiple external RCTs, are also provided. Potential applications include inference on treatment efficacy from independent trials involving either heterogeneous patient populations or different therapies from a common class. 相似文献
Abstract. This paper considers simultaneous estimation of means from several strata. A model-based approach is taken, where the covariates in the superpopulation model are subject to measurement errors. Empirical Bayes (EB) and Hierarchical Bayes estimators of the strata means are developed and asymptotic optimality of EB estimators is proved. Their performances are examined and compared with that of the sample mean in a simulation study as well as in data analysis. 相似文献
对于统计数据准确性评估方法的归纳分类,可以从辅助资料信息的来源及表现形态、包含目标特征真值关键信息的参照标准的构造方法、实际统计数值与参照标准之间的比较逻辑等方面加以考察。针对总量统计数据的准确性评估,主要在纵向时间维度上开展,根据统计指标与其相关联指标变动趋势的偏离程度加以评判;针对个体或分类统计数据的准确性评估则主要在横向空间维度上开展,对数据的统计分布形态加以检验,或者利用重复调查或随机实验方法对事先假定的误差参数进行估计。不同方法具有不同的适用条件,在实际应用中,应基于所掌握辅助资料的详实程度来选择评估方法,并对导致评估方法发生误判的因素加以细致分析和排除,以确保评估结论的说服力和可信度。 相似文献
《Journal of the Korean Statistical Society》2014,43(2):161-175
The area under the ROC curve (AUC) can be interpreted as the probability that the classification scores of a diseased subject is larger than that of a non-diseased subject for a randomly sampled pair of subjects. From the perspective of classification, we want to find a way to separate two groups as distinctly as possible via AUC. When the difference of the scores of a marker is small, its impact on classification is less important. Thus, a new diagnostic/classification measure based on a modified area under the ROC curve (mAUC) is proposed, which is defined as a weighted sum of two AUCs, where the AUC with the smaller difference is assigned a lower weight, and vice versa. Using mAUC is robust in the sense that mAUC gets larger as AUC gets larger as long as they are not equal. Moreover, in many diagnostic situations, only a specific range of specificity is of interest. Under normal distributions, we show that if the AUCs of two markers are within similar ranges, the larger mAUC implies the larger partial AUC for a given specificity. This property of mAUC will help to identify the marker with the higher partial AUC, even when the AUCs are similar. Two nonparametric estimates of an mAUC and their variances are given. We also suggest the use of mAUC as the objective function for classification, and the use of the gradient Lasso algorithm for classifier construction and marker selection. Application to simulation datasets and real microarray gene expression datasets show that our method finds a linear classifier with a higher ROC curve than some other existing linear classifiers, especially in the range of low false positive rates. 相似文献
众所周知,Engle (1982) 的ARCH检验对于条件均值模型误设并不稳健,特别地,当条件均值是非线性过程而我们仅对之建立线性模型时,它过度地拒绝真实的原假设,导致出现严重的水平扭曲 (size distortion)。因此,本文在文献当中首次利用Yeo-Johnson变换方法来转换均值模型的因变量以排除ARCH 过程中均值部分的非线性,进而提出一个新的稳健ARCH检验以及一个新的GARCH模型——Yeo-Johnson (YJ) GARCH模型。蒙特卡罗模拟结果表明,稳健的ARCH检验在水平 (size) 和势 (power) 方面的表现要显著优于Engle (1982) 的ARCH检验。对上证综指收益率的实证研究结果表明,YJ-GARCH模型的拟合效果要显著优于线性GARCH模型。 相似文献
Szu-Peng Yang 《统计学通讯:模拟与计算》2017,46(8):6083-6105
This paper adopts a Bayesian strategy for generalized ridge estimation for high-dimensional regression. We also consider significance testing based on the proposed estimator, which is useful for selecting regressors. Both theoretical and simulation studies show that the proposed estimator can simultaneously outperform the ordinary ridge estimator and the LSE in terms of the mean square error (MSE) criterion. The simulation study also demonstrates the competitive MSE performance of our proposal with the Lasso under sparse models. We demonstrate the method using the lung cancer data involving high-dimensional microarrays. 相似文献
When studying a regression model measures of explained variation are used to assess the degree to which the covariates determine the outcome of interest. Measures of predictive accuracy are used to assess the accuracy of the predictions based on the covariates and the regression model. We give a detailed and general introduction to the two measures and the estimation procedures. The framework we set up allows for a study of the effect of misspecification on the quantities estimated. We also introduce a generalization to survival analysis. 相似文献