期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Sparse discriminant analysis based on estimation of posterior probabilities

Akinori Hidaka Kenji Watanabe Takio Kurita 《Journal of applied statistics》2019,46(15):2761-2785

ABSTRACT

Fisher's linear discriminant analysis (FLDA) is known as a method to find a discriminative feature space for multi-class classification. As a theory of extending FLDA to an ultimate nonlinear form, optimal nonlinear discriminant analysis (ONDA) has been proposed. ONDA indicates that the best theoretical nonlinear map for maximizing the Fisher's discriminant criterion is formulated by using the Bayesian a posterior probabilities. In addition, the theory proves that FLDA is equivalent to ONDA when the Bayesian a posterior probabilities are approximated by linear regression (LR). Due to some limitations of the linear model, there is room to modify FLDA by using stronger approximation/estimation methods. For the purpose of probability estimation, multi-nominal logistic regression (MLR) is more suitable than LR. Along this line, in this paper, we develop a nonlinear discriminant analysis (NDA) in which the posterior probabilities in ONDA are estimated by MLR. In addition, in this paper, we develop a way to introduce sparseness into discriminant analysis. By applying L1 or L2 regularization to LR or MLR, we can incorporate sparseness in FLDA and our NDA to increase generalization performance. The performance of these methods is evaluated by benchmark experiments using last_exam17 standard datasets and a face classification experiment. 相似文献

2.

判别分析与Logistic回归的模拟比较 总被引：3，自引：2，他引：3

张初兵高康杨贵军《统计与信息论坛》2010,25(1):19-25

利用随机模拟方法,研究判别分析和Logistic回归分类的回判正确率。模拟结果显示,Logistic回归的回判正确率优于判别分析。随着随机误差的增大,Logistic回归与判别分析的回判正确率差异逐渐减小。随机误差超过一定界限,Logistic回归的回判正确率低于判别分析。在随机模拟的基础上,引入修正Logistic回归分类,模拟结果显示,修正Logistic回归分类略优于Logistic回归。相似文献

3.

A table of predictive success probabilities for logistic regression

S. G. Meester D. Eaves 《统计学通讯:模拟与计算》2013,42(4):1137-1139

A table of expected success rates under normally distributed success logit, used in conjunction with logistic regression analysis, enables easy calculation of expected win for betting on success of a future dichotomous trial. 相似文献

4.

Asymptotic Optimality of Sparse Linear Discriminant Analysis with Arbitrary Number of Classes

下载免费PDF全文

Ruiyan Luo Xin Qi 《Scandinavian Journal of Statistics》2017,44(3):598-616

Many sparse linear discriminant analysis (LDA) methods have been proposed to overcome the major problems of the classic LDA in high‐dimensional settings. However, the asymptotic optimality results are limited to the case with only two classes. When there are more than two classes, the classification boundary is complicated and no explicit formulas for the classification errors exist. We consider the asymptotic optimality in the high‐dimensional settings for a large family of linear classification rules with arbitrary number of classes. Our main theorem provides easy‐to‐check criteria for the asymptotic optimality of a general classification rule in this family as dimensionality and sample size both go to infinity and the number of classes is arbitrary. We establish the corresponding convergence rates. The general theory is applied to the classic LDA and the extensions of two recently proposed sparse LDA methods to obtain the asymptotic optimality. 相似文献

5.

Qdf misclassification probabilities for known population parameters

C. K. Bayne W. Y. Tan 《统计学通讯:理论与方法》2013,42(22):2315-2326

Approximated QDF misclassification probabilities have been derived for bivariate normal populations with known parameter values. Tne effect of unequal covariances and population distance on the misclassification probabilities are examined 相似文献

6.

Minimum Sample Size Considerations for Two-Group Linear and Quadratic Discriminant Analysis with Rare Populations

Shannon Zavorka Jamis J. Perrett 《统计学通讯:模拟与计算》2013,42(7):1726-1739

Linear discriminant analysis and quadratic discriminant analysis are used to predict group membership. Rare populations present situations in which group sizes differ drastically. This article examined k = 2 and k = 4 predictor variables for groups with different levels of rarity and different levels of sensitivity and specificity. Sample size recommendations were generated for both minimum and maximum group overlap using the leave-one-out (L-O-O) method of estimation. Minimum sample size recommendations are provided in tables for immediate implementation by applied researchers. 相似文献

7.

Selection without (unfair) discrimination

Thomes Johnson 《统计学通讯:理论与方法》2013,42(11):1079-1098

A method is devised for performing multiple discriminant analysis subject to inequality constraints on the probabilities of misassignment of different subpopulations. This procedure is motivated by attempts to devise.fair means of selection of applicants for schools, jobs, and credit. An algorithm is developed and sample calculations are given. 相似文献

8.

Discriminant coordinates analysis for multivariate functional data

Zofia Hanusz Mirosław Krzyśko Rafał Nadulski Łukasz Waszak 《统计学通讯:理论与方法》2020,49(18):4506-4519

Abstract

One of the basic statistical methods of dimensionality reduction is analysis of discriminant coordinates given by Fisher (1936 Fisher, R. A. 1936. The use of multiple measurements in taxonomic problem. Annals of Eugenics 7 (2):179–88. doi:10.1111/j.1469-1809.1936.tb02137.x.[Crossref] , [Google Scholar]) and Rao (1948). The space of discriminant coordinates is a space convenient for presenting multidimensional data originating from multiple groups and for the use of various classification methods (methods of discriminant analysis). In the present paper, we adapt the classical discriminant coordinates analysis to multivariate functional data. The theory has been applied to analysis of textural properties of apples of six varieties, measured over a period of 180?days, stored in two types of refrigeration chamber. 相似文献

9.

On the level probabilities for useful partially ordered alternatives in the analysis of variance

Hoshino Naoto Miyazaki Haruo Seki Yoichi 《统计学通讯:理论与方法》2013,42(8):2059-2071

In the analysis of variance, we often encounter situations in which we want to test the null hypothesis of homogeneity of the normal means against various partially ordered alternative hypotheses. We study likelihood ratio tests for three useful types of alternatives: d-star, bipartite and broom tree. Especially, we give computational formulas for the level probabilities of the alternative types. The results permit us to obtain critical values for practical use. 相似文献

10.

Sensitivity analysis on ruin probabilities with heavy-tailed claims

Gary K.C. Chan Hailiang Yang 《Statistical Methodology》2005,2(1):59-63

In this note, we consider the classical insurance risk model with heavy-tailed claim distributions. By using the Pollaczek–Khinchin Formula, we provide some sensitivity analysis on the ruin probability. 相似文献

11.

M-estimation Under a Two-Sample Semiparametric Model

Biao Zhang 《Scandinavian Journal of Statistics》2000,27(2):263-280

We consider M -estimation under a two-sample semiparametric model in which the log ratio of two unknown density functions has a known parametric form. This two-sample semiparametric model, arising naturally from case-control studies and logistic discriminant analysis, can be regarded as a biased sampling model. A new class of M -estimators are constructed on the basis of the maximum semiparametric likelihood estimator of the underlying distribution function. It is shown that the proposed M -estimators are consistent and asymptotically normally distributed. A simulation study is presented to demonstrate the performance of the proposed M -estimators. 相似文献

12.

Bayesian approach to estimation of intraclass correlation using reference prior

Younshik Chung Dipak K. Dey 《统计学通讯:理论与方法》2013,42(9):2241-2255

For the balanced variance component model when the intraclass correlation coefficient is of interest, Bayesian analysis is often appropriate. Berger and Bernardo’s (1992a) grouped ordering reference prior approach is used to analyze this model. The reference priors are developed and compared for the posterior inference with real and simulated data. We examine whether the reference priors satisfy the probability-matching criterion. Further, the reference prior is shown to be good in the sense of correct frequentist coverage probability of the posterior quantile. 相似文献

13.

A ridge-type estimator and good prior means

Jeffrey L Pliskin 《统计学通讯:理论与方法》2013,42(12):3429-3437

Swindel (1976) introduced a modified ridge regression estimator based on prior information. A necessary and sufficient condition is derived for Swindel's proposed estimator to have lower risk than the conventional ordinary ridge regression estimator when both estimators are computed using the same value of k. 相似文献

14.

Comparison of proportions with asymmetric prior information

Julia Mortera Aristide San Martini 《统计学通讯:理论与方法》2013,42(9):3205-3222

In this paper a Bayesian model is developed for comparing two binomial proportions. A two stage hierarchical prior distribution is used to represent prior dependence. Prior exchangeability and independence are shown to be but special cases. The relevant distributions have to be computed numerically and some examples are presented. 相似文献

15.

Bayesian optimization analysis with ML-II ε-contaminated prior

Pankaj Sinha Ashok K. Bansal 《Journal of applied statistics》2008,35(2):203-211

In this paper we derive the predictive density function of a future observation when prior distribution for unknown mean of a normal population is a Type-II maximum likelihood ε-contaminated prior. The derived predictive distribution is applied to the problem of optimization of a regression nature in the decisive prediction framework. 相似文献

16.

Discriminant analyses of peanut allergy severity scores

O. Collignon J.-M. Monnez P. Vallois F. Codreanu J.-M. Renaudin G. Kanny 《Journal of applied statistics》2011,38(9):1783-1799

Peanut allergy is one of the most prevalent food allergies. The possibility of a lethal accidental exposure and the persistence of the disease make it a public health problem. Evaluating the intensity of symptoms is accomplished with a double blind placebo-controlled food challenge (DBPCFC), which scores the severity of reactions and measures the dose of peanut that elicits the first reaction. Since DBPCFC can result in life-threatening responses, we propose an alternate procedure with the long-term goal of replacing invasive allergy tests. Discriminant analyses of DBPCFC score, the eliciting dose and the first accidental exposure score were performed in 76 allergic patients using 6 immunoassays and 28 skin prick tests. A multiple factorial analysis was performed to assign equal weights to both groups of variables, and predictive models were built by cross-validation with linear discriminant analysis, k-nearest neighbours, classification and regression trees, penalized support vector machine, stepwise logistic regression and AdaBoost methods. We developed an algorithm for simultaneously clustering eliciting dose values and selecting discriminant variables. Our main conclusion is that antibody measurements offer information on the allergy severity, especially those directed against rAra-h1 and rAra-h3. Further independent validation of these results and the use of new predictors will help extend this study to clinical practices. 相似文献

17.

Bayesian analysis for confirmatory factor model with finite-dimensional Dirichlet prior mixing

Xia Yemao Pan Maolin 《统计学通讯:理论与方法》2017,46(9):4599-4619

Confirmatory factor analysis (CFA) model is a useful multivariate statistical tool for interpreting relationships between latent variables and manifest variables. Often statistical results based on a single CFA are seriously distorted when data set takes on heterogeneity. To address the heterogeneity resulting from the multivariate responses, we propose a Bayesian semiparametric modeling for CFA. The approach relies on using a prior over the space of mixing distributions with finite components. Blocked Gibbs sampler is implemented to cope with the posterior analysis. Results obtained from a simulation study and a real data set are presented to illustrate the methodology. 相似文献

18.

关于多个总体判别分析ROC曲面及其一些性质

林昌浩《统计与信息论坛》2008,23(5):15-18

为了把两个总体判别分析中的ROC曲线推广到了多个总体的情形,根据两个总体判断分析中的ROC曲线变换,得到了多个总体判别分析中的ROC曲面,并研究了其某些性质。相似文献

19.

Simulation‐based sample‐sizing and power calculations in logistic regression with partial prior information

Andrew P. Grieve Shah‐Jalal Sarker 《Pharmaceutical statistics》2016,15(6):507-516

There have been many approximations developed for sample sizing of a logistic regression model with a single normally‐distributed stimulus. Despite this, it has been recognised that there is no consensus as to the best method. In pharmaceutical drug development, simulation provides a powerful tool to characterise the operating characteristics of complex adaptive designs and is an ideal method for determining the sample size for such a problem. In this paper, we address some issues associated with applying simulation to determine the sample size for a given power in the context of logistic regression. These include efficient methods for evaluating the convolution of a logistic function and a normal density and an efficient heuristic approach to searching for the appropriate sample size. We illustrate our approach with three case studies. Copyright © 2016 John Wiley & Sons, Ltd. 相似文献

20.

Bayesian multivariate normal analysis with a wishart prior

A. Bekker J.J.J. Roux 《统计学通讯:理论与方法》2013,42(10):2485-2497

This paper considers the Bayesian analysis of the multivariate normal distribution when its covariance matrix has a Wishart prior density under the assumption of a multivariate quadratic loss function. New flexible marginal posterior distributions of the mean μ and of the covariance matrix Σ are developed and univariate cases with graphical representations are given. 相似文献