首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Margin-based classifiers have been popular in both machine learning and statistics for classification problems. Among numerous classifiers, some are hard classifiers while some are soft ones. Soft classifiers explicitly estimate the class conditional probabilities and then perform classification based on estimated probabilities. In contrast, hard classifiers directly target on the classification decision boundary without producing the probability estimation. These two types of classifiers are based on different philosophies and each has its own merits. In this paper, we propose a novel family of large-margin classifiers, namely large-margin unified machines (LUMs), which covers a broad range of margin-based classifiers including both hard and soft ones. By offering a natural bridge from soft to hard classification, the LUM provides a unified algorithm to fit various classifiers and hence a convenient platform to compare hard and soft classification. Both theoretical consistency and numerical performance of LUMs are explored. Our numerical study sheds some light on the choice between hard and soft classifiers in various classification problems.  相似文献   

2.
Many large-margin classifiers such as the Support Vector Machine (SVM) sidestep estimating conditional class probabilities and target the discovery of classification boundaries directly. However, estimation of conditional class probabilities can be useful in many applications. Wang, Shen, and Liu (2008) bridged the gap by providing an interval estimator of the conditional class probability via bracketing. The interval estimator was achieved by applying different weights to positive and negative classes and training the corresponding weighted large-margin classifiers. They propose to estimate the weighted large-margin classifiers individually. However, empirically the individually estimated classification boundaries may suffer from crossing each other even though, theoretically, they should not.In this work, we propose a technique to ensure non-crossing of the estimated classification boundaries. Furthermore, we take advantage of the estimated conditional class probabilities to precondition our training data. The standard SVM is then applied to the preconditioned training data to achieve robustness. Simulations and real data are used to illustrate their finite sample performance.  相似文献   

3.
Conventional multiclass conditional probability estimation methods, such as Fisher's discriminate analysis and logistic regression, often require restrictive distributional model assumption. In this paper, a model-free estimation method is proposed to estimate multiclass conditional probability through a series of conditional quantile regression functions. Specifically, the conditional class probability is formulated as a difference of corresponding cumulative distribution functions, where the cumulative distribution functions can be converted from the estimated conditional quantile regression functions. The proposed estimation method is also efficient as its computation cost does not increase exponentially with the number of classes. The theoretical and numerical studies demonstrate that the proposed estimation method is highly competitive against the existing competitors, especially when the number of classes is relatively large.  相似文献   

4.
ABSTRACT

Fisher's linear discriminant analysis (FLDA) is known as a method to find a discriminative feature space for multi-class classification. As a theory of extending FLDA to an ultimate nonlinear form, optimal nonlinear discriminant analysis (ONDA) has been proposed. ONDA indicates that the best theoretical nonlinear map for maximizing the Fisher's discriminant criterion is formulated by using the Bayesian a posterior probabilities. In addition, the theory proves that FLDA is equivalent to ONDA when the Bayesian a posterior probabilities are approximated by linear regression (LR). Due to some limitations of the linear model, there is room to modify FLDA by using stronger approximation/estimation methods. For the purpose of probability estimation, multi-nominal logistic regression (MLR) is more suitable than LR. Along this line, in this paper, we develop a nonlinear discriminant analysis (NDA) in which the posterior probabilities in ONDA are estimated by MLR. In addition, in this paper, we develop a way to introduce sparseness into discriminant analysis. By applying L1 or L2 regularization to LR or MLR, we can incorporate sparseness in FLDA and our NDA to increase generalization performance. The performance of these methods is evaluated by benchmark experiments using last_exam17 standard datasets and a face classification experiment.  相似文献   

5.
In this paper, a nonparametric discriminant analysis procedure that is less sensitive than traditional procedures to deviations from the usual assumptions is proposed. The procedure uses the projection pursuit methodology where the projection index is the two-group transvariation probability. Montanari [A. Montanari, Linear discriminant analysis and transvariation, J. Classification 21 (2004), pp. 71–88] proposed and used this projection index to measure group separation but allocated the new observation using distances. Our procedure employs a method of allocation based on group–group transvariation probability to classify the new observation. A simulation study shows that the procedure proposed in this paper provides lower misclassification error rates than classical procedures like linear discriminant analysis and quadratic discriminant analysis and recent procedures like maximum depth and Montanari's transvariation-based classifiers, when the underlying distributions are skewed and/or the prior probabilities are unequal.  相似文献   

6.
In this paper, we propose a new Bayesian inference approach for classification based on the traditional hinge loss used for classical support vector machines, which we call the Bayesian Additive Machine (BAM). Unlike existing approaches, the new model has a semiparametric discriminant function where some feature effects are nonlinear and others are linear. This separation of features is achieved automatically during model fitting without user pre-specification. Following the literature on sparse regression of high-dimensional models, we can also identify the irrelevant features. By introducing spike-and-slab priors using two sets of indicator variables, these multiple goals are achieved simultaneously and automatically, without any parameter tuning such as cross-validation. An efficient partially collapsed Markov chain Monte Carlo algorithm is developed for posterior exploration based on a data augmentation scheme for the hinge loss. Our simulations and three real data examples demonstrate that the new approach is a strong competitor to some approaches that were proposed recently for dealing with challenging classification examples with high dimensionality.  相似文献   

7.
In this article, a sequential correction of two linear methods: linear discriminant analysis (LDA) and perceptron is proposed. This correction relies on sequential joining of additional features on which the classifier is trained. These new features are posterior probabilities determined by a basic classification method such as LDA and perceptron. In each step, we add the probabilities obtained on a slightly different data set, because the vector of added probabilities varies at each step. We therefore have many classifiers of the same type trained on slightly different data sets. Four different sequential correction methods are presented based on different combining schemas (e.g. mean rule and product rule). Experimental results on different data sets demonstrate that the improvements are efficient, and that this approach outperforms classical linear methods, providing a significant reduction in the mean classification error rate.  相似文献   

8.
As no single classification method outperforms other classification methods under all circumstances, decision-makers may solve a classification problem using several classification methods and examine their performance for classification purposes in the learning set. Based on this performance, better classification methods might be adopted and poor methods might be avoided. However, which single classification method is the best to predict the classification of new observations is still not clear, especially when some methods offer similar classification performance in the learning set. In this article we present various regression and classical methods, which combine several classification methods to predict the classification of new observations. The quality of the combined classifiers is examined on some real data. Nonparametric regression is the best method of combining classifiers.  相似文献   

9.
Empirical Bayes is a versatile approach to “learn from a lot” in two ways: first, from a large number of variables and, second, from a potentially large amount of prior information, for example, stored in public repositories. We review applications of a variety of empirical Bayes methods to several well‐known model‐based prediction methods, including penalized regression, linear discriminant analysis, and Bayesian models with sparse or dense priors. We discuss “formal” empirical Bayes methods that maximize the marginal likelihood but also more informal approaches based on other data summaries. We contrast empirical Bayes to cross‐validation and full Bayes and discuss hybrid approaches. To study the relation between the quality of an empirical Bayes estimator and p, the number of variables, we consider a simple empirical Bayes estimator in a linear model setting. We argue that empirical Bayes is particularly useful when the prior contains multiple parameters, which model a priori information on variables termed “co‐data”. In particular, we present two novel examples that allow for co‐data: first, a Bayesian spike‐and‐slab setting that facilitates inclusion of multiple co‐data sources and types and, second, a hybrid empirical Bayes–full Bayes ridge regression approach for estimation of the posterior predictive interval.  相似文献   

10.
The penalized logistic regression (PLR) is a powerful statistical tool for classification. It has been commonly used in many practical problems. Despite its success, since the loss function of the PLR is unbounded, resulting classifiers can be sensitive to outliers. To build more robust classifiers, we propose the robust PLR (RPLR) which uses truncated logistic loss functions, and suggest three schemes to estimate conditional class probabilities. Connections of the RPLR with some other existing work on robust logistic regression have been discussed. Our theoretical results indicate that the RPLR is Fisher consistent and more robust to outliers. Moreover, we develop estimated generalized approximate cross validation (EGACV) for the tuning parameter selection. Through numerical examples, we demonstrate that truncating the loss function indeed yields better performance in terms of classification accuracy and class probability estimation.  相似文献   

11.
The risk of an individual woman having a pregnancy associated with Down's syndrome is estimated given her age, α-fetoprotein, human chorionic gonadotropin, and pregnancy-specific β1-glycoprotein levels. The classical estimation method is based on discriminant analysis under the assumption of lognormality of the marker values, but logistic regression is also applied for data classification. In the present work, we compare the performance of the two methods using a dataset containing the data of almost 89,000 unaffected and 333 affected pregnancies. Assuming lognormality of the marker values, we also calculate the theoretical detection and false positive rates for both the methods.  相似文献   

12.
Most methods for variable selection work from the top down and steadily remove features until only a small number remain. They often rely on a predictive model, and there are usually significant disconnections in the sequence of methodologies that leads from the training samples to the choice of the predictor, then to variable selection, then to choice of a classifier, and finally to classification of a new data vector. In this paper we suggest a bottom‐up approach that brings the choices of variable selector and classifier closer together, by basing the variable selector directly on the classifier, removing the need to involve predictive methods in the classification decision, and enabling the direct and transparent comparison of different classifiers in a given problem. Specifically, we suggest ‘wrapper methods’, determined by classifier type, for choosing variables that minimize the classification error rate. This approach is particularly useful for exploring relationships among the variables that are chosen for the classifier. It reveals which variables have a high degree of leverage for correct classification using different classifiers; it shows which variables operate in relative isolation, and which are important mainly in conjunction with others; it permits quantification of the authority with which variables are selected; and it generally leads to a reduced number of variables for classification, in comparison with alternative approaches based on prediction.  相似文献   

13.
A new density-based classification method that uses semiparametric mixtures is proposed. Like other density-based classifiers, it first estimates the probability density function for the observations in each class, with a semiparametric mixture, and then classifies a new observation by the highest posterior probability. By making a proper use of a multivariate nonparametric density estimator that has been developed recently, it is able to produce adaptively smooth and complicated decision boundaries in a high-dimensional space and can thus work well in such cases. Issues specific to classification are studied and discussed. Numerical studies using simulated and real-world data show that the new classifier performs very well as compared with other commonly used classification methods.  相似文献   

14.
Food authenticity studies are concerned with determining if food samples have been correctly labelled or not. Discriminant analysis methods are an integral part of the methodology for food authentication. Motivated by food authenticity applications, a model-based discriminant analysis method that includes variable selection is presented. The discriminant analysis model is fitted in a semi-supervised manner using both labeled and unlabeled data. The method is shown to give excellent classification performance on several high-dimensional multiclass food authenticity datasets with more variables than observations. The variables selected by the proposed method provide information about which variables are meaningful for classification purposes. A headlong search strategy for variable selection is shown to be efficient in terms of computation and achieves excellent classification performance. In applications to several food authenticity datasets, our proposed method outperformed default implementations of Random Forests, AdaBoost, transductive SVMs and Bayesian Multinomial Regression by substantial margins.  相似文献   

15.
The problem of two-group classification has implications in a number of fields, such as medicine, finance, and economics. This study aims to compare the methods of two-group classification. The minimum sum of deviations and linear programming model, linear discriminant analysis, quadratic discriminant analysis and logistic regression, multivariate analysis of variance (MANOVA) test-based classification and the unpooled T-square test-based classification methods, support vector machines and k-nearest neighbor methods, and combined classification method will be compared for data structures having fat-tail and/or skewness. The comparison has been carried out by using a simulation procedure designed for various stable distribution structures and sample sizes.  相似文献   

16.
This paper considers the estimation of the regression coefficients in the Cox proportional hazards model with left-truncated and interval-censored data. Using the approaches of Pan [A multiple imputation approach to Cox regression with interval-censored data, Biometrics 56 (2000), pp. 199–203] and Heller [Proportional hazards regression with interval censored data using an inverse probability weight, Lifetime Data Anal. 17 (2011), pp. 373–385], we propose two estimates of the regression coefficients. The first estimate is based on a multiple imputation methodology. The second estimate uses an inverse probability weight to select event time pairs where the ordering is unambiguous. A simulation study is conducted to investigate the performance of the proposed estimators. The proposed methods are illustrated using the Centers for Disease Control and Prevention (CDC) acquired immunodeficiency syndrome (AIDS) Blood Transfusion Data.  相似文献   

17.
We propose a hybrid two-group classification method that integrates linear discriminant analysis, a polynomial expansion of the basis (or variable space), and a genetic algorithm with multiple crossover operations to select variables from the expanded basis. Using new product launch data from the biochemical industry, we found that the proposed algorithm offers mean percentage decreases in the misclassification error rate of 50%, 56%, 59%, 77%, and 78% in comparison to a support vector machine, artificial neural network, quadratic discriminant analysis, linear discriminant analysis, and logistic regression, respectively. These improvements correspond to annual cost savings of $4.40–$25.73 million.  相似文献   

18.
This paper discusses visualization methods for discriminant analysis. It does not address numerical methods for classification per se, but rather focuses on graphical methods that can be viewed as pre-processors, aiding the analyst's understanding of the data and the choice of a final classifier. The methods are adaptations of recent results in dimension reduction for regression, including sliced inverse regression and sliced average variance estimation. A permutation test is suggested as a means of determining dimension, and examples are given throughout the discussion.  相似文献   

19.
Non parametric approaches to classification have gained significant attention in the last two decades. In this paper, we propose a classification methodology based on the multivariate rank functions and show that it is a Bayes rule for spherically symmetric distributions with a location shift. We show that a rank-based classifier is equivalent to optimal Bayes rule under suitable conditions. We also present an affine invariant version of the classifier. To accommodate different covariance structures, we construct a classifier based on the central rank region. Asymptotic properties of these classification methods are studied. We illustrate the performance of our proposed methods in comparison to some other depth-based classifiers using simulated and real data sets.  相似文献   

20.
Fast and robust bootstrap   总被引:1,自引:0,他引:1  
In this paper we review recent developments on a bootstrap method for robust estimators which is computationally faster and more resistant to outliers than the classical bootstrap. This fast and robust bootstrap method is, under reasonable regularity conditions, asymptotically consistent. We describe the method in general and then consider its application to perform inference based on robust estimators for the linear regression and multivariate location-scatter models. In particular, we study confidence and prediction intervals and tests of hypotheses for linear regression models, inference for location-scatter parameters and principal components, and classification error estimation for discriminant analysis.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号