首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 484 毫秒
1.
The authors consider a robust linear discriminant function based on high breakdown location and covariance matrix estimators. They derive influence functions for the estimators of the parameters of the discriminant function and for the associated classification error. The most B‐robust estimator is determined within the class of multivariate S‐estimators. This estimator, which minimizes the maximal influence that an outlier can have on the classification error, is also the most B‐robust location S‐estimator. A comparison of the most B‐robust estimator with the more familiar biweight S‐estimator is made.  相似文献   

2.
The influence function introduced by Hampe1 (1968, 1973, 1974) is a tool that can be used for outlier detection. Campbell (1978) has obtained influence function for Mahalanobis’s distance between two populations which can be used for detecting outliers in discrim-inant analysis. In this paper influence functions for a variety of parametric functions in multivariate analysis are obtained. Influence functions for the generalized variance, the matrix of regression coefficients, the noncentrality matrix Σ-1 δ in multivariate analysis of variance and its eigen values, the matrix L, which is a generalization of 1-R2 , canonical correlations, principal components and parameters that correspond to Pillai’s statistic (1955), Hotelling’s (1951) generalized To2 and Wilk’s Λ (1932), which can be used for outlier detection in multivariate analysis, are obtained. Delvin, Ginanadesikan and Kettenring (1975) have obtained influence function for the population correlation co-efficient in the bivariate case. It is shown in this paper that influence functions for parameters corresponding to r2, R2, and Mahalanobis D2 can be obtained as particular cases.  相似文献   

3.
The purpose of this study was to predict placement and nonplacement outcomes for mildly handicapped three through five year old children given knowledge of developmental screening test data. Discrete discriminant analysis (Anderson, 1951; Cochran & Hopkins, 1961; Goldstein & Dillon, 1978) was used to classify children into either a placement or nonplacement group using developmental information retrieved from longitudinal Child Find records (1982-89). These records were located at the Florida Diagnostic and Learning Resource System (FDLRS) in Sarasota, Florida and provided usable data for 602 children. The developmental variables included performance on screening test activities from the Comprehensive Identification Process (Zehrbach, 1975), and consisted of: (a) gross motor skills, (b) expressive language skills, and (c) social-emotional skills. These three dichotomously scored developmental variables generated eight mutually exclusive and exhaustive combinations of screening data. Combined with one of three different types of cost-of-misclassification functions, each child in a random cross-validation sample of 100 was classified into one of the two outcome groups minimizing the expected cost of misclassification based on the remaining 502 children. For each cost function designed by the researchers a comparison was made between classifications from the discrete discriminant analysis procedure and actual placement outcomes for the 100 children. A logit analysis and a standard discriminant analysis were likewise conducted using the 502 children and compared with results of the discrete discriminant analysis for selected cost functions.  相似文献   

4.
Fisher's Linear Discriminant Function Can be used to classify an individual who has sampled from one of two multivariate normal Populations. In the following, this function is viewed as the other given his data vector it is assumed that the Population means and common covariance matrix are unknown. The vector of discriminant coeffients β(p×1) is the gradient of posterior log-odds and certain of its lineqar functions are directional derivatives which have a practical meaning. Accordingly, we treat the problems of estimating several linear functions of β The usual estimatoes of these functions are scaled versions of the unbiased estmators. In this Paper, these estimators are domainated by explicit alterenatives under a quadratic loss function. we reduce the problem of estimating β to that of estimating the inverse convariance matrix.  相似文献   

5.
Most discriminant functions refer to qualitatively district groups. Talis et al. (1975) introduced the probit discriminant function for distinguishing between two ordered groups. They showed how to estimate this function for mixture sampling and continuous predictor variables. Here an estimation system is given for the more common separate sampling which is applicable to continuous and/or discrete predictor variables. When used solely with continuous variables) this method of estimation is more robust than Tallis!

The relationship of probit and logistic discrimination is discussed.  相似文献   

6.
In multivariate data analysis, Fisher linear discriminant analysis is useful to optimally separate two classes of observations by finding a linear combination of p variables. Functional data analysis deals with the analysis of continuous functions and thus can be seen as a generalisation of multivariate analysis where the dimension of the analysis space p strives to infinity. Several authors propose methods to perform discriminant analysis in this infinite dimensional space. Here, the methodology is introduced to perform discriminant analysis, not on single infinite dimensional functions, but to find a linear combination of p infinite dimensional continuous functions, providing a set of continuous canonical functions which are optimally separated in the canonical space.KEYWORDS: Functional data analysis, linear discriminant analysis, classification  相似文献   

7.
In discriminant analysis, the dimension of the hyperplane which population mean vectors span is called the dimensionality. The procedures commonly used to estimate this dimension involve testing a sequence of dimensionality hypotheses as well as model fitting approaches based on (consistent) Akaike's method, (modified) Mallows' method and Schwarz's method. The marginal log-likelihood (MLL) method is developed and the asymptotic distribution of the dimensionality estimated by this method for normal populations is derived. Furthermore a modified marginal log-likelihood (MMLL) method is also considered. The MLL method is not consistent for large samples and two modified criteria are proposed which attain asymptotic consistency. Some comments are made with regard to the robustness of this method to departures from normality. The operating characteristics of the various methods proposed are examined and compared.  相似文献   

8.
ABSTRACT

Fisher's linear discriminant analysis (FLDA) is known as a method to find a discriminative feature space for multi-class classification. As a theory of extending FLDA to an ultimate nonlinear form, optimal nonlinear discriminant analysis (ONDA) has been proposed. ONDA indicates that the best theoretical nonlinear map for maximizing the Fisher's discriminant criterion is formulated by using the Bayesian a posterior probabilities. In addition, the theory proves that FLDA is equivalent to ONDA when the Bayesian a posterior probabilities are approximated by linear regression (LR). Due to some limitations of the linear model, there is room to modify FLDA by using stronger approximation/estimation methods. For the purpose of probability estimation, multi-nominal logistic regression (MLR) is more suitable than LR. Along this line, in this paper, we develop a nonlinear discriminant analysis (NDA) in which the posterior probabilities in ONDA are estimated by MLR. In addition, in this paper, we develop a way to introduce sparseness into discriminant analysis. By applying L1 or L2 regularization to LR or MLR, we can incorporate sparseness in FLDA and our NDA to increase generalization performance. The performance of these methods is evaluated by benchmark experiments using last_exam17 standard datasets and a face classification experiment.  相似文献   

9.
In present days it is commonly recognized that firm production datasets are affected by some level of random perturbation, and that consequently production frontiers have a stochastic nature. Mathematical programming methods, traditionally employed for frontier evaluation, are then reputed capable of mistaking errors for technical (in)efficiency. Therefore, recent literature is oriented towards a statistical view: frontiers are designed by enveloping data that have been preliminarly filtered from noise.In this paper a nonparametric smoother for filtering panel production data is presented. We pursue a recent approach of Kneip and Simar (1996), and frame it into a more general formulation whose a setting constitutes our specific proposal. The major feature of the method is that noise reduction and outlier detection are faced separately: i) a high order local polynomial fit is used as smoother; and ii) data are weighted by robustness scores. An extensive numerical study on some common production models yields encouraging results from a competition with Kneip and Simars filter.  相似文献   

10.
Fisher's linear discriminant function, adapted by Anderson for allocating new observations into one of two existing groups, is considered in this paper. Methods of estimating the misclassification error rates are reviewed and evaluated by Monte Carlo simulations. The investigation is carried out under both ideal (Multivariate Normal data) and non-ideal (Multivariate Binary data) conditions. The assessment is based on the usual mean square error (MSE) criterion and also on a new criterion of optimism. The results show that although there is a common cluster of good estimators for both ideal and non-ideal conditions, the single best estimators vary with respect to the different criteria  相似文献   

11.
12.
We introduce a technique for extending the classical method of linear discriminant analysis (LDA) to data sets where the predictor variables are curves or functions. This procedure, which we call functional linear discriminant analysis ( FLDA ), is particularly useful when only fragments of the curves are observed. All the techniques associated with LDA can be extended for use with FLDA. In particular FLDA can be used to produce classifications on new (test) curves, give an estimate of the discriminant function between classes and provide a one- or two-dimensional pictorial representation of a set of curves. We also extend this procedure to provide generalizations of quadratic and regularized discriminant analysis.  相似文献   

13.
Kernel discriminant analysis translates the original classification problem into feature space and solves the problem with dimension and sample size interchanged. In high‐dimension low sample size (HDLSS) settings, this reduces the ‘dimension’ to that of the sample size. For HDLSS two‐class problems we modify Mika's kernel Fisher discriminant function which – in general – remains ill‐posed even in a kernel setting; see Mika et al. (1999). We propose a kernel naive Bayes discriminant function and its smoothed version, using first‐ and second‐degree polynomial kernels. For fixed sample size and increasing dimension, we present asymptotic expressions for the kernel discriminant functions, discriminant directions and for the error probability of our kernel discriminant functions. The theoretical calculations are complemented by simulations which show the convergence of the estimators to the population quantities as the dimension grows. We illustrate the performance of the new discriminant rules, which are easy to implement, on real HDLSS data. For such data, our results clearly demonstrate the superior performance of the new discriminant rules, and especially their smoothed versions, over Mika's kernel Fisher version, and typically also over the commonly used naive Bayes discriminant rule.  相似文献   

14.
We consider the supervised classification setting, in which the data consist of p features measured on n observations, each of which belongs to one of K classes. Linear discriminant analysis (LDA) is a classical method for this problem. However, in the high-dimensional setting where p ? n, LDA is not appropriate for two reasons. First, the standard estimate for the within-class covariance matrix is singular, and so the usual discriminant rule cannot be applied. Second, when p is large, it is difficult to interpret the classification rule obtained from LDA, since it involves all p features. We propose penalized LDA, a general approach for penalizing the discriminant vectors in Fisher's discriminant problem in a way that leads to greater interpretability. The discriminant problem is not convex, so we use a minorization-maximization approach in order to efficiently optimize it when convex penalties are applied to the discriminant vectors. In particular, we consider the use of L(1) and fused lasso penalties. Our proposal is equivalent to recasting Fisher's discriminant problem as a biconvex problem. We evaluate the performances of the resulting methods on a simulation study, and on three gene expression data sets. We also survey past methods for extending LDA to the high-dimensional setting, and explore their relationships with our proposal.  相似文献   

15.
The empirical influence function for Mahalanobis distance and for misclassification rates are presented for discriminant analysis with two multivariate normal populations, following Campbell (1978). Conclusions about the effects of outliers from the empirical influence function are contrasted with exact calculations for four simple cases. These cases demonstrate that the higher-order terms discarded in deriving the empirical influence function can be important in practical problems.  相似文献   

16.
The quadratic discriminant function (QDF) with known parameters has been represented in terms of a weighted sum of independent noncentral chi-square variables. To approximate the density function of the QDF as m-dimensional exponential family, its moments in each order have been calculated. This is done using the recursive formula for the moments via the Stein's identity in the exponential family. We validate the performance of our method using simulation study and compare with other methods in the literature based on the real data. The finding results reveal better estimation of misclassification probabilities, and less computation time with our method.  相似文献   

17.
Several mathematical programming approaches to the classification problem in discriminant analysis have recently been introduced. This paper empirically compares these newly introduced classification techniques with Fisher's linear discriminant analysis (FLDA), quadratic discriminant analysis (QDA), logit analysis, and several rank-based procedures for a variety of symmetric and skewed distributions. The percent of correctly classified observations by each procedure in a holdout sample indicate that while under some experimental conditions the linear programming approaches compete well with the classical procedures, overall, however, their performance lags behind that of the classical procedures.  相似文献   

18.
This article considers the problem of statistical classification involving multivariate normal populations and compares the performance of the linear discriminant function (LDF) and the Euclidean distance function (EDF), Although the LDF is quite popular and robust, it has been established (Marco, Young and Turner, 1989) that under certain non-trivial conditions, the EDF is "equivalent" to the LDF, in terms of equal probabilities of misclassifica-tion (error rates). Thus it follows that under those conditions the sample EDF could perform better than the sample LDF, since the sample EDF involves estimation of fewer parameters. Sindation results, also from the above paper; seemed to support this hypothesis. This article compares the two sample discriminant functions through asymptotic expansions of error rates, and identifies situations when the sample EDF should perform better than the sample LDF. Results from simulation experiments are also reported and discussed.  相似文献   

19.
The main contribution of this paper is is updating a nonlinear discriminant function on the basis of data of unknown origin. Specifically a procedure is developed for updating the nonlinear discriminant function on the basis of two Burr Type III distributions (TBIIID) when the additional observations are mixed or classified. First the nonlinear discriminant function of the assumed model is obtained. Then the total probabilities of misclassification are calculated. In addition a Monte carlo simulation runs are used to compute the relative efficiencies in order to investigate the performance of the developed updating procedures. Finally the results obtained in this paper are illustrated through a real and simulated data set.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号