期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Kernel naive Bayes discrimination for high‐dimensional pattern recognition

Inge Koch Kanta Naito Hiroaki Tanaka 《Australian & New Zealand Journal of Statistics》2019,61(4):401-428

Kernel discriminant analysis translates the original classification problem into feature space and solves the problem with dimension and sample size interchanged. In high‐dimension low sample size (HDLSS) settings, this reduces the ‘dimension’ to that of the sample size. For HDLSS two‐class problems we modify Mika's kernel Fisher discriminant function which – in general – remains ill‐posed even in a kernel setting; see Mika et al. (1999). We propose a kernel naive Bayes discriminant function and its smoothed version, using first‐ and second‐degree polynomial kernels. For fixed sample size and increasing dimension, we present asymptotic expressions for the kernel discriminant functions, discriminant directions and for the error probability of our kernel discriminant functions. The theoretical calculations are complemented by simulations which show the convergence of the estimators to the population quantities as the dimension grows. We illustrate the performance of the new discriminant rules, which are easy to implement, on real HDLSS data. For such data, our results clearly demonstrate the superior performance of the new discriminant rules, and especially their smoothed versions, over Mika's kernel Fisher version, and typically also over the commonly used naive Bayes discriminant rule. 相似文献

2.

Discriminant analysis of survey data

Ching-Ho Leu Kam-Wah Tsui 《Journal of statistical planning and inference》1997,60(2):1115-290

We consider the problem of the effect of sample designs on discriminant analysis. The selection of the learning sample is assumed to depend on the population values of auxiliary variables. Under a superpopulation model with a multivariate normal distribution, unbiasedness and consistency are examined for the conventional estimators (derived under the assumptions of simple random sampling), maximum likelihood estimators, probability-weighted estimators and conditionally unbiased estimators of parameters. Four corresponding sampled linear discriminant functions are examined. The rates of misclassification of these four discriminant functions and the effect of sample design on these four rates of misclassification are discussed. The performances of these four discriminant functions are assessed in a simulation study. 相似文献

3.

Extension of model-based classification for binary data when training and test populations differ

J. Jacques C. Biernacki 《Journal of applied statistics》2010,37(5):749-766

相似文献

4.

The Euclidean distance classifier: an alternative to the linear discriminant function

Virgil R. Marco Dean M. Young Danny W. Turner 《统计学通讯:模拟与计算》2013,42(2):485-505

The sample linear discriminant function (LDF) is known to perform poorly when the number of features p is large relative to the size of the training samples, A simple and rarely applied alternative to the sample LDF is the sample Euclidean distance classifier (EDC). Raudys and Pikelis (1980) have compared the sample LDF with three other discriminant functions, including thesample EDC, when classifying individuals from two spherical normal populations. They have concluded that the sample EDC outperforms the sample LDF when p is large relative to the training sample size. This paper derives conditions for which the two classifiers are equivalent when all parameters are known and employs a Monte Carlo simulation to compare the sample EDC with the sample LDF no only for the spherical normal case but also for several nonspherical parameter configurations. Fo many practical situations, the sample EDC performs as well as or superior to the sample LDF, even for nonspherical covariance configurations. 相似文献

5.

Chapter Notes

Frederick Mosteller 《The American statistician》2013,67(1):20-22

Tests for redundancy of variables in linear two-group discriminant analysis are well known and frequently used. We give a survey of similar tests, including the one-sample T ² as a special case, in the situation in which only the mean vector (but no covariance matrix) is available in one sample. Then we show that a relation between linear regression and discriminant functions found by Fisher (1936) can be generalized to this situation. Relating regression and discriminant analysis to a multivariate linear model sheds more light on the relationship between them. Practical and didactical advantages of the regression approach to T ² tests and discriminant analysis are outlined. 相似文献

6.

Anderson's classification statistic based on a post-stratified training sample

C.Y. Leung 《统计学通讯:理论与方法》2013,42(5):1659-1667

The performance of Anderson's classification statistic based on a post-stratified random sample is examined. It is assumed that the training sample is a random sample from a stratified population consisting of two strata with unknown stratum weights. The sample is first segregated into the two strata by post-stratification. The unknown parameters for each of the two populations are then estimated and used in the construction of the plug-in discriminant. Under this procedure, it is shown that additional estimation of the stratum weight will not seriously affect the performance of Anderson's classification statistic. Furthermore, our discriminant enjoys a much higher efficiency than the procedure based on an unclassified sample from a mixture of normals investigated by Ganesalingam and McLachlan (1978). 相似文献

7.

The linear and euclidean discriminant functions: a comparison v1a asymptotic expansions and simulation study

J. P. Koolaard C. R. O. Lawoko 《统计学通讯:理论与方法》2013,42(12):2989-3011

This article considers the problem of statistical classification involving multivariate normal populations and compares the performance of the linear discriminant function (LDF) and the Euclidean distance function (EDF), Although the LDF is quite popular and robust, it has been established (Marco, Young and Turner, 1989) that under certain non-trivial conditions, the EDF is "equivalent" to the LDF, in terms of equal probabilities of misclassifica-tion (error rates). Thus it follows that under those conditions the sample EDF could perform better than the sample LDF, since the sample EDF involves estimation of fewer parameters. Sindation results, also from the above paper; seemed to support this hypothesis. This article compares the two sample discriminant functions through asymptotic expansions of error rates, and identifies situations when the sample EDF should perform better than the sample LDF. Results from simulation experiments are also reported and discussed. 相似文献

8.

Minimum Sample Size Considerations for Two-Group Linear and Quadratic Discriminant Analysis with Rare Populations

Shannon Zavorka Jamis J. Perrett 《统计学通讯:模拟与计算》2013,42(7):1726-1739

Linear discriminant analysis and quadratic discriminant analysis are used to predict group membership. Rare populations present situations in which group sizes differ drastically. This article examined k = 2 and k = 4 predictor variables for groups with different levels of rarity and different levels of sensitivity and specificity. Sample size recommendations were generated for both minimum and maximum group overlap using the leave-one-out (L-O-O) method of estimation. Minimum sample size recommendations are provided in tables for immediate implementation by applied researchers. 相似文献

9.

An Optimal Semiparametric Method for Two‐group Classification

《Scandinavian Journal of Statistics》2018,45(3):806-846

In the classical discriminant analysis, when two multivariate normal distributions with equal variance–covariance matrices are assumed for two groups, the classical linear discriminant function is optimal with respect to maximizing the standardized difference between the means of two groups. However, for a typical case‐control study, the distributional assumption for the case group often needs to be relaxed in practice. Komori et al. (Generalized t ‐statistic for two‐group classification. Biometrics 2015, 71: 404–416) proposed the generalized t ‐statistic to obtain a linear discriminant function, which allows for heterogeneity of case group. Their procedure has an optimality property in the class of consideration. We perform a further study of the problem and show that additional improvement is achievable. The approach we propose does not require a parametric distributional assumption on the case group. We further show that the new estimator is efficient, in that no further improvement is possible to construct the linear discriminant function more efficiently. We conduct simulation studies and real data examples to illustrate the finite sample performance and the gain that it produces in comparison with existing methods. 相似文献

10.

The performance of the linear and quadratic discriminant functions for three types of non-normal distribution

Hiroko Nakanishi Yoshiharu Sato 《统计学通讯:理论与方法》2013,42(5):1181-1200

The purpose of thls paper is to investlgate the performance of the LDF (linear discrlmlnant functlon) and QDF (quadratic dlscrminant functlon) for classlfylng observations from the three types of univariate and multivariate non-normal dlstrlbutlons on the basls of the mlsclasslficatlon rate. The theoretical and the empirical results are described for unlvariate distributions, and the empirical results are presented for multivariate distributions. It 1s also shown that the sign of the skewness of each population and the kurtosis have essential effects on the performance of the two discriminant functions. The variations of the populatlon speclflc mlsclasslflcatlon rates are greatly depend on the sample slze. For the large dlmenslonal populatlon dlstributlons, if the sample sizes are sufflclent, the QDF performs better than the LDF. We show the crlterla of a cholce between the two discriminant functions as an application. 相似文献

11.

Predicting early educational program placement with discrete discriminant analysis

Louise H. Boothby James K. Brewer 《统计学通讯:理论与方法》2013,42(11):4049-4060

The purpose of this study was to predict placement and nonplacement outcomes for mildly handicapped three through five year old children given knowledge of developmental screening test data. Discrete discriminant analysis (Anderson, 1951; Cochran & Hopkins, 1961; Goldstein & Dillon, 1978) was used to classify children into either a placement or nonplacement group using developmental information retrieved from longitudinal Child Find records (1982-89). These records were located at the Florida Diagnostic and Learning Resource System (FDLRS) in Sarasota, Florida and provided usable data for 602 children. The developmental variables included performance on screening test activities from the Comprehensive Identification Process (Zehrbach, 1975), and consisted of: (a) gross motor skills, (b) expressive language skills, and (c) social-emotional skills. These three dichotomously scored developmental variables generated eight mutually exclusive and exhaustive combinations of screening data. Combined with one of three different types of cost-of-misclassification functions, each child in a random cross-validation sample of 100 was classified into one of the two outcome groups minimizing the expected cost of misclassification based on the remaining 502 children. For each cost function designed by the researchers a comparison was made between classifications from the discrete discriminant analysis procedure and actual placement outcomes for the 100 children. A logit analysis and a standard discriminant analysis were likewise conducted using the 502 children and compared with results of the discrete discriminant analysis for selected cost functions. 相似文献

12.

Variance ratio screening for ultrahigh dimensional discriminant analysis

Fengli Song Baohua Shen Guosheng Cheng 《统计学通讯:理论与方法》2018,47(24):6034-6051

This article is concerned with feature screening for the ultrahigh dimensional discriminant analysis. A variance ratio screening method is proposed and the sure screening property of this screening procedure is proved. The proposed method has some additional desirable features. First, it is model-free which does not require specific discriminant model and can be directly applied to the multi-categories situation. Second, it can effectively screen main effects and interaction effects simultaneously. Third, it is relatively inexpensive in computational cost because of the simple structure. The finite sample properties are performed through the Monte Carlo simulation studies and two real-data analyses. 相似文献

13.

A comparison of the classical and the linear programming approaches to the classification problem in discriminant analysis

《Journal of Statistical Computation and Simulation》2012,82(1-2):73-93

Several mathematical programming approaches to the classification problem in discriminant analysis have recently been introduced. This paper empirically compares these newly introduced classification techniques with Fisher's linear discriminant analysis (FLDA), quadratic discriminant analysis (QDA), logit analysis, and several rank-based procedures for a variety of symmetric and skewed distributions. The percent of correctly classified observations by each procedure in a holdout sample indicate that while under some experimental conditions the linear programming approaches compete well with the classical procedures, overall, however, their performance lags behind that of the classical procedures. 相似文献

14.

Nonlinear Discriminant Functions for Mixed Random Walk Models

H. M. Moustafa S. A. Mohammed 《统计学通讯:模拟与计算》2013,42(10):1923-1938

A procedure is presented for finding maximum likelihood estimates of the parameters of a mixture of two random walk distributions in two cases, using classified and unclassified observations. Based on small sample size, estimation of nonlinear discriminant functions is considered. Throughout simulation experiments, the performance of the corresponding estimated nonlinear discriminant functions is investigated. The total probabilities of misclassification and percentage biases are evaluated and discussed. 相似文献

15.

Point Estimates of Test Sensitivity and Specificity from Sample Means and Variances

Richard G. Spencer Benjamin D. Cortese Vanessa A. Lukas Nancy Pleshko 《The American statistician》2017,71(1):81-87

In a wide variety of biomedical and clinical research studies, sample statistics from diagnostic marker measurements are presented as a means of distinguishing between two populations, such as with and without disease. Intuitively, a larger difference between the mean values of a marker for the two populations, and a smaller spread of values within each population, should lead to more reliable classification rules based on this marker. We formalize this intuitive notion by deriving practical, new, closed-form expressions for the sensitivity and specificity of three different discriminant tests defined in terms of the sample means and standard deviations of diagnostic marker measurements. The three discriminant tests evaluated are based, respectively, on the Euclidean distance and the Mahalanobis distance between means, and a likelihood ratio analysis. Expressions for the effects of measurement error are also presented. Our final expressions assume that the diagnostic markers follow independent normal distributions for the two populations, although it will be clear that other known distributions may be similarly analyzed. We then discuss applications drawn from the medical literature, although the formalism is clearly not restricted to that application. 相似文献

16.

PERFORMANCE OF THE LOCATION LINEAR DISCRIMINANT FUNCTION UNDER ACROSS-LOCATION HETEROSCEDASTICITY

《统计学通讯:理论与方法》2013,42(6):1031-1044

ABSTRACT

Classification of data consisting of both categorical and continuous variables between two groups is often handled by the sample location linear discriminant function confined to each of the locations specified by the observed values of the categorical variables. Homoscedasticity of across-location conditional dispersion matrices of the continuous variables is often assumed. Quite often, interactions between continuous and categorical variables cause across-location heteroscedasticity. In this article, we examine the effect of heterogeneous across-location conditional dispersion matrices on the overall expected and actual error rates associated with the sample location linear discriminant function. Performance of the sample location linear discriminant function is evaluated against the results for the restrictive classifier adjusted for across-location heteroscedasticity. Conclusions based on a Monte Carlo study are reported. 相似文献

17.

A Comparison of Two Group Classification Approaches to Fat-tailed and Skewed Data

Filiz Kardiyen Hülya Olmuş 《统计学通讯:模拟与计算》2016,45(1):17-32

The problem of two-group classification has implications in a number of fields, such as medicine, finance, and economics. This study aims to compare the methods of two-group classification. The minimum sum of deviations and linear programming model, linear discriminant analysis, quadratic discriminant analysis and logistic regression, multivariate analysis of variance (MANOVA) test-based classification and the unpooled T-square test-based classification methods, support vector machines and k-nearest neighbor methods, and combined classification method will be compared for data structures having fat-tail and/or skewness. The comparison has been carried out by using a simulation procedure designed for various stable distribution structures and sample sizes. 相似文献

18.

Asymptotic relative efficiency of the linear discriminant function under partial nonrandom classification of the training data

《Journal of Statistical Computation and Simulation》2012,82(4):415-426

This paper considers the problem where the linear discriminant rule is formed from training data that are only partially classified with respect to the two groups of origin. A further complication is that the data of unknown origin do not constitute an observed random sample from a mixture of the two under- lying groups. Under the assumption of a homoscedastic normal model, the overall error rate of the sample linear discriminant rule formed by maximum likelihood from the partially classified training data is derived up to and including terms of the first order in the case of univariate feature data. This first- order expansion of the sample rule so formed is used to define its asymptotic efficiency relative to the rule formed from a completely classified random training set and also to the rule formed from a completely unclassified random set. 相似文献

19.

Infiuence functions for certain parameters in discriminant analysis when a single discriminant function is not adequate

R. Radhakrishnan 《统计学通讯:理论与方法》2013,42(3):535-549

The influence function introduced by Hampel (1968, 1973, 1974) i s a tool that can be used for outlier detection. Campbell (1978) has derived influence function for ~ahalanobis's distance between two populations which can be used for detecting outliers i n discriminant analysis. Radhakrishnan and Kshirsagar (1981) have obtained influence functions for a variety of parametric functions i n multivariate analysis. Radhakrishnan (1983) obtained influence functions for parameters corresponding to "residual" wilks's A and i t s "direction" and "collinearity" factors i n discriminant analysis when a single discriminant function is ade- quate while discriminating among several groups. In this paper influence functions for parameters that correspond to "residual" wilks's A and its "direction" and "coplanarity" factors used to test the goodness of f i t of s (s>l) assigned discriminant func- tions for discriminating among several groups are obtained. These influence functions can be used for outlier detection i n m u l t i -variate data when a single discriminant function is not adequate. 相似文献

20.

Asymptotic Optimality of Sparse Linear Discriminant Analysis with Arbitrary Number of Classes

下载免费PDF全文

Ruiyan Luo Xin Qi 《Scandinavian Journal of Statistics》2017,44(3):598-616

Many sparse linear discriminant analysis (LDA) methods have been proposed to overcome the major problems of the classic LDA in high‐dimensional settings. However, the asymptotic optimality results are limited to the case with only two classes. When there are more than two classes, the classification boundary is complicated and no explicit formulas for the classification errors exist. We consider the asymptotic optimality in the high‐dimensional settings for a large family of linear classification rules with arbitrary number of classes. Our main theorem provides easy‐to‐check criteria for the asymptotic optimality of a general classification rule in this family as dimensionality and sample size both go to infinity and the number of classes is arbitrary. We establish the corresponding convergence rates. The general theory is applied to the classic LDA and the extensions of two recently proposed sparse LDA methods to obtain the asymptotic optimality. 相似文献