首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
The solution of the generalized symmetric eigenproblem Ax = λBx is required in many multivariate statistical models, viz. canonical correlation, discriminant analysis, multivariate linear model, limited information maximum likelihoods. The problem can be solved by two efficient numerical algorithms: Cholesky decomposition or singular value decomposition. Practical considerations for implementation are also discussed.  相似文献   

2.
This paper discusses the robustness of discriminant analysis against contamination in the training data, the test data are assumed uncontaminated. The concept of training data breakdown point for discriminant analysis is introduced. It is quite different from the usual breakdown point in robust statistics. In the robust location parameter estimation problem, outliers are the main concern, but in discriminant analysis, not only are outliers a concern, but also inliers.  相似文献   

3.
马少沛等 《统计研究》2021,38(2):114-134
在大数据时代,金融学、基因组学和图像处理等领域产生了大量的张量数据。Zhong等(2015)提出了张量充分降维方法,并给出了处理二阶张量的序列迭代算法。鉴于高阶张量在实际生活中的广泛应用,本文将Zhong等(2015)的算法推广到高阶,以三阶张量为例,提出了两种不同的算法:结构转换算法和结构保持算法。两种算法都能够在不同程度上保持张量原有结构信息,同时有效降低变量维度和计算复杂度,避免协方差矩阵奇异的问题。将两种算法应用于人像彩图的分类识别,以二维和三维点图等形式直观展现了算法分类结果。将本文的结构保持算法与K-means聚类方法、t-SNE非线性降维方法、多维主成分分析、多维判别分析和张量切片逆回归共五种方法进行对比,结果表明本文所提方法在分类精度方面有明显优势,因此在图像识别及相关应用领域具有广阔的发展前景。  相似文献   

4.
We have compared the efficacy of five imputation algorithms readily available in SAS for the quadratic discriminant function. Here, we have generated several different parametric-configuration training data with missing data, including monotone missing-at-random observations, and used a Monte Carlo simulation to examine the expected probabilities of misclassification for the two-class quadratic statistical discrimination problem under five different imputation methods. Specifically, we have compared the efficacy of the complete observation-only method and the mean substitution, regression, predictive mean matching, propensity score, and Markov Chain Monte Carlo (MCMC) imputation methods. We found that the MCMC and propensity score multiple imputation approaches are, in general, superior to the other imputation methods for the configurations and training-sample sizes we considered.  相似文献   

5.
In this article we propose a new method of construction of discriminant coordinates and their kernel variant based on the regularization (ridge regression). Moreover, we compare the case of discriminant coordinates, functional discriminant coordinates and the kernel version of functional discriminant coordinates on 20 data sets from a wide variety of application domains using values of the criterion of goodness and statistical tests. Our experiments show that the kernel variant of discriminant coordinates provides significantly more accurate results on the examined data sets.  相似文献   

6.
We consider the supervised classification setting, in which the data consist of p features measured on n observations, each of which belongs to one of K classes. Linear discriminant analysis (LDA) is a classical method for this problem. However, in the high-dimensional setting where p ? n, LDA is not appropriate for two reasons. First, the standard estimate for the within-class covariance matrix is singular, and so the usual discriminant rule cannot be applied. Second, when p is large, it is difficult to interpret the classification rule obtained from LDA, since it involves all p features. We propose penalized LDA, a general approach for penalizing the discriminant vectors in Fisher's discriminant problem in a way that leads to greater interpretability. The discriminant problem is not convex, so we use a minorization-maximization approach in order to efficiently optimize it when convex penalties are applied to the discriminant vectors. In particular, we consider the use of L(1) and fused lasso penalties. Our proposal is equivalent to recasting Fisher's discriminant problem as a biconvex problem. We evaluate the performances of the resulting methods on a simulation study, and on three gene expression data sets. We also survey past methods for extending LDA to the high-dimensional setting, and explore their relationships with our proposal.  相似文献   

7.
Abstract

The problem of orthogonal projection of a point onto a set is an essential problem of computational geometry. This problem has many practical applications in different areas such as robotics, computer graphics and so on. In the present paper three algorithms for solving this problem are proposed. This algorithms are based on the idea of heuristic random search. Numerical experiments illustrating the work of the proposed methods are presented.  相似文献   

8.
We consider the problem of the effect of sample designs on discriminant analysis. The selection of the learning sample is assumed to depend on the population values of auxiliary variables. Under a superpopulation model with a multivariate normal distribution, unbiasedness and consistency are examined for the conventional estimators (derived under the assumptions of simple random sampling), maximum likelihood estimators, probability-weighted estimators and conditionally unbiased estimators of parameters. Four corresponding sampled linear discriminant functions are examined. The rates of misclassification of these four discriminant functions and the effect of sample design on these four rates of misclassification are discussed. The performances of these four discriminant functions are assessed in a simulation study.  相似文献   

9.
Error rate is a popular criterion for assessing the performance of an allocation rule in discriminant analysis. Training samples which involve missing values cause problems for those error rate estimators that require all variables to be observed at all data points. This paper explores imputation algorithms, their effects on, and problems of implementing them with, eight commonly used error rate estimators (three parametric and five non-parametric) in linear discriminant analysis. The results indicate that imputation should not be based on the way error rate estimators are calculated, and that imputed values may underestimate error rates.  相似文献   

10.
Kernel discriminant analysis translates the original classification problem into feature space and solves the problem with dimension and sample size interchanged. In high‐dimension low sample size (HDLSS) settings, this reduces the ‘dimension’ to that of the sample size. For HDLSS two‐class problems we modify Mika's kernel Fisher discriminant function which – in general – remains ill‐posed even in a kernel setting; see Mika et al. (1999). We propose a kernel naive Bayes discriminant function and its smoothed version, using first‐ and second‐degree polynomial kernels. For fixed sample size and increasing dimension, we present asymptotic expressions for the kernel discriminant functions, discriminant directions and for the error probability of our kernel discriminant functions. The theoretical calculations are complemented by simulations which show the convergence of the estimators to the population quantities as the dimension grows. We illustrate the performance of the new discriminant rules, which are easy to implement, on real HDLSS data. For such data, our results clearly demonstrate the superior performance of the new discriminant rules, and especially their smoothed versions, over Mika's kernel Fisher version, and typically also over the commonly used naive Bayes discriminant rule.  相似文献   

11.
The problem of updating discriminant functions estimated from inverse Gaussian populations is investigated in situations when the additional observations are mixed (unclassified) or classified. In each case two types of discriminant functions, linear and quadratic, are considered. Using simulation experiments the performance of the updating procedures is evaluated by means of relative efficiencies.  相似文献   

12.
The problem of updating a discriminant function on the basis of data of unknown origin is studied. There are observations of known origin from each of the underlying populations, and subsequently there is available a limited number of unclassified observations assumed to have been drawn from a mixture of the underlying populations. A sample discriminant function can be formed initially from the classified data. The question of whether the subsequent updating of this discriminant function on the basis of the unclassified data produces a reduction in the error rate of sufficient magnitude to warrant the computational effort is considered by carrying out a series of Monte Carlo experiments. The simulation results are contrasted with available asymptotic results.  相似文献   

13.
The location linear discriminant function is used in a two-population classification problem when the available data are generated from both binary and continuous random variables. Asymptotic distribution of the studentized location linear discriminant function is derived directly without the inversion of the corresponding characteristic function. The resulting plug-in estimate of the overall error of misclassification consists of the estimate based on the limiting distribution of the discriminant plus a correction term up to the second order. By comparison, our estimate avoids exact knowledge of the Mahalanobis distances which is necessary when the expansions of Vlachonikolis (1985) are used in the case of an arbitrary cut-off point. An example is re-examined and analysed in the present context.  相似文献   

14.
基于Fisher变换的Bayes判别方法探索   总被引:1,自引:0,他引:1       下载免费PDF全文
判别分析是三大多元统计分析方法之一,在许多领域都有广泛的应用。通常认为距离判别、Fisher判别和Bayes判别是三种不同的判别分析方法,本文的研究表明,距离判别与Bayes判别是两种实质的判别方法,前者实际依据的是百分位点或置信区间,后者实际依据的是概率。而著名的Fisher判别,只是依据方差分析的思想,对判别变量进行线性变换,然后用于距离判别,其实不能算是一种实质的判别方法。本文将Fisher变换与Bayes判别结合起来,即先做Fisher变换,再利用概率最大原则做Bayes判别,得到一种新的判别途径,可进一步提高判别效率。理论与实证分析表明,基于Fisher变换的Bayes判别,适用场合广泛,判别效率最高。  相似文献   

15.
Abstract

One of the basic statistical methods of dimensionality reduction is analysis of discriminant coordinates given by Fisher (1936 Fisher, R. A. 1936. The use of multiple measurements in taxonomic problem. Annals of Eugenics 7 (2):17988. doi:10.1111/j.1469-1809.1936.tb02137.x.[Crossref] [Google Scholar]) and Rao (1948). The space of discriminant coordinates is a space convenient for presenting multidimensional data originating from multiple groups and for the use of various classification methods (methods of discriminant analysis). In the present paper, we adapt the classical discriminant coordinates analysis to multivariate functional data. The theory has been applied to analysis of textural properties of apples of six varieties, measured over a period of 180?days, stored in two types of refrigeration chamber.  相似文献   

16.
This paper studies the application of genetic algorithms to the construction of exact D-optimal experimental designs. The concept of genetic algorithms is introduced in the general context of the problem of finding optimal designs. The algorithm is then applied specifically to finding exact D-optimal designs for three different types of model. The performance of genetic algorithms is compared with that of the modified Fedorov algorithm in terms of computing time and relative efficiency. Finally, potential applications of genetic algorithms to other optimality criteria and to other types of model are discussed, along with some open problems for possible future research.  相似文献   

17.
The problem of discrimination between two stationary ARMA time series models is considered, and in particular AR(p), MA(p), ARMA(1,1) models. The discriminant based on the likelihood ration leads to a quadratic form that is generally too complicated to evaluated explicitly. The discriminant can be expressed approximately as a linear combination of independent chi–squared random varianles each with one degree of freedom, the coefficients, of which are eigenvalues of cumbersome matrices. An analytical solution which gives the coefficients approximately is suggested.  相似文献   

18.
In this article, a variable selection procedure, called surrogate selection, is proposed which can be applied when a support vector machine or kernel Fisher discriminant analysis is used in a binary classification problem. Surrogate selection applies the lasso after substituting the kernel discriminant scores for the binary group labels, as well as values for the input variable observations. Empirical results are reported, showing that surrogate selection performs well.  相似文献   

19.
This paper discusses a supervised classification approach for the differential diagnosis of Raynaud's phenomenon (RP). The classification of data from healthy subjects and from patients suffering for primary and secondary RP is obtained by means of a set of classifiers derived within the framework of linear discriminant analysis. A set of functional variables and shape measures extracted from rewarming/reperfusion curves are proposed as discriminant features. Since the prediction of group membership is based on a large number of these features, the high dimension/small sample size problem is considered to overcome the singularity problem of the within-group covariance matrix. Results on a data set of 72 subjects demonstrate that a satisfactory classification of the subjects can be achieved through the proposed methodology.  相似文献   

20.
In this paper, we describe some results of an ESPRIT project known as StatLog whose purpose is the comparison of classification algorithms. We give a brief summary of some of the algorithms in the project: discriminant analysis; nearest neighbours; decision trees; neural net methods; SMART; kernel methods and other Bayesian approaches.We focus on data sets derived from images, ranging from raw pixel data to features and summaries extracted from such data.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号