首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Class specific stratified posterior probability estimators of misclassification probabilities in discriminant analysis simulations are introduced. These estimators afford a significant variance reduction over the usual count estimators. Sufficient conditions for a variance reduction are given. The stratified posterior probability estimator is generalized to other class specific expectations.  相似文献   

2.
In this paper the rank method for forced discrimination in two population problems, introduced by Randies, Broffitt, Ramberg and Hogg (1978), is extended to cover settings involving more than two populations. Several methods of ranking are compared to the normal theory procedure in a Monte Carlo study. Asymptotic theory is included which confirms that the rank method does balance the limiting probabilities of misclassification in a two population setting.  相似文献   

3.
Several methods have been proposed to estimate the misclassification probabilities when a linear discriminant function is used to classify an observation into one of several populations. We describe the application of bootstrap sampling to the above problem. The proposed method has the advantage of not only furnishing the estimates of misclassification probabilities but also provides an estimate of the standard error of estimate. The method is illustrated by a small simulation experiment. It is then applied to three published, well accessible data sets, which are typical of large, medium and small data sets encountered in practice.  相似文献   

4.
In this paper, we propose an asymptotic approximation for the expected probabilities of misclassification (EPMC) in the linear discriminant function on the basis of k-step monotone missing training data for general k. We derive certain relations of the statistics in order to obtain the approximation. Finally, we perform Monte Carlo simulation to evaluate the accuracy of our result and to compare it with existing approximations.  相似文献   

5.
We have compared the efficacy of five imputation algorithms readily available in SAS for the quadratic discriminant function. Here, we have generated several different parametric-configuration training data with missing data, including monotone missing-at-random observations, and used a Monte Carlo simulation to examine the expected probabilities of misclassification for the two-class quadratic statistical discrimination problem under five different imputation methods. Specifically, we have compared the efficacy of the complete observation-only method and the mean substitution, regression, predictive mean matching, propensity score, and Markov Chain Monte Carlo (MCMC) imputation methods. We found that the MCMC and propensity score multiple imputation approaches are, in general, superior to the other imputation methods for the configurations and training-sample sizes we considered.  相似文献   

6.
When classification rules are constructed using sample estimatest it is known that the probability of misclassification is not minimized. This article introduces a biased minimum X2 rule to classify items from a multivariate normal population. Using the principle of variance reduction, the probability of misclassification is reduced when the biased procedure is employed. Results of sampling experiments over a broad range of conditions are provided to demonstrate this improvement.  相似文献   

7.
The quadratic discriminant function is commonly used for the two group classification problem when the covariance matrices in the two populations are substantially unequal. This procedure is optimal when both populations are multivariate normal with known means and covariance matrices. This study examined the robustness of the QDF to non-normality. Sampling experiments were conducted to estimate expected actual error rates for the QDF when sampling from a variety of non-normal distributions. Results indicated that the QDF was robust to non-normality except when the distributions were highly skewed, in which case relatively large deviations from optimal were observed. In all cases studied the average probabilities of misclassification were relatively stable while the individual population error rates exhibited considerable variability.  相似文献   

8.
The von Mises-Fisher distribution is widely used for modeling directional data. In this article, we derive the discriminant rules based on this distribution to assign objects into pre-existing classes. We determine a distance between two von Mises-Fisher populations and we calculate estimates of the misclassification probabilities. We also analyze the behavior of the distance between two von Mises-Fisher populations and of the estimates of the misclassification probabilities when we modify the parameters of the populations or the samples size or the dimension of the sphere. Finally, we present an example with real spherical data available in the literature.  相似文献   

9.
In simulation studies for discriminant analysis, misclassification errors are often computed using the Monte Carlo method, by testing a classifier on large samples generated from known populations. Although large samples are expected to behave closely to the underlying distributions, they may not do so in a small interval or region, and thus may lead to unexpected results. We demonstrate with an example that the LDA misclassification error computed via the Monte Carlo method may often be smaller than the Bayes error. We give a rigorous explanation and recommend a method to properly compute misclassification errors.  相似文献   

10.
ABSTRACT

Fisher's linear discriminant analysis (FLDA) is known as a method to find a discriminative feature space for multi-class classification. As a theory of extending FLDA to an ultimate nonlinear form, optimal nonlinear discriminant analysis (ONDA) has been proposed. ONDA indicates that the best theoretical nonlinear map for maximizing the Fisher's discriminant criterion is formulated by using the Bayesian a posterior probabilities. In addition, the theory proves that FLDA is equivalent to ONDA when the Bayesian a posterior probabilities are approximated by linear regression (LR). Due to some limitations of the linear model, there is room to modify FLDA by using stronger approximation/estimation methods. For the purpose of probability estimation, multi-nominal logistic regression (MLR) is more suitable than LR. Along this line, in this paper, we develop a nonlinear discriminant analysis (NDA) in which the posterior probabilities in ONDA are estimated by MLR. In addition, in this paper, we develop a way to introduce sparseness into discriminant analysis. By applying L1 or L2 regularization to LR or MLR, we can incorporate sparseness in FLDA and our NDA to increase generalization performance. The performance of these methods is evaluated by benchmark experiments using last_exam17 standard datasets and a face classification experiment.  相似文献   

11.
The quadratic discriminant function (QDF) with known parameters has been represented in terms of a weighted sum of independent noncentral chi-square variables. To approximate the density function of the QDF as m-dimensional exponential family, its moments in each order have been calculated. This is done using the recursive formula for the moments via the Stein's identity in the exponential family. We validate the performance of our method using simulation study and compare with other methods in the literature based on the real data. The finding results reveal better estimation of misclassification probabilities, and less computation time with our method.  相似文献   

12.
The empirical influence function for Mahalanobis distance and for misclassification rates are presented for discriminant analysis with two multivariate normal populations, following Campbell (1978). Conclusions about the effects of outliers from the empirical influence function are contrasted with exact calculations for four simple cases. These cases demonstrate that the higher-order terms discarded in deriving the empirical influence function can be important in practical problems.  相似文献   

13.
The influence of observations in estimating the misclassification probability in multiple discriminant analysis is studied using the common omission approach. An empirical influence function for the misclassification probability is also derived, It can give a very good approximation to the omission approach, but the computational load is much reduced, Various extensions of the measures are suggested. The proposed measures are applied to the famous Iris data set. The same three observations are identified as having the most influence under different measures.  相似文献   

14.
This article extends the work of DiPillo (1976) on the Biased Minimum x2 Rule. The optimum value of k (the biasing factor) Is determined and the true probability of misclassification is found. The proportion improvements reported in the 1976 paper are shown to be conservative. Some suggestions for algorithms to determine the optimal value of k are presented.  相似文献   

15.
A random vector is assumed to have one of three known multivariate normal distributions with equal covariance matrices. It is desired to separate the three distributions by means of a single linear discriminant function. Such a function can lead to a classification rule. The function whose classification rule minimizes the average of the three probabilities of misclassification is found. Also the function is found whose rule minimizes the maximum of the three probabilities of misclassification.  相似文献   

16.
The main contribution of this paper is is updating a nonlinear discriminant function on the basis of data of unknown origin. Specifically a procedure is developed for updating the nonlinear discriminant function on the basis of two Burr Type III distributions (TBIIID) when the additional observations are mixed or classified. First the nonlinear discriminant function of the assumed model is obtained. Then the total probabilities of misclassification are calculated. In addition a Monte carlo simulation runs are used to compute the relative efficiencies in order to investigate the performance of the developed updating procedures. Finally the results obtained in this paper are illustrated through a real and simulated data set.  相似文献   

17.
This article considers multinomial data subject to misclassification in the presence of covariates which affect both the misclassification probabilities and the true classification probabilities. A subset of the data may be subject to a secondary measurement according to an infallible classifier. Computations are carried out in a Bayesian setting where it is seen that the prior has an important role in driving the inference. In addition, a new and less problematic definition of nonidentifiability is introduced and is referred to as hierarchical nonidentifiability.  相似文献   

18.
This study investigates the use of stratification to improve discrimination when prior probabilities vary across strata of a population of interest. Sources of heterogeneity in prior probabilities include differences in geographic locale, age differences in the population studied, or differences in the time component of the data collected. The article suggests using logistic regression both to identify the underlying stratification and to estimate prior probabilities. A simulation study compares misclassification rates under two alternative stratification schemes with the traditional discriminant approach that ignores stratification in favor of pooled prior estimates. The simulations show that large asymptotic gains can be realized by stratification, and that these gains can be realized in finite samples, given moderate differences in prior probabilities.  相似文献   

19.
The classification of a random variable based on a mixture can be meaningfully discussed only if the class of all finite mixtures is identifiable. In this paper, we find the maximum-likelihood estimates of the parameters of the mixture of two inverse Weibull distributions by using classified and unclassified observations. Next, we estimate the nonlinear discriminant function of the underlying model. Also, we calculate the total probabilities of misclassification as well as the percentage bias. In addition, we investigate the performance of all results through a series of simulation experiments by means of relative efficiencies. Finally, we analyse some simulated and real data sets through the findings of the paper.  相似文献   

20.
K. Fischer  Chr Thiele 《Statistics》2013,47(2):281-289
Linear discriminant rules for two symmetrical distributions, which only need the first and second moments of these distributions, are presented. The rules are based on Zhezhel's idea using the most unfavourable probabilities of misclassification as an optimality criterion. Also a rule is considered which deals with distributions differing in a location and scale parameter.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号