首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 171 毫秒
1.
A procedure is presented for finding maximum likelihood estimates of the parameters of a mixture of two random walk distributions in two cases, using classified and unclassified observations. Based on small sample size, estimation of nonlinear discriminant functions is considered. Throughout simulation experiments, the performance of the corresponding estimated nonlinear discriminant functions is investigated. The total probabilities of misclassification and percentage biases are evaluated and discussed.  相似文献   

2.
Fisher's linear discriminant function, adapted by Anderson for allocating new observations into one of two existing groups, is considered in this paper. Methods of estimating the misclassification error rates are reviewed and evaluated by Monte Carlo simulations. The investigation is carried out under both ideal (Multivariate Normal data) and non-ideal (Multivariate Binary data) conditions. The assessment is based on the usual mean square error (MSE) criterion and also on a new criterion of optimism. The results show that although there is a common cluster of good estimators for both ideal and non-ideal conditions, the single best estimators vary with respect to the different criteria  相似文献   

3.
Class specific stratified posterior probability estimators of misclassification probabilities in discriminant analysis simulations are introduced. These estimators afford a significant variance reduction over the usual count estimators. Sufficient conditions for a variance reduction are given. The stratified posterior probability estimator is generalized to other class specific expectations.  相似文献   

4.
This article investigates the possible use of our newly defined extended projection depth (abbreviated to EPD) in nonparametric discriminant analysis. We propose a robust nonparametric classifier, which relies on the intuitively simple notion of EPD. The EPD-based classifier assigns an observation to the population with respect to which it has the maximum EPD. Asymptotic properties of misclassification rates and robust properties of EPD-based classifier are discussed. A few simulated data sets are used to compare the performance of EPD-based classifier with Fisher's linear discriminant rule, quadratic discriminant rule, and PD-based classifier. It is also found that when the underlying distributions are elliptically symmetric, EPD-based classifier is asymptotically equivalent to the optimal Bayes classifier.  相似文献   

5.
K. Fischer  Chr Thiele 《Statistics》2013,47(2):281-289
Linear discriminant rules for two symmetrical distributions, which only need the first and second moments of these distributions, are presented. The rules are based on Zhezhel's idea using the most unfavourable probabilities of misclassification as an optimality criterion. Also a rule is considered which deals with distributions differing in a location and scale parameter.  相似文献   

6.
The purpose of this paper is to examine the multiple group (>2) discrimination problem in which the group sizes are unequal and the variables used in the classification are correlated with skewed distributions. Using statistical simulation based on data from a clinical study, we compare the performances, in terms of misclassification rates, of nine statistical discrimination methods. These methods are linear and quadratic discriminant analysis applied to untransformed data, rank transformed data, and inverse normal scores data, as well as fixed kernel discriminant analysis, variable kernel discriminant analysis, and variable kernel discriminant analysis applied to inverse normal scores data. It is found that the parametric methods with transformed data generally outperform the other methods, and the parametric methods applied to inverse normal scores usually outperform the parametric methods applied to rank transformed data. Although the kernel methods often have very biased estimates, the variable kernel method applied to inverse normal scores data provides considerable improvement in terms of total nonerror rate.  相似文献   

7.
The parametric and nonparametric methods for estimating the error rates in linear discriminant analysis are examined both in normal and in nonnormal situations. A Monte Carlo experiment was carried out under the assumption that two population distributions were characterized by a mixture of two multivariate normal distributions. The bootstrap bias-corrected apparent error rate compares favourably to other available estimators for nonnormal populations with small Mahalanobis distance. The methods for error estimation are also applied to a practical problem in medical diagnosis  相似文献   

8.
A nonparametric discriminant analysis procedure that is robust to deviations from the usual assumptions is proposed. The procedure uses the projection pursuit methodology where the projection index is the two-group transvariation probability. We use allocation based on the centrality of the new point measured using a smooth version of point-group transvariation. It is shown that the new procedure provides lower misclassification error rates than competing methods for data from skewed heavy-tailed and skewed distributions as well as unequal training data sizes.  相似文献   

9.
The performance of the sample linear discriminant function with known, proportional, covariance matrices and equal but unknown mean vectors is considered. Unconditional misclassification rates are obtained from the Student-t distribution. These results can be used as an aid in verifying simulation programs incorporating the linear discriminant function when Gaussian densities with unequal covariance matrices are used.  相似文献   

10.
A researcher is often confronted with the difficult and subjective task of determining which of m models best fits a set of observed data. A general robust statistical procedure for model selection is examined which uses discriminant analysis on significance levels resulting from various tests of hypotheses concerning the models. The use of Monte Carlo simulation to obtain the significance levels associated with the tests is presented. The technique is illustrated by application to four band recovery models useful in wildlife studies. Error rates due to misclassification are also reported.  相似文献   

11.
The von Mises-Fisher distribution is widely used for modeling directional data. In this article, we derive the discriminant rules based on this distribution to assign objects into pre-existing classes. We determine a distance between two von Mises-Fisher populations and we calculate estimates of the misclassification probabilities. We also analyze the behavior of the distance between two von Mises-Fisher populations and of the estimates of the misclassification probabilities when we modify the parameters of the populations or the samples size or the dimension of the sphere. Finally, we present an example with real spherical data available in the literature.  相似文献   

12.
For two or more populations of which the covariance matrices have a common set of eigenvectors, but different sets of eigenvalues, the common principal components (CPC) model is appropriate. Pepler et al. (2015 Pepler, P. T., Uys, D. W. and Nel, D. G. (2015). Regularised covariance matrix estimation under the common principal components model. Communications in Statistics: Simulation and Computation. (In press). [Google Scholar]) proposed a regularized CPC covariance matrix estimator and showed that this estimator outperforms the unbiased and pooled estimators in situations, where the CPC model is applicable. This article extends their work to the context of discriminant analysis for two groups, by plugging the regularized CPC estimator into the ordinary quadratic discriminant function. Monte Carlo simulation results show that CPC discriminant analysis offers significant improvements in misclassification error rates in certain situations, and at worst performs similar to ordinary quadratic and linear discriminant analysis. Based on these results, CPC discriminant analysis is recommended for situations, where the sample size is small compared to the number of variables, in particular for cases where there is uncertainty about the population covariance matrix structures.  相似文献   

13.
There is a large literature on estimation under misclassification. The present paper reviews epidemiologic inference under misclassification in the multiway contingency-table setting, and addresses a few controversial issues. In the 1990s, claims of inefficiency of early closed-form estimators of odds ratios under misclassification arose from misapplication of the estimators to studies with internal validation. In reality, these estimators are maximum likelihood (ML) and hence efficient under the external-validation assumptions used for their derivation. For the internal-validation case, a new closed-form estimator is derived that incorporates the nondifferentiality constraint into the predictive-value (“direct” or “inverse-matrix”) estimator. Results are presented in a general framework that applies to misclassification in models for multiway tables, and that allows the target parameter to be any measure of association or effect.  相似文献   

14.
Kernel discriminant analysis translates the original classification problem into feature space and solves the problem with dimension and sample size interchanged. In high‐dimension low sample size (HDLSS) settings, this reduces the ‘dimension’ to that of the sample size. For HDLSS two‐class problems we modify Mika's kernel Fisher discriminant function which – in general – remains ill‐posed even in a kernel setting; see Mika et al. (1999). We propose a kernel naive Bayes discriminant function and its smoothed version, using first‐ and second‐degree polynomial kernels. For fixed sample size and increasing dimension, we present asymptotic expressions for the kernel discriminant functions, discriminant directions and for the error probability of our kernel discriminant functions. The theoretical calculations are complemented by simulations which show the convergence of the estimators to the population quantities as the dimension grows. We illustrate the performance of the new discriminant rules, which are easy to implement, on real HDLSS data. For such data, our results clearly demonstrate the superior performance of the new discriminant rules, and especially their smoothed versions, over Mika's kernel Fisher version, and typically also over the commonly used naive Bayes discriminant rule.  相似文献   

15.
Multiple discriminant analysis (MDA) is a frequently used statistical technique. Although the dependence of this technique on the underlying assumptions concerning population priors and misclassification costs is well known, the assumption most often made by researchers is that both population priors and misclassification costs are equal. The purpose of this paper is to demonstrate the magnitude of the effect of these assumptions on statistical results. In the savings and loan case used here, the population priors are known:however, the relative misclassification costs are not. To test the sensitivity of the results to the unknown misclassification costs several different misclassification cost assumptions are used.  相似文献   

16.
We investigate three interval estimators for binomial misclassification rates in a complementary Poisson model where the data are possibly misclassified: a Wald-based interval, a score-based interval, and an interval based on the profile log-likelihood statistic. We investigate the coverage and average width properties of these intervals via a simulation study. For small Poisson counts and small misclassification rates, the intervals can perform poorly in terms of coverage. The profile log-likelihood confidence interval (CI) is often proved to outperform the other intervals with good coverage and width properties. Lastly, we apply the CIs to a real data set involving traffic accident data that contain misclassified counts.  相似文献   

17.
The empirical influence function for Mahalanobis distance and for misclassification rates are presented for discriminant analysis with two multivariate normal populations, following Campbell (1978). Conclusions about the effects of outliers from the empirical influence function are contrasted with exact calculations for four simple cases. These cases demonstrate that the higher-order terms discarded in deriving the empirical influence function can be important in practical problems.  相似文献   

18.
Summary.  An authentic food is one that is what it purports to be. Food processors and consumers need to be assured that, when they pay for a specific product or ingredient, they are receiving exactly what they pay for. Classification methods are an important tool in food authenticity studies where they are used to assign food samples of unknown type to known types. A classification method is developed where the classification rule is estimated by using both the labelled and the unlabelled data, in contrast with many classical methods which use only the labelled data for estimation. This methodology models the data as arising from a Gaussian mixture model with parsimonious covariance structure, as is done in model-based clustering. A missing data formulation of the mixture model is used and the models are fitted by using the EM and classification EM algorithms. The methods are applied to the analysis of spectra of food-stuffs recorded over the visible and near infra-red wavelength range in food authenticity studies. A comparison of the performance of model-based discriminant analysis and the method of classification proposed is given. The classification method proposed is shown to yield very good misclassification rates. The correct classification rate was observed to be as much as 15% higher than the correct classification rate for model-based discriminant analysis.  相似文献   

19.
Fast and robust bootstrap   总被引:1,自引:0,他引:1  
In this paper we review recent developments on a bootstrap method for robust estimators which is computationally faster and more resistant to outliers than the classical bootstrap. This fast and robust bootstrap method is, under reasonable regularity conditions, asymptotically consistent. We describe the method in general and then consider its application to perform inference based on robust estimators for the linear regression and multivariate location-scatter models. In particular, we study confidence and prediction intervals and tests of hypotheses for linear regression models, inference for location-scatter parameters and principal components, and classification error estimation for discriminant analysis.  相似文献   

20.
We formulate closed-form Bayesian estimators for two complementary Poisson rate parameters using double sampling with data subject to misclassification and error free data. We also derive closed-form Bayesian estimators for two misclassification parameters in the modified Poisson model we assume. We use our results to determine credible sets for the rate and misclassification parameters. Additionally, we use MCMC methods to determine Bayesian estimators for three or more rate parameters and the misclassification parameters. We also perform a limited Monte Carlo simulation to examine the characteristics of these estimators. We demonstrate the efficacy of the new Bayesian estimators and highest posterior density regions with examples using two real data sets.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号