期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

A comparison of the classical and the linear programming approaches to the classification problem in discriminant analysis

《Journal of Statistical Computation and Simulation》2012,82(1-2):73-93

Several mathematical programming approaches to the classification problem in discriminant analysis have recently been introduced. This paper empirically compares these newly introduced classification techniques with Fisher's linear discriminant analysis (FLDA), quadratic discriminant analysis (QDA), logit analysis, and several rank-based procedures for a variety of symmetric and skewed distributions. The percent of correctly classified observations by each procedure in a holdout sample indicate that while under some experimental conditions the linear programming approaches compete well with the classical procedures, overall, however, their performance lags behind that of the classical procedures. 相似文献

2.

Functional linear discriminant analysis for irregularly sampled curves

Gareth M. James & Trevor J. Hastie 《Journal of the Royal Statistical Society. Series B, Statistical methodology》2001,63(3):533-550

We introduce a technique for extending the classical method of linear discriminant analysis (LDA) to data sets where the predictor variables are curves or functions. This procedure, which we call functional linear discriminant analysis ( FLDA ), is particularly useful when only fragments of the curves are observed. All the techniques associated with LDA can be extended for use with FLDA. In particular FLDA can be used to produce classifications on new (test) curves, give an estimate of the discriminant function between classes and provide a one- or two-dimensional pictorial representation of a set of curves. We also extend this procedure to provide generalizations of quadratic and regularized discriminant analysis. 相似文献

3.

The efficiency of Efron's “Bootstrap” Approach Applied to Error Rate Estimation in Discriminant Analysis

《Journal of Statistical Computation and Simulation》2012,82(3-4):273-279

The “bootstrap” approach of Efron is considered in its application to the estimation of error rates in discriminant analysis. Its efficiency relative to parametric estimation is investigated by simulation for Fisher's linear discriminant function in the context of two multivariate normal populations with a common covariance matrix. 相似文献

4.

Minimum Sample Size Considerations for Two-Group Linear and Quadratic Discriminant Analysis with Rare Populations

Shannon Zavorka Jamis J. Perrett 《统计学通讯:模拟与计算》2013,42(7):1726-1739

Linear discriminant analysis and quadratic discriminant analysis are used to predict group membership. Rare populations present situations in which group sizes differ drastically. This article examined k = 2 and k = 4 predictor variables for groups with different levels of rarity and different levels of sensitivity and specificity. Sample size recommendations were generated for both minimum and maximum group overlap using the leave-one-out (L-O-O) method of estimation. Minimum sample size recommendations are provided in tables for immediate implementation by applied researchers. 相似文献

5.

Forward and backward stepping in variable selection

《Journal of Statistical Computation and Simulation》2012,82(3-4):177-185

For stepwise regression and discriminant analysis the parameters F _in and F _out govern the inclusion and deletion of variables. The candidate variable with the biggest F—ratio is included if this exceeds F _inthe included variable with the smallest F—ratio is deleted if this is less than F _in If F _in ≧F _out; then return to a previous subset size implies improvement in the criterion measure. This result also holds for a generalization, stepwise multivariate analysis, which includes stepwise regression and discriminant analysis as special cases

Eliminations do not occur if forward regression and backward elimination yield the same sequence of subsets. Conversely, there is a more liberal stepping rule which always eliminates if the two sequences differ. 相似文献

6.

An Optimal Semiparametric Method for Two‐group Classification

《Scandinavian Journal of Statistics》2018,45(3):806-846

In the classical discriminant analysis, when two multivariate normal distributions with equal variance–covariance matrices are assumed for two groups, the classical linear discriminant function is optimal with respect to maximizing the standardized difference between the means of two groups. However, for a typical case‐control study, the distributional assumption for the case group often needs to be relaxed in practice. Komori et al. (Generalized t ‐statistic for two‐group classification. Biometrics 2015, 71: 404–416) proposed the generalized t ‐statistic to obtain a linear discriminant function, which allows for heterogeneity of case group. Their procedure has an optimality property in the class of consideration. We perform a further study of the problem and show that additional improvement is achievable. The approach we propose does not require a parametric distributional assumption on the case group. We further show that the new estimator is efficient, in that no further improvement is possible to construct the linear discriminant function more efficiently. We conduct simulation studies and real data examples to illustrate the finite sample performance and the gain that it produces in comparison with existing methods. 相似文献

7.

Estimation of relaxation time distributions in magnetic resonance imaging

Edward Susko Michael J. Bronskill Simon J. Graham Robert J. Tibshirani 《Revue canadienne de statistique》2001,29(3):379-394

Magnetic resonance imaging techniques can be used to measure some biophysical properties of tissue. In this context, the T₂ relaxation time is an important parameter for soft‐tissue contrast. The authors develop a new technique to estimate the integral of the distribution of T₂ relaxation time without imposing any constraint other than the monotonicity of the underlying cumulative relaxation time distribution. They explore the properties of the estimation and its applications for the analysis of breast tissue data. As they show, an extension of linear discriminant analysis is found to distinguish well between two classes of breast tissue. 相似文献

8.

Selection without (unfair) discrimination

Thomes Johnson 《统计学通讯:理论与方法》2013,42(11):1079-1098

A method is devised for performing multiple discriminant analysis subject to inequality constraints on the probabilities of misassignment of different subpopulations. This procedure is motivated by attempts to devise.fair means of selection of applicants for schools, jobs, and credit. An algorithm is developed and sample calculations are given. 相似文献

9.

Comparisons of decision tree methods using water data

Muhammad Azam Muhammad Aslam Khushnoor Khan Anwar Mughal Awais Inayat 《统计学通讯:模拟与计算》2017,46(4):2924-2934

This article demonstrates the application of classification trees (decision trees), logistic regression (LR), and linear discriminant function (LDR) to classify data of water quality (i.e., whether the water is fit for drinking on not fit for drinking). The data on water quality were obtained from Pakistan Council of Research in Water Resources (PCRWR) for two cities of Pakistan—one representing industrial environment (Sialkot) and the other one representing non-industrial environment (Narowal). To classify data on water quality, three statistical tools were employed—the Decision Tree methodology using Gini Index, LR, and LDA—using R software library. The results obtained by the said three techniques were compared using misclassification rates (a model with minimum value of misclassification rate is better). It was witnessed that LR performed well than the other two techniques while the Decision trees and LDA performed equally well. But for illustration purposes decision trees technique is comparatively easy to draw and interpret. 相似文献

10.

New methods for fitting multiple sinusoids from irregularly sampled data

Sbastien Bourguignon Herv Carfantan 《Statistical Methodology》2008,5(4):318-327

A novel framework is proposed for the estimation of multiple sinusoids from irregularly sampled time series. This spectral analysis problem is addressed as an under-determined inverse problem, where the spectrum is discretized on an arbitrarily thin frequency grid. As we focus on line spectra estimation, the solution must be sparse, i.e. the amplitude of the spectrum must be zero almost everywhere. Such prior information is taken into account within the Bayesian framework. Two models are used to account for the prior sparseness of the solution, namely a Laplace prior and a Bernoulli–Gaussian prior, associated to optimization and stochastic sampling algorithms, respectively. Such approaches are efficient alternatives to usual sequential prewhitening methods, especially in case of strong sampling aliases perturbating the Fourier spectrum. Both methods should be intensively tested on real data sets by physicists. 相似文献

11.

An asymptotic approximation for EPMC in linear discriminant analysis based on monotone missing data

Nobumichi Shutoh 《Journal of statistical planning and inference》2012,142(1):110-125

In this paper, we propose an asymptotic approximation for the expected probabilities of misclassification (EPMC) in the linear discriminant function on the basis of k-step monotone missing training data for general k. We derive certain relations of the statistics in order to obtain the approximation. Finally, we perform Monte Carlo simulation to evaluate the accuracy of our result and to compare it with existing approximations. 相似文献

12.

Discriminant analysis with stratified prior probabilities

Berry Wilson 《统计学通讯:理论与方法》2013,42(5):1283-1295

This study investigates the use of stratification to improve discrimination when prior probabilities vary across strata of a population of interest. Sources of heterogeneity in prior probabilities include differences in geographic locale, age differences in the population studied, or differences in the time component of the data collected. The article suggests using logistic regression both to identify the underlying stratification and to estimate prior probabilities. A simulation study compares misclassification rates under two alternative stratification schemes with the traditional discriminant approach that ignores stratification in favor of pooled prior estimates. The simulations show that large asymptotic gains can be realized by stratification, and that these gains can be realized in finite samples, given moderate differences in prior probabilities. 相似文献

13.

Discriminant analyses of peanut allergy severity scores

O. Collignon J.-M. Monnez P. Vallois F. Codreanu J.-M. Renaudin G. Kanny 《Journal of applied statistics》2011,38(9):1783-1799

Peanut allergy is one of the most prevalent food allergies. The possibility of a lethal accidental exposure and the persistence of the disease make it a public health problem. Evaluating the intensity of symptoms is accomplished with a double blind placebo-controlled food challenge (DBPCFC), which scores the severity of reactions and measures the dose of peanut that elicits the first reaction. Since DBPCFC can result in life-threatening responses, we propose an alternate procedure with the long-term goal of replacing invasive allergy tests. Discriminant analyses of DBPCFC score, the eliciting dose and the first accidental exposure score were performed in 76 allergic patients using 6 immunoassays and 28 skin prick tests. A multiple factorial analysis was performed to assign equal weights to both groups of variables, and predictive models were built by cross-validation with linear discriminant analysis, k-nearest neighbours, classification and regression trees, penalized support vector machine, stepwise logistic regression and AdaBoost methods. We developed an algorithm for simultaneously clustering eliciting dose values and selecting discriminant variables. Our main conclusion is that antibody measurements offer information on the allergy severity, especially those directed against rAra-h1 and rAra-h3. Further independent validation of these results and the use of new predictors will help extend this study to clinical practices. 相似文献

14.

On approximating distribution of the quadratic discriminant function

G. Rekabdar R. Chinipardaz B. Mansouri 《统计学通讯:模拟与计算》2017,46(5):3614-3626

The quadratic discriminant function (QDF) with known parameters has been represented in terms of a weighted sum of independent noncentral chi-square variables. To approximate the density function of the QDF as m-dimensional exponential family, its moments in each order have been calculated. This is done using the recursive formula for the moments via the Stein's identity in the exponential family. We validate the performance of our method using simulation study and compare with other methods in the literature based on the real data. The finding results reveal better estimation of misclassification probabilities, and less computation time with our method. 相似文献

15.

Sequential correction of linear classifiers

T. Górecki 《Journal of applied statistics》2013,40(4):763-776

In this article, a sequential correction of two linear methods: linear discriminant analysis (LDA) and perceptron is proposed. This correction relies on sequential joining of additional features on which the classifier is trained. These new features are posterior probabilities determined by a basic classification method such as LDA and perceptron. In each step, we add the probabilities obtained on a slightly different data set, because the vector of added probabilities varies at each step. We therefore have many classifiers of the same type trained on slightly different data sets. Four different sequential correction methods are presented based on different combining schemas (e.g. mean rule and product rule). Experimental results on different data sets demonstrate that the improvements are efficient, and that this approach outperforms classical linear methods, providing a significant reduction in the mean classification error rate. 相似文献

16.

Optimal design for classification of functional data

Cai Li Luo Xiao 《Revue canadienne de statistique》2020,48(2):285-307

We study the design problem for the optimal classification of functional data. The goal is to select sampling time points so that functional data observed at these time points can be classified accurately. We propose optimal designs that are applicable to either dense or sparse functional data. Using linear discriminant analysis, we formulate our design objectives as explicit functions of the sampling points. We study the theoretical properties of the proposed design objectives and provide a practical implementation. The performance of the proposed design is evaluated through simulations and real data applications. The Canadian Journal of Statistics 48: 285–307; 2020 © 2019 Statistical Society of Canada 相似文献

17.

Errors of misclassification in discrimination with data from truncated <Emphasis Type="Italic">t</Emphasis> populations

Apostolos?Batsidis Email author 《Statistical Papers》2012,53(2):281-298

The distribution of the probabilities of misclassification is derived in this paper, which are reproduced by the use of the linear discriminant function. The statistical background is two independent doubly truncated t populations with distinct location parameters and common scale parameter and degrees of freedom. The behavior of the linear discriminant function is studied by comparing the distribution function of the errors of misclassification under the truncated t and truncated normal models. 相似文献

18.

Comparison of algorithms for replacing missing data in discriminant analysis

J.Twedt Daniel D.S. Gill 《统计学通讯:理论与方法》2013,42(6):1567-1578

We examined the impact of different methods for replacing missing data in discriminant analyses conducted on randomly generated samples from multivariate normal and non-normal distributions. The probabilities of correct classification were obtained for these discriminant analyses before and after randomly deleting data as well as after deleted data were replaced using: (1) variable means, (2) principal component projections, and (3) the EM algorithm. Populations compared were: (1) multivariate normal with covariance matrices ∑₁=∑₂, (2) multivariate normal with ∑₁≠∑₂ and (3) multivariate non-normal with ∑₁=∑₂. Differences in the probabilities of correct classification were most evident for populations with small Mahalanobis distances or high proportions of missing data. The three replacement methods performed similarly but all were better than non - replacement. 相似文献

19.

Bayesian Variable Selection Under Collinearity

Joyee Ghosh Andrew E. Ghattas 《The American statistician》2015,69(3):165-173

In this article, we highlight some interesting facts about Bayesian variable selection methods for linear regression models in settings where the design matrix exhibits strong collinearity. We first demonstrate via real data analysis and simulation studies that summaries of the posterior distribution based on marginal and joint distributions may give conflicting results for assessing the importance of strongly correlated covariates. The natural question is which one should be used in practice. The simulation studies suggest that posterior inclusion probabilities and Bayes factors that evaluate the importance of correlated covariates jointly are more appropriate, and some priors may be more adversely affected in such a setting. To obtain a better understanding behind the phenomenon, we study some toy examples with Zellner’s g-prior. The results show that strong collinearity may lead to a multimodal posterior distribution over models, in which joint summaries are more appropriate than marginal summaries. Thus, we recommend a routine examination of the correlation matrix and calculation of the joint inclusion probabilities for correlated covariates, in addition to marginal inclusion probabilities, for assessing the importance of covariates in Bayesian variable selection. 相似文献

20.

Bayesian modelling of the mean and covariance matrix in normal nonlinear models

《Journal of Statistical Computation and Simulation》2012,82(6):837-853

An important problem in statistics is the study of longitudinal data taking into account the effect of other explanatory variables such as treatments and time. In this paper, a new Bayesian approach for analysing longitudinal data is proposed. This innovative approach takes into account the possibility of having nonlinear regression structures on the mean and linear regression structures on the variance–covariance matrix of normal observations, and it is based on the modelling strategy suggested by Pourahmadi [M. Pourahmadi, Joint mean-covariance models with applications to longitudinal data: Unconstrained parameterizations, Biometrika, 87 (1999), pp. 667–690.]. We initially extend the classical methodology to accommodate the fitting of nonlinear mean models then we propose our Bayesian approach based on a generalization of the Metropolis–Hastings algorithm of Cepeda [E.C. Cepeda, Variability modeling in generalized linear models, Unpublished Ph.D. Thesis, Mathematics Institute, Universidade Federal do Rio de Janeiro, 2001]. Finally, we illustrate the proposed methodology by analysing one example, the cattle data set, that is used to study cattle growth. 相似文献