期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

On the Spectral Decomposition in Normal Discriminant Analysis

Luca Bagnato Francesca Greselin Antonio Punzo 《统计学通讯:模拟与计算》2013,42(6):1471-1489

This article enlarges the covariance configurations, on which the classical linear discriminant analysis is based, by considering the four models arising from the spectral decomposition when eigenvalues and/or eigenvectors matrices are allowed to vary or not between groups. As in the classical approach, the assessment of these configurations is accomplished via a test on the training set. The discrimination rule is then built upon the configuration provided by the test, considering or not the unlabeled data. Numerical experiments, on simulated and real data, have been performed to evaluate the gain of our proposal with respect to the linear discriminant analysis. 相似文献

2.

A study on discriminant analysis techniques applied to multivariate lognormal data

《Journal of Statistical Computation and Simulation》2012,82(1-2):79-100

The purpose of this paper is to examine the multiple group (>2) discrimination problem in which the group sizes are unequal and the variables used in the classification are correlated with skewed distributions. Using statistical simulation based on data from a clinical study, we compare the performances, in terms of misclassification rates, of nine statistical discrimination methods. These methods are linear and quadratic discriminant analysis applied to untransformed data, rank transformed data, and inverse normal scores data, as well as fixed kernel discriminant analysis, variable kernel discriminant analysis, and variable kernel discriminant analysis applied to inverse normal scores data. It is found that the parametric methods with transformed data generally outperform the other methods, and the parametric methods applied to inverse normal scores usually outperform the parametric methods applied to rank transformed data. Although the kernel methods often have very biased estimates, the variable kernel method applied to inverse normal scores data provides considerable improvement in terms of total nonerror rate. 相似文献

3.

基于Fisher变换的Bayes判别方法探索 总被引：1，自引：0，他引：1

下载免费PDF全文

杜子芳刘亚文《统计研究》2012,29(3):73-78

判别分析是三大多元统计分析方法之一,在许多领域都有广泛的应用。通常认为距离判别、Fisher判别和Bayes判别是三种不同的判别分析方法,本文的研究表明,距离判别与Bayes判别是两种实质的判别方法,前者实际依据的是百分位点或置信区间,后者实际依据的是概率。而著名的Fisher判别,只是依据方差分析的思想,对判别变量进行线性变换,然后用于距离判别,其实不能算是一种实质的判别方法。本文将Fisher变换与Bayes判别结合起来,即先做Fisher变换,再利用概率最大原则做Bayes判别,得到一种新的判别途径,可进一步提高判别效率。理论与实证分析表明,基于Fisher变换的Bayes判别,适用场合广泛,判别效率最高。相似文献

4.

A NEW APPROACH TO DISCRIMINATION AND CLASSIFICATION USING A HAUSDORFF TYPE DISTANCE

Sangit Chatterjee A. Narayanan 《Australian & New Zealand Journal of Statistics》1992,34(3):391-406

A new method of discrimination and classification based on a Hausdorff type distance is proposed. In two groups, the Hausdorff distance is defined as the sum of the furthest distance of the nearest elements of one set to another. This distance has some useful properties and is exploited in developing a discriminant criterion between individual objects belonging to two groups based on a finite number of classification variables. The discrimination criterion is generalized to more than two groups in a couple of ways. Several data sets are analysed and their classification accuracy is compared to that obtained from linear discriminant function and the results are encouraging. The method in simple, lends itself to parallel computation and imposes less stringent conditions on the data. 相似文献

5.

数量化Ⅱ类弓形效应修正方法及应用

赵雪艳《统计研究》2020,37(6):106-118

对应分析在对定性数据进行数量化处理过程中出现了“弓形效应”，关于对应分析的“弓形效应”的修正方法已经有了丰富的研究成果，避免了可能错误的分析结果，对理论界和应用领域都有重要意义。数量化Ⅱ类是关于定性数据的一种判别分析方法，在国内外已被广泛应用。本文通过大量模拟数据研究发现，数量化Ⅱ类在对定性数据进行数量化过程中出现了“弓形效应”，降低了正判别率，同时不能正确再现原始数据信息，得出与原始数据信息不符的错误分析结果，为修正“弓形效应”，提出了二阶段判别分析法，并从正判别率和对原始数据再现程度两个方面对数量化Ⅱ类与二阶段判别分析法进行了比较，同时将二阶段判别分析法运用到个人信用评级中，发现二阶段判别分析法的判别性能优于数量化Ⅱ类。相似文献

6.

Logistic与分类树模型变量筛选的比较——基于信用卡邮寄业务响应率分析

谢远涛杨娟王稳《统计与信息论坛》2011,26(6):96-101

基于信用卡邮寄业务响应率分析来讨论Logistic模型和分类树模型在变量选取上的区别,并尝试从几个不同角度去解释两类模型变量筛选差异的原因。笔者认为没有绝对占优势的方法,需要结合具体场景和模型的特点来选择合适的模型。分类树模型在训练集上容易过度拟合,对单个变量的影响很敏感,在进行危险因素分析时结果更能强调危险因素,对孤立点的识别率很高。Logistic模型容易受到解释变量依存关系的影响,加上分类变量的影响容易过多地选入变量或者因子,对孤立点敏感,对噪点不敏感。判别函数的差异是变量筛选差异的关键因素。相似文献

7.

Mutability of DNA base pairs: A statistical approach based on linear discrimination

Phaik Mooi Leong Stephan Morgenthaler 《Revue canadienne de statistique》1998,26(3):445-454

A mutational spectrum measures the frequency of mutations at particular base pairs along a given DNA sequence under the influence of a particular treatment. Such spectra have been measured for several genes, in various organisms and under different treatments. This article presents a method of analysis based on the statistical discrimination between mutable base pairs and stable base pairs. The coefficients of the discriminant function characterize the mutational spectrum induced by a given treatment on a given sequence. These coefficients can be interpreted by the user without specialized statistical knowledge. 相似文献

8.

Classification and similarity analysis of fundamental frequency patterns in infant spoken language acquisition

Hiroko Kato Solvang Masanobu Taniguchi Tomohiro Nakatani Shigeaki Amano 《Statistical Methodology》2008,5(3):187-208

Fundamental frequency (F0) patterns, which indicate the vibration frequency of vocal cords, reflect the developmental changes in infant spoken language. In previous studies of developmental psychology, however, F0 patterns were manually classified into subjectively specified categories. Furthermore, since F0 has sequential missing and indicates a mean nonstationarity, classification that employs subsequent partition and conventional discriminant analysis based on stationary and local stationary processes is considered inadequate. Consequently, we propose a classification method based on discriminant analysis of time series data with mean nonstationarity and sequential missing, and a measurement technique for investigating the configuration similarities for classification. Using our proposed procedures, we analyse a longitudinal database of recorded conversations between infants and parents over a five-year period. Various F0 patterns were automatically classified into appropriate pattern groups, and the classification similarities calculated. These similarities gradually decreased with infant’s monthly age until a large change occurred around 20 months. The results suggest that our proposed methods are useful for analysing large-scale data and can contribute to studies of infant spoken language acquisition. 相似文献

9.

Discrimination of AR,MA and ARMA time series models

H.T. Chan R. Chinipardaz T.F. Cox 《统计学通讯:理论与方法》2013,42(6):1247-1260

The problem of discrimination between two stationary ARMA time series models is considered, and in particular AR(p), MA(p), ARMA(1,1) models. The discriminant based on the likelihood ration leads to a quadratic form that is generally too complicated to evaluated explicitly. The discriminant can be expressed approximately as a linear combination of independent chi–squared random varianles each with one degree of freedom, the coefficients, of which are eigenvalues of cumbersome matrices. An analytical solution which gives the coefficients approximately is suggested. 相似文献

10.

Categorical variable selection based on entropy reduction

William M. Stanish Randy U. Allred 《统计学通讯:理论与方法》2013,42(17):1733-1750

This paper presents the derivation of a categorical variable selection technique which utilizes the entropy function as a measure of variability for nominally scaled variables. The selection criterion uses likelihood ratio statistics which, for the hypotheses under consideration, are identical to minimum discrimination information statistics. Thus, the paper provides an alternative motivation for a selection technique based on discriminatory power, and it provides an extension of that technique to the multipopulation discrimination problem. The selection technique is illustrated for a study in which we discriminate among three populations: cervical cancer patients, population-based controls, and hospital-based controls. 相似文献

11.

1979-2009年中国地下经济规模测算及影响分析

苏飞胡艳《统计与信息论坛》2012,27(4):60-66

应用MIMIC模型,利用1979-2009年的统计资料,将地下经济视为一个潜变量,以税收负担、犯罪率、失业率、政府管制、通货膨胀和居民收入作为地下经济的原因变量,以货币流通量及自我就业率作为地下经济的指标变量.研究显示:中国地下经济规模增长幅度较大,1979年中国地下经济规模比例仅为0.78％,2009年则高达19.93％;地下经济规模与官方经济增长率互为因果关系,地下经济对官方经济具有一定的积极作用;地下经济规模与居民收入差距为单向因果关系,即地下经济规模的扩大加剧了现阶段中国居民收入的不平等. 相似文献

12.

Classification of cyclical time series using complex demodulation

Elizabeth Ann Maharaj 《Statistics and Computing》2014,24(6):1031-1046

A new and innovative procedure based on time varying amplitudes for the classification of cyclical time series is proposed. In many practical situations, the amplitude of a cyclical component of a time series is not constant. Estimated time varying amplitudes obtained through complex demodulation of the time series are used as the discriminating variables in classical discriminant analysis. The aim of this paper is to demonstrate through simulation studies and applications to well-known data sets, that time varying amplitudes have very good discriminating power and hence their use in classical discriminant analysis is a simple alternative to more complex methods of time series discrimination. 相似文献

13.

On cross-validation for discrete kernel estimates in discrimination

Gerhard Tutz 《统计学通讯:理论与方法》2013,42(11):4145-4162

The choice of smoothing determines the properties of nonparametric estimates of probability densities. In the discrimination problem, the choice is often tied to loss functions. A framework for the cross–validatory choice of smoothing parameters based on general loss functions is given. Several loss functions are considered as special cases. In particular, a family of loss functions, which is connected to discrimination problems, is directly related to measures of performance used in discrimination. Consistency results are given for a general class of loss functions which comprise this family of discriminant loss functions. 相似文献

14.

Probit and logistic discriminant functions

A. Albert J. A. Anderson 《统计学通讯:理论与方法》2013,42(7):641-657

Most discriminant functions refer to qualitatively district groups. Talis et al. (1975) introduced the probit discriminant function for distinguishing between two ordered groups. They showed how to estimate this function for mixture sampling and continuous predictor variables. Here an estimation system is given for the more common separate sampling which is applicable to continuous and/or discrete predictor variables. When used solely with continuous variables) this method of estimation is more robust than Tallis!

The relationship of probit and logistic discrimination is discussed. 相似文献

15.

Logistic回归模型在判别分析中的应用

任康李刚《统计与信息论坛》2007,22(6):71-73

介绍Logistic回归模型用于判别的方法,利用给出的某期间华北地区和长江中下游降水年变化为判别对象,以这种判别方法确定界于两个地区中间地带的一些观测站属于何种年变化型,并且与传统用的最大概率法做了比较,发现Logistic的效果要比最大概率法好。相似文献

16.

Classification with discrete and continuous variables via general mixed-data models

A. R. de Leon A. Soo T. Williamson 《Journal of applied statistics》2011,38(5):1021-1032

We study the problem of classifying an individual into one of several populations based on mixed nominal, continuous, and ordinal data. Specifically, we obtain a classification procedure as an extension to the so-called location linear discriminant function, by specifying a general mixed-data model for the joint distribution of the mixed discrete and continuous variables. We outline methods for estimating misclassification error rates. Results of simulations of the performance of proposed classification rules in various settings vis-à-vis a robust mixed-data discrimination method are reported as well. We give an example utilizing data on croup in children. 相似文献

17.

A robust logistic discrimination model

COX TREVOR F. PEARCE KIM F. 《Statistics and Computing》1997,7(3):155-161

Logistic discrimination is a well documented method for classifying observations to two or more groups. However, estimation of the discriminant rule can be seriously affected by outliers. To overcome this, Cox and Ferry produced a robust logistic discrimination technique. Although their method worked in practice, parameter estimation was sometimes prone to convergence problems. This paper proposes a simplified robust logistic model which does not have any such problems and which takes a generalized linear model form. Misclassification rates calculated in a simulation exercise are used to compare the new method with ordinary logistic discrimination. Model diagnostics are also presented. The newly proposed model is then used on data collected from pregnant women at two district general hospitals. A robust logistic discriminant is calculated which can be used to predict accurately which method of feeding a woman will eventually use: breast feeding or bottle feeding. 相似文献

18.

Logistic Discrimination with Total Variation Regularization

Robin Rühlicke 《统计学通讯:模拟与计算》2013,42(9):1825-1838

This article introduces a regularized logistic discrimination method that is especially suited for discretized stochastic processes (such as periodograms, spectrograms, EEG curves, etc.). The proposed method penalizes the total variation of the discriminant directions, giving smaller misclassification errors than alternative methods, and smoother and more easily interpretable discriminant directions. The properties of the new method are studied by simulation and by a real-data example involving classification of phonemes. 相似文献

19.

Discriminating between the generalized Rayleigh and Weibull distributions: Some comparative studies

Murad A. Ahmad Debasis Kundu 《统计学通讯:模拟与计算》2017,46(6):4880-4895

The generalized Rayleigh distribution was introduced and studied quite effectively in the literature. The closeness and separation between the distributions are extremely important for analyzing any lifetime data. In this spirit, both the generalized Rayleigh and Weibull distributions can be used for analyzing skewed datasets. In this article, we compare these two distributions based on the Fisher information measures and use it for discrimination purposes. It is evident that the Fisher information measures play an important role in separating between the distributions. The total information measures and the variances of the different percentile estimators are computed and presented. A real life dataset is analyzed for illustration purposes and a numerical comparison study is performed to assess our procedures in separating between these two distributions. 相似文献

20.

Bootstrapping frequency domain tests in multivariate time series with an application to comparing spectral densities

Holger Dette Efstathios Paparoditis 《Journal of the Royal Statistical Society. Series B, Statistical methodology》2009,71(4):831-857

Summary. We propose a general bootstrap procedure to approximate the null distribution of non-parametric frequency domain tests about the spectral density matrix of a multivariate time series. Under a set of easy-to-verify conditions, we establish asymptotic validity of the bootstrap procedure proposed. We apply a version of this procedure together with a new statistic to test the hypothesis that the spectral densities of not necessarily independent time series are equal. The test statistic proposed is based on an L ₂-distance between the non-parametrically estimated individual spectral densities and an overall, 'pooled' spectral density, the latter being obtained by using the whole set of m time series considered. The effects of the dependence between the time series on the power behaviour of the test are investigated. Some simulations are presented and a real life data example is discussed. 相似文献