首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 0 毫秒
判别分析统计检验体系的探讨   总被引:3,自引:0,他引:3  
判别分析已越来越受到人们的重视并取得了重要的应用成果,但应用中存在着简单套用的情况,对判别分析的适用性、判别效果的显著性、判别变量的判别能力以及判别函数的判别能力的检验等问题重视不够。为了更好地应用判别分析,就应对判别分析进行统计检验并建立统计检验体系,统计检验体系应包括:判别分析适用性检验,判别效果显著性检验,判别变量的判别能力检验和判别函数的判别能力检验。  相似文献   

函数型数据的稀疏性和无穷维特性使得传统聚类分析失效。针对此问题,本文在界定函数型数据概念与内涵的基础上提出了一种自适应迭代更新聚类分析。首先,基于数据参数信息实现无穷维函数空间向有限维多元空间的过渡;在此基础上,依据变量信息含量的差异构建了自适应赋权聚类统计量,并依此为函数型数据的相似性测度进行初始类别划分;进一步地,在给定阈值限制下,对所有函数的初始类别归属进行自适应迭代更新,将收敛的优化结果作为最终的类别划分。随机模拟和实证检验表明,与现有的同类函数型聚类分析相比,文中方法的分类正确率显著提高,体现了新方法的相对优良性和实际问题应用中的有效性。  相似文献   

函数数据聚类分析方法探析   总被引:3,自引:0,他引:3  
函数数据是目前数据分析中新出现的一种数据类型,它同时具有时间序列和横截面数据的特征,通常可以描述为关于某一变量的函数图像,在实际应用中具有很强的实用性。首先简要分析函数数据的一些基本特征和目前提出的一些函数数据聚类方法,如均匀修正的函数数据K均值聚类方法、函数数据层次聚类方法等,并在此基础上,从函数特征分析的角度探讨了函数数据聚类方法,提出了一种基于导数分析的函数数据区间聚类分析方法,并利用中国中部六省的就业人口数据对该方法进行实证分析,取得了聚类结果。  相似文献   

This article investigates nonparametric estimation of variance functions for functional data when the mean function is unknown. We obtain asymptotic results for the kernel estimator based on squared residuals. Similar to the finite dimensional case, our asymptotic result shows the smoothness of the unknown mean function has an effect on the rate of convergence. Our simulation studies demonstrate that estimator based on residuals performs much better than that based on conditional second moment of the responses.  相似文献   

判别分析与Logistic回归的模拟比较   总被引:3,自引:2,他引:3  
利用随机模拟方法,研究判别分析和Logistic回归分类的回判正确率。模拟结果显示,Logistic回归的回判正确率优于判别分析。随着随机误差的增大,Logistic回归与判别分析的回判正确率差异逐渐减小。随机误差超过一定界限,Logistic回归的回判正确率低于判别分析。在随机模拟的基础上,引入修正Logistic回归分类,模拟结果显示,修正Logistic回归分类略优于Logistic回归。  相似文献   

Many sparse linear discriminant analysis (LDA) methods have been proposed to overcome the major problems of the classic LDA in high‐dimensional settings. However, the asymptotic optimality results are limited to the case with only two classes. When there are more than two classes, the classification boundary is complicated and no explicit formulas for the classification errors exist. We consider the asymptotic optimality in the high‐dimensional settings for a large family of linear classification rules with arbitrary number of classes. Our main theorem provides easy‐to‐check criteria for the asymptotic optimality of a general classification rule in this family as dimensionality and sample size both go to infinity and the number of classes is arbitrary. We establish the corresponding convergence rates. The general theory is applied to the classic LDA and the extensions of two recently proposed sparse LDA methods to obtain the asymptotic optimality.  相似文献   

Linear discriminant analysis and quadratic discriminant analysis are used to predict group membership. Rare populations present situations in which group sizes differ drastically. This article examined k = 2 and k = 4 predictor variables for groups with different levels of rarity and different levels of sensitivity and specificity. Sample size recommendations were generated for both minimum and maximum group overlap using the leave-one-out (L-O-O) method of estimation. Minimum sample size recommendations are provided in tables for immediate implementation by applied researchers.  相似文献   


One of the basic statistical methods of dimensionality reduction is analysis of discriminant coordinates given by Fisher (1936 Fisher, R. A. 1936. The use of multiple measurements in taxonomic problem. Annals of Eugenics 7 (2):17988. doi:10.1111/j.1469-1809.1936.tb02137.x.[Crossref] [Google Scholar]) and Rao (1948). The space of discriminant coordinates is a space convenient for presenting multidimensional data originating from multiple groups and for the use of various classification methods (methods of discriminant analysis). In the present paper, we adapt the classical discriminant coordinates analysis to multivariate functional data. The theory has been applied to analysis of textural properties of apples of six varieties, measured over a period of 180?days, stored in two types of refrigeration chamber.  相似文献   

面板数据的有序聚类分析是多元统计分析的新兴研究领域。借鉴多元统计学中主成分分析方法对面板数据在时间变量上进行降维处理,把变异信息的损失降低到最小,较为准确地反映了样本在各时间段内的整体变化水平;采用费希尔最优求解算法对主成分得分进行有序聚类,为研究有序面板数据的亲疏关系提供一些思路;对全球气候变化进行聚类分析,分析五十年来全球及区域气候变化特点,与国外研究结论对比,显示出良好的应用性。  相似文献   

函数性数据的统计分析:思想、方法和应用   总被引:9,自引:0,他引:9       下载免费PDF全文
严明义 《统计研究》2007,24(2):87-94
 摘  要:实际中,越来越多的研究领域所收集到的样本观测数据具有函数性特征,这种函数性数据是融合时间序列和横截面两者的数据,有些甚是曲线或其他函数图像。虽然计量经济学近二十多年来发展的面板数据分析方法,具有很好的应用价值,但是面板数据只是函数性数据的一种特殊类型,且其分析方法太过于依赖模型的线性结构和假设条件等。本文基于函数性数据的普遍特征,介绍一种对其进行分析的全新方法,并率先使用该方法对经济函数性数据进行分析,拓展了函数性数据分析的应用范围。分析结果表明,函数性数据分析方法,较之计量经济学和其他统计方法具有更多的优越性,尤其能够揭示其他方法所不能揭示的数据特征  相似文献   

We introduce a technique for extending the classical method of linear discriminant analysis (LDA) to data sets where the predictor variables are curves or functions. This procedure, which we call functional linear discriminant analysis ( FLDA ), is particularly useful when only fragments of the curves are observed. All the techniques associated with LDA can be extended for use with FLDA. In particular FLDA can be used to produce classifications on new (test) curves, give an estimate of the discriminant function between classes and provide a one- or two-dimensional pictorial representation of a set of curves. We also extend this procedure to provide generalizations of quadratic and regularized discriminant analysis.  相似文献   

The aim of this article is to improve the quality of cookies production by classifying them as good or bad from the curves of resistance of dough observed during the kneading process. As the predictor variable is functional, functional classification methodologies such as functional logit regression and functional discriminant analysis are considered. A P-spline approximation of the sample curves is proposed to improve the classification ability of these models and to suitably estimate the relationship between the quality of cookies and the resistance of dough. Inference results on the functional parameters and related odds ratios are obtained using the asymptotic normality of the maximum likelihood estimators under the classical regularity conditions. Finally, the classification results are compared with alternative functional data analysis approaches such as componentwise classification on the logit regression model.  相似文献   

It appears to be common practice with ridge regression to obtain a decomposition of the total sum of squares, and assign degrees of freedom, according to established least squares theory. This discussion notes the obvious fallacies of such an approach, and introduces a decomposition based on orthogonality, and degrees of freedom based on expected mean squares, for non-stochastic k.  相似文献   

Logistic回归模型在判别分析中的应用   总被引:2,自引:0,他引:2  
介绍Logistic回归模型用于判别的方法,利用给出的某期间华北地区和长江中下游降水年变化为判别对象,以这种判别方法确定界于两个地区中间地带的一些观测站属于何种年变化型,并且与传统用的最大概率法做了比较,发现Logistic的效果要比最大概率法好。  相似文献   

为了把两个总体判别分析中的ROC曲线推广到了多个总体的情形,根据两个总体判断分析中的ROC曲线变换,得到了多个总体判别分析中的ROC曲面,并研究了其某些性质。  相似文献   

We derive a likelihood ratio test for generalized variance a in factor analysis model. The asymptotic distribution of the test statistic follows chi-square distribution with one degree of freedom from a general theory of likelihood ratio test.  相似文献   

Summary: In this paper the complexity of high dimensional data with cyclical variation is reduced using analysis of variance and factor analysis. It is shown that the prediction of a small number of main cyclical factors is more useful than forecasting all the time-points separately as it is usually done by seasonal time series models. To give an example for this approach we analyze the electricity demand per quarter of an hour of industrial customers in Germany. The necessity of such predictions results from the liberalization of the German electricity market in 1998 due to legal requirements of the EC in 1996.  相似文献   

The correspondence analysis (CA) method appears to be an effective tool for analysis of interrelations between rows and columns in two-way contingency data. A discrete version of the method, box clustering, is developed in the paper using an approximation version of the CA model extended to the case when CA factor values are required to be Boolean. Several properties of the proposed SEFIT-BOX algorithm are proved to facilitate interpretation of its output. It is also shown that two known partitioning algorithms (applied within row or column sets only) could be considered as locally optimal algorithms for fitting the model, and extensions of these algorithms to a simultaneous row and column partitioning problem are proposed.  相似文献   

Functional linear models are useful in longitudinal data analysis. They include many classical and recently proposed statistical models for longitudinal data and other functional data. Recently, smoothing spline and kernel methods have been proposed for estimating their coefficient functions nonparametrically but these methods are either intensive in computation or inefficient in performance. To overcome these drawbacks, in this paper, a simple and powerful two-step alternative is proposed. In particular, the implementation of the proposed approach via local polynomial smoothing is discussed. Methods for estimating standard deviations of estimated coefficient functions are also proposed. Some asymptotic results for the local polynomial estimators are established. Two longitudinal data sets, one of which involves time-dependent covariates, are used to demonstrate the approach proposed. Simulation studies show that our two-step approach improves the kernel method proposed by Hoover and co-workers in several aspects such as accuracy, computational time and visual appeal of the estimators.  相似文献   

在面板数据聚类分析方法的研究中,基于面板数据兼具截面维度和时间维度的特征,对欧氏距离函数进行了改进,在聚类过程中考虑指标权重与时间权重,提出了适用于面板数据聚类分析的"加权距离函数"以及相应的Ward.D聚类方法。首先定义了考虑指标绝对值、邻近时点增长率以及波动变异程度的欧氏距离函数;然后,将指标权重与时间权重通过线性模型集结成综合加权距离,最终实现面板数据的加权聚类过程。实证分析结果显示,考虑指标权重与时间权重的面板数据加权聚类分析方法具有更好的分辨能力,能提高样本聚类的准确性。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号