共查询到18条相似文献,搜索用时 46 毫秒
1.
文章介绍了基于高斯混合模型的期望最大化聚类算法,并对模型进行了简化,运用案例分析了该模型在经济管理领域中的应用,利用可视化的图形展示了研究样本的概率密度. 相似文献
2.
对于一类变量非线性相关的面板数据,现有的基于线性算法的面板数据聚类方法并不能准确地度量样本间的相似性,且聚类结果的可解释性低。综合考虑变量非线性相关问题及聚类结果可解释性问题,提出一种非线性面板数据的聚类方法,通过非线性核主成分算法实现对样本相似性的测度,并基于混合高斯模型进行样本概率聚类,实证表明该方法的有效性及其对聚类结果的可解释性有所提高。 相似文献
3.
文章针对现代工程项目投资大、建设周期长、技术复杂等特点,构建了大型工程项目的风险评价指标体系,在此基础上运用网络分析法分析了指标体系之间的相互影响和作用得到指标权重,进而运用灰色聚类分析法将分散的风险评价信息处理成不同灰类度的评价量并得到风险综合评价值,最后通过实例说明该评价方法的有效性和科学性。 相似文献
4.
Dirichlet过程作为一种典型的变参数贝叶斯模型,基于该过程进行的聚类分析无需预先确定聚类数,聚类数作为模型中的参数由模型和数据自主计算得出,因而成为机器学习研究领域中的一个研究热点,可用于海量数据的聚类分析。文章建立Dirichlet过程无限混合模型对DNA基因表达数据展开了聚类分析。模拟测试数据集和急性白血病的DNA基因表达测试数据集的实验结果表明,Dirichlet过程无限混合模型能够准确地估计出数据中的聚类数。 相似文献
5.
传统的K-Prototypes聚类算法是利用划分的思想来对混合数据进行聚类,但是当混合数据的维度增大时,对象之间的差异度几乎相等,使得此算法难以进行。针对上述缺陷,文章提出一种改进的K-Prototyes聚类算法,聚类前先剔除各类中不相关的维度,将高维混合数据投影降维后再进行聚类。文中给出了Heart Disease Databases的算例,验证了算法的有效性。 相似文献
6.
7.
传统的解决有序样本聚类的Fisher最优分割法对计算机存储能力要求较高,不适合由于样本长度较大时的情况.实践中常用的最优二分割法只能求得局部最优解.文章提出了一种基于遗传算法解决有序样拳聚类问题的新算法.该算法适用于多种聚类距离,适合于大样本,可以解决方向聚类问题. 相似文献
8.
本文首先介绍利用变量聚类过程VARCLUS构造的类变量综合得分的方法,然后通过一个具体实例说明类变量综合得分在多指标(变量)系统的排序评估问题中的应用. 相似文献
9.
目前国内外各种聚类算法数以千百计,本文提出了一个基于聚类算法构成要素的分类框架,进行了文献综述,并指出了四个研究热点。 相似文献
10.
基于遗传算法的投影寻踪聚类 总被引:2,自引:0,他引:2
传统的投影寻踪聚类算法PROCLUS是一种有效的处理高维数据聚类的算法,但此算法是利用爬山法(Hill climbing)对各类中心点进行循环迭代、选取最优的过程,由于爬山法是一种局部搜索(local search)方法,得到的最优解可能仅仅是局部最优。针对上述缺陷,提出一种改进的投影寻踪聚类算法,即利用遗传算法(Genetic Algorithm)对各类中心点进行循环迭代,寻找到全局最优解。仿真实验结果证明了新算法的可行性和有效性。 相似文献
11.
12.
Sam Efromovich 《Scandinavian Journal of Statistics》2016,43(1):70-82
It is well known that adaptive sequential nonparametric estimation of differentiable functions with assigned mean integrated squared error and minimax expected stopping time is impossible. In other words, no sequential estimator can compete with an oracle estimator that knows how many derivatives an estimated curve has. Differentiable functions are typical in probability density and regression models but not in spectral density models, where considered functions are typically smoother. This paper shows that for a large class of spectral densities, which includes spectral densities of classical autoregressive moving average processes, an adaptive minimax sequential estimation with assigned mean integrated squared error is possible. Furthermore, a two‐stage sequential procedure is proposed, which is minimax and adaptive to smoothness of an underlying spectral density. 相似文献
13.
Consider a Gaussian random field model on
, observed on a rectangular region. Suppose it is desired to estimate a set of parameters in the covariance function. Spectral and circulant approximations to the likelihood are often used to facilitate estimation of the parameters. The purpose of the paper is to give a careful treatment of the quality of these approximations. A spectral approximation for the likelihood was given by Guyon (Biometrika 69 (1982) 95–105) but without proof. The results given here generalize those of Guyon, and fill in the details of the proof. In addition some matrix results are derived which may be of independent interest. Applications are made to Fisher information and bias calculations for maximum likelihood estimates. 相似文献
14.
For time series data with obvious periodicity (e.g., electric motor systems and cardiac monitor) or vague periodicity (e.g., earthquake and explosion, speech, and stock data), frequency-based techniques using the spectral analysis can usually capture the features of the series. By this approach, we are able not only to reduce the data dimensions into frequency domain but also utilize these frequencies by general classification methods such as linear discriminant analysis (LDA) and k-nearest-neighbor (KNN) to classify the time series. This is a combination of two classical approaches. However, there is a difficulty in using LDA and KNN in frequency domain due to excessive dimensions of data. We overcome the obstacle by using Singular Value Decomposition to select essential frequencies. Two data sets are used to illustrate our approach. The classification error rates of our simple approach are comparable to those of several more complicated methods. 相似文献
15.
函数型聚类分析算法涉及投影和聚类两个基本要素。通常,最优投影结果未必能够有效地保留类别信息,从而影响后续聚类效果。为此,本文梳理了函数型聚类的构成要素及运行过程;借助非负矩阵分解的聚类特性,提出了基于非负矩阵分解的函数型聚类算法,构建了“投影与聚类”并行的实现框架,并采用交替迭代方法更新求解,分析了算法的计算时间复杂度。针对随机模拟数据验证和语音识别数据的实例检验结果显示,该函数型聚类算法有助于提高聚类效果;针对北京市二氧化氮(NO2)污染物小时浓度数据的实例应用表明,该函数型聚类算法对空气质量监测点类型的区分能够充分识别站点布局的空间模式,具有良好的实际应用价值。 相似文献
16.
The circulant embedding method for generating statistically exact simulations of time series from certain Gaussian distributed
stationary processes is attractive because of its advantage in computational speed over a competitive method based upon the
modified Cholesky decomposition. We demonstrate that the circulant embedding method can be used to generate simulations from
stationary processes whose spectral density functions are dictated by a number of popular nonparametric estimators, including
all direct spectral estimators (a special case being the periodogram), certain lag window spectral estimators, all forms of
Welch's overlapped segment averaging spectral estimator and all basic multitaper spectral estimators. One application for
this technique is to generate time series for bootstrapping various statistics. When used with bootstrapping, our proposed
technique avoids some – but not all – of the pitfalls of previously proposed frequency domain methods for simulating time
series. 相似文献
17.
Vahid Tadayon 《统计学通讯:模拟与计算》2015,44(9):2431-2441
The customary approach to spatial data modeling in the presence of censored data, is to assume the underlying random field is Gaussian. However, in practice, we often faced data that the exploratory data analysis shows the skewness and consequently, it violates the normality assumption. In such setting, the skew Gaussian (SG) spatial model is used to overcome this issue. In this article, the SG model is fitted based on censored observations. For this purpose, we adopt the Bayesian approach and utilize the Markov chain Monte Carlo algorithms and data augmentations to carry out calculations. A numerical example illustrates the methodology. 相似文献