期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

吴玉霞牟援朝《统计与决策》2010,(5)

聚类方法可以有效反映出不同类型客户的行为特征,从而利于识别出可疑交易。文章结合证券公司客户真实交易数据和人工数据,采用Clementine进行建模实现聚类过程,识别出了异常值并计算可疑记录的可疑程度,可为金融情报部门提供高质量的调查数据,有效减缓金融情报部门工作人员的负担。相似文献

2.

基于数字图像处理的纺织物辅助设计模拟系统开发

孙佳理《华南农业大学学报(社会科学版)》2015,33(4)

文章研究并设计了一种基于数字图像处理技术( DIP)的纺织物辅助设计模拟系统。该系统通过将极大化思想、 K-means聚类以及形态学开闭运算有机结合,成功实现了从实际织物组织图中自动提取组织形态学结构,并提出了一种全新的组织纱线色彩替换算法,将各种组织结构模拟到纺织物设计图样中。实验证明,该系统能够实现很好的组织模拟效果。相似文献

3.

电子阅览室读者群的实证分析 总被引：1，自引：0，他引：1

陶相荣雷淑霞李鹏《榆林高等专科学校学报》2005,15(5):85-87

以电子阅览室上机者为研究对象,对原始数据进行聚类和逐步拟合回归分析,研究结果显示:2004年6月份上机的读者群具有学习和娱乐两大特点. 相似文献

4.

Time-varying clustering of multivariate longitudinal observations

Antonello Maruotti Maurizio Vichi 《统计学通讯:理论与方法》2013,42(2):430-443

Abstract

We propose a statistical method for clustering multivariate longitudinal data into homogeneous groups. This method relies on a time-varying extension of the classical K-means algorithm, where a multivariate vector autoregressive model is additionally assumed for modeling the evolution of clusters' centroids over time. Model inference is based on a least-squares method and on a coordinate descent algorithm. To illustrate our work, we consider a longitudinal dataset on human development. Three variables are modeled, namely life expectancy, education and gross domestic product. 相似文献

5.

Clustering transformed compositional data using K-means,with applications in gene expression and bicycle sharing system data

Antoine Godichon-Baggioni Cathy Maugis-Rabusseau Andrea Rau 《Journal of applied statistics》2019,46(1):47-65

Although there is no shortage of clustering algorithms proposed in the literature, the question of the most relevant strategy for clustering compositional data (i.e. data whose rows belong to the simplex) remains largely unexplored in cases where the observed value is equal or close to zero for one or more samples. This work is motivated by the analysis of two applications, both focused on the categorization of compositional profiles: (1) identifying groups of co-expressed genes from high-throughput RNA sequencing data, in which a given gene may be completely silent in one or more experimental conditions; and (2) finding patterns in the usage of stations over the course of one week in the Velib' bicycle sharing system in Paris, France. For both of these applications, we make use of appropriately chosen data transformations, including the Centered Log Ratio and a novel extension called the Log Centered Log Ratio, in conjunction with the K-means algorithm. We use a non-asymptotic penalized criterion, whose penalty is calibrated with the slope heuristics, to select the number of clusters. Finally, we illustrate the performance of this clustering strategy, which is implemented in the Bioconductor package coseq, on both the gene expression and bicycle sharing system data. 相似文献

6.

On the strengths of the self-updating process clustering algorithm

《Journal of Statistical Computation and Simulation》2012,82(5):1010-1031

The self-updating process (SUP) is a clustering algorithm that stands from the viewpoint of data points and simulates the process how data points move and perform self-clustering. It is an iterative process on the sample space and allows for both time-varying and time-invariant operators. By simulations and comparisons, this paper shows that SUP is particularly competitive in clustering (i) data with noise, (ii) data with a large number of clusters, and (iii) unbalanced data. When noise is present in the data, SUP is able to isolate the noise data points while performing clustering simultaneously. The property of the local updating enables SUP to handle data with a large number of clusters and data of various structures. In this paper, we showed that the blurring mean-shift is a static SUP. Therefore, our discussions on the strengths of SUP also apply to the blurring mean-shift. 相似文献

7.

Clustering microarray data using model-based double K-means

Francesca Martella Maurizio Vichi 《Journal of applied statistics》2012,39(9):1853-1869

The microarray technology allows the measurement of expression levels of thousands of genes simultaneously. The dimension and complexity of gene expression data obtained by microarrays create challenging data analysis and management problems ranging from the analysis of images produced by microarray experiments to biological interpretation of results. Therefore, statistical and computational approaches are beginning to assume a substantial position within the molecular biology area. We consider the problem of simultaneously clustering genes and tissue samples (in general conditions) of a microarray data set. This can be useful for revealing groups of genes involved in the same molecular process as well as groups of conditions where this process takes place. The need of finding a subset of genes and tissue samples defining a homogeneous block had led to the application of double clustering techniques on gene expression data. Here, we focus on an extension of standard K-means to simultaneously cluster observations and features of a data matrix, namely double K-means introduced by Vichi (2000). We introduce this model in a probabilistic framework and discuss the advantages of using this approach. We also develop a coordinate ascent algorithm and test its performance via simulation studies and real data set. Finally, we validate the results obtained on the real data set by building resampling confidence intervals for block centroids. 相似文献

8.

基于概率神经网络和K-means算法的纳税评估

赵雷张延荣《河北工程大学学报(社会科学版)》2011,28(1):27-28

纳税评估工作是一项难于建立准确数学模型的复杂系统,同时又是一个典型的模式识别问题。用神经网络方法进行纳税评估有其独特的优越性。运用PNN算法很大程度地依赖于训练样本对象的选取。选取的这些样本能否反映总体的信息特征决定了分类器的识别效果。文章运用K-means算法对纳税人信息样本进行聚类,找出聚类中心点,以此为基础来选择样本作为PNN的训练样本,从而达到对PNN算法的优化。研究结果表明这种改进后的PNN算法分类效果好,对于纳税评估有其应用价值。相似文献

9.

The sequence analysis of hospitals that established social work department under fuzzy environment

Li Wang 《European Journal of Social Work》2019,22(4):647-663

This paper reviews the existing literature on hospital social work and discusses intervention strategies for improving social work practice in hospital. The objective of this study was to improve the quality of medical care. But few studies have compared social work services between different hospitals. This study describes qualitative analysis under fuzzy environment， extracts the main influencing factors and establishes a comprehensive evaluation index system. It provides comprehensive evaluation for alternative hospitals by the fuzzy clustering method. This paper proposes a new mixed fuzzy clustering algorithm on the basis of analysing the axiomatic fuzzy set (AFS) and K-means algorithm, which is not affected by some complicated parameter issues and has higher statistical validity. Technique for Order Preference by Similarity to Ideal Solution (TOPSIS) is applied for selecting the best option for each cluster and a comparative analysis is done. Results from a case study in Shanghai, China, confirm that the proposed approach is effective by using information entropy to test. By comparing AFS, K-means and C-means algorithms, the hybrid algorithm can find the two closest attributes of evaluation index of hospital social work, and the proposed approach can be easily help raise the level of hospital social work service. 相似文献

10.

Approaches to blockmodeling dynamic networks: A Monte Carlo simulation study

《Social Networks》2023

Blockmodeling refers to a variety of statistical methods for reducing and simplifying large and complex networks. While methods for blockmodeling networks observed at one time point are well established, it is only recently that researchers have proposed several methods for analysing dynamic networks (i.e., networks observed at multiple time points). The considered approaches are based on k-means or stochastic blockmodeling, with different ways being used to model time dependency among time points. Their novelty means they have yet to be extensively compared and evaluated and the paper therefore aims to compare and evaluate them using Monte Carlo simulations. Different network characteristics are considered, including whether tie formation is random or governed by local network mechanisms. The results show the Dynamic Stochastic Blockmodel (Matias and Miele 2017) performs best if the blockmodel does not change; otherwise, the Stochastic Blockmodel for Multipartite Networks (Bar-Hen et al. 2020) does. 相似文献