首页 | 本学科首页   官方微博 | 高级检索  
  按 检索   检索词:      
出版年份:   被引次数:   他引次数: 提示:输入*表示无穷大
  收费全文   117篇
  免费   4篇
管理学   16篇
人口学   1篇
综合类   3篇
社会学   17篇
统计学   84篇
  2023年   1篇
  2022年   3篇
  2021年   2篇
  2020年   4篇
  2019年   2篇
  2018年   10篇
  2017年   7篇
  2016年   8篇
  2015年   6篇
  2014年   2篇
  2013年   23篇
  2012年   3篇
  2011年   6篇
  2010年   4篇
  2009年   8篇
  2008年   6篇
  2007年   6篇
  2006年   4篇
  2005年   3篇
  2004年   3篇
  2003年   2篇
  2002年   1篇
  2001年   3篇
  2000年   2篇
  1996年   1篇
  1975年   1篇
排序方式: 共有121条查询结果,搜索用时 15 毫秒
This paper is about techniques for clustering sequences such as nucleic or amino acids. Our application is to defining viral subtypes of HIV on the basis of similarities of V3 loop region amino acids of the envelope (env) gene. The techniques introduced here could apply with virtually no change to other HIV genes as well as to other problems and data not necessarily of viral origin. These algorithms as they apply to quantitative data have found much application in engineering contexts to compressing images and speech. They are called vector quantization and involve a mapping from a large number of possible inputs into a much smaller number of outputs. Many implementations, in particular those that go by the name generalized Lloyd or k-means, exist for choosing sets of possible outputs and mappings. With each there is an attempt to maximize similarities among inputs that map to any single output, or, alternatively, to minimize some measure of distortion between input and output. Here, two standard types of vector quantization are brought to bear upon the cited problem of clustering V3 loop amino acid sequences. Results of this clustering are compared to those of the well known UPGMA algorithms, the unweighted pair group method in which arithmetic averages are employed.  相似文献   
Clusters of galaxies are a useful proxy to trace the distribution of mass in the universe. By measuring the mass of clusters of galaxies on different scales, one can follow the evolution of the mass distribution (Martínez and Saar, Statistics of the Galaxy Distribution, 2002). It can be shown that finding galaxy clusters is equivalent to finding density contour clusters (Hartigan, Clustering Algorithms, 1975): connected components of the level set S c ≡{f>c} where f is a probability density function. Cuevas et al. (Can. J. Stat. 28, 367–382, 2000; Comput. Stat. Data Anal. 36, 441–459, 2001) proposed a nonparametric method for density contour clusters, attempting to find density contour clusters by the minimal spanning tree. While their algorithm is conceptually simple, it requires intensive computations for large datasets. We propose a more efficient clustering method based on their algorithm with the Fast Fourier Transform (FFT). The method is applied to a study of galaxy clustering on large astronomical sky survey data.  相似文献   
针对在海量数据中,如何有效地自动获取文摘以提高检索效率的问题,本文提出了一种自动文摘中主题区域划分的方法。该方法对文章段落向量模型进行聚类分析,得到文章的主题结构。这种方法适用于各种风格的文体,能有效解决文章主题分布自由的问题,准确地划分出文章主题区域。  相似文献   
The k-means algorithm is one of the most common non hierarchical methods of clustering. It aims to construct clusters in order to minimize the within cluster sum of squared distances. However, as most estimators defined in terms of objective functions depending on global sums of squares, the k-means procedure is not robust with respect to atypical observations in the data. Alternative techniques have thus been introduced in the literature, e.g., the k-medoids method. The k-means and k-medoids methodologies are particular cases of the generalized k-means procedure. In this article, focus is on the error rate these clustering procedures achieve when one expects the data to be distributed according to a mixture distribution. Two different definitions of the error rate are under consideration, depending on the data at hand. It is shown that contamination may make one of these two error rates decrease even under optimal models. The consequence of this will be emphasized with the comparison of influence functions and breakdown points of these error rates.  相似文献   
Mixture distribution survival trees are constructed by approximating different nodes in the tree by distinct types of mixture distributions to improve within node homogeneity. Previously, we proposed a mixture distribution survival tree-based method for determining clinically meaningful patient groups from a given dataset of patients’ length of stay. This article extends this approach to examine the interrelationship between length of stay in hospital, outcome measures, and other covariates. We describe an application of this approach to patient pathway and examine the relationship between length of stay in hospital and/or treatment outcome using five-years’ retrospective data of stroke patients.  相似文献   

Genetic data are frequently categorical and have complex dependence structures that are not always well understood. For this reason, clustering and classification based on genetic data, while highly relevant, are challenging statistical problems. Here we consider a versatile U-statistics-based approach for non-parametric clustering that allows for an unconventional way of solving these problems. In this paper we propose a statistical test to assess group homogeneity taking into account multiple testing issues and a clustering algorithm based on dissimilarities within and between groups that highly speeds up the homogeneity test. We also propose a test to verify classification significance of a sample in one of two groups. We present Monte Carlo simulations that evaluate size and power of the proposed tests under different scenarios. Finally, the methodology is applied to three different genetic data sets: global human genetic diversity, breast tumour gene expression and Dengue virus serotypes. These applications showcase this statistical framework's ability to answer diverse biological questions in the high dimension low sample size scenario while adapting to the specificities of the different datatypes.  相似文献   
股票收益波动具有典型的连续函数特征,将其纳入连续动态函数范畴分析,能够挖掘现有离散分析方法不能揭示的深层次信息。本文基于连续动态函数视角研究上证50指数样本股票收益波动的类别模式和时段特征。首先由实际离散观测数据信息自行驱动,重构隐含在其中的本征收益波动函数。进一步,利用函数型主成分正交分解收益函数波动的主趋势,在无核心信息损失的主成分降维基础上,引入自适应权重聚类分析客观划分股票收益函数波动的模式类别。最后,利用函数型方差分析检验不同类别收益函数之间波动差异的显著性和稳健性,并基于波动函数周期性时段划分,图形展示和可视化剖析每一类别收益函数在不同时段波动的势能转化规律。研究发现:上证综指股票收益波动的主导趋势可以分解为四个子模式,50只股票存在五类显著的波动模式类别,并且5类波动模式的特征差异主要体现在本次研究区间的初始阶段。本文拓展了股票收益波动模式分类和差异因素分析的研究视角,能够为金融监管部门的管理策略制定和证券市场的投资组合配置提供实证支持。  相似文献   
The ability to accurately forecast and control inpatient census, and thereby workloads, is a critical and long‐standing problem in hospital management. The majority of current literature focuses on optimal scheduling of inpatients, but largely ignores the process of accurate estimation of the trajectory of patients throughout the treatment and recovery process. The result is that current scheduling models are optimizing based on inaccurate input data. We developed a Clustering and Scheduling Integrated (CSI) approach to capture patient flows through a network of hospital services. CSI functions by clustering patients into groups based on similarity of trajectory using a novel semi‐Markov model (SMM)‐based clustering scheme, as opposed to clustering by patient attributes as in previous literature. Our methodology is validated by simulation and then applied to real patient data from a partner hospital where we demonstrate that it outperforms a suite of well‐established clustering methods. Furthermore, we demonstrate that extant optimization methods achieve significantly better results on key hospital performance measures under CSI, compared with traditional estimation approaches, increasing elective admissions by 97% and utilization by 22% compared to 30% and 8% using traditional estimation techniques. From a theoretical standpoint, the SMM‐clustering is a novel approach applicable to any temporal‐spatial stochastic data that is prevalent in many industries and application areas.  相似文献   
中国制造业技术变化实证研究   总被引:1,自引:0,他引:1  
方虹  王红霞 《统计研究》2008,25(4):40-44
制造业的发展关系到第二产业及整个国民经济的发展,制造业的增长是中国工业经济增长的主导力量。本文使用DEA- Malmquist 指数法,测度了中国制造业中20个行业1997—2005 年的全要素生产率TFP 变化情况,并在此基础上把制造业分为非耐用消费品、中间投入品和资本品及耐用消费品三个部门,比较部门间技术变化的差异。研究结果表明:制造业的TFP平均增长率为3.76%,其中技术进步是 TFP 增长的主要原因;在三个部门中,非耐用消费品工业全要素生产率平均增长最快,资本品及耐用消费品工业整体表现最好,低水平的规模效率抑制了中间投入品工业部门技术效率的提高,从而阻碍了该部门全要素生产率增长。  相似文献   
Multivariate dose-response models have recently been proposed for developmental toxicity data to simultaneously model malformation incidence (a binary outcome), and reductions in fetal weight (a continuous outcome). In this and other applications, the binary outcome often represents a dichotomization of another outcome or a composite of outcomes, which facilitates analysis. For example, in Segment II developmental toxicology studies, multiple malformation types (i.e., external, visceral, skeletal) are evaluated on each fetus; malformation status may also be ordinally measured (e.g., normal, signs of variation, full malformation). A model is proposed is for fetal weight and multiple malformation variables measured on an ordinal scale, where the correlations between the outcomes and between the offspring within a litter are taken into account. Fully specifying the joint distribution of outcomes within a litter is avoided by specifying only the distribution of the multivariate outcome for each fetus and using generalized estimating equation methodology to account for correlations due to litter clustering. The correlations between the outcomes are required to characterize joint risk to the fetus, and are therefore a focus of inference. Dose-response models and their application to quantitative risk assessment are illustrated using data from a recent developmental toxicology experiment of ethylene oxide in mice.  相似文献   
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号