共查询到18条相似文献,搜索用时 217 毫秒
1.
2.
分类中的类重叠问题及其处理方法研究 总被引:1,自引:0,他引:1
类重叠问题是数据挖掘与机器学习领域的瓶颈问题之一.如果其中还存在类不均衡问题时,情况变得更加复杂.有鉴于此,本文在已有文献基础上归纳了三种类重叠学习算法及提出一种新的方法:分隔法,并首次将支持向量数据描述算法用于实际数据的重叠样本识别,对类重叠问题及其与类不均衡问题的相互影响进行了系统研究.在真实数据上采用五种分类器的实验结果表明:1)多数情况下“分隔法”是表现最佳的类重叠学习算法;2)分隔法通常对基于分界面而非规则的分类器更为有效;3)分隔法在类不均衡问题中表现很好,当基础分类器为支持向量机时尤为突出.最后针对支持向量机的实验结果给出了理论分析. 相似文献
3.
4.
为了克服传统的机器学习方法在R&D项目管理领域应用存在的局限性,本文探讨了支持向量机应用于R&D项目中止决策的可行性和有效性问题,提出了一种基于支持向量机和遗传算法的R&D项目中止决策诊断方法,并以收集到的R&D项目两类样本问题为例,应用该方法进行了实证研究,验证了该方法的可行性和有效性. 相似文献
5.
6.
7.
从分析软件项目绩效评价指标体系不完善、评价方法不规范和模型考虑因素过于单一入手,应用统计分析理论建立软件组织状态、软件项目自身特征的指标体系;以文献研究的方式,界定软件项目绩效的内涵;提出了一种新的网络拓扑结构设计方法,建立了基于模糊神经网络的软件项目绩效评价模型;引入改进粒子群学习算法,准确高效地解决了评价模型连接权系数的确定问题。实证研究表明,该模型能够有效地评价软件项目绩效和识别项目风险因素,对软件组织制定风险规避策略、改善项目绩效水平、提供了决策支持信息。 相似文献
8.
9.
10.
基于非正态分布的投资风险测度方法 总被引:2,自引:0,他引:2
本文通过计算风险点对投资项目的风险进行定量分析,引进当量正态化的方法解决了独立变量非正态分布下的风险的测度问题,在此基础上提出一种新的投资项目风险测度方法。 相似文献
11.
研究了小额贷款公司对客户进行信用风险评估时面临的问题,构建了信用风险评估指标体系,改进了支持向量机(Support Vector Machine, SVM)对非均衡样本分类时分类超平面偏移的不足。首先分析小额贷款公司业务区域性强、信用数据来源不规范、评价标准不一致等特点,给出用于客户信用风险评估的四个维度指标。针对传统SMOTE算法在处理非均衡数据时对全部少数类样本操作的问题,提出仅对错分样本人工合成的改进思想,给出具体算法步骤。将改进算法用于某小额贷款公司客户信用风险评估案例中,分类精确度较其他算法有所提升,表明该方法的可行性和有效性。 相似文献
12.
For several years machine learning methods have been proposed for risk classification. While machine learning methods have also been used for failure diagnosis and condition monitoring, to the best of our knowledge, these methods have not been used for probabilistic risk assessment. Probabilistic risk assessment is a subjective process. The problem of how well machine learning methods can emulate expert judgments is challenging. Expert judgments are based on mental shortcuts, heuristics, which are susceptible to biases. This paper presents a process for developing natural language-based probabilistic risk assessment models, applying deep learning algorithms to emulate experts’ quantified risk estimates. This allows the risk analyst to obtain an a priori risk assessment when there is limited information in the form of text and numeric data. Universal sentence embedding (USE) with gradient boosting regression (GBR) trees trained over limited structured data presented the most promising results. When we apply these models’ outputs to generate survival distributions for autonomous systems’ likelihood of loss with distance, we observe that for open water and ice shelf operating environments, the differences between the survival distributions generated by the machine learning algorithm and those generated by the experts are not statistically significant. 相似文献
13.
14.
Herbert Moskowitz Paul Drnevich Okan Ersoy Kemal Altinkemer Alok Chaturvedi 《决策科学》2011,42(2):477-493
Multi‐organizational collaborative decision making in high‐magnitude crisis situations requires real‐time information sharing and dynamic modeling for effective response. Information technology (IT) based decision support tools can play a key role in facilitating such effective response. We explore one promising class of decision support tools based on machine learning, known as support vector machines (SVM), which have the capability to dynamically model and analyze decision processes. To examine this capability, we use a case study with a design science approach to evaluate improved decision‐making effectiveness of an SVM algorithm in an agent‐based simulation experimental environment. Testing and evaluation of real‐time decision support tools in simulated environments provides an opportunity to assess their value under various dynamic conditions. Decision making in high‐magnitude crisis situations involves multiple different patterns of behavior, requiring the development, application, and evaluation of different models. Therefore, we employ a multistage linear support vector machine (MLSVM) algorithm that permits partitioning decision maker response into behavioral subsets, which can then individually model and examine their diverse patterns of response behavior. The results of our case study indicate that our MLSVM is clearly superior to both single stage SVMs and traditional approaches such as linear and quadratic discriminant analysis for understanding and predicting behavior. We conclude that machine learning algorithms show promise for quickly assessing response strategy behavior and for providing the capability to share information with decision makers in multi‐organizational collaborative environments, thus supporting more effective decision making in such contexts. 相似文献
15.
信用风险评价是金融机构风险防控的重要环节之一。近年来,基于机器学习的信用风险评价模型以其准确的预测效果受到越来越多的关注,但机器学习模型具有可解释性不强的弊端,导致投资者无法完全信任其预测结果。针对上述问题,本文提出了一种改进的教学式方法,利用机器学习模型指导生成一个兼顾准确性与可解释性的信用风险评价决策树模型,以辅助投资者决策。为提高决策树对机器学习模型中正确功能的学习能力,提出了基于Weight Synthetic Minority Over-sampling Technique(Weight-SMOTE)的伪数据集生成方法,以提高伪数据集中可信度高的功能所标记的伪样本比例;为实现所生成的决策树在准确性、可解释性以及其与机器学习模型一致性间的有效权衡,在决策树生成过程中提出了一种新的决策树剪枝方法;同时针对保真度评价指标的局限性,提出了真保真度评价指标,来有效的衡量决策树与机器学习模型正确功能的近似程度。最后使用3个真实信用风险评价数据集对改进的教学式方法进行验证,实验结果表明所提出方法能够生成准确且可解释的信用风险评价模型,以满足投资者的决策偏好与实际需求。 相似文献
16.
In this paper we consider the scheduling problem with machine cost and rejection penalties. For this problem, we are given a sequence of independent jobs, each being characterized by its processing time (size) and its penalty. No machine is initially provided, and when a job is revealed the algorithm has the option to purchase new machines. Right when a new job arrives, we have the following choices: (i) reject it, in which case we pay its penalty; (ii) non-preemptively process it on an existing machine, which contributes to the machine load; (iii) purchase a new machine, and assign it to this machine. The objective is to minimize the sum of the makespan, the cost for purchasing machines, and the total penalty of all rejected jobs. For the small job case, (where all jobs have sizes no greater than the cost for purchasing one machine, and which is the generalization of the Ski-Rental Problem) we present an optimal online algorithm with a competitive ratio of 2. 相似文献
17.
在项目合作的前提下首先假设联盟成员以设备、劳动力、资金三种资源为出资方式来共同进行某一个项目,而项目的最大产出值是可以预测的。在构建了单个联盟成员进行项目合作时企业的收益、资源约束、资源投入风险三个效用函数的基础上得到了单个成员基于资源的项目合作综合效用函数。接着据此建立了多个联盟成员以三种资源的投入与产出为因变量的效用函数模型,最后应用一种新的神经网络方法对于后者的变量值进行了仿真求解。 相似文献
18.
企业的置换装配线调度问题(Permutation Assembly-line Scheduling Problem,PASP)是一类典型的NP-hard型生产调度问题,是现代集成制造系统CIMS极为关心的问题。该问题可以具体描述为n个工件要在m台机器上加工,每个工件需要经过m道工序,每道工序要求不同的机器,这n个工件通过m台机器的顺序相同,它们在每台机器上的加工顺序也相同,问题的主要目标是找到n个工件在每台机器上的最优加工顺序,使得最大完工时间最小。由于PASP问题的NP-hard性质,本文使用遗传算法对其进行求解。尽管遗传算法常用以求解调度问题,但其选择与交叉机制易导致局部最优及收敛慢。因此,本文提出基于区块挖掘与重组的改进遗传算法用于求解置换装配线调度问题。首先通过关联规则挖掘出不同的优秀基因,然后将具有较优结果的基因组合为优势区块,产生具优势的人工解,并引入高收敛性的局部搜索方法,提高搜索到最优解的机会与收敛效率。本文以OR-Library中Taillard标准测试例来验证改进遗传算法的求解质量与效率,结果证明:本文所提算法与其它求解调度问题的现有5种知名算法相比,不仅收敛速度较快,同时求解质量优于它们。 相似文献