首页 | 本学科首页   官方微博 | 高级检索  
     检索      

基于平均报酬模型的强化学习算法研究
引用本文:黄炳强,曹广益,费燕琼,王占全.基于平均报酬模型的强化学习算法研究[J].上海理工大学学报(社会科学版),2006,28(5):418-422.
作者姓名:黄炳强  曹广益  费燕琼  王占全
作者单位:上海交通大学电子信息与电气工程学院 上海200030(黄炳强,曹广益),上海交通大学机械与动力工程学院 上海200030(费燕琼),华东理工大学信息科学与工程学院 上海200237(王占全)
摘    要:对于有吸收目标状态的循环任务,比较合理的方法是采用基于平均报酬模型的强化学习.平均报酬模型强化学习具有收敛速度快、鲁棒性强等优点.本文介绍了平均报酬模型强化学习的3个主要算法:R学习、H学习和LC学习,并给出了平均报酬模型强化学习的主要应用及研究方向.

关 键 词:平均报酬强化学习  R学习  H学习  LC学习

Survey of average reinforcement learning algorithms
HUANG Bing-qiang,CAO Guang-yi,FEI Yan-qiong,WANG Zhan-quan.Survey of average reinforcement learning algorithms[J].Journal of University of Shanghai For Science and Technilogy(Social Science),2006,28(5):418-422.
Authors:HUANG Bing-qiang  CAO Guang-yi  FEI Yan-qiong  WANG Zhan-quan
Institution:1. School of Electronic, Information and Electrical Engineering, Shanghai Jiaotong University, Shanghai 200030, China; 2. School of Mechanical Engineering, Shanghai Jiaotong University, Shanghai 200030, China ; 3. College of ln forvnation Science and EngineerinG, Fast China University of Science and Technology, Shanghai 200237, China
Abstract:It is rational to adopt the average reward reinforcement learning algorithms for solving the absorbing goal states cyclical tasks.It has the merit of converging quickly and robustly.A detailed study as regards average reward reinforcement learning including R-learning,Hlearning and LC-learning is presented and the application and future research are proposed.
Keywords:average reward reinforcement learning  R-learning  H-learning  LC-learning
点击此处可从《上海理工大学学报(社会科学版)》浏览原始摘要信息
点击此处可从《上海理工大学学报(社会科学版)》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号