首站-论文投稿智能助手
典型文献
Efficient Exploration for Multi-Agent Reinforcement Learning via Transferable Successor Features
文献摘要:
In multi-agent reinforcement learning(MARL),the behaviors of each agent can influence the learning of others,and the agents have to search in an exponentially enlarged joint-action space.Hence,it is challenging for the multi-agent teams to explore in the environment.Agents may achieve suboptimal policies and fail to solve some complex tasks.To improve the exploring efficiency as well as the performance of MARL tasks,in this paper,we propose a new approach by transferring the knowledge across tasks.Differently from the traditional MARL algorithms,we first assume that the reward functions can be computed by linear combinations of a shared feature function and a set of task-specific weights.Then,we define a set of basic MARL tasks in the source domain and pre-train them as the basic knowledge for further use.Finally,once the weights for target tasks are available,it will be easier to get a well-performed policy to explore in the target domain.Hence,the learning process of agents for target tasks is speeded up by taking full use of the basic knowledge that was learned previously.We evaluate the proposed algorithm on two challenging MARL tasks:cooperative box-pushing and non-monotonic predator-prey.The experiment results have demonstrated the improved performance compared with state-of-the-art MARL algorithms.
文献关键词:
作者姓名:
Wenzhang Liu;Lu Dong;Dan Niu;Changyin Sun
作者机构:
School of Artificial Intelligence,Anhui University,Hefei 230039,and also with the Peng Cheng Laboratory,Shenzhen 518055,China;School of Cyber Science and Engineering,Southeast University,Nanjing 211189,China;School of Automation,Southeast University,Nanjing 210096,China;School of Automation,Southeast University,Nanjing 210096;Peng Cheng Laboratory,Shenzhen 518055,China
引用格式:
[1]Wenzhang Liu;Lu Dong;Dan Niu;Changyin Sun-.Efficient Exploration for Multi-Agent Reinforcement Learning via Transferable Successor Features)[J].自动化学报(英文版),2022(09):1673-1686
A类:
Successor
B类:
Efficient,Exploration,Multi,Reinforcement,Learning,via,Transferable,Features,In,multi,reinforcement,learning,MARL,behaviors,each,can,influence,others,agents,have,search,exponentially,enlarged,joint,action,space,Hence,challenging,teams,explore,environment,Agents,may,achieve,suboptimal,policies,fail,solve,some,complex,tasks,To,exploring,efficiency,well,performance,this,paper,new,approach,by,transferring,knowledge,across,Differently,from,traditional,algorithms,first,assume,that,reward,functions,computed,linear,combinations,shared,feature,set,specific,weights,Then,define,basic,source,domain,train,them,further,use,Finally,once,target,available,will,easier,performed,policy,process,speeded,up,taking,full,was,learned,previously,We,evaluate,proposed,two,cooperative,box,pushing,monotonic,predator,prey,experiment,results,demonstrated,improved,compared,state,art
AB值:
0.575581
相似文献
机标中图分类号,由域田数据科技根据网络公开资料自动分析生成,仅供学习研究参考。