首站-论文投稿智能助手
典型文献
Coach-assisted multi-agent reinforcement learning framework for unexpected crashed agents
文献摘要:
Multi-agent reinforcement learning is difficult to apply in practice, partially because of the gap between simulated and real-world scenarios. One reason for the gap is that simulated systems always assume that agents can work normally all the time, while in practice, one or more agents may unexpectedly"crash"during the coordination process due to inevitable hardware or software failures. Such crashes destroy the cooperation among agents and lead to performance degradation. In this work, we present a formal conceptualization of a cooperative multi-agent reinforcement learning system with unexpected crashes. To enhance the robustness of the system to crashes, we propose a coach-assisted multi-agent reinforcement learning framework that introduces a virtual coach agent to adjust the crash rate during training. We have designed three coaching strategies (fixed crash rate, curriculum learning, and adaptive crash rate) and a re-sampling strategy for our coach agent. To our knowledge, this work is the first to study unexpected crashes in a multi-agent system. Extensive experiments on grid-world and StarCraft Ⅱ micromanagement tasks demonstrate the efficacy of the adaptive strategy compared with the fixed crash rate strategy and curriculum learning strategy. The ablation study further illustrates the effectiveness of our re-sampling strategy.
文献关键词:
作者姓名:
Jian ZHAO;Youpeng ZHAO;Weixun WANG;Mingyu YANG;Xunhan HU;Wengang ZHOU;Jianye HAO;Houqiang LI
作者机构:
School of Information Science and Technology,University of Science and Technology of China,Hefei 230026,China;College of Intelligence and Computing,Tianjin University,Tianjin 300072,China
引用格式:
[1]Jian ZHAO;Youpeng ZHAO;Weixun WANG;Mingyu YANG;Xunhan HU;Wengang ZHOU;Jianye HAO;Houqiang LI-.Coach-assisted multi-agent reinforcement learning framework for unexpected crashed agents)[J].信息与电子工程前沿(英文),2022(07):1032-1042
A类:
crashed,coaching,StarCraft,micromanagement
B类:
Coach,assisted,multi,reinforcement,learning,framework,agents,Multi,difficult,apply,practice,partially,because,gap,between,simulated,real,world,scenarios,One,reason,that,systems,always,assume,can,normally,while,one,more,may,unexpectedly,during,coordination,process,due,inevitable,hardware,software,failures,Such,crashes,destroy,cooperation,among,lead,performance,degradation,In,this,present,formal,conceptualization,cooperative,To,enhance,robustness,propose,introduces,virtual,adjust,training,We,have,designed,three,strategies,fixed,curriculum,adaptive,sampling,strategy,our,knowledge,first,study,Extensive,experiments,grid,tasks,demonstrate,efficacy,compared,ablation,further,illustrates,effectiveness
AB值:
0.455142
相似文献
机标中图分类号,由域田数据科技根据网络公开资料自动分析生成,仅供学习研究参考。