典型文献
Reinforcement Learning Behavioral Control for Nonlinear Autonomous System
文献摘要:
Behavior-based autonomous systems rely on human intelligence to resolve multi-mission conflicts by designing mission priority rules and nonlinear controllers.In this work,a novel two-layer reinforcement learning behavioral control(RLBC)method is proposed to reduce such dependence by trial-and-error learning.Specifically,in the upper layer,a reinforcement learning mission supervisor(RLMS)is designed to learn the optimal mission priority.Compared with existing mission supervisors,the RLMS improves the dynamic performance of mission priority adjustment by maximizing cumulative rewards and reducing hardware storage demand when using neural networks.In the lower layer,a reinforcement learning controller(RLC)is designed to learn the optimal control policy.Compared with existing behavioral controllers,the RLC reduces the control cost of mission priority adjustment by balancing control performance and consumption.All error signals are proved to be semi-globally uniformly ultimately bounded(SGUUB).Simulation results show that the number of mission priority adjustment and the control cost are significantly reduced compared to some existing mission supervisors and behavioral controllers,respectively.
文献关键词:
中图分类号:
作者姓名:
Zhenyi Zhang;Zhibin Mo;Yutao Chen;Jie Huang
作者机构:
College of Electrical Engineering and Automation,Fuzhou University,Fuzhou 350108;Key Laboratory of Industrial Automation Control Technology and Information Processing,Education Department of Fujian Province,Fuzhou 350108;G+Industrial Internet Institute,Fuzhou University,Fuzhou 350108,China
文献出处:
引用格式:
[1]Zhenyi Zhang;Zhibin Mo;Yutao Chen;Jie Huang-.Reinforcement Learning Behavioral Control for Nonlinear Autonomous System)[J].自动化学报(英文版),2022(09):1561-1573
A类:
RLBC,SGUUB
B类:
Reinforcement,Learning,Behavioral,Control,Nonlinear,Autonomous,System,autonomous,systems,rely,human,intelligence,resolve,multi,mission,conflicts,by,designing,priority,rules,nonlinear,controllers,In,this,novel,layer,reinforcement,learning,behavioral,method,proposed,such,dependence,trial,error,Specifically,upper,RLMS,designed,optimal,Compared,existing,supervisors,improves,dynamic,performance,adjustment,maximizing,cumulative,rewards,reducing,hardware,storage,demand,when,using,neural,networks,lower,RLC,policy,reduces,cost,balancing,consumption,All,signals,proved,semi,globally,uniformly,ultimately,bounded,Simulation,results,show,that,number,significantly,reduced,compared,some,respectively
AB值:
0.516452
相似文献
机标中图分类号,由域田数据科技根据网络公开资料自动分析生成,仅供学习研究参考。