Hierarchical reinforcement learning guidance with threat avoidance|LI Bohao;WU Yunjie;LI Guofei|School of Automation Science and Electrical Engineering,Beihang University,Beijing 100191,China - 期刊导航|首站-论文投稿智能助手|论文发表|论文智能投稿|期刊自助发表推荐|杂志社快速发表|查同导刊-域田数据官方网站

典型文献

Hierarchical reinforcement learning guidance with threat avoidance

文献摘要：

The guidance strategy is an extremely critical factor in determining the striking effect of the missile operation.A novel guidance law is presented by exploiting the deep reinforcement learning(DRL)with the hierarchical deep deterministic policy gradient(DDPG)algorithm.The reward functions are con-structed to minimize the line-of-sight(LOS)angle rate and avoid the threat caused by the opposed obstacles.To attenuate the chattering of the acceleration,a hierarchical reinforcement learn-ing structure and an improved reward function with action penalty are put forward.The simulation results validate that the missile under the proposed method can hit the target success-fully and keep away from the threatened areas effectively.

文献关键词：

中图分类号：

[1] 自动化技术、计算机技术（TP） / 自动化基础理论（TP1） / 人工智能理论（TP18）

[2] 自动化技术、计算机技术（TP） / 计算技术、计算机技术（TP3） / 计算机的应用（TP39） / 信息处理(信息加工)（TP391）

[3] 医药、卫生（R） / 神经病学与精神病学（R74） / 神经病学（R741）

作者姓名：

LI Bohao;WU Yunjie;LI Guofei

作者机构：

State Key Laboratory of Virtual Reality Technology and System,Beihang University,Beijing 100191,China;School of Automation Science and Electrical Engineering,Beihang University,Beijing 100191,China;Science and Technology on Aircraft Control Laboratory,Beijing 100191,China;School of Astronautics,Northwestern Polytechnical University,Xi'an 710072,China

文献出处：

系统工程与电子技术（英文版）

引用格式：

[1]LI Bohao;WU Yunjie;LI Guofei-.Hierarchical reinforcement learning guidance with threat avoidance)[J].系统工程与电子技术（英文版）,2022(05):1173-1185

A类：

B类：

Hierarchical,reinforcement,learning,guidance,avoidance,strategy,extremely,critical,determining,striking,missile,operation,novel,law,presented,by,exploiting,deep,DRL,hierarchical,deterministic,policy,gradient,DDPG,algorithm,reward,functions,con,structed,minimize,line,sight,LOS,angle,caused,opposed,obstacles,To,attenuate,chattering,acceleration,structure,improved,action,penalty,put,forward,simulation,results,validate,that,under,proposed,method,can,hit,target,success,fully,keep,away,from,threatened,areas,effectively

AB值：

0.653123

相似文献

Towards autonomous and optimal excavation of shield machine:a deep reinforcement learning-based approach

Ya-kun ZHANG;Guo-fang GONG;Hua-yong YANG;Yu-xi CHEN;Geng-lin CHEN-State Key Laboratory of Fluid Power and Mechatronic Systems,Zhejiang University,Hangzhou 310027,China;School of Electrical and Power Engineering,China University of Mining and Technology,Xuzhou 221116,China

Instability energy mechanism of super-large section crossing chambers in deep coal mines

Deyuan Fan;Xuesheng Liu;Yunliang Tan;Xuebin Li;Purev Lkhamsuren-College of Energy and Mining Engineering,Shandong University of Science and Technology,Qingdao 266590,China;State Key Laboratory of Mining Disaster Prevention and Control,Shandong University of Science and Technology,Qingdao 266590,China;School of Geology and Mining,Mongolian University of Science and Technology,Ulaanbaatar 120646,Mongolia

Active control of flow past an elliptic cylinder using an artificial neural network trained by deep reinforcement learning

Bofu WANG;Qiang WANG;Quan ZHOU;Yulu LIU-Shanghai Key Laboratory of Mechanics in Energy Engineering,Shanghai Institute of Applied Mathematics and Mechanics,School of Mechanics and Engineering Science,Shanghai University,Shanghai 200072,China;Shanghai Frontiers Science Base for Mechanoinfomatic,Shanghai University,Shanghai 200072,China;School of Science,Shanghai Institute of Technology,Shanghai 201418,China

Minimax Q-learning design for H∞ control of linear discrete-time systems

Xinxing LI;Lele XI;Wenzhong ZHA;Zhihong PENG-Information Science Academy,China Electronics Technology Group Corporation,Beijing 100086,China;School of Automation,Beijing Institute of Technology,Beijing 100081,China;Peng Cheng Laboratory,Shenzhen 518052,China

Multi-agent differential game based cooperative synchronization control using a data-driven method