A deep reinforcement learning method for multi-stage equip-ment development planning in uncertain environments|LIU Peng;XIA Boyuan;YANG Zhiwei;LI Jichao;TAN Yuejin - 期刊导航|首站-论文投稿智能助手|论文发表|论文智能投稿|期刊自助发表推荐|杂志社快速发表|查同导刊-域田数据官方网站

典型文献

A deep reinforcement learning method for multi-stage equip-ment development planning in uncertain environments

文献摘要：

Equipment development planning (EDP) is usually a long-term process often performed in an environment with high uncertainty. The traditional multi-stage dynamic programming cannot cope with this kind of uncertainty with unpredictable situ-ations. To deal with this problem, a multi-stage EDP model based on a deep reinforcement learning (DRL) algorithm is pro-posed to respond quickly to any environmental changes within a reasonable range. Firstly, the basic problem of multi-stage EDP is described, and a mathematical planning model is constructed. Then, for two kinds of uncertainties (future capabiūlity require-ments and the amount of investment in each stage), a corre-sponding DRL framework is designed to define the environment, state, action, and reward function for multi-stage EDP. After that, the dueling deep Q-network (Dueling DQN) algorithm is used to solve the multi-stage EDP to generate an approximately optimal multi-stage equipment development scheme. Finally, a case of ten kinds of equipment in 100 possible environments, which are randomly generated, is used to test the feasibility and effective-ness of the proposed models. The results show that the algo-rithm can respond instantaneously in any state of the multi-stage EDP environment and unlike traditional algorithms, the algorithm does not need to re-optimize the problem for any change in the environment. In addition, the algorithm can flexi-bly adjust at subsequent planning stages in the event of a change to the equipment capability requirements to adapt to the new requirements.

文献关键词：

中图分类号：

[1] 自动化技术、计算机技术（TP） / 计算技术、计算机技术（TP3） / 计算机的应用（TP39） / 信息处理(信息加工)（TP391）

[2] 医药、卫生（R） / 药学（R9） / 药理学（R96） / 实验药理学（R965）

[3] 自动化技术、计算机技术（TP） / 计算技术、计算机技术（TP3） / 计算机的应用（TP39） / 信息处理(信息加工)（TP391） / 模式识别与装置（TP391.4）

作者姓名：

LIU Peng;XIA Boyuan;YANG Zhiwei;LI Jichao;TAN Yuejin

作者机构：

College of Systems Engineering,National University of Defense Technology,Changsha 410073,China

文献出处：

系统工程与电子技术（英文版）

引用格式：

[1]LIU Peng;XIA Boyuan;YANG Zhiwei;LI Jichao;TAN Yuejin-.A deep reinforcement learning method for multi-stage equip-ment development planning in uncertain environments)[J].系统工程与电子技术（英文版）,2022(06):1159-1175

A类：

capabi

B类：

deep,reinforcement,learning,method,multi,development,planning,environments,Equipment,EDP,usually,long,term,process,often,performed,high,uncertainty,traditional,dynamic,programming,cannot,cope,this,unpredictable,situ,ations,To,deal,problem,DRL,respond,quickly,any,environmental,changes,within,reasonable,range,Firstly,basic,described,mathematical,constructed,Then,kinds,uncertainties,future,amount,investment,each,corre,sponding,framework,designed,define,state,action,reward,function,After,that,dueling,network,Dueling,DQN,used,solve,approximately,optimal,equipment,scheme,Finally,case,possible,which,are,randomly,generated,test,feasibility,effective,ness,proposed,models,results,show,instantaneously,unlike,algorithms,does,need,optimize,In,addition,flexi,bly,adjust,subsequent,stages,event,capability,requirements,adapt,new

AB值：

0.489734

相似文献

Towards autonomous and optimal excavation of shield machine:a deep reinforcement learning-based approach

Ya-kun ZHANG;Guo-fang GONG;Hua-yong YANG;Yu-xi CHEN;Geng-lin CHEN-State Key Laboratory of Fluid Power and Mechatronic Systems,Zhejiang University,Hangzhou 310027,China;School of Electrical and Power Engineering,China University of Mining and Technology,Xuzhou 221116,China

Multi-agent deep reinforcement learning for end–edge orchestrated resource allocation in industrial wireless networks

Xiaoyu LIU;Chi XU;Haibin YU;Peng ZENG-State Key Laboratory of Robotics,Shenyang Institute of Automation,Chinese Academy of Sciences,Shenyang 110016,China;Key Laboratory of Networked Control Systems,Chinese Academy of Sciences,Shenyang 110016,China;Institutes for Robotics and Intelligent Manufacturing,Chinese Academy of Sciences,Shenyang 110169,China;University of Chinese Academy of Sciences,Beijing 100049,China

Behavioral control task supervisor with memory based on reinforcement learning for human-multi-robot coordination systems

Jie HUANG;Zhibin MO;Zhenyi ZHANG;Yutao CHEN-School of Electrical Engineering and Automation,Fuzhou University,Fuzhou 350108,China;G+Industrial Internet Institute,Fuzhou University,Fuzhou 350108,China;Key Laboratory of Industrial Automation Control Technology and Information Processing of Fujian Province,Fuzhou University,Fuzhou 350108,China

A deep Q-learning network based active ob ject detection model with a novel training algorithm for service robots

Shaopeng LIU;Guohui TIAN;Yongcheng CUI;Xuyang SHAO-School of Control Science and Engineering,Shandong University,Jinan 250061,China

Image-based traffic signal control via world models