典型文献
A deep reinforcement learning method for multi-stage equip-ment development planning in uncertain environments
文献摘要:
Equipment development planning (EDP) is usually a long-term process often performed in an environment with high uncertainty. The traditional multi-stage dynamic programming cannot cope with this kind of uncertainty with unpredictable situ-ations. To deal with this problem, a multi-stage EDP model based on a deep reinforcement learning (DRL) algorithm is pro-posed to respond quickly to any environmental changes within a reasonable range. Firstly, the basic problem of multi-stage EDP is described, and a mathematical planning model is constructed. Then, for two kinds of uncertainties (future capabiūlity require-ments and the amount of investment in each stage), a corre-sponding DRL framework is designed to define the environment, state, action, and reward function for multi-stage EDP. After that, the dueling deep Q-network (Dueling DQN) algorithm is used to solve the multi-stage EDP to generate an approximately optimal multi-stage equipment development scheme. Finally, a case of ten kinds of equipment in 100 possible environments, which are randomly generated, is used to test the feasibility and effective-ness of the proposed models. The results show that the algo-rithm can respond instantaneously in any state of the multi-stage EDP environment and unlike traditional algorithms, the algorithm does not need to re-optimize the problem for any change in the environment. In addition, the algorithm can flexi-bly adjust at subsequent planning stages in the event of a change to the equipment capability requirements to adapt to the new requirements.
文献关键词:
中图分类号:
作者姓名:
LIU Peng;XIA Boyuan;YANG Zhiwei;LI Jichao;TAN Yuejin
作者机构:
College of Systems Engineering,National University of Defense Technology,Changsha 410073,China
文献出处:
引用格式:
[1]LIU Peng;XIA Boyuan;YANG Zhiwei;LI Jichao;TAN Yuejin-.A deep reinforcement learning method for multi-stage equip-ment development planning in uncertain environments)[J].系统工程与电子技术(英文版),2022(06):1159-1175
A类:
capabi
B类:
deep,reinforcement,learning,method,multi,development,planning,environments,Equipment,EDP,usually,long,term,process,often,performed,high,uncertainty,traditional,dynamic,programming,cannot,cope,this,unpredictable,situ,ations,To,deal,problem,DRL,respond,quickly,any,environmental,changes,within,reasonable,range,Firstly,basic,described,mathematical,constructed,Then,kinds,uncertainties,future,amount,investment,each,corre,sponding,framework,designed,define,state,action,reward,function,After,that,dueling,network,Dueling,DQN,used,solve,approximately,optimal,equipment,scheme,Finally,case,possible,which,are,randomly,generated,test,feasibility,effective,ness,proposed,models,results,show,instantaneously,unlike,algorithms,does,need,optimize,In,addition,flexi,bly,adjust,subsequent,stages,event,capability,requirements,adapt,new
AB值:
0.489734
相似文献
机标中图分类号,由域田数据科技根据网络公开资料自动分析生成,仅供学习研究参考。