Minimax Q-learning design for H∞ control of linear discrete-time systems|Xinxing LI;Lele XI;Wenzhong ZHA;Zhihong PENG|School of Automation,Beijing Institute of Technology,Beijing 100081,China - 期刊导航|首站-论文投稿智能助手|论文发表|论文智能投稿|期刊自助发表推荐|杂志社快速发表|查同导刊-域田数据官方网站

典型文献

Minimax Q-learning design for H∞ control of linear discrete-time systems

文献摘要：

The H∞ control method is an effective approach for attenuating the effect of disturbances on practical systems, but it is difficult to obtain the H∞ controller due to the nonlinear Hamilton–Jacobi–Isaacs equation, even for linear systems. This study deals with the design of an H∞ controller for linear discrete-time systems. To solve the related game algebraic Riccati equation (GARE), a novel model-free minimax Q-learning method is developed, on the basis of an offline policy iteration algorithm, which is shown to be Newton's method for solving the GARE. The proposed minimax Q-learning method, which employs off-policy reinforcement learning, learns the optimal control policies for the controller and the disturbance online, using only the state samples generated by the implemented behavior policies. Different from existing Q-learning methods, a novel gradient-based policy improvement scheme is proposed. We prove that the minimax Q-learning method converges to the saddle solution under initially admissible control policies and an appropriate positive learning rate, provided that certain persistence of excitation (PE) conditions are satisfied. In addition, the PE conditions can be easily met by choosing appropriate behavior policies containing certain excitation noises, without causing any excitation noise bias. In the simulation study, we apply the proposed minimax Q-learning method to design an H∞ load-frequency controller for an electrical power system generator that suffers from load disturbance, and the simulation results indicate that the obtained H∞ load-frequency controller has good disturbance rejection performance.

文献关键词：

中图分类号：

[1] 数理科学和化学（O） / 力学（O3） / 振动理论（O32） / 非线性振动（O322）

[2] 自动化技术、计算机技术（TP） / 自动化基础理论（TP1） / 自动控制理论（TP13）

[3] 医药、卫生（R） / 基础医学（R3） / 病理学（R36） / 病理过程（R364）

作者姓名：

Xinxing LI;Lele XI;Wenzhong ZHA;Zhihong PENG

作者机构：

Information Science Academy,China Electronics Technology Group Corporation,Beijing 100086,China;School of Automation,Beijing Institute of Technology,Beijing 100081,China;Peng Cheng Laboratory,Shenzhen 518052,China

文献出处：

信息与电子工程前沿（英文）

引用格式：

[1]Xinxing LI;Lele XI;Wenzhong ZHA;Zhihong PENG-.Minimax Q-learning design for H∞ control of linear discrete-time systems)[J].信息与电子工程前沿（英文）,2022(03):438-451

A类：

B类：

Minimax,learning,design,discrete,systems,effective,approach,attenuating,disturbances,practical,but,difficult,controller,due,nonlinear,Hamilton,Jacobi,Isaacs,equation,even,This,study,deals,To,solve,related,game,algebraic,Riccati,GARE,novel,model,free,minimax,developed,basis,offline,policy,iteration,algorithm,which,shown,Newton,solving,proposed,employs,reinforcement,learns,optimal,policies,only,state,samples,generated,by,implemented,behavior,Different,from,existing,methods,gradient,improvement,scheme,We,that,converges,saddle,solution,under,initially,admissible,appropriate,positive,provided,certain,persistence,excitation,PE,conditions,are,satisfied,In,addition,can,easily,choosing,containing,noises,without,causing,any,bias,simulation,apply,load,frequency,electrical,power,generator,suffers,results,indicate,obtained,has,good,rejection,performance

AB值：

0.516053

相似文献

Towards autonomous and optimal excavation of shield machine:a deep reinforcement learning-based approach

Ya-kun ZHANG;Guo-fang GONG;Hua-yong YANG;Yu-xi CHEN;Geng-lin CHEN-State Key Laboratory of Fluid Power and Mechatronic Systems,Zhejiang University,Hangzhou 310027,China;School of Electrical and Power Engineering,China University of Mining and Technology,Xuzhou 221116,China

Dual-steering mode based on direct yaw moment control for multi-wheel hub motor driven vehicles:Theoretical design and experimental assessment

Zheng Zhang;Xiao-jun Ma;Chun-guang Liu;Shu-guang Wei-Department of Weapons and Control Engineering,Army Academy of Armored Forces,Beijing,100072,China

Knowledge transfer in multi-agent reinforcement learning with incremental number of agents

LIU Wenzhang;DONG Lu;LIU Jian;SUN Changyin-School of Automation,Southeast University,Nanjing 210096,China;School of Cyber Science and Engineering,Southeast University,Nanjing 211189,China

Joint optimization of inspection-based and age-based preventive maintenance and spare ordering policies for single-unit systems

MA Weining;ZHAO Fei;LI Xin;HU Qiwei;SHANG Bingcong-Department of Equipment Command and Management Shijiazhuang Campus,Army Engineering University,Shijiazhuang 050003,China;School of Business Administration,Northeastern University,Shenyang 110057,China;Northeastern University at Qinhuangdao,Qinhuangdao 066099,China

Memory-augmented adaptive flocking control for multi-agent systems subject to uncertain external disturbances