典型文献
                Hybrid Q-learning for data-based optimal control of non-linear switching system
            文献摘要:
                    In this paper,the optimal control of non-linear switch-ing system is investigated without knowing the system dyna-mics.First,the Hamilton-Jacobi-Bellman(HJB)equation is derived with the consideration of hybrid action space.Then,a novel data-based hybrid Q-learning(HQL)algorithm is proposed to find the optimal solution in an iterative manner.In addition,the theoretical analysis is provided to illustrate the convergence and optimality of the proposed algorithm.Finally,the algorithm is implemented with the actor-critic(AC)structure,and two linear-in-parameter neural networks are utilized to approximate the functions.Simulation results validate the effectiveness of the data-driven method.
                文献关键词:
                    
                中图分类号:
                    作者姓名:
                    
                        LI Xiaofeng;DONG Lu;SUN Changyin
                    
                作者机构:
                    School of Automation,Southeast University,Nanjing 210096,China;School of Artificial Intelligence,Anhui University,Hefei 230601,China;School of Cyber Science and Engineering,Southeast University,Nanjing 211189,China
                文献出处:
                    
                引用格式:
                    
                        [1]LI Xiaofeng;DONG Lu;SUN Changyin-.Hybrid Q-learning for data-based optimal control of non-linear switching system)[J].系统工程与电子技术(英文版),2022(05):1186-1194
                    
                A类:
                HQL
                B类:
                    Hybrid,learning,data,control,linear,switching,system,In,this,paper,investigated,without,knowing,dyna,mics,First,Hamilton,Jacobi,Bellman,HJB,equation,derived,consideration,hybrid,action,space,Then,novel,algorithm,proposed,find,solution,iterative,manner,addition,theoretical,analysis,provided,illustrate,convergence,optimality,Finally,implemented,actor,critic,AC,structure,parameter,neural,networks,are,utilized,approximate,functions,Simulation,results,validate,effectiveness,driven,method
                AB值:
                    0.616504
                相似文献
                
            机标中图分类号,由域田数据科技根据网络公开资料自动分析生成,仅供学习研究参考。