Design of high parallel CNN accelerator based on FPGA for AIoT|Lin Zhijian;Gao Xuewei;Chen Xiaopei;Zhu Zhipeng;Du Xiaoyong;Chen Pingping|College of Physics and Information Engineering.Fuzhou University,Fuzhou 350108,China - 期刊导航|首站-论文投稿智能助手|论文发表|论文智能投稿|期刊自助发表推荐|杂志社快速发表|查同导刊-域田数据官方网站

典型文献

Design of high parallel CNN accelerator based on FPGA for AIoT

文献摘要：

To tackle the challenge of applying convolutional neural network(CNN)in field-programmable gate array(FPGA)due to its computational complexity,a high-performance CNN hardware accelerator based on Verilog hardware description language was designed,which utilizes a pipeline architecture with three parallel dimensions including input channels,output channels,and convolution kernels.Firstly,two multiply-and-accumulate(MAC)operations were packed into one digital signal processing(DSP)block of FPGA to double the computation rate of the CNN accelerator.Secondly,strategies of feature map block partitioning and special memory arrangement were proposed to optimize the total amount of off-chip access memory and reduce the pressure on FPGA bandwidth.Finally,an efficient computational array combining multiplicative-additive tree and Winograd fast convolution algorithm was designed to balance hardware resource consumption and computational performance.The high parallel CNN accelerator was deployed in ZU3EG of Alinx,using the YOLOv3-tiny algorithm as the test object.The average computing performance of the CNN accelerator is 127.5 giga operations per second(GOPS).The experimental results show that the hardware architecture effectively improves the computational power of CNN and provides better performance compared with other existing schemes in terms of power consumption and the efficiency of DSPs and block random access memory(BRAMs).

文献关键词：

中图分类号：

[1] 自动化技术、计算机技术（TP） / 计算技术、计算机技术（TP3） / 计算机的应用（TP39） / 信息处理(信息加工)（TP391）

[2] 自动化技术、计算机技术（TP） / 自动化基础理论（TP1） / 人工智能理论（TP18） / 人工神经网络与计算（TP183）

[3] 医药、卫生（R） / 药学（R9） / 药理学（R96） / 实验药理学（R965）

作者姓名：

Lin Zhijian;Gao Xuewei;Chen Xiaopei;Zhu Zhipeng;Du Xiaoyong;Chen Pingping

作者机构：

School of Advanced Manufacturing,Fuzhou University,Quanzhou 362251,China;College of Physics and Information Engineering.Fuzhou University,Fuzhou 350108,China

文献出处：

中国邮电高校学报（英文版）

引用格式：

[1]Lin Zhijian;Gao Xuewei;Chen Xiaopei;Zhu Zhipeng;Du Xiaoyong;Chen Pingping-.Design of high parallel CNN accelerator based on FPGA for AIoT)[J].中国邮电高校学报（英文版）,2022(05):1-9,61

A类：

ZU3EG,Alinx,giga,DSPs,BRAMs

B类：

Design,high,parallel,accelerator,FPGA,AIoT,To,tackle,challenge,applying,convolutional,neural,network,field,programmable,gate,array,due,its,computational,complexity,performance,hardware,Verilog,description,language,was,designed,which,utilizes,pipeline,architecture,three,dimensions,including,input,channels,output,kernels,Firstly,multiply,accumulate,MAC,operations,were,packed,into,one,digital,signal,processing,block,double,Secondly,strategies,feature,map,partitioning,special,memory,arrangement,proposed,optimize,total,amount,off,chip,access,reduce,pressure,bandwidth,Finally,efficient,combining,multiplicative,additive,tree,Winograd,fast,algorithm,balance,resource,consumption,deployed,using,YOLOv3,tiny,test,object,average,computing,second,GOPS,experimental,results,show,that,effectively,improves,power,provides,better,compared,other,existing,schemes,terms,efficiency,random

AB值：

0.572816

相似文献

An ε-domination based two-archive 2 algorithm for many-objective optimization

WU Tianwei;AN Siguang;HAN Jianqiang;SHENTU Nanying-College of Mechanical and Electrical Engineering,China Jiliang University,Hangzhou 310018,China;Key Laboratory of Intelligent Manufacturing Quality Big Data Tracing and Analysis of Zhejiang Province,China Jilang University,Hangzhou 310018,China

Predictive cruise control for heavy trucks based on slope information under cloud control system

LI Shuyan;WAN Keke;GAO Bolin;LI Rui;WANG Yue;LI Keqiang-College of Engineering,China Agricultural University,Beijing 100083,China;School of Vehicle and Mobility,Tsinghua University,Beijing 100084,China

FPGA implementation and image encryption application of a new PRNG based on a memristive Hopfield neural network with a special activation gradient

Fei Yu;Zinan Zhang;Hui Shen;Yuanyuan Huang;Shuo Cai;Sichun Du-School of Computer and Communication Engineering,Changsha University of Science and Technology,Changsha 410114,China;College of Computer Science and Electronic Engineering,Hunan University,Changsha 410082,China

Exponential sine chaotification model for enhancing chaos and its hardware implementation

Rui Wang;Meng-Yang Li;Hai-Jun Luo-College of Physics and Electronic Engineering,Chongqing Normal University,Chongqing 401331,China;The University of Chicago,Chicago 60637,United States of America;National Center for Applied Mathematics in Chongqing,Chongqing 401331,China

An energy-efficient reconfigurable asymmetric modular cryptographic operation unit for RSA and ECC