Design and implementation of near-memory computing array architecture based on shared buffer|SHAN Rui;GAO Xu;FENG Yani;HUI Chao;CUI Xinyue;CHAI Miaomiao|School of Computer,Xi'an University of Posts and Telecommunications,Xi'an 710121,P.R.China - 期刊导航|首站-论文投稿智能助手|论文发表|论文智能投稿|期刊自助发表推荐|杂志社快速发表|查同导刊-域田数据官方网站

典型文献

Design and implementation of near-memory computing array architecture based on shared buffer

文献摘要：

Deep learning algorithms have been widely used in computer vision, natural language process-ing and other fields. However, due to the ever-increasing scale of the deep learning model, the re-quirements for storage and computing performance are getting higher and higher, and the processors based on the von Neumann architecture have gradually exposed significant shortcomings such as con-sumption and long latency. In order to alleviate this problem, large-scale processing systems are shifting from a traditional computing-centric model to a data-centric model. A near-memory compu-ting array architecture based on the shared buffer is proposed in this paper to improve system per-formance, which supports instructions with the characteristics of store-calculation integration, reduc-ing the data movement between the processor and main memory. Through data reuse, the processing speed of the algorithm is further improved. The proposed architecture is verified and tested through the parallel realization of the convolutional neural network ( CNN) algorithm. The experimental re-sults show that at the frequency of 110 MHz, the calculation speed of a single convolution operation is increased by 66 . 64％ on average compared with the CNN architecture that performs parallel cal-culations on field programmable gate array( FPGA) . The processing speed of the whole convolution layer is improved by 8 . 81％ compared with the reconfigurable array processor that does not support near-memory computing.

文献关键词：

中图分类号：

[1] 自动化技术、计算机技术（TP） / 计算技术、计算机技术（TP3） / 计算机的应用（TP39） / 信息处理(信息加工)（TP391）

[2] 医药、卫生（R） / 药学（R9） / 药理学（R96） / 实验药理学（R965）

[3] 经济（F） / 工业经济（F4） / 中国工业经济（F42）

作者姓名：

SHAN Rui;GAO Xu;FENG Yani;HUI Chao;CUI Xinyue;CHAI Miaomiao

作者机构：

School of Electronic Engineering,Xi'an University of Posts and Telecommunications,Xi'an 710121,P.R.China;School of Computer,Xi'an University of Posts and Telecommunications,Xi'an 710121,P.R.China

文献出处：

高技术通讯（英文版）

引用格式：

[1]SHAN Rui;GAO Xu;FENG Yani;HUI Chao;CUI Xinyue;CHAI Miaomiao-.Design and implementation of near-memory computing array architecture based on shared buffer)[J].高技术通讯（英文版）,2022(04):345-353

A类：

B类：

Design,implementation,near,memory,computing,array,architecture,shared,buffer,Deep,learning,algorithms,have,been,widely,used,computer,vision,natural,language,other,fields,However,due,increasing,scale,deep,model,quirements,storage,performance,getting,higher,processors,von,Neumann,gradually,exposed,significant,shortcomings,such,sumption,long,latency,In,order,alleviate,this,problem,large,processing,systems,shifting,from,traditional,centric,data,proposed,paper,which,supports,instructions,characteristics,store,calculation,integration,reduc,movement,between,main,Through,reuse,speed,further,improved,verified,tested,through,parallel,realization,convolutional,neural,network,experimental,sults,show,that,frequency,MHz,single,operation,increased,by,average,compared,performs,culations,programmable,gate,FPGA,whole,layer,reconfigurable,does,not

AB值：

0.527993

相似文献

Towards autonomous and optimal excavation of shield machine:a deep reinforcement learning-based approach

Ya-kun ZHANG;Guo-fang GONG;Hua-yong YANG;Yu-xi CHEN;Geng-lin CHEN-State Key Laboratory of Fluid Power and Mechatronic Systems,Zhejiang University,Hangzhou 310027,China;School of Electrical and Power Engineering,China University of Mining and Technology,Xuzhou 221116,China

Deep neural network based classification of rolling element bearings and health degradation through comprehensive vibration signal analysis

KULEVOME Delanyo Kwame Bensah;WANG Hong;WANG Xuegang-School of Information and Communication Engineering,University of Electronic Science and Technology of China,Chengdu 611731,China;Yangtze Delta Region Institute(Huzhou),University of Electronic Science and Technology of China,Huzhou 313001,China

Predictive cruise control for heavy trucks based on slope information under cloud control system

LI Shuyan;WAN Keke;GAO Bolin;LI Rui;WANG Yue;LI Keqiang-College of Engineering,China Agricultural University,Beijing 100083,China;School of Vehicle and Mobility,Tsinghua University,Beijing 100084,China

ShortTail:taming tail latency for erasure-code-based in-memory systems

Yun TENG;Zhiyue LI;Jing HUANG;Guangyan ZHANG-College of Computer Science and Technology,Jilin University,Changchun 130012,China;Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education,Jilin University,Changchun 130012,China;Department of Computer Science and Technology,Tsinghua University,Beijing 100084,China;Beijing National Research Center for Information Science and Technology(Tsinghua University),Beijing 100084,China

Development of a wide-range and fast-response digitizing pulse signal acquisition and processing system for neutron flux monitoring on EAST