典型文献
Efficient decoding self-attention for end-to-end speech synthesis
文献摘要:
Self-attention has been innovatively applied to text-to-speech (TTS) because of its parallel structure and superior strength in modeling sequential data. However, when used in end-to-end speech synthesis with an autoregressive decoding scheme, its inference speed becomes relatively low due to the quadratic complexity in sequence length. This problem becomes particularly severe on devices without graphics processing units (GPUs). To alleviate the dilemma, we propose an efficient decoding self-attention (EDSA) module as an alternative. Combined with a dynamic programming decoding procedure, TTS model inference can be effectively accelerated to have a linear computation complexity. We conduct studies on Mandarin and English datasets and find that our proposed model with EDSA can achieve 720%and 50%higher inference speed on the central processing unit (CPU) and GPU respectively, with almost the same performance. Thus, this method may make the deployment of such models easier when there are limited GPU resources. In addition, our model may perform better than the baseline Transformer TTS on out-of-domain utterances.
文献关键词:
中图分类号:
作者姓名:
Wei ZHAO;Li XU
作者机构:
College of Electrical Engineering,Zhejiang University,Hangzhou 310027,China;Institute of Robotics,Zhejiang University,Yuyao 315400,China
文献出处:
引用格式:
[1]Wei ZHAO;Li XU-.Efficient decoding self-attention for end-to-end speech synthesis)[J].信息与电子工程前沿(英文),2022(07):1127-1138
A类:
EDSA,utterances
B类:
Efficient,decoding,self,attention,end,speech,synthesis,Self,has,been,innovatively,applied,text,TTS,because,parallel,structure,superior,strength,modeling,sequential,However,when,used,autoregressive,scheme,inference,speed,becomes,relatively,low,due,quadratic,complexity,sequence,length,This,problem,particularly,severe,devices,without,graphics,processing,units,GPUs,To,alleviate,dilemma,efficient,module,alternative,Combined,dynamic,programming,procedure,can,effectively,accelerated,have,linear,computation,We,conduct,studies,Mandarin,English,datasets,find,that,proposed,achieve,higher,central,CPU,respectively,almost,same,performance,Thus,this,method,may,make,deployment,such,models,easier,there,are,limited,resources,In,addition,better,than,baseline,Transformer,domain
AB值:
0.607609
相似文献
机标中图分类号,由域田数据科技根据网络公开资料自动分析生成,仅供学习研究参考。