首站-论文投稿智能助手
典型文献
A Comparative Study of CNN-and Transformer-Based Visual Style Transfer
文献摘要:
Vision Transformer has shown impressive performance on the image classification tasks.Observing that most existing visual style transfer(VST)algorithms are based on the texture-biased convolution neural network(CNN),here raises the question of whether the shape-biased Vision Transformer can perform style transfer as CNN.In this work,we focus on comparing and analyzing the shape bias between CNN-and transformer-based models from the view of VST tasks.For comprehensive comparisons,we propose three kinds of transformer-based visual style transfer(Tr-VST)methods(Tr-NST for optimization-based VST,Tr-WCT for reconstruction-based VST and Tr-AdaIN for perceptual-based VST).By engaging three mainstream VST methods in the transformer pipeline,we show that transformer-based models pre-trained on ImageNet are not proper for style transfer methods.Due to the strong shape bias of the transformer-based models,these Tr-VST methods cannot render style patterns.We further analyze the shape bias by considering the influence of the learned parameters and the structure design.Results prove that with proper style supervision,the transformer can learn similar texture-biased features as CNN does.With the reduced shape bias in the transformer encoder,Tr-VST methods can generate higher-quality results compared with state-of-the-art VST methods.
文献关键词:
作者姓名:
Hua-Peng Wei;Ying-Ying Deng;Fan Tang;Xing-Jia Pan;Wei-Ming Dong
作者机构:
School of Artificial Intelligence,Jilin University,Changchun 130012,China;National Laboratory of Pattern Recognition,Institute of Automation,Chinese Academy of Sciences,Beijing 100190,China;Youtu Laboratory,Tencent Incorporated,Shanghai 200233,China
引用格式:
[1]Hua-Peng Wei;Ying-Ying Deng;Fan Tang;Xing-Jia Pan;Wei-Ming Dong-.A Comparative Study of CNN-and Transformer-Based Visual Style Transfer)[J].计算机科学技术学报(英文版),2022(03):601-614
A类:
B类:
Comparative,Study,Transformer,Based,Visual,Style,Transfer,Vision,has,shown,impressive,performance,image,classification,tasks,Observing,that,most,existing,visual,style,transfer,VST,algorithms,texture,biased,convolution,neural,network,here,raises,question,whether,shape,In,this,focus,comparing,analyzing,between,transformer,models,from,view,For,comprehensive,comparisons,propose,three,kinds,methods,NST,optimization,WCT,reconstruction,AdaIN,perceptual,By,engaging,mainstream,pipeline,trained,ImageNet,proper,Due,strong,these,cannot,render,patterns,We,further,analyze,by,considering,influence,learned,parameters,structure,design,Results,prove,supervision,similar,features,does,With,reduced,encoder,generate,higher,quality,results,compared,state,art
AB值:
0.488506
相似文献
机标中图分类号,由域田数据科技根据网络公开资料自动分析生成,仅供学习研究参考。