典型文献
Lexicon-Augmented Cross-Domain Chinese Word Segmentation with Graph Convolutional Network
文献摘要:
Existing neural approaches have achieved significant progress for Chinese word segmentation(CWS).The performances of these methods tend to drop dramatically in the cross-domain scenarios due to the data distribution mismatch across domains and the out of vocabulary words problem.To address these two issues,proposes a lexicon-augmented graph convolutional net-work for cross-domain CWS.The novel model can cap-ture the information of word boundaries from all candid-ate words and utilize domain lexicons to alleviate the dis-tribution gap across domains.Experimental results on the cross-domain CWS datasets(SIGHAN-2010 and TCM)show that the proposed method successfully models in-formation of domain lexicons for neural CWS approaches and helps to achieve competitive performance for cross-domain CWS.The two problems of cross-domain CWS can be effectively solved through various interactions between characters and candidate words based on graphs.Further,experiments on the CWS benchmarks(Bakeoff-2005)also demonstrate the robustness and efficiency of the proposed method.
文献关键词:
中图分类号:
作者姓名:
YU Hao;HUANG Kaiyu;WANG Yu;HUANG Degen
作者机构:
School of Computer Science and Technology,Dalian University of Technology,Dalian 116024,China;School of Foreign Languages,Dalian University of Technology,Dalian 116024,China
文献出处:
引用格式:
[1]YU Hao;HUANG Kaiyu;WANG Yu;HUANG Degen-.Lexicon-Augmented Cross-Domain Chinese Word Segmentation with Graph Convolutional Network)[J].电子学报(英文),2022(05):949-957
A类:
candid,lexicons,SIGHAN,Bakeoff
B类:
Lexicon,Augmented,Cross,Domain,Chinese,Word,Segmentation,Graph,Convolutional,Network,Existing,neural,approaches,have,achieved,significant,progress,segmentation,CWS,performances,these,methods,tend,drop,dramatically,scenarios,due,distribution,mismatch,across,domains,out,vocabulary,words,To,address,issues,proposes,augmented,convolutional,net,novel,cap,ture,information,boundaries,from,utilize,alleviate,gap,Experimental,results,datasets,TCM,show,that,proposed,successfully,models,helps,competitive,problems,effectively,solved,through,various,interactions,between,characters,candidate,graphs,Further,experiments,benchmarks,also,demonstrate,robustness,efficiency
AB值:
0.508494
相似文献
机标中图分类号,由域田数据科技根据网络公开资料自动分析生成,仅供学习研究参考。