首站-论文投稿智能助手
典型文献
CDCAT:A Multi-Language Cross-Document Entity and Event Coreference Annotation Tool
文献摘要:
A tool for the manual annotation of cross-document entity and event coreferences that helps annotators to label mention coreference relations in text is essential for the annotation of coreference corpora.To the best of our knowledge,CROss-document Main Events and entities Recognition(CROMER)is the only open-source manual annotation tool available for cross-document entity and event coreferences.However,CROMER lacks multi-language support and extensibility.Moreover,to label cross-document mention coreference relations,CROMER requires the support of another intra-document coreference annotation tool known as Content Annotation Tool,which is now unavailable.To address these problems,we introduce Cross-Document Coreference Annotation Tool(CDCAT),a new multi-language open-source manual annotation tool for cross-document entity and event coreference,which can handle different input/output formats,preprocessing functions,languages,and annotation systems.Using this new tool,annotators can label a reference relation with only two mouse clicks.Best practice analyses reveal that annotators can reach an annotation speed of 0.025 coreference relations per second on a corpus with a coreference density of 0.076 coreference relations per word.As the first multi-language open-source cross-document entity and event coreference annotation tool,CDCAT can theoretically achieve higher annotation efficiency than CROMER.
文献关键词:
作者姓名:
Yang Xu;Boming Xia;Yueliang Wan;Fan Zhang;Jiabo Xu;Huansheng Ning
作者机构:
School of Computer and Communication Engineering, University of Science and Technology Beijing, Beijing 100083, China;Beijing Engineering Research Center for Cyberspace Data Analysis and Applications, Beijing 100083, China;School of Computer and Communication Engineering,University of Science and Technology Beijing,Beijing 100083,China;Research Institute with Run Technologies Company,Ltd.,Beijing 100192,China;School of Information Engineering,Xinjiang Institute of Engineering,Urumqi 830091,China
引用格式:
[1]Yang Xu;Boming Xia;Yueliang Wan;Fan Zhang;Jiabo Xu;Huansheng Ning-.CDCAT:A Multi-Language Cross-Document Entity and Event Coreference Annotation Tool)[J].清华大学学报自然科学版(英文版),2022(03):589-598
A类:
CDCAT,Coreference,coreferences,annotators,coreference,CROss,CROMER
B类:
Multi,Language,Cross,Document,Entity,Annotation,Tool,tool,manual,annotation,cross,document,entity,event,that,helps,label,mention,relations,text,essential,corpora,best,knowledge,Main,Events,entities,Recognition,only,open,source,However,lacks,multi,support,extensibility,Moreover,requires,another,intra,known,as,Content,which,unavailable,address,these,problems,introduce,new,can,handle,different,input,output,formats,preprocessing,functions,languages,systems,Using,this,two,mouse,clicks,Best,practice,analyses,reveal,reach,speed,per,second,corpus,density,word,first,theoretically,achieve,higher,efficiency,than
AB值:
0.383541
相似文献
Annotating TSSs in Multiple Cell Types Based on DNA Sequence and RNA-seq Data via DeeReCT-TSS
Juexiao Zhou;Bin Zhang;Haoyang Li;Longxi Zhou;Zhongxiao Li;Yongkang Long;Wenkai Han;Mengran Wang;Huanhuan Cui;Jingjing Li;Wei Chen;Xin Gao-Computer Science Program,Computer,Electrical and Mathematical Sciences and Engineering Division,King Abdullah University of Science and Technology,Thuwal 23955-6900,Saudi Arabia;Computational Bioscience Research Center,King Abdullah University of Science and Technology,Thuwal 23955-6900,Saudi Arabia;Department of Biology,School of Life Sciences,Southern University of Science and Technology,Shenzhen 518055,China;Shenzhen Key Laboratory of Gene Regulation and Systems Biology,School of Life Sciences,Southern University of Science and Technology,Shenzhen 518055,China;Academy for Advanced Interdisciplinary Studies,Southern University of Science and Technology,Shenzhen 518055,China
机标中图分类号,由域田数据科技根据网络公开资料自动分析生成,仅供学习研究参考。