首站-论文投稿智能助手
典型文献
PSLDA:a novel supervised pseudo document-based topic model for short texts
文献摘要:
Various kinds of online social media applications such as Twitter and Weibo,have brought a huge volume of short texts.However,mining semantic topics from short texts efficiently is still a challenging problem because of the sparseness of word-occurrence and the diversity of topics.To address the above problems,we propose a novel supervised pseudo-document-based maximum entropy discrimination latent Dirichlet allocation model(PSLDA for short).Specifically,we first assume that short texts are generated from the normal size latent pseudo documents,and the topic distributions are sampled from the pseudo documents.In this way,the model will reduce the sparseness of word-occurrence and the diversity of topics because it implicitly aggregates short texts to longer and higher-level pseudo documents.To make full use of labeled information in training data,we introduce labels into the model,and further propose a supervised topic model to learn the reasonable distribution of topics.Extensive experiments demonstrate that our proposed method achieves better performance compared with some state-of-the-art methods.
文献关键词:
作者姓名:
Mingtao SUN;Xiaowei ZHAO;Jingjing LIN;Jian JING;Deqing WANG;Guozhu JIA
作者机构:
School of Economics and Management,Beihang Univeristy,Beijing 100191,China;School of Computer Science,Beihang University,Beijing 100191,China;School of Instrumentation and Optoelectronic Engineering,Beihang University,Beijing 100191,China
文献出处:
引用格式:
[1]Mingtao SUN;Xiaowei ZHAO;Jingjing LIN;Jian JING;Deqing WANG;Guozhu JIA-.PSLDA:a novel supervised pseudo document-based topic model for short texts)[J].计算机科学前沿,2022(06):67-76
A类:
PSLDA,sparseness
B类:
novel,supervised,pseudo,model,short,texts,Various,kinds,online,social,media,applications,such,Twitter,Weibo,have,brought,huge,volume,However,mining,semantic,topics,from,efficiently,still,challenging,because,word,occurrence,diversity,To,address,above,problems,maximum,entropy,discrimination,latent,Dirichlet,allocation,Specifically,first,assume,that,generated,normal,size,documents,distributions,sampled,In,this,way,will,reduce,implicitly,aggregates,longer,higher,level,make,full,labeled,information,training,data,introduce,labels,into,further,learn,reasonable,Extensive,experiments,demonstrate,our,proposed,achieves,better,performance,compared,some,state,art,methods
AB值:
0.531046
相似文献
机标中图分类号,由域田数据科技根据网络公开资料自动分析生成,仅供学习研究参考。