Instance-sequence reasoning for video question answering|Rui LIU;Yahong HAN|Tianjin Key Lab of Machine Learning,Tianjin University,Tianjin 300350,China - 期刊导航|首站-论文投稿智能助手|论文发表|论文智能投稿|期刊自助发表推荐|杂志社快速发表|查同导刊-域田数据官方网站

典型文献

Instance-sequence reasoning for video question answering

文献摘要：

Video question answering(Video QA)involves a thorough understanding of video content and question language,as well as the grounding of the textual semantic to the visual content of videos.Thus,to answer the questions more accurately,not only the semantic entity should be associated with certain visual instance in video frames,but also the action or event in the question should be localized to a corresponding temporal slot.It turns out to be a more challen-ging task that requires the ability of conducting reasoning with correlations between instances along temporal frames.In this paper,we propose an instance-sequence reasoning network for video question answering with instance grounding and temporal localization.In our model,both visual instances and textual representations are firstly embedded into graph nodes,which benefits the integration of intra-and inter-modality.Then,we propose graph causal convolution(GCC)on graph-structured sequence with a large receptive field to capture more causal connections,which is vital for visual grounding and instance-sequence reasoning.Finally,we evaluate our model on TVQA+dataset,which contains the groundtruth of instance grounding and temporal localization,three other Video QA datasets and three multimodal language processing datasets.Extensive experiments demonstrate the effectiveness and generalization of the proposed method.Specifically,our method outperforms the state-of-the-art methods on these benchmarks.

文献关键词：

中图分类号：

[1] 自动化技术、计算机技术（TP） / 计算技术、计算机技术（TP3） / 计算机的应用（TP39） / 信息处理(信息加工)（TP391）

[2] 文化、科学、教育、体育（G） / 体育（G8）

[3] 自动化技术、计算机技术（TP） / 计算技术、计算机技术（TP3） / 计算机的应用（TP39） / 信息处理(信息加工)（TP391） / 模式识别与装置（TP391.4）

作者姓名：

Rui LIU;Yahong HAN

作者机构：

College of Intelligence and Computing,Tianjin University,Tianjin 300350,China;Tianjin Key Lab of Machine Learning,Tianjin University,Tianjin 300350,China

文献出处：

计算机科学前沿

引用格式：

[1]Rui LIU;Yahong HAN-.Instance-sequence reasoning for video question answering)[J].计算机科学前沿,2022(06):89-97

A类：

TVQA+dataset,groundtruth

B类：

Instance,sequence,reasoning,answering,Video,involves,thorough,understanding,content,language,well,grounding,textual,semantic,visual,videos,Thus,questions,more,accurately,not,only,entity,should,associated,certain,frames,but,also,action,event,localized,corresponding,temporal,slot,It,turns,challen,ging,task,that,requires,ability,conducting,correlations,between,instances,along,this,paper,network,localization,our,model,both,representations,are,firstly,embedded,into,graph,nodes,which,benefits,integration,intra,inter,modality,Then,causal,convolution,GCC,structured,large,receptive,field,capture,connections,vital,Finally,evaluate,contains,three,other,datasets,multimodal,processing,Extensive,experiments,demonstrate,effectiveness,generalization,proposed,Specifically,outperforms,state,art,methods,these,benchmarks

AB值：

0.504363

相似文献

Vision-based aerial image mosaicking algorithm with object detection

HAN Jun;LI Weixing;FENG Kai;PAN Feng-School of Automation,Beijing Institute of Technology,Beijing 100081,China

Unsupervised change detection of man-made objects using coherent and incoherent features of multi-temporal SAR images

FENG Hao;WU Jianzhong;ZHANG Lu;LIAO Mingsheng-State Key Laboratory of Information Engineering in Surveying,Mapping and Remote Sensing,Wuhan University,Wuhan 430079,China;Key Laboratory of Land Subsidence Monitoring and Prevention,Ministry of Land and Resources,Shanghai 200072,China;Shanghai Engineering Research Center of Land Subsidence,Shanghai 200072,China;Shanghai Institute of Geological Survey,Shanghai 200072,China

Label correlation for partial label learning

GE Lingchi;FANG Min;LI Haikun;CHEN Bo-School of Computer Science and Technology,Xidian University,Xi'an 710071,China

Influence fast or later:Two types of influencers in social networks

Fang Zhou;Chang Su;Shuqi Xu;Linyuan Lv-Yangtze Delta Region Institute(Huzhou)&Institute of Fundamental and Frontier Sciences,University of Electronic Science and Technology of China,Huzhou 313001,China;Beijing Computational Science Research Center,Beijing 100193,China

Yijun Ran;Tianyu Liu;Tao Jia;Xiao-Ke Xu-College of Computer and Information Science,Southwest University,Chongqing 400715,China;College of Information and Communication Engineering,Dalian Minzu University,Dalian 116600,China