Video Paragraph Captioning as a Text Summarization Task

9Citations
Citations of this article
69Readers
Mendeley users who have this article in their library.

Abstract

Video paragraph captioning aims to generate a set of coherent sentences to describe a video that contains several events. Most previous methods simplify this task by using groundtruth event segments. In this work, we propose a novel framework by taking this task as a text summarization task. We first generate lots of sentence-level captions focusing on different video clips and then summarize these captions to obtain the final paragraph caption. Our method does not depend on ground-truth event segments. Experiments on two popular datasets ActivityNet Captions and YouCookII demonstrate the advantages of our new framework. On the ActivityNet dataset, our method even outperforms some previous methods using ground-truth event segment labels.

References Powered by Scopus

CIDEr: Consensus-based image description evaluation

3618Citations
N/AReaders
Get full text

LexRank: Graph-based lexical centrality as salience in text summarization

2434Citations
N/AReaders
Get full text

Soft-NMS - Improving Object Detection with One Line of Code

1700Citations
N/AReaders
Get full text

Cited by Powered by Scopus

An Empirical Survey on Long Document Summarization: Datasets, Models, and Metrics

71Citations
N/AReaders
Get full text

Visual Abductive Reasoning

39Citations
N/AReaders
Get full text

GOAL: A Challenging Knowledge-grounded Video Captioning Benchmark for Real-time Soccer Commentary Generation

9Citations
N/AReaders
Get full text

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Cite

CITATION STYLE

APA

Liu, H., & Wan, X. (2021). Video Paragraph Captioning as a Text Summarization Task. In ACL-IJCNLP 2021 - 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, Proceedings of the Conference (Vol. 2, pp. 55–60). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2021.acl-short.9

Readers' Seniority

Tooltip

PhD / Post grad / Masters / Doc 14

70%

Researcher 3

15%

Lecturer / Post doc 2

10%

Professor / Associate Prof. 1

5%

Readers' Discipline

Tooltip

Computer Science 19

76%

Linguistics 4

16%

Neuroscience 1

4%

Social Sciences 1

4%

Save time finding and organizing research with Mendeley

Sign up for free