Multimodal Named Entity Recognition via Co-attention-Based Method with Dynamic Visual Concept Expansion

1Citations
Citations of this article
6Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Multimodal named entity recognition (MNER) that recognizes named entities in text with the help of images has become a popular topic in recent years. Previous studies on MNER only utilize visual features or detected concepts from a given image directly without considering implicit knowledge among visual concepts. Taking the concepts not detected but relevant to those in the image into consideration provides rich prior knowledge, which has been proved effective on other multimodal tasks. This paper proposes a novel method to effectively take full advantage of external implicit knowledge, called Co-attention-based model with Dynamic Visual Concept Expansion (CDVCE). In CDVCE, we adopt the concept co-occurrence matrix in a large-scale annotated image database as implicit knowledge among visual concepts and dynamically expand detected visual concepts conditioned on the concept co-occurrence matrix and the input text. Experiments conducted on two public MNER datasets prove the effectiveness of our proposed method, which outperforms other state-of-the-art methods in most cases.

Cite

CITATION STYLE

APA

Zhao, X., & Tang, B. (2021). Multimodal Named Entity Recognition via Co-attention-Based Method with Dynamic Visual Concept Expansion. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 13108 LNCS, pp. 476–487). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-030-92185-9_39

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free