Heterogeneous Interactive Learning Network for Unsupervised Cross-Modal Retrieval

Yuanchao Zheng; Xiaowei Zhang

Conference Proceedings

Heterogeneous Interactive Learning Network for Unsupervised Cross-Modal Retrieval

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2023) 13844 LNCS 692-707

DOI: 10.1007/978-3-031-26316-3_41

0Citations

1Readers

Get full text

Abstract

Cross-modal hashing has received a lot of attention because of its unique characteristic of low storage cost and high retrieval efficiency. However, these existing cross-modal retrieval approaches often fail to align effectively semantic information due to information asymmetry between image and text modality. To address this issue, we propose Heterogeneous Interactive Learning Network (HILN) for unsupervised cross-modal retrieval to alleviate the problem of the heterogeneous semantic gap. Specifically, we introduce a multi-head self-attention mechanism to capture the global dependencies of semantic features within the modality. Moreover, since the semantic relations among object entities from different modalities exist consistency, we perform heterogeneous feature fusion through the heterogeneous feature interaction module, especially through the cross attention in it to learn the interaction between different modal features. Finally, to further maintain semantic consistency, we introduce adversarial loss into network learning to generate more robust hash codes. Extensive experiments demonstrate that the proposed HILN improves the accuracy of T→ I and I→ T cross-modal retrieval tasks by 7.6 % and 5.5 % over the best competitor DGCPN on the NUS-WIDE dataset, respectively. Code is available at https://github.com/Z000204/HILN.

Author supplied keywords

Cite

CITATION STYLE

APA

Zheng, Y., & Zhang, X. (2023). Heterogeneous Interactive Learning Network for Unsupervised Cross-Modal Retrieval. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 13844 LNCS, pp. 692–707). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-031-26316-3_41

Heterogeneous Interactive Learning Network for Unsupervised Cross-Modal Retrieval

Abstract

Author supplied keywords

Cite

Register to see more suggestions