Deep unsupervised embedding for remote sensing image retrieval using textual cues

21Citations
Citations of this article
14Readers
Mendeley users who have this article in their library.

Abstract

Compared to image-image retrieval, text-image retrieval has been less investigated in the remote sensing community, possibly because of the complexity of appropriately tying textual data to respective visual representations. Moreover, a single image may be described via multiple sentences according to the perception of the human labeler and the structure/body of the language they use, which magnifies the complexity even further. In this paper, we propose an unsupervised method for text-image retrieval in remote sensing imagery. In the method, image representation is obtained via visual Big Transfer (BiT) Models, while textual descriptions are encoded via a bidirectional Long Short-Term Memory (Bi-LSTM) network. The training of the proposed retrieval architecture is optimized using an unsupervised embedding loss, which aims to make the features of an image closest to its corresponding textual description and different from other image features and vise-versa. To demonstrate the performance of the proposed architecture, experiments are performed on two datasets, obtaining plausible text/image retrieval outcomes.

References Powered by Scopus

77546Citations
26299Readers
Get full text

ImageNet: A Large-Scale Hierarchical Image Database

51931Citations
9801Readers
Get full text

Xception: Deep learning with depthwise separable convolutions

11634Citations
8349Readers
Get full text

Cited by Powered by Scopus

RemoteCLIP: A Vision Language Foundation Model for Remote Sensing

75Citations
75Readers
Get full text
39Citations
27Readers
Get full text

Multilanguage Transformer for Improved Text to Remote Sensing Image Retrieval

37Citations
16Readers

This article is free to access.

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Cite

CITATION STYLE

APA

Al Rahhal, M. M., Bazi, Y., Abdullah, T., Mekhalfi, M. L., & Zuair, M. (2020). Deep unsupervised embedding for remote sensing image retrieval using textual cues. Applied Sciences (Switzerland), 10(24), 1–14. https://doi.org/10.3390/app10248931

Readers over time

‘20‘21‘23‘24‘2502468

Readers' Seniority

Tooltip

PhD / Post grad / Masters / Doc 6

86%

Lecturer / Post doc 1

14%

Readers' Discipline

Tooltip

Computer Science 5

71%

Social Sciences 1

14%

Psychology 1

14%

Save time finding and organizing research with Mendeley

Sign up for free
0