An Improved End-to-End Multi-Target Tracking Method Based on Transformer Self-Attention

Yong Hong; Deren Li; Shupei Luo; Xin Chen; Yi Yang; Mi Wang

Journal ArticleOPEN ACCESS

An Improved End-to-End Multi-Target Tracking Method Based on Transformer Self-Attention

Remote Sensing (2022) 14(24)

DOI: 10.3390/rs14246354

5Citations

7Readers

Abstract

Current multi-target multi-camera tracking algorithms demand increased requirements for re-identification accuracy and tracking reliability. This study proposed an improved end-to-end multi-target tracking algorithm that adapts to multi-view multi-scale scenes based on the self-attentive mechanism of the transformer’s encoder–decoder structure. A multi-dimensional feature extraction backbone network was combined with a self-built raster semantic map which was stored in the encoder for correlation and generated target position encoding and multi-dimensional feature vectors. The decoder incorporated four methods: spatial clustering and semantic filtering of multi-view targets; dynamic matching of multi-dimensional features; space–time logic-based multi-target tracking, and space–time convergence network (STCN)-based parameter passing. Through the fusion of multiple decoding methods, multi-camera targets were tracked in three dimensions: temporal logic, spatial logic, and feature matching. For the MOT17 dataset, this study’s method significantly outperformed the current state-of-the-art method by 2.2% on the multiple object tracking accuracy (MOTA) metric. Furthermore, this study proposed a retrospective mechanism for the first time and adopted a reverse-order processing method to optimize the historical mislabeled targets for improving the identification F1-score (IDF1). For the self-built dataset OVIT-MOT01, the IDF1 improved from 0.948 to 0.967, and the multi-camera tracking accuracy (MCTA) improved from 0.878 to 0.909, which significantly improved the continuous tracking accuracy and reliability.

Author supplied keywords

References Powered by Scopus

View more at Scopus

Cited by Powered by Scopus

View more at Scopus

Cite

CITATION STYLE

APA

Hong, Y., Li, D., Luo, S., Chen, X., Yang, Y., & Wang, M. (2022). An Improved End-to-End Multi-Target Tracking Method Based on Transformer Self-Attention. Remote Sensing, 14(24). https://doi.org/10.3390/rs14246354

Readers over time

Readers' Seniority

PhD / Post grad / Masters / Doc 4

80%

Researcher 1

20%

Readers' Discipline

Computer Science 2

33%

Engineering 2

33%

Medicine and Dentistry 1

17%

Physics and Astronomy 1

17%

Article Metrics

Mentions

Blog Mentions: 1

News Mentions: 1

View details >

An Improved End-to-End Multi-Target Tracking Method Based on Transformer Self-Attention

Abstract

Author supplied keywords

References Powered by Scopus

You only look once: Unified, real-time object detection

End-to-End Object Detection with Transformers

Simple online and realtime tracking with a deep association metric

Cited by Powered by Scopus

Global-Local and Occlusion Awareness Network for Object Tracking in UAVs

NLOS Error Suppression Method based on UWB Indoor Positioning

Multi-target detection and tracking based on CRF network and spatio-temporal attention for sports videos

Register to see more suggestions

Cite

Readers over time

Readers' Seniority

Readers' Discipline

Article Metrics