Video-based person re-identification (re-id), which aims to match people through videos captured by non-overlapping camera views, has attracted lots of research interest recently. In this paper, we propose a novel hybrid 2D and 3D convolution based recurrent neural network for video-based person re-id task, which can simultaneously make use of the local short-term fast-varying motion information and the global long-term spatial and temporal information. Specifically, the 3D convolutional module is able to explore the local short-term fast-varying motion information, while the recurrent layer performed can learn global long-term spatial and temporal information. We evaluate the proposed hybrid neural network on the publicly available PRID 2011, iLIDS-VID and MARS multi-shot pedestrian re-identification datasets, and the experiment results demonstrate the effectiveness of our approach on the task of video-based person re-id.
CITATION STYLE
Cheng, L., Jing, X. Y., Zhu, X., Qi, F., Ma, F., Jia, X., … Wang, C. (2018). A hybrid 2D and 3D convolution based recurrent network for video-based person re-identification. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11301 LNCS, pp. 439–451). Springer Verlag. https://doi.org/10.1007/978-3-030-04167-0_40
Mendeley helps you to discover research relevant for your work.