Residual gating fusion network for human action recognition

Junxuan Zhang; Haifeng Hu

Conference Proceedings

Residual gating fusion network for human action recognition

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2018) 10996 LNCS 79-86

DOI: 10.1007/978-3-319-97909-0_9

1Citations

1Readers

Get full text

Abstract

Most of the recent works leverage Two-Stream framework to model the spatiotemporal information for video action recognition and achieve remarkable performance. In this paper, we propose a novel convolution architecture, called Residual Gating Fusion Network (RGFN), to improve their performance by fully exploring spatiotemporal information in residual signals. In order to further exploit the local details of low-level layers, we introduce Multi-Scale Convolution Fusion (MSCF) to implement spatiotemporal fusion at multiple levels. Since RGFN is an end-to-end network, it can be trained on various kinds of video datasets and applicative to other video analysis tasks. We evaluate our RGFN on two standard benchmarks, i.e., UCF101 and HMDB51, and analyze the designs of convolution network. Experiments results demonstrate the advantages of RGFN, achieving the state-of-the-art performance.

Author supplied keywords

Cite

CITATION STYLE

APA

Zhang, J., & Hu, H. (2018). Residual gating fusion network for human action recognition. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 10996 LNCS, pp. 79–86). Springer Verlag. https://doi.org/10.1007/978-3-319-97909-0_9

Residual gating fusion network for human action recognition

Abstract

Author supplied keywords

Cite

Register to see more suggestions