Skeleton-based hand gesture recognition has achieved great success in recent years. However, most of the existing methods cannot extract spatiotemporal features well due to the skeleton noise. In real applications, some large models also suffer from a huge number of parameters and low execution speed. This paper presents a lightweight skeleton-based hand gesture recognition network by using multi-input fusion to address those issues. We convey two joint-oriented features: Center Joint Distances (CJD) feature and Center Joint Angles (CJA) feature as the static branch. Besides, the motion branch consists of Global Linear Velocities (GLV) feature and Local Angular Velocities (LAV) feature. Fusing static and motion branches, a robust input can be generated and fed into a lightweight CNN-based network to recognize hand gestures. Our method achieves 95.8% and 92.5% hand gesture recognition accuracy with only 2.24M parameters on the 14 gestures and 28 gestures of the SHREC’17 dataset. Experimental results show that the proposed method outperforms state-of-the-art (SOAT) methods.
Mendeley helps you to discover research relevant for your work.
CITATION STYLE
Hu, Q., Gao, Q., Gao, H., & Ju, Z. (2022). Skeleton-Based Hand Gesture Recognition by Using Multi-input Fusion Lightweight Network. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 13455 LNAI, pp. 24–34). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-031-13844-7_3