Discover the SciOpen Platform and Achieve Your Research Goals with Ease.
Search articles, authors, keywords, DOl and etc.
Human gesture recognition is an important research field of human-computer interaction due to its potential applications in various fields, but existing methods still face challenges in achieving high levels of accuracy. To address this issue, some existing researches propose to fuse the global features with the cropped features called focuses on vital body parts like hands. However, most methods rely on experience when choosing the focus, the scheme of focus selection is not discussed in detail. In this paper, a hierarchical body part combination method is proposed to take into account the number, combinations, and logical relationships between body parts. The proposed method generates multiple focuses using this method and employs chart-based surface modality alongside red-green-blue and optical flow modalities to enhance each focus. A feature-level fusion scheme based on the residual connection structure is proposed to fuse different modalities at convolution stages, and a focus fusion scheme is proposed to learn the relevancy of focus channels for each gesture class individually. Experiments conducted on ChaLearn isolated gesture dataset show that the use of multiple focuses in conjunction with multi-modal features and fusion strategies leads to better gesture recognition accuracy.
R. Mahmoud, S. Belgacem, and M. N. Omri, Towards an end-to-end isolated and continuous deep gesture recognition process, Neural Comput. Appl., vol. 34, no. 16, pp. 13713–13732, 2022.
R. Jain, R. K. Karsh, and A. A. Barbhuiya, Literature review of vision-based dynamic gesture recognition using deep learning techniques, Concurr. Comput., vol. 34, no. 22, pp. e7159, 2022.
N. Naz, H. Sajid, S. Ali, O. Hasan, and M. K. Ehsan, Signgraph: An efficient and accurate pose-based graph convolution approach toward sign language recognition, IEEE Access, vol. 11, pp. 19135–19147, 2023.
J. C. Núñez, R. Cabido, J. J. Pantrigo, A. S. Montemayor, and J. F. Vélez, Convolutional neural networks and long short-term memory for skeleton-based human activity and hand gesture recognition, Pattern Recogn., vol. 76, pp. 80–94, 2018.
Y. Zhang, L. Shi, Y. Wu, K. Cheng, J. Cheng, and H. Lu, Gesture recognition based on deep deformable 3D convolutional neural networks, Pattern Recogn., vol. 107, pp. 107416, 2020.
J. Wan, C. Lin, L. Wen, Y. Li, Q. Miao, S. Escalera, G. Anbarjafari, I. Guyon, G. Guo, and S. Li, ChaLearn looking at people: IsoGD and ConGD large-scale RGB-D gesture recognition, IEEE Trans. Cybern., vol. 52, no. 5, pp. 3422–3433, 2022.
S. Wan, L. Yang, K. Ding, and D. Qiu, Dynamic gesture recognition based on three-stream coordinate attention network and knowledge distillation, IEEE Access, vol. 11, pp. 50547–50559, 2023.
Z. Yu, B. Zhou, J. Wan, P. Wang, H. Chen, X. Liu, S. Li, and G. Zhao, Searching multi-rate and multi-modal temporal enhanced networks for gesture recognition, IEEE Trans. Image Process., vol. 30, pp. 5626–5640, 2021.
Y. Li, Q. Miao, X. Qi, Z. Ma, and W. Ouyang, A spatiotemporal attention-based ResC3D model for large-scale gesture recognition, Mach. Vis. Appl., vol. 30, no. 5, pp. 875–888, 2019.
G. Zhu, L. Zhang, L. Yang, L. Mei, S. A. A. Shah, M. Bennamoun, and P. Shen, Redundancy and attention in convolutional LSTM for gesture recognition, IEEE Trans. Neural Netw. Learn. Syst., vol. 31, no. 4, pp. 1323–1335, 2020.
The articles published in this open access journal are distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/).