Journal Home > Volume 4 , Issue 3

Facial emotion recognition achieves great success with the help of large neural models but also fails to be applied in practical situations due to the large model size of neural methods. To bridge this gap, in this paper, we combine two mainstream model compression methods (pruning and quantization) together, and propose a pruning-then-quantization framework to compress the neural models for facial emotion recognition tasks. Experiments on three datasets show that our model could achieve a high model compression ratio and maintain the model’s high performance well. Besides, We analyze the layer-wise compression performance of our proposed framework to explore its effect and adaptability in fine-grained modules.


menu
Abstract
Full text
Outline
About this article

A pruning-then-quantization model compression framework for facial emotion recognition

Show Author's information Han Sun1,Wei Shao2,( )Tao Li3Jiayu Zhao3Weitao Xu2Linqi Song2
Department of Computer Science, City University of Hong Kong, Hong Kong 999077, China
Department of Computer Science, City University of Hong Kong, Hong Kong 999077, China, and also with City University of Hong Kong Shenzhen Research Institute, Shenzhen 518057, China
Shenzhen Konka Electronic Technology Co., Ltd., Shenzhen 518053, China

Han Sun and Wei Shao contribute equally to this paper.

Abstract

Facial emotion recognition achieves great success with the help of large neural models but also fails to be applied in practical situations due to the large model size of neural methods. To bridge this gap, in this paper, we combine two mainstream model compression methods (pruning and quantization) together, and propose a pruning-then-quantization framework to compress the neural models for facial emotion recognition tasks. Experiments on three datasets show that our model could achieve a high model compression ratio and maintain the model’s high performance well. Besides, We analyze the layer-wise compression performance of our proposed framework to explore its effect and adaptability in fine-grained modules.

Keywords: model compression, facial emotion recognition, Resnet

References(37)

[1]
M. K. Chowdary, T. N. Nguyen, and D. J. Hemanth, Deep learning-based facial emotion recognition for human-computer interaction applications, Neural Comput. Appl., pp. 1–18, 2021.
DOI
[2]

M. Egger, M. Ley, and S. Hanke, Emotion recognition from physiological signal analysis: A review, Electron. Notes Theor. Comput. Sci., vol. 343, pp. 35–55, 2019.

[3]

W. Hua, F. Dai, L. Huang, J. Xiong, and G. Gui, HERO: Human emotions recognition for realizing intelligent Internet of Things, IEEE Access, vol. 7, pp. 24321–24332, 2019.

[4]
A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, Attention is all You need, in Proc. 31st Int. Conf. Neural Information Processing Systems, Long Beach, California, USA, 2017, pp. 6000–6010.
[5]
Z. Liu, Y. Lin, Y. Cao, H. Hu, Y. Wei, Z. Zhang, S. Lin, and B. Guo, Swin transformer: Hierarchical vision transformer using shifted windows, in Proc. 2021 IEEE/CVF Int. Conf. Computer Vision (ICCV), Montreal, Canada, 2022, pp. 9992–10002.
DOI
[6]
K. He, X. Zhang, S. Ren, and J. Sun, Deep residual learning for image recognition, in Proc. 2016 IEEE Conf. Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 2016, pp. 770–778.
DOI
[7]
S. Schneider, A. Baevski, R. Collobert, and M. Auli, wav2vec: unsupervised pre-training for speech recognition, arXiv preprint arXiv: 1904.05862, 2019.
DOI
[8]
Z. Wang, C. Li, and X. Wang, Convolutional neural network pruning with structural redundancy reduction, in Proc. 2021 IEEE/CVF Conf. Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, 2021, pp. 14908–14917.
DOI
[9]
Y. LeCun, J. Denker, and S. Solla, Optimal brain damage, in Proc. 2nd Int. Conf. Neural Information Processing Systems, Denver, CO, USA, 1989, pp. 598–605.
[10]
C. Louizos, M. Welling, and D. P. Kingma, Learning sparse neural networks through L0 regularization, arXiv preprint arXiv: 1712.01312, 2017.
[11]
Z. Liu, H. Mu, X. Zhang, Z. Guo, X. Yang, K. -T. Cheng, and J. Sun, MetaPruning: meta learning for automatic neural network channel pruning, in Proc. 2019 IEEE/CVF Int. Conf. Computer Vision (ICCV), Seoul, Republic of Korea, 2020, pp. 3295–3304.
DOI
[12]
J. S. McCarley, R. Chakravarti, and A. Sil, Structured pruning of a BERT-based question answering model, arXiv preprint arXiv: 1910.06360, 2019.
[13]
F. Lagunas, E. Charlaix, V. Sanh, and A. M. Rush, Block pruning for faster transformers, arXiv preprint arXiv: 2109.04838, 2021.
DOI
[14]
M. Xia, Z. Zhong, and D. Chen, Structured pruning learns compact and accurate models, in Proc. 60th Annu. Meeting Association for Computational Linguistics, Dublin, Ireland, 2022, pp. 1513–1528.
DOI
[15]
S. Narang, E. Elsen, G. Diamos, and S. Sengupta, Exploring sparsity in recurrent neural networks, arXiv preprint arXiv: 1704.05119, 2017.
[16]
Y. Wang, L. Wang, V. Li, and Z. Tu, On the sparsity of neural machine translation models, arXiv preprint arXiv: 2010.02646, 2020.
DOI
[17]
V. Sanh, T. Wolf, and A. M. Rush, Movement pruning: Adaptive sparsity by fine-tuning, arXiv preprint arXiv: 2005.07683, 2020.
[18]
D. Guo, A. Rush, and Y. Kim, Parameter-efficient transfer learning with diff pruning, in Proc. 59th Annu. Meeting Association for Computational Linguistics, 11th Int. Joint Conf. Natural Language Processing, virtual, 2021, pp. 4884–4896.
DOI
[19]
I. Hubara, Y. Nahshan, Y. Hanani, R. Banner, and D. Soudry, Accurate post training quantization with small calibration sets, in Proc. 38th Int. Conf. Machine Learning, virtual, 2021, pp. 4466–4475.
[20]
J. Yang, X. Shen, J. Xing, X. Tian, H. Li, B. Deng, J. Huang, and X. -S. Hua, Quantization networks, in Proc. 2019 IEEE/CVF Conf. Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 2020, pp. 7300–7308.
DOI
[21]
B. Jacob, S. Kligys, B. Chen, M. Zhu, M. Tang, A. Howard, H. Adam, and D. Kalenichenko, Quantization and training of neural networks for efficient integer-arithmetic-only inference, arXiv preprint arXiv: 1712.05877, 2017.
DOI
[22]
C. Sakr, S. Dai, R. Venkatesan, B. Zimmer, W. Dally, and B. Khailany, Optimal clippingand magnitude-aware differentiation for improved quantization-aware training, in Proc. 39th Int. Conf. Machine Learning, Baltimore, MD, USA, pp. 19123–19138.
[23]
M. Nagel, M. Fournarakis, Y. Bondarenko, and T. Blankevoort, Overcoming oscillations inquantization-aware training, in Proc. 39th Int. Conf. Machine Learning, Baltimore, MD, USA, pp. 16318–16330.
[24]
Y. Wang, Y. Lu, and T. Blankevoort, Differentiable joint pruning and quantization for hardware efficiency, in Proc. 16th European Conf. Computer Vision (ECCV), Glasgow, UK, 2020, 259–277.
DOI
[25]
F. Tung and G. Mori, CLIP-Q: Deep network compression learning by in-parallel pruning-quantization, in Proc. 2018 IEEE/CVF Conf. Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 2018, pp. 7873–7882.
DOI
[26]
P. Hu, X. Peng, H. Zhu, M. M. S. Aly, and J. Lin, OPQ: Compressing deep neural networks with one-shot pruning-quantization, arXiv preprint arXiv: 2205.11141, 2022.
[27]
S. Ye, T. Zhang, K. Zhang, J. Li, J. Xie, Y. Liang, S. Liu, X. Lin, and Y. Wang, A unified framework of DNN weight pruning and weight clustering/quantization using ADMM, arXiv preprint arXiv: 1811.01907, 2018.
[28]
J. Kim, K. Yoo, and N. Kwak, Position-based scaled gradient for model quantization and pruning, arXiv preprint arXiv: 2005.11035, 2020.
[29]
L. Guerra, B. Zhuang, I. Reid, and T. Drummond, Automatic pruning for quantized neural networks, arXiv preprint arXiv: 2002.00523, 2020.
DOI
[30]
H. Li, A. Kadav, I. Durdanovic, H. Samet, and H. P. Graf, Pruning filters for efficient ConvNets, arXiv preprint arXiv: 1608.08710, 2016.
[31]
X. Liu, J. Pool, S. Han, and W. J. Dally, Efficient sparse-winograd convolutional neural networks, arXiv preprint arXiv: 1802.06367, 2018.
[32]
M. Nagel, M. van Baalen, T. Blankevoort, and M. Welling, Data-free quantization through weight equalization and bias correction, arXiv preprint arXiv: 1906.04721, 2019.
DOI
[33]
Y. Choukroun, E. Kravchik, F. Yang, and P. Kisilev, Low-bit quantization of neural networks for efficient inference, arXiv preprint arXiv: 1902.06822, 2019.
DOI
[34]
S. Zhou, Y. Wu, Z. Ni, X. Zhou, H. Wen, and Y. Zou, DoReFa-net: Training low bitwidth convolutional neural networks with low bitwidth gradients, arXiv preprint arXiv: 1606.06160, 2016.
[35]
Q. Jin, J. Ren, R. Zhuang, S. Hanumante, Z. Li, Z. Chen, Y. Wang, K. Yang, and S. Tulyakov, F8Net: fixed-point 8-bit only multiplication for network quantization, arXiv preprint arXiv: 2202.05239, 2022.
[36]
J. Choi, Z. Wang, S. Venkataramani, P. I. J. Chuang, V. Srinivasan, and K. Gopalakrishnan, Pact: Parameterized clipping activation for quantized neural networks, arXiv preprint arXiv: 1805.06085, 2018.
[37]
Q. Zhang, Z. Han, F. Yang, Y. Zhang, Z. Liu, M. Yang, and L. Zhou, “Retiarii: A deep learning exploratory-training framework, in Proc. 14th USENIX Symp. Operating Systems Design and Implementation (OSDI’20), virtual, 2020, pp. 919–936.
Publication history
Copyright
Acknowledgements
Rights and permissions

Publication history

Received: 23 February 2023
Revised: 30 May 2023
Accepted: 30 June 2023
Published: 30 September 2023
Issue date: September 2023

Copyright

© All articles included in the journal are copyrighted to the ITU and TUP.

Acknowledgements

Acknowledgment

This work was supported in part by the Technological Breakthrough Project of Science, Technology and Innovation Commission of Shenzhen Municipality (No. JSGG20201102162000001), InnoHK Initiative of Hong Kong SAR Government, and the Laboratory for AI-Powered Financial Technologies Ltd.

Rights and permissions

This work is available under the CC BY-NC-ND 3.0 IGO license:https://creativecommons.org/licenses/by-nc-nd/3.0/igo/

Return