Journal Home > Volume 29 , Issue 2

As a popular kind of stylized face, cartoon faces have rich application scenarios. It is challenging to create personalized 3D cartoon faces directly from 2D real photos. Besides, in order to adapt to more application scenarios, automatic style editing, and animation of cartoon faces is also a crucial problem that is urgently needed to be solved in the industry, but has not yet had a perfect solution. To solve this problem, we first propose "3D face cartoonizer" , which can generate high-quality 3D cartoon faces with texture when fed into 2D facial images. We contribute the first 3D cartoon face hybrid dataset and a new training strategy which first pretrains our network with low-quality triplets in a reconstruction-then-generation manner, and then finetunes it with high-quality triplets in an adversarial manner to fully leverage the hybrid dataset. Besides, we implement style editing for 3D cartoon faces based on k-means, which can be easily achieved without retrain the neural network. In addition, we propose a new cartoon faces’ blendshape generation method, and based on this, realize the expression animation of 3D cartoon faces, enabling more practical applications. Our dataset and code will be released for future research.


menu
Abstract
Full text
Outline
About this article

Synthesis, Style Editing, and Animation of 3D Cartoon Face

Show Author's information Ming Guo1Feng Xu1( )Shunfei Wang2Zhibo Wang1Ming Lu3Xiufen Cui2Xiao Ling2
School of Software and BNRist, Tsinghua University, Beijing 100084, China
Department of Multimedia and Smart, Guangdong OPPO Mobile Telecommunications Corp. Ltd., Dongguan 523860, China
Intel Labs, Intel, Beijing 100089, China

Abstract

As a popular kind of stylized face, cartoon faces have rich application scenarios. It is challenging to create personalized 3D cartoon faces directly from 2D real photos. Besides, in order to adapt to more application scenarios, automatic style editing, and animation of cartoon faces is also a crucial problem that is urgently needed to be solved in the industry, but has not yet had a perfect solution. To solve this problem, we first propose "3D face cartoonizer" , which can generate high-quality 3D cartoon faces with texture when fed into 2D facial images. We contribute the first 3D cartoon face hybrid dataset and a new training strategy which first pretrains our network with low-quality triplets in a reconstruction-then-generation manner, and then finetunes it with high-quality triplets in an adversarial manner to fully leverage the hybrid dataset. Besides, we implement style editing for 3D cartoon faces based on k-means, which can be easily achieved without retrain the neural network. In addition, we propose a new cartoon faces’ blendshape generation method, and based on this, realize the expression animation of 3D cartoon faces, enabling more practical applications. Our dataset and code will be released for future research.

Keywords: cartoon, computer graphics, animation, 3D face

References(37)

[1]
J. L. Gong, Y. Hold-Geoffroy, and J. W. Lu, AutoToon: Automatic geometric warping for face cartoon generation, in Proc. IEEE Winter Conf. Applications of Computer Vision, Snowmass, CO, USA, 2020, pp. 360–369.
[2]
K. D. Cao, J. Liao, and L. Yuan, CariGANs: Unpaired photo-to-caricature translation, ACM Trans. Graph., vol. 37, no. 6, p. 244, 2018.
[3]
Y. C. Shi, D. Deb, and A. K. Jain, WarpGAN: Automatic caricature generation, in Proc. 2019 IEEE/CVF Conf. Computer Vision and Pattern Recognition, Long Beach, CA, USA, 2019, pp. 10762–10771.
[4]
X. G. Han, C. Gao, and Y. Z. Yu, DeepSketch2Face: A deep learning based sketching system for 3D face and caricature modeling, ACM Trans. Graph., vol. 36, no. 4, p. 126, 2017.
[5]
X. G. Han, K. C. Hou, D. Du, Y. D. Qiu, S. G. Cui, K. Zhou, and Y. Z. Yu, CaricatureShop: Personalized and photorealistic caricature sketching, IEEE Trans. Vis. Comput. Graph., vol. 26, no. 7, pp. 2349–2361, 2020.
[6]
R. C. C. Vieira, C. A. Vidal, and J. B. Cavalcante-Neto, Three-dimensional face caricaturing by anthropometric distortions, in Proc. XXVI Conf. Graphics, Patterns and Images, Arequipa, Peru, 2013, pp. 163–170.
[7]
Y. L. Xing and J. Zhu, Deep learning-based action recognition with 3D skeleton: A survey, CAAI Trans. Intell. Technol., vol. 6, no. 1, pp. 80–92, 2021.
[8]
A. Wani, S. Revathi, and R. Khaliq, SDN-based intrusion detection system for IoT using deep learning classifier (IDSIoT-SDL), CAAI Trans. Intell. Technol., vol. 6, no. 3, pp. 281–290, 2021.
[9]
Q. Y. Wu, J. Y. Zhang, Y. K. Lai, J. M. Zheng, and J. F. Cai, Alive caricature from 2D to 3D, in Proc. IEEE/CVF Conf. Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 2018, pp. 7336–7345.
[10]
V. Blanz and T. Vetter, A morphable model for the synthesis of 3D faces, in Proc. 26th Ann. Conf. Computer Graphics and Interactive Techniques, Los Angeles, CA, USA, 1999, pp. 187–194.
[11]
C. Cao, Y. L. Weng, S. Zhou, Y. Y. Tong, and K. Zhou, FaceWarehouse: A 3D facial expression database for visual computing, IEEE Trans. Vis. Comput. Graph., vol. 20, no. 3, pp. 413–425, 2014.
[12]
Y. D. Qiu, X. J. Xu, L. T. Qiu, Y. Pan, Y. S. Wu, W. K. Chen, and X. G. Han, 3DCaricShop: A dataset and a baseline method for single-view 3D caricature face reconstruction, in Proc. IEEE/CVF Conf. Computer Vision and Pattern Recognition, Nashville, TN, USA, 2021, pp. 10236–10245.
[13]
H. R. Cai, Y. D. Guo, Z. Peng, and J. Y. Zhang, Landmark detection and 3D face reconstruction for caricature using a nonlinear parametric model, Graph. Models, vol. 115, p. 101103, 2021.
[14]
T. Lewiner, T. Vieira, D. Martínez, A. Peixoto, V. Mello, and L. Velho, Interactive 3D caricature from harmonic exaggeration, Comput. Graph., vol. 35, no. 3, pp. 586–595, 2011.
[15]
Z. P. Ye, M. F. Xia, Y. N. Sun, R. Yi, M. J. Yu, J. Y. Zhang, Y. K. Lai, and Y. J. Liu, 3D-CariGAN: An end-to-end solution to 3D caricature generation from normal face photos, IEEE Trans. Vis. Comput. Graph., vol. 29, no. 4, pp. 2203–2210, 2023.
[16]
Y. D. Guo, L. Jiang, L. Cai, and J. Y. Zhang, 3D magic mirror: Automatic video to 3D caricature translation, arXiv preprint arXiv: 1906.00544, 2019.
[17]
J. F. Liu, Y. Q. Chen, C. Y. Miao, J. J. Xie, C. X. Ling, X. Y. Gao, and W. Gao, Semi-supervised learning in reconstructed manifold space for 3D caricature generation, Comput. Graph. Forum, vol. 28, no. 8, pp. 2104–2116, 2009.
[18]
Z. P. Ye, M. F. Xia, Y. N. Sun, R. Yi, M. J. Yu, J. Y. Zhang, Y. K. Lai, and Y. J. Liu, 3D-CariGAN: An end-to-end solution to 3D caricature generation from normal face photos, IEEE Trans. Vis. Comput. Graph., vol. 29, no. 4, pp. 2203–2210, 2023.
[19]
F. Z. Han, S. Q. Ye, M. M. He, M. L. Chai, and J. Liao, Exemplar-based 3D portrait stylization, IEEE Trans. Vis. Comput. Graph., vol. 29, no. 2, pp. 1371–1383, 2023.
[20]
B. Guenter, C. Grimm, D. Wood, H. Malvar, and F. Pighin, Making faces, in Proc. 25th Annu. Conf. Computer Graphics and Interactive Techniques, Orlando, FL, USA, 1998, pp. 55–66.
[21]
H. T. Yang, H. Zhu, Y. R. Wang, M. K. Huang, Q. Shen, R. G. Yang, and X. Cao, FaceScape: A large-scale high quality 3D face dataset and detailed riggable 3D face prediction, in Proc. IEEE/CVF Conf. Computer Vision and Pattern Recognition, Seattle, WA, USA, 2020, pp. 601–610.
[22]
T. Weise, S. Bouaziz, H. Li, and M. Pauly, Realtime performance-based facial animation, ACM Trans. Graph., vol. 30, no. 4, p. 77, 2011.
[23]
C. Cao, H. Z. Wu, Y. L. Weng, T. J. Shao, and K. Zhou, Real-time facial animation with image-based dynamic avatars, ACM Trans. Graph., vol. 35, no. 4, p. 126, 2016.
[24]
L. Y. Mo, H. K. Li, C. Y. Zou, Y. B. Zhang, M. Yang, Y. H. Yang, and M. K. Tan, Towards accurate facial motion retargeting with identity-consistent and expression-exclusive constraints, in Proc. 36th AAAI Conf. Artificial Intelligence, Palo Alto, CA, USA, 2022, pp. 1981–1989.
[25]
T. Y. Li, T. Bolkart, M. J. Black, H. Li, and J. Romero, Learning a model of facial shape and expression from 4D scans, ACM Trans. Graph., vol. 36, no. 6, p. 194, 2017.
[26]
B. Chaudhuri, N. Vesdapunt, and B. Y. Wang, Joint face detection and facial motion retargeting for multiple faces, in Proc. IEEE/CVF Conf. Computer Vision and Pattern Recognition, Long Beach, CA, USA, 2019, pp. 9719–9728.
[27]
K. Y. Chen, J. M. Zheng, J. F. Cai, and J. Y. Zhang, Modeling caricature expressions by 3D blendshape and dynamic texture, in Proc. 28th ACM Int. Conf. Multimedia, Seattle, WA, USA, 2020, pp. 3228–3236.
[28]
M. Guo, S. F. Wang, Z. B. Wang, M. Lu, X. F. Cui, X. Ling, and F. Xu, 3D face cartoonizer: Generating personalized 3D cartoon faces from 2D real photos with a hybrid dataset, in Proc. 2nd CAAI Int. Conf. Artificial Intelligence, Beijing, China, 2022, pp. 356–367.
[29]
R. O. Duda and P. E. Hart, Pattern Classification and Scene Analysis. New York, NY, USA: Wiley, 1973.
[30]
E. Richardson, Y. Alaluf, O. Patashnik, Y. Nitzan, Y. Azar, S. Shapiro, and D. Cohen-Or, Encoding in style: A StyleGAN encoder for image-to-image translation, in Proc. IEEE/CVF Conf. Computer Vision and Pattern Recognition, Nashville, TN, USA, 2021, pp. 2287–2296.
[31]
Z. B. Wang, J. W. Ling, C. Z. Feng, M. Lu, and F. Xu, Emotion-preserving blendshape update with real-time face tracking, IEEE Trans. Vis. Comput. Graph., vol. 28, no. 6, pp. 2364–2375, 2022.
[32]
R. W. Sumner and J. Popović, Deformation transfer for triangle meshes, ACM Trans. Graph., vol. 23, no. 3, pp. 399–405, 2004.
[33]
O. Sorkine, D. Cohen-Or, Y. Lipman, M. Alexa, C. Rössl, and H. P. Seidel, Laplacian surface editing, in Proc. Eurographics/ACM SIGGRAPH Symp. Geometry Processing, Nice, France, pp. 175–184, 2004.
[34]
K. M. He, X. Y. Zhang, S. Q. Ren, and J. Sun, Deep residual learning for image recognition, in Proc. IEEE Conf. Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 2016, pp. 770–778.
[35]
S. Woo, J. Park, J. Y. Lee, and I. S. Kweon, CBAM: Convolutional block attention module, in Proc. 15th European Conf. Computer Vision, Munich, Germany, 2018, pp. 3–19.
[36]
L. Y. Liu, H. M. Jiang, P. C. He, W. Z. Chen, X. D. Liu, J. F. Gao, and J. W. Han, On the variance of the adaptive learning rate and beyond, presented at the 8th Int. Conf. Learning Representations, Addis Ababa, Ethiopia, 2020.
[37]
M. R. Zhang, J. Lucas, G. Hinton, and J. Ba, Lookahead optimizer: k steps forward, 1 step back, in Proc. 33rd Int. Conf. Neural Information Processing Systems, Vancouver, Canada, 2019, p. 861.
Publication history
Copyright
Acknowledgements
Rights and permissions

Publication history

Received: 19 February 2023
Revised: 03 April 2023
Accepted: 04 April 2023
Published: 22 September 2023
Issue date: April 2024

Copyright

© The author(s) 2024.

Acknowledgements

This work was supported by the National Key R&D Program of China (No. 2018YFA0704000), the Beijing Natural Science Foundation (No. M22024), the National Natural Science Foundation of China (No. 62021002), and the Key Research and Development Project of Tibet Autonomous Region (No. XZ202101ZY0019G). This work was also supported by the Institute for Brain and Cognitive Sciences, BNRist, Tsinghua University, BLBCI, and Beijing Municipal Education Commission.

Rights and permissions

The articles published in this open access journal are distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/).

Return