Synthesis, Style Editing, and Animation of 3D Cartoon Face

Ming Guo; Feng Xu; Shunfei Wang; Zhibo Wang; Ming Lu; Xiufen Cui; Xiao Ling

doi:10.26599/TST.2023.9010028

AI Chat Paper

Note: Please note that the following content is generated by AMiner AI. SciOpen does not take any responsibility related to this content.

Chat more with AI

| Sign up

Browse by Subject

Search for peer-reviewed journals with full access.

Journals A - Z

About Us

Discover the SciOpen Platform and Achieve Your Research Goals with Ease.

About Us

Publish with Us

Support

Journals A - Z

About Us

Publish with Us

Support

PDF (625.3 KB)

Cite

EndNote(RIS) BibTeX

Collect

Submit Manuscript

AI Chat Paper

Show Outline

Outline

Show full outline

Hide outline

Outline

Show full outline

Hide outline

Open Access

Synthesis, Style Editing, and Animation of 3D Cartoon Face

Ming Guo^¹, Feng Xu^¹(

), Shunfei Wang^², Zhibo Wang^¹, Ming Lu^³, Xiufen Cui^², Xiao Ling^²

1School of Software and BNRist, Tsinghua University, Beijing 100084, China

2Department of Multimedia and Smart, Guangdong OPPO Mobile Telecommunications Corp. Ltd., Dongguan 523860, China

3Intel Labs, Intel, Beijing 100089, China

Show Author Information

Abstract

As a popular kind of stylized face, cartoon faces have rich application scenarios. It is challenging to create personalized 3D cartoon faces directly from 2D real photos. Besides, in order to adapt to more application scenarios, automatic style editing, and animation of cartoon faces is also a crucial problem that is urgently needed to be solved in the industry, but has not yet had a perfect solution. To solve this problem, we first propose "3D face cartoonizer" , which can generate high-quality 3D cartoon faces with texture when fed into 2D facial images. We contribute the first 3D cartoon face hybrid dataset and a new training strategy which first pretrains our network with low-quality triplets in a reconstruction-then-generation manner, and then finetunes it with high-quality triplets in an adversarial manner to fully leverage the hybrid dataset. Besides, we implement style editing for 3D cartoon faces based on k-means, which can be easily achieved without retrain the neural network. In addition, we propose a new cartoon faces’ blendshape generation method, and based on this, realize the expression animation of 3D cartoon faces, enabling more practical applications. Our dataset and code will be released for future research.

Keywords

cartoon computer graphics animation 3D face

References

[1]

J. L. Gong, Y. Hold-Geoffroy, and J. W. Lu, AutoToon: Automatic geometric warping for face cartoon generation, in Proc. IEEE Winter Conf. Applications of Computer Vision, Snowmass, CO, USA, 2020, pp. 360–369.

Crossref Google Scholar

[2]

K. D. Cao, J. Liao, and L. Yuan, CariGANs: Unpaired photo-to-caricature translation, ACM Trans. Graph., vol. 37, no. 6, p. 244, 2018.

Crossref Google Scholar

[3]

Y. C. Shi, D. Deb, and A. K. Jain, WarpGAN: Automatic caricature generation, in Proc. 2019 IEEE/CVF Conf. Computer Vision and Pattern Recognition, Long Beach, CA, USA, 2019, pp. 10762–10771.

Crossref Google Scholar

[4]

X. G. Han, C. Gao, and Y. Z. Yu, DeepSketch2Face: A deep learning based sketching system for 3D face and caricature modeling, ACM Trans. Graph., vol. 36, no. 4, p. 126, 2017.

Crossref Google Scholar

[5]

X. G. Han, K. C. Hou, D. Du, Y. D. Qiu, S. G. Cui, K. Zhou, and Y. Z. Yu, CaricatureShop: Personalized and photorealistic caricature sketching, IEEE Trans. Vis. Comput. Graph., vol. 26, no. 7, pp. 2349–2361, 2020.

Crossref Google Scholar

[6]

R. C. C. Vieira, C. A. Vidal, and J. B. Cavalcante-Neto, Three-dimensional face caricaturing by anthropometric distortions, in Proc. XXVI Conf. Graphics, Patterns and Images, Arequipa, Peru, 2013, pp. 163–170.

Crossref Google Scholar

[7]

Y. L. Xing and J. Zhu, Deep learning-based action recognition with 3D skeleton: A survey, CAAI Trans. Intell. Technol., vol. 6, no. 1, pp. 80–92, 2021.

Crossref Google Scholar

[8]

A. Wani, S. Revathi, and R. Khaliq, SDN-based intrusion detection system for IoT using deep learning classifier (IDSIoT-SDL), CAAI Trans. Intell. Technol., vol. 6, no. 3, pp. 281–290, 2021.

Crossref Google Scholar

[9]

Q. Y. Wu, J. Y. Zhang, Y. K. Lai, J. M. Zheng, and J. F. Cai, Alive caricature from 2D to 3D, in Proc. IEEE/CVF Conf. Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 2018, pp. 7336–7345.

Google Scholar

[10]

V. Blanz and T. Vetter, A morphable model for the synthesis of 3D faces, in Proc. 26^th Ann. Conf. Computer Graphics and Interactive Techniques, Los Angeles, CA, USA, 1999, pp. 187–194.

Crossref Google Scholar

[11]

C. Cao, Y. L. Weng, S. Zhou, Y. Y. Tong, and K. Zhou, FaceWarehouse: A 3D facial expression database for visual computing, IEEE Trans. Vis. Comput. Graph., vol. 20, no. 3, pp. 413–425, 2014.

Crossref Google Scholar

[12]

Y. D. Qiu, X. J. Xu, L. T. Qiu, Y. Pan, Y. S. Wu, W. K. Chen, and X. G. Han, 3DCaricShop: A dataset and a baseline method for single-view 3D caricature face reconstruction, in Proc. IEEE/CVF Conf. Computer Vision and Pattern Recognition, Nashville, TN, USA, 2021, pp. 10236–10245.

Crossref Google Scholar

[13]

H. R. Cai, Y. D. Guo, Z. Peng, and J. Y. Zhang, Landmark detection and 3D face reconstruction for caricature using a nonlinear parametric model, Graph. Models, vol. 115, p. 101103, 2021.

Crossref Google Scholar

[14]

T. Lewiner, T. Vieira, D. Martínez, A. Peixoto, V. Mello, and L. Velho, Interactive 3D caricature from harmonic exaggeration, Comput. Graph., vol. 35, no. 3, pp. 586–595, 2011.

Crossref Google Scholar

[15]

Z. P. Ye, M. F. Xia, Y. N. Sun, R. Yi, M. J. Yu, J. Y. Zhang, Y. K. Lai, and Y. J. Liu, 3D-CariGAN: An end-to-end solution to 3D caricature generation from normal face photos, IEEE Trans. Vis. Comput. Graph., vol. 29, no. 4, pp. 2203–2210, 2023.

Crossref Google Scholar

[16]

Y. D. Guo, L. Jiang, L. Cai, and J. Y. Zhang, 3D magic mirror: Automatic video to 3D caricature translation, arXiv preprint arXiv: 1906.00544, 2019.

Google Scholar

[17]

J. F. Liu, Y. Q. Chen, C. Y. Miao, J. J. Xie, C. X. Ling, X. Y. Gao, and W. Gao, Semi-supervised learning in reconstructed manifold space for 3D caricature generation, Comput. Graph. Forum, vol. 28, no. 8, pp. 2104–2116, 2009.

Crossref Google Scholar

[18]

Crossref Google Scholar

[19]

F. Z. Han, S. Q. Ye, M. M. He, M. L. Chai, and J. Liao, Exemplar-based 3D portrait stylization, IEEE Trans. Vis. Comput. Graph., vol. 29, no. 2, pp. 1371–1383, 2023.

Crossref Google Scholar

[20]

B. Guenter, C. Grimm, D. Wood, H. Malvar, and F. Pighin, Making faces, in Proc. 25^th Annu. Conf. Computer Graphics and Interactive Techniques, Orlando, FL, USA, 1998, pp. 55–66.

Crossref Google Scholar

[21]

H. T. Yang, H. Zhu, Y. R. Wang, M. K. Huang, Q. Shen, R. G. Yang, and X. Cao, FaceScape: A large-scale high quality 3D face dataset and detailed riggable 3D face prediction, in Proc. IEEE/CVF Conf. Computer Vision and Pattern Recognition, Seattle, WA, USA, 2020, pp. 601–610.

Crossref Google Scholar

[22]

T. Weise, S. Bouaziz, H. Li, and M. Pauly, Realtime performance-based facial animation, ACM Trans. Graph., vol. 30, no. 4, p. 77, 2011.

Crossref Google Scholar

[23]

C. Cao, H. Z. Wu, Y. L. Weng, T. J. Shao, and K. Zhou, Real-time facial animation with image-based dynamic avatars, ACM Trans. Graph., vol. 35, no. 4, p. 126, 2016.

Crossref Google Scholar

[24]

L. Y. Mo, H. K. Li, C. Y. Zou, Y. B. Zhang, M. Yang, Y. H. Yang, and M. K. Tan, Towards accurate facial motion retargeting with identity-consistent and expression-exclusive constraints, in Proc. 36th AAAI Conf. Artificial Intelligence, Palo Alto, CA, USA, 2022, pp. 1981–1989.

Crossref Google Scholar

[25]

T. Y. Li, T. Bolkart, M. J. Black, H. Li, and J. Romero, Learning a model of facial shape and expression from 4D scans, ACM Trans. Graph., vol. 36, no. 6, p. 194, 2017.

Crossref Google Scholar

[26]

B. Chaudhuri, N. Vesdapunt, and B. Y. Wang, Joint face detection and facial motion retargeting for multiple faces, in Proc. IEEE/CVF Conf. Computer Vision and Pattern Recognition, Long Beach, CA, USA, 2019, pp. 9719–9728.

Crossref Google Scholar

[27]

K. Y. Chen, J. M. Zheng, J. F. Cai, and J. Y. Zhang, Modeling caricature expressions by 3D blendshape and dynamic texture, in Proc. 28th ACM Int. Conf. Multimedia, Seattle, WA, USA, 2020, pp. 3228–3236.

Crossref Google Scholar

[28]

M. Guo, S. F. Wang, Z. B. Wang, M. Lu, X. F. Cui, X. Ling, and F. Xu, 3D face cartoonizer: Generating personalized 3D cartoon faces from 2D real photos with a hybrid dataset, in Proc. 2nd CAAI Int. Conf. Artificial Intelligence, Beijing, China, 2022, pp. 356–367.

Crossref Google Scholar

[29]

R. O. Duda and P. E. Hart, Pattern Classification and Scene Analysis. New York, NY, USA: Wiley, 1973.

[30]

E. Richardson, Y. Alaluf, O. Patashnik, Y. Nitzan, Y. Azar, S. Shapiro, and D. Cohen-Or, Encoding in style: A StyleGAN encoder for image-to-image translation, in Proc. IEEE/CVF Conf. Computer Vision and Pattern Recognition, Nashville, TN, USA, 2021, pp. 2287–2296.

Crossref Google Scholar

[31]

Z. B. Wang, J. W. Ling, C. Z. Feng, M. Lu, and F. Xu, Emotion-preserving blendshape update with real-time face tracking, IEEE Trans. Vis. Comput. Graph., vol. 28, no. 6, pp. 2364–2375, 2022.

Google Scholar

[32]

R. W. Sumner and J. Popović, Deformation transfer for triangle meshes, ACM Trans. Graph., vol. 23, no. 3, pp. 399–405, 2004.

Crossref Google Scholar

[33]

O. Sorkine, D. Cohen-Or, Y. Lipman, M. Alexa, C. Rössl, and H. P. Seidel, Laplacian surface editing, in Proc. Eurographics/ACM SIGGRAPH Symp. Geometry Processing, Nice, France, pp. 175–184, 2004.

Crossref Google Scholar

[34]

K. M. He, X. Y. Zhang, S. Q. Ren, and J. Sun, Deep residual learning for image recognition, in Proc. IEEE Conf. Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 2016, pp. 770–778.

Google Scholar

[35]

S. Woo, J. Park, J. Y. Lee, and I. S. Kweon, CBAM: Convolutional block attention module, in Proc. 15^th European Conf. Computer Vision, Munich, Germany, 2018, pp. 3–19.

Crossref Google Scholar

[36]

L. Y. Liu, H. M. Jiang, P. C. He, W. Z. Chen, X. D. Liu, J. F. Gao, and J. W. Han, On the variance of the adaptive learning rate and beyond, presented at the 8^th Int. Conf. Learning Representations, Addis Ababa, Ethiopia, 2020.

Google Scholar

[37]

M. R. Zhang, J. Lucas, G. Hinton, and J. Ba, Lookahead optimizer: k steps forward, 1 step back, in Proc. 33^rd Int. Conf. Neural Information Processing Systems, Vancouver, Canada, 2019, p. 861.

Google Scholar

Tsinghua Science and Technology

Volume 29 Issue 2,
April 2024

Pages 506-516

DOI: 10.26599/TST.2023.9010028

Cite this article:

Guo M, Xu F, Wang S, et al. Synthesis, Style Editing, and Animation of 3D Cartoon Face. Tsinghua Science and Technology, 2024, 29(2): 506-516. https://doi.org/10.26599/TST.2023.9010028

430

Views

Downloads

Crossref

Web of Science

Scopus

CSCD

Google Scholar
Citation

Altmetrics

Received: 19 February 2023

Revised: 03 April 2023

Accepted: 04 April 2023

Published: 22 September 2023

The articles published in this open access journal are distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/).