Journal Home > Volume 9 , Issue 2

3D morphable models (3DMMs) are generative models for face shape and appearance. Recent works impose face recognition constraints on 3DMM shape parameters so that the face shapes of the same person remain consistent. However, theshape parameters of traditional 3DMMs satisfy the multivariate Gaussian distribution. In contrast, the identity embeddings meet the hypersphere distribution, and this conflict makes it challenging for face reconstruction models to preserve the faithfulness and the shape consistency simultaneously. In other words, recognition loss and reconstruction loss can not decrease jointly due to their conflict distribution. To address this issue, we propose the Sphere Face Model (SFM), a novel 3DMM for monocular face reconstruction, preserving both shape fidelity and identity consistency. The core of our SFM is the basis matrix which can be used to reconstruct 3D face shapes, and the basic matrix is learned by adopting a two-stage training approach where 3D and 2D training data are used in the first and second stages, respectively. We design a novel loss to resolve the distribution mismatch, enforcing that the shape parameters have the hyperspherical distribution. Our model accepts 2Dand 3D data for constructing the sphere face models. Extensive experiments show that SFM has high representation ability and clustering performance in its shape parameter space. Moreover, it produces high-fidelity face shapes consistently in challenging conditions in monocular face reconstruction. The code will be released at https://github.com/a686432/SIR


menu
Abstract
Full text
Outline
About this article

Sphere Face Model: A 3D morphable model with hypersphere manifold latent space using joint 2D/3D training

Show Author's information Diqiong Jiang1Yiwei Jin1Fang-Lue Zhang2Zhe Zhu3Yun Zhang4Ruofeng Tong1( )Min Tang1
Zhejiang University, Hangzhou 310058, China
Victoria University of Wellington, Wellington 6012, New Zealand
Duke University, Durham, North Carolina 27708, USA
Communication University of Zhejiang, Hangzhou 310019, China

Abstract

3D morphable models (3DMMs) are generative models for face shape and appearance. Recent works impose face recognition constraints on 3DMM shape parameters so that the face shapes of the same person remain consistent. However, theshape parameters of traditional 3DMMs satisfy the multivariate Gaussian distribution. In contrast, the identity embeddings meet the hypersphere distribution, and this conflict makes it challenging for face reconstruction models to preserve the faithfulness and the shape consistency simultaneously. In other words, recognition loss and reconstruction loss can not decrease jointly due to their conflict distribution. To address this issue, we propose the Sphere Face Model (SFM), a novel 3DMM for monocular face reconstruction, preserving both shape fidelity and identity consistency. The core of our SFM is the basis matrix which can be used to reconstruct 3D face shapes, and the basic matrix is learned by adopting a two-stage training approach where 3D and 2D training data are used in the first and second stages, respectively. We design a novel loss to resolve the distribution mismatch, enforcing that the shape parameters have the hyperspherical distribution. Our model accepts 2Dand 3D data for constructing the sphere face models. Extensive experiments show that SFM has high representation ability and clustering performance in its shape parameter space. Moreover, it produces high-fidelity face shapes consistently in challenging conditions in monocular face reconstruction. The code will be released at https://github.com/a686432/SIR

Keywords: deep learning, facial modeling, face recon-struction, 3D morphable model (3DMM)

References(73)

[1]
Chen, S. Y.; Gao, L.; Lai, Y. K.; Rosin, P. L.; Xia, S. Real-time 3D face reconstruction and gaze tracking for virtual reality. In: Proceedings of the IEEE Conference on Virtual Reality and 3D User Interfaces, 525–526, 2018.
DOI
[2]
Lattas, A.; Moschoglou, S.; Gecer, B.; Ploumpis, S.; Triantafyllou, V.; Ghosh, A.; Zafeiriou, S.AvatarMe: Realistically renderable 3D facial reconstruction “in-the-wild”. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 757–766, 2020.
DOI
[3]
Bian, S. J.; Zheng, A. Z.; Gao, L.; Maguire, G.; Kokke, W.; Macey, J.; You, L.; Zhang, J. J. Fully automatic facial deformation transfer. Symmetry Vol. 12, No. 1, 27, 2019.
[4]
Lin, J. K.; Yuan, Y.; Zou, Z. X. MeInGame: Create a game character face from a single portrait. Proceedings of the AAAI Conference on Artificial Intelligence Vol. 35, No. 1, 311–319, 2021.
[5]
Shi, T. Y.; Zuo, Z. X.; Yuan, Y.; Fan, C. J.; Shi, T. Y.; Zuo, Z. X.; Yuan, Y.; Fan, C. Fast and robust face-to-parameter translation for game character auto-creation. Proceedings of the AAAI Conference on Artificial Intelligence Vol. 34, No. 2, 1733–1740, 2020.
[6]
Yang, L.; Wu, J.; Huo, J.; Lai, Y. K.; Gao, Y. Learning 3D face reconstruction from a single sketch. Graphical Models Vol. 115, 101102, 2021.
[7]
Zhu, X. Y.; Liu, X. M.; Lei, Z.; Li, S. Z. Face alignment in full pose range: A 3D total solution. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 41, No. 1, 78–92, 2019.
[8]
Jourabloo, A.; Liu, X. M. Large-pose face alignment via CNN-based dense 3D model fitting. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 4188–4196, 2016.
DOI
[9]
Sanyal, S.; Bolkart, T.; Feng, H. W.; Black, M. J. Learning to regress 3D face shape and expression from an image without 3D supervision. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 7755–7764, 2019.
DOI
[10]
Tran, A. T.; Hassner, T.; Masi, I.; Medioni, G. Regressing robust and discriminative 3D morphable models with a very deep neural network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 1493–1502, 2017.
DOI
[11]
Liu, F.; Zhu, R. H.; Zeng, D.; Zhao, Q. J.; Liu, X. M. Disentangling features in 3D face shapes for joint face reconstruction and recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 5216–5225, 2018.
DOI
[12]
Paysan, P.; Knothe, R.; Amberg, B.; Romdhani, S.; Vetter, T. A 3D face model for pose and illumination invariant face recognition. In: Proceedings of the 6th IEEE International Conference on Advanced Video and Signal Based Surveillance, 296–301, 2009.
DOI
[13]
Li, T. Y.; Bolkart, T.; Black, M. J.; Li, H.; Romero, J. Learning a model of facial shape and expression from 4D scans. ACM Transactions on Graphics Vol. 36, No. 6, Article No. 194, 2017.
[14]
Gerig, T.; Morel-Forster, A.; Blumer, C.; Egger, B.; Luthi, M.; Schöenborn, S.; Vetter, T. Morphable face models—An open framework. In: Proceedings of the 13th IEEE International Conference on Automatic Face & Gesture Recognition, 75–82, 2018.
DOI
[15]
Booth, J.; Roussos, A.; Zafeiriou, S.; Ponniah, A.; Dunaway, D. A 3D morphable model learnt from 10,000 faces. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 5543–5552, 2016.
DOI
[16]
Blanz, V.; Vetter, T. A morphable model for the synthesis of 3D faces. In: Proceedings of the 26th Annual Conference on Computer Graphics and Interactive Techniques, 187–194, 1999.
DOI
[17]
Blanz, V.; Basso, C.; Poggio, T.; Vetter, T. Reanimating faces in images and video. Computer Graphics Forum Vol. 22, No. 3, 641–650, 2003.
[18]
Thies, J.; Zollhöfer, M.; Nießner, M.; Valgaerts, L.; Stamminger, M.; Theobalt, C. Real-time expression transfer for facial reenactment. ACM Transactions on Graphics Vol. 34, No. 6, Article No. 183, 2015.
[19]
Amberg, B.; Knothe, R.; Vetter, T. Expression invariant 3D face recognition with a Morphable Model. In: Proceedings of the 8th IEEE International Conference on Automatic Face & Gesture Recognition, 1–6, 2008.
DOI
[20]
Li, H.; Weise, T.; Pauly, M. Example-based facial rigging. ACM Transactions on Graphics Vol. 29, No. 4, Article No. 32, 2010.
[21]
Bouaziz, S.; Wang, Y. G.; Pauly, M. Online modeling for realtime facial animation. ACM Transactions on Graphics Vol. 32, No. 4, Article No. 40, 2013.
[22]
Vlasic, D.; Brand, M.; Pfister, H.; Popović, J. Face transfer with multilinear models. ACM Transactions on Graphics Vol. 24, No. 3, 426–433, 2005.
[23]
Yang, H. T.; Zhu, H.; Wang, Y. R.; Huang, M. K.; Shen, Q.; Yang, R. G.; Cao, X. FaceScape: A large-scale high quality 3D face dataset and detailed riggable 3D face prediction. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 598–607, 2020.
DOI
[24]
Tran, L.; Liu, X. M. Nonlinear 3D face morphable model. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 7346–7355, 2018.
DOI
[25]
Tewari, A.; Zollhöfer, M.; Garrido, P.; Bernard, F.; Kim, H.; Pérez, P.; Theobalt, C. Self-supervised multi-level face model learning for monocular reconstruction at over 250 Hz. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2549–2559, 2018.
DOI
[26]
Tran, L.; Liu, F.; Liu, X. M. Towards high-fidelity nonlinear 3D face morphable model. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 1126–1135, 2019.
DOI
[27]
Bagautdinov, T.; Wu, C. L.; Saragih, J.; Fua, P.; Sheikh, Y. Modeling facial geometry using compositional VAEs. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 3877–3886, 2018.
DOI
[28]
Aldrian, O.; Smith, W. A. P. Inverse rendering in SUV space with a linear texture model. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, 822–829, 2011.
DOI
[29]
Schneider, A.; Schönborn, S.; Egger, B.; Frobeen, L.; Vetter, T. Efficient global illumination for morphable models. In: Proceedings of the IEEE International Conference on Computer Vision, 3885–3893, 2017.
DOI
[30]
Bas, A.; Smith, W. A. P.; Bolkart, T.; Wuhrer, S. Fitting a 3D morphable model to edges: A comparison between hard and soft correspondences. In: Computer Vision – ACCV 2016 Workshops. Lecture Notes in Computer Science, Vol. 10117. Chen, C. S.; Lu, J.; Ma, K. K. Eds. Springer Cham, 377–391, 2017.
[31]
Paysan, P.; Lüthi, M.; Albrecht, T.; Lerch, A.; Amberg, B.; Santini, F.; Vetter, T. Face reconstruction from skull shapes and physical attributes. In: Pattern Recognition. Lecture Notes in Computer Science, Vol. 5748. Denzler, J.; Notni, G.; Süße, H. Eds. Springer Berlin Heidelberg, 232–241, 2009.
[32]
Schönborn, S.; Egger, B.; Morel-Forster, A.; Vetter, T. Markov chain Monte Carlo for automated face image analysis. International Journal of Computer Vision Vol. 123, No. 2, 160–183, 2017.
[33]
Guo, J. Z.; Zhu, X. Y.; Yang, Y.; Yang, F.; Lei, Z.; Li, S. Z. Towards fast, accurate and stable 3D dense face alignment. In: Computer Vision – ECCV 2020. Lecture Notes in Computer Science, Vol. 12364. Vedaldi, A.; Bischof, H.; Brox, T.; Frahm, J. M. Eds. Springer Cham, 152–168, 2020.
[34]
Deng, Y.; Yang, J. L.; Xu, S. C.; Chen, D.; Jia, Y. D.; Tong, X. Accurate 3D face reconstruction with weakly-supervised learning: From single image to image set. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 285–295, 2019.
DOI
[35]
Jin, Y. W.; Jiang, D. Q.; Cai, M. 3D reconstruction using deep learning: A survey. Communications in Information and Systems Vol. 20, No. 4, 389–413, 2020.
[36]
Xiao, Y. P.; Lai, Y. K.; Zhang, F. L.; Li, C. P.; Gao, L. A survey on deep geometry learning: From a representation perspective. Computational Visual Media Vol. 6, No. 2, 113–133, 2020.
[37]
Lin, J. K.; Yuan, Y.; Shao, T. J.; Zhou, K. Towards high-fidelity 3D face reconstruction from in-the-wild images using graph convolutional networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 5890–5899, 2020.
DOI
[38]
Gecer, B.; Ploumpis, S.; Kotsia, I.; Zafeiriou, S. GANFIT: Generative adversarial network fitting for high fidelity 3D face reconstruction. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 1155–1164, 2019.
DOI
[39]
Chen, Y. J.; Wu, F. Z.; Wang, Z. Y.; Song, Y. B.; Ling, Y. G.; Bao, L. C. Self-supervised learning of detailed 3D face reconstruction. IEEE Transactions on Image Processing Vol. 29, 8696–8705, 2020.
[40]
Zeng, X. X.; Wu, Z. L.; Peng, X. J.; Qiao, Y. Joint 3D facial shape reconstruction and texture completion from a single image. Computational Visual Media Vol. 8, No. 2, 239–256, 2022.
[41]
Feng, Y.; Feng, H. W.; Black, M. J.; Bolkart, T. Learning an animatable detailed 3D face model from in-the-wild images. ACM Transactions on Graphics Vol. 40, No. 4, Article No. 88, 2021.
[42]
Jiang, D. Q.; Jin, Y. W.; Zhang, F. L.; Lai, Y. K.; Deng, R. S.; Tong, R. F.; Tang, M. Reconstructing recognizable 3D face shapes based on 3D morphable models. Computer Graphics Forum Vol. 41, No. 6, 348–364, 2022.
[43]
Sun, Y.; Chen, Y.; Wang, X.; Tang, X. Deep learning face representation by joint identification-verification. In: Proceedings of the 27th International Conference on Neural Information Processing Systems, Vol. 2, 1988–1996, 2014.
[44]
Schroff, F.; Kalenichenko, D.; Philbin, J. FaceNet: A unified embedding for face recognition and clustering. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 815–823, 2015.
DOI
[45]
Wen, Y. D.; Zhang, K. P.; Li, Z. F.; Qiao, Y. A discriminative feature learning approach for deep face recognition. In: Computer Vision – ECCV 2016. Lecture Notes in Computer Science, Vol. 9911. Leibe, B.; Matas, J.; Sebe, N.; Welling, M. Eds. Springer Cham, 499–515, 2016.
[46]
Liu, W. Y.; Wen, Y. D.; Yu, Z. D.; Yang, M. Large-margin softmax loss for convolutional neural networks. In: Proceedings of the 33rd International Conference on International Conference on Machine Learning, 507–516, 2016.
[47]
Wang, F.; Xiang, X.; Cheng, J.; Yuille, A. L. NormFace: L2 hypersphere embedding for face verification. In: Proceedings of the 25th ACM International Conference on Multimedia, 1041–1049, 2017.
DOI
[48]
Liu, Y.; Li, H. Y.; Wang, X. G. Rethinking feature discrimination and polymerization for large-scale recognition. arXiv preprint arXiv:1710.00870, 2017.
[49]
Liu, W. Y.; Wen, Y. D.; Yu, Z. D.; Li, M.; Raj, B.; Song,L. SphereFace: Deep hypersphere embedding for face recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 6738–6746, 2017.
DOI
[50]
Deng, J. K.; Guo, J.; Xue, N. N.; Zafeiriou, S. ArcFace: Additive angular margin loss for deep face recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 4685–4694, 2019.
DOI
[51]
Wang, H.; Wang, Y. T.; Zhou, Z.; Ji, X.; Gong, D. H.; Zhou, J. C.; Li, Z.; Liu, W. CosFace: Large margin cosine loss for deep face recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 5265–5274, 2018.
DOI
[52]
Wang, F.; Cheng, J.; Liu, W. Y.; Liu, H. J. Additive margin softmax for face verification. IEEE Signal Processing Letters Vol. 25, No. 7, 926–930, 2018.
[53]
Huang, Y. G.; Wang, Y. H.; Tai, Y.; Liu, X. M.; Shen, P. C.; Li, S. X.; Li, J.; Huang, F. CurricularFace: Adaptive curriculum learning loss for deep face recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 5900–5909, 2020.
DOI
[54]
Zhang, X.; Zhao, R.; Qiao, Y.; Wang, X. G.; Li, H. S. AdaCos: Adaptively scaling cosine logits for effectively learning deep face representations. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 10815–10824, 2019.
DOI
[55]
Liu, H.; Zhu, X. Y.; Lei, Z.; Li, S. Z. AdaptiveFace: Adaptive margin and sampling for face recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 11939–11948, 2019.
DOI
[56]
Patel, A.; Smith, W. A. P. Manifold-based constraints for operations in face space. Pattern Recognition Vol. 52, 206–217, 2016.
[57]
Jiang, Z. H.; Wu, Q. Y.; Chen, K. Y.; Zhang, J. Y. Disentangled representation learning for 3D face shape. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 11949–11958, 2019.
DOI
[58]
Zhu, W. B.; Wu, H. T.; Chen, Z. Y.; Vesdapunt, N.; Wang, B. Y. ReDA: Reinforced differentiable attribute for 3D face reconstruction. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 4957–4966, 2020.
DOI
[59]
Smith, W. A. P.; Seck, A.; Dee, H.; Tiddeman, B.; Tenenbaum, J. B.; Egger, B. A morphable face albedo model. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 5010–5019, 2020.
DOI
[60]
Johnson, J.; Ravi, N.; Reizenstein, J.; Novotny, D.; Tulsiani, S.; Lassner, C.; Branson, S. Accelerating 3D deep learning with PyTorch3D. In: Proceedings of the SIGGRAPH Asia 2020 Courses, 1, 2020.
DOI
[61]
Phillips, P. J.; Flynn, P. J.; Scruggs, T.; Bowyer, K. W.; Chang, J.; Hoffman, K.; Marques, J.; Min, J.; Worek, W. Overview of the face recognition grand challenge. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 947–954, 2005.
[62]
Amberg, B.; Romdhani, S.; Vetter, T. Optimal step nonrigid ICP algorithms for surface registration. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 1–8, 2007.
DOI
[63]
Zhu, X. Y.; Zhen, L.; Yan, J. J.; Dong, Y.; Li, S. Z. High-fidelity Pose and Expression Normalization for face recognition in the wild. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 787–796, 2015.
[64]
Cao, Q.; Shen, L.; Xie, W. D.; Parkhi, O. M.; Zisserman, A. VGGFace2: A dataset for recognising faces across pose and age. In: Proceedings of the 13th IEEE International Conference on Automatic Face & Gesture Recognition, 67–74, 2018.
DOI
[65]
Sagonas, C.; Antonakos, E.; Tzimiropoulos, G.; Zafeiriou, S.; Pantic, M. 300 faces in-the-wild challenge: Database and results. Image and Vision Computing Vol. 47, 3–18, 2016.
[66]
Kingma, D. P.; Welling, M. Auto-encoding variational Bayes. In: Proceedings of the 2nd International Conference on Learning Representations, 2014.
[67]
Wu, S. Z.; Rupprecht, C.; Vedaldi, A. Unsupervised learning of probably symmetric deformable 3D objects from images in the wild. IEEE Transactions on Pattern Analysis and Machine Intelligence , 2021.
[68]
Savran, A.; Alyüz, N.; Dibeklioglu, H.; Çeliktutan, O.; Gökberk, B.; Sankur, B.; Akarun, L. Bosphorus database for 3D face analysis. In: Biometrics and Identity Management. Lecture Notes in Computer Science, Vol. 5372. Schouten, B.; Juul, N. C.; Drygajlo, A.; Tistarelli, M. Eds. Springer Berlin Heidelberg, 47–56, 2008.
[69]
Besl, P. J.; McKay, N. D. Method for registration of 3-D shapes. In: Proceedings of the SPIE 1611, Sensor Fusion IV: Control Paradigms and Data Structures, 586–606, 1992.
DOI
[70]
Van der Maaten, L.; Hinton, G. Visualizing data using t-SNE. Journal of Machine Learning Research Vol. 9, No. 86, 2579–2605, 2008.
[71]
Bagdanov, A. D.; del Bimbo, A.; Masi, I. The Florence 2D/3D hybrid face dataset. In: Proceedings of the Joint ACM Workshop on Human Gesture and Behavior Understanding, 79–80, 2011.
DOI
[72]
Shang, J. X.; Shen, T. W.; Li, S. W.; Zhou, L.; Zhen, M. M.; Fang, T.; Quan, L. Self-supervised monocular 3D face reconstruction by occlusion-aware multi-view geometry consistency. In: Computer Vision – ECCV 2020. Lecture Notes in Computer Science, Vol. 12360. Vedaldi, A.; Bischof, H.; Brox, T.; Frahm, J. M. Eds. Springer Cham, 53–70, 2020.
[73]
Feng, Y.; Wu, F.; Shao, X. H.; Wang, Y. F.; Zhou, X. Joint 3D face reconstruction and dense alignment with position map regression network. In: Computer Vision – ECCV 2018. Lecture Notes in Computer Science, Vol. 11218. Ferrari, V.; Hebert, M.; Sminchisescu, C.; Weiss, Y. Eds. Springer Cham, 557–574, 2018.
Publication history
Copyright
Acknowledgements
Rights and permissions

Publication history

Received: 10 January 2022
Accepted: 05 April 2022
Published: 03 January 2023
Issue date: June 2023

Copyright

© The Author(s) 2022.

Acknowledgements

The research is supported in part by National NaturalScience Foundation of China (61972342, 61832016), Scienceand Technology Department of Zhejiang Province (2018C01080), Zhejiang Province Public Welfare Technology Application Research (LGG22F020009), Key Laboratory of Film and TV Media Technology of Zhejiang Province (2020E10015), and Teaching Reform Project of Communication University of Zhejiang (jgxm202131).

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduc-tion in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made.

The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Other papers from this open access journal are available free of charge from http://www.springer.com/journal/41095. To submit a manuscript, please go to https://www.editorialmanager.com/cvmj.

Return