Scholar - SciOpen

In recent years, 3D face reconstruction has become a research hotspot in computer graphics and computer vision. Most current 3DMM-based methods focus on learning displacement maps to recover high-frequency facial details. However, they focus less on learning mid-frequency facial details and introduce displacement maps with noise, decreasing face reconstruction accuracy. Thus, this work presents a novel approach to regressing accurate and detailed 3D face shapes. First, we design a novel feature consistency loss to recover mid-frequency facial details. Specifically, we exploit the powerful CLIP as prior knowledge of faces to extract geometric and semantic features, which helps guide the reconstructed 3D geometric details to match local details in the input image. Furthermore, we propose a parameter refinement module to learn fine-grained features. It helps to obtain accurate model parameters and improve the accuracy of facial reconstruction. Extensive experiments on a FaceScape and a REALY benchmark demonstrate that our method outperforms several state-of-the-art methods in reconstruction accuracy. Furthermore, comprehensive qualitative results show that our approach achieves better visual performance than existing methods.