Scholar - SciOpen

Cartoonizing portrait images is a stylish and eye-catching application in both computer vision and graphics. We aimed to train a face cartoonization model using very few (e.g., 5—10) style examples. The main difficulty in this challenging task lies in producing stylizations of high quality while preserving the identity of the input, particularly when the style examples contain strong exaggerations. To address this, we propose a novel cross-domain center loss for few-shot generative adversarial network (GAN) adaptation, which forces the distribution of the target domain to be similar to that of the source. We then employ it to solve this few-shot problem along with a two-stage strategy. Stage Ⅰ generates an intermediate cartoonization for the input, where we first stylize the individual facial components locally and then deform them to mimic the desired exaggeration under the guidance of landmarks. Stage Ⅱ focuses on global refinement of the intermediate image. First, we adapt a pretrained StyleGAN model using the proposed cross-domain center loss to the target domain defined by a few examples. Subsequently, the intermediate cartoonization from Stage Ⅰ can be holistically refined through GAN inversion. The generative power of StyleGAN guarantees high image quality, while the local translation and landmark-guided deformation applied to facial components provide high identity fidelity. Experiments show that the proposed method outperforms state-of-the-art few-shot stylization approaches both qualitatively and quantitatively.