AI Chat Paper
Note: Please note that the following content is generated by AMiner AI. SciOpen does not take any responsibility related to this content.
{{lang === 'zh_CN' ? '文章概述' : 'Summary'}}
{{lang === 'en_US' ? '中' : 'Eng'}}
Chat more with AI
PDF (33.1 MB)
Collect
Submit Manuscript AI Chat Paper
Show Outline
Outline
Show full outline
Hide outline
Outline
Show full outline
Hide outline
Research Article | Open Access

Intuitive user-guided portrait image editing with asymmetric conditional GAN

Linlin Liu1,*Qian Fu2,*Fei Hou3,4Ying He5( )
Interdisciplinary Graduate School, Nanyang Technological University, and Alibaba Group, Singapore 639798, Singapore
Data61, Commonwealth Scientific and Industrial Research Organisation, Sydney 2122, Australia
Key Laboratory of System Software (CAS) and State Key Laboratory of Computer Science, Institute of Software, Chinese Academy of Sciences, Beijing 100190, China
University of Chinese Academy of Sciences, Beijing 100049, China
School of Computer Science and Engineering, Nanyang Technological University, Singapore 639798, Singapore

* Linlin Liu and Qian Fu contributed equally to this work.

Show Author Information

Graphical Abstract

Abstract

We propose PortraitACG, a novel framework for user-guided portrait image editing that leverages an asymmetric conditional generative adversarial network (GAN), which supports the fine-grained editing of geometries, colors, lights, and shadows using a single neural network model. Existing conditional GAN-based approaches usually feed the same conditional information into generators and discriminators, which is sub-optimal because these two modules are designed for different purposes. To facilitate flexible user-guided editing, we propose a novel asymmetric conditional GAN, where the generators take the transformed conditional inputs, such as edge maps, color palettes, sliders, and masks, that can be directly edited by the user, and the discriminators take the conditional inputs in a way that can guide controllable image generation more effectively. This allows image editing operations to be performed in a simpler and more intuitive manner. For example, the user can directly use a color palette to specify the desired colors of hair, skin, eyes, lips, and background and use a slider to blend colors. Moreover, users can edit the lights and shadows by modifying their corresponding masks.

Electronic Supplementary Material

Video
cvm-11-2-361_ESM.mp4

References

[1]

Fu, Q.; He, Y.; Hou, F.; Sun, Q.; Zeng, A.; Huang, Z.; Zhang, J.; Liu, Y. Poisson vector graphics (PVG)-guided face color transfer in videos. IEEE Computer Graphics and Applications Vol. 41, No. 6, 152–163, 2021.

[2]
Shu, Z.; Yumer, E.; Hadap, S.; Sunkavalli, K.; Shechtman, E.; Samaras, D. Neural face editing with intrinsic image disentangling. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 5444–5453, 2017.
[3]
Nguyen, T.; Tran, A. T.; Hoai, M. Lipstick ain’t enough: Beyond color matching for in-the-wild makeup transfer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 13300–13309, 2021.
[4]
Cho, W.; Choi, S.; Park, D. K.; Shin, I.; Choo, J. Image-to-image translation via group-wise deep whitening-and-coloring transformation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 10631–10639, 2019.
[5]
Xiao, Y.; Zhou, P.; Zheng, Y.; Leung, C. S. Interactive deep colorization using simultaneous global and local inputs. In: Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, 1887–1891, 2019.
[6]

Zhang, R.; Zhu, J.; Isola, P.; Geng, X.; Lin, A. S.; Yu, T.; Efros, A. A. Real-time user-guided image colorization with learned deep priors. ACM Transactions on Graphics Vol. 9, No. 4, Article No. 119, 2017.

[7]
Sangkloy, P.; Lu, J.; Fang, C.; Yu, F.; Hays, J. Scribbler: Controlling deep image synthesis with sketch and color. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 6836–6845, 2017.
[8]
Xian, W.; Sangkloy, P.; Agrawal, V.; Raj, A.; Lu, J.; Fang, C.; Yu, F.; Hays, J. TextureGAN: Controlling deep image synthesis with texture patches. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 8456–8465, 2018.
[9]
Kim, H.; Jhoo, H. Y.; Park, E.; Yoo, S. Tag2Pix: Line art colorization using text tag with SECat and changing loss. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, 9056–9065, 2019.
[10]
Bahng, H.; Yoo, S.; Cho, W.; Park, D. K.; Wu, Z.; Ma, X.; Choo, J. Coloring with words: Guiding image colorization through text-based palette generation. In: Computer Vision – ECCV 2018. Lecture Notes in Computer Science, Vol. 11216. Ferrari, V.; Hebert, M.; Sminchisescu, C.; Weiss, Y. Eds. Springer Cham, 431–447, 2018.
[11]
You, S.; You, N.; Pan, M. PI-REC: Progressive image reconstruction network with edge and color domain. arXiv preprint arXiv: 1903.10146, 2019.
[12]
Mirza, M.; Osindero, S. Conditional generative adversarial nets. arXiv preprint arXiv: 1411.1784, 2014.
[13]
Isola, P.; Zhu, J. Y.; Zhou, T.; Efros, A. A. Image-to-image translation with conditional adversarial networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 1125–1134, 2017.
[14]
Goodfellow, I.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Warde-Farley, D.; Ozair, S.; Courville, A.; Bengio, Y. Generative adversarial nets. In: Proceedings of the 28th International Conference on Neural Information Processing Systems, Vol. 2, 2672–2680, 2014.
[15]
Gu, S.; Bao, J.; Yang, H.; Chen, D.; Wen, F.; Yuan, L. Mask-Guided portrait editing with conditional GANs. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 3436–3445, 2019.
[16]
Park, T.; Liu, M. Y.; Wang, T. C.; Zhu, J. Y. Semantic image synthesis with spatially-adaptive normalization. arXiv preprint arXiv: 1903.07291, 2019.
[17]

Shamsolmoali, P.; Zareapoor, M.; Granger, E.; Zhou, H.; Wang, R.; Celebi, M. E.; Yang, J. Image synthesis with adversarial networks: A comprehensive survey and case studies. Information Fusion Vol. 72, 126–146, 2021.

[18]

Pang, Y.; Lin, J.; Qin, T.; Chen, Z. Image-to-image translation: Methods and applications. IEEE Transactions on Multimedia Vol. 24, 3859–3881, 2021.

[19]

Wu, Y.; Yang, Y. L.; Xiao, Q.; Jin, X. Coarse-to-fine: Facial structure editing of portrait images via latent space classifications. ACM Transactions on Graphics Vol. 40, No. 4, Article No. 46, 2021.

[20]

Chen, S.; Su, W.; Gao, L.; Xia, S. L.; Fu, H. DeepFaceDrawing: Deep generation of face images from sketches. ACM Transactions on Graphics Vol. 39, No. 4, Article No. 72, 2020.

[21]
Yang, B.; Chen, X.; Wang, C.; Zhang, C.; Chen, Z.; Sun, X. Semantics-preserving sketch embedding for face generation. arXiv preprint arXiv: 2211.13015, 2022.
[22]

Su, W.; Ye, H.; Chen, S. Y.; Gao, L.; Fu, H. DrawingInStyles: Portrait image generation and editing with spatially conditioned StyleGAN. IEEE Transactions on Visualization and Computer Graphics Vol. 29, No. 10, 4074–4088, 2023.

[23]

Liu, F.-L.; Chen, S.-Y.; Lai, Y.; Li, C.; Jiang, Y.-R.; Fu, H.; Gao, L. DeepFaceVideoEditing: Sketch-based deep editing of face videos. ACM Transactions on Graphics Vol. 41, No. 4, Article No. 167, 2022.

[24]
Zhou, H.; Hadap, S.; Sunkavalli, K.; Jacobs, D. W. Deep single-image portrait relighting. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, 7194–7202, 2019.
[25]

Wang, Z.; Yu, X.; Lu, M.; Wang, Q.; Qian, C.; Xu, F. Single image portrait relighting via explicit multiple reflectance channel modeling. ACM Transactions on Graphics Vol. 39, No. 6, Article No. 220, 2020.

[26]

Zhang, X. C.; Barron, J. T.; Tsai, Y. T.; Pandey, R.; Zhang X.; Ng, R.; Jacobs, D. E. Portrait shadow manipulation. ACM Transactions on Graphics Vol. 39, No. 4, Article No. 78, 2020.

[27]
Li, Y.; Liu, M. Y.; Li, X.; Yang, M. H.; Kautz, J. A closed-form solution to photorealistic image stylization. In: Computer Vision – ECCV 2018. Lecture Notes in Computer Science, Vol. 11207. Ferrari, V.; Hebert, M.; Sminchisescu, C.; Weiss, Y. Eds. Springer Cham, 468–483, 2018.
[28]
Yi R.; Liu Y. J.; Lai Y. K.; Rosin, P. L. APDrawingGAN: Generating artistic portrait drawings from face photos with hierarchical GANs. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 10743–10752, 2019.
[29]
Tewari, A.; Elgharib, M.; Bharaj, G.; Bernard, F.; Seidel, H. P.; Perez, P.; Zollhofer, M.; Theobalt, C. StyleRig: Rigging StyleGAN for 3D control over portrait images. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 6141–6150, 2020.
[30]
Tan, Z.; Chai, M.; Chen, D.; Liao, J.; Chu, Q.; Yuan, L.; Tulyakov, S.; Yu, N.; Pektas, M.; Gecer, B.; et al. MichiGAN: Multi-input-conditioned hair image generation for portrait editing. arXiv preprint arXiv: 2010.16417, 2020.
[31]

Levin, A.; Lischinski, D.; Weiss, Y. Colorization using optimization. ACM Transactions on Graphics Vol. 23, No. 3, 689–694, 2004.

[32]

Reinhard, E.; Adhikhmin, M.; Gooch, B.; Shirley, P. Color transfer between images. IEEE Computer Graphics and Applications Vol. 21, No. 5, 34–41, 2001.

[33]

He, M.; Chen, D.; Liao, J.; Sander, P. V.; Yuan, L. Deep exemplar-based colorization. ACM Transactions on Graphics Vol. 37, No. 4, Article No. 47, 2018.

[34]
Afifi, M.; Brubaker, M. A.; Brown, M. S. HistoGAN: Controlling colors of GAN-generated and real images via color histograms. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 7941–7950, 2021.
[35]

Chang, H.; Fried, O.; Liu, Y.; DiVerdi, S.; Finkelstein, A. Palette-based photo recoloring. ACM Transactions on Graphics Vol. 34, No. 4, Article No. 139, 2015.

[36]
Nestmeyer, T.; Lalonde, J. F.; Matthews, I.; Lehrmann, A. Learning physics-guided face relighting under directional light. In: Proceedings of the IEEE/ CVF Conference on Computer Vision and Pattern Recognition, 5123–5132, 2020.
[37]
Sun, T.; Lin, K.; Bi, S.; Xu, Z.; Ramamoorthi, R. Nelf: Neural light-transport field for portrait view synthesis and relighting. arXiv preprint arXiv: 2107.12351, 2021.
[38]
Hou, A.; Zhang, Z.; Sarkis, M.; Bi, N.; Tong, Y.; Liu, X. Towards high fidelity face relighting with realistic shadows. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 14719–14728, 2021.
[39]

Shu, Z.; Hadap, S.; Shechtman, E.; Sunkavalli, K.; Paris, S.; Samaras, D. Portrait lighting transfer using a mass transport approach. ACM Transactions on Graphics Vol. 36, No. 4, Article No. 1, 2017.

[40]
Soria, X.; Riba, E.; Sappa, A. Dense extreme inception network: Towards a robust CNN model for edge detection. In: Proceedings of the IEEE Winter Conference on Applications of Computer Vision, 1923–1932, 2020.
[41]

Canny, J. A computational approach to edge detection. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. PAMI-8, No. 6, 679–698, 1986.

[42]
Xie, S.; Tu, Z. Holistically-nested edge detection. arXiv preprint arXiv: 1504.06375, 2015.
[43]
Lee, C. H.; Liu, Z.; Wu, L.; Luo, P. MaskGAN: Towards diverse and interactive facial image manipulation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 5548–5557, 2020.
[44]

Shen, H.; Zheng, Z. Real-time highlight removal using intensity ratio. Applied Optics Vol. 52, No. 19, 4483–4493, 2013.

[45]

Chen, S.; Liu, F.; Lai, Y.; Rosin, P. L.; Li, C.; Fu, H.; Gao, L. DeepFaceEditing: Deep face generation and editing with disentangled geometry and appearance control. ACM Transactions on Graphics Vol. 40, No. 4, Article No. 90, 2021.

[46]

Yang, S.; Wang, Z.; Liu, J.; Guo, Z. Deep plastic surgery: Robust and controllable image editing with human-drawn sketches. ACM Transactions on Graphics Vol. 40, No. 4, Article No. 46, 2020.

[47]
Wang, T. C.; Liu, M. Y.; Zhu, J. Y.; Tao, A.; Kautz, J.; Catanzaro, B. High-resolution image synthesis and semantic manipulation with conditional GANs. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 8798–8807, 2018.
[48]
Dash, A.; Ye, J.; Wang, G. High resolution solar image generation using generative adversarial networks. arXiv preprint arXiv: 2106.03814, 2021.
[49]
Chen, W.; Hays, J. SketchyGAN: Towards diverse and realistic sketch to image synthesis. arXiv preprint arXiv: 1801.02753, 2018.
[50]

Portenier, T.; Hu, Q.; Szab’o, A.; Bigdeli, S. A.; Favaro, P.; Zwicker, M. Faceshop: Deep sketch-based face image editing. ACM Transactions on Graphics, Vol. 37, No. 4, Article No. 99, 2018.

[51]
Wang, Y.; Tao, X.; Qi, X.; Shen, X.; Jia, J. Image inpainting via generative multi-column convolutional neural networks. In: Proceedings of the 32nd Conference on Neural Information Processing Systems, 2018.
[52]
Yi, R.; Liu, Y.; Lai, Y.; Rosin, P. L. Unpaired portrait drawing generation via asymmetric cycle mapping. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 8217–8225, 2020.
[53]
Johnson, J.; Alahi, A.; Li, F. F. Perceptual losses for real-time style transfer and super-resolution. In: Computer Vision – ECCV 2016. Lecture Notes in Computer Science, Vol. 9906. Leibe, B.; Matas, J.; Sebe, N.; Welling, M. Eds. Springer Cham, 694–711, 2016.
[54]

Pang, Y.; Xie, J.; Li, X. Visual haze removal by a unified generative adversarial network. IEEE Transactions on Circuits and Systems for Video Technology Vol. 29, No. 11, 3211–3221, 2019.

[55]
Dosovitskiy, A.; Brox, T.; Generating images with perceptual similarity metrics based on deep networks. In: Proceedings of the 30th Conference on Neural Information Processing Systems, 658–666, 2016.
[56]
Simonyan, K.; Zisserman, A. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv: 1409.1556, 2014.
[57]
Kingma, D. P.; Ba, J.; Hammad, M. M. Adam: A method for stochastic optimization. arXiv preprint arXiv: 1412.6980, 2014.
[58]

Sun, T.; Barron, J. T.; Tsai, Y. T.; Xu, Z.; Yu, X.; Fyffe, G.; Rhemann, C.; Busch, J.; Debevec, P.; Ramamoorthi, R. Single image portrait relighting. ACM Transactions on Graphics Vol. 38, No. 4, Article No. 79, 2019.

[59]
Sengupta, S.; Kanazawa, A.; Castillo, C. D.; Jacobs, D. W. SfSNet: Learning shape, reflectance and illuminance of faces ′in the wild′. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 6296–6305, 2018.
[60]
Tran, L.; Liu, X. Nonlinear 3D face morphable model. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 7346–7355, 2018.
[61]
Wu, F.; Bao, L.; Chen, Y.; Ling, Y.; Song, Y.; Li, S.; Ngan, K. N.; Liu, W. MVF-Net: Multi-view 3D face morphable model regression. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 959–968, 2019.
[62]

Wang, Z.; Bovik, A. C.; Sheikh, H. R.; Simoncelli, E. P. Image quality assessment: From error visibility to structural similarity. IEEE Transactions on Image Processing Vol. 13, No. 4, 600–612, 2004.

Computational Visual Media
Pages 361-379
Cite this article:
Liu L, Fu Q, Hou F, et al. Intuitive user-guided portrait image editing with asymmetric conditional GAN. Computational Visual Media, 2025, 11(2): 361-379. https://doi.org/10.26599/CVM.2025.9450370

21

Views

1

Downloads

0

Crossref

0

Web of Science

0

Scopus

0

CSCD

Altmetrics

Received: 06 January 2023
Accepted: 19 July 2023
Published: 08 May 2025
© The Author(s) 2025.

This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made.

The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

To submit a manuscript, please go to https://jcvm.org.

Return