An attention-embedded GAN for SVBRDF recovery from a single image

Zeqi Shi; Xiangyu Lin; Ying Song

doi:10.1007/s41095-022-0289-1

Computational Visual Media 2023, 9(3): 551-561 https://doi.org/10.1007/s41095-022-0289-1

Research Article |

Open Access | Issue | Published: 22 March 2023

An attention-embedded GAN for SVBRDF recovery from a single image

Show Author's Information Hide Author's Information Zeqi Shi^¹, Xiangyu Lin^¹, Ying Song^¹(

)

1Zhejiang Sci-Tech University, 2011 Collaborative InnovationCenter for Garment Personal Customization of Zhejiang Province, Hangzhou 310018, China

Keywords:

attention mechanism, generative adversarial network (GAN), spatially-varying bidirectional reflectance distribution function (SVBRDF), appearance capture

Cite this article:

Shi Z, Lin X, Song Y. An attention-embedded GAN for SVBRDF recovery from a single image. Computational Visual Media, 2023, 9(3): 551-561. https://doi.org/10.1007/s41095-022-0289-1

Download citation

EndNote(RIS)

BibTeX

624

Views

Downloads

Citations

Crossref

WoS

Scopus

CSCD

Abstract Full text Electronic supplementary material About this article

Abstract

Learning-based approaches have made sub-stantial progress in capturing spatially-varying bidi-rectional reflectance distribution functions (SVBRDFs) from a single image with unknown lighting and geometry. However, most existing networks only con-sider per-pixel losses which limit their capability to recover local features such as smooth glossy regions. A few generative adversarial networks use multiple discriminators for different parameter maps, increasing network complexity. We present a novel end-to-end generative adversarial network (GAN) to recover appearance from a single picture of a nearly-flat surface lit by flash. We use a single unified adversarial frame-work for each parameter map. An attention module guides the network to focus on details of the maps. Furthermore, the SVBRDF map loss is combined to prevent paying excess attention to specular highlights. We demonstrate and evaluate our method on both public datasets and real data. Quantitative analysis and visual comparisons indicate that our method achieves better results than the state-of-the-art in most cases.

Full text

Abstract

Full text

Outline

Electronic supplementary material

About this article

An attention-embedded GAN for SVBRDF recovery from a single image

Show Author's information Hide Author's Information Zeqi Shi^¹, Xiangyu Lin^¹, Ying Song^¹(

)

1Zhejiang Sci-Tech University, 2011 Collaborative InnovationCenter for Garment Personal Customization of Zhejiang Province, Hangzhou 310018, China

Abstract

Keywords: attention mechanism, generative adversarial network (GAN), spatially-varying bidirectional reflectance distribution function (SVBRDF), appearance capture

References(43)

[1]

Weyrich, T.; Lawrence, J.; Lensch, H. P. A.; Rusinkiewicz, S.; Zickler, T. Principles of appearance acquisition and representation. Foundations and Trends® in Computer Graphics and Vision Vol. 4, No. 2, 75–191, 2009.

DOI Google Scholar

[2]

Dorsey, J.; Rushmeier, H.; Sillion, F. Digital Modeling of Material Appearance. Amsterdam: Elsevier, 2010.

[3]

Weinmann, M.; Klein, R. Advances in geometry and reflectance acquisition (course notes). In: Proceedings of the SIGGRAPH Asia 2015 Courses, Article No. 1, 2015.

DOI

[4]

Guarnera, D.; Guarnera, G. C.; Ghosh, A.; Denk, C.; Glencross, M. BRDF representation and acquisition. Computer Graphics Forum Vol. 35, No. 2, 625–650, 2016.

DOI Google Scholar

[5]

Dong, Y. Deep appearance modeling: A survey. Visual Informatics Vol. 3, No. 2, 59–68, 2019.

DOI Google Scholar

[6]

Deschaintre, V.; Drettakis, G.; Bousseau, A. Guided fine-tuning for large-scale material transfer. Computer Graphics Forum Vol. 39, No. 4, 91–105, 2020.

DOI Google Scholar

[7]

Wang, Z.; Yu, X.; Lu, M.; Wang, Q.; Qian, C.; Xu, F. Single image portrait relighting via explicit multiple reflectance channel modeling. ACM Transactions on Graphics Vol. 39, No. 6, Article No. 220, 2020.

DOI Google Scholar

[8]

Li, X.; Dong, Y.; Peers, P.; Tong, X. Modeling surface appearance from a single photograph using self-augmented convolutional neural networks. ACM Transactions on Graphics Vol. 36, No. 4, Article No. 45, 2017.

DOI Google Scholar

[9]

Deschaintre, V.; Aittala, M.; Durand, F.; Drettakis, G.; Bousseau, A. Single-image SVBRDF capture with a rendering-aware deep network. ACM Transactions on Graphics Vol. 37, No. 4, Article No. 128, 2018.

DOI Google Scholar

[10]

Li, Z.; Sunkavalli, K.; Chandraker, M. Materials for masses: SVBRDF acquisition with a single mobile phone image. In: Computer Vision – ECCV 2018. Lecture Notes in Computer Science, Vol. 11207. Ferrari, V.; Hebert, M.; Sminchisescu, C.; Weiss, Y. Eds. Springer Cham, 74–90, 2018.

DOI

[11]

Ye, W. J.; Li, X.; Dong, Y.; Peers, P.; Tong, X. Single image surface appearance modeling with self-augmented CNNs and inexact supervision. Computer Graphics Forum Vol. 37, No. 7, 201–211, 2018.

DOI Google Scholar

[12]

Gao, D.; Li, X.; Dong, Y.; Peers, P.; Xu, K.; Tong, X. Deep inverse rendering for high-resolution SVBRDF estimation from an arbitrary number of images. ACM Transactions on Graphics Vol. 38, No. 4, Article No. 134, 2019.

DOI Google Scholar

[13]

Guo, Y.; Smith, C.; Hašan, M.; Sunkavalli, K.; Zhao, S. MaterialGAN: Reflectance capture using a generativeSVBRDF model. arXiv preprint arXiv:2010.00114, 2020.

DOI Google Scholar

[14]

Goodfellow, I.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Warde-Farley, D.; Ozair, S.; Courville, A.; Bengio, Y. Generative adversarial nets. In: Proceedings of the Advances in Neural Information Processing Systems 27, 2014.

[15]

Zhou, X. L.; Kalantari, N. K. Adversarial single-image SVBRDF estimation with hybrid training. Computer Graphics Forum Vol. 40, No. 2, 315–325, 2021.

DOI Google Scholar

[16]

Guo, J.; Lai, S. C.; Tao, C. Z.; Cai, Y. L.; Wang, L.; Guo, Y. W.; Yan, L. Q. Highlight-aware two-stream network for single-image SVBRDF acquisition. ACM Transactions on Graphics Vol. 40, No. 4, Article No. 123, 2021.

DOI Google Scholar

[17]

Chandraker, M. On shape and material recovery from motion. In: Computer Vision – ECCV 2014. Lecture Notes in Computer Science, Vol. 8695. Fleet, D.; Pajdla, T.; Schiele, B.; Tuytelaars, T. Eds. Springer Cham, 202–217, 2014.

DOI

[18]

Hui, Z.; Sankaranarayanan, A. C. A dictionary-based approach for estimating shape and spatially-varying reflectance. In: Proceedings of the IEEE International Conference on Computational Photography, 1–9, 2015.

DOI

[19]

Riviere, J.; Peers, P.; Ghosh, A. Mobile surface reflectometry. Computer Graphics Forum Vol. 35, No. 1, 191–202, 2016.

DOI Google Scholar

[20]

Xia, R.; Dong, Y.; Peers, P.; Tong, X. Recovering shape and spatially-varying surface reflectance under unknown illumination. ACM Transactions on Graphics Vol. 35, No. 6, Article No. 187, 2016.

DOI Google Scholar

[21]

Boivin, S.; Gagalowicz, A. Inverse rendering from a single image. In: Proceedings of the Conference on Colour in Graphics, Imaging, and Vision, 268–277, 2002. Available at https://www.dgp.toronto.edu/∼boivin/pubs/cgiv2002.pdf.

[22]

Aittala, M.; Weyrich, T.; Lehtinen, J. Two-shot SVBRDF capture for stationary materials. ACM Transactions on Graphics Vol. 34, No. 4, Article No. 110, 2015.

DOI Google Scholar

[23]

Xu, Z. X.; Nielsen, J. B.; Yu, J. Y.; Jensen, H. W.; Ramamoorthi, R. Minimal BRDF sampling for two-shot near-field reflectance acquisition. ACM Transactions on Graphics Vol. 35, No. 6, Article No. 188, 2016.

DOI Google Scholar

[24]

Ronneberger, O.; Fischer, P.; Brox, T. U-Net: Convolutional networks for biomedical image segmentation. In: Medical Image Computing and Computer-Assisted Intervention – MICCAI 2015. Lecture Notes in Computer Science, Vol. 9351. Navab, N.; Hornegger, J.; Wells, W.; Frangi, A. Eds. Springer Cham, 234–241, 2015.

DOI

[25]

Deschaintre, V.; Aittala, M.; Durand, F.; Drettakis, G.; Bousseau, A. Flexible SVBRDF capture with a multi-image deep network. Computer Graphics Forum Vol. 38, No. 4, 1–13, 2019.

DOI Google Scholar

[26]

Li, Z. Q.; Xu, Z. X.; Ramamoorthi, R.; Sunkavalli, K.; Chandraker, M. Learning to reconstruct shape and spatially-varying reflectance from a single image. ACM Transactions on Graphics Vol. 37, No. 6, Article No. 269, 2018.

DOI Google Scholar

[27]

Zhao, Y. Z.; Wang, B. B.; Xu, Y. N.; Zeng, Z.; Wang, L.; Holzschuch, N. Joint SVBRDF recovery and synthesis from a single image using an unsupervised generative adversarial network. In: Proceedings of the Eurographics Symposium on Rendering - DL-only Track, 53–66, 2020.

[28]

Asselin, L. P.; Laurendeau, D.; Lalonde, J. F. Deep SVBRDF estimation on real materials. In: Proceedings of the International Conference on 3D Vision, 1157–1166, 2021.

DOI

[29]

Karras, T.; Laine, S.; Aittala, M.; Hellsten, J.; Lehtinen, J.; Aila, T. M. Analyzing and improving the image quality of StyleGAN. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 8107–8116, 2020.

DOI

[30]

Maas, A. L.; Hannun, A. Y.; Ng, A. Y. Rectifier nonlinearities improve neural network acoustic models. In: Proceedings of the 30th International Conference on Machine Learning, 3–8, 2013.

[31]

Zhou, B. L.; Khosla, A.; Lapedriza, A.; Oliva, A.; Torralba, A. Object detectors emerge in deep scene CNNs. arXiv preprint arXiv:1412.6856, 2014.

Google Scholar

[32]

Isola, P.; Zhu, J. Y.; Zhou, T. H.; Efros, A. A. Image-to-image translation with conditional adversarial networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 5967–5976, 2017.

DOI

[33]

Nguyen, T. V.; Zhao, Q.; Yan, S. C. Attentive systems: A survey. International Journal of Computer Vision Vol. 126, No. 1, 86–110, 2018.

DOI Google Scholar

[34]

Chaudhari, S.; Mithal, V.; Polatkan, G.; Ramanath, R. An attentive survey of attention models. ACM Transactions on Intelligent Systems and Technology Vol. 12, No. 5, Article No. 53, 2021.

DOI Google Scholar

[35]

Bahdanau, D.; Cho, K.; Bengio, Y. Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473, 2014.

Google Scholar

[36]

Chorowski, J.; Bahdanau, D.; Serdyuk, D.; Cho, K.; Bengio, Y. Attention-based models for speech recognition. arXiv preprint arXiv:1506.07503, 2015.

Google Scholar

[37]

Mnih, V.; Heess, N.; Graves, A.; Kavukcuoglu, K. Recurrent models of visual attention. In: Proceedings of the 27th International Conference on Neural Information Processing Systems, Vol. 2, 2204–2212, 2014.

[38]

Maini, R.; Aggarwal, H. A comprehensive review of image enhancement techniques. arXiv preprint arXiv:1003.4053, 2010.

Google Scholar

[39]

Cook, R. L.; Torrance, K. E. A reflectance model for computer graphics. ACM Transactions on Graphics Vol. 1, No. 1, 7–24, 1982.

DOI Google Scholar

[40]

Aittala, M.; Aila, T. M.; Lehtinen, J. Reflectance modeling by neural texture synthesis. ACM Transactions on Graphics Vol. 35, No. 4, Article No. 65, 2016.

DOI Google Scholar

[41]

Team T. TensorFlow: Large-scale machine learning on heterogeneous systems. 2015. Available at https://www.tensorflow.org/.

[42]

Kingma, D. P.; Ba, J. Adam: A method for stochastic optimization. In: Proceedings of the 3rd International Conference for Learning Representations, 2015.

[43]

Fu, G.; Zhang, Q.; Zhu, L.; Li, P.; Xiao, C. X. A multi-task network for joint specular highlight detection and removal. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 7748–7757, 2021.

DOI

Electronic supplementary material

File

41095_0289_ESM.rar (115.1 MB)

About this article

Publication history

Acknowledgements

Rights and permissions

Publication history

Received: 20 January 2022

Accepted: 20 April 2022

Published: 22 March 2023

Issue date: September 2023

Copyright

Acknowledgements

The authors would like to thank Jie Guo from Nanjing University for his kind help with the comparison. Ying Song was partially supported by the National Natural Science Foundation of China (No. 61602416) and Shaoxing Science and Technology Plan Project (No. 2020B41006).

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduc-tion in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made.

The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Other papers from this open access journal are available free of charge from http://www.springer.com/journal/41095. To submit a manuscript, please go to https://www.editorialmanager.com/cvmj.