AOGAN: A generative adversarial network for screen space ambient occlusion

Lei Ren; Ying Song

doi:10.1007/s41095-021-0248-2

Computational Visual Media 2022, 8(3): 483-494 https://doi.org/10.1007/s41095-021-0248-2

Research Article |

Open Access | Issue | Published: 06 January 2022

AOGAN: A generative adversarial network for screen space ambient occlusion

Show Author's Information Hide Author's Information Lei Ren^¹, Ying Song^{¹^,²^,³}(

)

1Zhejiang Sci-Tech University, Hangzhou 310018, China

22011 Collaborative Innovation Center for Garment Personal Customization of Zhejiang Province, China

3Key Lab of Silk and Culture Heritage and Products Design Digital Technology, Ministry of Culture and Tourism, China

Keywords:

perceptual loss, generative adversarial network (GAN), ambient occlusion (AO), attention me-chanism

Cite this article:

Ren L, Song Y. AOGAN: A generative adversarial network for screen space ambient occlusion. Computational Visual Media, 2022, 8(3): 483-494. https://doi.org/10.1007/s41095-021-0248-2

Download citation

EndNote(RIS)

BibTeX

703

Views

Downloads

Citations

Crossref

WoS

Scopus

CSCD

Abstract Full text About this article

Abstract

Ambient occlusion (AO) is a widely-used real-time rendering technique which estimates light intensity on visible scene surfaces. Recently, a number of learning-based AO approaches have been proposed, which bring a new angle to solving screen space shading via a unified learning framework with competitive quality and speed. However, most such methods have high error for complex scenes or tend to ignore details. We propose an end-to-end generative adversarial network for the production of realistic AO, and explore the importance of perceptual loss in the generative model to AO accuracy. An attention mechanism is also described to improve the accuracy of details, whose effectiveness is demonstrated on a wide variety of scenes.

Full text

Abstract

Full text

Outline

About this article

AOGAN: A generative adversarial network for screen space ambient occlusion

Show Author's information Hide Author's Information Lei Ren^¹, Ying Song^{¹^,²^,³}(

)

1Zhejiang Sci-Tech University, Hangzhou 310018, China

22011 Collaborative Innovation Center for Garment Personal Customization of Zhejiang Province, China

3Key Lab of Silk and Culture Heritage and Products Design Digital Technology, Ministry of Culture and Tourism, China

Abstract

Keywords: perceptual loss, generative adversarial network (GAN), ambient occlusion (AO), attention me-chanism

References(40)

[1]

Mittring, M. Finding next gen: CryEngine 2. In: Proceedings of the ACM SIGGRAPH 2007 Courses, 97-121, 2007.

DOI

[2]

Shanmugam, P.; Arikan, O. Hardware accelerated ambient occlusion techniques on GPUs. In: Proceedings of the Symposium on Interactive 3D Graphics and Games, 73-80, 2007.

DOI

[3]

Bavoil, L.; Sainz, M. Screen space ambient occlusion. Available at https://developer.download.nvidia.cn/SDK/10.5/direct3d/Source/ScreenSpaceAO/doc/ScreenSpaceAO.pdf.

[4]

Fillion, D.; McNaughton, R. Effects & techniques. In: Proceedings of the ACM SIGGRAPH 2008 Games, 133-164, 2008.

DOI

[5]

Holden, D.; Saito, J.; Komura, T. Neural network ambient occlusion. In: Proceedings of the SIGGRAPH ASIA 2016 Technical Briefs, Article No. 9, 2016.

DOI

[6]

Erra, U.; Capece, N. F.; Agatiello, R. Ambient occlusion baking via a feed-forward neural network. In: Proceedings of the Eurographics - Short Papers, 13-16, 2017.

[7]

Nalbach, O.; Arabadzhiyska, E.; Mehta, D.; Seidel, H. P.; Ritschel, T. Deep shading: Convolutional neural networks for screen space shading. Computer Graphics Forum Vol. 36, No. 4, 65-78, 2017.

DOI Google Scholar

[8]

Ronneberger, O.; Fischer, P.; Brox, T. U-Net: Convolutional networks for biomedical image segmentation. In: Medical Image Computing and Computer-Assisted Intervention - MICCAI 2015. Lecture Notes in Computer Science, Vol. 9351. Navab, N.; Hornegger, J.; Wells, W.; Frangi, A. Eds. Springer Cham, 234-241, 2015.

DOI

[9]

Long, J.; Shelhamer, E.; Darrell, T. Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 3431-3440, 2015.

DOI

[10]

Zhou, Z. W.; Rahman Siddiquee, M. M.; Tajbakhsh, N.; Liang, J. M. UNet++: A nested U-net architecture for medical image segmentation. In: Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support. Lecture Notes in Computer Science, Vol. 11045. Stoyanov, D. et al. Eds. Springer Cham, 3-11, 2018.

DOI

[11]

Oktay, O.; Schlemper, J.; Folgoc, L. L.; Lee, M.; Heinrich, M.; Misawa, K.; Mori, K.; McDonagh, S.; Hammerla, N. Y.; Kainz, B. et al. Attention U-net: Learning where to look for the pancreas. arXiv preprint arXiv:1804.03999, 2018.

Google Scholar

[12]

Wang, Z.; Bovik, A. C.; Sheikh, H. R.; Simoncelli, E. P. Image quality assessment: From error visibility to structural similarity. IEEE Transactions on Image Processing Vol. 13, No. 4, 600-612, 2004.

DOI Google Scholar

[13]

Avcibas, I.; Sankur, B.; Sayood, K. Statistical evaluation of image quality measures. Journal of Electronic Imaging, Vol. 11, No. 2, 206-223, 2002.

DOI Google Scholar

[14]

Dong, C.; Loy, C. C.; He, K. M.; Tang, X. O. Learning a deep convolutional network for image super-resolution. In: Computer Vision - ECCV 2014. Lecture Notes in Computer Science, Vol. 8692. Fleet, D.; Pajdla, T.; Schiele, B.; Tuytelaars, T. Eds. Springer Cham, 184-199, 2014.

DOI

[15]

Chen, G. Y.; Han, K.; Wong, K. Y. K. PS-FCN: A flexible learning framework for photometric stereo. In: Computer Vision - ECCV 2018. Lecture Notes in Computer Science, Vol. 11213. Ferrari, V.; Hebert, M.; Sminchisescu, C.; Weiss, Y. Eds. Springer Cham, 3-19, 2018.

[16]

Blau, Y.; Michaeli, T. The perception-distortion tradeoff. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 6228-6237, 2018.

DOI

[17]

Zhang, R.; Isola, P.; Efros, A. A.; Shechtman, E.; Wang, O. The unreasonable effectiveness of deep features as a perceptual metric. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 586-595, 2018.

DOI

[18]

Goodfellow, I.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Warde-Farley, D.; Ozair, S.; Courville, A.; Bengio, Y. Generative adversarial nets. In: Proceedings of the 27th International Conference on Neural Information Processing Systems, Vol. 2, 2672-2680, 2014.

[19]

Ledig, C.; Theis, L.; Huszár, F.; Caballero, J.; Cunningham, A.; Acosta, A.; Aitken, A.; Tejani, A.; Totz, J.; Wang, Z. et al. Photo-realistic single image super-resolution using a generative adversarial network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 105-114, 2017.

DOI

[20]

Simonyan, K.; Zisserman, A. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556, 2014.

Google Scholar

[21]

Zhang, D. J.; Xian, C. H.; Luo, G. L.; Xiong, Y. H.; Han, C. DeepAO: Efficient screen space ambient occlusion generation via deep network. IEEE Access Vol. 8, 64434-64441, 2020.

DOI Google Scholar

[22]

Bavoil, L.; Sainz, M.; Dimitrov, R. Image-space horizon-based ambient occlusion. In: Proceedings of the ACM SIGGRAPH 2008 Talks, Article No. 22, 2008.

DOI

[23]

McGuire, M.; Osman, B.; Bukowski, M.; Hennessy, P. The alchemy screen-space ambient obscurance algorithm. In: Proceedings of the ACM SIGGRAPH Symposium on High Performance Graphics, 25-32, 2011.

DOI

[24]

Vardis, K.; Papaioannou, G.; Gaitatzes, A. Multi-view ambient occlusion with importance sampling. In: Proceedings of the ACM SIGGRAPH Symposium on Interactive 3D Graphics and Games, 111-118, 2013.

DOI

[25]

Jimenez, J.; Wu, X.-C.; Pesce, A.; Jarabo, A. Practical real-time strategies for accurate indirect occlusion. In: Proceedings of the SIGGRAPH 2016 Courses: Physically Based Shading in Theory and Practice, 2016.

[26]

Radford, A.; Metz, L.; Chintala, S. Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv preprint arXiv:1511.06434, 2015.

Google Scholar

[27]

Mirza, M.; Osindero, S. Conditional generative adversarial nets. arXiv preprint arXiv:1411.1784, 2014.

Google Scholar

[28]

Isola, P.; Zhu, J. Y.; Zhou, T. H.; Efros, A. A. Image-to-image translation with conditional adversarial networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 5967-5976, 2017.

DOI

[29]

Tewari, A.; Fried, O.; Thies, J.; Sitzmann, V.; Lombardi, S.; Sunkavalli, K.; Martin-Brualla, R.; Simon, T.; Saragih, J.; Nießner, M. et al. State of the art on neural rendering. Computer Graphics Forum Vol. 39, No. 2, 701-727, 2020.

DOI Google Scholar

[30]

Johnson, J.; Alahi, A.; Li, F. F. Perceptual losses for real-time style transfer and super-resolution. In: Computer Vision - ECCV 2016. Lecture Notes in Computer Science, Vol. 9906. Leibe, B.; Matas, J.; Sebe, N.; Welling, M. Eds. Springer Cham, 694-711, 2016.

DOI

[31]

Park, E.; Yang, J. M.; Yumer, E.; Ceylan, D.; Berg, A. C. Transformation-grounded image generation network for novel 3D view synthesis. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 702-711, 2017.

DOI

[32]

Dosovitskiy, A.; Brox, T. Generating images with perceptual similarity metrics based on deep networks. arXiv preprint arXiv:1602.02644, 2016.

Google Scholar

[33]

Gatys, L.; Ecker, A.; Bethge, M. A neural algorithm of artistic style. arXiv preprint arXiv:1508.06576, 2015.

DOI Google Scholar

[34]

Lin, Z. H.; Feng, M. W.; Santos, C. N. D.; Yu, M.; Bengio, Y. A structured self-attentive sentence embedding. arXiv preprint arXiv:1703.03130, 2017.

Google Scholar

[35]

Shen, T.; Zhou, T. Y.; Long, G. D.; Jiang, J.; Zhang, C. Q. DiSAN: Directional self-attention network for RNN/CNN-free language understanding. arXiv preprint arXiv:1709.04696, 2017.

Google Scholar

[36]

Tang, J. H.; Hong, R. C.; Yan, S. C.; Chua, T. S.; Qi, G. J.; Jain, R. Image annotation by kNN-sparse graph-based label propagation over noisily tagged web images. ACM Transactions on Intelligent Systems and Technology Vol. 2, No. 2, Article No. 14, 2011.

DOI Google Scholar

[37]

Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A. N.; Kaiser, L.; Polosukhin, I. Attention is all you need. In: Proceedings of the 31st International Conference on Neural Information Processing Systems, 6000-6010, 2017.

[38]

Zhang, H.; Goodfellow, I.; Metaxas, D.; Odena, A. Self-attention generative adversarial networks. In: Proceedings of the 36th International Conference on Machine Learning, 7354-7363, 2019.

[39]

Arjovsky, M.; Chintala, S.; Bottou, L. Wasserstein GAN. arXiv preprint arXiv:1701.07875, 2017.

Google Scholar

[40]

Pathak, D.; Krähenbühl, P.; Donahue, J.; Darrell, T.; Efros, A. A. Context encoders: Feature learning by inpainting. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2536-2544, 2016.

DOI

About this article

Publication history

Acknowledgements

Rights and permissions

Publication history

Received: 13 May 2021

Accepted: 13 July 2021

Published: 06 January 2022

Issue date: September 2022

Copyright

Acknowledgements

The authors would like to thank the anonymous reviewers for their helpful suggestions and comments. Ying Song was supported by the National Natural Science Foundation of China (No. 61602416), ShaoxingScience and Technology Bureau Key Project (No. 2020B41006), and the Opening Fund (No. 2020WLB10) of the Key Laboratory of Silk Culture Heritage and Product Design Digital Technology.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduc-tion in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made.

The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Other papers from this open access journal are available free of charge from http://www.springer.com/journal/41095. To submit a manuscript, please go to https://www. editorialmanager.com/cvmj.