Discriminative feature encoding for intrinsic image decomposition

Zongji Wang; Yunfei Liu; Feng Lu

doi:10.1007/s41095-022-0294-4

AI Chat Paper

Note: Please note that the following content is generated by AMiner AI. SciOpen does not take any responsibility related to this content.

Chat more with AI

| Sign up

Browse by Subject

Search for peer-reviewed journals with full access.

Journals A - Z

About Us

Discover the SciOpen Platform and Achieve Your Research Goals with Ease.

About Us

Publish with Us

Support

Journals A - Z

About Us

Publish with Us

Support

PDF (13.2 MB)

Cite

EndNote(RIS) BibTeX

Collect

Submit Manuscript

AI Chat Paper

Show Outline

Outline

Show full outline

Hide outline

Outline

Show full outline

Hide outline

Research Article | Open Access

Discriminative feature encoding for intrinsic image decomposition

Zongji Wang^¹, Yunfei Liu^², Feng Lu^{²^,³}(

)

1Key Laboratory of Network Information System Technology (NIST), Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing 100190, China

2State Key Laboratory of Virtual Reality Technology and Systems, School of Computer Science and Engineering,Beihang University, Beijing 100191, China

3Peng Cheng Laboratory, Shenzhen 518000, China

Show Author Information

Graphical Abstract

Abstract

Intrinsic image decomposition is an important and long-standing computer vision problem. Given an input image, recovering the physical scene properties is ill-posed. Several physically motivated priors have been used to restrict the solution space of the optimization problem for intrinsic image decomposition. This work takes advantage of deep learning, and shows that it can solve this challenging computer vision problem with high efficiency. The focus lies in the feature encoding phase to extract discriminative features for different intrinsic layers from an input image. To achieve this goal, we explore the distinctive characteristics of different intrinsic components in the high-dimensional feature embedding space. We define feature distribution divergence to efficiently separate the feature vectors of different intrinsic components. The feature distributions are also constrained to fit the real ones through a feature distribution consistency. In addition, a data refinement approach is provided to remove data inconsistency from the Sintel dataset, making it more suitable for intrinsic image decomposition. Our method is also extended to intrinsic video decomposition based on pixel-wise correspondences between adjacent frames. Experimental results indicate that our proposed network structure can outperform the existing state-of-the-art.

Keywords

deep learning intrinsic image decomposition feature distribution data refinement

References

[1]

Baslamisli, A. S.; Le, H. A.; Gevers, T. CNN based learning using reflection and retinex models for intrinsic image decomposition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 6674–6683, 2018.

Crossref

[2]

Land, E. H.; McCann, J. J. Lightness and retinex theory. Journal of the Optical Society of America Vol. 61, No. 1, 1–11, 1971.

Crossref Google Scholar

[3]

Gehler, P. V.; Rother, C.; Kiefel, M.; Zhang, L. M.; Schölkopf, B. Recovering intrinsic images with a global sparsity prior on reflectance. In: Proceedings of the 24th International Conference on Neural Information Processing Systems, 765–773, 2011.

[4]

Shen, L.; Yeo, C. Intrinsic images decomposition using a local and global sparse representation of reflectance. In: Proceedings of the IEEE/CVF Computer Vision and Pattern Recognition Conference, 697–704, 2011.

Crossref

[5]

Shen, L.; Tan, P.; Lin, S. Intrinsic image decomposition with non-local texture cues. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 1–7, 2008.

[6]

Zhao, Q.; Tan, P.; Dai, Q.; Shen, L.; Wu, E. H.; Lin, S. A closed-form solution to retinex with nonlocal texture constraints. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 34, No. 7, 1437–1444, 2012.

Crossref Google Scholar

[7]

Barron, J. T.; Malik, J. Shape, illumination, and reflectance from shading. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 37, No. 8, 1670–1687, 2015.

Crossref Google Scholar

[8]

Bousseau, A.; Paris, S.; Durand, F. User-assisted intrinsic images. In: Proceedings of the ACM SIGGRAPH Asia 2009 Papers, Article No. 130, 2009.

Crossref

[9]

Shen, J. B.; Yang, X. S.; Li, X. L.; Jia, Y. D. Intrinsic image decomposition using optimization and user scribbles. IEEE Transactions on Cybernetics Vol. 43, No. 2, 425–436, 2013.

Crossref Google Scholar

[10]

Grosse, R.; Johnson, M. K.; Adelson, E. H.; Freeman, W. T. Ground truth dataset and baseline evaluations for intrinsic image algorithms. In: Proceedings of the IEEE 12th International Conference on Computer Vision, 2335–2342, 2009.

Crossref

[11]

Butler, D. J.; Wulff, J.; Stanley, G. B.; Black, M. J. A naturalistic open source movie for optical flow evaluation. In: Computer Vision – ECCV 2012. Lecture Notes in Computer Science, Vol. 7577. Fitzgibbon, A.; Lazebnik, S.; Perona, P.; Sato, Y.; Schmid, C. Eds. Springer Berlin Heidelberg, 611–625, 2012.

Crossref

[12]

Bell, S.; Bala, K.; Snavely, N. Intrinsic images in the wild. ACM Transactions on Graphics Vol. 33, No. 4, Article No. 159, 2014.

Crossref Google Scholar

[13]

Narihira, T.; Maire, M.; Yu, S. X. Learning lightness from human judgement on relative reflectance. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2965–2973, 2015.

Crossref

[14]

Narihira, T.; Maire, M.; Yu, S. X. Direct intrinsics: Learning albedo-shading decomposition by convolutional regression. In: Proceedings of the IEEE International Conference on Computer Vision, 2992–2992, 2015.

Crossref

[15]

Shi, J.; Dong, Y.; Su, H.; Yu, S. X. Learning non-lambertian object intrinsics across ShapeNet categories. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 5844–5853, 2017.

Crossref

[16]

Fan, Q. N.; Yang, J. L.; Hua, G.; Chen, B. Q.; Wipf, D. Revisiting deep intrinsic image decompositions. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 8944–8952, 2018.

Crossref

[17]

Li, Z.; Snavely, N. CGIntrinsics: Better intrinsic image decomposition through physically-based rendering. In: Computer Vision – ECCV 2018. Lecture Notes in Computer Science, Vol. 11207. Ferrari, V.; Hebert, M.; Sminchisescu, C.; Weiss, Y. Eds. Springer Cham, 381–399, 2018.

Crossref

[18]

Wang, Z. J.; Lu, F. Single image intrinsic decomposition with discriminative feature encoding. In: Proceedings of the IEEE/CVF International Conference on Computer Vision Workshop, 4310–4319, 2019.

Crossref

[19]

Bonneel, N.; Kovacs, B.; Paris, S.; Bala, K. Intrinsic decompositions for image editing. Computer Graphics Forum Vol. 36, No. 2, 593–609, 2017.

Crossref Google Scholar

[20]

Weiss, Y. Deriving intrinsic images from image sequences. In: Proceedings of the 8th IEEE Inter-national Conference on Computer Vision, 68–75, 2002.

[21]

Matsushita, Y.; Lin, S.; Kang, S. B.; Shum, H. Y. Estimating intrinsic images from image sequences with biased illumination. In: Computer Vision - ECCV 2004. Lecture Notes in Computer Science, Vol. 3022. Pajdla, T.; Matas, J. Eds. Springer Berlin Heidelberg, 274–286, 2004.

Crossref

[22]

Laffont, P. Y.; Bazin, J. C. Intrinsic decomposition of image sequences from local temporal variations. In: Proceedings of the IEEE International Conference on Computer Vision, 433–441, 2015.

Crossref

[23]

Li, Z. Q.; Snavely, N. Learning intrinsic image decomposition from watching the world. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 9039–9048, 2018.

Crossref

[24]

Lettry, L.; Vanhoey, K.; Van Gool, L. Unsupervised deep single-image intrinsic decomposition using illumination-varying image sequences. Computer Graphics Forum Vol. 37, No. 7, 409–419, 2018.

Crossref Google Scholar

[25]

Gong, W. Y.; Xu, W. H.; Wu, L. Q.; Xie, X. H.; Cheng, Z. L. Intrinsic image sequence decomposition using low-rank sparse model. IEEE Access Vol. 7, 4024–4030, 2018.

Crossref Google Scholar

[26]

Liu, Y. F.; Lu, F. Separate in latent space: Unsupervised single image layer separation. Proceedings of the AAAI Conference on Artificial Intelligence Vol. 34, No. 7, 11661–11668, 2020.

Crossref Google Scholar

[27]

Barron, J. T.; Malik, J. Intrinsic scene properties from a single RGB-D image. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 17–24, 2013.

Crossref

[28]

Chen, Q. F.; Koltun, V. A simple model for intrinsic image decomposition with depth cues. In: Proceedings of the IEEE International Conference on Computer Vision, 241–248, 2013.

Crossref

[29]

Lee, K. J.; Zhao, Q.; Tong, X.; Gong, M. M.; Izadi, S.; Lee, S. U.; Tan, P.; Lin, S. Estimation of intrinsic image sequences from Image+Depth video. In: Computer Vision – ECCV 2012. Lecture Notes in Computer Science, Vol. 7577. Fitzgibbon, A.; Lazebnik, S.; Perona, P.; Sato, Y.; Schmid, C. Eds. Springer Berlin Heidelberg, 327–340, 2012.

Crossref

[30]

Kim, S.; Park, K.; Sohn, K.; Lin, S. Unified depth prediction and intrinsic image decomposition from a single image via joint convolutional neural fields. In: Computer Vision – ECCV 2016. Lecture Notes in Computer Science, Vol. 9912. Leibe, B.; Matas, J.; Sebe, N.; Welling, M. Eds. Springer Cham, 143–159, 2016.

Crossref

[31]

Cheng, Z. A.; Zheng, Y. Q.; You, S. D.; Sato, I. Non-local intrinsic decomposition with near-infrared priors. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, 2521–2530, 2019.

Crossref

[32]

Bi, S.; Han, X. G.; Yu, Y. Z. An L1 image transform for edge-preserving smoothing and scene-level intrinsic decomposition. ACM Transactions on Graphics Vol. 34, No. 4, Article No. 78, 2015.

Crossref Google Scholar

[33]

Li, Y.; Brown, M. S. Single image layer separation using relative smoothness. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2752–2759, 2014.

Crossref

[34]

Sheng, B.; Li, P.; Jin, Y. X.; Tan, P.; Lee, T. Y. Intrinsic image decomposition with step and drift shading separation. IEEE Transactions on Visualization and Computer Graphics Vol. 26, No. 2, 1332–1346, 2020.

Crossref Google Scholar

[35]

Fu, X. Y.; Zeng, D. L.; Huang, Y.; Zhang, X. P.; Ding, X. H. A weighted variational model for simultaneous reflectance and illumination estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2782–2790, 2016.

Crossref

[36]

Fu, G.; Zhang, Q.; Xiao, C. X. Towards high-quality intrinsic images in the wild. In: Proceedings of the IEEE International Conference on Multimedia and Expo, 175–180, 2019.

Crossref

[37]

Krebs, A.; Benezeth, Y.; Marzani, F. Intrinsic image decomposition as two independent deconvolution problems. Signal Processing: Image Communication Vol. 86, 115872, 2020.

Crossref Google Scholar

[38]

Tang, Y.; Salakhutdinov, R.; Hinton, G. Deep Lambertian networks. In: Proceedings of the 29th International Conference on Machine Learning, 1419–1426, 2012.

[39]

Zhou, T. H.; Krähenbühl, P.; Efros, A. A. Learning data-driven reflectance priors for intrinsic image decomposition. In: Proceedings of the IEEE International Conference on Computer Vision, 3469–3477, 2015.

Crossref

[40]

Zoran, D.; Isola, P.; Krishnan, D.; Freeman, W. T. Learning ordinal relationships for mid-level vision. In: Proceedings of the IEEE International Conference on Computer Vision, 388–396, 2015.

Crossref

[41]

Nestmeyer, T.; Gehler, P. V. Reflectance adaptive filtering improves intrinsic image estimation. In:Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 1771–1780, 2017.

Crossref

[42]

Fu, G.; Zhang, Q.; Zhu, L.; Li, P.; Xiao, C. X. A multi-task network for joint specular highlight detection and removal. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 7748–7757, 2021.

Crossref

[43]

Seo, K.; Kinoshita, Y.; Kiya, H. Deep retinex network for estimating illumination colors with self-supervised learning. In: Proceedings of the IEEE 3rd Global Conference on Life Sciences and Technologie, 1–5, 2021.

Crossref

[44]

Baslamisli, A. S.; Liu, Y.; Karaoglu, S.; Gevers, T. Physics-based shading reconstruction for intrinsic image decomposition. Computer Vision and Image Understanding Vol. 205, 103183, 2021.

Crossref Google Scholar

[45]

Baslamisli, A. S.; Das, P.; Le, H. A.; Karaoglu, S.; Gevers, T. ShadingNet: Image intrinsics by fine-grained shading decomposition. International Journal of Computer Vision Vol. 129, No. 8, 2445–2473, 2021.

Crossref Google Scholar

[46]

Zhu, Y. J.; Tang, J. J.; Li, S.; Shi, B. X. DeRenderNet: Intrinsic image decomposition of urban scenes with shape-(In)dependent shading rendering. In: Proceedings of the IEEE International Conference on Computational Photography, 1–11, 2021.

Crossref

[47]

Sial, H. A.; Baldrich, R.; Vanrell, M. Deep intrinsic decomposition trained on surreal scenes yet with realistic light effects. Journal of the Optical Society of America A Vol. 37, No. 1, 1–15, 2019.

Crossref Google Scholar

[48]

Chang, A. X.; Funkhouser, T.; Guibas, L.; Hanrahan, P.; Huang, Q. X.; Li, Z. M.; Savarese, S.; Savva, M.; Song, S.; Su, H.; et al. ShapeNet: An information-rich 3D model repository. arXiv preprint arXiv:1512.03012, 2015.

Google Scholar

[49]

Kong, N.; Gehler, P. V.; Black, M. J. Intrinsic video. In: Computer Vision – ECCV 2014. Lecture Notes in Computer Science, Vol. 8690. Fleet, D.; Pajdla, T.; Schiele, B.; Tuytelaars, T. Eds. Springer Cham, 360–375, 2014.

Crossref

[50]

Ye, G. Z.; Garces, E.; Liu, Y. B.; Dai, Q. H.; Gutierrez, D. Intrinsic video and applications. ACM Transactions on Graphics Vol. 33, No. 4, Article No. 80, 2014.

Crossref Google Scholar

[51]

Meka, A.; Zollhöfer, M.; Richardt, C.; Theobalt, C. Live intrinsic video. ACM Transactions on Graphics Vol. 35, No. 4, Article No. 109, 2016.

Crossref Google Scholar

[52]

Lei, C.; Xing, Y.; Chen, Q. Blind video temporal consistency via deep video prior. In: Proceedings of the 34th Conference on Neural Information Processing Systems, 1083–1093, 2020.

[53]

Simonyan, K.; Zisserman, A. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556, 2014.

Google Scholar

[54]

Deng, J.; Dong, W.; Socher, R.; Li, L. J.; Kai, L.; Li, F. F. ImageNet: A large-scale hierarchical image database. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 248–255, 2009.

Crossref

[55]

Johnson, J.; Alahi, A.; Li, F. F. Perceptual losses for real-time style transfer and super-resolution. In: Computer Vision – ECCV 2016. Lecture Notes in Computer Science, Vol. 9906. Leibe, B.; Matas, J.; Sebe, N.; Welling, M. Eds. Springer Cham, 694–711, 2016.

Crossref

[56]

Wang, Z.; Bovik, A. C.; Sheikh, H. R.; Simoncelli, E. P. Image quality assessment: From error visibility to structural similarity. IEEE Transactions on Image Processing Vol. 13, No. 4, 600–612, 2004.

Crossref Google Scholar

[57]

Roweis, S. T.; Saul, L. K. Nonlinear dimensionality reduction by locally linear embedding. Science Vol. 290, No. 5500, 2323–2326, 2000.

Crossref Google Scholar

[58]

Yao, C. H.; Chang, C. Y.; Chien, S. Y. Occlusion-aware video temporal consistency. In: Proceedings of the 25th ACM International Conference on Multimedia, 777–785, 2017.

Crossref

[59]

Garces, E.; Munoz, A.; Lopez-Moreno, J.; Gutierrez, D. Intrinsic images by clustering. Computer Graphics Forum Vol. 31, No. 4, 1415–1424, 2012.

Crossref Google Scholar

[60]

Barron, J. T.; Adams, A.; Shih, Y.; Hernández, C. Fast bilateral-space stereo for synthetic defocus. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 4466–4474, 2015.

Crossref

Computational Visual Media

Volume 9 Issue 3,
September 2023

Pages 597-618

DOI: 10.1007/s41095-022-0294-4

Cite this article:

Wang Z, Liu Y, Lu F. Discriminative feature encoding for intrinsic image decomposition. Computational Visual Media, 2023, 9(3): 597-618. https://doi.org/10.1007/s41095-022-0294-4

928

Views

Downloads

Crossref

Web of Science

Scopus

CSCD

Google Scholar
Citation

Altmetrics

Received: 11 January 2022

Accepted: 09 May 2022

Published: 18 April 2023

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduc-tion in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made.

The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Other papers from this open access journal are available free of charge from http://www.springer.com/journal/41095. To submit a manuscript, please go to https://www.editorialmanager.com/cvmj.