High-quality indoor scene 3D reconstruction with RGB-D cameras: A brief review

Jianwei Li; Wei Gao; Yihong Wu; Yangdong Liu; Yanfei Shen

doi:10.1007/s41095-021-0250-8

Computational Visual Media 2022, 8(3): 369-393 https://doi.org/10.1007/s41095-021-0250-8

Review Article |

Open Access | Issue | Published: 06 March 2022

High-quality indoor scene 3D reconstruction with RGB-D cameras: A brief review

Show Author's Information Hide Author's Information Jianwei Li^¹, Wei Gao^{²^,³}, Yihong Wu^{²^,³}, Yangdong Liu^⁴, Yanfei Shen^¹(

)

1School of Sports Engineering, Beijing Sports University, Beijing 100084, China

2National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, Beijing 100080, China

3University of Chinese Academy of Sciences, Beijing 100049, China

4Huawei Technologies Co., Ltd., Beijing 100085, China

Keywords:

3D reconstruction, image processing, camera pose estimation, surface fusion

Cite this article:

Li J, Gao W, Wu Y, et al. High-quality indoor scene 3D reconstruction with RGB-D cameras: A brief review. Computational Visual Media, 2022, 8(3): 369-393. https://doi.org/10.1007/s41095-021-0250-8

Download citation

EndNote(RIS)

BibTeX

1843

Views

249

Downloads

Citations

Crossref

WoS

Scopus

CSCD

Abstract Full text About this article

Abstract

High-quality 3D reconstruction is an important topic in computer graphics and computer vision with many applications, such as robotics and augmented reality. The advent of consumer RGB-D cameras has made a profound advance in indoor scenereconstruction. For the past few years, researchers have spent significant effort to develop algorithms to capture 3D models with RGB-D cameras. As depth images produced by consumer RGB-D cameras are noisy and incomplete when surfaces are shiny, bright, transparent, or far from the camera, obtaining high- quality 3D scene models is still a challenge for existing systems. We here review high-quality 3D indoor scene reconstruction methods using consumer RGB-D cameras. In this paper, we make comparisons and analyses from the following aspects: (i) depth processing methods in 3D reconstruction are reviewed in terms of enhancement and completion, (ii) ICP-based, feature-based, and hybrid methods of camera pose estimation methods are reviewed, and (iii) surface reconstruction methods are reviewed in terms of surface fusion, optimization, and completion. The performance of state-of-the-art methods is also compared and analyzed. This survey will be useful for researchers who want to follow best practices in designing new high-quality 3D reconstruction methods.

Full text

Abstract

Full text

Outline

About this article

High-quality indoor scene 3D reconstruction with RGB-D cameras: A brief review

Show Author's information Hide Author's Information Jianwei Li^¹, Wei Gao^{²^,³}, Yihong Wu^{²^,³}, Yangdong Liu^⁴, Yanfei Shen^¹(

)

1School of Sports Engineering, Beijing Sports University, Beijing 100084, China

2National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, Beijing 100080, China

3University of Chinese Academy of Sciences, Beijing 100049, China

4Huawei Technologies Co., Ltd., Beijing 100085, China

Abstract

Keywords: 3D reconstruction, image processing, camera pose estimation, surface fusion

References(160)

[1]

Orts-Escolano, S.; Rhemann, C.; Fanello, S.; Chang, W.; Kowdle, A.; Degtyarev, Y.; Kim, D.; Davidson, P. L.; Khamis, S.; Dou, M.; et al. Holoportation: Virtual 3D teleportation in real-time. In: Proceedings of the 29th Annual Symposium on User Interface Software and Technology, 741-754, 2016.

DOI

[2]

DGene. Available at https://www.dgene.com/tech/model.

DOI

[3]

Choi, S.; Zhou, Q. Y.; Koltun, V. Robust reconstruction of indoor scenes. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 5556-5565, 2015.

DOI

[4]

Newcombe, R. A.; Izadi, S.; Hilliges, O.; Molyneaux, D.; Kim, D.; Davison, A. J.; Kohi, P.; Shotton, J.; Hodges, S.; Fitzgibbon, A. KinectFusion: Real-time dense surface mapping and tracking. In: Proceedings of the 10th IEEE International Symposium on Mixed and Augmented Reality, 127-136, 2011.

DOI

[5]

Curless, B.; Levoy, M. A volumetric method for building complex models from range images. In: Proceedings of the 23rd Annual Conference on Computer Graphics and Interactive Techniques, 303-312, 1996.

DOI

[6]

Rusinkiewicz, S.; Levoy, M. Efficient variants of the ICP algorithm. In: Proceedings of the 3rd International Conference on 3-D Digital Imaging and Modeling, 145-152, 2001.

DOI

[7]

Whelan, T.; Kaess, M.; Fallon, M.; Johannsson, H.; Leonard, J.; Mcdonald, J. Kintinuous: Spatially extended KinectFusion. Robotics and Autonomous Systems Vol. 69, No. C, 3-14, 2012.

DOI Google Scholar

[8]

Whelan, T.; Kaess, M.; Johannsson, H.; Fallon, M.; Leonard, J. J.; McDonald, J. Real-time large-scale dense RGB-D SLAM with volumetric fusion. The International Journal of Robotics Research Vol. 34, Nos. 4-5, 598-626, 2015.

DOI Google Scholar

[9]

Thomas, D.; Sugimoto, A. Modeling large-scale indoor scenes with rigid fragments using RGB-D cameras. Computer Vision and Image Understanding Vol. 157, 103-116, 2017.

DOI Google Scholar

[10]

Golodetz, S.; Cavallari, T.; Lord, N. A.; Prisacariu, V. A.; Murray, D. W.; Torr, P. H. S. Collaborative large-scale dense 3D reconstruction with online inter-agent pose optimisation. IEEE Transactions on Visualization and Computer Graphics Vol. 24, No. 11, 2895-2905, 2018.

DOI Google Scholar

[11]

Dai, A.; Diller, C.; Niessner, M. SG-NN: Sparse generative neural networks for self-supervised scene completion of RGB-D scans. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 846-855, 2020.

DOI

[12]

Salas-Moreno, R. F.; Newcombe, R. A.; Strasdat, H.; Kelly, P. H. J.; Davison, A. J. SLAM++: Simultaneous localisation and mapping at the level of objects. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 1352-1359, 2013.

DOI

[13]

Shao, T. J.; Xu, W. W.; Zhou, K.; Wang, J. D.; Li, D. P.; Guo, B. N. An interactive approach to semantic modeling of indoor scenes with an RGBD camera. ACM Transactions on Graphics Vol. 31, No. 6, Article No. 136, 2012.

DOI Google Scholar

[14]

Chen, K.; Lai, Y. K.; Wu, Y. X.; Martin, R.; Hu, S. M. Automatic semantic modeling of indoor scenes from low-quality RGB-D data using contextual information. ACM Transactions on Graphics Vol. 33, No. 6, Article No. 208, 2014.

DOI Google Scholar

[15]

McCormac, J.; Handa, A.; Davison, A.; Leutenegger, S. SemanticFusion: Dense 3D semantic mapping with convolutional neural networks. In: Proceedings of the IEEE International Conference on Robotics and Automation, 4628-4635, 2017.

DOI

[16]

Hou, J.; Dai, A.; Nießner, M. 3D-SIS: 3D semantic instance segmentation of RGB-D scans. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 4416-4425, 2019.

DOI

[17]

Cai, Y. J.; Chen, X. S.; Zhang, C.; Lin, K. Y.; Wang, X. G.; Li, H. S. Semantic scene completion via integrating instances and scene in-the-loop. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 324-333, 2021.

DOI

[18]

Newcombe, R. A.; Fox, D.; Seitz, S. M. DynamicFusion: Reconstruction and tracking of non-rigid scenes in real-time. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 343-352, 2015.

DOI

[19]

Dou, M. S.; Khamis, S.; Degtyarev, Y.; Davidson, P.; Fanello, S. R.; Kowdle, A.; Escolano, S. O.; Rhemann, C.; Kim, D.; Taylor, J.; et al. Fusion4D: Real-time performance capture of challenging scenes. ACM Transactions on Graphics Vol. 35, No. 4, Article No. 114, 2016.

DOI Google Scholar

[20]

Meerits, S.; Thomas, D.; Nozick, V.; Saito, H. FusionMLS: Highly dynamic 3D reconstruction with consumer-grade RGB-D cameras. Computational Visual Media Vol. 4, No. 4, 287-303, 2018.

DOI Google Scholar

[21]

Saito, S.; Simon, T.; Saragih, J.; Joo, H. PIFuHD: Multi-level pixel-aligned implicit function for high-resolution 3D human digitization. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 81-90, 2020.

DOI

[22]

Nießner, M.; Zollhöfer, M.; Izadi, S.; Stamminger, M. Real-time 3D reconstruction at scale using voxel hashing. ACM Transactions on Graphics Vol. 32, No. 6, Article No. 169, 2013.

DOI Google Scholar

[23]

Steinbrücker, F.; Sturm, J.; Cremers, D. Volumetric 3D mapping in real-time on a CPU. In: Proceedings of the IEEE International Conference on Robotics and Automation, 2021-2028, 2014.

DOI

[24]

Kähler, O.; Adrian Prisacariu, V.; Yuheng Ren, C.; Sun, X.; Torr, P.; Murray, D. Very high frame rate volumetric integration of depth images on mobile devices. IEEE Transactions on Visualization and Computer Graphics Vol. 21, No. 11, 1241-1250, 2015.

DOI Google Scholar

[25]

Prisacariu, V. A.; Kähler, O.; Golodetz, S.; Sapienza, M.; Cavallari, T.; Torr, P. H.; Murray, D. W. InfiniTAM v3: A framework for large-scale 3D reconstruction with loop closure. arXiv preprint arXiv:1708.00783, 2017.

DOI Google Scholar

[26]

Dai, A.; Nießner, M.; Zollhöfer, M.; Izadi, S.; Theobalt, C. BundleFusion: Real-time globally consistent 3D reconstruction using on-the-fly surface reintegration. ACM Transactions on Graphics Vol. 36, No. 4, Article No. 76a, 2017.

DOI Google Scholar

[27]

Maier, R.; Kim, K.; Cremers, D.; Kautz, J.; Nießner, M. Intrinsic3D: High-quality 3D reconstruction by joint appearance and geometry optimization with spatially-varying lighting. In: Proceedings of the IEEE International Conference on Computer Vision, 3133-3141, 2017.

DOI

[28]

Cao, Y. P.; Kobbelt, L.; Hu, S. M. Real-time high-accuracy three-dimensional reconstruction with consumer RGB-D cameras. ACM Transactions on Graphics Vol. 37, No. 5, Article No. 171, 2018.

DOI Google Scholar

[29]

Whelan, T.; Salas-Moreno, R. F.; Glocker, B.; Davison, A. J.; Leutenegger, S. ElasticFusion: Real-time dense SLAM and light source estimation. The International Journal of Robotics Research Vol. 35, No. 14, 1697-1716, 2016.

DOI Google Scholar

[30]

Jeon, J.; Jung, Y.; Kim, H.; Lee, S. Texture map generation for 3D reconstructed scenes. The Visual Computer Vol. 32, Nos. 6-8, 955-965, 2016.

DOI Google Scholar

[31]

Dai, A.; Chang, A. X.; Savva, M.; Halber, M.; Funkhouser, T.; Nießner, M. ScanNet: Richly-annotated 3D reconstructions of indoor scenes. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2432-2443, 2017.

DOI

[32]

Sturm, J.; Engelhard, N.; Endres, F.; Burgard, W.; Cremers, D. A benchmark for the evaluation of RGB-D SLAM systems. In: Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 573-580, 2012.

DOI

[33]

Zhou, Q. Y.; Miller, S.; Koltun, V. Elastic fragments for dense scene reconstruction. In: Proceedings of the IEEE International Conference on Computer Vision, 473-480, 2013.

DOI

[34]

Xiao, J. X.; Owens, A.; Torralba, A. SUN3D: A database of big spaces reconstructed using SfM and object labels. In: Proceedings of the IEEE International Conference on Computer Vision, 1625-1632, 2013.

DOI

[35]

Handa, A.; Whelan, T.; McDonald, J.; Davison, A. J. A benchmark for RGB-D visual odometry, 3D reconstruction and SLAM. In: Proceedings of the IEEE International Conference on Robotics and Automation, 1524-1531, 2014.

DOI

[36]

Hua, B. S.; Pham, Q. H.; Nguyen, D. T.; Tran, M. K.; Yu, L. F.; Yeung, S. K. SceneNN: A scene meshes dataset with aNNotations. In: Proceedings of the 4th International Conference on 3D Vision, 92-101, 2016.

DOI

[37]

Wasenmüller, O.; Meyer, M.; Stricker, D. CoRBS: Comprehensive RGB-D benchmark for SLAM using Kinect v2. In: Proceedings of the IEEE Winter Conference on Applications of Computer Vision, 1-7, 2016.

DOI

[38]

Chang, A.; Dai, A.; Funkhouser, T.; Halber, M.; Niebner, M.; Savva, M.; Song, S.; Zeng, A.; Zhang, Y. Matterport3D: Learning from RGB-D data in indoor environments. In: Proceedings of the International Conference on 3D Vision, 667-676, 2017.

DOI

[39]

McCormac, J.; Handa, A.; Leutenegger, S.; Davison, A. J. SceneNet RGB-D: Can 5M synthetic images beat generic ImageNet pre-training on indoor segmentation? In: Proceedings of the IEEE International Conference on Computer Vision, 2697-2706, 2017.

DOI

[40]

Palazzolo, E.; Behley, J.; Lottes, P.; Giguère, P.; Stachniss, C. ReFusion: 3D reconstruction in dynamic environments for RGB-D cameras exploiting residuals. In: Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 7855-7862, 2019.

DOI

[41]

Li, W. B.; Saeedi, S.; McCormac, J.; Clark, R.; Leutenegger, S. InteriorNet: Mega-scale Multi-sensor Photo-realistic indoor scenes dataset. arXiv preprint arXiv:1809.00716, 2018.

DOI Google Scholar

[42]

Straub, J.; Whelan, T.; Ma, L. N.; Chen, Y. F.; Wijmans, E.; Green, S.; Engel, J. J.; Mur-Artal, R.; Ren, C.; Verma, S.; et al. The replica dataset: A digital replica of indoor spaces. arXiv preprint arXiv:1906.05797, 2019.

DOI Google Scholar

[43]

Shi, X. S.; Li, D. J.; Zhao, P. P.; Tian, Q. B.; Tian, Y. X.; Long, Q. W.; Zhu, C.; Song, J.; Qiao, F.; Song, L.; et al. Are we ready for service robots? The OpenLORIS-scene datasets for lifelong SLAM. In: Proceedings of the IEEE International Conference on Robotics and Automation, 3139-3145, 2020.

DOI

[44]

Kadambi, A.; Taamazyan, V.; Shi, B. X.; Raskar, R. Polarized 3D: High-quality depth sensing with polarization cues. In: Proceedings of the IEEE International Conference on Computer Vision, 3370-3378, 2015.

DOI

[45]

Berger, M.; Tagliasacchi, A.; Seversky, L.; Alliez, P.; Levine, J.; Sharf, A.; Silva, C. State of the art in surface reconstruction from point clouds. In: Proceedings of the Eurographics 2014 - State of the Art Reports, 161-185, 2014.

DOI

[46]

Chen, K.; Lai, Y. K.; Hu, S. M. 3D indoor scene modeling from RGB-D data: A survey. Computational Visual Media Vol. 1, No. 4, 267-278, 2015.

DOI Google Scholar

[47]

Stotko, P. State of the art in real-time registration of RGB-D images. In: Proceedings of the Central European Seminar on Computer Graphics for Students, 2016.

DOI

[48]

Xu, K.; Kim, V. G.; Huang, Q. X.; Mitra, N.; Kalogerakis, E. Data-driven shape analysis and processing. In: Proceedings of the SIGGRAPH ASIA 2016 Courses, Article No. 4, 2016.

DOI

[49]

Zollhöfer, M.; Stotko, P.; Görlitz, A.; Theobalt, C.; Nießner, M.; Klein, R.; Kolb, A. State of the art on 3D reconstruction with RGB-D cameras. Computer Graphics Forum Vol. 37, No. 2, 625-652, 2018.

DOI Google Scholar

[50]

Han, X. F.; Laga, H.; Bennamoun, M. Image-based 3D object reconstruction: State-of-the-art and trends in the deep learning era. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 43, No. 5, 1578-1604, 2021.

DOI Google Scholar

[51]

Roldão, L.; Charette, R. D.; Verroust-Blondet, A. 3D semantic scene completion: A survey. arXiv preprint arXiv:2103.07466, 2021.

DOI Google Scholar

[52]

Liu, Y. Z.; Fu, Y. J.; Chen, F. D.; Goossens, B.; Zhao, H. Simultaneous localization and mapping related datasets: A comprehensive survey. arXiv preprint arXiv:2102.04036, 2021.

DOI Google Scholar

[53]

Nguyen, C. V.; Izadi, S.; Lovell, D. Modeling kinect sensor noise for improved 3D reconstruction and tracking. In: Proceedings of the 2nd International Conference on 3D Imaging, Modeling, Processing, Visualization & Transmission, 524-530, 2012.

DOI

[54]

Sarbolandi, H.; Lefloch, D.; Kolb, A. Kinect range sensing: Structured-light versus Time-of-Flight Kinect. Computer Vision and Image Understanding Vol. 139, 1-20, 2015.

DOI Google Scholar

[55]

Wasenmüller, O.; Stricker, D. Comparison of kinect V1 and V2 depth images in terms of accuracy and precision. In: Computer Vision - ACCV 2016 Workshops. Lecture Notes in Computer Science, Vol. 10117. Chen, C. S.; Lu, J.; Ma, K, K. Eds. Springer Cham, 34-45, 2017.

DOI

[56]

Tomasi, C.; Manduchi, R. Bilateral filtering for gray and color images. In: Proceedings of the 6th International Conference on Computer Vision, 839-846, 1998.

DOI

[57]

Li, J. W.; Gao, W.; Wu, Y. H. Elaborate scene reconstruction with a consumer depth camera. International Journal of Automation and Computing Vol. 15, No. 4, 443-453, 2018.

DOI Google Scholar

[58]

Sterzentsenko, V.; Saroglou, L.; Chatzitofis, A.; Thermos, S.; Zioulis, N.; Doumanoglou, A.; Zarpalas, D.; Daras, P. Self-supervised deep depth denoising. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, 1242-1251, 2019.

DOI

[59]

Ferstl, D.; Rüther, M.; Bischof, H. Variational depth superresolution using example-based edge representations. In: Proceedings of the IEEE International Conference on Computer Vision, 513-521, 2015.

DOI

[60]

Kopf, J.; Cohen, M. F.; Lischinski, D.; Uyttendaele, M. Joint bilateral upsampling. ACM Transactions on Graphics Vol. 26, No. 3, 96-es, 2007.

DOI Google Scholar

[61]

Kiechle, M.; Hawe, S.; Kleinsteuber, M. A joint intensity and depth co-sparse analysis model for depth map super-resolution. In: Proceedings of the IEEE International Conference on Computer Vision, 1545-1552, 2013.

DOI

[62]

Park, J.; Kim, H.; Tai, Y. W.; Brown, M. S.; Kweon, I. S. High-quality depth map upsampling and completion for RGB-D cameras. IEEE Transactions on Image Processing Vol. 23, No. 12, 5559-5572, 2014.

DOI Google Scholar

[63]

Hui, T. W.; Loy, C. C.; Tang, X. O. Depth map super-resolution by deep multi-scale guidance. In: Computer Vision - ECCV 2016. Lecture Notes in Computer Science, Vol. 9907. Leibe, B.; Matas, J.; Sebe, N.; Welling, M. Eds. Springer Cham, 353-369, 2016.

DOI

[64]

Riegler, G.; Ferstl, D.; Rüther, M.; Bischof, H. A deep primal-dual network for guided depth super-resolution. In: Procedings of the British Machine Vision Conference, 2016.

DOI

[65]

Riegler, G.; Rüther, M.; Bischof, H. ATGV-net: Accurate depth super-resolution. In: Computer Vision - ECCV 2016. Lecture Notes in Computer Science, Vol. 9907. Leibe, B.; Matas, J.; Sebe, N.; Welling, M. Eds. Springer Cham, 268-284, 2016.

DOI

[66]

Zhang, R.; Tsai, P. S.; Cryer, J. E.; Shah, M. Shape-from-shading: A survey. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 21, No. 8, 690-706, 1999.

DOI Google Scholar

[67]

Han, Y.; Lee, J.-Y.; Kweon, I. S. High quality shape from a single RGB-D image under uncalibrated natural illumination. In: Proceedings of the IEEE International Conference on Computer Vision, 1617-1624, 2013.

DOI

[68]

Yu, L. F.; Yeung, S. K.; Tai, Y. W.; Lin, S. Shading-based shape refinement of RGB-D images. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 1415-1422, 2013.

DOI

[69]

Wu, C. L.; Zollhöfer, M.; Nießner, M.; Stamminger, M.; Izadi, S.; Theobalt, C. Real-time shading-based refinement for consumer depth cameras. ACM Transactions on Graphics Vol. 33, No. 6, Article No. 200, 2014.

DOI Google Scholar

[70]

Or-El, R.; Rosman, G.; Wetzler, A.; Kimmel, R.; Bruckstein, A. M. RGBD-fusion: Real-time high precision depth recovery. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 5407-5416, 2015.

DOI

[71]

Cui, Z. P.; Gu, J. W.; Shi, B. X.; Tan, P.; Kautz, J. Polarimetric multi-view stereo. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 369-378, 2017.

DOI

[72]

Ba, Y.; Gilbert, A. R.; Wang, F.; Yang, J.; Chen, R.; Wang, Y.; Yan, L.; Shi, B.; Kadambi, A. Deep shapefrom polarization. arXiv preprint arXiv:1903.10210,2019.

DOI Google Scholar

[73]

Deschaintre, V.; Lin, Y. M.; Ghosh, A. Deep polarization imaging for 3D shape and SVBRDF acquisition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 15562-15571, 2021.

DOI

[74]

Information on https://github.com/yindaz/DeepCompletionRelease.

DOI

[75]

Silberman, N.; Hoiem, D.; Kohli, P.; Fergus, R. Indoor segmentation and support inference from RGBD images. In: Computer Vision - ECCV 2012. Lecture Notes in Computer Science, Vol. 7576. Fitzgibbon, A.; Lazebnik, S.; Perona, P.; Sato, Y.; Schmid, C. Eds. Springer Berlin Heidelberg, 746-760, 2012.

DOI

[76]

Chen, Q. F.; Koltun, V. Fast MRF optimization with application to depth reconstruction. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 3914-3921, 2014.

DOI

[77]

Ma, F. C.; Karaman, S. Sparse-to-dense: Depth prediction from sparse depth samples and a single image. In: Proceedings of the IEEE International Conference on Robotics and Automation, 4796-4803, 2018.

DOI

[78]

Chen, Z.; Badrinarayanan, V.; Drozdov, G.; Rabinovich, A. Estimating depth from RGB and sparse sensing. In: Computer Vision - ECCV 2018. Lecture Notes in Computer Science, Vol. 11208. Ferrari, V.; Hebert, M.; Sminchisescu, C.; Weiss, Y. Eds. Springer Cham, 176-192, 2018.

DOI

[79]

Cheng, X. J.; Wang, P.; Yang, R. G. Depth estimation via affinity learned with convolutional spatial propagation network. In: Computer Vision - ECCV 2018. Lecture Notes in Computer Science, Vol. 11220. Ferrari, V.; Hebert, M.; Sminchisescu, C.; Weiss, Y. Eds. Springer Cham, 108-125, 2018.

DOI

[80]

Cheng, X. J.; Wang, P.; Yang, R. G. Learning depth with convolutional spatial propagation network. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 42, No. 10, 2361-2379, 2020.

DOI Google Scholar

[81]

Lee, B. U.; Jeon, H. G.; Im, S.; Kweon, I. S. Depth completion with deep geometry and context guidance. In: Proceedings of the International Conference on Robotics and Automation, 3281-3287, 2019.

DOI

[82]

Cheng, X. J.; Wang, P.; Guan, C. Y.; Yang, R. G. CSPN++: Learning context and resource aware convolutional spatial propagation networks for depth completion. Proceedings of the AAAI Conference on Artificial Intelligence Vol. 34, No. 7, 10615-10622, 2020.

DOI Google Scholar

[83]

Imran, S.; Long, Y. F.; Liu, X. M.; Morris, D. Depth coefficients for depth completion. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 12438-12447, 2019.

DOI

[84]

Zhu, L. Y.; Mousavian, A.; Xiang, Y.; Mazhar, H.; Eenbergen, J. V.; Debnath, S.; Fox, D. RGB-D local implicit function for depth completion of transparent objects. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 4647-4656, 2021.

DOI

[85]

Li, J. W.; Gao, W.; Wu, Y. H. High-quality 3D reconstruction with depth super-resolution and completion. IEEE Access Vol. 7, 19370-19381, 2019.

DOI Google Scholar

[86]

Slavcheva, M.; Kehl, W.; Navab, N.; Ilic, S. SDF-2-SDF: Highly accurate 3D object reconstruction. In: Computer Vision - ECCV 2016. Lecture Notes in Computer Science, Vol. 9905. Leibe, B.; Matas, J.; Sebe, N.; Welling, M. Eds. Springer Cham, 680-696, 2016.

DOI

[87]

Zeng, A.; Song, S. R.; Nießner, M.; Fisher, M.; Xiao, J. X.; Funkhouser, T. 3DMatch: Learning local geometric descriptors from RGB-D reconstructions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 199-208, 2017.

DOI

[88]

Lee, J. K.; Yea, J.; Park, M. G.; Yoon, K. J. Joint layout estimation and global multi-view registration for indoor reconstruction. In: Proceedings of the IEEE International Conference on Computer Vision, 162-171, 2017.

DOI

[89]

Besl, P. J.; McKay, N. D. Method for registration of 3-D shapes. In: Proceedings of the SPIE 1611, Sensor Fusion IV: Control Paradigms and Data Structures, 586-606, 1992.

DOI

[90]

Low, K.-L. Linear least-squares optimization for point-to-plane ICP surface registration. Technical Report TR04-004. Department of Computer Science, University of North Carolina at Chapel Hill, 2004.

DOI

[91]

Kerl, C.; Sturm, J.; Cremers, D. Robust odometry estimation for RGB-D cameras. In: Proceedings of the IEEE International Conference on Robotics and Automation, 3748-3754, 2013.

DOI

[92]

Whelan, T.; Johannsson, H.; Kaess, M.; Leonard, J. J.; McDonald, J. Robust real-time visual odometry for dense RGB-D mapping. In: Proceedings of the IEEE International Conference on Robotics and Automation, 5724-5731, 2013.

DOI

[93]

Johnson, A. E.; Kang, S. B. Registration and integration of textured 3D data. Image and Vision Computing Vol. 17, No. 2, 135-147, 1999.

DOI Google Scholar

[94]

Haehnel, D.; Thrun, S.; Burgard, W. An extension of the icp algorithm for modeling nonrigid objects with mobile robots. In: Proceedings of the 18th International Joint Conference on Artificial Intelligence, 915-920, 2003.

DOI

[95]

Segal, A.; Haehnel, D.; Thrun, S. Generalized-ICP. Robotics: Science and Systems Vol. 2, No. 4, 435, 2009.

DOI Google Scholar

[96]

Serafin, J.; Grisetti, G. NICP: Dense normal based point cloud registration. In: Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 742-749, 2015.

DOI

[97]

Wang, Y.; Solomon, J. Deep closest point: Learning representations for point cloud registration. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, 3522-3531, 2019.

DOI

[98]

Wang, Y.; Solomon, J. M. PRNet: Self-supervised learning for partial-to-partial registration. arXiv preprint arXiv:1910.12240, 2019.

DOI Google Scholar

[99]

Aoki, Y.; Goforth, H.; Srivatsan, R. A.; Lucey, S. PointNetLK: Robust & efficient point cloud registration using PointNet. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 7156-7165, 2019.

DOI

[100]

Ginzburg, D.; Raviv, D. Deep Weighted Consensus: Dense correspondence confidence maps for 3D shape registration. arXiv preprint arXiv:2105.02714, 2021.

DOI Google Scholar

[101]

Mur-Artal, R.; Tardós, J. D. ORB-SLAM2: An open-source SLAM system for monocular, stereo, and RGB-D cameras. IEEE Transactions on Robotics Vol. 33, No. 5, 1255-1262, 2017.

DOI Google Scholar

[102]

Sarlin, P. E.; Unagar, A.; Larsson, M.; Germain, H.; Toft, C.; Larsson, V.; Pollefeys, M.; Lepetit, V.; Hammarstrand, L.; Kahl, F.; et al. Back to the feature: Learning robust camera localization from pixels to pose. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 3246-3256, 2021.

DOI

[103]

Choi, C.; Trevor, A. J. B.; Christensen, H. I. RGB-D edge detection and edge-based registration. In: Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 1568-1575, 2013.

DOI

[104]

Lu, Y.; Song, D. Z. Robust RGB-D odometry using point and line features. In: Proceedings of the IEEE International Conference on Computer Vision, 3934-3942, 2015.

DOI

[105]

Zhou, Q. Y.; Koltun, V. Depth camera tracking with contour cues. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 632-638, 2015.

DOI

[106]

Taguchi, Y.; Jian, Y. D.; Ramalingam, S.; Feng, C. Point-plane SLAM for hand-held 3D sensors. In: Proceedings of the IEEE International Conference on Robotics and Automation, 5182-5189, 2013.

DOI

[107]

Salas-Moreno, R. F.; Glocken, B.; Kelly, P. H. J.; Davison, A. J. Dense planar SLAM. In: Proceedings of the IEEE International Symposium on Mixed and Augmented Reality, 157-164, 2014.

DOI

[108]

Ma, L. N.; Kerl, C.; Stückler, J.; Cremers, D. CPA-SLAM: Consistent plane-model alignment for direct RGB-D SLAM. In: Proceedings of the IEEE International Conference on Robotics and Automation, 1285-1291, 2016.

DOI

[109]

Shi, Y. F.; Xu, K.; Nießner, M.; Rusinkiewicz, S.; Funkhouser, T. PlaneMatch: Patch coplanarity prediction for robust RGB-D reconstruction. In: Computer Vision - ECCV 2018. Lecture Notes in Computer Science, Vol. 11212. Ferrari, V.; Hebert, M.; Sminchisescu, C.; Weiss, Y. Eds. Springer Cham, 767-784, 2018.

DOI

[110]

Yunus, R.; Li, Y. Y.; Tombari, F. ManhattanSLAM: Robust planar tracking and mapping leveraging mixture of Manhattan frames. In: Proceedings of the IEEE International Conference on Robotics and Automation, 6687-6693, 2021.

DOI

[111]

Kehl, W.; Tombari, F.; Ilic, S.; Navab, N. Real-time 3D model tracking in color and depth on a single CPU core. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 465-473, 2017.

DOI

[112]

Schönberger, J. L.; Pollefeys, M.; Geiger, A.; Sattler, T. Semantic visual localization. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 6896-6906, 2018.

DOI

[113]

Yin, Z. C.; Shi, J. P. GeoNet: Unsupervised learning of dense depth, optical flow and camera pose. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 1983-1992, 2018.

DOI

[114]

Tang, S.; Tang, C.; Huang, R.; Zhu, S.; Tan, P. Learning camera localization via dense scene matching. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 1831-1841, 2021.

DOI

[115]

Puri, P.; Jia, D. Y.; Kaess, M. GravityFusion: Real-time dense mapping without pose graph using deformation and orientation. In: Procedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 6506-6513, 2017.

DOI

[116]

Dong, W.; Wang, Q. Y.; Wang, X.; Zha, H. B. PSDF fusion: Probabilistic signed distance function for on-the-fly 3D data fusion and scene reconstruction. In: Computer Vision - ECCV 2018. Lecture Notes in Computer Science, Vol. 11213. Ferrari, V.; Hebert, M.; Sminchisescu, C.; Weiss, Y. Eds. Springer Cham, 714-730, 2018.

DOI

[117]

Steinbrucker, F.; Kerl, C.; Cremers, D.; Sturm, J. Large-scale multi-resolution surface reconstruction from RGB-D sequences. In: Proceedings of the IEEE International Conference on Computer Vision, 3264-3271, 2013.

DOI

[118]

Sumner, R. W.; Schmid, J.; Pauly, M. Embedded deformation for shape manipulation. ACM Transactions on Graphics Vol. 26, No. 3, 80-es, 2007.

DOI Google Scholar

[119]

Zheng, Z. R.; Yu, T.; Wei, Y. X.; Dai, Q. H.; Liu, Y. B. DeepHuman: 3D human reconstruction from a single image. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, 7738-7748, 2019.

DOI

[120]

Park, J. J.; Florence, P.; Straub, J.; Newcombe, R.; Lovegrove, S. DeepSDF: Learning continuous signed distance functions for shape representation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 165-174, 2019.

DOI

[121]

Sitzmann, V.; Zollhöfer, M.; Wetzstein, G. Scene representation networks: Continuous 3D-structure-aware neural scene representations. arXiv preprint arXiv:1906.01618, 2019.

DOI Google Scholar

[122]

Li, Z. Q.; Niklaus, S.; Snavely, N.; Wang, O. Neural scene flow fields for space-time view synthesis of dynamic scenes. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 6494-6504, 2021.

DOI

[123]

Pfister, H.; Zwicker, M.; van Baar, J.; Gross, M. Surfels: Surface elements as rendering primitives. In: Proceedings of the 27th Annual Conference on Computer Graphics and Interactive Techniques, 335-342, 2000.

DOI

[124]

Andersen, V.; Aans, H.; Brentzen, J. A. Surfel based geometry resonstruction. In: Theory and Practice of Computer Graphics. The Eurographics Association, 39-44, 2010.

DOI

[125]

Keller, M.; Lefloch, D.; Lambers, M.; Izadi, S.; Weyrich, T.; Kolb, A. Real-time 3D reconstructionin dynamic scenes using point-based fusion. In: Proceedings of the International Conference on 3D Vision, 1-8, 2013.

DOI

[126]

Mihajlovic, M.; Weder, S.; Pollefeys, M.; Oswald, M. R. DeepSurfels: learning online appearance fusion. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 14519-14530, 2021.

DOI

[127]

Mandikal, P.; Radhakrishnan, V. B. Dense 3D point cloud reconstruction using a deep pyramid network. In: Proceedings of the IEEE Winter Conference on Applications of Computer Vision, 1052-1060, 2019.

DOI

[128]

Wolff, K.; Kim, C.; Zimmer, H.; Schroers, C.; Botsch, M.; Sorkine-Hornung, O.; Sorkine-Hornung, A. Point cloud noise and outlier removal for image-based 3D reconstruction. In: Proceedings of the 4th International Conference on 3D Vision, 118-127, 2016.

DOI

[129]

Casajus, P. H.; Ritschel, T.; Ropinski, T. Total denoising: Unsupervised learning of 3D point cloud cleaning. In: Proceedings of the IEEE/CVF Inter-national Conference on Computer Vision, 52-60, 2019.

DOI

[130]

Delaunoy, A.; Prados, E. Gradient flows for optimizing triangular mesh-based surfaces: Applications to 3D reconstruction problems dealing with visibility. International Journal of Computer Vision Vol. 95, No. 2, 100-123, 2011.

DOI Google Scholar

[131]

Wang, P. S.; Liu, Y.; Tong, X. Mesh denoising via cascaded normal regression. ACM Transactions on Graphics Vol. 35, No. 6, Article No. 232, 2016.

DOI Google Scholar

[132]

Schertler, N.; Tarini, M.; Jakob, W.; Kazhdan, M.; Gumhold, S.; Panozzo, D. Field-aligned online surface reconstruction. ACM Transactions on Graphics Vol. 36, No. 4, Article No. 77, 2017.

DOI Google Scholar

[133]

Tsai, C. Y.; Sankaranarayanan, A. C.; Gkioulekas, I. Beyond volumetric albedo—A surface optimization framework for non-line-of-sight imaging. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 1545-1555, 2019.

DOI

[134]

Firman, M.; Aodha, O. M.; Julier, S.; Brostow, G. J. Structured prediction of unobserved voxels from a single depth image. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 5431-5440, 2016.

DOI

[135]

Huang, J.; Dai, A.; Guibas, L. J.; Niessner, M. 3Dlite: Towards commodity 3D scanning for content creation. ACM Transactions on Graphics Vol. 36, No. 6, Article No. 203, 2017.

DOI Google Scholar

[136]

Zollhöfer, M., Dai, A., Innmann, M., Wu, C. L.; Stamminger, M., Theobalt, C.; Nießner, M. Shading-based refinement on volumetric signed distance functions. ACM Transactions on Graphics Vol. 34, No. 4, Article No. 96, 2015.

DOI Google Scholar

[137]

Xu, D.; Duan, Q.; Zheng, J. M.; Zhang, J. Y.; Cai, J. F.; Cham, T. J. Shading-based surface detail recovery under general unknown illumination. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 40, No. 2, 423-436, 2018.

DOI Google Scholar

[138]

Chen, Z. Q.; Kim, V. G.; Fisher, M.; Aigerman, N.; Zhang, H.; Chaudhuri, S. DECOR-GAN: 3D shape detailization by conditional refinement. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 15735-15744, 2021.

DOI

[139]

Liu, Z. N.; Cao, Y. P.; Kuang, Z. F.; Kobbelt, L.; Hu, S. M. High-quality textured 3D shape reconstruction with cascaded fully convolutional networks. IEEE Transactions on Visualization and Computer Graphics Vol. 27, No. 1, 83-97, 2021.

DOI Google Scholar

[140]

Huang, J. W.; Thies, J.; Dai, A.; Kundu, A.; Jiang, C. Y.; Guibas, L. J.; Nießner, M.; Funkhouser, T. Adversarial texture optimization from RGB-D scans. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 1556-1565, 2020.

DOI

[141]

Davis, J.; Marschner, S. R.; Garr, M.; Levoy, M. Filling holes in complex surfaces using volumetric diffusion. In: Proceedings of the International Symposium on 3D Data Processing Visualization and Transmission, 428-441, 2002.

DOI

[142]

Rock, J.; Gupta, T.; Thorsen, J.; Gwak, J.; Shin, D.; Hoiem, D. Completing 3D object shape from one depth image. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2484-2493, 2015.

[143]

Harary, G.; Tal, A.; Grinspun, E. Context-based coherent surface completion. ACM Transactions on Graphics Vol. 33, No. 1, Article No. 5, 2014.

DOI Google Scholar

[144]

Wu, Z. R.; Song, S. R.; Khosla, A.; Yu, F.; Zhang, L. G.; Tang, X. O.; Xiao, J. 3D ShapeNets: A deep representation for volumetric shapes. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 1912-1920, 2015.

DOI

[145]

Sharma, A.; Grau, O.; Fritz, M. VConv-DAE: Deep volumetric shape learning without object labels. In: Computer Vision - ECCV 2016 Workshops. Lecture Notes in Computer Science, Vol. 9915. Hua, G.; Jégou, H. Eds. Springer Cham, 236-250, 2016.

DOI

[146]

Wu, J.; Zhang, C.; Xue, T.; Freeman, W. T.; Tenenbaum, J. B. Learning a probabilistic latent space of object shapes via 3D generative-adversarial modeling. In: Proceedings of the 29th Conference on Neural Information Processing System, 82-90, 2016.

DOI

[147]

Riegler, G.; Ulusoy, A. O.; Bischof, H.; Geiger, A. OctNetFusion: Learning depth fusion from data.In: Proceedings of the International Conference on 3D Vision, 57-66, 2017.

DOI

[148]

Dai, A.; Qi, C. R.; Nießner, M. Shape completion using 3D-encoder-predictor CNNs and shape synthesis. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 6545-6554, 2017.

DOI

[149]

Nicastro, A.; Clark, R.; Leutenegger, S. X-section: Cross-section prediction for enhanced RGB-D fusion. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, 1517-1526, 2019.

DOI

[150]

Hou, J.; Dai, A.; Nießner, M. RevealNet: Seeing behind objects in RGB-D scans. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2095-2104, 2020.

DOI

[151]

Zhang, J. Z.; Chen, X. Y.; Cai, Z.; Pan, L.; Zhao, H. Y.; Yi, S.; Yeo, C. K., Dai, B.; Loy, C. C. Unsupervised 3D shape completion through GAN inversion. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 1768-1777, 2021.

DOI

[152]

Silberman, N.; Shapira, L.; Gal, R.; Kohli, P. A contour completion model for augmenting surface reconstructions. In: Computer Vision - ECCV 2014. Lecture Notes in Computer Science, Vol. 8691. Fleet, D.; Pajdla, T.; Schiele, B.; Tuytelaars, T. Eds. Springer Cham, 488-503, 2014.

DOI

[153]

Sung, M.; Kim, V. G.; Angst, R.; Guibas, L. Data-driven structural priors for shape completion. ACM Transactions on Graphics Vol. 34, No. 6, Article No. 175, 2015.

DOI Google Scholar

[154]

Song, S. R.; Yu, F.; Zeng, A.; Chang, A. X.; Savva, M.; Funkhouser, T. Semantic scene completion from a single depth image. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 190-198, 2017.

DOI

[155]

Dzitsiuk, M.; Sturm, J.; Maier, R.; Ma, L. N.; Cremers, D. De-noising, stabilizing and completing 3D reconstructions on-the-go using plane priors. In: Proceedings of the IEEE International Conference on Robotics and Automation, 3976-3983, 2017.

DOI

[156]

Dai, A.; Ritchie, D.; Bokeloh, M.; Reed, S.; Sturm, J.; Nießner, M. ScanComplete: Large-scale scene completion and semantic segmentation for 3D scans. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 4578-4587, 2018.

DOI

[157]

Li, J.; Liu, Y.; Yuan, X.; Zhao, C. X.; Siegwart, R.; Reid, I.; Cadena, C. Depth based semantic scene completion with position importance aware loss. IEEE Robotics and Automation Letters Vol. 5, No. 1, 219-226, 2020.

DOI Google Scholar

[158]

Endres, F.; Hess, J.; Engelhard, N.; Sturm, J.; Cremers, D.; Burgard, W. An evaluation of the RGB-D SLAMsystem. In: Proceedings of the IEEE International Conference on Robotics and Automation, 1691-1696, 2012.

DOI

[159]

Ferstl, D.; Reinbacher, C.; Ranftl, R.; Ruether, M.; Bischof, H. Image guided depth upsampling using anisotropic total generalized variation. In: Proceedings of the IEEE International Conference on Computer Vision, 993-1000, 2013.

[160]

Zhang, Y. D.; Song, S. R.; Yumer, E.; Savva, M.; Lee, J. Y.; Jin, H. L.; Funkhouser, T. Physically-based rendering for indoor scene understanding using convolutional neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 5057-5065, 2017.

About this article

Publication history

Acknowledgements

Rights and permissions

Publication history

Received: 20 May 2021

Accepted: 30 July 2021

Published: 06 March 2022

Issue date: September 2022

Copyright

Acknowledgements

This work is supported by the National Key R&D Program of China under Grant No. 2018YFC2000600, the Open Projects Program of National Laboratory of Pattern Recognition under Grant No. 202100009, the National Natural Science Foundation of China under Grant No. 72071018, and the Fundamental Research Funds for Central Universities under Grant No. 2021TD006.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduc-tion in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made.

The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Other papers from this open access journal are available free of charge from http://www.springer.com/journal/41095. To submit a manuscript, please go to https://www. editorialmanager.com/cvmj.