Journal Home > Volume 8 , Issue 3

High-quality 3D reconstruction is an important topic in computer graphics and computer vision with many applications, such as robotics and augmented reality. The advent of consumer RGB-D cameras has made a profound advance in indoor scenereconstruction. For the past few years, researchers have spent significant effort to develop algorithms to capture 3D models with RGB-D cameras. As depth images produced by consumer RGB-D cameras are noisy and incomplete when surfaces are shiny, bright, transparent, or far from the camera, obtaining high- quality 3D scene models is still a challenge for existing systems. We here review high-quality 3D indoor scene reconstruction methods using consumer RGB-D cameras. In this paper, we make comparisons and analyses from the following aspects: (i) depth processing methods in 3D reconstruction are reviewed in terms of enhancement and completion, (ii) ICP-based, feature-based, and hybrid methods of camera pose estimation methods are reviewed, and (iii) surface reconstruction methods are reviewed in terms of surface fusion, optimization, and completion. The performance of state-of-the-art methods is also compared and analyzed. This survey will be useful for researchers who want to follow best practices in designing new high-quality 3D reconstruction methods.


menu
Abstract
Full text
Outline
About this article

High-quality indoor scene 3D reconstruction with RGB-D cameras: A brief review

Show Author's information Jianwei Li1Wei Gao2,3Yihong Wu2,3Yangdong Liu4Yanfei Shen1( )
School of Sports Engineering, Beijing Sports University, Beijing 100084, China
National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, Beijing 100080, China
University of Chinese Academy of Sciences, Beijing 100049, China
Huawei Technologies Co., Ltd., Beijing 100085, China

Abstract

High-quality 3D reconstruction is an important topic in computer graphics and computer vision with many applications, such as robotics and augmented reality. The advent of consumer RGB-D cameras has made a profound advance in indoor scenereconstruction. For the past few years, researchers have spent significant effort to develop algorithms to capture 3D models with RGB-D cameras. As depth images produced by consumer RGB-D cameras are noisy and incomplete when surfaces are shiny, bright, transparent, or far from the camera, obtaining high- quality 3D scene models is still a challenge for existing systems. We here review high-quality 3D indoor scene reconstruction methods using consumer RGB-D cameras. In this paper, we make comparisons and analyses from the following aspects: (i) depth processing methods in 3D reconstruction are reviewed in terms of enhancement and completion, (ii) ICP-based, feature-based, and hybrid methods of camera pose estimation methods are reviewed, and (iii) surface reconstruction methods are reviewed in terms of surface fusion, optimization, and completion. The performance of state-of-the-art methods is also compared and analyzed. This survey will be useful for researchers who want to follow best practices in designing new high-quality 3D reconstruction methods.

Keywords:

3D reconstruction, image processing, camera pose estimation, surface fusion
Received: 20 May 2021 Accepted: 30 July 2021 Published: 06 March 2022 Issue date: September 2022
References(160)
[1]
Orts-Escolano, S.; Rhemann, C.; Fanello, S.; Chang, W.; Kowdle, A.; Degtyarev, Y.; Kim, D.; Davidson, P. L.; Khamis, S.; Dou, M.; et al. Holoportation: Virtual 3D teleportation in real-time. In: Proceedings of the 29th Annual Symposium on User Interface Software and Technology, 741-754, 2016.
[2]
DGene. Available at https://www.dgene.com/tech/model.
DOI
[3]
Choi, S.; Zhou, Q. Y.; Koltun, V. Robust reconstruction of indoor scenes. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 5556-5565, 2015.
[4]
Newcombe, R. A.; Izadi, S.; Hilliges, O.; Molyneaux, D.; Kim, D.; Davison, A. J.; Kohi, P.; Shotton, J.; Hodges, S.; Fitzgibbon, A. KinectFusion: Real-time dense surface mapping and tracking. In: Proceedings of the 10th IEEE International Symposium on Mixed and Augmented Reality, 127-136, 2011.
[5]
Curless, B.; Levoy, M. A volumetric method for building complex models from range images. In: Proceedings of the 23rd Annual Conference on Computer Graphics and Interactive Techniques, 303-312, 1996.
[6]
Rusinkiewicz, S.; Levoy, M. Efficient variants of the ICP algorithm. In: Proceedings of the 3rd International Conference on 3-D Digital Imaging and Modeling, 145-152, 2001.
[7]
Whelan, T.; Kaess, M.; Fallon, M.; Johannsson, H.; Leonard, J.; Mcdonald, J. Kintinuous: Spatially extended KinectFusion. Robotics and Autonomous Systems Vol. 69, No. C, 3-14, 2012.
[8]
Whelan, T.; Kaess, M.; Johannsson, H.; Fallon, M.; Leonard, J. J.; McDonald, J. Real-time large-scale dense RGB-D SLAM with volumetric fusion. The International Journal of Robotics Research Vol. 34, Nos. 4-5, 598-626, 2015.
[9]
Thomas, D.; Sugimoto, A. Modeling large-scale indoor scenes with rigid fragments using RGB-D cameras. Computer Vision and Image Understanding Vol. 157, 103-116, 2017.
[10]
Golodetz, S.; Cavallari, T.; Lord, N. A.; Prisacariu, V. A.; Murray, D. W.; Torr, P. H. S. Collaborative large-scale dense 3D reconstruction with online inter-agent pose optimisation. IEEE Transactions on Visualization and Computer Graphics Vol. 24, No. 11, 2895-2905, 2018.
[11]
Dai, A.; Diller, C.; Niessner, M. SG-NN: Sparse generative neural networks for self-supervised scene completion of RGB-D scans. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 846-855, 2020.
[12]
Salas-Moreno, R. F.; Newcombe, R. A.; Strasdat, H.; Kelly, P. H. J.; Davison, A. J. SLAM++: Simultaneous localisation and mapping at the level of objects. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 1352-1359, 2013.
[13]
Shao, T. J.; Xu, W. W.; Zhou, K.; Wang, J. D.; Li, D. P.; Guo, B. N. An interactive approach to semantic modeling of indoor scenes with an RGBD camera. ACM Transactions on Graphics Vol. 31, No. 6, Article No. 136, 2012.
[14]
Chen, K.; Lai, Y. K.; Wu, Y. X.; Martin, R.; Hu, S. M. Automatic semantic modeling of indoor scenes from low-quality RGB-D data using contextual information. ACM Transactions on Graphics Vol. 33, No. 6, Article No. 208, 2014.
[15]
McCormac, J.; Handa, A.; Davison, A.; Leutenegger, S. SemanticFusion: Dense 3D semantic mapping with convolutional neural networks. In: Proceedings of the IEEE International Conference on Robotics and Automation, 4628-4635, 2017.
[16]
Hou, J.; Dai, A.; Nießner, M. 3D-SIS: 3D semantic instance segmentation of RGB-D scans. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 4416-4425, 2019.
[17]
Cai, Y. J.; Chen, X. S.; Zhang, C.; Lin, K. Y.; Wang, X. G.; Li, H. S. Semantic scene completion via integrating instances and scene in-the-loop. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 324-333, 2021.
[18]
Newcombe, R. A.; Fox, D.; Seitz, S. M. DynamicFusion: Reconstruction and tracking of non-rigid scenes in real-time. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 343-352, 2015.
[19]
Dou, M. S.; Khamis, S.; Degtyarev, Y.; Davidson, P.; Fanello, S. R.; Kowdle, A.; Escolano, S. O.; Rhemann, C.; Kim, D.; Taylor, J.; et al. Fusion4D: Real-time performance capture of challenging scenes. ACM Transactions on Graphics Vol. 35, No. 4, Article No. 114, 2016.
[20]
Meerits, S.; Thomas, D.; Nozick, V.; Saito, H. FusionMLS: Highly dynamic 3D reconstruction with consumer-grade RGB-D cameras. Computational Visual Media Vol. 4, No. 4, 287-303, 2018.
[21]
Saito, S.; Simon, T.; Saragih, J.; Joo, H. PIFuHD: Multi-level pixel-aligned implicit function for high-resolution 3D human digitization. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 81-90, 2020.
[22]
Nießner, M.; Zollhöfer, M.; Izadi, S.; Stamminger, M. Real-time 3D reconstruction at scale using voxel hashing. ACM Transactions on Graphics Vol. 32, No. 6, Article No. 169, 2013.
[23]
Steinbrücker, F.; Sturm, J.; Cremers, D. Volumetric 3D mapping in real-time on a CPU. In: Proceedings of the IEEE International Conference on Robotics and Automation, 2021-2028, 2014.
[24]
Kähler, O.; Adrian Prisacariu, V.; Yuheng Ren, C.; Sun, X.; Torr, P.; Murray, D. Very high frame rate volumetric integration of depth images on mobile devices. IEEE Transactions on Visualization and Computer Graphics Vol. 21, No. 11, 1241-1250, 2015.
[25]
Prisacariu, V. A.; Kähler, O.; Golodetz, S.; Sapienza, M.; Cavallari, T.; Torr, P. H.; Murray, D. W. InfiniTAM v3: A framework for large-scale 3D reconstruction with loop closure. arXiv preprint arXiv:1708.00783, 2017.
[26]
Dai, A.; Nießner, M.; Zollhöfer, M.; Izadi, S.; Theobalt, C. BundleFusion: Real-time globally consistent 3D reconstruction using on-the-fly surface reintegration. ACM Transactions on Graphics Vol. 36, No. 4, Article No. 76a, 2017.
[27]
Maier, R.; Kim, K.; Cremers, D.; Kautz, J.; Nießner, M. Intrinsic3D: High-quality 3D reconstruction by joint appearance and geometry optimization with spatially-varying lighting. In: Proceedings of the IEEE International Conference on Computer Vision, 3133-3141, 2017.
[28]
Cao, Y. P.; Kobbelt, L.; Hu, S. M. Real-time high-accuracy three-dimensional reconstruction with consumer RGB-D cameras. ACM Transactions on Graphics Vol. 37, No. 5, Article No. 171, 2018.
[29]
Whelan, T.; Salas-Moreno, R. F.; Glocker, B.; Davison, A. J.; Leutenegger, S. ElasticFusion: Real-time dense SLAM and light source estimation. The International Journal of Robotics Research Vol. 35, No. 14, 1697-1716, 2016.
[30]
Jeon, J.; Jung, Y.; Kim, H.; Lee, S. Texture map generation for 3D reconstructed scenes. The Visual Computer Vol. 32, Nos. 6-8, 955-965, 2016.
[31]
Dai, A.; Chang, A. X.; Savva, M.; Halber, M.; Funkhouser, T.; Nießner, M. ScanNet: Richly-annotated 3D reconstructions of indoor scenes. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2432-2443, 2017.
[32]
Sturm, J.; Engelhard, N.; Endres, F.; Burgard, W.; Cremers, D. A benchmark for the evaluation of RGB-D SLAM systems. In: Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 573-580, 2012.
[33]
Zhou, Q. Y.; Miller, S.; Koltun, V. Elastic fragments for dense scene reconstruction. In: Proceedings of the IEEE International Conference on Computer Vision, 473-480, 2013.
[34]
Xiao, J. X.; Owens, A.; Torralba, A. SUN3D: A database of big spaces reconstructed using SfM and object labels. In: Proceedings of the IEEE International Conference on Computer Vision, 1625-1632, 2013.
[35]
Handa, A.; Whelan, T.; McDonald, J.; Davison, A. J. A benchmark for RGB-D visual odometry, 3D reconstruction and SLAM. In: Proceedings of the IEEE International Conference on Robotics and Automation, 1524-1531, 2014.
[36]
Hua, B. S.; Pham, Q. H.; Nguyen, D. T.; Tran, M. K.; Yu, L. F.; Yeung, S. K. SceneNN: A scene meshes dataset with aNNotations. In: Proceedings of the 4th International Conference on 3D Vision, 92-101, 2016.
[37]
Wasenmüller, O.; Meyer, M.; Stricker, D. CoRBS: Comprehensive RGB-D benchmark for SLAM using Kinect v2. In: Proceedings of the IEEE Winter Conference on Applications of Computer Vision, 1-7, 2016.
[38]
Chang, A.; Dai, A.; Funkhouser, T.; Halber, M.; Niebner, M.; Savva, M.; Song, S.; Zeng, A.; Zhang, Y. Matterport3D: Learning from RGB-D data in indoor environments. In: Proceedings of the International Conference on 3D Vision, 667-676, 2017.
[39]
McCormac, J.; Handa, A.; Leutenegger, S.; Davison, A. J. SceneNet RGB-D: Can 5M synthetic images beat generic ImageNet pre-training on indoor segmentation? In: Proceedings of the IEEE International Conference on Computer Vision, 2697-2706, 2017.
[40]
Palazzolo, E.; Behley, J.; Lottes, P.; Giguère, P.; Stachniss, C. ReFusion: 3D reconstruction in dynamic environments for RGB-D cameras exploiting residuals. In: Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 7855-7862, 2019.
[41]
Li, W. B.; Saeedi, S.; McCormac, J.; Clark, R.; Leutenegger, S. InteriorNet: Mega-scale Multi-sensor Photo-realistic indoor scenes dataset. arXiv preprint arXiv:1809.00716, 2018.
[42]
Straub, J.; Whelan, T.; Ma, L. N.; Chen, Y. F.; Wijmans, E.; Green, S.; Engel, J. J.; Mur-Artal, R.; Ren, C.; Verma, S.; et al. The replica dataset: A digital replica of indoor spaces. arXiv preprint arXiv:1906.05797, 2019.
[43]
Shi, X. S.; Li, D. J.; Zhao, P. P.; Tian, Q. B.; Tian, Y. X.; Long, Q. W.; Zhu, C.; Song, J.; Qiao, F.; Song, L.; et al. Are we ready for service robots? The OpenLORIS-scene datasets for lifelong SLAM. In: Proceedings of the IEEE International Conference on Robotics and Automation, 3139-3145, 2020.
[44]
Kadambi, A.; Taamazyan, V.; Shi, B. X.; Raskar, R. Polarized 3D: High-quality depth sensing with polarization cues. In: Proceedings of the IEEE International Conference on Computer Vision, 3370-3378, 2015.
[45]
Berger, M.; Tagliasacchi, A.; Seversky, L.; Alliez, P.; Levine, J.; Sharf, A.; Silva, C. State of the art in surface reconstruction from point clouds. In: Proceedings of the Eurographics 2014 - State of the Art Reports, 161-185, 2014.
[46]
Chen, K.; Lai, Y. K.; Hu, S. M. 3D indoor scene modeling from RGB-D data: A survey. Computational Visual Media Vol. 1, No. 4, 267-278, 2015.
[47]
Stotko, P. State of the art in real-time registration of RGB-D images. In: Proceedings of the Central European Seminar on Computer Graphics for Students, 2016.
[48]
Xu, K.; Kim, V. G.; Huang, Q. X.; Mitra, N.; Kalogerakis, E. Data-driven shape analysis and processing. In: Proceedings of the SIGGRAPH ASIA 2016 Courses, Article No. 4, 2016.
[49]
Zollhöfer, M.; Stotko, P.; Görlitz, A.; Theobalt, C.; Nießner, M.; Klein, R.; Kolb, A. State of the art on 3D reconstruction with RGB-D cameras. Computer Graphics Forum Vol. 37, No. 2, 625-652, 2018.
[50]
Han, X. F.; Laga, H.; Bennamoun, M. Image-based 3D object reconstruction: State-of-the-art and trends in the deep learning era. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 43, No. 5, 1578-1604, 2021.
[51]
Roldão, L.; Charette, R. D.; Verroust-Blondet, A. 3D semantic scene completion: A survey. arXiv preprint arXiv:2103.07466, 2021.
[52]
Liu, Y. Z.; Fu, Y. J.; Chen, F. D.; Goossens, B.; Zhao, H. Simultaneous localization and mapping related datasets: A comprehensive survey. arXiv preprint arXiv:2102.04036, 2021.
[53]
Nguyen, C. V.; Izadi, S.; Lovell, D. Modeling kinect sensor noise for improved 3D reconstruction and tracking. In: Proceedings of the 2nd International Conference on 3D Imaging, Modeling, Processing, Visualization & Transmission, 524-530, 2012.
DOI
[54]
Sarbolandi, H.; Lefloch, D.; Kolb, A. Kinect range sensing: Structured-light versus Time-of-Flight Kinect. Computer Vision and Image Understanding Vol. 139, 1-20, 2015.
[55]
Wasenmüller, O.; Stricker, D. Comparison of kinect V1 and V2 depth images in terms of accuracy and precision. In: Computer Vision - ACCV 2016 Workshops. Lecture Notes in Computer Science, Vol. 10117. Chen, C. S.; Lu, J.; Ma, K, K. Eds. Springer Cham, 34-45, 2017.
DOI
[56]
Tomasi, C.; Manduchi, R. Bilateral filtering for gray and color images. In: Proceedings of the 6th International Conference on Computer Vision, 839-846, 1998.
[57]
Li, J. W.; Gao, W.; Wu, Y. H. Elaborate scene reconstruction with a consumer depth camera. International Journal of Automation and Computing Vol. 15, No. 4, 443-453, 2018.
[58]
Sterzentsenko, V.; Saroglou, L.; Chatzitofis, A.; Thermos, S.; Zioulis, N.; Doumanoglou, A.; Zarpalas, D.; Daras, P. Self-supervised deep depth denoising. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, 1242-1251, 2019.
[59]
Ferstl, D.; Rüther, M.; Bischof, H. Variational depth superresolution using example-based edge representations. In: Proceedings of the IEEE International Conference on Computer Vision, 513-521, 2015.
[60]
Kopf, J.; Cohen, M. F.; Lischinski, D.; Uyttendaele, M. Joint bilateral upsampling. ACM Transactions on Graphics Vol. 26, No. 3, 96-es, 2007.
[61]
Kiechle, M.; Hawe, S.; Kleinsteuber, M. A joint intensity and depth co-sparse analysis model for depth map super-resolution. In: Proceedings of the IEEE International Conference on Computer Vision, 1545-1552, 2013.
[62]
Park, J.; Kim, H.; Tai, Y. W.; Brown, M. S.; Kweon, I. S. High-quality depth map upsampling and completion for RGB-D cameras. IEEE Transactions on Image Processing Vol. 23, No. 12, 5559-5572, 2014.
[63]
Hui, T. W.; Loy, C. C.; Tang, X. O. Depth map super-resolution by deep multi-scale guidance. In: Computer Vision - ECCV 2016. Lecture Notes in Computer Science, Vol. 9907. Leibe, B.; Matas, J.; Sebe, N.; Welling, M. Eds. Springer Cham, 353-369, 2016.
DOI
[64]
Riegler, G.; Ferstl, D.; Rüther, M.; Bischof, H. A deep primal-dual network for guided depth super-resolution. In: Procedings of the British Machine Vision Conference, 2016.
[65]
Riegler, G.; Rüther, M.; Bischof, H. ATGV-net: Accurate depth super-resolution. In: Computer Vision - ECCV 2016. Lecture Notes in Computer Science, Vol. 9907. Leibe, B.; Matas, J.; Sebe, N.; Welling, M. Eds. Springer Cham, 268-284, 2016.
DOI
[66]
Zhang, R.; Tsai, P. S.; Cryer, J. E.; Shah, M. Shape-from-shading: A survey. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 21, No. 8, 690-706, 1999.
[67]
Han, Y.; Lee, J.-Y.; Kweon, I. S. High quality shape from a single RGB-D image under uncalibrated natural illumination. In: Proceedings of the IEEE International Conference on Computer Vision, 1617-1624, 2013.
[68]
Yu, L. F.; Yeung, S. K.; Tai, Y. W.; Lin, S. Shading-based shape refinement of RGB-D images. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 1415-1422, 2013.
[69]
Wu, C. L.; Zollhöfer, M.; Nießner, M.; Stamminger, M.; Izadi, S.; Theobalt, C. Real-time shading-based refinement for consumer depth cameras. ACM Transactions on Graphics Vol. 33, No. 6, Article No. 200, 2014.
[70]
Or-El, R.; Rosman, G.; Wetzler, A.; Kimmel, R.; Bruckstein, A. M. RGBD-fusion: Real-time high precision depth recovery. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 5407-5416, 2015.
[71]
Cui, Z. P.; Gu, J. W.; Shi, B. X.; Tan, P.; Kautz, J. Polarimetric multi-view stereo. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 369-378, 2017.
[72]
Ba, Y.; Gilbert, A. R.; Wang, F.; Yang, J.; Chen, R.; Wang, Y.; Yan, L.; Shi, B.; Kadambi, A. Deep shapefrom polarization. arXiv preprint arXiv:1903.10210,2019.
[73]
Deschaintre, V.; Lin, Y. M.; Ghosh, A. Deep polarization imaging for 3D shape and SVBRDF acquisition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 15562-15571, 2021.
[74]
DOI
[75]
Silberman, N.; Hoiem, D.; Kohli, P.; Fergus, R. Indoor segmentation and support inference from RGBD images. In: Computer Vision - ECCV 2012. Lecture Notes in Computer Science, Vol. 7576. Fitzgibbon, A.; Lazebnik, S.; Perona, P.; Sato, Y.; Schmid, C. Eds. Springer Berlin Heidelberg, 746-760, 2012.
DOI
[76]
Chen, Q. F.; Koltun, V. Fast MRF optimization with application to depth reconstruction. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 3914-3921, 2014.
[77]
Ma, F. C.; Karaman, S. Sparse-to-dense: Depth prediction from sparse depth samples and a single image. In: Proceedings of the IEEE International Conference on Robotics and Automation, 4796-4803, 2018.
[78]
Chen, Z.; Badrinarayanan, V.; Drozdov, G.; Rabinovich, A. Estimating depth from RGB and sparse sensing. In: Computer Vision - ECCV 2018. Lecture Notes in Computer Science, Vol. 11208. Ferrari, V.; Hebert, M.; Sminchisescu, C.; Weiss, Y. Eds. Springer Cham, 176-192, 2018.
DOI
[79]
Cheng, X. J.; Wang, P.; Yang, R. G. Depth estimation via affinity learned with convolutional spatial propagation network. In: Computer Vision - ECCV 2018. Lecture Notes in Computer Science, Vol. 11220. Ferrari, V.; Hebert, M.; Sminchisescu, C.; Weiss, Y. Eds. Springer Cham, 108-125, 2018.
DOI
[80]
Cheng, X. J.; Wang, P.; Yang, R. G. Learning depth with convolutional spatial propagation network. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 42, No. 10, 2361-2379, 2020.
[81]
Lee, B. U.; Jeon, H. G.; Im, S.; Kweon, I. S. Depth completion with deep geometry and context guidance. In: Proceedings of the International Conference on Robotics and Automation, 3281-3287, 2019.
[82]
Cheng, X. J.; Wang, P.; Guan, C. Y.; Yang, R. G. CSPN++: Learning context and resource aware convolutional spatial propagation networks for depth completion. Proceedings of the AAAI Conference on Artificial Intelligence Vol. 34, No. 7, 10615-10622, 2020.
[83]
Imran, S.; Long, Y. F.; Liu, X. M.; Morris, D. Depth coefficients for depth completion. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 12438-12447, 2019.
[84]
Zhu, L. Y.; Mousavian, A.; Xiang, Y.; Mazhar, H.; Eenbergen, J. V.; Debnath, S.; Fox, D. RGB-D local implicit function for depth completion of transparent objects. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 4647-4656, 2021.
[85]
Li, J. W.; Gao, W.; Wu, Y. H. High-quality 3D reconstruction with depth super-resolution and completion. IEEE Access Vol. 7, 19370-19381, 2019.
[86]
Slavcheva, M.; Kehl, W.; Navab, N.; Ilic, S. SDF-2-SDF: Highly accurate 3D object reconstruction. In: Computer Vision - ECCV 2016. Lecture Notes in Computer Science, Vol. 9905. Leibe, B.; Matas, J.; Sebe, N.; Welling, M. Eds. Springer Cham, 680-696, 2016.
DOI
[87]
Zeng, A.; Song, S. R.; Nießner, M.; Fisher, M.; Xiao, J. X.; Funkhouser, T. 3DMatch: Learning local geometric descriptors from RGB-D reconstructions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 199-208, 2017.
[88]
Lee, J. K.; Yea, J.; Park, M. G.; Yoon, K. J. Joint layout estimation and global multi-view registration for indoor reconstruction. In: Proceedings of the IEEE International Conference on Computer Vision, 162-171, 2017.
[89]
Besl, P. J.; McKay, N. D. Method for registration of 3-D shapes. In: Proceedings of the SPIE 1611, Sensor Fusion IV: Control Paradigms and Data Structures, 586-606, 1992.
[90]
Low, K.-L. Linear least-squares optimization for point-to-plane ICP surface registration. Technical Report TR04-004. Department of Computer Science, University of North Carolina at Chapel Hill, 2004.
DOI
[91]
Kerl, C.; Sturm, J.; Cremers, D. Robust odometry estimation for RGB-D cameras. In: Proceedings of the IEEE International Conference on Robotics and Automation, 3748-3754, 2013.
[92]
Whelan, T.; Johannsson, H.; Kaess, M.; Leonard, J. J.; McDonald, J. Robust real-time visual odometry for dense RGB-D mapping. In: Proceedings of the IEEE International Conference on Robotics and Automation, 5724-5731, 2013.
[93]
Johnson, A. E.; Kang, S. B. Registration and integration of textured 3D data. Image and Vision Computing Vol. 17, No. 2, 135-147, 1999.
[94]
Haehnel, D.; Thrun, S.; Burgard, W. An extension of the icp algorithm for modeling nonrigid objects with mobile robots. In: Proceedings of the 18th International Joint Conference on Artificial Intelligence, 915-920, 2003.
[95]
Segal, A.; Haehnel, D.; Thrun, S. Generalized-ICP. Robotics: Science and Systems Vol. 2, No. 4, 435, 2009.
[96]
Serafin, J.; Grisetti, G. NICP: Dense normal based point cloud registration. In: Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 742-749, 2015.
[97]
Wang, Y.; Solomon, J. Deep closest point: Learning representations for point cloud registration. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, 3522-3531, 2019.
[98]
Wang, Y.; Solomon, J. M. PRNet: Self-supervised learning for partial-to-partial registration. arXiv preprint arXiv:1910.12240, 2019.
[99]
Aoki, Y.; Goforth, H.; Srivatsan, R. A.; Lucey, S. PointNetLK: Robust & efficient point cloud registration using PointNet. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 7156-7165, 2019.
DOI
[100]
Ginzburg, D.; Raviv, D. Deep Weighted Consensus: Dense correspondence confidence maps for 3D shape registration. arXiv preprint arXiv:2105.02714, 2021.
[101]
Mur-Artal, R.; Tardós, J. D. ORB-SLAM2: An open-source SLAM system for monocular, stereo, and RGB-D cameras. IEEE Transactions on Robotics Vol. 33, No. 5, 1255-1262, 2017.
[102]
Sarlin, P. E.; Unagar, A.; Larsson, M.; Germain, H.; Toft, C.; Larsson, V.; Pollefeys, M.; Lepetit, V.; Hammarstrand, L.; Kahl, F.; et al. Back to the feature: Learning robust camera localization from pixels to pose. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 3246-3256, 2021.
[103]
Choi, C.; Trevor, A. J. B.; Christensen, H. I. RGB-D edge detection and edge-based registration. In: Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 1568-1575, 2013.
[104]
Lu, Y.; Song, D. Z. Robust RGB-D odometry using point and line features. In: Proceedings of the IEEE International Conference on Computer Vision, 3934-3942, 2015.
[105]
Zhou, Q. Y.; Koltun, V. Depth camera tracking with contour cues. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 632-638, 2015.
[106]
Taguchi, Y.; Jian, Y. D.; Ramalingam, S.; Feng, C. Point-plane SLAM for hand-held 3D sensors. In: Proceedings of the IEEE International Conference on Robotics and Automation, 5182-5189, 2013.
[107]
Salas-Moreno, R. F.; Glocken, B.; Kelly, P. H. J.; Davison, A. J. Dense planar SLAM. In: Proceedings of the IEEE International Symposium on Mixed and Augmented Reality, 157-164, 2014.
[108]
Ma, L. N.; Kerl, C.; Stückler, J.; Cremers, D. CPA-SLAM: Consistent plane-model alignment for direct RGB-D SLAM. In: Proceedings of the IEEE International Conference on Robotics and Automation, 1285-1291, 2016.
[109]
Shi, Y. F.; Xu, K.; Nießner, M.; Rusinkiewicz, S.; Funkhouser, T. PlaneMatch: Patch coplanarity prediction for robust RGB-D reconstruction. In: Computer Vision - ECCV 2018. Lecture Notes in Computer Science, Vol. 11212. Ferrari, V.; Hebert, M.; Sminchisescu, C.; Weiss, Y. Eds. Springer Cham, 767-784, 2018.
DOI
[110]
Yunus, R.; Li, Y. Y.; Tombari, F. ManhattanSLAM: Robust planar tracking and mapping leveraging mixture of Manhattan frames. In: Proceedings of the IEEE International Conference on Robotics and Automation, 6687-6693, 2021.
[111]
Kehl, W.; Tombari, F.; Ilic, S.; Navab, N. Real-time 3D model tracking in color and depth on a single CPU core. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 465-473, 2017.
[112]
Schönberger, J. L.; Pollefeys, M.; Geiger, A.; Sattler, T. Semantic visual localization. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 6896-6906, 2018.
[113]
Yin, Z. C.; Shi, J. P. GeoNet: Unsupervised learning of dense depth, optical flow and camera pose. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 1983-1992, 2018.
[114]
Tang, S.; Tang, C.; Huang, R.; Zhu, S.; Tan, P. Learning camera localization via dense scene matching. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 1831-1841, 2021.
[115]
Puri, P.; Jia, D. Y.; Kaess, M. GravityFusion: Real-time dense mapping without pose graph using deformation and orientation. In: Procedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 6506-6513, 2017.
[116]
Dong, W.; Wang, Q. Y.; Wang, X.; Zha, H. B. PSDF fusion: Probabilistic signed distance function for on-the-fly 3D data fusion and scene reconstruction. In: Computer Vision - ECCV 2018. Lecture Notes in Computer Science, Vol. 11213. Ferrari, V.; Hebert, M.; Sminchisescu, C.; Weiss, Y. Eds. Springer Cham, 714-730, 2018.
DOI
[117]
Steinbrucker, F.; Kerl, C.; Cremers, D.; Sturm, J. Large-scale multi-resolution surface reconstruction from RGB-D sequences. In: Proceedings of the IEEE International Conference on Computer Vision, 3264-3271, 2013.
[118]
Sumner, R. W.; Schmid, J.; Pauly, M. Embedded deformation for shape manipulation. ACM Transactions on Graphics Vol. 26, No. 3, 80-es, 2007.
[119]
Zheng, Z. R.; Yu, T.; Wei, Y. X.; Dai, Q. H.; Liu, Y. B. DeepHuman: 3D human reconstruction from a single image. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, 7738-7748, 2019.
[120]
Park, J. J.; Florence, P.; Straub, J.; Newcombe, R.; Lovegrove, S. DeepSDF: Learning continuous signed distance functions for shape representation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 165-174, 2019.
[121]
Sitzmann, V.; Zollhöfer, M.; Wetzstein, G. Scene representation networks: Continuous 3D-structure-aware neural scene representations. arXiv preprint arXiv:1906.01618, 2019.
[122]
Li, Z. Q.; Niklaus, S.; Snavely, N.; Wang, O. Neural scene flow fields for space-time view synthesis of dynamic scenes. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 6494-6504, 2021.
[123]
Pfister, H.; Zwicker, M.; van Baar, J.; Gross, M. Surfels: Surface elements as rendering primitives. In: Proceedings of the 27th Annual Conference on Computer Graphics and Interactive Techniques, 335-342, 2000.
[124]
Andersen, V.; Aans, H.; Brentzen, J. A. Surfel based geometry resonstruction. In: Theory and Practice of Computer Graphics. The Eurographics Association, 39-44, 2010.
DOI
[125]
Keller, M.; Lefloch, D.; Lambers, M.; Izadi, S.; Weyrich, T.; Kolb, A. Real-time 3D reconstructionin dynamic scenes using point-based fusion. In: Proceedings of the International Conference on 3D Vision, 1-8, 2013.
[126]
Mihajlovic, M.; Weder, S.; Pollefeys, M.; Oswald, M. R. DeepSurfels: learning online appearance fusion. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 14519-14530, 2021.
[127]
Mandikal, P.; Radhakrishnan, V. B. Dense 3D point cloud reconstruction using a deep pyramid network. In: Proceedings of the IEEE Winter Conference on Applications of Computer Vision, 1052-1060, 2019.
[128]
Wolff, K.; Kim, C.; Zimmer, H.; Schroers, C.; Botsch, M.; Sorkine-Hornung, O.; Sorkine-Hornung, A. Point cloud noise and outlier removal for image-based 3D reconstruction. In: Proceedings of the 4th International Conference on 3D Vision, 118-127, 2016.
[129]
Casajus, P. H.; Ritschel, T.; Ropinski, T. Total denoising: Unsupervised learning of 3D point cloud cleaning. In: Proceedings of the IEEE/CVF Inter-national Conference on Computer Vision, 52-60, 2019.
[130]
Delaunoy, A.; Prados, E. Gradient flows for optimizing triangular mesh-based surfaces: Applications to 3D reconstruction problems dealing with visibility. International Journal of Computer Vision Vol. 95, No. 2, 100-123, 2011.
[131]
Wang, P. S.; Liu, Y.; Tong, X. Mesh denoising via cascaded normal regression. ACM Transactions on Graphics Vol. 35, No. 6, Article No. 232, 2016.
[132]
Schertler, N.; Tarini, M.; Jakob, W.; Kazhdan, M.; Gumhold, S.; Panozzo, D. Field-aligned online surface reconstruction. ACM Transactions on Graphics Vol. 36, No. 4, Article No. 77, 2017.
[133]
Tsai, C. Y.; Sankaranarayanan, A. C.; Gkioulekas, I. Beyond volumetric albedo—A surface optimization framework for non-line-of-sight imaging. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 1545-1555, 2019.
[134]
Firman, M.; Aodha, O. M.; Julier, S.; Brostow, G. J. Structured prediction of unobserved voxels from a single depth image. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 5431-5440, 2016.
[135]
Huang, J.; Dai, A.; Guibas, L. J.; Niessner, M. 3Dlite: Towards commodity 3D scanning for content creation. ACM Transactions on Graphics Vol. 36, No. 6, Article No. 203, 2017.
[136]
Zollhöfer, M., Dai, A., Innmann, M., Wu, C. L.; Stamminger, M., Theobalt, C.; Nießner, M. Shading-based refinement on volumetric signed distance functions. ACM Transactions on Graphics Vol. 34, No. 4, Article No. 96, 2015.
[137]
Xu, D.; Duan, Q.; Zheng, J. M.; Zhang, J. Y.; Cai, J. F.; Cham, T. J. Shading-based surface detail recovery under general unknown illumination. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 40, No. 2, 423-436, 2018.
[138]
Chen, Z. Q.; Kim, V. G.; Fisher, M.; Aigerman, N.; Zhang, H.; Chaudhuri, S. DECOR-GAN: 3D shape detailization by conditional refinement. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 15735-15744, 2021.
[139]
Liu, Z. N.; Cao, Y. P.; Kuang, Z. F.; Kobbelt, L.; Hu, S. M. High-quality textured 3D shape reconstruction with cascaded fully convolutional networks. IEEE Transactions on Visualization and Computer Graphics Vol. 27, No. 1, 83-97, 2021.
[140]
Huang, J. W.; Thies, J.; Dai, A.; Kundu, A.; Jiang, C. Y.; Guibas, L. J.; Nießner, M.; Funkhouser, T. Adversarial texture optimization from RGB-D scans. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 1556-1565, 2020.
[141]
Davis, J.; Marschner, S. R.; Garr, M.; Levoy, M. Filling holes in complex surfaces using volumetric diffusion. In: Proceedings of the International Symposium on 3D Data Processing Visualization and Transmission, 428-441, 2002.
[142]
Rock, J.; Gupta, T.; Thorsen, J.; Gwak, J.; Shin, D.; Hoiem, D. Completing 3D object shape from one depth image. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2484-2493, 2015.
[143]
Harary, G.; Tal, A.; Grinspun, E. Context-based coherent surface completion. ACM Transactions on Graphics Vol. 33, No. 1, Article No. 5, 2014.
[144]
Wu, Z. R.; Song, S. R.; Khosla, A.; Yu, F.; Zhang, L. G.; Tang, X. O.; Xiao, J. 3D ShapeNets: A deep representation for volumetric shapes. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 1912-1920, 2015.
[145]
Sharma, A.; Grau, O.; Fritz, M. VConv-DAE: Deep volumetric shape learning without object labels. In: Computer Vision - ECCV 2016 Workshops. Lecture Notes in Computer Science, Vol. 9915. Hua, G.; Jégou, H. Eds. Springer Cham, 236-250, 2016.
DOI
[146]
Wu, J.; Zhang, C.; Xue, T.; Freeman, W. T.; Tenenbaum, J. B. Learning a probabilistic latent space of object shapes via 3D generative-adversarial modeling. In: Proceedings of the 29th Conference on Neural Information Processing System, 82-90, 2016.
[147]
Riegler, G.; Ulusoy, A. O.; Bischof, H.; Geiger, A. OctNetFusion: Learning depth fusion from data.In: Proceedings of the International Conference on 3D Vision, 57-66, 2017.
[148]
Dai, A.; Qi, C. R.; Nießner, M. Shape completion using 3D-encoder-predictor CNNs and shape synthesis. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 6545-6554, 2017.
[149]
Nicastro, A.; Clark, R.; Leutenegger, S. X-section: Cross-section prediction for enhanced RGB-D fusion. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, 1517-1526, 2019.
[150]
Hou, J.; Dai, A.; Nießner, M. RevealNet: Seeing behind objects in RGB-D scans. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2095-2104, 2020.
[151]
Zhang, J. Z.; Chen, X. Y.; Cai, Z.; Pan, L.; Zhao, H. Y.; Yi, S.; Yeo, C. K., Dai, B.; Loy, C. C. Unsupervised 3D shape completion through GAN inversion. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 1768-1777, 2021.
[152]
Silberman, N.; Shapira, L.; Gal, R.; Kohli, P. A contour completion model for augmenting surface reconstructions. In: Computer Vision - ECCV 2014. Lecture Notes in Computer Science, Vol. 8691. Fleet, D.; Pajdla, T.; Schiele, B.; Tuytelaars, T. Eds. Springer Cham, 488-503, 2014.
DOI
[153]
Sung, M.; Kim, V. G.; Angst, R.; Guibas, L. Data-driven structural priors for shape completion. ACM Transactions on Graphics Vol. 34, No. 6, Article No. 175, 2015.
[154]
Song, S. R.; Yu, F.; Zeng, A.; Chang, A. X.; Savva, M.; Funkhouser, T. Semantic scene completion from a single depth image. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 190-198, 2017.
[155]
Dzitsiuk, M.; Sturm, J.; Maier, R.; Ma, L. N.; Cremers, D. De-noising, stabilizing and completing 3D reconstructions on-the-go using plane priors. In: Proceedings of the IEEE International Conference on Robotics and Automation, 3976-3983, 2017.
[156]
Dai, A.; Ritchie, D.; Bokeloh, M.; Reed, S.; Sturm, J.; Nießner, M. ScanComplete: Large-scale scene completion and semantic segmentation for 3D scans. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 4578-4587, 2018.
[157]
Li, J.; Liu, Y.; Yuan, X.; Zhao, C. X.; Siegwart, R.; Reid, I.; Cadena, C. Depth based semantic scene completion with position importance aware loss. IEEE Robotics and Automation Letters Vol. 5, No. 1, 219-226, 2020.
[158]
Endres, F.; Hess, J.; Engelhard, N.; Sturm, J.; Cremers, D.; Burgard, W. An evaluation of the RGB-D SLAMsystem. In: Proceedings of the IEEE International Conference on Robotics and Automation, 1691-1696, 2012.
[159]
Ferstl, D.; Reinbacher, C.; Ranftl, R.; Ruether, M.; Bischof, H. Image guided depth upsampling using anisotropic total generalized variation. In: Proceedings of the IEEE International Conference on Computer Vision, 993-1000, 2013.
[160]
Zhang, Y. D.; Song, S. R.; Yumer, E.; Savva, M.; Lee, J. Y.; Jin, H. L.; Funkhouser, T. Physically-based rendering for indoor scene understanding using convolutional neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 5057-5065, 2017.
Publication history
Copyright
Acknowledgements
Rights and permissions

Publication history

Received: 20 May 2021
Accepted: 30 July 2021
Published: 06 March 2022
Issue date: September 2022

Copyright

© The Author(s) 2021.

Acknowledgements

This work is supported by the National Key R&D Program of China under Grant No. 2018YFC2000600, the Open Projects Program of National Laboratory of Pattern Recognition under Grant No. 202100009, the National Natural Science Foundation of China under Grant No. 72071018, and the Fundamental Research Funds for Central Universities under Grant No. 2021TD006.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduc-tion in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made.

The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Other papers from this open access journal are available free of charge from http://www.springer.com/journal/41095. To submit a manuscript, please go to https://www. editorialmanager.com/cvmj.

Return