3D indoor scene modeling from RGB-D data: a survey

Kang Chen; Yu-Kun Lai; Shi-Min Hu

doi:10.1007/s41095-015-0029-x

Computational Visual Media 2015, 1(4): 267-278 https://doi.org/10.1007/s41095-015-0029-x

Review Article |

Open Access | Issue | Published: 04 December 2015

3D indoor scene modeling from RGB-D data: a survey

Show Author's Information Hide Author's Information Kang Chen^¹, Yu-Kun Lai^², Shi-Min Hu^¹(

)

1 Tsinghua University, Beijing 100084, China.

2 Cardiff University, Cardiff, CF24 3AA, Wales, UK.

Keywords:

3D indoor scenes, geometric modeling, semantic modeling, survey, RGB-D camera

Cite this article:

Chen K, Lai Y-K, Hu S-M. 3D indoor scene modeling from RGB-D data: a survey. Computational Visual Media, 2015, 1(4): 267-278. https://doi.org/10.1007/s41095-015-0029-x

Download citation

EndNote(RIS)

BibTeX

1164

Views

Downloads

Citations

Crossref

N/A

WoS

Scopus

CSCD

Abstract Full text About this article

Abstract

3D scene modeling has long been a fundamental problem in computer graphics and computer vision. With the popularity of consumer-level RGB-D cameras, there is a growing interest in digitizing real-world indoor 3D scenes. However, modeling indoor 3D scenes remains a challenging problem because of the complex structure of interior objects and poor quality of RGB-D data acquired by consumer-level sensors. Various methods have been proposed to tackle these challenges. In this survey, we provide an overview of recent advances in indoor scene modeling techniques, as well as public datasets and code libraries which can facilitate experiments and evaluation.

Full text

Abstract

Full text

Outline

About this article

3D indoor scene modeling from RGB-D data: a survey

Show Author's information Hide Author's Information Kang Chen^¹, Yu-Kun Lai^², Shi-Min Hu^¹(

)

1 Tsinghua University, Beijing 100084, China.

2 Cardiff University, Cardiff, CF24 3AA, Wales, UK.

Abstract

Keywords: 3D indoor scenes, geometric modeling, semantic modeling, survey, RGB-D camera

References(43)

[1]

Merrell, P.; Schkufza, E.; Li, Z.; Agrawala, M.; Koltun, V. Interactive furniture layout using interior design guidelines. ACM Transactions on Graphics Vol. 30, No. 4, Article No. 87, 2011.

DOI Google Scholar

[2]

Yu, L.-F.; Yeung, S.-K.; Tang, C.-K.; Terzopoulos, D.; Chan, T. F.; Osher, S. J. Make it home: Automatic optimization of furniture arrangement. ACM Transactions on Graphics Vol. 30, No. 4, Article No. 86, 2011.

DOI Google Scholar

[3]

Xiao, J.; Furukawa, Y. Reconstructing the world’s museums. International Journal of Computer Vision Vol. 110, No. 3, 243-258, 2014.

DOI Google Scholar

[4]

Izadi, S.; Kim, D.; Hilliges, O.; Molyneaux, D.; Newcombe, R.; Kohli, P.; Shotton, J.; Hodges, S.; Freeman, D.; Davison, A.; Fitzgibbon, A. KinectFusion: Real-time 3D reconstruction and interaction using a moving depth camera. In: Proceedings of the 24th Annual ACM Symposium on User Interface Software and Technology, 559-568, 2011.

DOI

[5]

Newcombe, R. A.; Izadi, S.; Hilliges, O.; Molyneaux, D.; Kim, D.; Davison, A. J.; Kohi, P.; Shotton, J.; Hodges, S.; Fitzgibbon, A. KinectFusion: Real-time dense surface mapping and tracking. In: Proceedings of 2011 10th IEEE International Symposium on Mixed and Augmented Reality, 127-136, 2011.

DOI

[6]

Savva, M.; Chang, A. X.; Hanrahan, P.; Fisher, M.; Nießner, M. SceneGrok: Inferring action maps in 3D environments. ACM Transactions on Graphics Vol. 33, No. 6, Article No. 212, 2014.

DOI Google Scholar

[7]

Chen, K.; Lai, Y.-K.; Wu, Y.-X.; Martin, R.; Hu, S.-M. Automatic semantic modeling of indoor scenes from low-quality RGB-D data using contextual information. ACM Transactions on Graphics Vol. 33, No. 6, Article No. 208, 2014.

DOI Google Scholar

[8]

Iddan, G. J.; Yahav, G. Three-dimensional imaging in the studio and elsewhere. In: Proceedings of the International Society for Optics and Photonics, Vol. 4289, No. 48, 48-55, 2001.

DOI

[9]

Anand, A.; Koppula, H. S.; Joachims, T.; Saxena, A. Contextually guided semantic labeling and search for three-dimensional point clouds. International Journal of Robotics Research Vol. 32, No. 1, 19-34, 2013.

DOI Google Scholar

[10]

Koppula, H. S.; Anand, A.; Joachims, T.; Saxena, A. Semantic labeling of 3D point clouds for indoor scenes. In: Proceedings of the Conference on Neural Information Processing Systems, 244-252, 2011.

[11]

Lai, K.; Bo, L.; Fox, D. Unsupervised feature learning for 3D scene labeling. In: Proceedings of 2014 IEEE International Conference on Robotics and Automation, 3050-3057, 2014.

DOI

[12]

Silberman, N.; Fergus, R. Indoor scene segmentation using a structured light sensor. In: Proceedings of 2011 IEEE International Conference on Computer Vision Workshops, 601-608, 2011.

DOI

[13]

Silberman, N.; Hoiem, D.; Kohli, P.; Fergus, R. Indoor segmentation and support inference from RGBD images. In: Proceedings of the 12th European Conference on Computer Vision-Volume Part V, 746-760, 2012.

DOI

[14]

Xiao, J.; Owens, A.; Torralba, A. SUN3D: A database of big spaces reconstructed using SfM and object labels. In: Proceedings of 2013 IEEE International Conference on Computer Vision, 1625-1632, 2013.

DOI

[15]

Mattausch, O.; Panozzo, D.; Mura, C.; Sorkine-Hornung, O.; Pajarola, R. Object detection and classification from large-scale cluttered indoor scans. Computer Graphics Forum Vol. 33, No. 2, 11-21, 2014.

DOI Google Scholar

[16]

Rusu, R. B.; Cousins, S. 3D is here: Point cloud library (PCL). In: Proceedings of 2011 IEEE International Conference on Robotics and Automation, 1-4, 2011.

DOI

[17]

Information on http://www.mrpt.org.

DOI

[18]

Besl, P. J.; McKay, N. D. A method for registration of 3-D shapes. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 14, No. 2, 239-256, 1992.

DOI Google Scholar

[19]

Chen, Y.; Medioni, G. Object modeling by registration of multiple range images. Image and Vision Computing Vol. 10, No. 3, 145-155, 1992.

DOI Google Scholar

[20]

Durrant-Whyte, H.; Bailey, T. Simultaneous localization and mapping: Part I. IEEE Robotics & Automation Magazine Vol. 13, No. 2, 99-110, 2006.

DOI Google Scholar

[21]

Curless, B.; Levoy, M. A volumetric method for building complex models from range images. In: Proceedings of the 23rd Annual Conference on Computer Graphics and Interactive Techniques, 303-312, 1996.

DOI

[22]

Heredia, F.; Favier, R. Kinect Fusion extensions to large scale environments. Available at http:// www.pointclouds.org/blog/srcs/fheredia.

[23]

Endres, F.; Hess, J.; Engelhard, N.; Sturm, J.; Burgard, W. An evaluation of the RGB-D SLAM system. In: Proceedings of 2012 IEEE International Conference on Robotics and Automation, 1691-1696, 2012.

DOI

[24]

Information on http://openslam.org.

DOI

[25]

Lowe, D. G. Object recognition from local scale-invariant features. In: Proceedings of the 7th IEEE International Conference on Computer Vision, Vol. 2, 1150-1157, 1999.

DOI

[26]

Bay, H.; Ess, A.; Tuytelaars, T.; Van Gool, L. Speeded-up robust features (SURF). Computer Vision and Image Understanding Vol. 110, No. 3, 346-359, 2008.

DOI Google Scholar

[27]

Rublee, E.; Rabaud, V.; Konolige, K.; Bradski, G. ORB: An efficient alternative to SIFT or SURF. In: Proceedings of 2011 IEEE International Conference on Computer Vision, 2564-2571, 2011.

DOI

[28]

Fischler, M. A.; Bolles, R. C. Random sample consensus: A paradigm for model fitting with applications to image analysis and automated cartography. Communications of the ACM Vol. 24, No. 6, 381-395, 1981.

DOI Google Scholar

[29]

Tsai, C.-Y.; Wang, C.-W.; Wang, W.-Y. Design and implementation of a RANSAC RGB-D mapping algorithm for multi-view point cloud registration. In: Proceedings of 2013 International Automatic Control Conference, 367-370, 2013.

DOI

[30]

Henry, P.; Krainin, M.; Herbst, E.; Ren, X.; Fox, D. RGB-D mapping: Using depth cameras for dense 3D modeling of indoor environments. International Journal of Robotics Research Vol. 31, No. 5, 647-663, 2012.

DOI Google Scholar

[31]

Li, M.; Lin, R.; Wang H.; Xu, H. An efficient SLAM system only using RGBD sensors. In: Proceedings of 2013 IEEE International Conference on Robotics and Biomimetics, 1653-1658, 2013.

DOI

[32]

Lin, R.; Wang, Y.; Yang, S. RGBD SLAM for indoor environment. In: Proceedings of the 1st International Conference on Cognitive Systems and Information Processing, 161-175, 2014.

DOI

[33]

Duda, R. O.; Hart, P. E. Use of the Hough transformation to detect lines and curves in pictures. Communications of the ACM Vol. 15, No. 1, 11-15, 1972.

DOI Google Scholar

[34]

Stockman, G.; Shapiro, L. Computer Vision. Upper Saddle River, NJ, USA: Prentice Hall, 2001.

[35]

Oesau, S.; Lafarge, F.; Alliez, P. Indoor scene reconstruction using feature sensitive primitive extraction and graph-cut. ISPRS Journal of Photogrammetry and Remote Sensing Vol. 90, 68-82, 2014.

DOI Google Scholar

[36]

Sanchez, V.; Zakhor, A. Planar 3D modeling of building interiors from point cloud data. In: Proceedings of 2012 19th IEEE International Conference on Image Processing, 1777-1780, 2012.

DOI

[37]

Li, Y.; Wu, X.; Chrysathou, Y.; Sharf, A.; Cohen-Or, D.; Mitra, N. J. GlobFit: Consistently fitting primitives by discovering global relations. ACM Transactions on Graphics Vol. 30, No. 4, Article No. 52, 2011.

DOI Google Scholar

[38]

Arikan, M.; Schwärzler, M.; Flöry, S.; Wimmer, M.; Maierhofer, S. O-snap: Optimization-based snapping for modeling architecture. ACM Transactions on Graphics Vol. 32, No. 1, Article No. 6, 2013.

DOI Google Scholar

[39]

Kim, Y. M.; Mitra, N. J.; Yan, D.-M.; Guibas, L. Acquiring 3D indoor environments with variability and repetition. ACM Transactions on Graphics Vol. 31, No. 6, Article No. 138, 2012.

DOI Google Scholar

[40]

Nan, L.; Xie, K.; Sharf, A. A search-classify approach for cluttered indoor scene understanding. ACM Transactions on Graphics Vol. 31, No. 6, Article No. 137, 2012.

DOI Google Scholar

[41]

Shao, T.; Xu, W.; Zhou, K.; Wang, J.; Li, D.; Guo, B. An interactive approach to semantic modeling of indoor scenes with an RGBD camera. ACM Transactions on Graphics Vol. 31, No. 6, Article No. 136, 2012.

DOI Google Scholar

[42]

Zhou, Q.-Y.; Koltun, V. Dense scene reconstruction with points of interest. ACM Transactions on Graphics Vol. 32, No. 4, Article No. 112, 2013.

DOI Google Scholar

[43]

Salas-Moreno, R. F.; Newcombe, R. A.; Strasdat, H.; Kelly, P. H. J.; Davison, A. J. SLAM++: Simultaneous localisation and mapping at the level of objects. In: Proceedings of 2013 IEEE Conference on Computer Vision and Pattern Recognition, 1352-1359, 2013.

DOI

About this article

Publication history

Acknowledgements

Rights and permissions

Publication history

Received: 09 October 2015

Revised: 09 October 2015

Accepted: 19 November 2015

Published: 04 December 2015

Issue date: December 2015

Copyright

Acknowledgements

This work was supported by the National Natural Science Foundation of China (Project No. 61120106007), Research Grant of Beijing Higher Institution Engineering Research Center, and Tsinghua University Initiative Scientific Research Program.

Rights and permissions

This article is published with open access at Springerlink.com

This article is distributed under the terms of the Creative Commons Attribution License which permits any use, distribution, and reproduction in any medium, provided the original author(s) and the source are credited.

Other papers from this open access journal are available free of charge from http://www.springer.com/journal/41095. To submit a manuscript, please go to https://www.editorialmanager.com/cvmj.