Journal Home > Volume 1 , Issue 4

3D scene modeling has long been a fundamental problem in computer graphics and computer vision. With the popularity of consumer-level RGB-D cameras, there is a growing interest in digitizing real-world indoor 3D scenes. However, modeling indoor 3D scenes remains a challenging problem because of the complex structure of interior objects and poor quality of RGB-D data acquired by consumer-level sensors. Various methods have been proposed to tackle these challenges. In this survey, we provide an overview of recent advances in indoor scene modeling techniques, as well as public datasets and code libraries which can facilitate experiments and evaluation.


menu
Abstract
Full text
Outline
About this article

3D indoor scene modeling from RGB-D data: a survey

Show Author's information Kang Chen1Yu-Kun Lai2Shi-Min Hu1( )
Tsinghua University, Beijing 100084, China.
Cardiff University, Cardiff, CF24 3AA, Wales, UK.

Abstract

3D scene modeling has long been a fundamental problem in computer graphics and computer vision. With the popularity of consumer-level RGB-D cameras, there is a growing interest in digitizing real-world indoor 3D scenes. However, modeling indoor 3D scenes remains a challenging problem because of the complex structure of interior objects and poor quality of RGB-D data acquired by consumer-level sensors. Various methods have been proposed to tackle these challenges. In this survey, we provide an overview of recent advances in indoor scene modeling techniques, as well as public datasets and code libraries which can facilitate experiments and evaluation.

Keywords:

RGB-D camera, 3D indoor scenes, geometric modeling, semantic modeling, survey
Received: 09 October 2015 Revised: 09 October 2015 Accepted: 19 November 2015 Published: 04 December 2015 Issue date: December 2015
References(43)
[1]
Merrell, P.; Schkufza, E.; Li, Z.; Agrawala, M.; Koltun, V. Interactive furniture layout using interior design guidelines. ACM Transactions on Graphics Vol. 30, No. 4, Article No. 87, 2011.
[2]
Yu, L.-F.; Yeung, S.-K.; Tang, C.-K.; Terzopoulos, D.; Chan, T. F.; Osher, S. J. Make it home: Automatic optimization of furniture arrangement. ACM Transactions on Graphics Vol. 30, No. 4, Article No. 86, 2011.
[3]
Xiao, J.; Furukawa, Y. Reconstructing the world’s museums. International Journal of Computer Vision Vol. 110, No. 3, 243-258, 2014.
[4]
Izadi, S.; Kim, D.; Hilliges, O.; Molyneaux, D.; Newcombe, R.; Kohli, P.; Shotton, J.; Hodges, S.; Freeman, D.; Davison, A.; Fitzgibbon, A. KinectFusion: Real-time 3D reconstruction and interaction using a moving depth camera. In: Proceedings of the 24th Annual ACM Symposium on User Interface Software and Technology, 559-568, 2011.
[5]
Newcombe, R. A.; Izadi, S.; Hilliges, O.; Molyneaux, D.; Kim, D.; Davison, A. J.; Kohi, P.; Shotton, J.; Hodges, S.; Fitzgibbon, A. KinectFusion: Real-time dense surface mapping and tracking. In: Proceedings of 2011 10th IEEE International Symposium on Mixed and Augmented Reality, 127-136, 2011.
[6]
Savva, M.; Chang, A. X.; Hanrahan, P.; Fisher, M.; Nießner, M. SceneGrok: Inferring action maps in 3D environments. ACM Transactions on Graphics Vol. 33, No. 6, Article No. 212, 2014.
[7]
Chen, K.; Lai, Y.-K.; Wu, Y.-X.; Martin, R.; Hu, S.-M. Automatic semantic modeling of indoor scenes from low-quality RGB-D data using contextual information. ACM Transactions on Graphics Vol. 33, No. 6, Article No. 208, 2014.
[8]
Iddan, G. J.; Yahav, G. Three-dimensional imaging in the studio and elsewhere. In: Proceedings of the International Society for Optics and Photonics, Vol. 4289, No. 48, 48-55, 2001.
[9]
Anand, A.; Koppula, H. S.; Joachims, T.; Saxena, A. Contextually guided semantic labeling and search for three-dimensional point clouds. International Journal of Robotics Research Vol. 32, No. 1, 19-34, 2013.
[10]
Koppula, H. S.; Anand, A.; Joachims, T.; Saxena, A. Semantic labeling of 3D point clouds for indoor scenes. In: Proceedings of the Conference on Neural Information Processing Systems, 244-252, 2011.
[11]
Lai, K.; Bo, L.; Fox, D. Unsupervised feature learning for 3D scene labeling. In: Proceedings of 2014 IEEE International Conference on Robotics and Automation, 3050-3057, 2014.
[12]
Silberman, N.; Fergus, R. Indoor scene segmentation using a structured light sensor. In: Proceedings of 2011 IEEE International Conference on Computer Vision Workshops, 601-608, 2011.
[13]
Silberman, N.; Hoiem, D.; Kohli, P.; Fergus, R. Indoor segmentation and support inference from RGBD images. In: Proceedings of the 12th European Conference on Computer Vision-Volume Part V, 746-760, 2012.
[14]
Xiao, J.; Owens, A.; Torralba, A. SUN3D: A database of big spaces reconstructed using SfM and object labels. In: Proceedings of 2013 IEEE International Conference on Computer Vision, 1625-1632, 2013.
[15]
Mattausch, O.; Panozzo, D.; Mura, C.; Sorkine-Hornung, O.; Pajarola, R. Object detection and classification from large-scale cluttered indoor scans. Computer Graphics Forum Vol. 33, No. 2, 11-21, 2014.
[16]
Rusu, R. B.; Cousins, S. 3D is here: Point cloud library (PCL). In: Proceedings of 2011 IEEE International Conference on Robotics and Automation, 1-4, 2011.
[17]
Information on http://www.mrpt.org.
DOI
[18]
Besl, P. J.; McKay, N. D. A method for registration of 3-D shapes. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 14, No. 2, 239-256, 1992.
[19]
Chen, Y.; Medioni, G. Object modeling by registration of multiple range images. Image and Vision Computing Vol. 10, No. 3, 145-155, 1992.
[20]
Durrant-Whyte, H.; Bailey, T. Simultaneous localization and mapping: Part I. IEEE Robotics & Automation Magazine Vol. 13, No. 2, 99-110, 2006.
DOI
[21]
Curless, B.; Levoy, M. A volumetric method for building complex models from range images. In: Proceedings of the 23rd Annual Conference on Computer Graphics and Interactive Techniques, 303-312, 1996.
[22]
Heredia, F.; Favier, R. Kinect Fusion extensions to large scale environments. Available at http:// www.pointclouds.org/blog/srcs/fheredia.
[23]
Endres, F.; Hess, J.; Engelhard, N.; Sturm, J.; Burgard, W. An evaluation of the RGB-D SLAM system. In: Proceedings of 2012 IEEE International Conference on Robotics and Automation, 1691-1696, 2012.
[24]
Information on http://openslam.org.
DOI
[25]
Lowe, D. G. Object recognition from local scale-invariant features. In: Proceedings of the 7th IEEE International Conference on Computer Vision, Vol. 2, 1150-1157, 1999.
[26]
Bay, H.; Ess, A.; Tuytelaars, T.; Van Gool, L. Speeded-up robust features (SURF). Computer Vision and Image Understanding Vol. 110, No. 3, 346-359, 2008.
[27]
Rublee, E.; Rabaud, V.; Konolige, K.; Bradski, G. ORB: An efficient alternative to SIFT or SURF. In: Proceedings of 2011 IEEE International Conference on Computer Vision, 2564-2571, 2011.
[28]
Fischler, M. A.; Bolles, R. C. Random sample consensus: A paradigm for model fitting with applications to image analysis and automated cartography. Communications of the ACM Vol. 24, No. 6, 381-395, 1981.
[29]
Tsai, C.-Y.; Wang, C.-W.; Wang, W.-Y. Design and implementation of a RANSAC RGB-D mapping algorithm for multi-view point cloud registration. In: Proceedings of 2013 International Automatic Control Conference, 367-370, 2013.
[30]
Henry, P.; Krainin, M.; Herbst, E.; Ren, X.; Fox, D. RGB-D mapping: Using depth cameras for dense 3D modeling of indoor environments. International Journal of Robotics Research Vol. 31, No. 5, 647-663, 2012.
[31]
Li, M.; Lin, R.; Wang H.; Xu, H. An efficient SLAM system only using RGBD sensors. In: Proceedings of 2013 IEEE International Conference on Robotics and Biomimetics, 1653-1658, 2013.
[32]
Lin, R.; Wang, Y.; Yang, S. RGBD SLAM for indoor environment. In: Proceedings of the 1st International Conference on Cognitive Systems and Information Processing, 161-175, 2014.
[33]
Duda, R. O.; Hart, P. E. Use of the Hough transformation to detect lines and curves in pictures. Communications of the ACM Vol. 15, No. 1, 11-15, 1972.
[34]
Stockman, G.; Shapiro, L. Computer Vision. Upper Saddle River, NJ, USA: Prentice Hall, 2001.
[35]
Oesau, S.; Lafarge, F.; Alliez, P. Indoor scene reconstruction using feature sensitive primitive extraction and graph-cut. ISPRS Journal of Photogrammetry and Remote Sensing Vol. 90, 68-82, 2014.
[36]
Sanchez, V.; Zakhor, A. Planar 3D modeling of building interiors from point cloud data. In: Proceedings of 2012 19th IEEE International Conference on Image Processing, 1777-1780, 2012.
[37]
Li, Y.; Wu, X.; Chrysathou, Y.; Sharf, A.; Cohen-Or, D.; Mitra, N. J. GlobFit: Consistently fitting primitives by discovering global relations. ACM Transactions on Graphics Vol. 30, No. 4, Article No. 52, 2011.
[38]
Arikan, M.; Schwärzler, M.; Flöry, S.; Wimmer, M.; Maierhofer, S. O-snap: Optimization-based snapping for modeling architecture. ACM Transactions on Graphics Vol. 32, No. 1, Article No. 6, 2013.
[39]
Kim, Y. M.; Mitra, N. J.; Yan, D.-M.; Guibas, L. Acquiring 3D indoor environments with variability and repetition. ACM Transactions on Graphics Vol. 31, No. 6, Article No. 138, 2012.
[40]
Nan, L.; Xie, K.; Sharf, A. A search-classify approach for cluttered indoor scene understanding. ACM Transactions on Graphics Vol. 31, No. 6, Article No. 137, 2012.
[41]
Shao, T.; Xu, W.; Zhou, K.; Wang, J.; Li, D.; Guo, B. An interactive approach to semantic modeling of indoor scenes with an RGBD camera. ACM Transactions on Graphics Vol. 31, No. 6, Article No. 136, 2012.
[42]
Zhou, Q.-Y.; Koltun, V. Dense scene reconstruction with points of interest. ACM Transactions on Graphics Vol. 32, No. 4, Article No. 112, 2013.
[43]
Salas-Moreno, R. F.; Newcombe, R. A.; Strasdat, H.; Kelly, P. H. J.; Davison, A. J. SLAM++: Simultaneous localisation and mapping at the level of objects. In: Proceedings of 2013 IEEE Conference on Computer Vision and Pattern Recognition, 1352-1359, 2013.
Publication history
Copyright
Acknowledgements
Rights and permissions

Publication history

Received: 09 October 2015
Revised: 09 October 2015
Accepted: 19 November 2015
Published: 04 December 2015
Issue date: December 2015

Copyright

© The Author(s) 2015

Acknowledgements

This work was supported by the National Natural Science Foundation of China (Project No. 61120106007), Research Grant of Beijing Higher Institution Engineering Research Center, and Tsinghua University Initiative Scientific Research Program.

Rights and permissions

This article is published with open access at Springerlink.com

This article is distributed under the terms of the Creative Commons Attribution License which permits any use, distribution, and reproduction in any medium, provided the original author(s) and the source are credited.

Other papers from this open access journal are available free of charge from http://www.springer.com/journal/41095. To submit a manuscript, please go to https://www.editorialmanager.com/cvmj.

Return