Journal Home > Volume 9 , Issue 4

Recognizing 3D part instances from a 3D point cloud is crucial for 3D structure and scene understanding. Several learning-based approaches use semantic segmentation and instance center prediction as training tasks and fail to further exploit the inherent relationship between shape semantics and part instances. In this paper, we present a new method for 3D part instance segmentation. Our method exploits semantic segmentation to fuse nonlocal instance fea-tures, such as center prediction, and further enhances the fusion scheme in a multi- and cross-level way. We also propose a semantic region center prediction task to train and leverage the prediction results to improve the clustering of instance points. Our method outperforms existing methods with a large-margin improvement in the PartNet benchmark. We also demonstrate that our feature fusion scheme can be applied to other existing methods to improve their performance in indoor scene instance segmentation tasks.


menu
Abstract
Full text
Outline
About this article

Semantic segmentation-assisted instance feature fusion for multi-level 3D part instance segmentation

Show Author's information Chun-Yu Sun1Xin Tong2Yang Liu2( )
Institute for Advanced Study, Tsinghua University, Beijing 100084, China
Microsoft Research Asia, Beijing 100080, China

Abstract

Recognizing 3D part instances from a 3D point cloud is crucial for 3D structure and scene understanding. Several learning-based approaches use semantic segmentation and instance center prediction as training tasks and fail to further exploit the inherent relationship between shape semantics and part instances. In this paper, we present a new method for 3D part instance segmentation. Our method exploits semantic segmentation to fuse nonlocal instance fea-tures, such as center prediction, and further enhances the fusion scheme in a multi- and cross-level way. We also propose a semantic region center prediction task to train and leverage the prediction results to improve the clustering of instance points. Our method outperforms existing methods with a large-margin improvement in the PartNet benchmark. We also demonstrate that our feature fusion scheme can be applied to other existing methods to improve their performance in indoor scene instance segmentation tasks.

Keywords: feature fusion, 3D part instance segmentation, 3D deep learning

References(45)

[1]
Tchapmi, L.; Choy, C.; Armeni, I.; Gwak, J.; Savarese, S. SEGCloud: Semantic segmentation of 3D point clouds. In: Proceedings of the International Conference on 3D Vision, 537–547, 2017.
DOI
[2]
Yang, B.; Wang, J. N.; Clark, R.; Hu, Q. Y.; Wang, S.; Markham, A.; Trigoni, N. Learning object bounding boxes for 3D instance segmentation on point clouds. In: Proceedings of the 33rd International Conference on Neural Information Processing Systems, Article No. 605, 6740–6749, 2019.
[3]
Lahoud, J.; Ghanem, B.; Oswald, M. R.; Pollefeys, M. 3D instance segmentation via multi-task metric learning. In: Proceedings of the IEEE/CVF Inter-national Conference on Computer Vision, 9255–9265, 2019.
DOI
[4]
Zhang, F. H.; Guan, C. Y.; Fang, J.; Bai, S.; Yang, R. G.; Torr, P. H. S.; Prisacariu, V. Instance segmentation of LiDAR point clouds. In: Proceedings of the IEEE International Conference on Robotics and Automation, 9448–9455, 2020.
DOI
[5]
Tan, J. G.; Chen, L. L.; Wang, K. R.; Li, J. M.; Zhang, X. L. SASO: Joint 3D semantic-instance segmentation via multi-scale semantic association and salient point clustering optimization. IET Computer Vision Vol. 15, No. 5, 366–379, 2021.
[6]
Engelmann, F.; Bokeloh, M.; Fathi, A.; Leibe, B.; Nießner, M. 3D-MPA: Multi-proposal aggregation for 3D semantic instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 9028–9037, 2020.
DOI
[7]
Liu, S. H.; Yu, S. Y.; Wu, S. C.; Chen, H. T.; Liu, T. L. Learning Gaussian instance segmentation in point clouds. arXiv preprint arXiv:2007.09860, 2020.
[8]
Jiang, L.; Zhao, H. S.; Shi, S. S.; Liu, S.; Fu, C. W.; Jia, J. Y. PointGroup: Dual-set point grouping for 3D instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 4866–4875, 2020.
DOI
[9]
Zhang, B.; Wonka, P. Point cloud instance segmen-tation using probabilistic embeddings. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 8879–8888, 2021.
DOI
[10]
He, T.; Shen, C. H.; van den Hengel, A. DyCo3D: Robust instance segmentation of 3D point clouds through dynamic convolution. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 354–363, 2021.
DOI
[11]
Wang, X. L.; Liu, S.; Shen, X. Y.; Shen, C. H.; Jia, J. Y. Associatively segmenting instances and semantics in point clouds. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 4091–4100, 2019.
DOI
[12]
Zhao, L.; Tao, W. B. JSNet: Joint instance and semantic segmentation of 3D point clouds. Proceedings of the AAAI Conference on Artificial Intelligence Vol. 34, No. 7, 12951–12958, 2020.
[13]
Mo, K. C.; Zhu, S. L.; Chang, A. X.; Yi, L.; Tripathi, S.; Guibas, L. J.; Su, H. PartNet: A large-scale benchmark for fine-grained and hierarchical part-level 3D object understanding. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 909–918, 2019.
DOI
[14]
Dai, A.; Chang, A. X.; Savva, M.; Halber, M.; Funkhouser, T.; Nießner, M. ScanNet: Richly-annotated 3D reconstructions of indoor scenes. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2432–2443, 2017.
DOI
[15]
Armeni, I.; Sener, O.; Zamir, A. R.; Jiang, H.; Brilakis, I.; Fischer, M.; Savarese, S. 3D semantic parsing of large-scale indoor spaces. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 1534–1543, 2016.
DOI
[16]
Hafiz, A. M.; Bhat, G. M. A survey on instance segmentation: State of the art. International Journal of Multimedia Information Retrieval Vol. 9, No. 3, 171–189, 2020.
[17]
Girshick, R. Fast R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision, 1440–1448, 2015.
DOI
[18]
Ren, S.; He, K.; Girshick, R.; Sun, J. Faster R-CNN: Towards real-time object detection with region proposal networks. In: Proceedings of the 28th International Conference on Neural Information Processing Systems, 91–99, 2015.
[19]
Wang, X. L.; Kong, T.; Shen, C. H.; Jiang, Y. N.; Li, L. SOLO: Segmenting objects by locations. In: Computer Vision – ECCV 2020. Lecture Notes in Computer Science, Vol. 12363. Vedaldi, A.; Bischof, H.; Brox, T.; Frahm, J. M. Eds. Springer Cham, 649–665, 2020.
DOI
[20]
He, K. M.; Gkioxari, G.; Dollár, P.; Girshick, R. Mask R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision, 2980–2988, 2017.
DOI
[21]
Bai, M.; Urtasun, R. Deep watershed transform for instance segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2858–2866, 2017.
DOI
[22]
Dai, J. F.; He, K. M.; Li, Y.; Ren, S. Q.; Sun, J. Instance-sensitive fully convolutional networks. In: Computer Vision – ECCV 2016. Lecture Notes in Computer Science, Vol. 9910. Leibe, B.; Matas, J.; Sebe, N.; Welling, M. Eds. Springer Cham, 534–549, 2016.
DOI
[23]
Chen, X. L.; Girshick, R.; He, K. M.; Dollar, P. TensorMask: A foundation for dense object segmentation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, 2061–2069, 2019.
DOI
[24]
Zhang, H.; Sun, H.; Ao, W.; Dimirovski, G. A survey on instance segmentation: Recent advances and challenges. International Journal of Innovative Computing, Information and Control Vol. 17, No. 3, 1041–1053, 2021.
[25]
Minaee, S.; Boykov, Y.; Porikli, F.; Plaza, A.; Kehtarnavaz, N.; Terzopoulos, D. Image segmentation using deep learning: A survey. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 44, No. 7, 3523–3542, 2022.
[26]
Guo, Y.; Wang, H.; Hu, Q.; Liu, H.; Liu, L.; Bennamoun, M. Deep learning for 3D point clouds: A survey. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 43, No. 12, 4338–4364, 2021.
[27]
He, Y.; Yu, H. S.; Liu, X. Y.; Yang, Z. G.; Sun, W.; Wang, Y. N.; Fu, Q.; Zou, Y. M.; Mian, A.Deep learning based 3D segmentation: A survey. arXiv preprint arXiv:2103.05423, 2021.
[28]
Jiang, H. Y.; Yan, F. L.; Cai, J. F.; Zheng, J. M.; Xiao, J. End-to-end 3D point cloud instance segmentation without detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 12793–12802, 2020.
DOI
[29]
Hou, J.; Dai, A.; Nießner, M. 3D-SIS: 3D semantic instance segmentation of RGB-D scans. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 4416–4425, 2019.
DOI
[30]
Yi, L.; Zhao, W.; Wang, H.; Sung, M.; Guibas, L. J. GSPN: Generative shape proposal network for 3D instance segmentation in point cloud. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 3942–3951, 2019.
DOI
[31]
Wang, W. Y.; Yu, R.; Huang, Q. G.; Neumann, U. SGPN: Similarity group proposal network for 3D point cloud instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2569–2578, 2018.
DOI
[32]
Liu, C.; Furukawa, Y. MASC: Multi-scale affinity with sparse convolution for 3D instance segmentation. arXiv preprint arXiv:1902.04478, 2019.
[33]
Han, L.; Zheng, T.; Xu, L.; Fang, L. OccuSeg: Occupancy-aware 3D instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2937–2946, 2020.
DOI
[34]
Chen, S. Y.; Fang, J. M.; Zhang, Q.; Liu, W. Y.; Wang, X. G. Hierarchical aggregation for 3D instance segmentation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, 15447–15456, 2021.
DOI
[35]
Liang, Z. H.; Li, Z. H.; Xu, S. C.; Tan, M. K.; Jia, K. Instance segmentation in 3D scenes using semantic superpoint tree networks. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, 2763–2772, 2021.
DOI
[36]
Yu, F. G.; Liu, K.; Zhang, Y.; Zhu, C. Y.; Xu, K. PartNet: A recursive part decomposition network for fine-grained and hierarchical shape segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 9483–9492, 2019.
DOI
[37]
Comaniciu, D.; Meer, P. Mean shift: A robust approach toward feature space analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 24, No. 5, 603–619, 2002.
[38]
Wang, P.-S.; Liu, Y.; Guo, Y.-X.; Sun, C.-Y.; Tong, X. O-CNN: Octree-based convolutional neural networksfor 3D shape analysis. ACM Transactions on Graphics Vol. 36, No. 4, Article No. 72, 2017.
[39]
Wang, P. S.; Liu, Y.; Tong, X. Deep octree-based CNNs with output-guided skip connections for 3D shape and scene completion. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 1074–1081, 2020.
DOI
[40]
Graham, B.; van der Maaten, L. Submanifold sparse convolutional networks. arXiv preprint arXiv:1706.01307, 2017.
[41]
Choy, C.; Gwak, J.; Savarese, S. 4D spatio-temporal ConvNets: Minkowski convolutional neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 3070–3079, 2019.
DOI
[42]
Abadi, M.; Agarwal, A.; Barham, P.; Brevdo, E.; Chen, Z.; Citro, C.; Corrado, G. S.; Davis, A.; Dean, J.; Devin, M., et al. TensorFlow: Large-scale machine learning on heterogeneous distributed systems. arXiv preprint arXiv:1603.04467, 2016.
[43]
Pedregosa, F.; Varoquaux, G.; Gramfort, A.; Michel, V.; Thirion, B.; Grisel, O.; Blondel, M.; Prettenhofer, P.; Weiss, R.; Dubourg, V., et al. Scikit-learn: Machine Learning in Python. Journal of Machine Learning Research Vol. 12, 2825–2830, 2011.
[44]
Qi, C. R.; Yi, L.; Su, H.; Guibas, L. J. PointNet++: Deep hierarchical feature learning on point sets in a metric space. arXiv preprint arXiv:1706.02413, 2017.
[45]
Sharma, G.; Liu, D. F.; Maji, S.; Kalogerakis, E.; Chaudhuri, S.; Měch, R. ParSeNet: A parametric surface fitting network for 3D point clouds. In: Computer Vision – ECCV 2020. Lecture Notes in Computer Science, Vol. 12352. Vedaldi, A.; Bischof, H.; Brox, T.; Frahm, J. M. Eds. Springer Cham, 261–276, 2020.
DOI
Publication history
Copyright
Rights and permissions

Publication history

Received: 08 April 2022
Accepted: 03 June 2022
Published: 30 June 2023
Issue date: December 2023

Copyright

© The Author(s) 2023.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduc-tion in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made.

The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Other papers from this open access journal are available free of charge from http://www.springer.com/journal/41095. To submit a manuscript, please go to https://www.editorialmanager.com/cvmj.

Return