Journal Home > Volume 9 , Issue 2

The lack of fine-grained 3D shape seg-mentation data is the main obstacle to developing learning-based 3D segmentation techniques. We pro-pose an effective semi-supervised method for learning 3D segmentations from a few labeled 3D shapes and a large amount of unlabeled 3D data. For the unlabeled data, we present a novel multilevel consistency loss to enforce consistency of network predictions between perturbed copies of a 3D shape at multiple levels: point level, part level, and hierarchical level. For the labeled data, we develop a simple yet effective part substitution scheme to augment the labeled 3D shapes with more structural variations to enhance training. Our method has been extensively validated on the task of 3D object semantic segmentation on PartNet and ShapeNetPart, and indoor scene semantic segmentation on ScanNet. It exhibits superior performance to existing semi-supervised and unsupervised pre-training 3D approaches.


menu
Abstract
Full text
Outline
About this article

Semi-supervised 3D shape segmentation with multilevel consistency and part substitution

Show Author's information Chun-Yu Sun1Yu-Qi Yang1Hao-Xiang Guo1Peng-Shuai Wang2Xin Tong2Yang Liu2( )Heung-Yeung Shum1
Institute for Advanced Study, Tsinghua University, Beijing 100084, China
Microsoft Research Asia, Beijing 100080, China

Abstract

The lack of fine-grained 3D shape seg-mentation data is the main obstacle to developing learning-based 3D segmentation techniques. We pro-pose an effective semi-supervised method for learning 3D segmentations from a few labeled 3D shapes and a large amount of unlabeled 3D data. For the unlabeled data, we present a novel multilevel consistency loss to enforce consistency of network predictions between perturbed copies of a 3D shape at multiple levels: point level, part level, and hierarchical level. For the labeled data, we develop a simple yet effective part substitution scheme to augment the labeled 3D shapes with more structural variations to enhance training. Our method has been extensively validated on the task of 3D object semantic segmentation on PartNet and ShapeNetPart, and indoor scene semantic segmentation on ScanNet. It exhibits superior performance to existing semi-supervised and unsupervised pre-training 3D approaches.

Keywords: shape segmentation, semi-supervised lear-ning, multilevel consistency

References(77)

[1]
Ouali, Y.; Hudelot, C.; Tami, M. Semi-supervised semantic segmentation with cross-consistency training. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 12671–12681, 2020.
DOI
[2]
Ke, Z. H.; Qiu, D.; Li, K. C.; Yan, Q.; Lau, R. W. H. Guided collaborative training for pixel-wise semi-supervised learning. In: Computer Vision – ECCV 2020. Lecture Notes in Computer Science, Vol. 12358. Vedaldi, A.; Bischof, H.; Brox, T.; Frahm, J. M. Eds. Springer Cham, 429–445, 2020.
[3]
Shamir, A. A survey on mesh segmentation techniques. Computer Graphics Forum Vol. 27, No. 6, 1539–1556, 2008.
[4]
Rodrigues, R. S. V.; Morgado, J. F. M.; Gomes, A. J. P. Part-based mesh segmentation: A survey. Computer Graphics Forum Vol. 37, No. 6, 235–274, 2018.
[5]
Xu, K.; Kim, V. G.; Huang, Q. X.; Kalogerakis, E. Data-driven shape analysis and processing. Computer Graphics Forum Vol. 36, No. 1, 101–132, 2017.
[6]
Tulsiani, S.; Su, H.; Guibas, L. J.; Efros, A. A.; Malik, J. Learning shape abstractions by assembling volumetric primitives. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 1466–1474, 2017.
DOI
[7]
Sun, C. Y.; Zou, Q. F.; Tong, X.; Liu, Y. Learning adaptive hierarchical cuboid abstractions of 3D shape collections. ACM Transactions on Graphics Vol. 38, No. 6, Article No. 241, 2019.
[8]
Paschalidou, D.; Ulusoy, A. O.; Geiger, A. Superquadrics revisited: Learning 3D shape parsing beyond cuboids. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 10336–10345, 2019.
DOI
[9]
Deng, B. Y.; Genova, K.; Yazdani, S.; Bouaziz, S.; Hinton, G. E.; Tagliasacchi, A. CvxNet: Learnable convex decomposition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 31–41, 2020.
DOI
[10]
Genova, K.; Cole, F.; Sud, A.; Sarna, A.; Funkhouser, T. Deep structured implicit functions. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020.
[11]
Chen, Z. Q.; Yin, K. X.; Fisher, M.; Chaudhuri, S.; Zhang, H. BAE-NET: Branched autoencoder for shape co-segmentation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, 8489–8498, 2019.
DOI
[12]
Guo, Y. L.; Wang, H. Y.; Hu, Q. Y.; Liu, H.; Liu, L.; Bennamoun, M. Deep learning for 3D point clouds: A survey. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 43, No. 12, 4338–4364, 2021.
[13]
Xie, Z. G.; Xu, K.; Shan, W.; Liu, L. G.; Xiong, Y. S.; Huang, H. Projective feature learning for 3D shapes with multi-view depth images. Computer Graphics Forum Vol. 34, No. 7, 1–11, 2015.
[14]
Kalogerakis, E.; Averkiou, M.; Maji, S.; Chaudhuri, S. 3D shape segmentation with projective convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 6630–6639, 2017.
DOI
[15]
Dai, A.; Nießner, M. 3DMV: Joint 3D-multi-view prediction for 3D semantic scene segmentation. In: Computer Vision – ECCV 2018. Lecture Notes in Computer Science, Vol. 11214. Ferrari, V.; Hebert, M.; Sminchisescu, C.; Weiss, Y. Eds. Springer Cham, 458–474, 2018.
[16]
Charles, R. Q.; Hao, S.; Mo, K. C.; Guibas, L. J. PointNet: Deep learning on point sets for 3D classification and segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 77–85, 2017.
DOI
[17]
Qi, C.; Yi, L.; Su, H.; Guibas, L. PointNet++: Deep hierarchical feature learning on point sets in a metric space. In: Proceedings of the Advances in Neural Information Processing Systems 30, 2017.
[18]
Li, Y.; Bu, R.; Sun, M.; Wu, W.; Di, X.; Chen, B. PointCNN: Convolution on X-transformed points. In: Proceedings of the 32nd International Conference on Neural Information Processing Systems, 828–838, 2018.
[19]
Thomas, H.; Qi, C. R.; Deschaud, J. E.; Marcotegui, B.; Goulette, F.; Guibas, L. KPConv: Flexible and deformable convolution for point clouds. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, 6410–6419, 2019.
DOI
[20]
Wang, Y.; Sun, Y. B.; Liu, Z. W.; Sarma, S. E.; Bronstein, M. M.; Solomon, J. M. Dynamic graph CNN for learning on point clouds. ACM Transactions on Graphics Vol. 38, No. 5, Article No. 146, 2019.
[21]
Hanocka, R.; Hertz, A.; Fish, N.; Giryes, R.; Fleishman, S.; Cohen-Or, D. MeshCNN: A network with an edge. ACM Transactions on Graphics Vol. 38, No. 4, Article No. 90, 2019.
[22]
Kalogerakis, E.; Hertzmann, A.; Singh, K. Learning 3D mesh segmentation and labeling. ACM Transactions on Graphics Vol. 29, No. 4, Article No. 102, 2010.
[23]
Masci, J.; Boscaini, D.; Bronstein, M. M.; Vandergheynst, P. Geodesic convolutional neural networks on Riemannian manifolds. In: Proceedings of the IEEE International Conference on Computer Vision Workshop, 832–840, 2015.
DOI
[24]
Poulenard, A.; Ovsjanikov, M. Multi-directional geodesic neural networks via equivariant convolution. ACM Transactions on Graphics Vol. 37, No. 6, Article No. 236, 2018.
[25]
Yang, Y. Q.; Pan, H.; Liu, S. L.; Liu, Y.; Tong, X. PFCNN: Convolutional neural networks on 3D surfaces using parallel frames. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 13575–13584, 2020.
DOI
[26]
Song, S. R.; Yu, F.; Zeng, A.; Chang, A. X.; Savva, M.; Funkhouser, T. Semantic scene completion from a single depth image. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 190–198, 2017.
DOI
[27]
Wang, P.-S.; Liu, Y.; Guo, Y.-X.; Sun, C.-Y.; Tong, X. O-CNN: Octree-based convolutional neural networks for 3D shape analysis. ACM Transactions on Graphics Vol. 36, No. 4, Article No. 72, 2017.
[28]
Graham, B.; Engelcke, M.; van der Maaten, L. 3D semantic segmentation with submanifold sparse convolutional networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 9224–9232, 2018.
DOI
[29]
Choy, C.; Gwak, J.; Savarese, S. 4D spatio-temporal ConvNets: Minkowski convolutional neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 3070–3079, 2019.
DOI
[30]
Zhang, J. Z.; Zhu, C. Y.; Zheng, L. T.; Xu, K. Fusion-aware point convolution for online semantic 3D scene segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 4533–4542, 2020.
DOI
[31]
Huang, S. S.; Ma, Z. Y.; Mu, T. J.; Fu, H. B.; Hu, S. M. Supervoxel convolution for online 3D semantic segmentation. ACM Transactions on Graphics Vol. 40, No. 3, Article No. 34, 2021.
[32]
Yi, L.; Guibas, L.; Hertzmann, A.; Kim, V. G.; Su, H.; Yumer, E. Learning hierarchical shape segmentation and labeling from online repositories. ACM Transactions on Graphics Vol. 36, No. 4, Article No. 70, 2017.
[33]
Muralikrishnan, S.; Kim, V. G.; Chaudhuri, S. Tags2Parts: Discovering semantic regions from shape tags. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2926–2935, 2018.
DOI
[34]
Wang, X. G.; Zhou, B.; Fang, H. Y.; Chen, X. W.; Zhao, Q. P.; Xu, K. Learning to group and label fine-grained shape components. ACM Transactions on Graphics Vol. 37, No. 6, Article No. 210, 2018.
[35]
Sharma, G.; Kalogerakis, E.; Maji, S. Learning point embeddings from shape repositories for few-shot segmentation. In: Proceedings of the International Conference on 3D Vision, 67–75, 2019.
DOI
[36]
Zhu, C. Y.; Xu, K.; Chaudhuri, S.; Yi, L.; Guibas, L. J.; Zhang, H. AdaCoSeg: Adaptive shape co-segmentation with group consistency loss. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 8540–8549, 2020.
DOI
[37]
Xu, X.; Lee, G. H. Weakly supervised semantic point cloud segmentation: Towards 10 × fewer labels. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 13703–13712, 2020.
DOI
[38]
Bengio, Y.; Courville, A.; Vincent, P. Representation learning: A review and new perspectives. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 35, No. 8, 1798–1828, 2013.
[39]
Hassani, K.; Haley, M. Unsupervised multi-task feature learning on point clouds. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, 8159–8170, 2019.
DOI
[40]
Chang, A. X.; Funkhouser, T.; Guibas, L.; Hanrahan, P.; Huang, Q. X.; Li, Z. M.; Savarese, S.; Savva, M.; Song, S.; Su, H.; et al. ShapeNet: An information-rich 3D model repository. arXiv preprint arXiv:1512.03012, 2015.
[41]
Wang, P. S.; Yang, Y. Q.; Zou, Q. F.; Wu, Z. R.; Liu, Y.; Tong, X. Unsupervised 3D learning for shape analysis via multiresolution instance discrimination. Proceedings of the AAAI Conference on Artificial Intelligence Vol. 35, No. 4, 2773–2781, 2021.
[42]
Hou, J.; Graham, B.; Nießner, M.; Xie, S. N. Exploring data-efficient 3D scene understanding with contrastive scene contexts. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 15582–15592, 2021.
DOI
[43]
Xie, S. N.; Gu, J. T.; Guo, D. M.; Qi, C. R.; Guibas, L.; Litany, O. PointContrast: Unsupervised pre-training for 3D point cloud understanding. In: Computer Vision – ECCV 2020. Lecture Notes in Computer Science, Vol. 12348. Vedaldi, A.; Bischof, H.; Brox, T.; Frahm, J. M. Eds. Springer Cham, 574–591, 2020.
DOI
[44]
Van Engelen, J. E.; Hoos, H. H. A survey on semi-supervised learning. Machine Learning Vol. 109, No. 2, 373–440, 2020.
[45]
Laine, S.; Aila, T. Temporal ensembling for semi-supervised learning. In: Proceedings of the 5th International Conference on Learning Representations, 2017.
[46]
Tarvainen, A.; Valpola, H. Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results. In: Proceedings of the Advances in Neural Information Processing Systems 30, 2017.
[47]
Sohn, K.; Berthelot, D.; Li, C.-L.; Zhang, Z.; Cubuk, N. C. E. D.; Kurakin, A.; Zhang, H.; Raffel, C. FixMatch: Simplifying semi-supervised learning with consistency and confidence. In: Proceedings of the Advances in Neural Information Processing Systems 33, 2020.
[48]
Berthelot, D.; Carlini, N.; Goodfellow, I.; Papernot, N.; Oliver, A.; Raffel, C. MixMatch: A holistic approach to semi-supervised learning. In: Proceedings of the 33rd International Conference on Neural Information Processing Systems, Article No. 454, 5049–5059, 2019.
[49]
French, G.; Laine, S.; Aila, T. M.; Mackiewicz, M.; Finlayson, G. Semi-supervised semantic segmentation needs strong, varied perturbations. In: Proceedings of the 31st British Machine Vision Virtual Conference, 2020.
[50]
Wang, K. P.; Zhan, B.; Zu, C.; Wu, X.; Zhou, J. L.; Zhou, L. P.; Wang, Y. Tripled-uncertainty guided mean teacher model for semi-supervised medical image segmentation. In: Medical Image Computing and Computer Assisted Intervention – MICCAI 2021. Lecture Notes in Computer Science, Vol. 12902. Springer Cham, 450–460, 2021.
DOI
[51]
Wang, L. J.; Li, X.; Fang, Y. Few-shot learning of part-specific probability space for 3D shape segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 4503–4512, 2020.
DOI
[52]
Funkhouser, T.; Kazhdan, M.; Shilane, P.; Min, P.; Kiefer, W.; Tal, A.; Rusinkiewicz, S.; Dobkin, D. Modeling by example. ACM Transactions on Graphics Vol. 23, No. 3, 652–663, 2004.
[53]
Chaudhuri, S.; Kalogerakis, E.; Guibas, L.; Koltun, V. Probabilistic reasoning for assembly-based 3D modeling. ACM Transactions on Graphics Vol. 30, No. 4, Article No. 35, 2011.
[54]
Xie, X. H.; Xu, K.; Mitra, N. J., Cohen-Or, D., Gong, W. Y.; Su, Q.; Chen, B. Sketch-to-design: Context-based part assembly. Computer Graphics Forum Vol. 32, No. 8, 233–245, 2013.
[55]
Alhashim, I.; Li, H. H.; Xu, K.; Cao, J. J.; Ma, R.; Zhang, H. Topology-varying 3D shape creation via structural blending. ACM Transactions on Graphics Vol. 33, No. 4, Article No. 158, 2014.
[56]
Xu, K.; Zhang, H.; Cohen-Or, D.; Chen, B. Fit and diverse: Set evolution for inspiring 3D shape galleries. ACM Transactions on Graphics Vol. 31, No. 4, Article No. 57, 2012.
[57]
Zhu, C.; Xu, K.; Chaudhuri, S.; Yi, R.; Zhang, H. SCORES: Shape composition with recursive substructure priors. ACM Transactions on Graphics Vol. 37, No. 6, Article No. 211, 2018.
[58]
Huang, H. B.; Kalogerakis, E.; Marlin, B. Analysis and synthesis of 3D shape families via deep-learned generative models of surfaces. Computer Graphics Forum Vol. 34, No. 5, 25–38, 2015.
[59]
Wu, R. D.; Zhuang, Y. X.; Xu, K.; Zhang, H.; Chen, B. Q. PQ-NET: A generative part Seq2Seq network for 3D shapes. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 826–835, 2020.
[60]
Mo, K. C.; Guerrero, P.; Yi, L.; Su, H.; Wonka, P.; Mitra, N. J.; Guibas, L. J. StructEdit: Learning structural shape variations. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 8856–8865, 2020.
[61]
Fu, Q.; Chen, X. W.; Su, X. Y.; Fu, H. B. Pose-inspired shape synthesis and functional hybrid. IEEE Transactions on Visualization and Computer Graphics Vol. 23, No. 12, 2574–2585, 2017.
[62]
Zheng, Y. Y.; Cohen-Or, D.; Mitra, N. J. Smart variations: Functional substructures for part compatibility. Computer Graphics Forum Vol. 32, No. 2pt2, 195–204, 2013.
[63]
Guan, Y.; Liu, H.; Liu, K.; Yin, K.; Hu, R.; van Kaick, O.; Zhang, Y.; Yumer, E.; Carr, N.; Mech, R.; Zhang, H. FAME: 3D shape generation via functionality-aware model evolution. IEEE Transactions on Visualization and Computer Graphics Vol. 28, No. 4, 1758–1772, 2022.
[64]
Chen, Y.; Hu, V. T.; Gavves, E.; Mensink, T.; Mettes, P.; Yang, P.; Snoek, C. G. PointMixup: Augmentation for point clouds. In: Computer Vision – ECCV 2020. Lecture Notes in Computer Science, Vol. 12348. Vedaldi, A.; Bischof, H.; Brox, T.; Frahm, J. M. Eds. Springer Cham, 330–345, 2020.
DOI
[65]
Li, R. H.; Li, X. Z.; Heng, P. A.; Fu, C. W. PointAugment: An auto-augmentation framework for point cloud classification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 6377–6386, 2020.
[66]
Lee, D.; Lee, J.; Lee, J.; Lee, H.; Lee, M.; Woo, S.; Lee, S. Regularization strategy for point cloud via rigidly mixed sample. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 15895–15904, 2021.
DOI
[67]
Zhang, J. L.; Chen, L. J.; Bo, O. Y.; Liu, B. B.; Zhu, J. H.; Chen, Y. J.; Meng, Y.; Wu, D. PointCutMix: Regularization strategy for point cloud classification. Neurocomputing Vol. 505, 58–67, 2022.
[68]
Wang, P. S.; Liu, Y.; Tong, X. Deep octree-based CNNs with output-guided skip connections for 3D shape and scene completion. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 1074–1081, 2020.
DOI
[69]
Abadi, M.; Barham, P.; Chen, J.; Chen, Z.; Davis, A.; Dean, J.; Devin, M.; Ghemawat, S.; Irving, G.; Isard, M.; et al. TensorFlow: A system for large-scale machine learning. In: Proceedings of the 12th USENIX Conference on Operating Systems Design and Implementation, 265–283, 2016.
[70]
Mo, K. C.; Zhu, S. L.; Chang, A. X.; Yi, L.; Tripathi, S.; Guibas, L. J.; Su, H. PartNet: A large-scale benchmark for fine-grained and hierarchical part-level 3D object understanding. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 909–918, 2019.
[71]
Dai, A.; Chang, A. X.; Savva, M.; Halber, M.; Funkhouser, T.; Nießner, M. ScanNet: Richly-annotated 3D reconstructions of indoor scenes. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2432–2443, 2017.
DOI
[72]
Li, J. X.; Chen, B. M.; Lee, G. H. SO-Net: Self-organizing network for point cloud analysis. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 9397–9406, 2018.
[73]
Zhao, Y. H.; Birdal, T.; Deng, H. W.; Tombari, F. 3D point capsule networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 1009–1018, 2019.
DOI
[74]
Thabet, A.; Alwassel, H.; Ghanem, B. Self-supervised learning of local features in 3D point clouds. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 4048–4052, 2020.
DOI
[75]
Alliegro, A.; Boscaini, D.; Tommasi, T. Joint supervised and self-supervised learning for 3D real world challenges. In: Proceedings of the 25th International Conference on Pattern Recognition, 6718–6725, 2020.
DOI
[76]
Gadelha, M.; RoyChowdhury, A.; Sharma, G.; Kalogerakis, E.; Cao, L. L.; Learned-Miller, E.; Wang, R.; Maji, S. Label-efficient learning on point clouds using approximate convex decompositions. In: Computer Vision – ECCV 2020. Lecture Notes in Computer Science, Vol. 12355. Vedaldi, A.; Bischof, H.; Brox, T.; Frahm, J. M. Eds. Springer Cham, 473–491, 2020.
DOI
[77]
Fellbaum, C. WordNet: An Electronic Lexical Database. The MIT Press, 1998.
DOI
Publication history
Copyright
Rights and permissions

Publication history

Received: 06 January 2022
Accepted: 05 March 2022
Published: 03 January 2023
Issue date: June 2023

Copyright

© The Author(s) 2022.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduc-tion in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made.

The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Other papers from this open access journal are available free of charge from http://www.springer.com/journal/41095. To submit a manuscript, please go to https://www.editorialmanager.com/cvmj.

Return