A survey of deep learning-based 3D shape generation

Qun-Ce Xu; Tai-Jiang Mu; Yong-Liang Yang

doi:10.1007/s41095-022-0321-5

| Sign up

PDF (8.9 MB)

Cite

EndNote(RIS) BibTeX

Collect

Submit Manuscript

Review Article | Open Access

A survey of deep learning-based 3D shape generation

Qun-Ce Xu^¹, Tai-Jiang Mu^¹(), Yong-Liang Yang^²

1BNRist, Department of Computer Science and Tech-nology, Tsinghua University, Beijing 100084, China

2Department of Computer Science, University of Bath, Bath, UK

Show Author Information

Graphical Abstract

View original image Download original image

Abstract

Deep learning has been successfully used for tasks in the 2D image domain. Research on 3D computer vision and deep geometry learning has also attracted attention. Considerable achievements have been made regarding feature extraction and discrimination of 3D shapes. Following recent advances in deep generative models such as generative adversarial networks, effective generation of 3D shapes has become an active research topic. Unlike 2D images with a regular grid structure, 3D shapes have various representations, such as voxels, point clouds, meshes, and implicit functions. For deep learning of 3D shapes, shape representation has to be taken into account as there is no unified representation that can cover all tasks well. Factors such as the representativeness of geometry and topology often largely affect the quality of the generated 3D shapes. In this survey, we comprehensively review works on deep-learning-based 3D shape generation by classifying and discussing them in terms of the underlying shape representation and the architecture of the shape generator. The advantages and disadvantages of each class are further analyzed. We also consider the 3D shape datasets commonly used for shape generation. Finally, we present several potential research directions that hopefully can inspire future works on this topic.

Keywords

3D representations geometry learning generative models deep learning

References

[1]

Zhang,

Z. Y.

Microsoft kinect sensor and its effect. IEEE MultiMedia Vol. 19, No. 2, 4–10, 2012.

Crossref Google Scholar

[2]

Chang,

A. X.

; Funkhouser,

; Guibas,

; Hanrahan,

; Huang,

Q. X.

; Li,

Z. M.

; Savarese,

; Savva,

; Song,

S. R.

; Su,

; et al. ShapeNet: An information-rich 3D model repository. arXiv preprint arXiv:1512.03012, 2015.

Google Scholar

[3]

Deng,

; Dong,

; Socher,

; Li,

L. J.

; Kai,

; Li,

F. F.

ImageNet: A large-scale hierarchical image database. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 248–255, 2009.

Crossref

[4]

Kirk,

NVIDIA cuda software and gpu parallel computing architecture. In: Proceedings of the 6th International Symposium on Memory Management, 103–104, 2007.

Crossref

[5]

Guo,

M. H.

; Xu,

T. X.

; Liu,

J. J.

; Liu,

Z. N.

; Jiang,

P. T.

; Mu,

T. J.

; Zhang,

S. H.

; Martin,

R. R.

; Cheng,

M. M.

; Hu,

S. M.

Attention mechanisms in computer vision: A survey. Computational Visual Media Vol. 8, No. 3, 331–368, 2022.

Crossref Google Scholar

[6]

Cao,

W. M.

; Yan,

Z. Y.

; He,

Z. Q.

; He,

Z. H.

A comprehensive survey on geometric deep learning. IEEE Access Vol. 8, 35929–35949, 2020.

Crossref Google Scholar

[7]

Karras,

; Laine,

; Aila,

T. M.

A style-based generator architecture for generative adversarial networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 4396–4405, 2019.

Crossref

[8]

Ramesh,

; Pavlov,

; Goh,

; Gray,

; Voss,

; Radford,

; Chen,

; Sutskever,

In: Proceedings of the 38th International Conference on Machine Learning, 8821–8831, 2021.

[9]

Dosovitskiy,

; Beyer,

; Kolesnikov,

; Weissenborn,

; Zhai,

X. H.

; Unterthiner,

; Dehghani,

; Minderer,

; Heigold,

; Gelly,

; et al. An image is worth 16x16 words: Transformers for image recognition at scale. In: Proceedings of the International Conference on Learning Representations, 2021.

[10]

Huang,

J. H.

; Huang,

S. S.

; Song,

H. X.

; Hu,

S. M.

DI-fusion: Online implicit 3D reconstruction with deep priors. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 8928–8937, 2021.

Crossref

[11]

Mildenhall,

; Srinivasan,

P. P.

; Tancik,

; Barron,

J. T.

; Ramamoorthi,

; Ng,

NeRF: Representing scenes as neural radiance fields for view synthesis. In: Computer Vision – ECCV 2020. Lecture Notes in Computer Science, Vol. 12346. Vedaldi,

; Bischof,

; Brox,

; Frahm,

J. M.

Eds. Springer Cham, 405–421, 2020.

Crossref

[12]

Saito,

; Huang,

; Natsume,

; Morishima,

; Li,

; Kanazawa,

PIFu: Pixel-aligned implicit function for high-resolution clothed human digitization. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, 2304–2314, 2019.

Crossref

[13]

Wu,

Z. R.

; Song,

S. R.

; Khosla,

; Yu,

; Zhang,

L. G.

; Tang,

X. O.

; Xiao,

J. X.

3D ShapeNets: A deep representation for volumetric shapes. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 1912–1920, 2015.

[14]

Yan,

; Yang,

; Yumer,

; Guo,

; Lee,

Perspective transformer nets: Learning single-view 3D object reconstruction without 3D supervision. In: Proceedings of the 30th International Conference on Neural Information Processing Systems, 1704–1712, 2016.

[15]

Girdhar,

; Fouhey,

D. F.

; Rodriguez,

; Gupta,

Learning a predictable and generative vector representation for objects. In: Computer Vision – ECCV 2016. Lecture Notes in Computer Science, Vol. 9910. Leibe,

; Matas,

; Sebe,

; Welling,

Eds. Springer Cham, 484–499, 2016.

Crossref

[16]

Choy,

C. B.

; Xu,

D. F.

; Gwak,

; Chen,

; Savarese,

3D-R2N2: A unified approach for single and multi-view 3D object reconstruction. In: Computer Vision – ECCV 2016. Lecture Notes in Computer Science, Vol. 9912. Leibe,

; Matas,

; Sebe,

; Welling,

Eds. Springer Cham, 628–644, 2016.

Crossref

[17]

Wu,

; Wang,

; Xue,

; Sun,

; Freeman,

; Tenenbaum,

MarrNet: 3D shape reconstruction via 2.5D sketches. In: Proceedings of the 31st International Conference on Neural Information Processing Systems, 540–550, 2017.

[18]

Wu,

J. J.

; Zhang,

C. K.

; Zhang,

X. M.

; Zhang,

Z. T.

; Freeman,

W. T.

; Tenenbaum,

J. B.

Learning shape priors for single-view 3D completion and reconstruction. In: Computer Vision – ECCV 2018. Lecture Notes in Computer Science, Vol. 11215. Ferrari,

; Hebert,

; Sminchisescu,

; Weiss,

Eds. Springer Cham, 673–691, 2018.

Crossref

[19]

Zhang,

; Zhang,

; Tenenbaum,

; Freeman,

; Wu,

Learning to reconstruct shapes from unseen classes. In: Proceedings of the 32nd International Conference on Neural Information Processing Systems, 2263–2274, 2018.

[20]

Kar,

; Häne,

; Malik,

Learning a multi-view stereo machine. In: Proceedings of the 31st International Conference on Neural Information Processing Systems, 364–375, 2017.

[21]

Liu,

S. K.

; Giles,

; Ororbia,

Learning a hierarchical latent-variable model of 3D shapes. In: Proceedings of the International Conference on 3D Vision, 542–551, 2018.

Crossref

[22]

Tatarchenko,

; Dosovitskiy,

; Brox,

Octree generating networks: Efficient convolutional architectures for high-resolution 3D outputs. In: Proceedings of the IEEE International Conference on Computer Vision, 2107–2115, 2017.

Crossref

[23]

Häne,

; Tulsiani,

; Malik,

Hierarchical surface prediction for 3D object reconstruction. In: Proceedings of the International Conference on 3D Vision, 412–420, 2017.

Crossref

[24]

Cao,

Y. P.

; Liu,

Z. N.

; Kuang,

Z. F.

; Kobbelt,

; Hu,

S. M.

Learning to reconstruct high-quality 3D shapes with cascaded fully convolutional networks. In: Computer Vision – ECCV 2018. Lecture Notes in Computer Science, Vol. 11213. Ferrari,

; Hebert,

; Sminchisescu,

; Weiss,

Eds. Springer Cham, 626–643, 2018.

Crossref

[25]

Liu,

Z. N.

; Cao,

Y. P.

; Kuang,

Z. F.

; Kobbelt,

; Hu,

S. M.

High-quality textured 3D shape reconstruction with cascaded fully convolutional networks. IEEE Transactions on Visualization and Computer Graphics Vol. 27, No. 1, 83–97, 2021.

Crossref Google Scholar

[26]

Xie,

H. Z.

; Yao,

H. X.

; Sun,

X. S.

; Zhou,

S. C.

; Zhang,

S. P.

Pix2Vox: Context-aware 3D reconstruction from single and multi-view images. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, 2690–2698, 2019.

Crossref

[27]

Yang,

; Xu,

; Xie,

H. Z.

; Perry,

; Xia,

J. H.

Single-view 3D object reconstruction from shape priors in memory. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 3151–3160, 2021.

Crossref

[28]

Wu,

; Zhang,

; Xue,

; Freeman,

; Tenenbaum,

Learning a probabilistic latent space of object shapes via 3D generative-adversarial modeling. In: Proceedings of the 30th International Conference on Neural Information Processing Systems, 82–90, 2016.

[29]

Smith,

E. J.

; Meger,

Improved adversarial systems for 3D object generation and reconstruction. In: Proceedings of the 1st Annual Conference on Robot Learning, 87–96, 2017.

[30]

Zhu,

J. Y.

; Zhang,

; Wu,

; Torralba,

; Tenenbaum,

; Freeman,

Visual object networks: Image generation with disentangled 3D representation. In: Proceedings of the 32nd International Conference on Neural Information Processing Systems, 118–129, 2018.

[31]

Chen,

; Choy,

C. B.

; Savva,

; Chang,

A. X.

; Funkhouser,

; Savarese,

Text2Shape: Generating shapes from natural language by learning joint embeddings. In: Computer Vision – ACCV 2018. Lecture Notes in Computer Science, Vol. 11363. Jawahar,

; Li,

; Mori,

; Schindler,

Eds. Springer Cham, 100–116, 2019.

Crossref

[32]

Knyaz,

V. A.

; Kniaz,

V. V.

; Remondino,

Image-to-voxel model translation with conditional adversarial networks. In: Computer Vision – ECCV 2018 Workshops. Lecture Notes in Computer Science, Vol. 11129. Leal-Taixé,

; Roth,

Eds. Springer Cham, 601–618, 2019.

Crossref

[33]

Gadelha,

; Maji,

; Wang,

3D shape induction from 2D views of multiple objects. In: Proceedings of the International Conference on 3D Vision, 402–411, 2017.

Crossref

[34]

Li,

; Dong,

; Peers,

; Tong,

Synthesizing 3D shapes from silhouette image collections using multi-projection generative adversarial networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 5530–5539, 2019.

Crossref

[35]

Khan,

S. H.

; Guo,

; Hayat,

; Barnes,

Unsupervised primitive discovery for improved 3D generative modeling. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 9731–9740, 2019.

Crossref

[36]

Henzler,

; Mitra,

; Ritschel,

Escaping plato’s cave: 3D shape from adversarial rendering. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, 9983–9992, 2019.

Crossref

[37]

Chen,

Z. Q.

; Kim,

V. G.

; Fisher,

; Aigerman,

; Zhang,

; Chaudhuri,

DECOR-GAN: 3D shape detailization by conditional refinement. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 15735–15744, 2021.

Crossref

[38]

Brock,

; Lim,

; Ritchie,

J. M.

; Weston,

Generative and discriminative voxel modeling with convolutional neural networks. arXiv preprint arXiv:1608.04236, 2016.

Google Scholar

[39]

Balashova,

; Singh,

; Wang,

J. P.

; Teixeira,

; Chen,

; Funkhouser,

Structure-aware shape synthesis. In: Proceedings of the International Conference on 3D Vision, 140–149, 2018.

Crossref

[40]

Mittal,

; Cheng,

Y. C.

; Singh,

; Tulsiani,

AutoSDF: Shape priors for 3D completion, reconstruction and generation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 306–315, 2022.

Crossref

[41]

Huang,

; Lai,

; Xu,

; Tu,

3D volumetric modeling with introspective neural networks. Proceedings of the AAAI Conference on Artificial Intelligence Vol. 33, No. 01, 8481–8488, 2019.

Crossref Google Scholar

[42]

Ibing,

; Kobsik,

; Kobbelt,

Octree transformer: Autoregressive 3D shape generation on hierarchically structured sequences. arXiv preprint arXiv:2111.12480, 2021.

Google Scholar

[43]

Xie,

; Zheng,

; Gao,

; Wang,

; Zhu,

S. C.

; Wu,

Y. N.

Learning descriptor networks for 3D shape synthesis and analysis. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 8629–8638, 2018.

Crossref

[44]

Xie,

; Zheng,

; Gao,

; Wang,

; Zhu,

S. C.

; Wu,

Y. N.

Generative VoxelNet: Learning energy-based models for 3D shape synsynthesis and analysis. IEEE Transactions on Pattern Analysisand Machine Intelligence Vol. 44, No. 5, 2468–2484, 2022.

Google Scholar

[45]

Gadelha,

; Wang,

; Maji,

Multiresolution tree networks for 3D point cloud processing. In: Computer Vision – ECCV 2018. Lecture Notes in Computer Science, Vol. 11211. Ferrari,

; Hebert,

; Sminchisescu,

; Weiss,

Eds. Springer Cham, 105–122, 2018.

Crossref

[46]

Yang,

Y. Q.

; Feng,

; Shen,

Y. R.

; Tian,

FoldingNet: Point cloud auto-encoder via deep grid deformation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 206–215, 2018.

Crossref

[47]

Zamorski,

; Zięba,

; Klukowski,

; Nowak,

; Kurach,

; Stokowiec,

; Trzciński,

Adversarial autoencoders for compact representations of 3D point clouds. Computer Vision and Image Understanding Vol. 193, 102921, 2020.

Crossref Google Scholar

[48]

Kurenkov,

; Ji,

J. W.

; Garg,

; Mehta,

; Gwak,

; Choy,

; Savarese,

DeformNet: Free-form deformation network for 3D shape reconstruction from a single image. In: Proceedings of the IEEE Winter Conference on Applications of Computer Vision, 858–866, 2018.

Crossref

[49]

Fan,

H. Q.

; Su,

; Guibas,

A point set generation network for 3D object reconstruction from a single image. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2463–2471, 2017.

Crossref

[50]

Wei,

; Liu,

S. H.

; Zhao,

; Lu,

J. W.

Conditional single-view shape generation for multi-view stereo reconstruction. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 9643–9652, 2019.

Crossref

[51]

Hu,

; Lin,

; Han,

Z. Z.

; Zwicker,

Learning to generate dense point clouds with textures on multiple categories. In: Proceedings of the IEEE Winter Conference on Applications of Computer Vision, 2169–2178, 2021.

Crossref

[52]

Lin,

C. H.

; Kong,

; Lucey,

Learning efficient point cloud generation for dense 3D object reconstruction. Proceedings of the AAAI Conference on Artificial Intelligence Vol. 32, No. 1, 7114–7121, 2018.

Crossref Google Scholar

[53]

Insafutdinov,

; Dosovitskiy,

Unsupervised learning of shape and pose with differentiable point clouds. In: Proceedings of the 32nd International Conference on Neural Information Processing Systems, 2807–2817, 2018.

[54]

Chen,

; Han,

Z. Z.

; Liu,

Y. S.

; Zwicker,

Unsupervised learning of fine structure generation for 3D point clouds by 2D projection matching. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, 12446–12457, 2021.

Crossref

[55]

Komarichev,

; Hua,

; Zhong,

Z. C.

Learning geometry-aware joint latent space for simultaneous multimodal shape generation. Computer Aided Geometric Design Vol. 93, 102076, 2022.

Crossref Google Scholar

[56]

Gal,

; Bermano,

; Zhang,

; Cohen-Or,

MRGAN: Multi-rooted 3D shape generation with unsupervised part disentanglement. arXiv preprint arXiv:2007.12944, 2020.

Crossref Google Scholar

[57]

Achlioptas,

; Diamanti,

; Mitliagkas,

; Guibas,

Learning representations and generative models for 3D point clouds. In: Proceedings of the 35th International Conference on Machine Learning, 40–49, 2018.

[58]

Valsesia,

; Fracastoro,

; Magli,

Learning localized generative models for 3D point clouds via graph convolution. In: Proceedings of the 7th International Conference on Learning Representations, 2019.

[59]

Shu,

; Park,

S. W.

; Kwon,

3D point cloud generative adversarial network based on tree structured graph convolutions. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, 3858–3867, 2019.

Crossref

[60]

Li,

R. H.

; Li,

X. Z.

; Fu,

C. W.

; Cohen-Or,

; Heng,

P. A.

PU-GAN: A point cloud upsampling adversarial network. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, 7202–7211, 2019.

Crossref

[61]

Ramasinghe,

; Khan,

; Barnes,

; Gould,

Spectral-GANs for high-resolution 3D point-cloudgeneration. In: Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 8169–8176, 2020.

Crossref

[62]

Li,

Y. S.

; Baciu,

SAPCGAN: Self-attention based generative adversarial network for point clouds. In: Proceedings of the IEEE 19th International Conference on Cognitive Informatics & Cognitive Computing, 52–59, 2020.

Crossref

[63]

Li,

Y. S.

; Baciu,

HSGAN: Hierarchical graph learning for point cloud generation. IEEE Transactions on Image Processing Vol. 30, 4540–4554, 2021.

Crossref Google Scholar

[64]

Li,

; Li,

; Hui,

K. H.

; Fu,

C. W.

SP-GAN: Sphere-guided 3D shape generation and manipulation. ACM Transactions on Graphics Vol. 40, No. 4, Article No. 151, 2021.

Crossref Google Scholar

[65]

Tang,

Y. Z.

; Qian,

; Zhang,

Q. J.

; Zeng,

Y. M.

; Hou,

J. H.

; Zhe,

X. F.

WarpingGAN: Warping multiple uniform priors for adversarial 3D point cloud generation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 6387–6395, 2022.

Crossref

[66]

Hui,

; Xu,

; Xie,

; Qian,

J. J.

; Yang,

Progressive point cloud deconvolution generation network. In: Computer Vision – ECCV 2020. Lecture Notes in Computer Science, Vol. 12360. Vedaldi,

; Bischof,

; Brox,

; Frahm,

J. M.

Eds. Springer Cham, 397–413, 2020.

Crossref

[67]

Arshad,

M. S.

; Beksi,

W. J.

A progressive conditional generative adversarial network for generating dense and colored 3D point clouds. In: Proceedings of the International Conference on 3D Vision, 712–722, 2020.

Crossref

[68]

Wen,

; Yu,

B. S.

; Tao,

D. C.

Learning progressive point embeddings for 3D point cloud generation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 10261–10270, 2021.

Crossref

[69]

Mo,

K. C.

; Wang,

; Yan,

X. C.

; Guibas,

PT2PC: Learning to generate 3D point cloud shapes from part tree conditions. In: Computer Vision – ECCV 2020. Lecture Notes in Computer Science, Vol. 12351. Vedaldi,

; Bischof,

; Brox,

; Frahm,

J. M.

Eds. Springer Cham, 683–701, 2020.

Crossref

[70]

Yang,

X. M.

; Wu,

Y. A.

; Zhang,

K. Y.

; Jin,

CPCGAN: A controllable 3D point cloud generative adversarial network with semantic label generating. Proceedings of the AAAI Conference on Artificial Intelligence Vol. 35, No. 4, 3154–3162, 2021.

Crossref Google Scholar

[71]

Kim,

; Yoo,

; Lee,

; Hong,

SetVAE: Learning hierarchical composition for generative modeling of set-structured data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 15054–15063, 2021.

Crossref

[72]

Li,

S. D.

; Liu,

M. M.

; Walder,

EditVAE: Unsupervised parts-aware controllable 3D point cloud shape generation. Proceedings of the AAAI Conference on Artificial Intelligence Vol. 36, No. 2, 1386–1394, 2022.

Crossref Google Scholar

[73]

Yang,

G. D.

; Huang,

; Hao,

Z. K.

; Liu,

M. Y.

; Belongie,

; Hariharan,

PointFlow: 3D point cloud generation with continuous normalizing flows. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, 4540–4549, 2019.

Crossref

[74]

Klokov,

; Boyer,

; Verbeek,

Discrete point flow networks for efficient point cloud generation. In: Computer Vision – ECCV 2020. Lecture Notes in Computer Science, Vol. 12368. Vedaldi,

; Bischof,

; Brox,

; Frahm,

J. M.

Eds. Springer Cham, 694–710, 2020.

Crossref

[75]

Kim,

; Lee,

; Kang,

W. H.

; Lee,

J. Y.

; Kim,

N. S.

SoftFlow: Probabilistic framework for normalizing flow on manifolds. In: Proceedings of the 34th International Conference on Neural Information Processing Systems, 16388–16397, 2020.

[76]

Pumarola,

; Popov,

; Moreno-Noguer,

; Ferrari,

C-flow: Conditional generative flow models for images and 3D point clouds. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 7946–7955, 2020.

Crossref

[77]

Postels,

; Liu,

M. Y.

; Spezialetti,

; Van Gool,

; Tombari,

Go with the flows: Mixtures of normalizing flows for point cloud generation and reconstruction. In: Proceedings of the International Conference on 3D Vision, 1249–1258, 2021.

Crossref

[78]

Sun,

Y. B.

; Wang,

; Liu,

Z. W.

; Siegel,

J. E.

; Sarma,

S. E.

PointGrow: Autoregressively learned point cloud generation with self-attention. In: Proceedings of the IEEE Winter Conference on Applications of Computer Vision, 61–70, 2020.

Crossref

[79]

Luo,

S. T.

; Hu,

Diffusion probabilistic models for 3D point cloud generation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2836–2844, 2021.

Crossref

[80]

Zhou,

L. Q.

; Du,

Y. L.

; Wu,

J. J.

3D shape generation and completion through point-voxel diffusion. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, 5806–5815, 2021.

Crossref

[81]

Xie,

; Xu,

; Zheng,

; Zhu,

S. C.

; Wu,

Y. N.

Generative PointNet: Deep energy-based learning on unordered point sets for 3D generation, reconstruction and classification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 14971–14980, 2021.

Crossref

[82]

Cai,

R. J.

; Yang,

G. D.

; Averbuch-Elor,

; Hao,

Z. K.

; Belongie,

; Snavely,

; Hariharan,

Learning gradient fields for shape generation. In: Computer Vision – ECCV 2020. Lecture Notes in Computer Science, Vol. 12348. Vedaldi,

; Bischof,

; Brox,

; Frahm,

J. M.

Eds. Springer Cham, 364–381, 2020.

Crossref

[83]

Groueix,

; Fisher,

; Kim,

V. G.

; Russell,

B. C.

; Aubry,

A Papier-Mache approach to learning 3D surface generation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 216–224, 2018.

Crossref

[84]

Pontes,

J. K.

; Kong,

; Sridharan,

; Lucey,

; Eriksson,

; Fookes,

Image2Mesh: A learning framework for single image 3D reconstruction. In: Computer Vision – ACCV 2018. Lecture Notes in Computer Science, Vol. 11361. Jawahar,

; Li,

; Mori,

; Schindler,

Eds. Springer Cham, 365–381, 2019.

Crossref

[85]

Wang,

N. Y.

; Zhang,

Y. D.

; Li,

Z. W.

; Fu,

Y. W.

; Liu,

; Jiang,

Y. G.

Pixel2Mesh: Generating 3D mesh models from single RGB images. In: Computer Vision – ECCV 2018. Lecture Notes in Computer Science, Vol. 11215. Ferrari,

; Hebert,

; Sminchisescu,

; Weiss,

Eds. Springer Cham, 55–71, 2018.

Crossref

[86]

Wen,

; Zhang,

Y. D.

; Li,

Z. W.

; Fu,

Y. W.

Pixel2Mesh: Multi-view 3D mesh generation via deformation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, 1042–1051, 2019.

Crossref

[87]

Pan,

J. Y.

; Han,

X. G.

; Chen,

W. K.

; Tang,

J. P.

; Jia,

Deep mesh reconstruction from single RGB images via topology modification networks. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, 9963–9972, 2019.

Crossref

[88]

Shi,

; Ni,

B. B.

; Liu,

J. X.

; Rong,

D. Y.

; Qian,

; Zhang,

W. J.

Geometric granularity aware pixel-to-mesh. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, 13077–13086, 2021.

Crossref

[89]

Tang,

J. P.

; Han,

X. G.

; Pan,

J. Y.

; Jia,

; Tong,

A skeleton-bridged deep learning approach for generating meshes of complex topologies from single RGB images. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 4536–4545, 2019.

Crossref

[90]

Gkioxari,

; Johnson,

; Malik,

Mesh R-CNN. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, 9784–9794, 2019.

Crossref

[91]

Hui,

K. H.

; Li,

R. H.

; Hu,

J. Y.

; Fu,

C. W.

Neural template: Topology-aware reconstruction and disentangled generation of 3D meshes. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 18551–18561, 2022.

Crossref

[92]

Zhang,

S. H.

; Guo,

Y. C.

; Gu,

Q. W.

Sketch2Model: View-aware 3D modeling from single free-hand sketches. In: Proceedings of the IEEE/CVF Con-ference on Computer Vision and Pattern Recognition, 6000–6017, 2021.

Crossref

[93]

Chen,

; Ling,

; Gao,

; Smith,

; Lehtinen,

; Jacobson,

; Fidler,

Learning to predict 3D objects with an interpolation-based differentiable renderer. In: Proceedings of the 33rd International Conference on Neural Information Processing Systems, 9609–9619, 2019.

[94]

Grigorev,

; Iskakov,

; Ianina,

; Bashirov,

; Zakharkin,

; Vakhitov,

; Lempitsky,

StylePeople: A generative model of fullbody human avatars. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 5147–5156, 2021.

Crossref

[95]

Pavllo,

; Spinks,

; Hofmann,

; Moens,

M. F.

; Lucchi,

Convolutional generation of textured 3D meshes. In: Proceedings of the 34th International Conference on Neural Information Processing Systems, 870–882, 2020.

[96]

Pavllo,

; Kohler,

; Hofmann,

; Lucchi,

Learning generative models of textured 3D meshes from real-world images. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, 13859–13869, 2021.

Crossref

[97]

Tan,

; Gao,

; Lai,

Y. K.

; Xia,

Variational autoencoders for deforming 3D mesh models. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 5841–5850, 2018.

Crossref

[98]

Gao,

; Wu,

; Yuan,

Y. J.

; Lin,

M. X.

; Lai,

Y. K.

; Zhang,

TM-NET: Deep generative networks for textured meshes. ACM Transactions on Graphics Vol. 40, No. 6, Article No. 263, 2021.

Crossref Google Scholar

[99]

Rezende,

D. J.

; Eslami,

; Mohamed,

; Battaglia,

; Jaderberg,

; Heess,

Unsupervised learning of 3D structure from images. In: Proceedings of the 30th International Conference on Neural Information Processing Systems, 5003–5011, 2016.

[100]

Henderson,

; Tsiminaki,

; Lampert,

C. H.

Leveraging 2D data to learn textured 3D mesh generation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 7495–7504, 2020.

Crossref

[101]

Nash,

; Ganin,

; Eslami,

S. A.

; Battaglia,

PolyGen: An autoregressive generative model of 3D meshes. In: Proceedings of the 37th International Conference on Machine Learning, 7220–7229, 2020.

[102]

Park,

J. J.

; Florence,

; Straub,

; Newcombe,

; Lovegrove,

DeepSDF: Learning continuous signed distance functions for shape representation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 165–174, 2019.

Crossref

[103]

Mescheder,

; Oechsle,

; Niemeyer,

; Nowozin,

; Geiger,

Occupancy networks: Learning 3D reconstruction in function space. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 4455–4465, 2019.

Crossref

[104]

Xu,

; Wang,

; Ceylan,

; Mech,

; Neumann,

DISN: Deep implicit surface network for high-quality single-view 3D reconstruction. In: Proceedings of the 33rd International Conference on Neural Information Processing Systems, 492–502, 2019.

[105]

Liu,

; Saito,

; Chen,

; Li,

Learning to infer implicit surfaces without 3D supervision. In: Proceedings of the 33rd International Conference on Neural Information Processing Systems, 8295–8306, 2019.

[106]

Peng,

S. Y.

; Niemeyer,

; Mescheder,

; Pollefeys,

; Geiger,

Convolutional occupancy networks. In: Computer Vision – ECCV 2020. Lecture Notes in Computer Science, Vol. 12348. Vedaldi,

; Bischof,

; Brox,

; Frahm,

J. M.

Eds. Springer Cham, 523–540, 2020.

Crossref

[107]

Liu,

S. L.

; Guo,

H. X.

; Pan,

; Wang,

P. S.

; Tong,

; Liu,

Deep implicit moving least-squares functions for 3D reconstruction. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 1788–1797, 2021.

Crossref

[108]

Chibane,

; Alldieck,

; Pons-Moll,

Implicit functions in feature space for 3D shape reconstruction and completion. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 6968–6979, 2020.

Crossref

[109]

Li,

M. Y.

; Zhang,

D2IM-net: Learning detail disentangled implicit fields from single images. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 10241–10250, 2021.

Crossref

[110]

Poursaeed,

; Fisher,

; Aigerman,

; Kim,

V. G.

Coupling explicit and implicit surface representations for generative 3D modeling. In: Computer Vision – ECCV 2020. Lecture Notes in Computer Science, Vol. 12355. Vedaldi,

; Bischof,

; Brox,

; Frahm,

J. M.

Eds. Springer Cham, 667–683, 2020.

Crossref

[111]

Jiang,

C. Y.

; Sud,

; Makadia,

; Huang,

J. W.

; Nießner,

; Funkhouser,

Local implicit grid representations for 3D scenes. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 6000–6009, 2020.

Crossref

[112]

Jiang,

; Marcus,

Hierarchical detail enhancing mesh-based shape generation with 3D generative adversarial network. arXiv preprint arXiv:1709.07581, 2017.

Google Scholar

[113]

Chen,

Z. Q.

; Zhang,

Learning implicit fields for generative shape modeling. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 5932–5941, 2019.

Crossref

[114]

Ibing,

; Lim,

; Kobbelt,

3D shape generation with grid-based implicit functions. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 13554–13563, 2021.

Crossref

[115]

Mezghanni,

; Boulkenafed,

; Lieutier,

; Ovsjanikov,

Physically-aware generative network for 3D shape modeling. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 9326–9337, 2021.

Crossref

[116]

Tang,

J. H.

; Chen,

W. K.

; Yang,

; Wang,

; Liu,

S. R.

; Yang,

; Gao,

OctField: Hierarchical implicit functions for 3D modeling. In: Proceedings of the 35th Conference on Neural Information Processing Systems, 12648–12660, 2021.

[117]

Sanghi,

; Chu,

; Lambourne,

J. G.

; Wang,

; Cheng,

C. Y.

; Fumero,

; Malekshan,

K. R.

CLIP-forge: Towards zero-shot text-to-shape generation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 18582–18592, 2022.

Crossref

[118]

Yan,

X. G.

; Lin,

L. Q.

; Mitra,

N. J.

; Lischinski,

; Cohen-Or,

; Huang,

ShapeFormer: Transformer-based shape completion via sparse representation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 6229–6239, 2022.

Crossref

[119]

Liu,

Z. Z.

; Wang,

; Qi,

X. J.

; Fu,

C. W.

Towards implicit text-guided 3D shape generation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 17875–17885, 2022.

Crossref

[120]

Zou,

C. H.

; Yumer,

; Yang,

J. M.

; Ceylan,

; Hoiem,

3D-PRNN: Generating shape primitives with recurrent neural networks. In: Proceedings of the IEEE International Conference on Computer Vision, 900–909, 2017.

Crossref

[121]

Schor,

; Katzir,

; Zhang,

; Cohen-Or,

CompoNet: Learning to generate the unseen by part synthesis and composition. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, 8758–8767, 2019.

Crossref

[122]

Wu,

R. D.

; Zhuang,

Y. X.

; Xu,

; Zhang,

; Chen,

B. Q.

PQ-NET: A generative part Seq2Seq network for 3D shapes. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 826–835, 2020.

Crossref

[123]

Yin,

K. X.

; Chen,

Z. Q.

; Chaudhuri,

; Fisher,

; Kim,

V. G.

; Zhang,

COALESCE: Component assembly by learning to synthesize connections. In: Proceedings of the International Conference on 3D Vision, 61–70, 2020.

Crossref

[124]

Kawana,

; Mukuta,

; Harada,

Neural star domain as primitive representation. In: Proceedings of the 34th International Conference on Neural Information Processing Systems, 7875–7886, 2020.

[125]

Li,

; Xu,

; Chaudhuri,

; Yumer,

; Zhang,

; Guibas,

GRASS: Generative recursive autoencoders for shape structures. ACM Transactions on Graphics Vol. 36, No. 4, Article No. 52, 2017.

Crossref Google Scholar

[126]

Wang,

; Schor,

; Hu,

R. Z.

; Huang,

H. B.

; Cohen-Or,

; Huang,

Global-to-local generative model for 3D shapes. ACM Transactions on Graphics Vol. 37, No. 6, Article No. 214, 2018.

Crossref Google Scholar

[127]

Nash,

; Williams,

C. K.

The shape variational autoencoder: A deep generative model of part-segmented 3D objects. Computer Graphics Forum Vol. 36, No. 5, 1–12, 2017.

Crossref Google Scholar

[128]

Wu,

Z. J.

; Wang,

; Lin,

; Lischinski,

; Cohen-Or,

; Huang,

SAGNet: Structure-aware generative network for 3D-shape modeling. ACM Transactions on Graphics Vol. 38, No. 4, Article No. 91, 2019.

Crossref Google Scholar

[129]

Mo,

K. C.

; Guerrero,

; Yi,

; Su,

; Wonka,

; Mitra,

N. J.

; Guibas,

L. J.

StructureNet: Hierarchical graph networks for 3D shape generation. ACM Transactions on Graphics Vol. 38, No. 6, Article No. 242, 2019.

Crossref Google Scholar

[130]

Gao,

; Yang,

; Wu,

; Yuan,

Y. J.

; Fu,

H. B.

; Lai,

Y. K.

; Zhang,

SDM-NET: Deep generative network for structured deformable mesh. ACM Transactions on Graphics Vol. 38, No. 6, Article No. 243, 2019.

Crossref Google Scholar

[131]

Yang,

; Mo,

K. C.

; Lai,

Y. K.

; Guibas,

L. J.

; Gao,

DSG-net: Learning disentangled structure and geometry for 3D shape generation. ACM Transactions on Graphics Vol. 42, No. 1, Article No. 1, 2023.

Crossref Google Scholar

[132]

Jones,

R. K.

; Barton,

; Xu,

; Wang,

; Jiang,

; Guerrero,

; Mitra,

N. J.

; Ritchie,

ShapeAssembly: Learning to generate programs for 3D shape structure synthesis. ACM Transactions on Graphics Vol. 39, No. 6, Article No. 234, 2020.

Crossref Google Scholar

[133]

Kalogerakis,

; Chaudhuri,

; Koller,

; Koltun,

A probabilistic model for component-based shape synthesis. ACM Transactions on Graphics Vol. 31, No. 4, Article No. 55, 2012.

Crossref Google Scholar

[134]

Kim,

V. G.

; Li,

; Mitra,

; Chaudhuri,

; DiVerdi,

; Funkhouser,

Learning part-based templates from large collections of 3D shapes. ACM Transactions on Graphics Vol. 32, No. 4, Article No. 70, 2013.

Crossref Google Scholar

[135]

Huang,

; Kalogerakis,

; Marlin,

Analysis and synthesis of 3D shape families via deep-learned generative models of surfaces. Computer Graphics Forum Vol. 34, No. 5, 25–38, 2015.

Crossref Google Scholar

[136]

Sung,

; Su,

; Kim,

V. G.

; Chaudhuri,

; Guibas,

ComplementMe: Weakly-supervised component suggestions for 3D modeling. ACM Transactions on Graphics Vol. 36, No. 6, Article No. 226, 2017.

Crossref Google Scholar

[137]

Chen,

Z. Q.

; Tagliasacchi,

; Zhang,

BSP-net: Generating compact meshes via binary space partitioning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 42–51, 2020.

Crossref

[138]

Paschalidou,

; Katharopoulos,

; Geiger,

; Fidler,

Neural parts: Learning expressive 3D shape abstractions with invertible neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 3203–3214, 2021.

Crossref

[139]

Xiao,

Y. P.

; Lai,

Y. K.

; Zhang,

F. L.

; Li,

C. P.

; Gao,

A survey on deep geometry learning: From a representation perspective. Computational Visual Media Vol. 6, No. 2, 113–133, 2020.

Crossref Google Scholar

[140]

Li,

R. H.

; Li,

X. Z.

; Heng,

P. A.

; Fu,

C. W.

PointAugment: An auto-augmentation framework for point cloud classification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 6377–6386, 2020.

Crossref

[141]

Guo,

M. H.

; Cai,

J. X.

; Liu,

Z. N.

; Mu,

T. J.

; Martin,

R. R.

; Hu,

S. M.

PCT: Point cloud transformer. Computational Visual Media Vol. 7, No. 2, 187–199, 2021.

Crossref Google Scholar

[142]

Huang,

S. S.

; Ma,

Z. Y.

; Mu,

T. J.

; Fu,

; Hu,

S. M.

Supervoxel convolution for online 3D semantic segmentation. ACM Transactions on Graphics Vol. 40, No. 3, Article No. 34, 2021.

Crossref Google Scholar

[143]

Huang,

J. H.

; Wang,

; Birdal,

; Sung,

; Arrigoni,

; Hu,

S. M.

; Guibas,

MultiBodySync: Multi-body segmentation and motion estimation via 3D scan synchronization. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 7104–7114, 2021.

Crossref

[144]

Bronstein,

M. M.

; Bruna,

; LeCun,

; Szlam,

; Vandergheynst,

Geometric deep learning: Going beyond euclidean data. IEEE Signal Processing Magazine Vol. 34, No. 4, 18–42, 2017.

Crossref Google Scholar

[145]

Maturana,

; Scherer,

3D Convolutional Neural Networks for landing zone detection from LiDAR. In: Proceedings of the IEEE International Conference on Robotics and Automation, 3471–3478, 2015.

Crossref

[146]

Maturana,

; Scherer,

VoxNet: A 3D Con-volutional Neural Network for real-time object recognition. In: Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 922–928, 2015.

Crossref

[147]

Charles,

R. Q.

; Hao,

; Mo,

K. C.

; Guibas,

L. J.

PointNet: Deep learning on point sets for 3D classification and segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 77–85, 2017.

Crossref

[148]

Li,

; Bu,

; Sun,

; Wu,

; Di,

; Chen,

PointCNN: Convolution on

X

-transformed points. In: Proceedings of the 32nd International Conference on Neural Information Processing Systems, 828–838, 2018.

[149]

Hanocka,

; Hertz,

; Fish,

; Giryes,

; Fleishman,

; Cohen-Or,

MeshCNN: A network with an edge. ACM Transactions on Graphics Vol. 38, No. 4, Article No. 90, 2019.

Crossref Google Scholar

[150]

Yuan,

Y. J.

; Lai,

Y. K.

; Yang,

; Duan,

; Fu,

; Gao,

Mesh variational autoencoders with edge contraction pooling. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 1105–1112, 2020.

Crossref

[151]

Hu,

S. M.

; Liu,

Z. N.

; Guo,

M. H.

; Cai,

J. X.

; Huang,

J. H.

; Mu,

T. J.

; Martin,

R. R.

Subdivision-based mesh convolution networks. ACM Transactions on Graphics Vol. 41, No. 3, Article No. 25, 2022.

Crossref Google Scholar

[152]

Lorensen,

W. E.

; Cline,

H. E.

Marching cubes: A high resolution 3D surface construction algorithm. In: Proceedings of the 14th Annual Conferenceon Computer Graphics and Interactive Techniques, 163–169, 1987.

Crossref

[153]

Hinton,

G. E.

; Zemel,

Autoencoders, minimum description length and Helmholtz free energy. In: Proceedings of the 6th International Conference on Neural Information Processing Systems, 3–10, 1993.

[154]

Goodfellow,

I. J.

; Pouget-Abadie,

; Mirza,

; Xu,

; Warde-Farley,

; Ozair,

; Courville,

; Bengio,

Generative adversarial nets. In: Proceedings of the 27th International Conference on Neural Information Processing Systems, 2672–2680, 2014.

[155]

Kingma,

D. P.

; Welling,

Auto-encoding variational Bayes. In: Proceedings of the International Conference on Learning Representations, 2014.

[156]

Dinh,

; Krueger,

; Bengio,

NICE: Non-linear independent components estimation. In: Proceedings of the International Conference on Learning Representations Workshops, 2015.

[157]

Dinh,

; Sohl-Dickstein,

; Bengio,

Density estimation using Real NVP. In: Proceedings of the International Conference on Learning Representations, 2017.

[158]

Kingma,

D. P.

; Dhariwal,

Glow: Generative flow with invertible 1

\times

1 convolutions. In: Proceedings of the 32nd International Conference on Neural Information Processing Systems, 10236–10245, 2018.

[159]

Rezende,

D. J.

; Mohamed,

Variational inference with normalizing flows. In: Proceedings of the 32nd International Conference on Machine Learning, 1530–1538, 2015.

[160]

Chen,

R. T. Q.

; Rubanova,

; Bettencourt,

; Duvenaud,

Neural ordinary differential equations. In: Proceedings of the 32nd International Conference on Neural Information Processing Systems, 6572–6583, 2018.

[161]

Grathwohl,

; Chen,

R. T. Q.

; Bettencourt,

; Sutskever,

; Duvenaud,

FFJORD: Free-form continuous dynamics for scalable reversible generative models. In: Proceedings of the International Con-ference on Learning Representations, 2019.

[162]

Edwards,

; Storkey,

A. J.

Towards a neural statistician. In: Proceedings of the International Conference on Learning Representations, 2017.

[163]

Riegler,

; Ulusoy,

A. O.

; Geiger,

OctNet: Learning deep 3D representations at high resolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 6620–6629, 2017.

Crossref

[164]

Wang,

P. S.

; Liu,

; Guo,

Y. X.

; Sun,

C. Y.

; Tong,

O-CNN: Octree-based convolutional neural networks for 3D shape analysis. ACM Transactions on Graphics Vol. 36, No. 4, Article No. 72, 2017.

Crossref Google Scholar

[165]

Radford,

; Metz,

; Chintala,

Unsupervised representation learning with deep convolutional generative adversarial networks. In: Proceedings of the International Conference on Learning Representations, 2016.

[166]

He,

K. M.

; Zhang,

X. Y.

; Ren,

S. Q.

; Sun,

Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 770–778, 2016.

Crossref

[167]

Arjovsky,

; Chintala,

; Bottou,

Wasserstein generative adversarial networks. In: Proceedings of the 34th International Conference on Machine Learning, 214–223, 2017.

[168]

Gulrajani,

; Ahmed,

; Arjovsky,

; Dumoulin,

; Courville,

Improved training of Wasserstein GANs. In: Proceedings of the 31st International Conference on Neural Information Processing Systems, 5769–5779, 2017.

[169]

Isola,

; Zhu,

J. Y.

; Zhou,

T. H.

; Efros,

A. A.

Image-to-image translation with conditional adversarial networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 5967–5976, 2017.

Crossref

[170]

Lazarow,

; Jin,

; Tu,

Z. W.

Introspective neural networks for generative modeling. In: Proceedings of the IEEE International Conference on Computer Vision, 2793–2802, 2017.

Crossref

[171]

Razavi,

; Van den Oord,

; Vinyals,

Generating diverse high-fidelity images with VQ-VAE-2. In: Proceedings of the 33rd International Conference on Neural Information Processing Systems, 14866–14876, 2019.

[172]

Esser,

; Rombach,

; Ommer,

Taming transformers for high-resolution image synthesis. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 12868–12878, 2021.

Crossref

[173]

Devlin,

; Chang,

; Lee,

; Toutanova,

BERT: Pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 4171–4186, 2019.

[174]

LeCun,

; Chopra,

; Hadsell,

; Ranzato,

; Huang,

A tutorial on energy-based learning. In: Predicting Structured Data. Bakir,

; Hofman,

; Scholkopf,

; Smola,

; Taskar,

Eds. MIT Press, 2006.

[175]

Xie,

J. W.

; Lu,

; Zhu,

S. C.

; Wu,

Y. N.

A theory of generative ConvNet. In: Proceedings of the 33rd International Conference on International Conference on Machine Learning, 2635–2644, 2016.

[176]

Qi,

C. R.

; Yi,

; Su,

; Guibas,

L. J.

PointNet++: Deep hierarchical feature learning on point sets in a metric space. In: Proceedings of the 31st International Conference on Neural Information Processing Systems, 5105–5114, 2017.

[177]

Wang,

; Sun,

Y. B.

; Liu,

Z. W.

; Sarma,

S. E.

; Bronstein,

M. M.

; Solomon,

J. M.

Dynamic graph CNN for learning on point clouds. ACM Transactions on Graphics Vol. 38, No. 5, Article No. 146, 2019.

Crossref Google Scholar

[178]

Vaswani,

; Shazeer,

; Parmar,

; Uszkoreit,

; Jones,

; Gomez,

A. N.

; Kaiser,

Ł.

; Polosukhin,

Attention is all you need. In: Proceedings of the 31st International Conference on Neural Information Processing Systems, 6000–6010, 2017.

[179]

Zhao,

; Jiang,

; Jia,

; Torr,

P. H.

; Koltun,

Point transformer. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, 16239–16248, 2021.

Crossref

[180]

Pan,

; Xia,

; Song,

; Li,

L. E.

; Huang,

3D object detection with pointformer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 7459–7468, 2021.

Crossref

[181]

Sederberg,

T. W.

; Parry,

S. R.

Free-form deformation of solid geometric models. In: Proceedings of the 13th Annual Conference on Computer Graphics and Interactive Techniques, 151–160, 1986.

Crossref

[182]

Navaneet,

K. L.

; Mandikal,

; Agarwal,

; Babu,

R. V.

CAPNet: Continuous approximation projection for 3D point cloud reconstruction using 2D supervision. Proceedings of the AAAI Conference on Artificial Intelligence Vol. 33, No. 01, 8819–8826, 2019.

Crossref Google Scholar

[183]

Han,

Z. Z.

; Chen,

; Liu,

Y. S.

; Zwicker,

DRWR: A differentiable renderer without rendering for unsupervised 3D structure learning from silhouette images. In: Proceedings of the 37th International Conference on Machine Learning, 3994–4005, 2020.

[184]

Karras,

; Laine,

; Aittala,

; Hellsten,

; Lehtinen,

; Aila,

Analyzing and improving the image quality of StyleGAN. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 8107–8116, 2020.

Crossref

[185]

Sohl-Dickstein,

; Weiss,

E. A.

; Maheswaranathan,

; Ganguli,

Deep unsupervised learning using nonequilibrium thermodynamics. In: Proceedings of the 32nd International Conference on International Conference on Machine Learning, 2256–2265, 2015.

[186]

Maron,

; Galun,

; Aigerman,

; Trope,

; Dym,

; Yumer,

; Kim,

V. G.

; Lipman,

Convolutional neural networks on surfaces via seamless toric covers. ACM Transactions on Graphics Vol. 36, No. 4, Article No. 71, 2017.

Crossref Google Scholar

[187]

Saquil,

; Xu,

Q. C.

; Yang,

Y. L.

; Hall,

Rank3DGAN: Semantic mesh generation using relative attributes. In: Proceedings of the 34th AAAI Conference on Artificial Intelligenc, 5586–5594, 2020.

Crossref

[188]

Aigerman,

; Lipman,

Hyperbolic orbifold tutte embeddings. ACM Transactions on Graphics Vol. 34, No. 6, Article No. 190, 2015.

Crossref Google Scholar

[189]

Bruna,

; Zaremba,

; Szlam,

; LeCun,

Spectral networks and locally connected networks on graphs. In: Proceedings of the International Conference on Learning Representations, 2014.

[190]

Atwood,

; Towsley,

Diffusion-convolutional neural networks. In: Proceedings of the 30th International Conference on Neural Information Processing Systems, 2001–2009, 2016.

[191]

Defferrard,

; Bresson,

; Vandergheynst,

Convolutional neural networks on graphs with fast localized spectral filtering. In: Proceedings of the 30th International Conference on Neural Information Processing Systems, 3844–3852, 2016.

[192]

Qiao,

Y. L.

; Gao,

; Yang,

; Rosin,

P. L.

; Lai,

Y. K.

; Chen,

X. L.

Learning on 3D meshes with Laplacian encoding and pooling. IEEE Transactions on Visualization and Computer Graphics Vol. 28, No. 2, 1317–1327, 2022.

Crossref Google Scholar

[193]

Feng,

; Feng,

; You,

; Zhao,

; Gao,

MeshNet: Mesh neural network for 3D shape representation. Proceedings of the AAAI Conference on Artificial Intelligence Vol. 33, No. 01, 8279–8286, 2019.

Crossref Google Scholar

[194]

Liu,

H. T D.

; Kim,

V. G.

; Chaudhuri,

; Aigerman,

; Jacobson,

Neural subdivision. ACM Transactions on Graphics Vol. 39, No. 4, Article No. 124, 2020.

Crossref Google Scholar

[195]

Hu,

S. M.

; Liang,

; Yang,

G. Y.

; Yang,

G. W.

; Zhou,

W. Y.

Jittor: A novel deep learning framework with meta-operators and unified graph execution. Science China Information Sciences Vol. 63, No. 12, 222103, 2020.

Crossref Google Scholar

[196]

He,

; Gkioxari,

; Dollár,

; Girshick,

Mask R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision, 2980–2988, 2017.

Crossref

[197]

Gregor,

; Danihelka,

; Graves,

; Rezende,

D. J.

; Wierstra,

DRAW: A recurrent neural network for image generation. In: Proceedings of the International Conference on Machine Learning, 1462–1471, 2015.

[198]

Kato,

; Ushiku,

; Harada,

Neural 3D mesh renderer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 3907–3916, 2018.

Crossref

[199]

Liu,

; Li,

; Chen,

; Li,

Soft rasterizer: A differentiable renderer for image-based 3D reasoning. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, 7707–7716, 2019.

Crossref

[200]

Pavlakos,

; Choutas,

; Ghorbani,

; Bolkart,

; Osman,

A. A. A.

; Tzionas,

; Black,

M. J.

Expressive body capture: 3D hands, face, and body from a single image. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 10967–10977, 2019.

Crossref

[201]

Michalkiewicz,

; Pontes,

J. K.

; Jack,

; Baktashmotlagh,

; Eriksson,

Deep level sets: Implicit surface representations for 3D shape inference. arXiv preprint arXiv:1901.06802, 2019.

Crossref Google Scholar

[202]

Chibane,

; Mir,

; Pons-Moll,

Neural unsigned distance fields for implicit function learning. In: Proceedings of the 34th International Conference on Neural Information Processing Systems, 21638–21652, 2020.

[203]

Venkatesh,

; Karmali,

; Sharma,

; Ghosh,

; Babu,

R. V.

; Jeni,

L. A.

; Singh,

Deep implicit surface point prediction networks. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, 12633–12642, 2021.

Crossref

[204]

Aumentado-Armstrong,

; Tsogkas,

; Dickinson,

; Jepson,

Representing 3D shapes with probabilistic directed distance fields. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 19321–19332, 2022.

Crossref

[205]

Radford,

; Kim,

J. W.

; Hallacy,

; Ramesh,

; Goh,

; Agarwal,

; Sastry,

; Askell,

; Mishkin,

; Clark,

; et al. Learning transferable visual models from natural language supervision. In: Proceedings of the International Conference on Machine Learning, 8748–8763, 2021.

[206]

Schuster,

; Paliwal,

K. K.

Bidirectional recurrent neural networks. IEEE Transactions on Signal Processing Vol. 45, No. 11, 2673–2681, 1997.

Crossref Google Scholar

[207]

Cho,

; van Merrienboer,

; Gulcehre,

; Bahdanau,

; Bougares,

; Schwenk,

; Bengio,

Learning phrase representations using RNN encoder-decoder for statistical machine translation. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, 1724–1734, 2014.

Crossref

[208]

Gao,

; Lai,

Y. K.

; Yang,

; Zhang,

L. X.

; Xia,

S. H.

; Kobbelt,

Sparse data driven mesh deformation. IEEE Transactions on Visualization and Computer Graphics Vol. 27, No. 3, 2085–2100, 2021.

Crossref Google Scholar

[209]

Sedaghat,

; Zolfaghari,

; Amiri,

; Brox,

Orientation-boosted voxel nets for 3D object recognition. In: Proceedings of the British Machine Vision Conference, 97.1–97.13, 2017.

Crossref

[210]

Xiang,

; Kim,

; Chen,

; Ji,

; Choy,

; Su,

; Mottaghi,

; Guibas,

; Savarese,

ObjectNet3D: A large scale database for 3D object recognition. In: Computer Vision – ECCV 2016. Lecture Notes in Computer Science, Vol. 9912. Leibe,

; Matas,

; Sebe,

; Welling,

Eds. Springer Cham, 160–176, 2016.

Crossref

[211]

Song,

S. R.

; Yu,

; Zeng,

; Chang,

A. X.

; Savva,

; Funkhouser,

Semantic scene completion from a single depth image. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 190–198, 2017.

Crossref

[212]

Sun,

X. Y.

; Wu,

J. J.

; Zhang,

X. M.

; Zhang,

Z. T.

; Zhang,

C. K.

; Xue,

T. F.

; Tenenbaum,

J. B.

; Freeman,

W. T.

Pix3D: Dataset and methods for single-image 3D shape modeling. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2974–2983, 2018.

Crossref

[213]

Mo,

; Zhu,

; Chang,

A. X.

; Yi,

; Tripathi,

; Guibas,

L. J.

; Su,

PartNet: A large-scale benchmark for fine-grained and hierarchical part-level 3D object understanding. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 909–918, 2019.

Crossref

[214]

Fu,

; Jia,

R. F.

; Gao,

; Gong,

M. M.

; Zhao,

B. Q.

; Maybank,

; Tao,

D. C.

3D-FUTURE: 3D furniture shape with TextURE. International Journal of Computer Vision Vol. 129, No. 12, 3313–3337, 2021.

Crossref Google Scholar

[215]

Xiang,

; Mottaghi,

; Savarese,

Beyond PASCAL: A benchmark for 3D object detection in the wild. In: Proceedings of the IEEE Winter Conference on Applications of Computer Vision, 75–82, 2014.

Crossref

[216]

Bronstein,

A. M.

; Bronstein,

M. M.

; Kimmel,

Numerical Geometry of Non-Rigid Shapes. Springer New York, 2009.

Crossref

[217]

Bogo,

; Romero,

; Loper,

; Black,

M. J.

FAUST: Dataset and evaluation for 3D mesh registration. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 3794–3801, 2014.

Crossref

[218]

Song,

S. R.

; Lichtenberg,

S. P.

; Xiao,

J. X.

SUN RGB-D: A RGB-D scene understanding benchmark suite. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 567–576, 2015.

Crossref

[219]

Choi,

; Zhou,

Q. Y.

; Miller,

; Koltun,

A large dataset of object scans. arXiv preprint arXiv:1602.02481, 2016.

Google Scholar

[220]

Welinder,

; Branson,

; Mita,

; Wah,

; Schroff,

; Belongie,

S. J.

; Perona,

Caltech-UCSD birds 200. Computation & Neural Systems Technical Report, 2010-001. California Institute of Technology, 2010. Available at https://resolver.caltech.edu/CaltechAUTHORS:20111026-155425465.

[221]

Shen,

T. C.

; Gao,

; Yin,

K. X.

; Liu,

M. Y.

; Fidler,

Deep marching tetrahedra: A hybrid representation for high-resolution 3D shape synthesis. In: Proceedings of the 35th Conference on Neural Information Processing Systems, 6087–6101, 2021.

[222]

Yuan,

Y. J.

; Sun,

Y. T.

; Lai,

Y. K.

; Ma,

Y. W.

; Jia,

R. F.

; Gao,

NeRF-editing: Geometry editing of neural radiance fields. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 18332–18343, 2022.

Crossref

[223]

Liao,

Y. Y.

; Donné,

; Geiger,

Deep marching cubes: Learning explicit surface representations. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2916–2925, 2018.

Crossref

[224]

Zhang,

S. H.

; Zhang,

S. K.

; Xie,

W. Y.

; Luo,

C. Y.

; Yang,

Y. L.

; Fu,

H. B.

Fast 3D indoor scene synthesis by learning spatial relation priors of objects. IEEE Transactions on Visualization and Computer Graphics Vol. 28, No. 9, 3082–3092, 2022.

Crossref Google Scholar

[225]

Qian,

; Hou,

; Kwong,

; He,

PUGeo-Net: A geometry-centric network for 3D point cloud upsampling. In: Computer Vision – ECCV 2020. Lecture Notes in Computer Science, Vol. 12364. Vedaldi,

; Bischof,

; Brox,

; Frahm,

J. M.

Eds. Springer Cham, 752–769, 2020.

Crossref

[226]

Liang,

Y. Q.

; Zhao,

S. S.

; Yu,

B. S.

; Zhang,

; He,

F. Z.

MeshMAE: Masked autoencoders for 3D mesh data analysis. arXiv preprint arXiv:2207.10228, 2022.

Crossref Google Scholar

Computational Visual Media

Volume 9 Issue 3,
September 2023

Pages 407-442

DOI: 10.1007/s41095-022-0321-5

Cite this article:

Xu Q-C, Mu T-J, Yang Y-L. A survey of deep learning-based 3D shape generation. Computational Visual Media, 2023, 9(3): 407-442. https://doi.org/10.1007/s41095-022-0321-5