| Sign up

PDF (470.1 KB)

Cite

EndNote(RIS) BibTeX

Collect

Collect

Submit Manuscript

Open Access

Artificial Intelligence for Metaverse: A Framework

Yuchen Guo^¹, Tao Yu^¹, Jiamin Wu^{¹^,²}, Yuwang Wang^¹, Sen Wan^{¹^,²}, Jiyuan Zheng^¹, Lu Fang^{¹^,³}(), Qionghai Dai^{¹^,²}()

1 Beijing National Research Center for Information Science and Technology, Tsinghua University, Beijing 100084, China

2 Department of Automation, Tsinghua University, Beijing 100084, China

3 Department of Electronic Engineering, Tsinghua University, Beijing 100084, China

Show Author Information

Abstract

The metaverse is attracting considerable attention recently. It aims to build a virtual environment that people can interact with the world and cooperate with each other. In this survey paper, we re-introduce metaverse in a new framework based on a broad range of technologies, including perception which enables us to precisely capture the characteristics of the real world, computation which supports the large computation requirement over large-scale data, reconstruction which builds the virtual world from the real one, cooperation which facilitates long-distance communication and teamwork between users, and interaction which bridges users and the virtual world. Despite its popularity, the fundamental techniques in this framework are still immature. Innovating new techniques to facilitate the applications of metaverse is necessary. In recent years, artificial intelligence (AI), especially deep learning, has shown promising results for empowering various areas, from science to industry. It is reasonable to imagine how we can combine AI with the framework in order to promote the development of metaverse. In this survey, we present the recent achievement by AI for metaverse in the proposed framework, including perception, computation, reconstruction, cooperation, and interaction. We also discuss some future works that AI can contribute to metaverse.

Keywords

artificial intelligence metaverse perception computation reconstruction cooperation interaction

References

1

N. Stephenson, Snow Crash, New York, NY, USA: Spectra, 2000.

2

A. Scavarelli, A. Arya, and R. J Teather, Virtual reality and augmented reality in social learning spaces: A literature review, Virtual Reality, vol. 25, no. 1, pp. 257–277, 2021.

Crossref Google Scholar

3

G. W. Wei, Protein structure prediction beyond alphafold, Nat. Mach. Intell., vol. 1, no. 8, pp. 336–337, 2019.

Crossref Google Scholar

4

S. H. M. Mehr, M. Craven, A. I. Leonov, G. Keenan, and L. Cronin, A universal system for digitization and automatic execution of the chemical synthesis literature, Science, vol. 370, no. 6512, pp. 101–108, 2020.

Crossref Google Scholar

5

A. Mirhoseini, A. Goldie, M. Yazgan, J. W. Jiang, E. Songhori, S. Wang, Y. J. Lee, E. Johnson, O. Pathak, A. Nazi, et al., A graph placement methodology for fast chip design, Nature, vol. 594, no. 7862, pp. 207–212, 2021.

Crossref Google Scholar

6

Q. Li, J. Pellegrino, D. J. Lee, A. A. Tran, H. A. Chaires, R Wang, J. E. Park, K. Ji, D. Chow, N. Zhang, et al., Synthetic group a streptogramin antibiotics that overcome vat resistance, Nature, vol. 586, no. 7827, pp. 145–150, 2020.

Crossref Google Scholar

7

J. Deng, W. Dong, R. Socher, L. J. Li, K. Li, and F. Li, ImageNet: A large-scale hierarchical image database, in Proc. 2009 IEEE Conf. on Computer Vision and Pattern Recognition, Miami, FL, USA, 2009, pp. 248–255.

8

X. Wang, X. Zhang, Y. Zhu, Y. Guo, X. Yuan, L. Xiang, Z. Wang, G. Ding, D. Brady, Q. Dai, et al., PANDA: A gigapixel-level human-centric video dataset, in Proc. 2020 IEEE/CVF Conf. on Computer Vision and Pattern Recognition, Seattle, WA, USA, 2020, pp. 3268–3278.

9

X. Yuan, M. Ji, J. Wu, D. J. Brady, Q. Dai, and L. Fang, A modular hierarchical array camera, Light: Sci. Appl., vol. 10, no. 1, p. 37, 2021.

Crossref Google Scholar

10

X. Ding, Y. Guo, G. Ding, and J. Han, ACNet: Strengthening the kernel skeletons for powerful CNN via asymmetric convolution blocks, in Proc. 2019 IEEE/CVF Int. Conf. on Computer Vision, Seoul, Republic of Korea, 2019, pp. 1911–1920.

11

X. Ding, T. Hao, J. Tan, J. Liu, J. Han, Y. Guo, and G. Ding, ResRep: Lossless CNN pruning via decoupling remembering and forgetting, in Proc. 2021 IEEE/CVF Int. Conf. on Computer Vision, Montreal, Canada, 2021, pp. 4510–4520.

12

B. Zhang, Y. Guo, Y. Li, Y. He, H. Wang, and Q. Dai, Memory recall: A simple neural network training framework against catastrophic forgetting, IEEE Trans. Neural Netw. Learn. Syst., vol. 33, no. 5, pp. 2010–2022, 2022.

Crossref Google Scholar

13

X. Lin, Y. Rivenson, N. T. Yardimci, M. Veli, Y. Luo, M. Jarrahi, and A. Ozcan, All-optical machine learning using diffractive deep neural networks, Science, vol. 361, no. 6406, pp. 1004–1008, 2018.

Crossref Google Scholar

14

T. Zhou, X. Lin, J. Wu, Y. Chen, H. Xie, Y. Li, J. Fan, H. Wu, L. Fang, and Q. Dai, Large-scale neuromorphic optoelectronic computing with a reconfigurable diffractive processing unit, Nat. Photonics, vol. 15, no. 5, pp. 367–373, 2021.

Crossref Google Scholar

15

T. Yu, Z. Zheng, K. Guo, P. Liu, Q. Dai, and Y. Liu, Function4D: Real-time human volumetric capture from very sparse consumer RGBD sensors, in Proc. 2021 IEEE/CVF Conf. on Computer Vision and Pattern Recognition, Nashville, TN, USA, 2021, pp. 5746–5756.

16

Z. Zheng, T. Yu, Q. Dai, and Y. Liu, Deep implicit templates for 3D shape representation, in Proc. 2021 IEEE/CVF Conf. on Computer Vision and Pattern Recognition, Nashville, TN, USA, 2021, pp. 1429–1439.

17

L. I. Rudin, S. Osher, and E. Fatemi, Nonlinear total variation based noise removal algorithms, Phys. D:Nonlinear Phenom., vol. 60, no. 1-4, pp. 259–268, 1992.

Crossref Google Scholar

18

Y. Hitomi, J. Gu, M. Gupta, T. Mitsunaga, and S. K. Nayar, Video from a single coded exposure photograph using a learned over-complete dictionary, in Proc. 2011 Int. Conf. on Computer Vision, Barcelona, Spain, 2011, pp. 287–294.

19

X. Yuan, T. H. Tsai, R. Zhu, P. Llull, D. J. Brady, and L. Carin, Compressive hyperspectral imaging with side information, IEEE J. Sel. Top. Signal Process., vol. 9, no. 6, pp. 964–976, 2015.

Crossref Google Scholar

20

J. Yang, X. Yuan, X. Liao, P. Llull, D. J. Brady, G. Sapiro, and L. Carin, Video compressive sensing using Gaussian mixture models, IEEE Trans. Image Process., vol. 23, no. 11, pp. 4863–4878, 2014.

Crossref Google Scholar

21

J. Yang, X. Liao, X. Yuan, P. Llull, D. J. Brady, G. Sapiro, and L. Carin, Compressive sensing by learning a Gaussian mixture model from measurements, IEEE Trans. Image Process., vol. 24, no. 1, pp. 106–119, 2015.

Crossref Google Scholar

22

Y. Liu, X. Yuan, J. Suo, D. J. Brady, and Q. Dai, Rank minimization for snapshot compressive imaging, IEEE Trans. Pattern Anal. Mach. Intell., vol. 41, no. 12, pp. 2990–3006, 2019.

Crossref Google Scholar

23

X. Yuan, D. J. Brady, and A. K. Katsaggelos, Snapshot compressive imaging: Theory, algorithms, and applications, IEEE Signal Process. Mag., vol. 38, no. 2, pp. 65–88, 2021.

Crossref Google Scholar

24

J. Ma, X. Liu, Z. Shou, and X. Yuan, Deep tensor ADMM-Net for snapshot compressive imaging, in Proc. 2019 IEEE/CVF Int. Conf. on Computer Vision, Seoul, Republic of Korea, 2019, pp. 10222–10231.

25

M. Qiao, Z. Meng, J. Ma, and X. Yuan, Deep learning for video compressive sensing, APL Photonics, vol. 5, no. 3, p. 030801, 2020.

Crossref Google Scholar

26

X. Miao, X. Yuan, Y. Pu, and V. Athitsos, Lambda-net: Reconstruct hyperspectral images from a snapshot measurement, in Proc. 2019 IEEE/CVF Int. Conf. on Computer Vision, Seoul, Republic of Korea, 2019, pp. 4058–4068.

27

Z. Meng, J. Ma, and X. Yuan, End-to-end low cost compressive spectral imaging with spatial-spectral self-attention, in Proc. 16^th European Conf. on Computer Vision, Glasgow, UK, 2020, pp. 187–204.

28

T. Huang, W. Dong, X. Yuan, J. Wu, and G. Shi, Deep gaussian scale mixture prior for spectral compressive imaging, in Proc. 2021 IEEE Conf. on Computer Vision and Pattern Recognition, Nashville, TN, USA, 2021, pp. 16211–16220.

29

Y. Li, M. Qi, R. Gulve, M. Wei, R. Genov, K. N. Kutulakos, and W. Heidrich, End-to-end video compressive sensing using anderson-accelerated unrolled networks, in Proc. 2020 IEEE Int. Conf. on Computational Photography, St. Louis, MO, USA, 2020, pp. 1–12.

30

Z. Cheng, R. Lu, Z. Wang, H. Zhang, B. Chen, Z. Meng, and X. Yuan, BIRNAT: Bidirectional recurrent neural networks with adversarial training for video snapshot compressive imaging, in Proc. 16^th European Conf. on Computer Vision, Glasgow, UK, 2020, pp. 258–275.

31

S. Zheng, C. Wang, X. Yuan, and H. Xin, Super-compression of large electron microscopy time series by deep compressive sensing learning, Patterns, vol. 2, no. 7, p. 100292, 2021.

Crossref Google Scholar

32

M. Qiao, X. Liu, and X. Yuan, Snapshot temporal compressive microscopy using an iterative algorithm with untrained neural networks, Opt. Lett., vol. 46, no. 8, pp. 1888–1891, 2021.

Crossref Google Scholar

33

Z. Meng, Z. Yu, K. Xu, and X. Yuan, Self-supervised neural networks for spectral snapshot compressive imaging, in Proc. 2021 IEEE/CVF Int. Conf. on Computer Vision, Montreal, Canada, 2021, pp. 2602–2611.

34

Z. Cheng, B. Chen, G. Liu, H. Zhang, R. Lu, Z. Wang, and X. Yuan, Memory-efficient network for large-scale video compressive sensing, in Proc. 2021 IEEE/CVF Conf. on Computer Vision and Pattern Recognition, Nashville, TN, USA, 2021, pp. 16241–16250.

35

Z. Wang, H. Zhang, Z. Cheng, B. Chen, and X. Yuan, MetaSCI: Scalable and adaptive reconstruction for video compressive sensing, in Proc. 2021 IEEE/CVF Conf. on Computer Vision and Pattern Recognition, Nashville, TN, USA, 2021, pp. 2083–2092.

36

J. Chang and G. Wetzstein, Deep optics for monocular depth estimation and 3D object detection, in Proc. 2019 IEEE/CVF Int. Conf. on Computer Vision, Seoul, Republic of Korea, 2019, pp. 10193–10202.

37

V. Sitzmann, S. Diamond, Y. Peng, X. Dun, S. Boyd, W. Heidrich, F. Heide, and G. Wetzstein, End-to-end optimization of optics and image processing for achromatic extended depth of field and super-resolution imaging, ACM Trans. Graphics, vol. 37, no. 4, p. 114, 2018.

Crossref Google Scholar

38

L. Wang, T. Zhang, Y. Fu, and H. Huang, HyperReconNet: Joint coded aperture optimization and image reconstruction for compressive hyperspectral imaging, IEEE Trans. Image Process., vol. 28, no. 5, pp. 2257–2270, 2019.

Crossref Google Scholar

39

Y. Inagaki, Y. Kobayashi, K. Takahashi, T. Fujii, and H. Nagahara, Learning to capture light fields through a coded aperture camera, in Proc. 15^th European Conf. on Computer Vision, Munich, Germany, 2018, pp. 418–434.

40

U. Akpinar, E. Sahin, and A. Gotchev, Learning optimal phase-coded aperture for depth of field extension, in Proc. 2019 IEEE Int. Conf. on Image Processing, Taipei, China, 2019, pp. 4315–4319.

41

J. Zhang, C. Zhao, and W. Gao, Optimization-inspired compact deep compressive sensing, IEEE J. Sel. Top. Signal Process., vol. 14, no. 4, pp. 765–774, 2020.

Crossref Google Scholar

42

J. W. Han, J. H. Kim, H. T. Lee, and S. J. Ko, A novel training based auto-focus for mobile-phone cameras, IEEE Trans. Consum. Electron., vol. 57, no. 1, pp. 232–238, 2011.

Crossref Google Scholar

43

P. A. Shedligeri, S. Mohan, and K. Mitra, Data driven coded aperture design for depth recovery, in Proc. 2017 IEEE Int. Conf. on Image Processing, Beijing, China, 2017, pp. 56–60.

44

M. Gupta, A. Agrawal, A. Veeraraghavan, and S. G. Narasimhan, Flexible voxels for motion-aware videography, in Proc. 11^th European Conf. on Computer Vision, Heraklion, Greece, 2010, pp. 100–114.

45

Y. S. Rawat and M. S. Kankanhalli, Context-aware photography learning for smart mobile devices, ACM Trans. Multimedia Comput. , Commun. , Appl., vol. 12, no. 1s, p. 19, 2015.

Crossref Google Scholar

46

Y. S. Rawat and M. S. Kankanhalli, ClickSmart: A context-aware viewpoint recommendation system for mobile photography, IEEE Trans. Circuits Syst. Video Technol., vol. 27, no. 1, pp. 149–158, 2017.

Crossref Google Scholar

47

C. Wang, Q. Fu, X. Dun, and W. Heidrich, Megapixel adaptive optics: Towards correcting large-scale distortions in computational cameras, ACM Trans. Graphics, vol. 37, no. 4, p. 115, 2018.

Crossref Google Scholar

48

S. Rao, K. Y. Ni, and Y. Owechko, Context and task-aware knowledge-enhanced compressive imaging, in Proc. SPIE 8877, Unconventional Imaging and Wavefront Sensing 2013, San Diego, CA, USA, 2013, p. 88770E.

49

A. Ashok, P. K. Baheti, and M. A. Neifeld, Compressive imaging system design using task-specific information, Appl. Opt., vol. 47, no. 25, pp. 4457–4471, 2008.

Crossref Google Scholar

50

F. Zhou and Y. Chai, Near-sensor and in-sensor computing, Nat. Electron., vol. 3, no. 11, pp. 664–671, 2020.

Crossref Google Scholar

51

Y. Chai, In-sensor computing for machine vision, Nature, vol. 579, no. 7797, pp. 32–33, 2020.

Crossref Google Scholar

52

M. A. Zidan, J. P. Strachan, and W. D. Lu, The future of electronics based on memristive systems, Nat. Electron., vol. 1, no. 1, pp. 22–29, 2018.

Crossref Google Scholar

53

Z. Du, R. Fasthuber, T. Chen, P. Ienne, L. Li, T. Luo, X. Feng, Y. Chen, and O. Temam, ShiDianNao: Shifting vision processing closer to the sensor, in Proc. ACM/IEEE 42^nd Annu. Int. Symp. on Computer Architecture (ISCA), Portland, OR, USA, 2015, pp. 92–104.

54

R. LiKamWa, Y. Hou, Y. Gao, M. Polansky, and L. Zhong, RedEye: Analog convnet image sensor architecture for continuous mobile vision, in Proc. ACM/IEEE 43^rd Annu. Int. Symp. on Computer Architecture (ISCA), Seoul, Republic of Korea, 2016, pp. 255–266.

55

L. Mennel, J. Symonowicz, S. Wachter, D. K. Polyushkin, A. J. Molina-Mendoza, and T. Mueller, Ultrafast machine vision with 2D material neural network image sensors, Nature, vol. 579, no. 7797, pp. 62–66, 2020.

Crossref Google Scholar

56

W. Wang, G. Pedretti, V. Milo, R. Carboni, A. Calderoni, N. Ramaswamy, A. S. Spinelli, and D. Ielmini, Learning of spatiotemporal patterns in a spiking neural network with resistive switching synapses, Sci. Adv., vol. 4, no. 9, p. eaat4752, 2018.

Crossref Google Scholar

57

U. A. Butt, M. Mehmood, S. B. H. Shah, R. Amin, M. W. Shaukat, S. M. Raza, D. Y. Suh, and M. J. Piran, A review of machine learning algorithms for cloud computing security, Electronics, vol. 9, no. 9, p. 1379, 2020.

Crossref Google Scholar

58

H. M. Said, I. El Emary, B. A. Alyoubi, and A. A. Alyoubi, Application of intelligent data mining approach in securing the cloud computing, Int. J. Adv. Comput. Sci. Appl. , vol. 7, no. 9, 2016,doi: 10.14569/IJACSA.2016.070921.

59

X. Yuan, C. Li, and X. Li, DeepDefense: Identifying DDoS attack via deep learning, in Proc. 2017 IEEE Int. Conf. on Smart Computing (SMARTCOMP), Hong Kong, China, 2017, pp. 1–8.

60

A. A. Grusho, M. I. Zabezhailo, A. A. Zatsarinnyi, and V. O. Piskovskii, On some artificial intelligence methods and technologies for cloud-computing protection, Autom. Doc. Math. Linguist., vol. 51, no. 2, pp. 62–74, 2017.

Crossref Google Scholar

61

H. M. El-Boghdadi and R. A. Ramadan, Resource scheduling for offline cloud computing using deep reinforcement learning, Int. J. Comput. Sci. Netw. Secur., vol. 19, no. 4, pp. 54–60, 2019.

62

J. Gao, Machine learning applications for data center optimization,https://static.googleusercontent.com/media/research.google.com/zh-CN//pubs/archive/42542.pdf, 2014.

63

C. Coleman, D. Narayanan, D. Kang, T. Zhao, J. Zhang, L. Nardi, P. Bailis, K. Olukotun, C. Ré, and M. A. Zaharia, DAWNBench: An end-to-end deep learning benchmark and competition,https://cs.stanford.edu/~deepakn/assets/papers/dawnbench-sysml18.pdf, 2017.

64

J. Lin, R. Men, A. Yang, C. Zhou, M. Ding, Y. Zhang, P. Wang, A. Wang, L. Jiang, X. Jia, et al., M6: A Chinese multimodal pretrainer, arXiv preprint arXiv: 2103.00823, 2021.

65

J. Devlin, M. W. Chang, K. Lee, and K. Toutanova, BERT: Pre-training of deep bidirectional transformers for language understanding, arXiv preprint arXiv: 1810.04805, 2019.

66

L. Floridi and M. Chiriatti, GPT-3: Its nature, scope, limits, and consequences, Minds Mach., vol. 30, no. 4, pp. 681–694, 2020.

Crossref Google Scholar

67

S. Yuan, H. Zhao, Z. Du, M. Ding, X. Liu, Y. Cen, X. Zou, Z. Yang, and J. Tang, WuDaoCorpora: A super large-scale Chinese corpora for pre-training language models, AI Open, vol. 2, pp. 65–68, 2021.

Crossref Google Scholar

68

J. Jumper, R. Evans, A. Pritzel, T. Green, M. Figurnov, O. Ronneberger, K. Tunyasuvunakool, R. Bates, A. Žídek, A. Potapenko, et al., Highly accurate protein structure prediction with AlphaFold, Nature, vol. 596, no. 7873, pp. 583–589, 2021.

Crossref Google Scholar

69

P. Li, J. Li, Z. Huang, T. Li, C. Gao, S. M. Yiu, and K. Chen, Multi-key privacy-preserving deep learning in cloud computing, Future Gener. Comput. Syst., vol. 74, pp. 76–85, 2017.

Crossref Google Scholar

70

Y. Huang, Z. Song, K. Li, and S. Arora, InstaHide: Instance-hiding schemes for private distributed learning, in Proc. 37^th Int. Conf. on Machine Learning, 2020, pp. 4507–4518.

71

Y. Li, H. Li, G. Xu, T. Xiang, X. Huang, and R. Lu, Toward secure and privacy-preserving distributed deep learning in fog-cloud computing, IEEE Internet Things J., vol. 7, no. 12, pp. 11460–11472, 2020.

Crossref Google Scholar

72

T. Li, A. K. Sahu, A. S. Talwalkar, and V. Smith, Federated learning: Challenges, methods, and future directions, IEEE Signal Process. Mag., vol. 37, no. 3, pp. 50–60, 2020.

Crossref Google Scholar

73

J. McCarthy, Generality in artificial intelligence, Commun. ACM, vol. 30, no. 12, pp. 1030–1035, 1987.

Crossref Google Scholar

74

The HEP Software Foundation, J. Albrecht, A. A. Alves Jr, G. Amadio, G. Andronico, N. Anh-Ky, L. Aphecetche, J. Apostolakis, M. Asai, L. Atzori, et al., A roadmap for HEP software and computing R&D for the 2020s,Comput. Softw. Big Sci., vol. 3, no. 1, p. 7, 2019.

Crossref Google Scholar

75

S. Srinivas and R. V. Babu, Data-free parameter pruning for deep neural networks, in Proc. 2015 British Machine Vision Conf., Swansea, UK, 2015, pp. 31.1–31.12.

76

S. Han, J. Pool, J. Tran, and W. J. Dally, Learning both weights and connections for efficient neural networks, in Proc. 28^th Int. Conf. on Neural Information Processing Systems, Montreal, Canada, 2015, pp. 1135–1143.

77

W. Chen, J. T. Wilson, S. Tyree, K. Q. Weinberger, and Y. Chen, Compressing neural networks with the hashing trick, in Proc. 32^nd Int. Conf. on Machine Learning, Lille, France, 2015, pp. 2285–2294.

78

A. Rasmus, H. Valpola, M. Honkala, M. Berglund, and T. Raiko, Semi-supervised learning with ladder networks, in Proc. 28^th Int. Conf. on Neural Information Processing Systems, Montreal, Canada, 2015, pp. 3546–3554.

79

Y. Gong, L. Liu, M. Yang, and L. Bourdev, Compressing deep convolutional networks using vector quantization, arXiv preprint arXiv: 1412.6115, 2014.

80

J. Wu, C. Leng, Y. Wang, Q. Hu, and J. Cheng, Quantized convolutional neural networks for mobile devices, in Proc. 2016 IEEE Conf. on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 2016, pp. 4820–4828.

81

V. Vanhoucke, A. Senior, and M. Z. Mao, Improving the speed of neural networks on CPUs, http://static.googleusercontent.com/media/research.google.com/en//pubs/archive/37631.pdf, 2011.

82

R. Rigamonti, A. Sironi, V. Lepetit, and P. Fua, Learning separable filters, in Proc. 2013 IEEE Conf. on Computer Vision and Pattern Recognition, Portland, OR, USA, 2013, pp. 2754–2761.

83

E. Denton, W. Zaremba, J. Bruna, Y. LeCun, and R. Fergus, Exploiting linear structure within convolutional networks for efficient evaluation, in Proc. 27^th Int. Conf. on Neural Information Processing Systems, Montreal, Canada, 2014, pp. 1269–1277.

84

V. Lebedev, Y. Ganin, M. Rakhuba, I. Oseledets, and V. Lempitsky, Speeding-up convolutional neural networks using fine-tuned CP-decomposition, in Proc. 3^rd Int. Conf. on Learning Representations, San Diego, CA, USA, 2015.

85

S. Zhai, Y. Cheng, W. Lu, and Z. Zhang, Doubly convolutional neural networks, in Proc. 30^th Int. Conf. on Neural Information Processing Systems, Barcelona, Spain, 2016, pp. 1090–1098.

86

W. Shang, K. Sohn, D. Almeida, and H. Lee, Understanding and improving convolutional neural networks via concatenated rectified linear units, in Proc. 33^rd Int. Conf. on Machine Learning, New York City, NY, USA, 2016, pp. 2217–2225.

87

T. Cohen and M. Welling, Group equivariant convolutional networks, in Proc. 33^rd Int. Conf. on Machine Learning, New York, NY, USA, 2016, pp. 2990–2999.

88

L. J. Ba and R. Caruana, Do deep nets really need to be deep? in Proc. 27^th Int. Conf. on Neural Information Processing Systems, Montreal, Canada, 2014, pp. 2654–2662.

89

G. Hinton, O. Vinyals, and J. Dean, Distilling the knowledge in a neural network, arXiv preprint arXiv: 1503.02531, 2015.

90

A. Romero, N. Ballas, S. E. Kahou, A. Chassang, C. Gatta, and Y. Bengio, FitNets: Hints for thin deep nets, in Proc. 3^rd Int. Conf. on Learning Representations, San Diego, CA, USA, 2014.

91

X. Jiang, H. Wang, Y. Chen, Z. Wu, L. Wang, B. Zou, Y. Yang, Z. Cui, Y. Cai, T. Yu, et al., MNN: A universal and efficient inference engine, in Proc. Machine Learning and Systems 2020, Austin, TX, USA, 2020.

92

Tencent/ncnn, https://github.com/Tencent/ncnn, 2017.

93

T. Chen, T. Moreau, Z. Jiang, L. Zheng, E. Yan, H. Shen, M. Cowan, L. Wang, Y. Hu, L. Ceze, et al., TVM: An automated end-to-end optimizing compiler for deep learning, in Proc. 13^th USENIX Symp. on Operating Systems Design and Implementation, Carlsbad, CA, USA, 2018, pp. 578–594.

94

Y. LeCun, Y. Bengio, and G. Hinton, Deep learning, Nature, vol. 521, no. 7553, pp. 436–444, 2015.

Crossref Google Scholar

95

K. He, X. Zhang, S. Ren, and J. Sun, Deep residual learning for image recognition, in Proc. 2016 IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 2016, pp. 770–778.

96

V. Mnih, K. Kavukcuoglu, D. Silver, A. A. Rusu, J. Veness, M. G. Bellemare, A. Graves, M. A. Riedmiller, A. K. Fidjeland, G. Ostrovski, et al., Human-level control through deep reinforcement learning, Nature, vol. 518, no. 7540, pp. 529–533, 2015.

Crossref Google Scholar

97

A. Gholami, Z. Yao, S. Kim, M. W. Mahoney, and K. Keutzer, AI and memory wall, https://medium.com/riselab/ai-and-memory-wall-2cb4265cb0b8, 2021.

98

Y. Ma, Y. Du, L. Du, J. Lin, and Z. Wang, In-memory computing: The next-generation AI computing paradigm, in Proc. 2020 on Great Lakes Symp. on VLSI, China, 2020, pp. 265–270.

99

Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, Gradient-based learning applied to document recognition, Proc. IEEE, vol. 86, no. 11, pp. 2278–2324, 1998.

Crossref Google Scholar

100

J. Zhang, Z. Wang, and N. Verma, In-memory computation of a machine-learning classifier in a standard 6T SRAM array, IEEE J. Solid-State Circuits, vol. 52, no. 4, pp. 915–924, 2017.

Crossref Google Scholar

101

A. Biswas and A. P. Chandrakasan, CONV-SRAM: An energy-efficient SRAM with in-memory dot-product computation for low-power convolutional neural networks, IEEE J. Solid-State Circuits, vol. 54, no. 1, pp. 217–230, 2019.

Crossref Google Scholar

102

T. Yoo, H. Kim, Q. Chen, T. T. H. Kim, and B. Kim, A logic compatible 4T dual embedded DRAM array for in-memory computation of deep neural networks, in Proc. 2019 IEEE/ACM Int. Symp. on Low Power Electronics and Design (ISLPED), Lausanne, Switzerland, 2019, pp. 1–6.

103

F. Merrikh Bayat, X. Guo, M. Klachko, N. Do, K. Likharev, and D. Strukov, Model-based high-precision tuning of NOR flash memory cells for analog computing applications, in Proc. 74^th Annu. Device Research Conf. (DRC), Newark, DE, USA, 2016, pp. 1–2.

104

J. F. Kang, P. Huang, R. Z. Han, Y. C. Xiang, X. L. Cui, and X. Y. Liu, Flash-based computing in-memory scheme for IOT, in Proc. IEEE 13^th Int. Conf. on ASIC (ASICON), Chongqing, China, 2019, pp. 1–4.

105

G. W. Burr, B. N. Kurdi, J. C. Scott, C. H. Lam, K. Gopalakrishnan, and R. S. Shenoy, Overview of candidate device technologies for storage-class memory, IBM J. Res. Dev, vol. 52, no. 4-5, pp. 449–464, 2008.

Crossref Google Scholar

106

E. Chen, D. Apalkov, Z. Diao, A. Driskill-Smith, D. Druist, D. Lottis, V. Nikitin, X. Tang, S. Watts, S. Wang, et al., Advances and future prospects of spin-transfer torque random access memory, IEEE Trans. Magn., vol. 46, no. 6, pp. 1873–1878, 2010.

Crossref Google Scholar

107

T. S. Moise, S. R. Summerfelt, H. McAdams, S. Aggarwal, K. R. Udayakumar, F. G. Celii, J. S. Martin, G. Xing, L. Hall, K. J. Taylor, et al. , Demonstration of a 4 MB, high density ferroelectric memory embedded within a 130 nm, 5 LM Cu/FSG logic process, in Int. Electron Devices Meeting, San Francisco, CA, USA, 2002, pp. 535–538.

108

S. J. Ahn, Y. J. Song, C. W. Jeong, J. M. Shin, Y. Fai, Y. N. Hwang, S. H. Lee, K. C. Ryoo, S. Y. Lee, J. H. Park, et al. , Highly manufacturable high density phase change memory of 64Mb and beyond, in Proc. 2004 IEEE Int. Electron Devices Meeting, San Francisco, CA, USA, 2004, pp. 907–910.

109

V. Joshi, M. Le Gallo, S. Haefeli, I. Boybat, S. R. Nandakumar, C. Piveteau, M. Dazzi, B. Rajendran, A. Sebastian, and E. Eleftheriou, Accurate deep neural network inference using computational phase-change memory, Nat. Commun., vol. 11, no. 1, p. 2473, 2020.

Crossref Google Scholar

110

P. Chi, S. Li, C. Xu, T. Zhang, J. Zhao, Y. Liu, Y. Wang, and Y. Xie, PRIME: A novel processing-in-memory architecture for neural network computation in ReRAM-based main memory, in Proc. ACM/IEEE 43^rd Ann. Int. Symp. on Computer Architecture, Seoul, Republic of Korea, 2016, pp. 27–39.

111

H. Wu and Q. Dai, Artificial intelligence accelerated by light, Nature, vol. 589, no. 7840, pp. 25–26, 2021.

Crossref Google Scholar

112

Y. Shen, N. Harris, S. Skirlo, M. Prabhu, T. Baehr-Jones, M. Hochberg, X. Sun, S. Zhao, H. Larochelle, D. Englund, et al., Deep learning with coherent nanophotonic circuits, Nat. Photonics, vol. 11, no. 7, pp. 441–446, 2017.

Crossref Google Scholar

113

J. Feldmann, N. Youngblood, C. D. Wright, H. Bhaskaran, and W. H. P. Pernice, All-optical spiking neurosynaptic networks with self-learning capabilities, Nature, vol. 569, no. 7755, pp. 208–214, 2019.

Crossref Google Scholar

114

J. Feldmann, N. Youngblood, M. Karpov, H. Gehring, X. Li, M. Stappers, M. Le Gallo, X. Fu, A. Lukashchuk, A. S. Raja, et al., Parallel convolutional processing using an integrated photonic tensor core, Nature, vol. 589, no. 7840, pp. 52–58, 2021.

Crossref Google Scholar

115

X. Xu, M. Tan, B. Corcoran, J. Wu, A. Boes, T. G. Nguyen, S. T. Chu, B. E. Little, D. G. Hicks, R. Morandotti, et al., 11 tops photonic convolutional accelerator for optical neural networks, Nature, vol. 589, no. 7840, pp. 44–51, 2021.

Crossref Google Scholar

116

E. Khoram, A. Chen, D. Liu, L. Ying, Q. Wang, M. Yuan, and Z. Yu, Nanophotonic media for artificial neural inference, Photonics Res., vol. 7, no. 8, pp. 823–827, 2019.

Crossref Google Scholar

117

M. Grieves, Digital twin: Manufacturing excellence through virtual factory replication, 03 White paper, https://www.researchgate.net/publication/275211047_Digital_Twin_Manufacturing_Excellence_through_Virtual_Factory_Replication, 2014.

118

E. Glaessgen and D. Stargel, The digital twin paradigm for future NASA and U. S. air force vehicles, in Proc. 53^rd AIAA/ASME/ASCE/AHS/ASC Structures, Structural Dynamics and materials Conf. , Honolulu, HI, USA, 2012, p. 2012-1818.

119

F. Tao, W. Liu, J. Liu, X. Liu, Q. Liu, T. Qu, T. Hu, Z. Zhang, F. Xiang, W. Xu, et al., Digital twin and its potential application exploration, Comput. Integr. Manuf. Syst., vol. 24, no. 1, pp. 1–8, 2018.

120

X. F. Han, H. Laga, and M. Bennamoun, Image-based 3D object reconstruction: State-of-the-art and trends in the deep learning era, IEEE Trans. Pattern Anal. Mach. Intell., vol. 43, no. 5, pp. 1578–1604, 2021.

Crossref Google Scholar

121

M. Zollhöfer, P. Stotko, A. Görlitz, C. Theobalt, M. Nießner, R. Klein, and A. Kolb, State of the art on 3D reconstruction with RGB-D cameras, Comput. Graphics Forum, vol. 37, no. 2, pp. 625–652, 2018.

Crossref Google Scholar

122

M. Zollhöfer, J. Thies, P. Garrido, D. Bradley, T. Beeler, P. Pérez, M. Stamminger, M. Nießner, and C. Theobalt, State of the art on monocular 3D face reconstruction, tracking, and applications, Comput. Graphics Forum, vol. 37, no. 2, pp. 523–550, 2018.

Crossref Google Scholar

123

M. Dahnert, J. Hou, M. Nießner, and A. Dai, Panoptic 3D scene reconstruction from a single RGB image, in Proc. 35^th Conf. on Neural Information Processing Systems, 2021, pp. 8282–8293.

124

S. Kumar, Y. Dai, and H. Li, Superpixel soup: Monocular dense 3D reconstruction of a complex dynamic scene, IEEE Trans. Pattern Anal. Mach. Intell., vol. 43, no. 5, pp. 1705–1717, 2021.

Crossref Google Scholar

125

J. N. P. Martel, D. B. Lindell, C. Z. Lin, E. R. Chan, M. Monteiro, and G. Wetzstein, Acorn: Adaptive coordinate networks for neural scene representation, ACM Trans. Graphics, vol. 40, no. 4, p. 58, 2021.

Crossref Google Scholar

126

B. Curless and M. Levoy, A volumetric method for building complex models from range images, in Proc. 23^rd Annu. Conf. on Computer Graphics and Interactive Techniques, New Orleans, LA, USA, 1996, pp. 303–312.

127

R. A. Newcombe, S. Izadi, O. Hilliges, D. Molyneaux, D. Kim, A. J. Davison, P. Kohi, J. Shotton, S. Hodges, and A. Fitzgibbon, KinectFusion: Real-time dense surface mapping and tracking, in Proc. 10^th IEEE Int. Symp. on Mixed and Augmented Reality, Basel, Switzerland, 2011, pp. 127–136.

128

J. J. Park, P. Florence, J. Straub, R. Newcombe, and S. Lovegrove, DeepSDF: Learning continuous signed distance functions for shape representation, in Proc. 2019 IEEE/CVF Conf. on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 2019, pp. 165–174.

129

L. Mescheder, M. Oechsle, M. Niemeyer, S. Nowozin, and A. Geiger, Occupancy networks: Learning 3D reconstruction in function space, in Proc. 2019 IEEE/CVF Conf. on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 2019, pp. 4455–4465.

130

B. Mildenhall, P. P. Srinivasan, M. Tancik, J. T. Barron, R. Ramamoorthi, and R. Ng, NeRF: Representing scenes as neural radiance fields for view synthesis, Commun. ACM, vol. 65, no. 1, pp. 99–106, 2022.

Crossref Google Scholar

131

S. Peng, M. Niemeyer, L. M. Mescheder, M. Pollefeys, and A. Geiger, Convolutional occupancy networks, in Proc. 16^th European Conf. on Computer Vision (ECCV), Glasgow, UK, 2020, pp. 523–540.

132

A. Tewari, O. Fried, J. Thies, V. Sitzmann, S. Lombardi, K. Sunkavalli, R. Martin-Brualla, T. Simon, J. Saragih, M. Nießner, et al., State of the art on neural rendering, Comput. Graphics Forum, vol. 39, no. 2, pp. 701–727, 2020.

Crossref Google Scholar

133

S. M. Ali Eslami, D. J. Rezende, F. Besse, F. Viola, A. S. Morcos, M. Garnelo, A. Ruderman, A. A. Rusu, I. Danihelka, K. Gregor, et al., Neural scene representation and rendering, Science, vol. 360, no. 6394, pp. 1204–1210, 2018.

Crossref Google Scholar

134

2022 Global networking trends report, https://www.cisco.com/c/en/us/solutions/enterprise-networks/2022-networking-report-preview.html, 2021.

135

X. Chen, C. Wu, T. Chen, H. Zhang, Z. Liu, Y. Zhang, and M. Bennis, Age of information aware radio resource management in vehicular networks: A proactive deep reinforcement learning perspective, IEEE Trans. Wireless Commun., vol. 19, no. 4, pp. 2268–2281, 2020.

Crossref Google Scholar

136

L. Li, Y. Xu, J. Yin, W. Liang, X. Li, W. Chen, and Z. Han, Deep reinforcement learning approaches for content caching in cache-enabled D2D networks, IEEE Internet Things J., vol. 7, no. 1, pp. 544–557, 2020.

Crossref Google Scholar

137

L. T. Tan and R. Q. Hu, Mobility-aware edge caching and computing in vehicle networks: A deep reinforcement learning, IEEE Trans. Veh. Technol., vol. 67, no. 11, pp. 10190–10203, 2018.

Crossref Google Scholar

138

M. Yan, G. Feng, J. Zhou, Y. Sun, and Y. C. Liang, Intelligent resource scheduling for 5G radio access network slicing, IEEE Trans. Veh. Technol., vol. 68, no. 8, pp. 7691–7703, 2019.

Crossref Google Scholar

139

D. Bega, M. Gramaglia, M. Fiore, A. Banchs, and X. Costa-Perez, DeepCog: Cognitive network management in sliced 5G networks with deep learning, in Proc. 2019 IEEE Conf. on Computer Communications, Paris, France, 2019, pp. 280–288.

140

H. Li, K. Ota, and M. Dong, Learning IoT in edge: Deep learning for the internet of things with edge computing, IEEE Netw., vol. 32, no. 1, pp. 96–101, 2018.

Crossref Google Scholar

141

D. Ravì, C. Wong, B. Lo, and G. Z. Yang, A deep learning approach to on-node sensor data analytics for mobile or wearable devices, IEEE J. Biomedical Health Inform., vol. 21, no. 1, pp. 56–64, 2017.

Crossref Google Scholar

142

H. Ye, G. Y. Li, and B. H. Juang, Power of deep learning for channel estimation and signal detection in OFDM systems, IEEE Wireless Commun. Lett., vol. 7, no. 1, pp. 114–117, 2018.

Crossref Google Scholar

143

C. K. Wen, W. T. Shih, and S. Jin, Deep learning for massive MIMO CSI feedback, IEEE Wireless Commun. Lett., vol. 7, no. 5, pp. 748–751, 2018.

Crossref Google Scholar

144

X. Ma, Z. Gao, F. Gao, and M. Di Renzo, Model-driven deep learning based channel estimation and feedback for millimeter-wave massive hybrid MIMO systems, IEEE J. Sel. Areas Commun., vol. 39, no. 8, pp. 2388–2406, 2021.

Crossref Google Scholar

145

R. Q. Shaddad, E. M. Saif, H. M. Saif, Z. Y. Mohammed, and A. H. Farhan, Channel estimation for intelligent reflecting surface in 6G wireless network via deep learning technique, in Proc. 1^stInt. Conf. on Emerging Smart Technologies and Applications (eSmarTA), Sana'a, Yemen, 2021, pp. 1–5.

146

T. Gruber, S. Cammerer, J. Hoydis, and S. ten Brink, On deep learning-based channel decoding, in Proc. 51^st Annu. Conf. on Information Sciences and Systems (CISS), Baltimore, MD, USA, 2017, pp. 1–6.

147

F. Liang, C. Shen, and F. Wu, An iterative BP-CNN architecture for channel decoding, IEEE J. Sel. Top. Signal Process., vol. 12, no. 1, pp. 144–159, 2018.

Crossref Google Scholar

148

H. Xie, Z. Qin, G. Y. Li, and B. H. Juang, Deep learning enabled semantic communication systems, IEEE Trans. Signal Process., vol. 69, pp. 2663–2675, 2021.

Crossref Google Scholar

149

Z. Weng and Z. Qin, Semantic communication systems for speech transmission, IEEE J. Sel. Areas Commun., vol. 39, no. 8, pp. 2434–2444, 2021.

Crossref Google Scholar

150

S. Nakamoto, Bitcoin: A peer-to-peer electronic cash system,https://bitcoin.org/bitcoin.pdf, 2008.

151

A. Maxmen, AI researchers embrace bitcoin technology to share medical data, Nature, vol. 555, no. 7696, pp. 293–294, 2018.

Crossref Google Scholar

152

K. Salah, M. H. Ur Rehman, N. Nizamuddin, and A. Al-Fuqaha, Blockchain for AI: Review and open research challenges, IEEE Access, vol. 7, pp. 10127–10149, 2019.

Crossref Google Scholar

153

Y. Rizk, M. Awad, and E. W. Tunstel, Decision making in multiagent systems: A survey, IEEE Trans. Cognit. Dev. Syst., vol. 10, no. 3, pp. 514–529, 2018.

Crossref Google Scholar

154

N. Dhieb, H. Ghazzai, H. Besbes, and Y. Massoud. A secure AI-driven architecture for automated insurance systems: Fraud detection and risk measurement, IEEE Access, vol. 8, pp. 58546–58558, 2020.

Crossref Google Scholar

155

Z. Zhang, H. Ning, F. Shi, F. Farha, Y. Xu, J. Xu, F. Zhang, and K. K. R. Choo, Artificial intelligence in cyber security: Research advances, challenges, and opportunities, Artif. Intell. Rev., vol. 55, no. 2, pp. 1029–1053, 2022.

Crossref Google Scholar

156

Y. Xin, L. Kong, Z. Liu, Y. Chen, Y. Li, H. Zhu, M. Gao, H. Hou, and C. Wang, Machine learning and deep learning methods for cybersecurity, IEEE Access, vol. 6, pp. 35365–35381, 2018.

Crossref Google Scholar

157

X. Lu, L. Xiao, T. Xu, Y. Zhao, Y. Tang, and W. Zhuang, Reinforcement learning based PHY authentication for VANETs, IEEE Trans. on Veh. Technol., vol. 69, no. 3, pp. 3068–3079, 2020.

Crossref Google Scholar

158

H. Bao, H. He, Z. Liu, and Z. Liu, Research on information security situation awareness system based on big data and artificial intelligence technology, in Proc. 2019 Int. Conf. on Robots & Intelligent System (ICRIS), Haikou, China, 2019, pp. 318–322.

159

Y. Zhang, X. Chen, D. Guo, M. Song, Y. Teng, and X. Wang, PCCN: Parallel cross convolutional neural network for abnormal network traffic flows detection in multi-class imbalanced network traffic flows, IEEE Access, vol. 7, pp. 119904–119916, 2019.

Crossref Google Scholar

160

J. J. Vidal, Toward direct brain-computer communication, Annu. Rev. Biophys. Bioeng., vol. 2, pp. 157–180, 1973.

Crossref Google Scholar

161

H. Zhang, M. Zhao, C. Wei, D. Mantini, Z. Li, and Q. Liu, EEGdenoiseNet: A benchmark dataset for deep learning solutions of EEG denoising, J. Neural Eng., vol. 18, no. 5, p. 056057, 2021.

Crossref Google Scholar

162

W. Sun, Y. Su, X. Wu, and X. Wu, A novel end-to-end 1D-rescnn model to remove artifact from EEG signals, Neurocomputing, vol. 404, pp. 108–121, 2020.

Crossref Google Scholar

163

N. M. N. Leite, E. T. Pereira, E. C. Gurjão, and L. R. Veloso, Deep convolutional autoencoder for EEG noise filtering, in Proc. 2018 IEEE Int. Conf. on Bioinformatics and Biomedicine (BIBM), Madrid, Spain, 2018, pp. 2605–2612.

164

F. R. Willett, D. T. Avansino, L. R. Hochberg, J. M. Henderson, K. V. Shenoy, High-performance brain-to-text communication via handwriting, Nature, vol. 593, no. 7858, pp. 249–254, 2021.

Crossref Google Scholar

165

D. A. Moses, M. K. Leonard, J. G. Makin, and E. F. Chang, Real-time decoding of question-and-answer speech dialogue using human cortical activity, Nat. Commun., vol. 10, no. 1, p. 3096, 2019.

Crossref Google Scholar

166

M. Capogrosso, T. Milekovic, D. Borton, F. Wagner, E. M. Moraud, J. B. Mignardot, N. Buse, J. Gandar, Q. Barraud, D. Xing, et al., A brain-spine interface alleviating gait deficits after spinal cord injury in primates, Nature, vol. 539, no. 7628, pp. 284–288, 2016.

Crossref Google Scholar

167

K. W. Scangos, A. N. Khambhati, P. M. Daly, G. S. Makhoul, L. P. Sugrue, H. Zamanian, T. X. Liu, V. R. Rao, K. K. Sellers, H. E. Dawes, et al., Closed-loop neuromodulation in an individual with treatment-resistant depression, Nat. Med., vol. 27, no. 10, pp. 1696–1700, 2021.

Crossref Google Scholar

168

O. Rudovic, M. Zhang, B. Schuller, and R. W. Picard, Multi-modal active learning from human data: A deep reinforcement learning approach, in Proc. 2019 Int. Conf. on Multimodal Interaction, Suzhou, China, 2019, pp. 6–15.

169

J. Gao, P. Li, Z. Chen, and J. Zhang, A survey on deep learning for multimodal data fusion, Neural Comput., vol. 32, no. 5, pp. 829–864, 2020.

Crossref Google Scholar

170

L. Shi, B. Li, C. Kim, P. Kellnhofer, and W. Matusik, Towards real-time photorealistic 3D holography with deep neural networks, Nature, vol. 591, no. 7849, pp. 234–239, 2021.

Crossref Google Scholar

171

Y. Takahashi, S. Murata, H. Idei, H. Tomita, and Y. Yamashita, Neural network modeling of altered facial expression recognition in autism spectrum disorders based on predictive processing framework, Sci. Rep., vol. 11, no. 1, p. 14684, 2021.

Crossref Google Scholar

CAAI Artificial Intelligence Research

Volume 1 Issue 1,
September 2022

Pages 54-67

DOI: 10.26599/AIR.2022.9150004

Cite this article:

Guo Y, Yu T, Wu J, et al. Artificial Intelligence for Metaverse: A Framework. CAAI Artificial Intelligence Research, 2022, 1(1): 54-67. https://doi.org/10.26599/AIR.2022.9150004

About Us

Learn about Open Access

Tsinghua University Press

Publish with Us

Peer Review Policy

Copyright and Licensing

Article Processing Charge

Contact Us

Journal Collaboration: Yao Meng (Ms.)✉️ +86-10-83470574

Technical Support: Kuo Zhao (Mr.)✉️ +86-10-83470507

Media Contact: Hao Jin (Mr.)✉️ +86-10-83470559

Address: Floor 6, Tower B, Xueyan Building, Shuangqing Road, Haidian District, Beijing 100084, China.

SciOpen——中国科技期刊卓越行动计划支持项目

Copyright © 2025 Tsinghua University Press Ltd.

京ICP备 10035462号-42 京公网安备11010802044758号