CAAI Artificial Intelligence Research 2023, 2: 9150022 https://doi.org/10.26599/AIR.2023.9150022

Article |

Open Access | Issue | Published: 19 December 2023

Portable Perceptron Network-Based Fast Mode Decision for Video-Based Point Cloud Compression

Show Author's Information Hide Author's Information Shicheng Que^¹, Yue Li^¹(

)

1College of Computer Science, University of South China, Hengyang 421001, China

Keywords:

Video-based Point Cloud Compression (V-PCC), high efficiency video coding, fast mode decision, portable perceptron network

An erratum to this article is available online at https://doi.org/10.26599/AIR.2024.9150029

Cite this article:

Que S, Li Y. Portable Perceptron Network-Based Fast Mode Decision for Video-Based Point Cloud Compression. CAAI Artificial Intelligence Research, 2023, 2: 9150022. https://doi.org/10.26599/AIR.2023.9150022

Download citation

EndNote(RIS)

BibTeX

462

Views

Downloads

Citations

Crossref

N/A

WoS

N/A

Scopus

N/A

CSCD

Abstract Full text About this article

Abstract

In Video-based Point Cloud Compression (V-PCC), 2D videos to be encoded are generated by 3D point cloud projection, and compressed by High Efficiency Video Coding (HEVC). In the process of 2D video compression, the best mode of Coding Unit (CU) is searched by brute-force strategy, which greatly increases the complexity of the encoding process. To address this issue, we first propose a simple and effective Portable Perceptron Network (PPN)-based fast mode decision method for V-PCC under Random Access (RA) configuration. Second, we extract seven simple hand-extracted features for input into the PPN network. Third, we design an adaptive loss function, which can calculate the loss by allocating different weights according to different Rate-Distortion (RD) costs, to train our PPN network. Finally, experimental results show that the proposed method can save encoding complexity of 43.13% with almost no encoding efficiency loss under RA configuration, which is superior to the state-of-the-art methods. The source code is available at https://github.com/Mesks/PPNforV-PCC.

Full text

Abstract

Full text

Outline

About this article

Portable Perceptron Network-Based Fast Mode Decision for Video-Based Point Cloud Compression

Show Author's information Hide Author's Information Shicheng Que^¹, Yue Li^¹(

)

1College of Computer Science, University of South China, Hengyang 421001, China

Abstract

Keywords: Video-based Point Cloud Compression (V-PCC), high efficiency video coding, fast mode decision, portable perceptron network

References(33)

[1]

C. Cao, M. Preda, V. Zakharchenko, E. S. Jang, and T. Zaharia, Compression of sparse and dense dynamic point clouds—Methods and standards, Proc. IEEE, vol. 109, no. 9, pp. 1537–1558, 2021.

DOI Google Scholar

[2]

D. Graziosi, O. Nakagami, S. Kuma, A. Zaghetto, T. Suzuki, and A. Tabatabai, An overview of ongoing point cloud compression standardization activities: Video-based (V-PCC) and geometry-based (G-PCC), APSIPA Trans. Signal Inf. Process., vol. 9, no. 1, p. e13, 2020.

DOI Google Scholar

[3]

G. J. Sullivan, J. R. Ohm, W. J. Han, and T. Wiegand, Overview of the high efficiency video coding (HEVC) standard, IEEE Trans. Circuits Syst. Video Technol., vol. 22, no. 12, pp. 1649–1668, 2012.

DOI Google Scholar

[4]

W. Zhu, Y. Yi, H. Zhang, P. Chen, and H. Zhang, Fast mode decision algorithm for HEVC intra coding based on texture partition and direction, J. Real Time Image Process., vol. 17, no. 2, pp. 275–292, 2020.

DOI Google Scholar

[5]

G. J. Sullivan and T. Wiegand, Rate-distortion optimization for video compression, IEEE Signal Process. Mag., vol. 15, no. 6, pp. 74–90, 1998.

DOI Google Scholar

[6]

E. d’Eon, B. Harrison, T. Myers, and P. A. Chou, 8i voxelized full bodies - A voxelized point cloud dataset, ISO/IEC JTC1/SC29 Joint WG11/WG1 (MPEG/JPEG) input document WG11M40059/WG1M74006, Geneva, 2017.

[7]

S. Ma, X. Zhang, C. Jia, Z. Zhao, S. Wang, and S. Wang, Image and video compression with neural networks: A review, IEEE Trans. Circuits Syst. Video Technol., vol. 30, no. 6, pp. 1683–1698, 2020.

DOI Google Scholar

[8]

Y. Zhang, S. Kwong, and S. Wang, Machine learning based video coding optimizations: A survey, Inf. Sci., vol. 506, pp. 395–423, 2020.

DOI Google Scholar

[9]

L. Shen, Z. Zhang, and Z. Liu, Adaptive inter-mode decision for HEVC jointly utilizing inter-level and spatiotemporal correlations, IEEE Trans. Circuits Syst. Video Technol., vol. 24, no. 10, pp. 1709–1722, 2014.

DOI Google Scholar

[10]

J. Vanne, M. Viitanen, and T. D. Hamalainen, Efficient mode decision schemes for HEVC inter prediction, IEEE Trans. Circuits Syst. Video Technol., vol. 24, no. 9, pp. 1579–1593, 2014.

DOI Google Scholar

[11]

X. Liu, Y. Li, D. Liu, P. Wang, and L. T. Yang, An adaptive CU size decision algorithm for HEVC intra prediction based on complexity classification using machine learning, IEEE Trans. Circuits Syst. Video Technol., vol. 29, no. 1, pp. 144–155, 2019.

DOI Google Scholar

[12]

Z. Pan, P. Zhang, B. Peng, N. Ling, and J. Lei, A CNN-based fast inter coding method for VVC, IEEE Signal Process. Lett., vol. 28, pp. 1260–1264, 2021.

DOI Google Scholar

[13]

H. Yang, L. Shen, X. Dong, Q. Ding, P. An, and G. Jiang, Low-complexity CTU partition structure decision and fast intra mode decision for versatile video coding, IEEE Trans. Circuits Syst. Video Technol., vol. 30, no. 6, pp. 1668–1682, 2020.

DOI Google Scholar

[14]

Z. Liu, T. Li, Y. Chen, K. Wei, M. Xu, and H. Qi, Deep multi-task learning based fast intra-mode decision for versatile video coding, IEEE Trans. Circuits Syst. Video Technol., vol. 33, no. 10, pp. 6101–6116, 2023.

DOI Google Scholar

[15]

F. Duanmu, Z. Ma, and Y. Wang, Fast mode and partition decision using machine learning for intra-frame coding in HEVC screen content coding extension, IEEE J. Emerg. Sel. Topics Circuits Syst., vol. 6, no. 4, pp. 517–531, 2016.

DOI Google Scholar

[16]

M. Xu, T. Li, Z. Wang, X. Deng, R. Yang, and Z. Guan, Reducing complexity of HEVC: A deep learning approach, IEEE Trans. Image Process., vol. 27, no. 10, pp. 5044–5059, 2018.

DOI Google Scholar

[17]

K. Kim and W. W. Ro, Fast CU depth decision for HEVC using neural networks, IEEE Trans. Circuits Syst. Video Technol., vol. 29, no. 5, pp. 1462–1473, 2019.

DOI Google Scholar

[18]

S. H. Park and J. W. Kang, Fast multi-type tree partitioning for versatile video coding using a lightweight neural network, IEEE Trans. Multimedia, vol. 23, pp. 4388–4399, 2021.

DOI Google Scholar

[19]

Q. Zhang, R. Guo, B. Jiang, and R. Su, Fast CU decision-making algorithm based on DenseNet network for VVC, IEEE Access, vol. 9, pp. 119289–119297, 2021.

DOI Google Scholar

[20]

A. Feng, K. Liu, D. Liu, L. Li, and F. Wu, Partition map prediction for fast block partitioning in VVC intra-frame coding, IEEE Trans. Image Process., vol. 32, pp. 2237–2251, 2023.

DOI Google Scholar

[21]

Y. Liu, M. Abdoli, T. Guionnet, C. Guillemot, and A. Roumy, Light-weight CNN-based VVC inter partitioning acceleration, in Proc. 2022 IEEE 14th Image, Video, and Multidimensional Signal Processing Workshop (IVMSP), Nafplio, Greece, 2022, pp. 1–5.

DOI

[22]

C. Zou, S. Wan, T. Ji, M. G. Blanch, M. Mrak, and L. Herranz, Chroma intra prediction with lightweight attention-based neural networks, IEEE Trans. Circuits Syst. Video Technol., p. 1, 2023.

DOI

[23]

M. Li and W. Ji, Lightweight multiattention recursive residual CNN-based In-loop filter driven by neuron diversity, IEEE Trans. Circuits Syst. Video Technol., vol. 33, no. 11, pp. 6996–7008, 2023.

DOI Google Scholar

[24]

S. Ryu and J. Kang, Machine learning-based fast angular prediction mode decision technique in video coding, IEEE Trans. Image Process., vol. 27, no. 11, pp. 5525–5538, 2018.

DOI Google Scholar

[25]

Y. Zhang, S. Kwong, X. Wang, H. Yuan, Z. Pan, and L. Xu, Machine learning-based coding unit depth decisions for flexible complexity allocation in high efficiency video coding, IEEE Trans. Image Process., vol. 24, no. 7, pp. 2225–2238, 2015.

DOI Google Scholar

[26]

J. Xiong, H. Gao, M. Wang, H. Li, and W. Lin, Occupancy map guided fast video-based dynamic point cloud coding, IEEE Trans. Circuits Syst. Video Technol., vol. 32, no. 2, pp. 813–825, 2022.

DOI Google Scholar

[27]

L. Li, Z. Li, S. Liu, and H. Li, Occupancy-map-based rate distortion optimization and partition for video-based point cloud compression, IEEE Trans. Circuits Syst. Video Technol., vol. 31, no. 1, pp. 326–338, 2021.

DOI Google Scholar

[28]

H. Yuan, W. Gao, G. Li, and Z. Li, Rate-distortion-guided learning approach with cross-projection information for V-PCC fast CU decision, in Proc. 30th ACM Int. Conf. Multimedia, Lisboa, Portugal, 2022, pp. 3085–3093.

DOI

[29]

R. H. Gweon and Y.-L. Lee, Early termination of CU encoding to reduce HEVC complexity, IEICE Trans. Fundamentals, vol. E95. A, no. 7, pp. 1215–1218, 2012.

DOI Google Scholar

[30]

S. Narayan, The generalized sigmoid activation function: Competitive supervised learning, Inf. Sci., vol. 99, no. 1-2, pp. 69–82, 1997.

DOI Google Scholar

[31]

MPEGGroup, Video Point Cloud Compression - VPCC - mpeg-pcc-tmc2 test model candidate software, https://github.com/MPEGGroup/mpeg-pcc-tmc2, 2021.

[32]

S. Schwarz, P. A. Chou, and M. Budagavi, Common test conditions for point cloud compression, ISO/IEC JTC1/SC29/WG11 output document N17345, Gwangju, Republic of Korea, 2018.

[33]

Y. Xu, Y. Lu, and Z. Wen, Owlii Dynamic human mesh sequence dataset, ISO/IEC JTC1/SC29/WG11 m41658, 120th MPEG Meeting, Macau, China, 2017.

About this article

Publication history

Acknowledgements

Rights and permissions

Publication history

Received: 27 July 2023

Revised: 11 September 2023

Accepted: 03 November 2023

Published: 19 December 2023

Issue date: December 2023

Copyright

Acknowledgements

Acknowledgment

This research was partially supported by the National Natural Science Foundation of China (No. 62001209).

Rights and permissions

The articles published in this open access journal are distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/).