Journal Home > Volume 2

In Video-based Point Cloud Compression (V-PCC), 2D videos to be encoded are generated by 3D point cloud projection, and compressed by High Efficiency Video Coding (HEVC). In the process of 2D video compression, the best mode of Coding Unit (CU) is searched by brute-force strategy, which greatly increases the complexity of the encoding process. To address this issue, we first propose a simple and effective Portable Perceptron Network (PPN)-based fast mode decision method for V-PCC under Random Access (RA) configuration. Second, we extract seven simple hand-extracted features for input into the PPN network. Third, we design an adaptive loss function, which can calculate the loss by allocating different weights according to different Rate-Distortion (RD) costs, to train our PPN network. Finally, experimental results show that the proposed method can save encoding complexity of 43.13% with almost no encoding efficiency loss under RA configuration, which is superior to the state-of-the-art methods. The source code is available at https://github.com/Mesks/PPNforV-PCC.


menu
Abstract
Full text
Outline
About this article

Portable Perceptron Network-Based Fast Mode Decision for Video-Based Point Cloud Compression

Show Author's information Shicheng Que1Yue Li1( )
College of Computer Science, University of South China, Hengyang 421001, China

Abstract

In Video-based Point Cloud Compression (V-PCC), 2D videos to be encoded are generated by 3D point cloud projection, and compressed by High Efficiency Video Coding (HEVC). In the process of 2D video compression, the best mode of Coding Unit (CU) is searched by brute-force strategy, which greatly increases the complexity of the encoding process. To address this issue, we first propose a simple and effective Portable Perceptron Network (PPN)-based fast mode decision method for V-PCC under Random Access (RA) configuration. Second, we extract seven simple hand-extracted features for input into the PPN network. Third, we design an adaptive loss function, which can calculate the loss by allocating different weights according to different Rate-Distortion (RD) costs, to train our PPN network. Finally, experimental results show that the proposed method can save encoding complexity of 43.13% with almost no encoding efficiency loss under RA configuration, which is superior to the state-of-the-art methods. The source code is available at https://github.com/Mesks/PPNforV-PCC.

Keywords: Video-based Point Cloud Compression (V-PCC), high efficiency video coding, fast mode decision, portable perceptron network

References(33)

[1]

C. Cao, M. Preda, V. Zakharchenko, E. S. Jang, and T. Zaharia, Compression of sparse and dense dynamic point clouds—Methods and standards, Proc. IEEE, vol. 109, no. 9, pp. 1537–1558, 2021.

[2]

D. Graziosi, O. Nakagami, S. Kuma, A. Zaghetto, T. Suzuki, and A. Tabatabai, An overview of ongoing point cloud compression standardization activities: Video-based (V-PCC) and geometry-based (G-PCC), APSIPA Trans. Signal Inf. Process., vol. 9, no. 1, p. e13, 2020.

[3]

G. J. Sullivan, J. R. Ohm, W. J. Han, and T. Wiegand, Overview of the high efficiency video coding (HEVC) standard, IEEE Trans. Circuits Syst. Video Technol., vol. 22, no. 12, pp. 1649–1668, 2012.

[4]

W. Zhu, Y. Yi, H. Zhang, P. Chen, and H. Zhang, Fast mode decision algorithm for HEVC intra coding based on texture partition and direction, J. Real Time Image Process., vol. 17, no. 2, pp. 275–292, 2020.

[5]

G. J. Sullivan and T. Wiegand, Rate-distortion optimization for video compression, IEEE Signal Process. Mag., vol. 15, no. 6, pp. 74–90, 1998.

[6]
E. d’Eon, B. Harrison, T. Myers, and P. A. Chou, 8i voxelized full bodies - A voxelized point cloud dataset, ISO/IEC JTC1/SC29 Joint WG11/WG1 (MPEG/JPEG) input document WG11M40059/WG1M74006, Geneva, 2017.
[7]

S. Ma, X. Zhang, C. Jia, Z. Zhao, S. Wang, and S. Wang, Image and video compression with neural networks: A review, IEEE Trans. Circuits Syst. Video Technol., vol. 30, no. 6, pp. 1683–1698, 2020.

[8]

Y. Zhang, S. Kwong, and S. Wang, Machine learning based video coding optimizations: A survey, Inf. Sci., vol. 506, pp. 395–423, 2020.

[9]

L. Shen, Z. Zhang, and Z. Liu, Adaptive inter-mode decision for HEVC jointly utilizing inter-level and spatiotemporal correlations, IEEE Trans. Circuits Syst. Video Technol., vol. 24, no. 10, pp. 1709–1722, 2014.

[10]

J. Vanne, M. Viitanen, and T. D. Hamalainen, Efficient mode decision schemes for HEVC inter prediction, IEEE Trans. Circuits Syst. Video Technol., vol. 24, no. 9, pp. 1579–1593, 2014.

[11]

X. Liu, Y. Li, D. Liu, P. Wang, and L. T. Yang, An adaptive CU size decision algorithm for HEVC intra prediction based on complexity classification using machine learning, IEEE Trans. Circuits Syst. Video Technol., vol. 29, no. 1, pp. 144–155, 2019.

[12]

Z. Pan, P. Zhang, B. Peng, N. Ling, and J. Lei, A CNN-based fast inter coding method for VVC, IEEE Signal Process. Lett., vol. 28, pp. 1260–1264, 2021.

[13]

H. Yang, L. Shen, X. Dong, Q. Ding, P. An, and G. Jiang, Low-complexity CTU partition structure decision and fast intra mode decision for versatile video coding, IEEE Trans. Circuits Syst. Video Technol., vol. 30, no. 6, pp. 1668–1682, 2020.

[14]

Z. Liu, T. Li, Y. Chen, K. Wei, M. Xu, and H. Qi, Deep multi-task learning based fast intra-mode decision for versatile video coding, IEEE Trans. Circuits Syst. Video Technol., vol. 33, no. 10, pp. 6101–6116, 2023.

[15]

F. Duanmu, Z. Ma, and Y. Wang, Fast mode and partition decision using machine learning for intra-frame coding in HEVC screen content coding extension, IEEE J. Emerg. Sel. Topics Circuits Syst., vol. 6, no. 4, pp. 517–531, 2016.

[16]

M. Xu, T. Li, Z. Wang, X. Deng, R. Yang, and Z. Guan, Reducing complexity of HEVC: A deep learning approach, IEEE Trans. Image Process., vol. 27, no. 10, pp. 5044–5059, 2018.

[17]

K. Kim and W. W. Ro, Fast CU depth decision for HEVC using neural networks, IEEE Trans. Circuits Syst. Video Technol., vol. 29, no. 5, pp. 1462–1473, 2019.

[18]

S. H. Park and J. W. Kang, Fast multi-type tree partitioning for versatile video coding using a lightweight neural network, IEEE Trans. Multimedia, vol. 23, pp. 4388–4399, 2021.

[19]

Q. Zhang, R. Guo, B. Jiang, and R. Su, Fast CU decision-making algorithm based on DenseNet network for VVC, IEEE Access, vol. 9, pp. 119289–119297, 2021.

[20]

A. Feng, K. Liu, D. Liu, L. Li, and F. Wu, Partition map prediction for fast block partitioning in VVC intra-frame coding, IEEE Trans. Image Process., vol. 32, pp. 2237–2251, 2023.

[21]
Y. Liu, M. Abdoli, T. Guionnet, C. Guillemot, and A. Roumy, Light-weight CNN-based VVC inter partitioning acceleration, in Proc. 2022 IEEE 14th Image, Video, and Multidimensional Signal Processing Workshop (IVMSP), Nafplio, Greece, 2022, pp. 1–5.
DOI
[22]
C. Zou, S. Wan, T. Ji, M. G. Blanch, M. Mrak, and L. Herranz, Chroma intra prediction with lightweight attention-based neural networks, IEEE Trans. Circuits Syst. Video Technol., p. 1, 2023.
DOI
[23]

M. Li and W. Ji, Lightweight multiattention recursive residual CNN-based In-loop filter driven by neuron diversity, IEEE Trans. Circuits Syst. Video Technol., vol. 33, no. 11, pp. 6996–7008, 2023.

[24]

S. Ryu and J. Kang, Machine learning-based fast angular prediction mode decision technique in video coding, IEEE Trans. Image Process., vol. 27, no. 11, pp. 5525–5538, 2018.

[25]

Y. Zhang, S. Kwong, X. Wang, H. Yuan, Z. Pan, and L. Xu, Machine learning-based coding unit depth decisions for flexible complexity allocation in high efficiency video coding, IEEE Trans. Image Process., vol. 24, no. 7, pp. 2225–2238, 2015.

[26]

J. Xiong, H. Gao, M. Wang, H. Li, and W. Lin, Occupancy map guided fast video-based dynamic point cloud coding, IEEE Trans. Circuits Syst. Video Technol., vol. 32, no. 2, pp. 813–825, 2022.

[27]

L. Li, Z. Li, S. Liu, and H. Li, Occupancy-map-based rate distortion optimization and partition for video-based point cloud compression, IEEE Trans. Circuits Syst. Video Technol., vol. 31, no. 1, pp. 326–338, 2021.

[28]
H. Yuan, W. Gao, G. Li, and Z. Li, Rate-distortion-guided learning approach with cross-projection information for V-PCC fast CU decision, in Proc. 30th ACM Int. Conf. Multimedia, Lisboa, Portugal, 2022, pp. 3085–3093.
DOI
[29]

R. H. Gweon and Y.-L. Lee, Early termination of CU encoding to reduce HEVC complexity, IEICE Trans. Fundamentals, vol. E95. A, no. 7, pp. 1215–1218, 2012.

[30]

S. Narayan, The generalized sigmoid activation function: Competitive supervised learning, Inf. Sci., vol. 99, no. 1-2, pp. 69–82, 1997.

[31]
MPEGGroup, Video Point Cloud Compression - VPCC - mpeg-pcc-tmc2 test model candidate software, https://github.com/MPEGGroup/mpeg-pcc-tmc2, 2021.
[32]
S. Schwarz, P. A. Chou, and M. Budagavi, Common test conditions for point cloud compression, ISO/IEC JTC1/SC29/WG11 output document N17345, Gwangju, Republic of Korea, 2018.
[33]
Y. Xu, Y. Lu, and Z. Wen, Owlii Dynamic human mesh sequence dataset, ISO/IEC JTC1/SC29/WG11 m41658, 120th MPEG Meeting, Macau, China, 2017.
Publication history
Copyright
Acknowledgements
Rights and permissions

Publication history

Received: 27 July 2023
Revised: 11 September 2023
Accepted: 03 November 2023
Published: 19 December 2023
Issue date: December 2023

Copyright

© The author(s) 2023.

Acknowledgements

Acknowledgment

This research was partially supported by the National Natural Science Foundation of China (No. 62001209).

Rights and permissions

The articles published in this open access journal are distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/).

Return