Intelligent and Converged Networks 2022, 3(2): 204-216 https://doi.org/10.23919/ICN.2022.0014

Open Access | Issue | Published: 06 September 2022

PointGAT: Graph attention networks for 3D object detection

Show Author's Information Hide Author's Information Haoran Zhou^{¹^,^†}, Wei Wang^{¹^,^†}, Gang Liu^¹(

), Qingguo Zhou^¹(

)

1 School of Information Science and Engineering, Lanzhou University, Lanzhou 730000, China

Keywords:

attention mechanism, graph neural network, 3D object detection, pointcloud

Cite this article:

Zhou H, Wang W, Liu G, et al. PointGAT: Graph attention networks for 3D object detection. Intelligent and Converged Networks, 2022, 3(2): 204-216. https://doi.org/10.23919/ICN.2022.0014

Download citation

EndNote(RIS)

BibTeX

1064

Views

159

Downloads

Citations

Crossref

N/A

WoS

Scopus

N/A

CSCD

Abstract Full text About this article

Abstract

3D object detection is a critical technology in many applications, and among the various detection methods, pointcloud-based methods have been the most popular research topic in recent years. Since Graph Neural Network (GNN) is considered to be effective in dealing with pointclouds, in this work, we combined it with the attention mechanism and proposed a 3D object detection method named PointGAT. Our proposed PointGAT outperforms previous approaches on the KITTI test dataset. Experiments in real campus scenarios also demonstrate the potential of our method for further applications.

Full text

Abstract

Full text

Outline

About this article

PointGAT: Graph attention networks for 3D object detection

Show Author's information Hide Author's Information Haoran Zhou^{¹^,^†}, Wei Wang^{¹^,^†}, Gang Liu^¹(

), Qingguo Zhou^¹(

)

1 School of Information Science and Engineering, Lanzhou University, Lanzhou 730000, China

Abstract

Keywords: attention mechanism, graph neural network, 3D object detection, pointcloud

References(51)

Z. Wu, S. Pan, F. Chen, G. Long, C. Zhang, and P. S. Yu, A comprehensive survey on graph neural networks, IEEE Transactions on Neural Networks and Learning Systems, vol. 32, no. 1, pp. 4–24, 2021.

DOI Google Scholar

V. P. Dwivedi, C. K. Joshi, A. T. Luu, T. Laurent, Y. Bengio, and X. Bresson, Benchmarking graph neural networks, https://arxiv.org/abs/2003.00982, 2022

D. Fernandes, A. Silva, R. Névoa, C. Simões, D. Gonzalez, M. Guevara, P. Novais, J. Monteiro, and P. Melo-Pinto, Point-cloud based 3D object detection and classification methods for self-driving applications: A survey and taxonomy, Information Fusion, vol. 68, pp. 161–191, 2021.

DOI Google Scholar

B. Li, T. Zhang, and T. Xia, Vehicle detection from 3D lidar using fully convolutional network, presented at 2016 Robotics: Science and Systems Conference, Ann Arbor, MI, USA, 2016.

J. Long, E. Shelhamer, and T. Darrell, Fully convolutional networks for semantic segmentation, in Proc. 2015 IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 2015, pp. 3431–3440.

DOI

X. Chen, H. Ma, J. Wan, B. Li, and T. Xia, Multi-view 3D object detection network for autonomous driving, in Proc. 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 2017, pp. 6526–6534.

DOI

J. Ku, M. Mozifian, J. Lee, A. Harakeh, and S. L. Waslander, Joint 3D proposal generation and object detection from view aggregation, in Proc. 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems, Madrid, Spain, 2018, pp. 1–8.

DOI

T. Y. Lin, P. Dollár, R. Girshick, K. He, B. Hariharan, and S. Belongie, Feature pyramid networks for object detection, in Proc. 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 2017, pp. 936–944.

DOI

Y. Zeng, Y. Hu, S. Liu, J. Ye, Y. Han, X. Li, and N. Sun, RT3D: Real-time 3-D vehicle detection in LiDAR point cloud for autonomous driving, IEEE Robotics and Automation Letters, vol. 3, no. 4, pp. 3434–3440, 2018.

DOI Google Scholar

J. Dai, Y. Li, K. He, and J. Sun, R-FCN: Object detection via region-based fully convolutional networks, in Proc. 30^thInternational Conference on Neural Information Processing Systems, Barcelona, Spain, 2016, pp. 379–387.

B. Yang, W. Luo, and R. Urtasun, PIXOR: Real-time 3D object detection from point clouds, in Proc. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 2018, pp. 7652–7660.

DOI

T. Y. Lin, P. Goyal, R. Girshick, K. He, and P. Dollár, Focal loss for dense object detection, in Proc. 2017 IEEE International Conference on Computer Vision, Venice, Italy, 2017, pp. 2999–3007.

DOI

Y. Zhou and O. Tuzel, VoxelNet: End-to-end learning for point cloud based 3D object detection, in Proc. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 2018, pp. 4490–4499.

DOI

Y. Yan, Y. Mao, and B. Li, SECOND: Sparsely embedded convolutional detection, Sensors, vol. 18, no. 10, p. 3337, 2018.

DOI Google Scholar

B. Graham and L. V. D. Maaten, Submanifold sparse convolutional networks, https://arxiv.org/abs/1706.01307, 2017.

A. H. Lang, S. Vora, H. Caesar, L. Zhou, J. Yang, and O. Beijbom, PointPillars: Fast encoders for object detection from point clouds, in Proc. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 2019, pp. 12689–12697.

DOI

T. Yin, X. Zhou, and P. Krähenbühl, Center-based 3D object detection and tracking, inProc. 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 2021, pp. 11779–11788.

DOI

S. Shi, Z. Wang, J. Shi, X. Wang, and H. Li, From points to parts: 3D object detection from point cloud with part-aware and part-aggregation network, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 43, no. 8, pp. 2647–2664, 2020.

DOI Google Scholar

M. Ye, S. Xu, and T. Cao, Hvnet: Hybrid voxel network for LiDAR based 3D object detection, in Proc. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 2020, pp. 1628–1637.

DOI

C. R. Qi, H. Su, K. Mo, and L. J. Guibas, PointNet: Deep learning on point sets for 3D classification and segmentation, in Proc. 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 2017, pp. 77–85.

C. R. Qi, L. Yi, H. Su, and L. J. Guibas, PointNet++: Deep hierarchical feature learning on point sets in a metric space, in Proc. 31^stInternational Conference on Neural Information Processing Systems, Long Beach, CA, USA, 2017, pp. 5105–5114.

S. Shi, X. Wang, and H. Li, PointRCNN: 3D object proposal generation and detection from point cloud, in Proc. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 2019, pp. 770–779.

DOI

S. Ren, K. He, R. Girshick, and J. Sun, Faster RCNN: Towards real-time object detection with region proposal networks, in Proc. 28^thInternational Conference on Neural Information Processing Systems, Montreal, Canada, 2015, pp. 91–99.

Z. Yang, Y. Sun, S. Liu, and J. Jia, 3DSSD: Point-based 3D single stage object detector, in Proc. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 2020, pp. 11037–11045.

DOI

Z. Yang, Y. Sun, S. Liu, X. Shen, and J. Jia, STD: Sparse-to-dense 3D object detector for point cloud, in Proc. 2019 IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, 2019, pp. 1951–1960.

DOI

N. T. Bliss and M. C. Schmidt, Confronting the challenges of graphs and networks, Lincoln Laboratory Journal, vol. 20, no. 1, pp. 4–9, 2013.

Google Scholar

F. Scarselli, M. Gori, A. C. Tsoi, M. Hagenbuchner, and G. Monfardini, The graph neural network model, IEEE Transactions on Neural Networks, vol. 20, no. 1, pp. 61–80, 2008.

DOI Google Scholar

T. N. Kipf and M. Welling, Semi-supervised classification with graph convolutional networks, presented at 5^th International Conference on Learning Representations, Toulon, France, 2017.

P. Veličković, G. Cucurull, A. Casanova, A. Romero, P. Liò, and Y. Bengio, Graph attention networks, presented at 6^th International Conference on Learning Representations, Vancouver, Canada, 2018.

S. Zhang, Y. Liu, and L. Xie, Molecular mechanics-driven graph neural network with multiplex graph for molecular structures, https://arxiv.org/abs/2011.07457, 2020.

W. Jin, M. Qu, X. Jin, and X. Ren, Recurrent event network: autoregressive structure inferenceover temporal knowledge graphs, in Proc. 2020 Conference on Empirical Methods in Natural Language Processing, Virtual, 2020, pp. 6669–6683.

DOI

Y. Liu, J. Ma, and P. Li, Neural predicting higher-order patterns in temporal networks, https://arxiv.org/abs/2106.06039, 2021.

DOI

J. Chen, J. Han, X. Meng, Y. Li, and H. Li, Graph convolutional network combined with semantic feature guidance for deep clustering, Tsinghua Science and Technology, vol. 27, no. 5, pp. 855–868, 2022.

DOI Google Scholar

Y. Duan, J. Wang, H. Ma, and Y. Sun, Residual convolutional graph neural network with subgraph attention pooling, Tsinghua Science and Technology, vol. 27, no. 4, pp. 653–663, 2022.

DOI Google Scholar

Y. Ma, Z. Guo, Z. Ren, J. Tang, and D. Yin, Streaming graph neural networks, in Proc. 43^rdInternational ACM SIGIR Conference on Research and Development in Information Retrieval, Xi’an, China, 2020, pp. 719–728.

DOI

V. N. Ioannidis, A. G. Marques, and G. B. Giannakis, A recurrent graph neural network for multi-relational data, in Proc. ICASSP 2019 – 2019 IEEE International Conference on Acoustics, Speech and Signal Processing, Brighton, UK, 2019, pp. 8157–8161.

DOI

N. Yadati, M. Nimishakavi, P. Yadav, V. Nitin, A. Louis, and P. Talukdar, HyperGCN: A new method of training graph convolutional networks on hypergraphs, in Proc. 33^rdInternational Conference on Neural Information Processing Systems, Vancouver, Canada, 2019, pp. 1509–1520.

A. Pareja, G. Domeniconi, J. Chen, T. Ma, T. Suzumura, H. Kanezashi, T. Kaler, T. Schardl, and C. Leiserson, Evolvegcn: Evolving graph convolutional networks for dynamic graphs, Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, no. 4, pp. 5363–5370, 2020.

DOI Google Scholar

A. Sankar, Y. Wu, L. Gou, W. Zhang, and H. Yang, DySAT: Deep neural representation learning on dynamic graphs via self-attention networks, in Proc. 13^thInternational Conference on Web Search and Data Mining, Houston, TX, USA, 2020, pp. 519–527.

DOI

J. Gao, X. Liu, Y. Chen, and F. Xiong, MHGCN: Multiview highway graph convolutional network for cross-lingual entity alignment, Tsinghua Science and Technology, vol. 27, no. 4, pp. 719–728, 2021.

DOI Google Scholar

L. Landrieu and M. Simonovsky, Large-scale point cloud semantic segmentation with superpoint graphs, in Proc. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 2018, pp. 4558–4567.

DOI

J. Zarzar, S. Giancola, and B. Ghanem, PointRGCN: Graph convolution networks for 3D vehicles detection refinement, https://arxiv.org/abs/1911.12236, 2019.

L. Wang, Y. Huang, Y. Hou, S. Zhang, and J. Shan, Graph attention convolution for point cloud semantic segmentation, in Proc. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 2019, pp. 10288–10297.

DOI

W. Shi and R. Rajkumar, Point-GNN: Graph neural network for 3D object detection in a point cloud, in Proc. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 2020, pp. 1708–1716.

DOI

K. He, X. Zhang, S. Ren, and J. Sun, Deep residual learning for image recognition, in Proc. 2016 IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 2016, pp. 770–778.

DOI

Y. Zhang, D. Huang, and Y. Wang, PC-RGNN: Point cloud completion and graph neural network for 3D object detection, Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, no. 4, pp. 3430–3437, 2021.

Google Scholar

A. Geiger, P. Lenz, C. Stiller, and R. Urtasun, Vision meets robotics: The KITTI dataset, International Journal of Robotics Research, vol. 32, no. 11, pp. 1231–1237, 2013.

DOI Google Scholar

M. Abadi, P. Barham, J. Chen, Z. Chen, A. Davis, J. Dean, M. Devin, S. Ghemawat, G. Irving, M. Isard, et al., TensorFlow: A system for large-scale machine learning, in Proc. 12^thUSENIX Conference on Operating Systems Design and Implementation, Savannah, GA, USA, 2016, pp. 265–283.

C. R. Qi, W. Liu, C. Wu, H. Su, and L. J. Guibas, Frustum pointnets for 3D object detection from RGB-D data, in Proc. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 2018, pp. 918–927.

K. Shin, Y. P. Kwon, and M. Tomizuka, RoarNet: A robust 3D object detection based on region approximation refinement, in Proc. 2019 IEEE Intelligent Vehicles Symposium (IV), Paris, France, 2019, pp. 2510–2515.

DOI

M. Liang, B. Yang, S. Wang, and R. Urtasun, Deep continuous fusion for multi-sensor 3D object detection, in Proc. European Conference on Computer Vision, Munich, Germany, 2018, pp. 663–678.

DOI

About this article

Publication history

Acknowledgements

Rights and permissions

Publication history

Received: 15 June 2022

Revised: 06 July 2022

Accepted: 01 August 2022

Published: 06 September 2022

Issue date: June 2022

Copyright

Acknowledgements

Acknowledgment

This work was supported in part by the Gansu Provincial Science and Technology Major Special Innovation Consortium Project (No. 21ZD3GA002). The name of the innovation consortium is Gansu Province Green and Smart Highway Transportation Innovation Consortium, and the project name is Gansu Province Green and Smart Highway Key Technology Research and Demonstration.

Rights and permissions

This work is available under the CC BY-NC-ND 3.0 IGO license: https://creativecommons.org/licenses/by-nc-nd/3.0/igo/