Journal Home > Volume 3 , Issue 2

3D object detection is a critical technology in many applications, and among the various detection methods, pointcloud-based methods have been the most popular research topic in recent years. Since Graph Neural Network (GNN) is considered to be effective in dealing with pointclouds, in this work, we combined it with the attention mechanism and proposed a 3D object detection method named PointGAT. Our proposed PointGAT outperforms previous approaches on the KITTI test dataset. Experiments in real campus scenarios also demonstrate the potential of our method for further applications.


menu
Abstract
Full text
Outline
About this article

PointGAT: Graph attention networks for 3D object detection

Show Author's information Haoran Zhou1,Wei Wang1,Gang Liu1( )Qingguo Zhou1( )
School of Information Science and Engineering, Lanzhou University, Lanzhou 730000, China

Abstract

3D object detection is a critical technology in many applications, and among the various detection methods, pointcloud-based methods have been the most popular research topic in recent years. Since Graph Neural Network (GNN) is considered to be effective in dealing with pointclouds, in this work, we combined it with the attention mechanism and proposed a 3D object detection method named PointGAT. Our proposed PointGAT outperforms previous approaches on the KITTI test dataset. Experiments in real campus scenarios also demonstrate the potential of our method for further applications.

Keywords: attention mechanism, graph neural network, 3D object detection, pointcloud

References(51)

1

Z. Wu, S. Pan, F. Chen, G. Long, C. Zhang, and P. S. Yu, A comprehensive survey on graph neural networks, IEEE Transactions on Neural Networks and Learning Systems, vol. 32, no. 1, pp. 4–24, 2021.

2
V. P. Dwivedi, C. K. Joshi, A. T. Luu, T. Laurent, Y. Bengio, and X. Bresson, Benchmarking graph neural networks, https://arxiv.org/abs/2003.00982, 2022
3

D. Fernandes, A. Silva, R. Névoa, C. Simões, D. Gonzalez, M. Guevara, P. Novais, J. Monteiro, and P. Melo-Pinto, Point-cloud based 3D object detection and classification methods for self-driving applications: A survey and taxonomy, Information Fusion, vol. 68, pp. 161–191, 2021.

4
B. Li, T. Zhang, and T. Xia, Vehicle detection from 3D lidar using fully convolutional network, presented at 2016 Robotics: Science and Systems Conference, Ann Arbor, MI, USA, 2016.
5
J. Long, E. Shelhamer, and T. Darrell, Fully convolutional networks for semantic segmentation, in Proc. 2015 IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 2015, pp. 3431–3440.
DOI
6
X. Chen, H. Ma, J. Wan, B. Li, and T. Xia, Multi-view 3D object detection network for autonomous driving, in Proc. 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 2017, pp. 6526–6534.
DOI
7
J. Ku, M. Mozifian, J. Lee, A. Harakeh, and S. L. Waslander, Joint 3D proposal generation and object detection from view aggregation, in Proc. 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems, Madrid, Spain, 2018, pp. 1–8.
DOI
8
T. Y. Lin, P. Dollár, R. Girshick, K. He, B. Hariharan, and S. Belongie, Feature pyramid networks for object detection, in Proc. 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 2017, pp. 936–944.
DOI
9

Y. Zeng, Y. Hu, S. Liu, J. Ye, Y. Han, X. Li, and N. Sun, RT3D: Real-time 3-D vehicle detection in LiDAR point cloud for autonomous driving, IEEE Robotics and Automation Letters, vol. 3, no. 4, pp. 3434–3440, 2018.

10
J. Dai, Y. Li, K. He, and J. Sun, R-FCN: Object detection via region-based fully convolutional networks, in Proc. 30thInternational Conference on Neural Information Processing Systems, Barcelona, Spain, 2016, pp. 379–387.
11
B. Yang, W. Luo, and R. Urtasun, PIXOR: Real-time 3D object detection from point clouds, in Proc. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 2018, pp. 7652–7660.
DOI
12
T. Y. Lin, P. Goyal, R. Girshick, K. He, and P. Dollár, Focal loss for dense object detection, in Proc. 2017 IEEE International Conference on Computer Vision, Venice, Italy, 2017, pp. 2999–3007.
DOI
13
Y. Zhou and O. Tuzel, VoxelNet: End-to-end learning for point cloud based 3D object detection, in Proc. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 2018, pp. 4490–4499.
DOI
14

Y. Yan, Y. Mao, and B. Li, SECOND: Sparsely embedded convolutional detection, Sensors, vol. 18, no. 10, p. 3337, 2018.

15
B. Graham and L. V. D. Maaten, Submanifold sparse convolutional networks, https://arxiv.org/abs/1706.01307, 2017.
16
A. H. Lang, S. Vora, H. Caesar, L. Zhou, J. Yang, and O. Beijbom, PointPillars: Fast encoders for object detection from point clouds, in Proc. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 2019, pp. 12689–12697.
DOI
17
T. Yin, X. Zhou, and P. Krähenbühl, Center-based 3D object detection and tracking, inProc. 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 2021, pp. 11779–11788.
DOI
18

S. Shi, Z. Wang, J. Shi, X. Wang, and H. Li, From points to parts: 3D object detection from point cloud with part-aware and part-aggregation network, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 43, no. 8, pp. 2647–2664, 2020.

19
M. Ye, S. Xu, and T. Cao, Hvnet: Hybrid voxel network for LiDAR based 3D object detection, in Proc. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 2020, pp. 1628–1637.
DOI
20
C. R. Qi, H. Su, K. Mo, and L. J. Guibas, PointNet: Deep learning on point sets for 3D classification and segmentation, in Proc. 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 2017, pp. 77–85.
21
C. R. Qi, L. Yi, H. Su, and L. J. Guibas, PointNet++: Deep hierarchical feature learning on point sets in a metric space, in Proc. 31stInternational Conference on Neural Information Processing Systems, Long Beach, CA, USA, 2017, pp. 5105–5114.
22
S. Shi, X. Wang, and H. Li, PointRCNN: 3D object proposal generation and detection from point cloud, in Proc. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 2019, pp. 770–779.
DOI
23
S. Ren, K. He, R. Girshick, and J. Sun, Faster RCNN: Towards real-time object detection with region proposal networks, in Proc. 28thInternational Conference on Neural Information Processing Systems, Montreal, Canada, 2015, pp. 91–99.
24
Z. Yang, Y. Sun, S. Liu, and J. Jia, 3DSSD: Point-based 3D single stage object detector, in Proc. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 2020, pp. 11037–11045.
DOI
25
Z. Yang, Y. Sun, S. Liu, X. Shen, and J. Jia, STD: Sparse-to-dense 3D object detector for point cloud, in Proc. 2019 IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, 2019, pp. 1951–1960.
DOI
26

N. T. Bliss and M. C. Schmidt, Confronting the challenges of graphs and networks, Lincoln Laboratory Journal, vol. 20, no. 1, pp. 4–9, 2013.

27

F. Scarselli, M. Gori, A. C. Tsoi, M. Hagenbuchner, and G. Monfardini, The graph neural network model, IEEE Transactions on Neural Networks, vol. 20, no. 1, pp. 61–80, 2008.

28
T. N. Kipf and M. Welling, Semi-supervised classification with graph convolutional networks, presented at 5th International Conference on Learning Representations, Toulon, France, 2017.
29
P. Veličković, G. Cucurull, A. Casanova, A. Romero, P. Liò, and Y. Bengio, Graph attention networks, presented at 6th International Conference on Learning Representations, Vancouver, Canada, 2018.
30
S. Zhang, Y. Liu, and L. Xie, Molecular mechanics-driven graph neural network with multiplex graph for molecular structures, https://arxiv.org/abs/2011.07457, 2020.
31
W. Jin, M. Qu, X. Jin, and X. Ren, Recurrent event network: autoregressive structure inferenceover temporal knowledge graphs, in Proc. 2020 Conference on Empirical Methods in Natural Language Processing, Virtual, 2020, pp. 6669–6683.
DOI
32
Y. Liu, J. Ma, and P. Li, Neural predicting higher-order patterns in temporal networks, https://arxiv.org/abs/2106.06039, 2021.
DOI
33

J. Chen, J. Han, X. Meng, Y. Li, and H. Li, Graph convolutional network combined with semantic feature guidance for deep clustering, Tsinghua Science and Technology, vol. 27, no. 5, pp. 855–868, 2022.

34

Y. Duan, J. Wang, H. Ma, and Y. Sun, Residual convolutional graph neural network with subgraph attention pooling, Tsinghua Science and Technology, vol. 27, no. 4, pp. 653–663, 2022.

35
Y. Ma, Z. Guo, Z. Ren, J. Tang, and D. Yin, Streaming graph neural networks, in Proc. 43rdInternational ACM SIGIR Conference on Research and Development in Information Retrieval, Xi’an, China, 2020, pp. 719–728.
DOI
36
V. N. Ioannidis, A. G. Marques, and G. B. Giannakis, A recurrent graph neural network for multi-relational data, in Proc. ICASSP 2019 – 2019 IEEE International Conference on Acoustics, Speech and Signal Processing, Brighton, UK, 2019, pp. 8157–8161.
DOI
37
N. Yadati, M. Nimishakavi, P. Yadav, V. Nitin, A. Louis, and P. Talukdar, HyperGCN: A new method of training graph convolutional networks on hypergraphs, in Proc. 33rdInternational Conference on Neural Information Processing Systems, Vancouver, Canada, 2019, pp. 1509–1520.
38

A. Pareja, G. Domeniconi, J. Chen, T. Ma, T. Suzumura, H. Kanezashi, T. Kaler, T. Schardl, and C. Leiserson, Evolvegcn: Evolving graph convolutional networks for dynamic graphs, Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, no. 4, pp. 5363–5370, 2020.

39
A. Sankar, Y. Wu, L. Gou, W. Zhang, and H. Yang, DySAT: Deep neural representation learning on dynamic graphs via self-attention networks, in Proc. 13thInternational Conference on Web Search and Data Mining, Houston, TX, USA, 2020, pp. 519–527.
DOI
40

J. Gao, X. Liu, Y. Chen, and F. Xiong, MHGCN: Multiview highway graph convolutional network for cross-lingual entity alignment, Tsinghua Science and Technology, vol. 27, no. 4, pp. 719–728, 2021.

41
L. Landrieu and M. Simonovsky, Large-scale point cloud semantic segmentation with superpoint graphs, in Proc. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 2018, pp. 4558–4567.
DOI
42
J. Zarzar, S. Giancola, and B. Ghanem, PointRGCN: Graph convolution networks for 3D vehicles detection refinement, https://arxiv.org/abs/1911.12236, 2019.
43
L. Wang, Y. Huang, Y. Hou, S. Zhang, and J. Shan, Graph attention convolution for point cloud semantic segmentation, in Proc. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 2019, pp. 10288–10297.
DOI
44
W. Shi and R. Rajkumar, Point-GNN: Graph neural network for 3D object detection in a point cloud, in Proc. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 2020, pp. 1708–1716.
DOI
45
K. He, X. Zhang, S. Ren, and J. Sun, Deep residual learning for image recognition, in Proc. 2016 IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 2016, pp. 770–778.
DOI
46

Y. Zhang, D. Huang, and Y. Wang, PC-RGNN: Point cloud completion and graph neural network for 3D object detection, Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, no. 4, pp. 3430–3437, 2021.

47

A. Geiger, P. Lenz, C. Stiller, and R. Urtasun, Vision meets robotics: The KITTI dataset, International Journal of Robotics Research, vol. 32, no. 11, pp. 1231–1237, 2013.

48
M. Abadi, P. Barham, J. Chen, Z. Chen, A. Davis, J. Dean, M. Devin, S. Ghemawat, G. Irving, M. Isard, et al., TensorFlow: A system for large-scale machine learning, in Proc. 12thUSENIX Conference on Operating Systems Design and Implementation, Savannah, GA, USA, 2016, pp. 265–283.
49
C. R. Qi, W. Liu, C. Wu, H. Su, and L. J. Guibas, Frustum pointnets for 3D object detection from RGB-D data, in Proc. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 2018, pp. 918–927.
50
K. Shin, Y. P. Kwon, and M. Tomizuka, RoarNet: A robust 3D object detection based on region approximation refinement, in Proc. 2019 IEEE Intelligent Vehicles Symposium (IV), Paris, France, 2019, pp. 2510–2515.
DOI
51
M. Liang, B. Yang, S. Wang, and R. Urtasun, Deep continuous fusion for multi-sensor 3D object detection, in Proc. European Conference on Computer Vision, Munich, Germany, 2018, pp. 663–678.
DOI
Publication history
Copyright
Acknowledgements
Rights and permissions

Publication history

Received: 15 June 2022
Revised: 06 July 2022
Accepted: 01 August 2022
Published: 06 September 2022
Issue date: June 2022

Copyright

© All articles included in the journal are copyrighted to the ITU and TUP.

Acknowledgements

Acknowledgment

This work was supported in part by the Gansu Provincial Science and Technology Major Special Innovation Consortium Project (No. 21ZD3GA002). The name of the innovation consortium is Gansu Province Green and Smart Highway Transportation Innovation Consortium, and the project name is Gansu Province Green and Smart Highway Key Technology Research and Demonstration.

Rights and permissions

This work is available under the CC BY-NC-ND 3.0 IGO license: https://creativecommons.org/licenses/by-nc-nd/3.0/igo/

Return