AI Chat Paper
Note: Please note that the following content is generated by AMiner AI. SciOpen does not take any responsibility related to this content.
{{lang === 'zh_CN' ? '文章概述' : 'Summary'}}
{{lang === 'en_US' ? '中' : 'Eng'}}
Chat more with AI
Article Link
Collect
Submit Manuscript
Show Outline
Outline
Show full outline
Hide outline
Outline
Show full outline
Hide outline

Learning Multi-Modal Scale-Aware Attentions for Efficient and Robust Road Segmentation

School of Electrical and Electronic Engineering, Nanyang Technological University, Singapore
Computer Vision and Robot Research Center, International Digital Economy Academy, Shenzhen, Guangdong, P. R. China

This paper was recommended for publication in its revised form by Special Issue Editors: Jie Chen, Ben M. Chen and Jie Huang.

Show Author Information

Abstract

Road segmentation is essential to unmanned systems, contributing to road perception and navigation in the field of autonomous driving. While multi-modal road segmentation methods have shown promising results by leveraging the complementary data of RGB and Depth to provide robust 3D geometry information, existing methods suffer from severe efficiency problems that hinder their practical application in autonomous driving. Their direct concatenation of multi-modal features with a densely-connected network leads to increased semantic gaps among modalities and scales, causing high computational and time complexity. To address these issues, we propose a Multi-modal Scale-aware Attention Network (MSAN) to fuse RGB and Depth data effectively via a novel transformer-based cross-attention module, namely Multi-modal Scare-aware Transformer (MST), which fuses RGB-D features from a global perspective across multiple scales. To better consolidate different scales of features, we further propose a Scale-aware Attention Module (SAM) that captures channel-wise attention efficiently for cross-scale fusion. These two attention-based modules explore the complementarity of modalities and scales, narrowing the gaps and avoiding complex structures for road segmentation. Extensive experiments demonstrate MSAN achieves competitive performance at a low computational cost, suitable for real-time implementation on edge-devices in autonomous driving systems.

References

【1】
【1】
 
 
Unmanned Systems
Pages 201-213

{{item.num}}

Comments on this article

Go to comment

< Back to all reports

Review Status: {{reviewData.commendedNum}} Commended , {{reviewData.revisionRequiredNum}} Revision Required , {{reviewData.notCommendedNum}} Not Commended Under Peer Review

Review Comment

Close
Close
Cite this article:
Zhou Y, Yang J, Cao H, et al. Learning Multi-Modal Scale-Aware Attentions for Efficient and Robust Road Segmentation. Unmanned Systems, 2024, 12(2): 201-213. https://doi.org/10.1142/S2301385024410048

1068

Views

0

Crossref

2

Web of Science

3

Scopus

0

CSCD

Received: 13 August 2023
Revised: 24 October 2023
Accepted: 27 October 2023
Published: 07 December 2023
© World Scientific Publishing Company