Journal Home > Volume 5 , Issue 3
Purpose

Perception has been identified as the main cause underlying most autonomous vehicle related accidents. As the key technology in perception, deep learning (DL) based computer vision models are generally considered to be black boxes due to poor interpretability. These have exacerbated user distrust and further forestalled their widespread deployment in practical usage. This paper aims to develop explainable DL models for autonomous driving by jointly predicting potential driving actions with corresponding explanations. The explainable DL models can not only boost user trust in autonomy but also serve as a diagnostic approach to identify any model deficiencies or limitations during the system development phase.

Design/methodology/approach

This paper proposes an explainable end-to-end autonomous driving system based on "Transformer, " a state-ofthe-art self-attention (SA) based model. The model maps visual features from images collected by onboard cameras to guide potential driving actions with corresponding explanations, and aims to achieve soft attention over the image's global features.

Findings

The results demonstrate the efficacy of the proposed model as it exhibits superior performance (in terms of correct prediction of actions and explanations) compared to the benchmark model by a significant margin with much lower computational cost on a public data set (BDD-OIA). From the ablation studies, the proposed SA module also outperforms other attention mechanisms in feature fusion and can generate meaningful representations for downstream prediction.

Originality/value

In the contexts of situational awareness and driver assistance, the proposed model can perform as a driving alarm system for both human-driven vehicles and autonomous vehicles because it is capable of quickly understanding/characterizing the environment and identifying any infeasible driving actions. In addition, the extra explanation head of the proposed model provides an extra channel for sanity checks to guarantee that the model learns the ideal causal relationships. This provision is critical in the development of autonomous systems.


menu
Abstract
Full text
Outline
About this article

Development and testing of an image transformer for explainable autonomous driving systems

Show Author's information Jiqian Dong1Sikai Chen1( )Mohammad Miralinaghi1Tiantian Chen2Samuel Labi3
Center for Connected and Automated Transportation (CCAT) and Lyles School of Civil Engineering, Purdue University, West Lafayette, Indiana, USA
Department of Industrial and System Engineering, The Hong Kong Polytechnic University, Kowloon, China
Center for Connected and Automated Transportation (CCAT) and Lyles School of Civil Engineering, Purdue University, West Lafayette, Indiana, USA

Abstract

Purpose

Perception has been identified as the main cause underlying most autonomous vehicle related accidents. As the key technology in perception, deep learning (DL) based computer vision models are generally considered to be black boxes due to poor interpretability. These have exacerbated user distrust and further forestalled their widespread deployment in practical usage. This paper aims to develop explainable DL models for autonomous driving by jointly predicting potential driving actions with corresponding explanations. The explainable DL models can not only boost user trust in autonomy but also serve as a diagnostic approach to identify any model deficiencies or limitations during the system development phase.

Design/methodology/approach

This paper proposes an explainable end-to-end autonomous driving system based on "Transformer, " a state-ofthe-art self-attention (SA) based model. The model maps visual features from images collected by onboard cameras to guide potential driving actions with corresponding explanations, and aims to achieve soft attention over the image's global features.

Findings

The results demonstrate the efficacy of the proposed model as it exhibits superior performance (in terms of correct prediction of actions and explanations) compared to the benchmark model by a significant margin with much lower computational cost on a public data set (BDD-OIA). From the ablation studies, the proposed SA module also outperforms other attention mechanisms in feature fusion and can generate meaningful representations for downstream prediction.

Originality/value

In the contexts of situational awareness and driver assistance, the proposed model can perform as a driving alarm system for both human-driven vehicles and autonomous vehicles because it is capable of quickly understanding/characterizing the environment and identifying any infeasible driving actions. In addition, the extra explanation head of the proposed model provides an extra channel for sanity checks to guarantee that the model learns the ideal causal relationships. This provision is critical in the development of autonomous systems.

Keywords: Computer vision, Autonomous driving, Transformer, Explainable deep learning

References(50)

Alwosheel, A., van Cranenburgh, S. and Chorus, C.G. (2021), “Why did you predict that? Towards explainable artificial neural networks for travel demand analysis”, Transportation Research Part C: Emerging Technologies, Vol. 128, doi: 10.1016/j.trc.2021.103143.

Bojarski, M., Del Testa, D., Dworakowski, D., Firner, B., Flepp, B., Goyal, P. Jackel, L.D. et al. (2016), “End to end learning for self-driving cars”, pp. 1-9.

Bustos, C., Rhoads, D., Solé-Ribalta, A., Masip, D., Arenas, A., Lapedriza, A. and Borge-Holthoefer, J. (2021), “Explainable, automated urban interventions to improve pedestrian and vehicle safety”, Transportation Research Part C: Emerging Technologies, Vol. 125, doi: 10.1016/j.trc.2021.103018.

Chen, S., Leng, Y. and Labi, S. (2019), “A deep learning algorithm for simulating autonomous driving considering prior knowledge and temporal information”, Computer-Aided Civil and Infrastructure Engineering, Vol. 35 No. 4, doi: 10.1111/mice.12495.

Chen, S., Dong, J., Ha, P., Li, Y. and Labi, S. (2021), “Graph neural network and reinforcement learning for multi-agent cooperative control of connected autonomous vehicles”, Computer-Aided Civil and Infrastructure Engineering, Vol. 36 No. 7, doi: 10.1111/mice.12702.

Cui, Z., Henrickson, K., Ke, R. and Wang, Y. (2019), “Traffic graph convolutional recurrent neural network: a deep learning framework for network-scale traffic learning and forecasting”, IEEE Transactions on Intelligent Transportation Systems, Vol. 21 No. 11, doi: 10.1109/tits.2019.2950416.

Dong, J., Chen, S., Zong, S., Chen, T. and Labi, S. (2021b), “Image transformer for explainable autonomous driving system”, In 2021 IEEE International Intelligent Transportation Systems Conference (ITSC), IEEE, pp. 2732-2737.

Dong, J., Chen, S., Li, Y., Du, R., Steinfeld, A. and Labi, S. (2021a), “Space-weighted information fusion using deep reinforcement learning: the context of tactical control of lane-changing autonomous vehicles and connectivity range assessment”, Transportation Research Part C: Emerging Technologies, Vol. 128, doi: 10.1016/j.trc.2021.103192.

Dong, J., Chen, S., Li, Y., Ha, P.Y.J., Du, R., Steinfeld, A. and Labi, S. (2020), “Spatio-weighted information fusion and DRL-based control for connected autonomous vehicles”, 2020 IEEE 23rd International Conference on Intelligent Transportation Systems, ITSC 2020, doi: 10.1109/ITSC45102.2020.9294550.

Doran, D., Schulz, S. and Besold, T.R. (2018), “What does explainable AI really mean? A new conceptualization of perspectives”, CEUR Workshop Proceedings.
Du, R., Chen, S., Dong, J., Ha, P.Y.J. and Labi, S. (2021), “GAQ-EBkSP: a DRL-based urban traffic dynamic rerouting framework using fog-cloud architecture”, doi: 10.1109/isc253183.2021.9562832.
FHWA (2019), “Evaluation methods and techniques: advanced transportation and congestion management technologies deployment program, tech”, Rep. Nr. FHWA-HOP-19-053, Prepared by the Volpe National Transportation Syst, Washington, DC.

Ha, P., Chen, S., Du, R., Dong, J., Li, Y. and Labi, S. (2020), “Vehicle connectivity and automation: a sibling relationship”, Frontiers in Built Environment, Vol. 6, doi: 10.3389/fbuil.2020.590036.

He, K., Zhang, X., Ren, S. and Sun, J. (2016), “Deep residual learning for image recognition”, Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, doi: 10.1109/CVPR.2016.90.

Horgan, J., Hughes, C., McDonald, J. and Yogamani, S. (2015), “Vision-based driver assistance systems: survey, taxonomy and advances”, IEEE Conference on Intelligent Transportation Systems, Proceedings, ITSC, doi: 10.1109/ITSC.2015.329.

Hou, R., Jeong, S., Lynch, J.P. and Law, K.H. (2020), “Cyber-physical system architecture for automating the mapping of truck loads to bridge behavior using computer vision in connected highway corridors”, Transportation Research Part C: Emerging Technologies, Vol. 111, doi: 10.1016/j.trc.2019.11.024.

Hu, H., Zhao, T., Wang, Q., Gao, F. and He, L. (2020), “R-CNN based 3D object detection for autonomous driving”, CICTP 2020: Transportation Evolution Impacting Future Mobility – Selected Papers from the 20th COTA International Conference of Transportation Professionals, doi: 10.1061/9780784483053.077.

Khastgir, S., Birrell, S., Dhadyalla, G. and Jennings, P. (2018), “Calibrating trust through knowledge: introducing the concept of informed safety for automation in vehicles”, Transportation Research Part C: Emerging Technologies, Vol. 96, doi: 10.1016/j.trc.2018.07.001.

Kim, J. and Canny, J. (2017), “Interpretable learning for self-driving cars by visualizing causal attention”, Proceedings of the IEEE International Conference on Computer Vision, doi: 10.1109/ICCV.2017.320.

Kim, J., Moon, S., Rohrbach, A., Darrell, T. and Canny, J. (2020), “Advisable learning for self-driving vehicles by internalizing observation-to-action rules”, Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, doi: 10.1109/CVPR42600.2020.00968.

Ku, J., Pon, A.D. and Waslander, S.L. (2019), “Monocular 3D object detection leveraging accurate proposals and shape reconstruction”, Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, doi: 10.1109/CVPR.2019.01214.

Lioris, J., Pedarsani, R., Tascikaraoglu, F.Y. and Varaiya, P. (2017), “Platoons of connected vehicles can double throughput in urban roads”, Transportation Research Part C: Emerging Technologies, Vol. 77, doi: 10.1016/j.trc.2017.01.023.

Litman, T. (2014), “Autonomous Vehicle Implementation Predictions: Implications for Transport Planning”, Transportation Research Board Annual Meeting, doi: 10.1613/jair.301.

Liu, Y., Liu, Z. and Jia, R. (2019), “DeepPF: a deep learning based architecture for metro passenger flow prediction”, Transportation Research Part C: Emerging Technologies, Vol. 101, doi: 10.1016/j.trc.2019.01.027.

McCausland, P. (2019), “Self-driving uber car that hit and killed woman did not recognize that pedestrians jaywalk”, NBC News, pp. 3-5.

NTSB (2019), “Collision between vehicle controlled by developmental automated driving system and pedestrian”, Highway Accident Report NTSB/HAR19/03 Washington, DC.

Peng, B., Keskin, M.F., Kulcsár, B. and Wymeersch, H. (2021), “Connected autonomous vehicles for improving mixed traffic efficiency in unsignalized intersections with deep reinforcement learning”, Communications in Transportation Research, Vol. 1, doi: 10.1016/j.commtr.2021.100017.

Ren, S., He, K., Girshick, R. and Sun, J. (2017), “Faster R-CNN: towards real-time object detection with region proposal networks”, IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 39 No. 6, doi: 10.1109/TPAMI.2016.2577031.

Rosenholtz, R. (2016), “Capabilities and limitations of peripheral vision”, Annual Review of Vision Science, Vol. 2 No. 1, doi: 10.1146/annurev-vision-082114-035733.

Sandler, M., Howard, A., Zhu, M., Zhmoginov, A. and Chen, L.C. (2018), “MobileNetV2: inverted residuals and linear bottlenecks”, Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, doi: 10.1109/CVPR.2018.00474.

Schwarting, W., Alonso-Mora, J. and Rus, D. (2018), “Planning and decision-making for autonomous vehicles”, Annual Review of Control, Robotics, and Autonomous Systems, Vol. 1 No. 1, pp. 187-210.

Sinha, K.C. and Labi, S. (2007), “Transportation decision making: principles of project evaluation and programming, transportation decision making: principles of project evaluation and programming”, doi: 10.1002/9780470168073.

Sowmya Shree, B.V. and Karthikeyan, A. (2018), “Computer vision based advanced driver assistance system algorithms with optimization techniques-a review”, Proceedings of the 2nd International Conference on Electronics, Communication and Aerospace Technology, ICECA 2018, doi: 10.1109/ICECA.2018.8474604.

Talpaert, V., Sobh, I., Ravi Kiran, B., Mannion, P., Yogamani, S., El-Sallab, A. and Perez, P. (2019), “Exploring applications of deep reinforcement learning for real-world autonomous driving systems”, VISIGRAPP 2019Proceedings of the 14th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications, doi: 10.5220/0007520305640572.

TRB (2018), “Socioeconomic impacts of automated and connected vehicle: summary of the sixth EU – US”, Transportation Research Symposium, Transportation Research Board Conference Proceedings.
TRB (2019), “TRB forum on preparing for automated vehicles and shared mobility: mini-workshop on the importance and role of connectivity”, Transportation Research Circular.

Veres, S.M., Molnar, L., Lincoln, N.K. and Morice, C.P. (2011), “Autonomous vehicle control systems – a review of decision making”, Proceedings of the Institution of Mechanical Engineers. Part I: Journal of Systems and Control Engineering, doi: 10.1177/2041304110394727.

Wolfe, B., Dobres, J., Rosenholtz, R. and Reimer, B. (2017), “More than the useful field: considering peripheral vision in driving”, Applied Ergonomics, Vol. 65, doi: 10.1016/j.apergo.2017.07.009.

World Bank (2005), “A framework for the economic evaluation of transport projects, transport notes”.

Xia, Y., Kim, J., Canny, J., Zipser, K., Canas-Bajo, T. and Whitney, D. (2020), “Periphery-fovea multi-resolution driving model guided by human attention”, Proceedings2020 IEEE Winter Conference on Applications of Computer Vision, WACV 2020, doi: 10.1109/WACV45572.2020.9093524.

Xing, Y., Lv, C., Cao, D. and Velenis, E. (2021), “Multi-scale driver behavior modeling based on deep spatial-temporal representation for intelligent vehicles”, Transportation Research Part C: Emerging Technologies, Vol. 130, doi: 10.1016/j.trc.2021.103288.

Xu, H., Gao, Y., Yu, F. and Darrell, T. (2017), “End-to-end learning of driving models from large-scale video datasets”, Proceedings30th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, doi: 10.1109/CVPR.2017.376.

Xu, Y., Yang, X., Gong, L., Lin, H.C., Wu, T.Y., Li, Y. and Vasconcelos, N. (2020), “Explainable object-induced action decision for autonomous vehicles”, Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, doi: 10.1109/CVPR42600.2020.00954.

Yadron, D. and Tynan, D. (2016), “Tesla driver dies in first fatal crash while using autopilot mode”, The Guardian.

Yu, B., Lee, Y. and Sohn, K. (2020a), “Forecasting road traffic speeds by considering area-wide spatio-temporal dependencies based on a graph convolutional neural network (GCN)”, Transportation Research Part C: Emerging Technologies, Vol. 114, doi: 10.1016/j.trc.2020.02.013.

Yu, F., Chen, H., Wang, X., Xian, W., Chen, Y., Liu, F., Madhavan, V. and Darrell, T. (2020b), “BDD100K: a diverse driving dataset for heterogeneous multitask learning”, Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, doi: 10.1109/CVPR42600.2020.00271.

Zhao, H., Jia, J. and Koltun, V. (2020), “Exploring self-attention for image recognition”, Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, doi: 10.1109/CVPR42600.2020.01009.

Zhou, F., Li, L., Zhang, K. and Trajcevski, G. (2021), “Urban flow prediction with spatial–temporal neural ODEs”, Transportation Research Part C: Emerging Technologies, Vol. 124, doi: 10.1016/j.trc.2020.102912.

Zhu, W., Wu, J., Fu, T., Wang, J., Zhang, J. and Shangguan, Q. (2021), “Dynamic prediction of traffic incident duration on urban expressways: a deep learning approach based on LSTM and MLP”, Journal of Intelligent and Connected Vehicles, Vol. 4 No. 2, doi: 10.1108/jicv-03-2021-0004.

Zhuang, L., Wang, L., Zhang, Z. and Tsui, K.L. (2018), “Automated vision inspection of rail surface cracks: a double-layer data-driven framework”, Transportation Research Part C: Emerging Technologies, Vol. 92, doi: 10.1016/j.trc.2018.05.007.

Publication history
Copyright
Rights and permissions

Publication history

Received: 14 June 2022
Revised: 17 June 2022
Accepted: 19 June 2022
Published: 13 July 2022
Issue date: October 2022

Copyright

© 2022 Jiqian Dong, Sikai Chen, Mohammad Miralinaghi, Tiantian Chen and Samuel Labi. Published in Journal of Intelligent and Connected Vehicles. Published by Emerald Publishing Limited.

Rights and permissions

This article is published under the Creative Commons Attribution (CC BY 4.0) licence. Anyone may reproduce, distribute, translate and create derivative works of this article (for both commercial and non-commercial purposes), subject to full attribution to the original publication and authors. The full terms of this licence maybe seen at http://creativecommons.org/licences/by/4.0/legalcode

Return