Journal of Intelligent and Connected Vehicles 2022, 5(3): 235-249 https://doi.org/10.1108/JICV-06-2022-0021

Research paper |

Open Access | Issue | Published: 13 July 2022

Development and testing of an image transformer for explainable autonomous driving systems

Show Author's Information Hide Author's Information Jiqian Dong^¹, Sikai Chen^¹(

), Mohammad Miralinaghi^¹, Tiantian Chen^², Samuel Labi^³

Center for Connected and Automated Transportation (CCAT) and Lyles School of Civil Engineering, Purdue University, West Lafayette, Indiana, USA

Department of Industrial and System Engineering, The Hong Kong Polytechnic University, Kowloon, China

Center for Connected and Automated Transportation (CCAT) and Lyles School of Civil Engineering, Purdue University, West Lafayette, Indiana, USA

Keywords:

Computer vision, Autonomous driving, Transformer, Explainable deep learning

Cite this article:

Dong J, Chen S, Miralinaghi M, et al. Development and testing of an image transformer for explainable autonomous driving systems. Journal of Intelligent and Connected Vehicles, 2022, 5(3): 235-249. https://doi.org/10.1108/JICV-06-2022-0021

Download citation

EndNote(RIS)

BibTeX

434

Views

Downloads

Citations

Crossref

N/A

WoS

Scopus

N/A

CSCD

Abstract Full text About this article

Abstract

Purpose

Perception has been identified as the main cause underlying most autonomous vehicle related accidents. As the key technology in perception, deep learning (DL) based computer vision models are generally considered to be black boxes due to poor interpretability. These have exacerbated user distrust and further forestalled their widespread deployment in practical usage. This paper aims to develop explainable DL models for autonomous driving by jointly predicting potential driving actions with corresponding explanations. The explainable DL models can not only boost user trust in autonomy but also serve as a diagnostic approach to identify any model deficiencies or limitations during the system development phase.

Design/methodology/approach

This paper proposes an explainable end-to-end autonomous driving system based on "Transformer, " a state-ofthe-art self-attention (SA) based model. The model maps visual features from images collected by onboard cameras to guide potential driving actions with corresponding explanations, and aims to achieve soft attention over the image's global features.

Findings

The results demonstrate the efficacy of the proposed model as it exhibits superior performance (in terms of correct prediction of actions and explanations) compared to the benchmark model by a significant margin with much lower computational cost on a public data set (BDD-OIA). From the ablation studies, the proposed SA module also outperforms other attention mechanisms in feature fusion and can generate meaningful representations for downstream prediction.

Originality/value

In the contexts of situational awareness and driver assistance, the proposed model can perform as a driving alarm system for both human-driven vehicles and autonomous vehicles because it is capable of quickly understanding/characterizing the environment and identifying any infeasible driving actions. In addition, the extra explanation head of the proposed model provides an extra channel for sanity checks to guarantee that the model learns the ideal causal relationships. This provision is critical in the development of autonomous systems.

Full text

Abstract

Full text

Outline

About this article

Development and testing of an image transformer for explainable autonomous driving systems

Show Author's information Hide Author's Information Jiqian Dong^¹, Sikai Chen^¹(

), Mohammad Miralinaghi^¹, Tiantian Chen^², Samuel Labi^³

Center for Connected and Automated Transportation (CCAT) and Lyles School of Civil Engineering, Purdue University, West Lafayette, Indiana, USA

Department of Industrial and System Engineering, The Hong Kong Polytechnic University, Kowloon, China

Center for Connected and Automated Transportation (CCAT) and Lyles School of Civil Engineering, Purdue University, West Lafayette, Indiana, USA

Abstract

Purpose

Design/methodology/approach

Findings

Originality/value

Keywords: Computer vision, Autonomous driving, Transformer, Explainable deep learning

References(50)

Alwosheel, A., van Cranenburgh, S. and Chorus, C.G. (2021), “Why did you predict that? Towards explainable artificial neural networks for travel demand analysis”, Transportation Research Part C: Emerging Technologies, Vol. 128, doi: 10.1016/j.trc.2021.103143.

DOI Google Scholar

Bojarski, M., Del Testa, D., Dworakowski, D., Firner, B., Flepp, B., Goyal, P. Jackel, L.D. et al. (2016), “End to end learning for self-driving cars”, pp. 1-9.

Bustos, C., Rhoads, D., Solé-Ribalta, A., Masip, D., Arenas, A., Lapedriza, A. and Borge-Holthoefer, J. (2021), “Explainable, automated urban interventions to improve pedestrian and vehicle safety”, Transportation Research Part C: Emerging Technologies, Vol. 125, doi: 10.1016/j.trc.2021.103018.

DOI Google Scholar

Chen, S., Leng, Y. and Labi, S. (2019), “A deep learning algorithm for simulating autonomous driving considering prior knowledge and temporal information”, Computer-Aided Civil and Infrastructure Engineering, Vol. 35 No. 4, doi: 10.1111/mice.12495.

DOI Google Scholar

Chen, S., Dong, J., Ha, P., Li, Y. and Labi, S. (2021), “Graph neural network and reinforcement learning for multi-agent cooperative control of connected autonomous vehicles”, Computer-Aided Civil and Infrastructure Engineering, Vol. 36 No. 7, doi: 10.1111/mice.12702.

DOI Google Scholar

Cui, Z., Henrickson, K., Ke, R. and Wang, Y. (2019), “Traffic graph convolutional recurrent neural network: a deep learning framework for network-scale traffic learning and forecasting”, IEEE Transactions on Intelligent Transportation Systems, Vol. 21 No. 11, doi: 10.1109/tits.2019.2950416.

DOI Google Scholar

Dong, J., Chen, S., Zong, S., Chen, T. and Labi, S. (2021b), “Image transformer for explainable autonomous driving system”, In 2021 IEEE International Intelligent Transportation Systems Conference (ITSC), IEEE, pp. 2732-2737.

Google Scholar

Dong, J., Chen, S., Li, Y., Du, R., Steinfeld, A. and Labi, S. (2021a), “Space-weighted information fusion using deep reinforcement learning: the context of tactical control of lane-changing autonomous vehicles and connectivity range assessment”, Transportation Research Part C: Emerging Technologies, Vol. 128, doi: 10.1016/j.trc.2021.103192.

DOI Google Scholar

Dong, J., Chen, S., Li, Y., Ha, P.Y.J., Du, R., Steinfeld, A. and Labi, S. (2020), “Spatio-weighted information fusion and DRL-based control for connected autonomous vehicles”, 2020 IEEE 23rd International Conference on Intelligent Transportation Systems, ITSC 2020, doi: 10.1109/ITSC45102.2020.9294550.

DOI Google Scholar

Doran, D., Schulz, S. and Besold, T.R. (2018), “What does explainable AI really mean? A new conceptualization of perspectives”, CEUR Workshop Proceedings.

Du, R., Chen, S., Dong, J., Ha, P.Y.J. and Labi, S. (2021), “GAQ-EBkSP: a DRL-based urban traffic dynamic rerouting framework using fog-cloud architecture”, doi: 10.1109/isc253183.2021.9562832.

FHWA (2019), “Evaluation methods and techniques: advanced transportation and congestion management technologies deployment program, tech”, Rep. Nr. FHWA-HOP-19-053, Prepared by the Volpe National Transportation Syst, Washington, DC.

Ha, P., Chen, S., Du, R., Dong, J., Li, Y. and Labi, S. (2020), “Vehicle connectivity and automation: a sibling relationship”, Frontiers in Built Environment, Vol. 6, doi: 10.3389/fbuil.2020.590036.

DOI Google Scholar

He, K., Zhang, X., Ren, S. and Sun, J. (2016), “Deep residual learning for image recognition”, Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, doi: 10.1109/CVPR.2016.90.

DOI Google Scholar

Horgan, J., Hughes, C., McDonald, J. and Yogamani, S. (2015), “Vision-based driver assistance systems: survey, taxonomy and advances”, IEEE Conference on Intelligent Transportation Systems, Proceedings, ITSC, doi: 10.1109/ITSC.2015.329.

DOI Google Scholar

Hou, R., Jeong, S., Lynch, J.P. and Law, K.H. (2020), “Cyber-physical system architecture for automating the mapping of truck loads to bridge behavior using computer vision in connected highway corridors”, Transportation Research Part C: Emerging Technologies, Vol. 111, doi: 10.1016/j.trc.2019.11.024.

DOI Google Scholar

Hu, H., Zhao, T., Wang, Q., Gao, F. and He, L. (2020), “R-CNN based 3D object detection for autonomous driving”, CICTP 2020: Transportation Evolution Impacting Future Mobility – Selected Papers from the 20th COTA International Conference of Transportation Professionals, doi: 10.1061/9780784483053.077.

DOI Google Scholar

Khastgir, S., Birrell, S., Dhadyalla, G. and Jennings, P. (2018), “Calibrating trust through knowledge: introducing the concept of informed safety for automation in vehicles”, Transportation Research Part C: Emerging Technologies, Vol. 96, doi: 10.1016/j.trc.2018.07.001.

DOI Google Scholar

Kim, J. and Canny, J. (2017), “Interpretable learning for self-driving cars by visualizing causal attention”, Proceedings of the IEEE International Conference on Computer Vision, doi: 10.1109/ICCV.2017.320.

DOI Google Scholar

Kim, J., Moon, S., Rohrbach, A., Darrell, T. and Canny, J. (2020), “Advisable learning for self-driving vehicles by internalizing observation-to-action rules”, Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, doi: 10.1109/CVPR42600.2020.00968.

DOI Google Scholar

Ku, J., Pon, A.D. and Waslander, S.L. (2019), “Monocular 3D object detection leveraging accurate proposals and shape reconstruction”, Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, doi: 10.1109/CVPR.2019.01214.

DOI Google Scholar

Lioris, J., Pedarsani, R., Tascikaraoglu, F.Y. and Varaiya, P. (2017), “Platoons of connected vehicles can double throughput in urban roads”, Transportation Research Part C: Emerging Technologies, Vol. 77, doi: 10.1016/j.trc.2017.01.023.

DOI Google Scholar

Litman, T. (2014), “Autonomous Vehicle Implementation Predictions: Implications for Transport Planning”, Transportation Research Board Annual Meeting, doi: 10.1613/jair.301.

Liu, Y., Liu, Z. and Jia, R. (2019), “DeepPF: a deep learning based architecture for metro passenger flow prediction”, Transportation Research Part C: Emerging Technologies, Vol. 101, doi: 10.1016/j.trc.2019.01.027.

DOI Google Scholar

McCausland, P. (2019), “Self-driving uber car that hit and killed woman did not recognize that pedestrians jaywalk”, NBC News, pp. 3-5.

Google Scholar

NTSB (2019), “Collision between vehicle controlled by developmental automated driving system and pedestrian”, Highway Accident Report NTSB/HAR19/03 Washington, DC.

Peng, B., Keskin, M.F., Kulcsár, B. and Wymeersch, H. (2021), “Connected autonomous vehicles for improving mixed traffic efficiency in unsignalized intersections with deep reinforcement learning”, Communications in Transportation Research, Vol. 1, doi: 10.1016/j.commtr.2021.100017.

DOI Google Scholar

Ren, S., He, K., Girshick, R. and Sun, J. (2017), “Faster R-CNN: towards real-time object detection with region proposal networks”, IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 39 No. 6, doi: 10.1109/TPAMI.2016.2577031.

DOI Google Scholar

Rosenholtz, R. (2016), “Capabilities and limitations of peripheral vision”, Annual Review of Vision Science, Vol. 2 No. 1, doi: 10.1146/annurev-vision-082114-035733.

DOI Google Scholar

Sandler, M., Howard, A., Zhu, M., Zhmoginov, A. and Chen, L.C. (2018), “MobileNetV2: inverted residuals and linear bottlenecks”, Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, doi: 10.1109/CVPR.2018.00474.

DOI Google Scholar

Schwarting, W., Alonso-Mora, J. and Rus, D. (2018), “Planning and decision-making for autonomous vehicles”, Annual Review of Control, Robotics, and Autonomous Systems, Vol. 1 No. 1, pp. 187-210.

Google Scholar

Sinha, K.C. and Labi, S. (2007), “Transportation decision making: principles of project evaluation and programming, transportation decision making: principles of project evaluation and programming”, doi: 10.1002/9780470168073.

Sowmya Shree, B.V. and Karthikeyan, A. (2018), “Computer vision based advanced driver assistance system algorithms with optimization techniques-a review”, Proceedings of the 2nd International Conference on Electronics, Communication and Aerospace Technology, ICECA 2018, doi: 10.1109/ICECA.2018.8474604.

DOI Google Scholar

Talpaert, V., Sobh, I., Ravi Kiran, B., Mannion, P., Yogamani, S., El-Sallab, A. and Perez, P. (2019), “Exploring applications of deep reinforcement learning for real-world autonomous driving systems”, VISIGRAPP 2019 – Proceedings of the 14th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications, doi: 10.5220/0007520305640572.

DOI Google Scholar

TRB (2018), “Socioeconomic impacts of automated and connected vehicle: summary of the sixth EU – US”, Transportation Research Symposium, Transportation Research Board Conference Proceedings.

TRB (2019), “TRB forum on preparing for automated vehicles and shared mobility: mini-workshop on the importance and role of connectivity”, Transportation Research Circular.

Veres, S.M., Molnar, L., Lincoln, N.K. and Morice, C.P. (2011), “Autonomous vehicle control systems – a review of decision making”, Proceedings of the Institution of Mechanical Engineers. Part I: Journal of Systems and Control Engineering, doi: 10.1177/2041304110394727.

DOI Google Scholar

Wolfe, B., Dobres, J., Rosenholtz, R. and Reimer, B. (2017), “More than the useful field: considering peripheral vision in driving”, Applied Ergonomics, Vol. 65, doi: 10.1016/j.apergo.2017.07.009.

DOI Google Scholar

World Bank (2005), “A framework for the economic evaluation of transport projects, transport notes”.

Xia, Y., Kim, J., Canny, J., Zipser, K., Canas-Bajo, T. and Whitney, D. (2020), “Periphery-fovea multi-resolution driving model guided by human attention”, Proceedings – 2020 IEEE Winter Conference on Applications of Computer Vision, WACV 2020, doi: 10.1109/WACV45572.2020.9093524.

DOI Google Scholar

Xing, Y., Lv, C., Cao, D. and Velenis, E. (2021), “Multi-scale driver behavior modeling based on deep spatial-temporal representation for intelligent vehicles”, Transportation Research Part C: Emerging Technologies, Vol. 130, doi: 10.1016/j.trc.2021.103288.

DOI Google Scholar

Xu, H., Gao, Y., Yu, F. and Darrell, T. (2017), “End-to-end learning of driving models from large-scale video datasets”, Proceedings – 30th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, doi: 10.1109/CVPR.2017.376.

DOI Google Scholar

Xu, Y., Yang, X., Gong, L., Lin, H.C., Wu, T.Y., Li, Y. and Vasconcelos, N. (2020), “Explainable object-induced action decision for autonomous vehicles”, Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, doi: 10.1109/CVPR42600.2020.00954.

DOI Google Scholar

Yadron, D. and Tynan, D. (2016), “Tesla driver dies in first fatal crash while using autopilot mode”, The Guardian.

Yu, B., Lee, Y. and Sohn, K. (2020a), “Forecasting road traffic speeds by considering area-wide spatio-temporal dependencies based on a graph convolutional neural network (GCN)”, Transportation Research Part C: Emerging Technologies, Vol. 114, doi: 10.1016/j.trc.2020.02.013.

DOI Google Scholar

Yu, F., Chen, H., Wang, X., Xian, W., Chen, Y., Liu, F., Madhavan, V. and Darrell, T. (2020b), “BDD100K: a diverse driving dataset for heterogeneous multitask learning”, Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, doi: 10.1109/CVPR42600.2020.00271.

DOI Google Scholar

Zhao, H., Jia, J. and Koltun, V. (2020), “Exploring self-attention for image recognition”, Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, doi: 10.1109/CVPR42600.2020.01009.

DOI Google Scholar

Zhou, F., Li, L., Zhang, K. and Trajcevski, G. (2021), “Urban flow prediction with spatial–temporal neural ODEs”, Transportation Research Part C: Emerging Technologies, Vol. 124, doi: 10.1016/j.trc.2020.102912.

DOI Google Scholar

Zhu, W., Wu, J., Fu, T., Wang, J., Zhang, J. and Shangguan, Q. (2021), “Dynamic prediction of traffic incident duration on urban expressways: a deep learning approach based on LSTM and MLP”, Journal of Intelligent and Connected Vehicles, Vol. 4 No. 2, doi: 10.1108/jicv-03-2021-0004.

DOI Google Scholar

Zhuang, L., Wang, L., Zhang, Z. and Tsui, K.L. (2018), “Automated vision inspection of rail surface cracks: a double-layer data-driven framework”, Transportation Research Part C: Emerging Technologies, Vol. 92, doi: 10.1016/j.trc.2018.05.007.

DOI Google Scholar

About this article

Publication history

Rights and permissions

Publication history

Received: 14 June 2022

Revised: 17 June 2022

Accepted: 19 June 2022

Published: 13 July 2022

Issue date: October 2022

Copyright

Rights and permissions

This article is published under the Creative Commons Attribution (CC BY 4.0) licence. Anyone may reproduce, distribute, translate and create derivative works of this article (for both commercial and non-commercial purposes), subject to full attribution to the original publication and authors. The full terms of this licence maybe seen at http://creativecommons.org/licences/by/4.0/legalcode