Connected autonomous vehicles for improving mixed traffic efficiency in unsignalized intersections with deep reinforcement learning

Bile Peng; Musa Furkan Keskin; Balázs Kulcsár; Henk Wymeersch

doi:10.1016/j.commtr.2021.100017

Communications in Transportation Research 2021, 1(1): 100017 https://doi.org/10.1016/j.commtr.2021.100017

Research Article |

Open Access | Issue | Published: 29 November 2021

Connected autonomous vehicles for improving mixed traffic efficiency in unsignalized intersections with deep reinforcement learning

Show Author's Information Hide Author's Information Bile Peng^{^a}(

), Musa Furkan Keskin^{^b}, Balázs Kulcsár^{^b}, Henk Wymeersch^{^b}

Institute for Communications Technology, TU Braunschweig, 38 106, Braunschweig, Germany

Department of Electrical Engineering, Chalmers University of Technology, 41 296, Gothenburg, Sweden

Keywords:

Autonomous driving, Deep reinforcement learning, Connected vehicles, Intelligent transportation systems

Cite this article:

Peng B, Keskin MF, Kulcsár B, et al. Connected autonomous vehicles for improving mixed traffic efficiency in unsignalized intersections with deep reinforcement learning. Communications in Transportation Research, 2021, 1(1): 100017. https://doi.org/10.1016/j.commtr.2021.100017

Download citation

EndNote(RIS)

BibTeX

1173

Views

Citations

Crossref

WoS

Scopus

N/A

CSCD

Abstract About this article

Abstract

Human driven vehicles (HDVs) with selfish objectives cause low traffic efficiency in an un-signalized intersection. On the other hand, autonomous vehicles can overcome this inefficiency through perfect coordination. In this paper, we propose an intermediate solution, where we use vehicular communication and a small number of autonomous vehicles to improve the transportation system efficiency in such intersections. In our solution, two connected autonomous vehicles (CAVs) lead multiple HDVs in a double-lane intersection in order to avoid congestion in front of the intersection. The CAVs are able to communicate and coordinate their behavior, which is controlled by a deep reinforcement learning (DRL) agent. We design an altruistic reward function which enables CAVs to adjust their velocities flexibly in order to avoid queuing in front of the intersection. The proximal policy optimization (PPO) algorithm is applied to train the policy and the generalized advantage estimation (GAE) is used to estimate state values. Training results show that two CAVs are able to achieve significantly better traffic efficiency compared to similar scenarios without and with one altruistic autonomous vehicle.

About this article

Publication history

Acknowledgements

Rights and permissions

Publication history

Received: 26 October 2021

Revised: 21 November 2021

Accepted: 21 November 2021

Published: 29 November 2021

Issue date: December 2021

Copyright

Acknowledgements

The authors would like to thank Mr A. Kreidieh and Mr E. Vinitsky for their insightful suggestions.

The project has been partially funded by Chalmers Transport Area of Advance under IRIS: Inverse Reinforcement-Learning and Intelligent Swarm Algorithms for Resilient Transportation Networks.

Rights and permissions

This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).