Connected autonomous vehicles for improving mixed traffic efficiency in unsignalized intersections with deep reinforcement learning

Bile Peng; Musa Furkan Keskin; Balázs Kulcsár; Henk Wymeersch

doi:10.1016/j.commtr.2021.100017

Communications in Transportation Research 2021, 1(1): 100017 https://doi.org/10.1016/j.commtr.2021.100017

Research Article |

Open Access | Issue | Published: 29 November 2021

Connected autonomous vehicles for improving mixed traffic efficiency in unsignalized intersections with deep reinforcement learning

Show Author's Information Hide Author's Information Bile Peng^{^a}(

), Musa Furkan Keskin^{^b}, Balázs Kulcsár^{^b}, Henk Wymeersch^{^b}

Institute for Communications Technology, TU Braunschweig, 38 106, Braunschweig, Germany

Department of Electrical Engineering, Chalmers University of Technology, 41 296, Gothenburg, Sweden

Keywords:

Autonomous driving, Deep reinforcement learning, Connected vehicles, Intelligent transportation systems

Cite this article:

Peng B, Keskin MF, Kulcsár B, et al. Connected autonomous vehicles for improving mixed traffic efficiency in unsignalized intersections with deep reinforcement learning. Communications in Transportation Research, 2021, 1(1): 100017. https://doi.org/10.1016/j.commtr.2021.100017

Download citation

EndNote(RIS)

BibTeX

1205

Views

Downloads

Citations

Crossref

WoS

Scopus

N/A

CSCD

Abstract Full text About this article

Abstract

Human driven vehicles (HDVs) with selfish objectives cause low traffic efficiency in an un-signalized intersection. On the other hand, autonomous vehicles can overcome this inefficiency through perfect coordination. In this paper, we propose an intermediate solution, where we use vehicular communication and a small number of autonomous vehicles to improve the transportation system efficiency in such intersections. In our solution, two connected autonomous vehicles (CAVs) lead multiple HDVs in a double-lane intersection in order to avoid congestion in front of the intersection. The CAVs are able to communicate and coordinate their behavior, which is controlled by a deep reinforcement learning (DRL) agent. We design an altruistic reward function which enables CAVs to adjust their velocities flexibly in order to avoid queuing in front of the intersection. The proximal policy optimization (PPO) algorithm is applied to train the policy and the generalized advantage estimation (GAE) is used to estimate state values. Training results show that two CAVs are able to achieve significantly better traffic efficiency compared to similar scenarios without and with one altruistic autonomous vehicle.

Full text

Abstract

Full text

Outline

About this article

Connected autonomous vehicles for improving mixed traffic efficiency in unsignalized intersections with deep reinforcement learning

Show Author's information Hide Author's Information Bile Peng^{^a}(

), Musa Furkan Keskin^{^b}, Balázs Kulcsár^{^b}, Henk Wymeersch^{^b}

Institute for Communications Technology, TU Braunschweig, 38 106, Braunschweig, Germany

Department of Electrical Engineering, Chalmers University of Technology, 41 296, Gothenburg, Sweden

Abstract

Keywords: Autonomous driving, Deep reinforcement learning, Connected vehicles, Intelligent transportation systems

References(31)

Ahn, H. and Del Vecchio, D. (2016). Semi-autonomous intersection collision avoidance through job-shop scheduling. In Proceedings of the 19th International Conference on Hybrid Systems: Computation and Control, pages 185-194https://doi.org/10.1145/2883817.2883830

DOI

Azimi, R., Bhatia, G., Rajkumar, R. R., and Mudalige, P. (2014). Stip: Spatio-temporal intersection protocols for autonomous vehicles. In 2014 ACM/IEEE International Conference on Cyber-Physical Systems (ICCPS), pages 1-12. IEEEhttps://doi.org/10.1109/ICCPS.2014.6843706

DOI

Bahram, M. (2017). Interactive Maneuver Prediction and Planning for Highly Automated Driving Functions. PhD thesis, Technische Universitat Munchen

Behrisch, M., Bieker, L., Erdmann, J., and Krajzewicz, D. (2011). SUMO-Simulation of Urban MObility - an Overview. In SIMUL 2011, The Third International Conference on Advances in System Simulation, pages 55-60

Campos, G. R., Falcone, P., Wymeersch, H., Hult, R., and Sjoberg, J. (2014). Cooperative receding horizon conflict resolution at traffic intersections. In 53rd IEEE Conference on Decision and Control, pages 2932-2937. IEEEhttps://doi.org/10.1109/CDC.2014.7039840

DOI

Guney, M. A. and Raptis, I. A. (2020). Scheduling-based optimization for motion coordination of autonomous vehicles at multilane intersections. Journal of Robotics, 2020

Google Scholar

Hafner, M. R., Cunningham, D., Caminiti, L., and Del Vecchio, D. (2013). Cooperative collision avoidance at intersections: Algorithms and experiments. IEEE Transactions on Intelligent Transportation Systems, 14(3):1162-1175

DOI Google Scholar

Jang, K., Vinitsky, E., Chalaki, B., Remer, B., Beaver, L., Malikopoulos, A. A., and Bayen, A. (2019). Simulation to scaled city: zero-shot policy transfer for traffic control via autonomous vehicles. In Proceedings of the 10th ACM/IEEE International Conference on Cyber-Physical Systems, pages 291-300https://doi.org/10.1145/3302509.3313784

DOI

Kamal, M. A. S., Taguchi, S., and Yoshimura, T. (2016). Efficient driving on multilane roads under a connected vehicle environment. IEEE Transactions on Intelligent Transportation Systems, 17(9):2541-2551

DOI Google Scholar

Keskin, M. F., Peng, B., Kulcsar, B., and Wymeersch, H. (2020). Altruistic control of connected automated vehicles in mixed-autonomy multi-lane highway traffic. In 2020 21st IFAC World Congress (accepted)https://doi.org/10.1016/j.ifacol.2020.12.1990

DOI

Kheterpal, N., Vinitsky, E., Wu, C., Kreidieh, A., Jang, K., Parvate, K., and Bayen, A. (2018). Flow: Open source reinforcement learning for traffic control

Kober, J., Bagnell, J. A., and Peters, J. (2013). Reinforcement learning in robotics: A survey. The International Journal of Robotics Research, 32(11):1238-1274

DOI Google Scholar

Lee, M., Yu, G., and Li, Y. (2019). Learning to branch: Accelerating resource allocation in wireless networks. IEEE Transactions on Vehicular Technologyhttps://doi.org/10.1109/TVT.2019.2953724

DOI

Li, S. E., Zheng, Y., Li, K., Wang, L.-Y., and Zhang, H. (2017). Platoon control of connected vehicles from a networked control perspective: Literature review, component modeling, and controller synthesis. IEEE Transactions on Vehicular Technologyhttps://doi.org/10.1109/TVT.2017.2723881

DOI

Liang, E., Liaw, R., Moritz, P., Nishihara, R., Fox, R., Goldberg, K., Gonzalez, J. E., Jordan, M. I., and Stoica, I. (2017). Rllib: Abstractions for distributed reinforcement learning. arXiv preprint arXiv:1712.09381

Malikopoulos, A.A., Cassandras, C.G., Zhang, Y.J.

A decentralized energy-optimal control framework for connected automated vehicles at signal-free intersections

Automatica201893244256

Malikopoulos, A. A., Cassandras, C. G., and Zhang, Y. J. (2018). A decentralized energy-optimal control framework for connected automated vehicles at signal-free intersections. Automatica, 93:244-256

10.1016/j.automatica.2018.03.056

DOI Google Scholar

Rios-Torres, J., Malikopoulos, A.A.

A survey on the coordination of connected and automated vehicles at intersections and merging at highway on-ramps

IEEE Trans. Intell. Transport. Syst.201618510661077

Rios-Torres, J. and Malikopoulos, A. A. (2016). A survey on the coordination of connected and automated vehicles at intersections and merging at highway on-ramps. IEEE Transactions on Intelligent Transportation Systems, 18(5):1066-1077

10.1109/TITS.2016.2600504

DOI Google Scholar

Schulman, J., Levine, S., Abbeel, P., Jordan, M., and Moritz, P. (2015a). Trust region policy optimization. In International conference on machine learning, pages 1889-1897

Schulman, J., Moritz, P., Levine, S., Jordan, M., and Abbeel, P. (2015b). High-dimensional continuous control using generalized advantage estimation. arXiv preprint arXiv:1506.02438

Schulman, J., Wolski, F., Dhariwal, P., Radford, A., and Klimov, O. (2017). Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347

Silver, D., Hubert, T., Schrittwieser, J., Antonoglou, I., Lai, M., Guez, A., Lanctot, M., Sifre, L., Kumaran, D., Graepel, T., et al.

A general reinforcement learning algorithm that masters chess, shogi, and go through self-play

Science2018362641911401144

10.1126/science.aar6404

Silver, D., Hubert, T., Schrittwieser, J., Antonoglou, I., Lai, M., Guez, A., Lanctot, M., Sifre, L., Kumaran, D., Graepel, T., et al. (2018). A general reinforcement learning algorithm that masters chess, shogi, and go through self-play. Science, 362(6419):1140-1144

DOI Google Scholar

Treiber, M., Hennecke, A., and Helbing, D. (2000). Congested traffic states in empirical observations and microscopic simulations. Physical review E, 62(2):1805

DOI Google Scholar

Vinitsky, E., Kreidieh, A., Le Flem, L., Kheterpal, N., Jang, K., Wu, C., Wu, F., Liaw, R., Liang, E., and Bayen, A. M. (2018a). Benchmarks for reinforcement learning in mixed-autonomy traffic. In Conference on Robot Learning, pages 399-409

Vinitsky, E., Parvate, K., Kreidieh, A., Wu, C., and Bayen, A. (2018b). Lagrangian control through deep-RL: Applications to bottleneck decongestion. In 2018 21st International Conference on Intelligent Transportation Systems (ITSC), pages 759-765. IEEEhttps://doi.org/10.1109/ITSC.2018.8569615

DOI

Wang, P., Chan, C.-Y., and de La Fortelle, A. (2018). A reinforcement learning based approach for automated lane change maneuvers. In 2018 IEEE Intelligent Vehicles Symposium (IV), pages 1379-1384. IEEEhttps://doi.org/10.1109/IVS.2018.8500556

DOI

Wei, S., Zou, Y., Zhang, X., Zhang, T., and Li, X. (2019). An integrated longitudinal and lateral vehicle following control system with radar and vehicle-to-vehicle communication. IEEE Transactions on Vehicular Technology, 68(2):1116-1127

DOI Google Scholar

Wu, C., Kreidieh, A., Parvate, K., Vinitsky, E., and Bayen, A. M. (2017). Flow: Architecture and benchmarking for reinforcement learning in traffic control. arXiv preprint arXiv:1710.05465

Yang, K., Tan, I., and Menendez, M. (2017). A reinforcement learning based traffic signal control algorithm in a connected vehicle environment. In 17th Swiss Transport Research Conference (STRC 2017). STRC

Ye, H., Li, G. Y., and Juang, B.-H. F. (2019). Deep reinforcement learning based resource allocation for v2v communications. IEEE Transactions on Vehicular Technology, 68(4):3163-3173

DOI Google Scholar

Zhang, Y., Malikopoulos, A. A., and Cassandras, C. G. (2017). Decentralized optimal control for connected automated vehicles at intersections including left and right turns. In 2017 IEEE 56th Annual Conference on Decision and Control (CDC), pages 4428-4433. IEEEhttps://doi.org/10.1109/CDC.2017.8264312

DOI

Zhao, L., Malikopoulos, A., and Rios-Torres, J. (2018). Optimal control of connected and automated vehicles at roundabouts: An investigation in a mixed-traffic environment. IFAC-PapersOnLine, 51(9):73-78

DOI Google Scholar

About this article

Publication history

Acknowledgements

Rights and permissions

Publication history

Received: 26 October 2021

Revised: 21 November 2021

Accepted: 21 November 2021

Published: 29 November 2021

Issue date: December 2021

Copyright

Acknowledgements

The authors would like to thank Mr A. Kreidieh and Mr E. Vinitsky for their insightful suggestions.

The project has been partially funded by Chalmers Transport Area of Advance under IRIS: Inverse Reinforcement-Learning and Intelligent Swarm Algorithms for Resilient Transportation Networks.

Rights and permissions

This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).