1205
Views
87
Downloads
70
Crossref
58
WoS
69
Scopus
N/A
CSCD
Human driven vehicles (HDVs) with selfish objectives cause low traffic efficiency in an un-signalized intersection. On the other hand, autonomous vehicles can overcome this inefficiency through perfect coordination. In this paper, we propose an intermediate solution, where we use vehicular communication and a small number of autonomous vehicles to improve the transportation system efficiency in such intersections. In our solution, two connected autonomous vehicles (CAVs) lead multiple HDVs in a double-lane intersection in order to avoid congestion in front of the intersection. The CAVs are able to communicate and coordinate their behavior, which is controlled by a deep reinforcement learning (DRL) agent. We design an altruistic reward function which enables CAVs to adjust their velocities flexibly in order to avoid queuing in front of the intersection. The proximal policy optimization (PPO) algorithm is applied to train the policy and the generalized advantage estimation (GAE) is used to estimate state values. Training results show that two CAVs are able to achieve significantly better traffic efficiency compared to similar scenarios without and with one altruistic autonomous vehicle.
Human driven vehicles (HDVs) with selfish objectives cause low traffic efficiency in an un-signalized intersection. On the other hand, autonomous vehicles can overcome this inefficiency through perfect coordination. In this paper, we propose an intermediate solution, where we use vehicular communication and a small number of autonomous vehicles to improve the transportation system efficiency in such intersections. In our solution, two connected autonomous vehicles (CAVs) lead multiple HDVs in a double-lane intersection in order to avoid congestion in front of the intersection. The CAVs are able to communicate and coordinate their behavior, which is controlled by a deep reinforcement learning (DRL) agent. We design an altruistic reward function which enables CAVs to adjust their velocities flexibly in order to avoid queuing in front of the intersection. The proximal policy optimization (PPO) algorithm is applied to train the policy and the generalized advantage estimation (GAE) is used to estimate state values. Training results show that two CAVs are able to achieve significantly better traffic efficiency compared to similar scenarios without and with one altruistic autonomous vehicle.
Guney, M. A. and Raptis, I. A. (2020). Scheduling-based optimization for motion coordination of autonomous vehicles at multilane intersections. Journal of Robotics, 2020
Hafner, M. R., Cunningham, D., Caminiti, L., and Del Vecchio, D. (2013). Cooperative collision avoidance at intersections: Algorithms and experiments. IEEE Transactions on Intelligent Transportation Systems, 14(3):1162-1175
Kamal, M. A. S., Taguchi, S., and Yoshimura, T. (2016). Efficient driving on multilane roads under a connected vehicle environment. IEEE Transactions on Intelligent Transportation Systems, 17(9):2541-2551
Kober, J., Bagnell, J. A., and Peters, J. (2013). Reinforcement learning in robotics: A survey. The International Journal of Robotics Research, 32(11):1238-1274
Malikopoulos, A. A., Cassandras, C. G., and Zhang, Y. J. (2018). A decentralized energy-optimal control framework for connected automated vehicles at signal-free intersections. Automatica, 93:244-256
Rios-Torres, J. and Malikopoulos, A. A. (2016). A survey on the coordination of connected and automated vehicles at intersections and merging at highway on-ramps. IEEE Transactions on Intelligent Transportation Systems, 18(5):1066-1077
Silver, D., Hubert, T., Schrittwieser, J., Antonoglou, I., Lai, M., Guez, A., Lanctot, M., Sifre, L., Kumaran, D., Graepel, T., et al. (2018). A general reinforcement learning algorithm that masters chess, shogi, and go through self-play. Science, 362(6419):1140-1144
Treiber, M., Hennecke, A., and Helbing, D. (2000). Congested traffic states in empirical observations and microscopic simulations. Physical review E, 62(2):1805
Wei, S., Zou, Y., Zhang, X., Zhang, T., and Li, X. (2019). An integrated longitudinal and lateral vehicle following control system with radar and vehicle-to-vehicle communication. IEEE Transactions on Vehicular Technology, 68(2):1116-1127
Ye, H., Li, G. Y., and Juang, B.-H. F. (2019). Deep reinforcement learning based resource allocation for v2v communications. IEEE Transactions on Vehicular Technology, 68(4):3163-3173
Zhao, L., Malikopoulos, A., and Rios-Torres, J. (2018). Optimal control of connected and automated vehicles at roundabouts: An investigation in a mixed-traffic environment. IFAC-PapersOnLine, 51(9):73-78
The authors would like to thank Mr A. Kreidieh and Mr E. Vinitsky for their insightful suggestions.
The project has been partially funded by Chalmers Transport Area of Advance under IRIS: Inverse Reinforcement-Learning and Intelligent Swarm Algorithms for Resilient Transportation Networks.
This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).