Journal Home > Volume 28 , Issue 2

With the increasing attention to front-edge vehicular communication applications, distributed resource allocation is beneficial to the direct communications between vehicle nodes. However, in highly dynamic distributed vehicular networks, quality of service (QoS) of the systems would degrade dramatically because of serious packet collisions in the absence of sufficient link knowledge. Focusing on the fairness optimization, a Q-learning-based collision avoidance (QCA) scheme, which is characterized by an ingenious bidirectional backoff reward model RQCA corresponding to arbitrary backoff stage transitions, has been proposed in an intelligent distributed media access control protocol. In QCA, an intelligent bidirectional backoff agent based on the Markov decision process model can actively motivate each vehicle agent to update itself toward an optimal backoff sub-intervel BSIopt through either positive or negative bidirectional transition individually, resulting in the distinct fair communication with a proper balance of the resource allocation. According to the reinforcement learning theory, the problem of goodness evaluation on the backoff stage self-selection policy is equal to the problem of maximizing Q function of the vehicle in the current environment. The final decision on BSIopt related to an optimal contention window range was solved through maximizing the Q value or Qmax. The ε-greedy algorithm was used to keep a reasonable convergence of the Qmax solution. For the fairness evaluation of QCA, four kinds of dynamic impacts on the vehicular networks were investigated: mobility, density, payload size, and data rate with a network simulator NS2. Consequently, QCA can achieve fair communication efficiently and robustly, with advantages of superior Jain’s fairness index, relatively high packet delivery ratio, and low time delay.


menu
Abstract
Full text
Outline
About this article

A Fairness-Enhanced Intelligent MAC Scheme Using Q-Learning-Based Bidirectional Backoff for Distributed Vehicular Communication Networks

Show Author's information Ping Wang1( )Shuai Wang1
College of Information Science and Technology, Donghua University, Shanghai 201620, China

Abstract

With the increasing attention to front-edge vehicular communication applications, distributed resource allocation is beneficial to the direct communications between vehicle nodes. However, in highly dynamic distributed vehicular networks, quality of service (QoS) of the systems would degrade dramatically because of serious packet collisions in the absence of sufficient link knowledge. Focusing on the fairness optimization, a Q-learning-based collision avoidance (QCA) scheme, which is characterized by an ingenious bidirectional backoff reward model RQCA corresponding to arbitrary backoff stage transitions, has been proposed in an intelligent distributed media access control protocol. In QCA, an intelligent bidirectional backoff agent based on the Markov decision process model can actively motivate each vehicle agent to update itself toward an optimal backoff sub-intervel BSIopt through either positive or negative bidirectional transition individually, resulting in the distinct fair communication with a proper balance of the resource allocation. According to the reinforcement learning theory, the problem of goodness evaluation on the backoff stage self-selection policy is equal to the problem of maximizing Q function of the vehicle in the current environment. The final decision on BSIopt related to an optimal contention window range was solved through maximizing the Q value or Qmax. The ε-greedy algorithm was used to keep a reasonable convergence of the Qmax solution. For the fairness evaluation of QCA, four kinds of dynamic impacts on the vehicular networks were investigated: mobility, density, payload size, and data rate with a network simulator NS2. Consequently, QCA can achieve fair communication efficiently and robustly, with advantages of superior Jain’s fairness index, relatively high packet delivery ratio, and low time delay.

Keywords: media access control (MAC), fairness, Q-learning-based collision avoidance (QCA), bidirectional backoff agent, vehicular communication networks

References(22)

[1]
M. Shi, Y. Tang, X. Zhang, Y. Zhang, and J. Xu, Modeling and simulation of packet delivery rate in LTE-V network based on Markov chain, Tsinghua Science and Technology, vol. 25, no. 3, pp. 357–367, 2020.
[2]
T. Li, C. Li, J. Luo, and L. Song, Wireless recommendations for Internet of vehicles: Recent advances, challenges, and opportunities, Intell. Converged Netw., vol. 1, no. 1, pp. 1–17, 2020.
[3]
S. A. A. Shah, E. Ahmed, M. Imran, and S. Zeadally, 5G for vehicular communications, IEEE Commun. Mag., vol. 56, no. 1, pp. 111–117, 2018.
[4]
Y. Lu, P. Wang, S. Wang, and W. Yao, A Q-learning based SPS resource scheduling algorithm for reliable C-V2X communication, in Proc. 5th Int. Conf. on Digital Signal Processing (ICDSP 2021), Chengdu, China, 2021, pp. 201–206.
[5]
Y. Jeon, S. Kuk, and H. Kim, Reducing message collisions in sensing-based semi-persistent scheduling (SPS) by using reselection Lookaheads in cellular V2X, Sensors, vol. 18, no. 12, p. 4388, 2018.
[6]
C. Bin Ali Wael, N. Armi, A. Mitayani, D. Kurniawan, A. Suryadi Satyawan, and A. Subekti, Analysis of IEEE 802.11p MAC protocol for safety message broadcast in V2V communication, in Proc. 2020 Int. Conf. on Radar, Antenna, Microwave, Electronics, and Telecommunications (ICRAMET), Tangerang, Indonesia, 2020, pp. 320–324.
[7]
B. Huang, X. Cheng, C. Huang, and W. Cheng, Meet-Cloud for secure and accurate distribution of negative messages in vehicular ad hoc network, Tsinghua Science and Technology, vol. 23, no. 4, pp. 377–388, 2018.
[8]
A. D. Trabelsi, H. Marouane, F. Zarai, and A. Meddeb-Makhlouf, Dynamic scheduling algorithm based on priority assignment for LTE-V2X vehicular networks, in Proc. 15th Int. Wireless Communications & Mobile Computing Conf. (IWCMC), Tangier, Morocco, 2019, pp. 483–488.
[9]
A. Masmoudi, S. Feki, K. Mnif, and F. Zarai, Efficient scheduling and resource allocation for D2D-based LTE-V2X communications, in Proc. 15th Int. Wireless Communications & Mobile Computing Conf. (IWCMC), Tangier, Morocco, 2019, pp. 496–501.
[10]
A. T. Giang, A. Busson, and M. D. Renzo, Modeling and optimization of CSMA/CA in VANET, Ann. Operat. Res., vol. 239, no. 2, pp. 553–568, 2016.
[11]
M. Klapez, C. A. Grazia, and M. Casoni, Application-level performance of IEEE 802.11p in safety-related V2X field trials, IEEE Int. Things J., vol. 7, no. 5, pp. 3850–3860, 2020.
[12]
I. Tinnirello, M. Wentink, D. Garlisi, F. Giuliano, and G. Bianchi, MAC design on real 802.11 devices: From exponential to Moderated Backoff, in Proc. IEEE 17th Int. Symp. on A World of Wireless, Mobile and Multimedia Networks (WoWMoM), Coimbra, Portugal, 2016, pp. 1–6.
[13]
C. Zhang, P. Chen, J. Ren, X. Wang, and A. V. Vasilakos, A backoff algorithm based on self-adaptive contention window update factor for IEEE 802.11 DCF, Wireless Netw., vol. 23, no. 3, pp. 749-758, 2017.
[14]
A. Balador, C. Calafate, J. C. Cano, and P. Manzoni, A density-based contention window control scheme for unicast communications in vehicular ad hoc networks, Int. J. Ad Hoc Ubiquitous Comput., vol. 24, nos. 1&2, pp. 65–75, 2017.
[15]
G. Wu and P. Xu, Improving performance by a dynamic adaptive success-collision backoff algorithm for contention-based vehicular network, IEEE Access, vol. 6, pp. 2496–2505, 2018.
[16]
B. Yang, S. Wang, X. Shi, and W. Han, QoS-assurance-and-mobility-based EDCA mechanism for vehicular network access, in Proc. 2nd IEEE Int. Conf. on Computer and Communications (ICCC), Chengdu, China, 2016, pp. 2208–2212.
[17]
S. Wang, Y. Lu, J. Zhu, and P. Wang, A novel collision supervision and avoidance algorithm for scalable MAC of vehicular networks, Chin. J. Electron., vol. 30, no. 1, pp. 164–170, 2021.
[18]
D. Zhao, H. Qin, B. Song, Y. Zhang, X. Du, and M. Guizani, A reinforcement learning method for joint mode selection and power adaptation in the V2V communication network in 5G, IEEE Trans. Cognit. Commun. Netw., vol. 6, no. 2, pp. 452–463, 2020.
[19]
S. H. Park, P. D. Mitchell, and D. Grace, Reinforcement learning based MAC protocol (UW-ALOHA-Q) for underwater acoustic sensor networks, IEEE Access, vol. 7, pp. 165531–165542, 2019.
[20]
C. Wu, S. Ohzahata, Y. Ji, and T. Kato, A MAC protocol for delay-sensitive VANET applications with self-learning contention scheme, in Proc. IEEE 11th Consumer Communications and Networking Conf. (CCNC), Las Vegas, NV, USA, 2014, pp. 438–443.
[21]
G. Han, A. Gong, H. Wang, M. Martínez-García, and Y. Peng, Multi-AUV collaborative data collection algorithm based on Q-learning in underwater acoustic sensor networks, IEEE Trans. Veh. Technol., vol. 70, no. 9, pp. 9294–9305, 2021.
[22]
C. Guo, M. Sheng, X. Wang, and Y. Zhang, Throughput maximization with short-term and long-term Jain’s index constraints in downlink OFDMA systems, IEEE Trans. Commun., vol. 62, no. 5, pp. 1503–1517, 2014.
Publication history
Copyright
Rights and permissions

Publication history

Received: 24 August 2021
Revised: 17 December 2021
Accepted: 31 December 2021
Published: 29 September 2022
Issue date: April 2023

Copyright

© The author(s) 2023.

Rights and permissions

The articles published in this open access journal are distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/).

Return