Journal Home > Volume 2 , Issue 3

Fifth-generation (5G) systems have brought about new challenges toward ensuring Quality of Service (QoS) in differentiated services. This includes low latency applications, scalable machine-to-machine communication, and enhanced mobile broadband connectivity. In order to satisfy these requirements, the concept of network slicing has been introduced to generate slices of the network with specific characteristics. In order to meet the requirements of network slices, routers and switches must be effectively configured to provide priority queue provisioning, resource contention management and adaptation. Configuring routers from vendors, such as Ericsson, Cisco, and Juniper, have traditionally been an expert-driven process with static rules for individual flows, which are prone to sub optimal configurations with varying traffic conditions. In this paper, we model the internal ingress and egress queues within routers via a queuing model. The effects of changing queue configuration with respect to priority, weights, flow limits, and packet drops are studied in detail. This is used to train a model-based Reinforcement Learning (RL) algorithm to generate optimal policies for flow prioritization, fairness, and congestion control. The efficacy of the RL policy output is demonstrated over scenarios involving ingress queue traffic policing, egress queue traffic shaping, and one-hop router coordinated traffic conditioning. This is evaluated over a real application use case, wherein a statically configured router proved sub optimal toward desired QoS requirements. Such automated configuration of routers and switches will be critical for multiple 5G deployments with varying flow requirements and traffic patterns.


menu
Abstract
Full text
Outline
About this article

Model-based reinforcement learning for router port queue configurations

Show Author's information Ajay Kattepur1( )Sushanth David2Swarup Kumar Mohalik1
Artificial Intelligence System Group, Ericsson Research, Bangalore 560093, India
Ericsson Managed Services Unit in Texas, Plano, TX 75025, USA

Abstract

Fifth-generation (5G) systems have brought about new challenges toward ensuring Quality of Service (QoS) in differentiated services. This includes low latency applications, scalable machine-to-machine communication, and enhanced mobile broadband connectivity. In order to satisfy these requirements, the concept of network slicing has been introduced to generate slices of the network with specific characteristics. In order to meet the requirements of network slices, routers and switches must be effectively configured to provide priority queue provisioning, resource contention management and adaptation. Configuring routers from vendors, such as Ericsson, Cisco, and Juniper, have traditionally been an expert-driven process with static rules for individual flows, which are prone to sub optimal configurations with varying traffic conditions. In this paper, we model the internal ingress and egress queues within routers via a queuing model. The effects of changing queue configuration with respect to priority, weights, flow limits, and packet drops are studied in detail. This is used to train a model-based Reinforcement Learning (RL) algorithm to generate optimal policies for flow prioritization, fairness, and congestion control. The efficacy of the RL policy output is demonstrated over scenarios involving ingress queue traffic policing, egress queue traffic shaping, and one-hop router coordinated traffic conditioning. This is evaluated over a real application use case, wherein a statically configured router proved sub optimal toward desired QoS requirements. Such automated configuration of routers and switches will be critical for multiple 5G deployments with varying flow requirements and traffic patterns.

Keywords: network slicing, router port queues, model-based Reinforcement Learning (RL)

References(37)

1
ETSI, System architecture for the 5G system, 3GPP TS 23.501, version 15.3. 0, 2018.
2

X. Foukas, G. Patounas, A. Elmokashfi, and M. K. Marina, Network slicing in 5G: Survey and challenges, IEEE Commun. Mag., vol. 55, no. 5, pp. 94–100, 2017.

3
Ericsson AB, Router 6675, Technical specifications, 2019.
4
D. R. Hanks Jr. and H. Reynolds, Juniper MX Series. Sebastopol, CA, USA: O’Reilly Media, 2012.
5

D. Kreutz, F. M. V. Ramos, P. E. Veríssimo, C. E. Rothenberg, S. Azodolmolky, and S. Uhlig, Software-defined networking: A comprehensive survey, Proc. IEEE, vol. 103, no. 1, pp. 14–76, 2015.

6
Cisco Systems, Quality of Service (QoS) configuration guide, Cisco IOS, 2018.
7
R. S. Sutton and A. G. Barto, Reinforcement Learning - An Introduction. 2nd ed. Cambridge, MA, USA: MIT Press, 2018.
8

L. P. Kaelbling, M. L. Littman, and A. R. Cassandra, Planning and acting in partially observable stochastic domains, Artif. Intell., vol. 101, nos. 1&2, pp. 99–134, 1998.

9
Cisco Systems, QoS: Color-aware policer, Cisco IOS documentation, 2005.
10
C. Semeria, Supporting differentiated service classes: Queue scheduling disciplines, Juniper Networks Whitepaper, 2001.
11

H. Zhang, Service disciplines for guaranteed performance service in packet-switching networks, Proc. IEEE, vol. 83, no. 10, pp. 1374–1396, 1995.

12
Cisco Systems, DiffServ – the scalable end-to-end QoS model, WhitePaper, 2005.
13
T. X. Brown, Switch packet arbitration via queue-learning, in Proc. 14thInt. Conf. Neural Information Processing Systems: Natural and Synthetic, Vancouver, Canada, 2001, pp. 1337–1344.
14
J. A. Boyan and M. L. Littman, Packet routing in dynamically changing networks: A reinforcement learning approach, in Proc. 6thInt. Conf. Neural Information Processing Systems, Denver, CO, USA, 1993, pp. 671–678.
15

Z. Mammeri, Reinforcement learning based routing in networks: Review and classification of approaches, IEEE Access, vol. 7, pp. 55916–55950, 2019.

16

A. Mestres, A. Rodriguez-Natal, J. Carner, P. Barlet-Ros, E. Alarcón, M. Solé, V. Muntés-Mulero, D. Meyer, S. Barkai, M. J. Hibbett, et al., Knowledge-defined networking, SIGCOMM Comput. Commun. Rev., vol. 47, no. 3, pp. 2–10, 2017.

17

T. C. K. Hui and C. K. Tham, Adaptive provisioning of differentiated services networks based on reinforcement learning, IEEE Trans. Syst. Man Cybern. C(Appl. Rev.), vol. 33, no. 4, pp. 492–501, 2003.

18
J. Rao, X. P. Bu, C. Z. Xu, L. Y. Wang, and G. Yin, VCONF: A reinforcement learning approach to virtual machines auto-configuration, in Proc. 6thInt. Conf. Autonomic Computing, Barcelona, Spain, 2009, pp. 137–146.https://doi.org/10.1145/1555228.1555263
DOI
19
A. da Silva Veith, F. R. de Souza, M. D. de Assunção, L. Lefèvre, and J. C. S. dos Anjos, Multi-objective reinforcement learning for reconfiguring data stream analytics on edge computing, in Proc. 48thInt. Conf. Parallel Processing, Kyoto, Japan, 2019, p.106.https://doi.org/10.1145/3337821.3337894
DOI
20
A. Bar-Hillel, A. Di-Nur, L. Ein-Dor, R. Gilad-Bachrach, and Y. Ittach, Workstation capacity tuning using reinforcement learning, in Proc. ACM/IEEE Conf. Supercomputing, Reno, NV, USA, 2007, p. 32.https://doi.org/10.1145/1362622.1362666
DOI
21

C. H. Yu, J. L. Lan, Z. H. Guo, and Y. X. Hu, DROM: Optimizing the routing in software-defined networks with deep reinforcement learning, IEEE Access, vol. 6, pp. 64533–64539, 2018.

22
T. A. Q. Pham, Y. Hadjadj-Aoul, and A. Outtagarts, Deep reinforcement learning based QoS-aware routing in knowledge-defined networking, in Proc. 14thEAI Int. Conf. Heterogeneous Networking for Quality, Reliability, Security and Robustness, Ho Chi Minh City, Vietnam, 2019, pp. 14–26.https://doi.org/10.1007/978-3-030-14413-5_2
DOI
23
X. Y. You, X. J. Li, Y. D. Xu, H. Feng, and J. Zhao, Toward packet routing with fully-distributed multi-agent deep reinforcement learning, in Proc. of 2019 Int. Symp. Modeling and Optimization in Mobile, Ad Hoc, and Wireless Networks (WiOPT), Avignon, France, 2019, doi: 10.23919/WiOPT47501.2019.9144110.https://doi.org/10.23919/WiOPT47501.2019.9144110
DOI
24
X. Mai, Q. Z. Fu, and Y. Chen, Packet routing with graph attention multi-agent reinforcement learning, arXiv preprint arXiv: 2107.13181, 2021.https://doi.org/10.1109/GLOBECOM46510.2021.9685941
DOI
25
R. Bhattacharyya, A. Bura, D. Rengarajan, M. Rumuly, S. Shakkottai, D. Kalathil, R. K. P. Mok, and A. Dhamdhere, QFlow: A reinforcement learning approach to high QoE video streaming over wireless networks, in Proc. 20thACM Int. Symp. Mobile Ad Hoc Networking and Computing, Catania, Italy, 2019, pp. 251–260.https://doi.org/10.1145/3323679.3326523
DOI
26
J. Prados-Garzon, T. Taleb, and M. Bagaa, LEARNET: Reinforcement learning based flow scheduling for asynchronous deterministic networks, in Proc. of 2020 IEEE Int. Conf. Communications, Dublin, Ireland, 2020, doi: 10.1109/ICC40277.2020.9149092.https://doi.org/10.1109/ICC40277.2020.9149092
DOI
27
P. Pinyoanuntapong, M. Lee, and P. Wang, Distributed multi-hop traffic engineering via stochastic policy gradient reinforcement learning, in Proc. of 2019 IEEE Global Communications Conf. (GLOBECOM), Waikoloa, HI, USA, https://webpages.uncc.edu/pwang13/pub/routing.pdf, 2019.https://doi.org/10.1109/GLOBECOM38437.2019.9013134
DOI
28
J. Chavula, M. Densmore, and H. Suleman, Using SDN and reinforcement learning for traffic engineering in UbuntuNet Alliance, in Proc. of 2016 Int. Conf. Advances in Computing and Communication Engineering (ICACCE), Durban, South Africa, 2016, pp. 349–355.https://doi.org/10.1109/ICACCE.2016.8073774
DOI
29

K. F. Xiao, S. W. Mao, and J. K. Tugnait, TCP-Drinc: Smart congestion control based on deep reinforcement learning, IEEE Access, vol. 7, pp. 11892–11904, 2019.

30
B. Liu, Q. M. Xie, and E. Modiano, Reinforcement learning for optimal control of Queueing systems, in Proc. of the 57th Annu. Allerton Conf. Communication, Control, and Computing (Allerton), Monticello, IL, USA, 2019, pp. 663–670.https://doi.org/10.1109/ALLERTON.2019.8919665
DOI
31
J. G. Dai and M. Gluzman, Queueing network controls via deep reinforcement learning, arXiv preprint arXiv: 2008.01644, 2021.
32

M. Raeis, A. Tizghadam, and A. Leon-Garcia, Queue-learning: A reinforcement learning approach for providing quality of service, Proc. AAAI Conf. Artif. Intell., vol. 35, no. 1, pp. 461–468, 2021.

33
A. Kattepur, S. David, and S. Mohalik, Automated configuration of router port queues via model-based reinforcement learning, in Proc. of 2021 IEEE Int. Conf. Communications Workshops, Montreal, Canada, 2021, pp. 1–6.https://doi.org/10.1109/ICCWorkshops50388.2021.9473670
DOI
34

S. Floyd and V. Jacobson, Random early detection gateways for congestion avoidance, IEEE/ACM Trans. Netw., vol. 1, no. 4, pp. 397–413, 1993.

35

M. Bertoli, G. Casale, and G. Serazzi, JMT: Performance engineering tools for system modeling, ACM SIGMETRICS Perform. Eval. Rev., vol. 36, no. 4, pp. 10–15, 2009.

36
H. Kurniawati, D. Hsu, and W. S. Lee, SARSOP: Efficient point-based POMDP planning by approximating optimally reachable belief spaces, in Proc. Robotics: Science and Systems IV, Zurich, Switzerland, doi: 10.15607/RSS.2008.IV.0092008.
37
E. D. Lazowska, J. Zahorjan, G. S. Graham, and K. C. Sevcik, Quantitative System Performance, Computer System Analysis Using Queueing Network Models. Upper Saddle River, NJ, USA: Prentice-Hall, 1984.
Publication history
Copyright
Rights and permissions

Publication history

Received: 25 August 2021
Revised: 13 September 2021
Accepted: 29 September 2021
Published: 01 September 2021
Issue date: September 2021

Copyright

© All articles included in the journal are copyrighted to the ITU and TUP.

Rights and permissions

This work is available under the CC BY-NC-ND 3.0 IGO license: https://creativecommons.org/licenses/by-nc-nd/3.0/igo/

Return