Intelligent and Converged Networks 2021, 2(3): 177-197 https://doi.org/10.23919/ICN.2021.0016

Open Access | Issue | Published: 01 September 2021

Model-based reinforcement learning for router port queue configurations

Show Author's Information Hide Author's Information Ajay Kattepur^¹(

), Sushanth David^², Swarup Kumar Mohalik^¹

1 Artificial Intelligence System Group, Ericsson Research, Bangalore 560093, India

2 Ericsson Managed Services Unit in Texas, Plano, TX 75025, USA

Keywords:

network slicing, router port queues, model-based Reinforcement Learning (RL)

Cite this article:

Kattepur A, David S, Kumar Mohalik S. Model-based reinforcement learning for router port queue configurations. Intelligent and Converged Networks, 2021, 2(3): 177-197. https://doi.org/10.23919/ICN.2021.0016

Download citation

EndNote(RIS)

BibTeX

769

Views

Downloads

Citations

Crossref

N/A

WoS

Scopus

N/A

CSCD

Abstract Full text About this article

Abstract

Fifth-generation (5G) systems have brought about new challenges toward ensuring Quality of Service (QoS) in differentiated services. This includes low latency applications, scalable machine-to-machine communication, and enhanced mobile broadband connectivity. In order to satisfy these requirements, the concept of network slicing has been introduced to generate slices of the network with specific characteristics. In order to meet the requirements of network slices, routers and switches must be effectively configured to provide priority queue provisioning, resource contention management and adaptation. Configuring routers from vendors, such as Ericsson, Cisco, and Juniper, have traditionally been an expert-driven process with static rules for individual flows, which are prone to sub optimal configurations with varying traffic conditions. In this paper, we model the internal ingress and egress queues within routers via a queuing model. The effects of changing queue configuration with respect to priority, weights, flow limits, and packet drops are studied in detail. This is used to train a model-based Reinforcement Learning (RL) algorithm to generate optimal policies for flow prioritization, fairness, and congestion control. The efficacy of the RL policy output is demonstrated over scenarios involving ingress queue traffic policing, egress queue traffic shaping, and one-hop router coordinated traffic conditioning. This is evaluated over a real application use case, wherein a statically configured router proved sub optimal toward desired QoS requirements. Such automated configuration of routers and switches will be critical for multiple 5G deployments with varying flow requirements and traffic patterns.

Full text

Abstract

Full text

Outline

About this article

Model-based reinforcement learning for router port queue configurations

Show Author's information Hide Author's Information Ajay Kattepur^¹(

), Sushanth David^², Swarup Kumar Mohalik^¹

1 Artificial Intelligence System Group, Ericsson Research, Bangalore 560093, India

2 Ericsson Managed Services Unit in Texas, Plano, TX 75025, USA

Abstract

Keywords: network slicing, router port queues, model-based Reinforcement Learning (RL)

References(37)

ETSI, System architecture for the 5G system, 3GPP TS 23.501, version 15.3. 0, 2018.

X. Foukas, G. Patounas, A. Elmokashfi, and M. K. Marina, Network slicing in 5G: Survey and challenges, IEEE Commun. Mag., vol. 55, no. 5, pp. 94–100, 2017.

DOI Google Scholar

Ericsson AB, Router 6675, Technical specifications, 2019.

D. R. Hanks Jr. and H. Reynolds, Juniper MX Series. Sebastopol, CA, USA: O’Reilly Media, 2012.

D. Kreutz, F. M. V. Ramos, P. E. Veríssimo, C. E. Rothenberg, S. Azodolmolky, and S. Uhlig, Software-defined networking: A comprehensive survey, Proc. IEEE, vol. 103, no. 1, pp. 14–76, 2015.

DOI Google Scholar

Cisco Systems, Quality of Service (QoS) configuration guide, Cisco IOS, 2018.

R. S. Sutton and A. G. Barto, Reinforcement Learning - An Introduction. 2nd ed. Cambridge, MA, USA: MIT Press, 2018.

L. P. Kaelbling, M. L. Littman, and A. R. Cassandra, Planning and acting in partially observable stochastic domains, Artif. Intell., vol. 101, nos. 1&2, pp. 99–134, 1998.

DOI Google Scholar

Cisco Systems, QoS: Color-aware policer, Cisco IOS documentation, 2005.

C. Semeria, Supporting differentiated service classes: Queue scheduling disciplines, Juniper Networks Whitepaper, 2001.

H. Zhang, Service disciplines for guaranteed performance service in packet-switching networks, Proc. IEEE, vol. 83, no. 10, pp. 1374–1396, 1995.

DOI Google Scholar

Cisco Systems, DiffServ – the scalable end-to-end QoS model, WhitePaper, 2005.

T. X. Brown, Switch packet arbitration via queue-learning, in Proc. 14^thInt. Conf. Neural Information Processing Systems: Natural and Synthetic, Vancouver, Canada, 2001, pp. 1337–1344.

J. A. Boyan and M. L. Littman, Packet routing in dynamically changing networks: A reinforcement learning approach, in Proc. 6^thInt. Conf. Neural Information Processing Systems, Denver, CO, USA, 1993, pp. 671–678.

Z. Mammeri, Reinforcement learning based routing in networks: Review and classification of approaches, IEEE Access, vol. 7, pp. 55916–55950, 2019.

DOI Google Scholar

A. Mestres, A. Rodriguez-Natal, J. Carner, P. Barlet-Ros, E. Alarcón, M. Solé, V. Muntés-Mulero, D. Meyer, S. Barkai, M. J. Hibbett, et al., Knowledge-defined networking, SIGCOMM Comput. Commun. Rev., vol. 47, no. 3, pp. 2–10, 2017.

DOI Google Scholar

T. C. K. Hui and C. K. Tham, Adaptive provisioning of differentiated services networks based on reinforcement learning, IEEE Trans. Syst. Man Cybern. C(Appl. Rev.), vol. 33, no. 4, pp. 492–501, 2003.

DOI Google Scholar

J. Rao, X. P. Bu, C. Z. Xu, L. Y. Wang, and G. Yin, VCONF: A reinforcement learning approach to virtual machines auto-configuration, in Proc. 6^thInt. Conf. Autonomic Computing, Barcelona, Spain, 2009, pp. 137–146.https://doi.org/10.1145/1555228.1555263

DOI

A. da Silva Veith, F. R. de Souza, M. D. de Assunção, L. Lefèvre, and J. C. S. dos Anjos, Multi-objective reinforcement learning for reconfiguring data stream analytics on edge computing, in Proc. 48^thInt. Conf. Parallel Processing, Kyoto, Japan, 2019, p.106.https://doi.org/10.1145/3337821.3337894

DOI

A. Bar-Hillel, A. Di-Nur, L. Ein-Dor, R. Gilad-Bachrach, and Y. Ittach, Workstation capacity tuning using reinforcement learning, in Proc. ACM/IEEE Conf. Supercomputing, Reno, NV, USA, 2007, p. 32.https://doi.org/10.1145/1362622.1362666

DOI

C. H. Yu, J. L. Lan, Z. H. Guo, and Y. X. Hu, DROM: Optimizing the routing in software-defined networks with deep reinforcement learning, IEEE Access, vol. 6, pp. 64533–64539, 2018.

DOI Google Scholar

T. A. Q. Pham, Y. Hadjadj-Aoul, and A. Outtagarts, Deep reinforcement learning based QoS-aware routing in knowledge-defined networking, in Proc. 14^thEAI Int. Conf. Heterogeneous Networking for Quality, Reliability, Security and Robustness, Ho Chi Minh City, Vietnam, 2019, pp. 14–26.https://doi.org/10.1007/978-3-030-14413-5_2

DOI

X. Y. You, X. J. Li, Y. D. Xu, H. Feng, and J. Zhao, Toward packet routing with fully-distributed multi-agent deep reinforcement learning, in Proc. of 2019 Int. Symp. Modeling and Optimization in Mobile, Ad Hoc, and Wireless Networks (WiOPT), Avignon, France, 2019, doi: 10.23919/WiOPT47501.2019.9144110.https://doi.org/10.23919/WiOPT47501.2019.9144110

DOI

X. Mai, Q. Z. Fu, and Y. Chen, Packet routing with graph attention multi-agent reinforcement learning, arXiv preprint arXiv: 2107.13181, 2021.https://doi.org/10.1109/GLOBECOM46510.2021.9685941

DOI

R. Bhattacharyya, A. Bura, D. Rengarajan, M. Rumuly, S. Shakkottai, D. Kalathil, R. K. P. Mok, and A. Dhamdhere, QFlow: A reinforcement learning approach to high QoE video streaming over wireless networks, in Proc. 20^thACM Int. Symp. Mobile Ad Hoc Networking and Computing, Catania, Italy, 2019, pp. 251–260.https://doi.org/10.1145/3323679.3326523

DOI

J. Prados-Garzon, T. Taleb, and M. Bagaa, LEARNET: Reinforcement learning based flow scheduling for asynchronous deterministic networks, in Proc. of 2020 IEEE Int. Conf. Communications, Dublin, Ireland, 2020, doi: 10.1109/ICC40277.2020.9149092.https://doi.org/10.1109/ICC40277.2020.9149092

DOI

P. Pinyoanuntapong, M. Lee, and P. Wang, Distributed multi-hop traffic engineering via stochastic policy gradient reinforcement learning, in Proc. of 2019 IEEE Global Communications Conf. (GLOBECOM), Waikoloa, HI, USA, https://webpages.uncc.edu/pwang13/pub/routing.pdf, 2019.https://doi.org/10.1109/GLOBECOM38437.2019.9013134

DOI

J. Chavula, M. Densmore, and H. Suleman, Using SDN and reinforcement learning for traffic engineering in UbuntuNet Alliance, in Proc. of 2016 Int. Conf. Advances in Computing and Communication Engineering (ICACCE), Durban, South Africa, 2016, pp. 349–355.https://doi.org/10.1109/ICACCE.2016.8073774

DOI

K. F. Xiao, S. W. Mao, and J. K. Tugnait, TCP-Drinc: Smart congestion control based on deep reinforcement learning, IEEE Access, vol. 7, pp. 11892–11904, 2019.

DOI Google Scholar

B. Liu, Q. M. Xie, and E. Modiano, Reinforcement learning for optimal control of Queueing systems, in Proc. of the 57th Annu. Allerton Conf. Communication, Control, and Computing (Allerton), Monticello, IL, USA, 2019, pp. 663–670.https://doi.org/10.1109/ALLERTON.2019.8919665

DOI

J. G. Dai and M. Gluzman, Queueing network controls via deep reinforcement learning, arXiv preprint arXiv: 2008.01644, 2021.

M. Raeis, A. Tizghadam, and A. Leon-Garcia, Queue-learning: A reinforcement learning approach for providing quality of service, Proc. AAAI Conf. Artif. Intell., vol. 35, no. 1, pp. 461–468, 2021.

Google Scholar

A. Kattepur, S. David, and S. Mohalik, Automated configuration of router port queues via model-based reinforcement learning, in Proc. of 2021 IEEE Int. Conf. Communications Workshops, Montreal, Canada, 2021, pp. 1–6.https://doi.org/10.1109/ICCWorkshops50388.2021.9473670

DOI

S. Floyd and V. Jacobson, Random early detection gateways for congestion avoidance, IEEE/ACM Trans. Netw., vol. 1, no. 4, pp. 397–413, 1993.

DOI Google Scholar

M. Bertoli, G. Casale, and G. Serazzi, JMT: Performance engineering tools for system modeling, ACM SIGMETRICS Perform. Eval. Rev., vol. 36, no. 4, pp. 10–15, 2009.

DOI Google Scholar

H. Kurniawati, D. Hsu, and W. S. Lee, SARSOP: Efficient point-based POMDP planning by approximating optimally reachable belief spaces, in Proc. Robotics: Science and Systems IV, Zurich, Switzerland, doi: 10.15607/RSS.2008.IV.0092008.

E. D. Lazowska, J. Zahorjan, G. S. Graham, and K. C. Sevcik, Quantitative System Performance, Computer System Analysis Using Queueing Network Models. Upper Saddle River, NJ, USA: Prentice-Hall, 1984.

About this article

Publication history

Rights and permissions

Publication history

Received: 25 August 2021

Revised: 13 September 2021

Accepted: 29 September 2021

Published: 01 September 2021

Issue date: September 2021

Copyright

Rights and permissions

This work is available under the CC BY-NC-ND 3.0 IGO license: https://creativecommons.org/licenses/by-nc-nd/3.0/igo/