A Review of Reinforcement Learning Based Intelligent Optimization for Manufacturing Scheduling

Ling Wang; Zixiao Pan; Jingjing Wang

doi:10.23919/CSMS.2021.0027

Complex System Modeling and Simulation 2021, 1(4): 257-270 https://doi.org/10.23919/CSMS.2021.0027

Open Access | Issue | Published: 31 December 2021

A Review of Reinforcement Learning Based Intelligent Optimization for Manufacturing Scheduling

Show Author's Information Hide Author's Information Ling Wang^¹(

), Zixiao Pan^¹, Jingjing Wang^¹

1 Department of Automation, Tsinghua University, Beijing 100084, China

Keywords:

Reinforcement Learning (RL), manufacturing scheduling, scheduling optimization

Cite this article:

Wang L, Pan Z, Wang J. A Review of Reinforcement Learning Based Intelligent Optimization for Manufacturing Scheduling. Complex System Modeling and Simulation, 2021, 1(4): 257-270. https://doi.org/10.23919/CSMS.2021.0027

Download citation

EndNote(RIS)

BibTeX

1922

Views

231

Downloads

Citations

Crossref

N/A

WoS

Scopus

N/A

CSCD

Abstract Full text About this article

Abstract

As the critical component of manufacturing systems, production scheduling aims to optimize objectives in terms of profit, efficiency, and energy consumption by reasonably determining the main factors including processing path, machine assignment, execute time and so on. Due to the large scale and strongly coupled constraints nature, as well as the real-time solving requirement in certain scenarios, it faces great challenges in solving the manufacturing scheduling problems. With the development of machine learning, Reinforcement Learning (RL) has made breakthroughs in a variety of decision-making problems. For manufacturing scheduling problems, in this paper we summarize the designs of state and action, tease out RL-based algorithm for scheduling, review the applications of RL for different types of scheduling problems, and then discuss the fusion modes of reinforcement learning and meta-heuristics. Finally, we analyze the existing problems in current research, and point out the future research direction and significant contents to promote the research and applications of RL-based scheduling optimization.

Full text

Abstract

Full text

Outline

About this article

A Review of Reinforcement Learning Based Intelligent Optimization for Manufacturing Scheduling

Show Author's information Hide Author's Information Ling Wang^¹(

), Zixiao Pan^¹, Jingjing Wang^¹

1 Department of Automation, Tsinghua University, Beijing 100084, China

Abstract

Keywords: Reinforcement Learning (RL), manufacturing scheduling, scheduling optimization

References(79)

N. Dilokthanakul, C. Kaplanis, N. Pawlowski, and M. Shanahan, Feature control as intrinsic motivation for hierarchical reinforcement learning, IEEE Trans. Neural Netw. Learn. Syst., vol. 30, no. 11, pp. 3409–3418, 2019.

DOI Google Scholar

Y. Y. Jia and S. G. Ma, A coach-based bayesian reinforcement learning method for snake robot control, IEEE Robot. Autom. Lett., vol. 6, no. 2, pp. 2319–2326, 2021.

DOI Google Scholar

I. Bello, H. Pham, Q. V. Le, M. Norouzi, and S. Bengio, Neural combinatorial optimization with reinforcement learning, in Proc. 5^thInt. Conf. Learning Representations, Toulon, France, 2017, pp. 1–13.

W. Kool, H. Van Hoof, and M. Welling, Attention, learn to solve routing problems!, in Proc. 7^thInt. Conf. Learning Representations, New Orleans, LA, USA, 2019, pp. 1–12.

L. Wang and Z. X. Pan, Scheduling optimization for flow-shop based on deep reinforcement learning and iterative greedy method, (in Chinese), Control and Decision, vol. 36, no. 11, pp. 2609–2617, 2021.

Google Scholar

L. B. Wang, X. Hu, Y. Wang, S. J. Xu, S. J. Ma, K. X. Yang, Z. J. Liu, and W. D. Wang, Dynamic job-shop scheduling in smart manufacturing using deep reinforcement learning, Comput. Netw., vol. 190, p. 107969, 2021.

DOI Google Scholar

S. H. Qu, J. Wang, S. Govil, and J. O. Leckie, Optimized adaptive scheduling of a manufacturing process system with multi-skill workforce and multiple machine types: An ontology-based, multi-agent reinforcement learning approach, Procedia Cirp, vol. 57, pp. 55–60, 2016.

DOI Google Scholar

S. Luo, L. X. Zhang, and Y. S. Fan, Dynamic multi-objective scheduling for flexible job shop by deep reinforcement learning, Comput. Ind. Eng., vol. 159, p. 107489, 2021.

DOI Google Scholar

Y. C. Wang and J. M. Usher, Application of reinforcement learning for agent-based production scheduling, Eng. Appl. Artif. Intell., vol. 18, no. 1, pp. 73–82, 2005.

DOI Google Scholar

H. F. Wang, Q. Yan, and S. Z. Zhang, Integrated scheduling and flexible maintenance in deteriorating multi-state single machine system using a reinforcement learning approach, Adv. Eng. Inform., vol. 49, p. 101339, 2021.

DOI Google Scholar

Y. J. Zhao, Y. H. Wang, J. Zhang, and H. X. Yu, Application of improved Q learning algorithm in job shop scheduling problem, (in Chinese), Journal of System Simulation, https://kns.cnki.net/kcms/detail/11.3092.V.20210423.1823.002.html, 2021.

C. Zhang, W. Song, Z. G. Cao, J. Zhang, P. S. Tan, and C. Xu, Learning to dispatch for job shop scheduling via deep reinforcement learning, arXiv preprint arXiv: 2010.12367, 2020.

B. A. Han and J. J. Yang, Research on adaptive job shop scheduling problems based on dueling double DQN, IEEE Access, vol. 8, pp. 186474–186495, 2020.

DOI Google Scholar

L. Hu, Z. Y. Liu, W. F. Hu, Y. Y. Wang, J. R. Tan, and F. Wu, Petri-net-based dynamic scheduling of flexible manufacturing system via deep reinforcement learning with graph convolutional network, J. Manuf. Syst., vol. 55, pp. 1–14, 2020.

DOI Google Scholar

D. Y. Zhang and C. M. Ye, Reinforcement learning algorithm for permutation flow shop scheduling to minimize makespan, (in Chinese), Comput. Syst. Appl., vol. 28, no. 12, pp. 195–199, 2019.

Google Scholar

M. A. L. Silva, S. R. de Souza, M. J. F. Souza, and A. L. C. Bazzan, A reinforcement learning-based multi-agent framework applied for solving routing and scheduling problems, Expert Syst. Appl., vol. 131, pp. 148–171, 2019.

DOI Google Scholar

C. C. Lin, D. J. Deng, Y. L. Chih, and H. T. Chiu, Smart manufacturing scheduling with edge computing using multiclass deep Q network, IEEE Trans. Ind. Inform., vol. 15, no. 7, pp. 4276–4284, 2019.

DOI Google Scholar

S. L. Yang, Z. G. Xu, and J. Y. Wang, Intelligent decision-making of scheduling for dynamic permutation flowshop via deep reinforcement learning, Sensors, vol. 21, no. 3, p. 1019, 2021.

DOI Google Scholar

S. Luo, Dynamic scheduling for flexible job shop with new job insertions by deep reinforcement learning, Appl. Soft Comput., vol. 91, p. 106208, 2020.

DOI Google Scholar

A. M. Kintsakis, F. E. Psomopoulos, and P. A. Mitkas, Reinforcement learning based scheduling in a workflow management system, Eng. Appl. Artif. Intell., vol. 81, pp. 94–106, 2019.

DOI Google Scholar

Y. Y. Li, E. Fadda, D. Manerba, R. Tadei, and O. Terzo, Reinforcement learning algorithms for online single-machine scheduling, in Proc. 2020 Federated Conf. Computer Science and Information Systems, Sofia, Bulgaria, 2020, pp. 277–283.

R. S. Williem and K. Setiawan, Reinforcement learning combined with radial basis function neural network to solve Job-Shop scheduling problem, in Proc. 2011 IEEE Int. Summer Conference of Asia Pacific Business Innovation and Technology Management, Dalian, China, 2011, pp. 29–32.https://doi.org/10.1109/APBITM.2011.5996285

DOI

K. Arviv, H. Stern, and Y. Edan, Collaborative reinforcement learning for a two-robot job transfer flow-shop scheduling problem, Int. J. Prod. Res., vol. 54, no. 4, pp. 1196–1209, 2016.

DOI Google Scholar

I. B. Park, J. Huh, J. Kim, and J. Park, A reinforcement learning approach to robust scheduling of semiconductor manufacturing facilities, IEEE Trans. Automat. Sci. Eng., vol. 17, no. 3, pp. 1420–1431, 2020.

DOI Google Scholar

J. Wang, J. Hu, G. Y. Min, W. H. Zhan, Q. Ni, and N. Georgalas, Computation offloading in multi-access edge computing using a deep sequential model based on reinforcement learning, IEEE Commun. Mag., vol. 57, no. 5, pp. 64–69, 2019.

DOI Google Scholar

Z. H. Qin, N. Li, X. T. Liu, X. L. Liu, Q. Tong, and X. H. Liu, Overview of research on model-free reinforcement learning, (in Chinese), Computer Science, vol. 48, no. 3, pp. 180–187, 2021.

Google Scholar

J. Palombarini, J. C. Barsce, and E. Martinez, Generating rescheduling knowledge using reinforcement learning in a cognitive architecture, arXiv preprint arXiv: 1805.04752, 2018.

R. H. Chen, B. Yang, S. Li, and S. L. Wang, A self-learning genetic algorithm based on reinforcement learning for flexible job-shop scheduling problem, Comput. Ind. Eng., vol. 149, p. 106778, 2020.

DOI Google Scholar

A. I. Orhean, F. Pop, and I. Raicu, New scheduling approach using reinforcement learning for heterogeneous distributed systems, J. Parallel Distrib. Comput., vol. 117, pp. 292–302, 2018.

DOI Google Scholar

N. Aissani, A. Bekrar, D. Trentesaux, and B. Beldjilali, Dynamic scheduling for multi-site companies: A decisional approach based on reinforcement multi-agent learning, J. Intell. Manuf., vol. 23, no. 6, pp. 2513–2529, 2012.

DOI Google Scholar

W. Bouazza, Y. Sallez, and B. Beldjilali, A distributed approach solving partially flexible job-shop scheduling problem with a Q-learning effect, IFAC-PapersOnLine, vol. 50, no. 1, pp. 15890–15895, 2017.

DOI Google Scholar

Y. F. Wang, Adaptive job shop scheduling strategy based on weighted Q-learning algorithm, J. Intell. Manuf., vol. 31, no. 2, pp. 417–432, 2020.

DOI Google Scholar

N. Stricker, A. Kuhnle, R. Sturm, and S. Friess, Reinforcement learning for adaptive order dispatching in the semiconductor industry, CIRP Annals, vol. 67, no. 1, pp. 511–514, 2018.

DOI Google Scholar

H. X. Wang and H. S. Yan, An interoperable adaptive scheduling strategy for knowledgeable manufacturing based on SMGWQ-learning, J. Intell. Manuf., vol. 27, no. 5, pp. 1085–1095, 2016.

DOI Google Scholar

H. X. Wang, H. S. Yan, and Z. Wang, Adaptive assembly scheduling of aero-engine based on double-layer Q-learning in knowledge manufacturing, (in Chinese), Computer Integrated Manufacturing Systems, vol. 20, no. 12, pp. 3000–3010, 2014.

Google Scholar

J. W. Liu, F. Gao, and X. L. Luo, Survey of deep reinforcement learning based on value function and policy gradient, (in Chinese), Chinese Journal of Computers, vol. 42, no. 6, pp. 1406–1438, 2019.

Google Scholar

B. Waschneck, A. Reichstaller, L. Belzner, T. Altenmüller, T. Bauernhansl, A. Knapp, and A. Kyek, Optimization of global production scheduling with deep reinforcement learning, Procedia CIRP, vol. 72, pp. 1264–1269, 2018.

DOI Google Scholar

H. Hu, X. L. Jia, Q. X. He, S. F. Fu, and K. Liu, Deep reinforcement learning based AGVs real-time scheduling with mixed rule for flexible shop floor in industry 4.0, Comput. Ind. Eng., vol. 149, p. 106749, 2020.

DOI Google Scholar

J. A. Palombarini and E. C. Martínez, Closed-loop rescheduling using deep reinforcement learning, IFAC-PapersOnLine, vol. 52, no. 1, pp. 231–236, 2019.

DOI Google Scholar

H. Rummukainen and J. K. Nurminen, Practical reinforcement learning-experiences in lot scheduling application, IFAC-PapersOnLine, vol. 52, no. 13, pp. 1415–1420, 2019.

DOI Google Scholar

A. Kuhnle, N. Röhrig, and G. Lanza, Autonomous order dispatching in the semiconductor industry using reinforcement learning, Procedia CIRP, vol. 79, pp. 391–396, 2019.

DOI Google Scholar

C. L. Liu, C. C. Chang, and C. J. Tseng, Actor-critic deep reinforcement learning for solving job shop scheduling problems, IEEE Access, vol. 8, pp. 71752–71762, 2020.

DOI Google Scholar

C. D. Hubbs, C. Li, N. V. Sahinidis, I. E. Grossmann, and J. M. Wassick, A deep reinforcement learning approach for chemical production scheduling, Comput. Chem. Eng., vol. 141, p. 106982, 2020.

DOI Google Scholar

X. Y. Chen and Y. D. Tian, Learning to perform local rewriting for combinatorial optimization, arXiv preprint arXiv: 1810.00337, 2019.

J. Wang, X. P. Li, and X. Y. Zhu, Intelligent dynamic control of stochastic economic lot scheduling by agent-based reinforcement learning, Int. J. Prod. Res., vol. 50, no. 16, pp. 4381–4395, 2012.

DOI Google Scholar

S. F. Xie, T. Zhang, and O. Rose, Online single machine scheduling based on simulation and reinforcement learning, in Proc. of the Simulation in Produktion und Logistik, Wissenschaftliche Scripten, Auerbach, Germany, 2019, pp. 59–68.

S. J. Wang, S. Sun, B. H. Zhou, and L. F. Xi, Q-learning based dynamic single machine scheduling, (in Chinese), Journal of Shanghai Jiaotong University, vol. 41, no. 8, pp. 1227–1232 & 1243, 2007.

Google Scholar

H. B. Yang, W. C. Li, and B. Wang, Joint optimization of preventive maintenance and production scheduling for multi-state production systems based on reinforcement learning, Reliab. Eng. Syst. Saf., vol. 214, p. 107713, 2021.

DOI Google Scholar

H. B. Yang, L. Shen, M. Cheng, and L. F. Tao, Integrated optimization of scheduling and maintenance in multi-state production systems with deterioration effects, (in Chinese), Computer Integrated Manufacturing Systems, vol. 24, no. 1, pp. 80–88, 2018.

Google Scholar

Y. C. Wang and J. M. Usher, Learning policies for single machine job dispatching, Robot. Comput. -Integr. Manuf., vol. 20, no. 6, pp. 553–562, 2004.

DOI Google Scholar

Z. C. Zhang, L. Zheng, and M. X. Weng, Dynamic parallel machine scheduling with mean weighted tardiness objective by Q-Learning, Int. J. Adv. Manuf. Technol., vol. 34, no. 9, pp. 968–980, 2007.

DOI Google Scholar

L. F. Zhou, L. Zhang, and B. K. P. Horn, Deep reinforcement learning-based dynamic scheduling in smart manufacturing, Procedia CIRP, vol. 93, pp. 383–388, 2020.

DOI Google Scholar

Z. C. Zhang, L. Zheng, and X. H. Weng, Parallel machines scheduling with reinforcement learning, (in Chinese), Computer Integrated Manufacturing Systems, vol. 13, no. 1, pp. 110–116, 2007.

Google Scholar

Z. C. Zhang, L. Zheng, N. Li, W. P. Wang, S. Y. Zhong, and K. S. Hu, Minimizing mean weighted tardiness in unrelated parallel machine scheduling with reinforcement learning, Comput. Operat. Res., vol. 39, no. 7, pp. 1315–1324, 2012.

DOI Google Scholar

P. F. Xiao, C. Y. Zhang, L. L. Meng, H. Hong, and W. Dai, Non-permutation flow shop scheduling problem based on deep reinforcement learning, (in Chinese), Computer Integrated Manufacturing Systems, vol. 27, no. 1, pp. 192–205, 2021.

Google Scholar

Z. C. Zhang, W. P. Wang, S. Y. Zhong, and K. S. Hu, Flow shop scheduling with reinforcement learning, Asia-Pac. J. Operat. Res., vol. 30, no. 5, p. 1350014, 2013.

DOI Google Scholar

W. Han, F. Guo, and X. C. Su, A reinforcement learning method for a hybrid flow-Shop scheduling problem, Algorithms, vol. 12, no. 11, p. 222, 2019.

DOI Google Scholar

Y. C. Fonseca-Reyna and Y. Martínez-Jiménez, Adapting a reinforcement learning approach for the flow shop environment with sequence-dependent setup time, Revista Cubana de Ciencias Informáticas, vol. 11, no. 1, pp. 41–57, 2017.

Google Scholar

F. Q. Zhao, L. X. Zhang, J. Cao, and J. X. Tang, A cooperative water wave optimization algorithm with reinforcement learning for the distributed assembly no-idle flowshop scheduling problem, Comput. Ind. Eng., vol. 153, p. 107082, 2021.

DOI Google Scholar

T. Gabel and M. Riedmiller, Scaling adaptive agent-based reactive job-shop scheduling to large-scale problems, in Proc. of 2007 IEEE Symp. Computational Intelligence in Scheduling, Honolulu, HI, USA, 2007, pp. 259–266.https://doi.org/10.1109/SCIS.2007.367699

DOI

Y. Martínez, A. Nowé, J. Suárez, and R. Bello, A reinforcement learning approach for the flexible job shop scheduling problem, in Proc. of the 5^th Int. Conf. Learning and Intelligent Optimization, Rome, Italy, 2011, pp. 253–262.https://doi.org/10.1007/978-3-642-25566-3_19

DOI

C. Kardos, C. Laflamme, V. Gallina, and W. Sihn, Dynamic scheduling in a job-shop production system with reinforcement learning, Procedia CIRP, vol. 97, pp. 104–109, 2021.

DOI Google Scholar

B. Luo, S. B. Wang, B. Yang, and L. L. Yi, An improved deep reinforcement learning approach for the dynamic job shop scheduling problem with random job arrivals, J. Phys.:Conf. Ser., vol. 1848, no. 1, p. 012029, 2021.

DOI Google Scholar

J. Shahrabi, M. A. Adibi, and M Mahootchi, A reinforcement learning approach to parameter estimation in dynamic job shop scheduling, Comput. Ind. Eng., vol. 110, pp. 75–82, 2017.

DOI Google Scholar

B. C. Csáji, L. Monostori, and B. Kádár, Reinforcement learning in a distributed market-based production control system, Adv. Eng. Inform., vol. 20, no. 3, pp. 279–288, 2006.

DOI Google Scholar

H. X. Wang, B. R. Sarker, J. Li, and J. Li, Adaptive scheduling for assembly job shop with uncertain assembly times based on dual Q-learning, Int. J. Prod. Res., vol. 59, no. 19, pp. 5867–5883, 2021.

DOI Google Scholar

T. Zhou, D. B. Tang, H. H. Zhu, and Z. Q. Zhang, Multi-agent reinforcement learning for online scheduling in smart factories, Robot. Comput. -Integr. Manuf., vol. 72, p. 102202, 2021.

DOI Google Scholar

B. Wang, F. G. Liu, and W. W. Lin, Energy-efficient VM scheduling based on deep reinforcement learning, Future Generation Computer Systems, vol. 125, pp. 616–628, 2021.

DOI Google Scholar

Y. He, L. X. Wang, Y. F. Li, and Y. L. Wang, A scheduling method for reducing energy consumption of machining job shops considering the flexible process plan, (in Chinese), Journal of Mechanical Engineering, vol. 52, no. 19, pp. 168–179, 2016.

DOI Google Scholar

J. Hong and V. V. Parbhu, Distributed reinforcement learning control for batch sequencing and sizing in Just-In-Time manufacturing systems, Appl. Intell., vol. 20, no. 1, pp. 71–87, 2004.

DOI Google Scholar

T. Zhou, D. B. Tang, H. H. Zhu, and L. P. Wang, Reinforcement learning with composite rewards for production scheduling in a smart factory, IEEE Access, vol. 9, pp. 752–766, 2020.

DOI Google Scholar

J. L. Yuan, M. C. Chen, T. Jiang, and C. Li, Multi-objective reinforcement learning job scheduling method using AHP fixed weight in heterogeneous cloud environment, (in Chinese), Control and Decision, doi: 10.13195/j.kzyjc.2020.0911.

W. H. Zhan, C. B. Luo, J. Wang, C. Wang, G. Y. Min, H. C. Duan, and Q. X. Zhu, Deep reinforcement learning-based offloading scheduling for vehicular edge computing, IEEE Internet Things J., vol. 7, no. 6, pp. 5449–5465, 2020.

DOI Google Scholar

Y. X. Yang, J. Hu, D. Porter, T. Marek, K. Heflin, and H. X. Kong, Deep reinforcement learning-based irrigation scheduling, Trans. ASABE, vol. 63, no. 3, pp. 549–556, 2020.

DOI Google Scholar

A. Mortazavi, A. A. Khamseh, and P. Azimi, Designing of an intelligent self-adaptive model for supply chain ordering management system, Eng. Appl. Artif. Intell., vol. 37, pp. 207–220, 2015.

DOI Google Scholar

C. M. Xing and F. A. Liu, An adaptive particle swarm optimization based on reinforcement learning, (in Chinese), Control and Decision, vol. 26, no. 1, pp. 54–58, 2011.

Google Scholar

J. J. Wang and L. Wang, A cooperative memetic algorithm with learning-based agent for energy-aware distributed hybrid flow-Shop scheduling, IEEE Trans. Evol. Comput., doi: 10.1109/TEVC.2021.3106168.https://doi.org/10.1109/TEVC.2021.3106168

DOI

Z. P. Li, X. M. Wei, X. S. Jiang, and Y. W. Pang, A kind of reinforcement learning to improve genetic algorithm for multiagent task scheduling, Mathematical Problems in Engineering, vol. 2021, p. 1796296, 2021.

DOI Google Scholar

M. Alicastro, D. Ferone, P. Festa, S. Fugaro, and T. Pastore, A reinforcement learning iterated local search for makespan minimization in additive manufacturing machine scheduling problems, Comput. Operat. Res., vol. 131, p. 105272, 2021.

DOI Google Scholar

About this article

Publication history

Acknowledgements

Rights and permissions

Publication history

Received: 21 October 2021

Accepted: 22 November 2021

Published: 31 December 2021

Issue date: December 2021

Copyright

Acknowledgements

Acknowledgment

This work was supported in part by the National Science Fund for Distinguished Young Scholars of China (No. 61525304) and the National Natural Science Foundation of China (No. 61873328).

Rights and permissions

The articles published in this open access journal are distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/).