Journal Home > Volume 1 , Issue 4

As the critical component of manufacturing systems, production scheduling aims to optimize objectives in terms of profit, efficiency, and energy consumption by reasonably determining the main factors including processing path, machine assignment, execute time and so on. Due to the large scale and strongly coupled constraints nature, as well as the real-time solving requirement in certain scenarios, it faces great challenges in solving the manufacturing scheduling problems. With the development of machine learning, Reinforcement Learning (RL) has made breakthroughs in a variety of decision-making problems. For manufacturing scheduling problems, in this paper we summarize the designs of state and action, tease out RL-based algorithm for scheduling, review the applications of RL for different types of scheduling problems, and then discuss the fusion modes of reinforcement learning and meta-heuristics. Finally, we analyze the existing problems in current research, and point out the future research direction and significant contents to promote the research and applications of RL-based scheduling optimization.


menu
Abstract
Full text
Outline
About this article

A Review of Reinforcement Learning Based Intelligent Optimization for Manufacturing Scheduling

Show Author's information Ling Wang1( )Zixiao Pan1Jingjing Wang1
Department of Automation, Tsinghua University, Beijing 100084, China

Abstract

As the critical component of manufacturing systems, production scheduling aims to optimize objectives in terms of profit, efficiency, and energy consumption by reasonably determining the main factors including processing path, machine assignment, execute time and so on. Due to the large scale and strongly coupled constraints nature, as well as the real-time solving requirement in certain scenarios, it faces great challenges in solving the manufacturing scheduling problems. With the development of machine learning, Reinforcement Learning (RL) has made breakthroughs in a variety of decision-making problems. For manufacturing scheduling problems, in this paper we summarize the designs of state and action, tease out RL-based algorithm for scheduling, review the applications of RL for different types of scheduling problems, and then discuss the fusion modes of reinforcement learning and meta-heuristics. Finally, we analyze the existing problems in current research, and point out the future research direction and significant contents to promote the research and applications of RL-based scheduling optimization.

Keywords: Reinforcement Learning (RL), manufacturing scheduling, scheduling optimization

References(79)

1

N. Dilokthanakul, C. Kaplanis, N. Pawlowski, and M. Shanahan, Feature control as intrinsic motivation for hierarchical reinforcement learning, IEEE Trans. Neural Netw. Learn. Syst., vol. 30, no. 11, pp. 3409–3418, 2019.

2

Y. Y. Jia and S. G. Ma, A coach-based bayesian reinforcement learning method for snake robot control, IEEE Robot. Autom. Lett., vol. 6, no. 2, pp. 2319–2326, 2021.

3
I. Bello, H. Pham, Q. V. Le, M. Norouzi, and S. Bengio, Neural combinatorial optimization with reinforcement learning, in Proc. 5thInt. Conf. Learning Representations, Toulon, France, 2017, pp. 1–13.
4
W. Kool, H. Van Hoof, and M. Welling, Attention, learn to solve routing problems!, in Proc. 7thInt. Conf. Learning Representations, New Orleans, LA, USA, 2019, pp. 1–12.
5

L. Wang and Z. X. Pan, Scheduling optimization for flow-shop based on deep reinforcement learning and iterative greedy method, (in Chinese), Control and Decision, vol. 36, no. 11, pp. 2609–2617, 2021.

6

L. B. Wang, X. Hu, Y. Wang, S. J. Xu, S. J. Ma, K. X. Yang, Z. J. Liu, and W. D. Wang, Dynamic job-shop scheduling in smart manufacturing using deep reinforcement learning, Comput. Netw., vol. 190, p. 107969, 2021.

7

S. H. Qu, J. Wang, S. Govil, and J. O. Leckie, Optimized adaptive scheduling of a manufacturing process system with multi-skill workforce and multiple machine types: An ontology-based, multi-agent reinforcement learning approach, Procedia Cirp, vol. 57, pp. 55–60, 2016.

8

S. Luo, L. X. Zhang, and Y. S. Fan, Dynamic multi-objective scheduling for flexible job shop by deep reinforcement learning, Comput. Ind. Eng., vol. 159, p. 107489, 2021.

9

Y. C. Wang and J. M. Usher, Application of reinforcement learning for agent-based production scheduling, Eng. Appl. Artif. Intell., vol. 18, no. 1, pp. 73–82, 2005.

10

H. F. Wang, Q. Yan, and S. Z. Zhang, Integrated scheduling and flexible maintenance in deteriorating multi-state single machine system using a reinforcement learning approach, Adv. Eng. Inform., vol. 49, p. 101339, 2021.

11
Y. J. Zhao, Y. H. Wang, J. Zhang, and H. X. Yu, Application of improved Q learning algorithm in job shop scheduling problem, (in Chinese), Journal of System Simulation, https://kns.cnki.net/kcms/detail/11.3092.V.20210423.1823.002.html, 2021.
12
C. Zhang, W. Song, Z. G. Cao, J. Zhang, P. S. Tan, and C. Xu, Learning to dispatch for job shop scheduling via deep reinforcement learning, arXiv preprint arXiv: 2010.12367, 2020.
13

B. A. Han and J. J. Yang, Research on adaptive job shop scheduling problems based on dueling double DQN, IEEE Access, vol. 8, pp. 186474–186495, 2020.

14

L. Hu, Z. Y. Liu, W. F. Hu, Y. Y. Wang, J. R. Tan, and F. Wu, Petri-net-based dynamic scheduling of flexible manufacturing system via deep reinforcement learning with graph convolutional network, J. Manuf. Syst., vol. 55, pp. 1–14, 2020.

15

D. Y. Zhang and C. M. Ye, Reinforcement learning algorithm for permutation flow shop scheduling to minimize makespan, (in Chinese), Comput. Syst. Appl., vol. 28, no. 12, pp. 195–199, 2019.

16

M. A. L. Silva, S. R. de Souza, M. J. F. Souza, and A. L. C. Bazzan, A reinforcement learning-based multi-agent framework applied for solving routing and scheduling problems, Expert Syst. Appl., vol. 131, pp. 148–171, 2019.

17

C. C. Lin, D. J. Deng, Y. L. Chih, and H. T. Chiu, Smart manufacturing scheduling with edge computing using multiclass deep Q network, IEEE Trans. Ind. Inform., vol. 15, no. 7, pp. 4276–4284, 2019.

18

S. L. Yang, Z. G. Xu, and J. Y. Wang, Intelligent decision-making of scheduling for dynamic permutation flowshop via deep reinforcement learning, Sensors, vol. 21, no. 3, p. 1019, 2021.

19

S. Luo, Dynamic scheduling for flexible job shop with new job insertions by deep reinforcement learning, Appl. Soft Comput., vol. 91, p. 106208, 2020.

20

A. M. Kintsakis, F. E. Psomopoulos, and P. A. Mitkas, Reinforcement learning based scheduling in a workflow management system, Eng. Appl. Artif. Intell., vol. 81, pp. 94–106, 2019.

21
Y. Y. Li, E. Fadda, D. Manerba, R. Tadei, and O. Terzo, Reinforcement learning algorithms for online single-machine scheduling, in Proc. 2020 Federated Conf. Computer Science and Information Systems, Sofia, Bulgaria, 2020, pp. 277–283.
22
R. S. Williem and K. Setiawan, Reinforcement learning combined with radial basis function neural network to solve Job-Shop scheduling problem, in Proc. 2011 IEEE Int. Summer Conference of Asia Pacific Business Innovation and Technology Management, Dalian, China, 2011, pp. 29–32.https://doi.org/10.1109/APBITM.2011.5996285
DOI
23

K. Arviv, H. Stern, and Y. Edan, Collaborative reinforcement learning for a two-robot job transfer flow-shop scheduling problem, Int. J. Prod. Res., vol. 54, no. 4, pp. 1196–1209, 2016.

24

I. B. Park, J. Huh, J. Kim, and J. Park, A reinforcement learning approach to robust scheduling of semiconductor manufacturing facilities, IEEE Trans. Automat. Sci. Eng., vol. 17, no. 3, pp. 1420–1431, 2020.

25

J. Wang, J. Hu, G. Y. Min, W. H. Zhan, Q. Ni, and N. Georgalas, Computation offloading in multi-access edge computing using a deep sequential model based on reinforcement learning, IEEE Commun. Mag., vol. 57, no. 5, pp. 64–69, 2019.

26

Z. H. Qin, N. Li, X. T. Liu, X. L. Liu, Q. Tong, and X. H. Liu, Overview of research on model-free reinforcement learning, (in Chinese), Computer Science, vol. 48, no. 3, pp. 180–187, 2021.

27
J. Palombarini, J. C. Barsce, and E. Martinez, Generating rescheduling knowledge using reinforcement learning in a cognitive architecture, arXiv preprint arXiv: 1805.04752, 2018.
28

R. H. Chen, B. Yang, S. Li, and S. L. Wang, A self-learning genetic algorithm based on reinforcement learning for flexible job-shop scheduling problem, Comput. Ind. Eng., vol. 149, p. 106778, 2020.

29

A. I. Orhean, F. Pop, and I. Raicu, New scheduling approach using reinforcement learning for heterogeneous distributed systems, J. Parallel Distrib. Comput., vol. 117, pp. 292–302, 2018.

30

N. Aissani, A. Bekrar, D. Trentesaux, and B. Beldjilali, Dynamic scheduling for multi-site companies: A decisional approach based on reinforcement multi-agent learning, J. Intell. Manuf., vol. 23, no. 6, pp. 2513–2529, 2012.

31

W. Bouazza, Y. Sallez, and B. Beldjilali, A distributed approach solving partially flexible job-shop scheduling problem with a Q-learning effect, IFAC-PapersOnLine, vol. 50, no. 1, pp. 15890–15895, 2017.

32

Y. F. Wang, Adaptive job shop scheduling strategy based on weighted Q-learning algorithm, J. Intell. Manuf., vol. 31, no. 2, pp. 417–432, 2020.

33

N. Stricker, A. Kuhnle, R. Sturm, and S. Friess, Reinforcement learning for adaptive order dispatching in the semiconductor industry, CIRP Annals, vol. 67, no. 1, pp. 511–514, 2018.

34

H. X. Wang and H. S. Yan, An interoperable adaptive scheduling strategy for knowledgeable manufacturing based on SMGWQ-learning, J. Intell. Manuf., vol. 27, no. 5, pp. 1085–1095, 2016.

35

H. X. Wang, H. S. Yan, and Z. Wang, Adaptive assembly scheduling of aero-engine based on double-layer Q-learning in knowledge manufacturing, (in Chinese), Computer Integrated Manufacturing Systems, vol. 20, no. 12, pp. 3000–3010, 2014.

36

J. W. Liu, F. Gao, and X. L. Luo, Survey of deep reinforcement learning based on value function and policy gradient, (in Chinese), Chinese Journal of Computers, vol. 42, no. 6, pp. 1406–1438, 2019.

37

B. Waschneck, A. Reichstaller, L. Belzner, T. Altenmüller, T. Bauernhansl, A. Knapp, and A. Kyek, Optimization of global production scheduling with deep reinforcement learning, Procedia CIRP, vol. 72, pp. 1264–1269, 2018.

38

H. Hu, X. L. Jia, Q. X. He, S. F. Fu, and K. Liu, Deep reinforcement learning based AGVs real-time scheduling with mixed rule for flexible shop floor in industry 4.0, Comput. Ind. Eng., vol. 149, p. 106749, 2020.

39

J. A. Palombarini and E. C. Martínez, Closed-loop rescheduling using deep reinforcement learning, IFAC-PapersOnLine, vol. 52, no. 1, pp. 231–236, 2019.

40

H. Rummukainen and J. K. Nurminen, Practical reinforcement learning-experiences in lot scheduling application, IFAC-PapersOnLine, vol. 52, no. 13, pp. 1415–1420, 2019.

41

A. Kuhnle, N. Röhrig, and G. Lanza, Autonomous order dispatching in the semiconductor industry using reinforcement learning, Procedia CIRP, vol. 79, pp. 391–396, 2019.

42

C. L. Liu, C. C. Chang, and C. J. Tseng, Actor-critic deep reinforcement learning for solving job shop scheduling problems, IEEE Access, vol. 8, pp. 71752–71762, 2020.

43

C. D. Hubbs, C. Li, N. V. Sahinidis, I. E. Grossmann, and J. M. Wassick, A deep reinforcement learning approach for chemical production scheduling, Comput. Chem. Eng., vol. 141, p. 106982, 2020.

44
X. Y. Chen and Y. D. Tian, Learning to perform local rewriting for combinatorial optimization, arXiv preprint arXiv: 1810.00337, 2019.
45

J. Wang, X. P. Li, and X. Y. Zhu, Intelligent dynamic control of stochastic economic lot scheduling by agent-based reinforcement learning, Int. J. Prod. Res., vol. 50, no. 16, pp. 4381–4395, 2012.

46
S. F. Xie, T. Zhang, and O. Rose, Online single machine scheduling based on simulation and reinforcement learning, in Proc. of the Simulation in Produktion und Logistik, Wissenschaftliche Scripten, Auerbach, Germany, 2019, pp. 59–68.
47

S. J. Wang, S. Sun, B. H. Zhou, and L. F. Xi, Q-learning based dynamic single machine scheduling, (in Chinese), Journal of Shanghai Jiaotong University, vol. 41, no. 8, pp. 1227–1232 & 1243, 2007.

48

H. B. Yang, W. C. Li, and B. Wang, Joint optimization of preventive maintenance and production scheduling for multi-state production systems based on reinforcement learning, Reliab. Eng. Syst. Saf., vol. 214, p. 107713, 2021.

49

H. B. Yang, L. Shen, M. Cheng, and L. F. Tao, Integrated optimization of scheduling and maintenance in multi-state production systems with deterioration effects, (in Chinese), Computer Integrated Manufacturing Systems, vol. 24, no. 1, pp. 80–88, 2018.

50

Y. C. Wang and J. M. Usher, Learning policies for single machine job dispatching, Robot. Comput. -Integr. Manuf., vol. 20, no. 6, pp. 553–562, 2004.

51

Z. C. Zhang, L. Zheng, and M. X. Weng, Dynamic parallel machine scheduling with mean weighted tardiness objective by Q-Learning, Int. J. Adv. Manuf. Technol., vol. 34, no. 9, pp. 968–980, 2007.

52

L. F. Zhou, L. Zhang, and B. K. P. Horn, Deep reinforcement learning-based dynamic scheduling in smart manufacturing, Procedia CIRP, vol. 93, pp. 383–388, 2020.

53

Z. C. Zhang, L. Zheng, and X. H. Weng, Parallel machines scheduling with reinforcement learning, (in Chinese), Computer Integrated Manufacturing Systems, vol. 13, no. 1, pp. 110–116, 2007.

54

Z. C. Zhang, L. Zheng, N. Li, W. P. Wang, S. Y. Zhong, and K. S. Hu, Minimizing mean weighted tardiness in unrelated parallel machine scheduling with reinforcement learning, Comput. Operat. Res., vol. 39, no. 7, pp. 1315–1324, 2012.

55

P. F. Xiao, C. Y. Zhang, L. L. Meng, H. Hong, and W. Dai, Non-permutation flow shop scheduling problem based on deep reinforcement learning, (in Chinese), Computer Integrated Manufacturing Systems, vol. 27, no. 1, pp. 192–205, 2021.

56

Z. C. Zhang, W. P. Wang, S. Y. Zhong, and K. S. Hu, Flow shop scheduling with reinforcement learning, Asia-Pac. J. Operat. Res., vol. 30, no. 5, p. 1350014, 2013.

57

W. Han, F. Guo, and X. C. Su, A reinforcement learning method for a hybrid flow-Shop scheduling problem, Algorithms, vol. 12, no. 11, p. 222, 2019.

58

Y. C. Fonseca-Reyna and Y. Martínez-Jiménez, Adapting a reinforcement learning approach for the flow shop environment with sequence-dependent setup time, Revista Cubana de Ciencias Informáticas, vol. 11, no. 1, pp. 41–57, 2017.

59

F. Q. Zhao, L. X. Zhang, J. Cao, and J. X. Tang, A cooperative water wave optimization algorithm with reinforcement learning for the distributed assembly no-idle flowshop scheduling problem, Comput. Ind. Eng., vol. 153, p. 107082, 2021.

60
T. Gabel and M. Riedmiller, Scaling adaptive agent-based reactive job-shop scheduling to large-scale problems, in Proc. of 2007 IEEE Symp. Computational Intelligence in Scheduling, Honolulu, HI, USA, 2007, pp. 259–266.https://doi.org/10.1109/SCIS.2007.367699
DOI
61
Y. Martínez, A. Nowé, J. Suárez, and R. Bello, A reinforcement learning approach for the flexible job shop scheduling problem, in Proc. of the 5th Int. Conf. Learning and Intelligent Optimization, Rome, Italy, 2011, pp. 253–262.https://doi.org/10.1007/978-3-642-25566-3_19
DOI
62

C. Kardos, C. Laflamme, V. Gallina, and W. Sihn, Dynamic scheduling in a job-shop production system with reinforcement learning, Procedia CIRP, vol. 97, pp. 104–109, 2021.

63

B. Luo, S. B. Wang, B. Yang, and L. L. Yi, An improved deep reinforcement learning approach for the dynamic job shop scheduling problem with random job arrivals, J. Phys.:Conf. Ser., vol. 1848, no. 1, p. 012029, 2021.

64

J. Shahrabi, M. A. Adibi, and M Mahootchi, A reinforcement learning approach to parameter estimation in dynamic job shop scheduling, Comput. Ind. Eng., vol. 110, pp. 75–82, 2017.

65

B. C. Csáji, L. Monostori, and B. Kádár, Reinforcement learning in a distributed market-based production control system, Adv. Eng. Inform., vol. 20, no. 3, pp. 279–288, 2006.

66

H. X. Wang, B. R. Sarker, J. Li, and J. Li, Adaptive scheduling for assembly job shop with uncertain assembly times based on dual Q-learning, Int. J. Prod. Res., vol. 59, no. 19, pp. 5867–5883, 2021.

67

T. Zhou, D. B. Tang, H. H. Zhu, and Z. Q. Zhang, Multi-agent reinforcement learning for online scheduling in smart factories, Robot. Comput. -Integr. Manuf., vol. 72, p. 102202, 2021.

68

B. Wang, F. G. Liu, and W. W. Lin, Energy-efficient VM scheduling based on deep reinforcement learning, Future Generation Computer Systems, vol. 125, pp. 616–628, 2021.

69

Y. He, L. X. Wang, Y. F. Li, and Y. L. Wang, A scheduling method for reducing energy consumption of machining job shops considering the flexible process plan, (in Chinese), Journal of Mechanical Engineering, vol. 52, no. 19, pp. 168–179, 2016.

70

J. Hong and V. V. Parbhu, Distributed reinforcement learning control for batch sequencing and sizing in Just-In-Time manufacturing systems, Appl. Intell., vol. 20, no. 1, pp. 71–87, 2004.

71

T. Zhou, D. B. Tang, H. H. Zhu, and L. P. Wang, Reinforcement learning with composite rewards for production scheduling in a smart factory, IEEE Access, vol. 9, pp. 752–766, 2020.

72
J. L. Yuan, M. C. Chen, T. Jiang, and C. Li, Multi-objective reinforcement learning job scheduling method using AHP fixed weight in heterogeneous cloud environment, (in Chinese), Control and Decision, doi: 10.13195/j.kzyjc.2020.0911.
73

W. H. Zhan, C. B. Luo, J. Wang, C. Wang, G. Y. Min, H. C. Duan, and Q. X. Zhu, Deep reinforcement learning-based offloading scheduling for vehicular edge computing, IEEE Internet Things J., vol. 7, no. 6, pp. 5449–5465, 2020.

74

Y. X. Yang, J. Hu, D. Porter, T. Marek, K. Heflin, and H. X. Kong, Deep reinforcement learning-based irrigation scheduling, Trans. ASABE, vol. 63, no. 3, pp. 549–556, 2020.

75

A. Mortazavi, A. A. Khamseh, and P. Azimi, Designing of an intelligent self-adaptive model for supply chain ordering management system, Eng. Appl. Artif. Intell., vol. 37, pp. 207–220, 2015.

76

C. M. Xing and F. A. Liu, An adaptive particle swarm optimization based on reinforcement learning, (in Chinese), Control and Decision, vol. 26, no. 1, pp. 54–58, 2011.

77
J. J. Wang and L. Wang, A cooperative memetic algorithm with learning-based agent for energy-aware distributed hybrid flow-Shop scheduling, IEEE Trans. Evol. Comput., doi: 10.1109/TEVC.2021.3106168.https://doi.org/10.1109/TEVC.2021.3106168
DOI
78

Z. P. Li, X. M. Wei, X. S. Jiang, and Y. W. Pang, A kind of reinforcement learning to improve genetic algorithm for multiagent task scheduling, Mathematical Problems in Engineering, vol. 2021, p. 1796296, 2021.

79

M. Alicastro, D. Ferone, P. Festa, S. Fugaro, and T. Pastore, A reinforcement learning iterated local search for makespan minimization in additive manufacturing machine scheduling problems, Comput. Operat. Res., vol. 131, p. 105272, 2021.

Publication history
Copyright
Acknowledgements
Rights and permissions

Publication history

Received: 21 October 2021
Accepted: 22 November 2021
Published: 31 December 2021
Issue date: December 2021

Copyright

© The author(s) 2021

Acknowledgements

Acknowledgment

This work was supported in part by the National Science Fund for Distinguished Young Scholars of China (No. 61525304) and the National Natural Science Foundation of China (No. 61873328).

Rights and permissions

The articles published in this open access journal are distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/).

Return