Journal Home > Volume 29 , Issue 5

Combinatorial Optimization Problems (COPs) are a class of optimization problems that are commonly encountered in industrial production and everyday life. Over the last few decades, traditional algorithms, such as exact algorithms, approximate algorithms, and heuristic algorithms, have been proposed to solve COPs. However, as COPs in the real world become more complex, traditional algorithms struggle to generate optimal solutions in a limited amount of time. Since Deep Neural Networks (DNNs) are not heavily dependent on expert knowledge and are adequately flexible for generalization to various COPs, several DNN-based algorithms have been proposed in the last ten years for solving COPs. Herein, we categorize these algorithms into four classes and provide a brief overview of their applications in real-world problems.


menu
Abstract
Full text
Outline
About this article

Solving Combinatorial Optimization Problems with Deep Neural Network: A Survey

Show Author's information Feng Wang( )Qi HeShicheng Li
School of Computer Science, Wuhan University, Wuhan 430072, China

Abstract

Combinatorial Optimization Problems (COPs) are a class of optimization problems that are commonly encountered in industrial production and everyday life. Over the last few decades, traditional algorithms, such as exact algorithms, approximate algorithms, and heuristic algorithms, have been proposed to solve COPs. However, as COPs in the real world become more complex, traditional algorithms struggle to generate optimal solutions in a limited amount of time. Since Deep Neural Networks (DNNs) are not heavily dependent on expert knowledge and are adequately flexible for generalization to various COPs, several DNN-based algorithms have been proposed in the last ten years for solving COPs. Herein, we categorize these algorithms into four classes and provide a brief overview of their applications in real-world problems.

Keywords: Graph Neural Network (GNN), Transformer, Reinforcement Learning (RL), pointer network, Combinatorial Optimization Problem (COPs)

References(83)

[1]

K. W. Li, T. Zhang, R. Wang, W. J. Qin, H. H. He, and H. Huang, Research reviews of combinatorial optimization methods based on deep reinforcement learning, (in Chinese), Acta Automatica Sinica, vol. 47, no. 11, pp. 2521–2537, 2021.

[2]

G. Laporte, The vehicle routing problem: An overview of exact and approximate algorithms, Eur. J. Oper. Res., vol. 59, no. 3, pp. 345–358, 1992.

[3]

W. Yu, Y. Liao, and Y. Yang, Exact and approximation algorithms for the multi-depot capacitated arc routing problems, Tsinghua Science and Technology, vol. 28, no. 5, pp. 916–928, 2023.

[4]

A. H. Halim and I. Ismail, Combinatorial optimization: Comparison of heuristic algorithms in travelling salesman problem, Arch. Computat. Methods Eng., vol. 26, no. 2, pp. 367–380, 2019.

[5]

F. Zhao, S. Di, J. Cao, J. Tang, and Jonrinaldi, A novel cooperative multi-stage hyper-heuristic for combination optimization problems, Compl. Syst. Model. Simul., vol. 1, no. 2, pp. 91–108, 2021.

[6]

Y. Shen, L. Yu, and J. Li, Robust electric vehicle routing problem with time windows under demand uncertainty and weight-related energy consumption, Compl. Syst. Model. Simul., vol. 2, no. 1, pp. 18–34, 2022.

[7]

F. Wang, X. Wang, and S. Sun, A reinforcement learning level-based particle swarm optimization algorithm for large-scale optimization, Inf. Sci., vol. 602, pp. 298–312, 2022.

[8]
O. Vinyals, M. Fortunato, and N. Jaitly, Pointer networks, in Proc. 28 th Int. Conf. Neural Information Processing Systems, Montreal, Canada, 2015, pp. 2692–2700.
[9]
A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, Attention is all you need, in Proc. 31 st Int. Conf. Neural Information Processing Systems, Long Beach, CA, USA, 2017, pp. 6000–6010.
[10]

J. Zhou, G. Cui, S. Hu, Z. Zhang, C. Yang, Z. Liu, L. Wang, C. Li, and M. Sun, Graph neural networks: A review of methods and applications, AI Open, vol. 1, pp. 57–81, 2020.

[11]
T. N. Kipf and M. Welling, Semi-supervised classification with graph convolutional networks, arXiv preprint arXiv: 1609.02907, 2017.
[12]

L. Wang, Z. Pan, and J. Wang, A review of reinforcement learning based intelligent optimization for manufacturing scheduling, Compl. Syst. Model. Simul., vol. 1, no. 4, pp. 257–270, 2021.

[13]

Z. Chen, L. Zhang, X. Wang, and P. Gu, Optimal design of flexible job shop scheduling under resource preemption based on deep reinforcement learning, Compl. Syst. Model. Simul., vol. 2, no. 2, pp. 174–185, 2022.

[14]

B. Xi and D. Lei, Q-learning-based teaching-learning optimization for distributed two-stage hybrid flow shop scheduling with fuzzy processing time, Compl. Syst. Model. Simul., vol. 2, no. 2, pp. 113–129, 2022.

[15]

L. Luo, N. Zhao, and G. Lodewijks, Scheduling storage process of shuttle-based storage and retrieval systems based on reinforcement learning, Compl. Syst. Model. Simul., vol. 1, no. 2, pp. 131–144, 2021.

[16]
R. S. Sutton, D. McAllester, S. Singh, and Y. Mansour, Policy gradient methods for reinforcement learning with function approximation, in Proc. 12 th Int. Conf. Neural Information Processing Systems, Denver, CO, USA, 1999, pp. 1057–1063.
[17]
I. Sutskever, O. Vinyals, and Q. V. Le, Sequence to sequence learning with neural networks, in Proc. 27 th Int. Conf. Neural Information Processing Systems, Montréal, Canada, 2014, pp. 3104–3112.
[18]
J. Gu, Z. Lu, H. Li, and V. O. K. Li, Incorporating copying mechanism in sequence-to-sequence learning, in Proc. 54 th Annu. Meeting of the Association for Computational Linguistics (Volume 1 : Long Papers ), Berlin, Germany, 2016, pp. 1631–1640.
DOI
[19]
I. Bello, H. Pham, Q. V. Le, M. Norouzi, and S. Bengio, Neural combinatorial optimization with reinforcement learning, arXiv preprint arXiv: 1611.09940, 2017.
[20]
M. Nazari, A. Oroojlooy, M. Takáč, and L. V. Snyder, Reinforcement learning for solving the vehicle routing problem, in Proc. 32 nd Int. Conf. Neural Information Processing Systems, Montréal, Canada, 2018, pp. 9861–9871.
[21]
M. Deudon, P. Cournut, A. Lacoste, Y. Adulyasak, and L. M. Rousseau, Learning heuristics for the TSP by policy gradient, in Proc. 15 th Int. Conf. Integration of Constraint Programming, Artificial Intelligence, and Operations Research, Delft, The Netherlands, 2018, pp. 170–181.
DOI
[22]
W. Kool, H. van Hoof, and M. Welling, Attention, learn to solve routing problems! arXiv preprint arXiv: 1803.08475, 2019.
[23]

L. Xin, W. Song, Z. Cao, and J. Zhang, Multi-decoder attention model with embedding glimpse for solving vehicle routing problems, Proc. AAAI Conf. Artif. Intell., vol. 35, no. 13, pp. 12042–12049, 2021.

[24]
Y. D. Kwon, J. Choo, B. Kim, I. Yoon, Y. Gwon, and S. Min, POMO: Policy optimization with multiple optima for reinforcement learning, in Proc. 34 th Int. Conf. Neural Information Processing Systems, Vancouver, Canada, 2020, pp. 21188–21198.
[25]
M. Kim, J. Park, and J. Park, Sym-NCO: Leveraging symmetricity for neural combinatorial optimization, arXiv preprint arXiv: 2205.13209, 2023.
[26]
A. Hottung, Y. D. Kwon, and K. Tierney, Efficient active search for combinatorial optimization problems, arXiv preprint arXiv: 2106.05126, 2022.
[27]
J. Choo, Y. D. Kwon, J. Kim, J. Jae, A. Hottung, K. Tierney, and Y. Gwon, Simulation-guided beam search for neural combinatorial optimization, arXiv preprint arXiv: 2207.06190, 2022.
[28]
Q. Ma, S. Ge, D. He, D. Thaker, and I. Drori, Combinatorial optimization by graph pointer networks and hierarchical reinforcement learning, arXiv preprint arXiv: 1911.04936, 2019.
[29]
M. Kim, J. Park, and J. Kim, Learning collaborative policies to solve NP-hard routing problems, arXiv preprint arXiv: 2110.13987, 2021.
[30]

X. Pan, Y. Jin, Y. Ding, M. Feng, L. Zhao, L. Song, and J. Bian, H-TSP: Hierarchically solving the large-scale traveling salesman problem, Proc. AAAI Conf. Artif. Intell., vol. 37, no. 8, pp. 9345–9353, 2023.

[31]
Q. Hou, J. Yang, Y. Su, X. Wang, and Y. Deng, Generalize learned heuristics to solve large-scale vehicle routing problems in real-time, in Proc. 11 th Int. Conf. Learning Representations, Kigali, Rwanda, 2023, pp. 1–13.
[32]
[33]

K. Helsgaun, General k-opt submoves for the Lin–Kernighan TSP heuristic, Math. Prog. Comput., vol. 1, pp. 119–163, 2009.

[34]
H. Dai, E. B. Khalil, Y. Zhang, B. Dilkina, and L. Song, Learning combinatorial optimization algorithms over graphs, in Proc. 31 st Int. Conf. Neural Information Processing Systems, Long Beach, CA, USA, 2017, pp. 6351–6361.
[35]
S. Manchanda, A. Mittal, A. Dhawan, S. Medya, S. Ranu, and A. Singh, Learning heuristics over large graphs via deep reinforcement learning, arXiv preprint arXiv: 1903.03332, 2020.
[36]
A. Nowak, S. Villar, A. S. Bandeira, and J. Bruna, Revised note on learning algorithms for quadratic assignment with graph neural networks, arXiv preprint arXiv: 1706.07450, 2018.
DOI
[37]
Z. Li, Q. Chen, and V. Koltun, Combinatorial optimization with graph convolutional networks and guided tree search, in Proc. 32 nd Int. Conf. Neural Information Processing Systems, Montréal, Canada, 2018, pp. 537–546.
[38]
C. K. Joshi, T. Laurent, and X. Bresson, An efficient graph convolutional network technique for the travelling salesman problem, arXiv preprint arXiv: 1906.01227, 2019.
[39]

Z. Xing and S. Tu, A graph neural network assisted Monte Carlo tree search approach to traveling salesman problem, IEEE Access, vol. 8, pp. 108418–108428, 2020.

[40]

Z. H. Fu, K. B. Qiu, and H. Zha, Generalize a small pre-trained model to arbitrarily large TSP instances, Proc. AAAI Conf. Artif. Intell., vol. 35, no. 8, p. 7474, 2021.

[41]
W. Kool, H. van Hoof, J. Gromicho, and M. Welling, Deep policy dynamic programming for vehicle routing problems, in Proc. 19 th Int. Conf. Integration of Constraint Programming, Artificial Intelligence, and Operations Research, Los Angeles, CA, USA, 2022, pp. 190–213.
DOI
[42]
B. Hudson, Q. Li, M. Malencia, and A. Prorok, Graph neural network guided local search for the traveling salesperson problem, arXiv preprint arXiv: 2110.05291, 2022.
[43]
L. F. R. Ribeiro, P. H. P. Saverese, and D. R. Figueiredo, struc2vec: Learning node representations from structural identity, in Proc. 23 rd ACM SIGKDD Int. Conf. Knowledge Discovery and Data Mining, Halifax, Canada, 2017, pp. 385–394.
[44]
[45]

D. L. Applegate, R. E. Bixby, V. Chvátal, W. Cook, D. G. Espinoza, M. Goycoolea, and K. Helsgaun, Certification of an optimal TSP tour through 85900 cities, Oper. Res. Lett., vol. 37, no. 1, pp. 11–15, 2009.

[46]
K. Helsgaun, An Extension of the Lin-Kernighan-Helsgaun TSP Solver for Constrained Traveling Salesman and Vehicle Routing Problems. Roskilde, Denmark: Roskilde University, 2017.
[47]
X. Chen and Y. Tian, Learning to perform local rewriting for combinatorial optimization, in Proc. 33 rd Int. Conf. Neural Information Processing Systems, Vancouver, Canada, 2019, pp. 6281–6292.
[48]
H. Lu, X. Zhang, and S. Yang, A learning-based iterative method for solving vehicle routing problems, in Proc. 8 th Int. Conf. Learning Representations, Addis Ababa, Ethiopia, 2020, pp. 1–15.
[49]
L. Gao, M. Chen, Q. Chen, G. Luo, N. Zhu, and Z. Liu, Learn to design the heuristics for vehicle routing problem, arXiv preprint arXiv: 2002.08539, 2020.
[50]

Y. Wu, W. Song, Z. Cao, J. Zhang, and A. Lim, Learning improvement heuristics for solving routing problems, IEEE Trans. Neural Netw. Learn. Syst., vol. 33, no. 9, pp. 5057–5069, 2022.

[51]
Y. Ma, J. Li, Z. Cao, W. Song, L. Zhang, Z. Chen, and J. Tang, Learning to iteratively solve routing problems with dual-aspect collaborative transformer, in Proc. 35 th Conf. Neural Information Processing Systems, Sydney, Australia, 2021, pp. 11096–11107.
[52]

J. Zheng, K. He, J. Zhou, Y. Jin, and C. M. Li, Combining reinforcement learning with Lin-Kernighan-Helsgaun algorithm for the traveling salesman problem, Proc. AAAI Conf. Artif. Intell., vol. 35, no. 14, pp. 12445–12452, 2021.

[53]
L. Xin, W. Song, Z. Cao, and J. Zhang, NeurolKH: Combining deep learning model with Lin-Kernighan-Helsgaun heuristic for solving the traveling salesman problem, in Proc. 35 th Conf. Neural Information Processing Systems, Sydney, Australia, 2021, pp. 7472–7483.
[54]
H. Mao, M. Alizadeh, I. Menache, and S. Kandula, Resource management with deep reinforcement learning, in Proc. 15 th ACM Workshop on Hot Topics in Networks, Atlanta, GA, USA, 2016, pp. 50–56.
DOI
[55]
J. Schulman, F. Wolski, P. Dhariwal, A. Radford, and O. Klimov, Proximal policy optimization algorithms, arXiv preprint arXiv: 1707.06347, 2017.
[56]

J. T. Linderoth and M. W. P. Savelsbergh, A computational study of search strategies for mixed integer programming, INFORMS J. Comput., vol. 11, no. 2, pp. 173–187, 1999.

[57]
M. Gasse, D. Chételat, N. Ferroni, L. Charlin, and A. Lodi, Exact combinatorial optimization with graph convolutional neural networks, in Proc. 33 rd Int. Conf. Neural Information Processing Systems, Vancouver, Canada, 2019, pp. 15580–15592.
[58]

E. B. Khalil, C. Morris, and A. Lodi, MIP-GNN: A data-driven framework for guiding combinatorial solvers, Proc. AAAI Conf. Artif. Intell., vol. 36, no. 9, pp. 10219–10227, 2022.

[59]
Q. Han, L. Yang, Q. Chen, X. Zhou, D. Zhang, A. Wang, R. Sun, and X. Luo, A GNN-guided predict-and-search framework for mixed-integer linear programming, arXiv preprint arXiv: 2302.05636, 2023.
[60]
H. Sun, W. Chen, H. Li, and L. Song, Improving learning to branch via reinforcement learning, in Proc. 1 st Workshop on Learning Meets Combinatorial Algorithms, Vancouver, Canada, 2020, pp. 1–12.
[61]
T. Zhang, A. Banitalebi-Dehkordi, and Y. Zhang, Deep reinforcement learning for exact combinatorial optimization: Learning to branch, in Proc. 26 th Int. Conf. Pattern Recognition, Montreal, Canada, 2022, pp. 3105–3111.
DOI
[62]
GUROBI optimizer, https://www.gurobi.com, 2023.
[63]

Z. Zhang, Z. Zhang, X. Wang, and W. Zhu, Learning to solve travelling salesman problem with hardness-adaptive curriculum, Proc. AAAI Conf. Artif. Intell., vol. 36, no. 8, pp. 9136–9144, 2022.

[64]
Y. Bengio, J. Louradour, R. Collobert, and J. Weston, Curriculum learning, in Proc. 26 th Annu. Int. Conf. Machine Learning, Montréal, Canada, 2009, pp. 41–48.
DOI
[65]
J. Bi, Y. Ma, J. Wang, Z. Cao, J. Chen, Y. Sun, and Y. M. Chee, Learning generalizable models for vehicle routing problems via knowledge distillation, arXiv preprint arXiv: 2210.07686, 2023.
[66]

Y. Jiang, Y. Wu, Z. Cao, and J. Zhang, Learning to solve routing problems via distributionally robust optimization, Proc. AAAI Conf. Artif. Intell., vol. 36, no. 9, pp. 9786–9794, 2022.

[67]

Q. Zhang and H. Li, MOEA/D: A multiobjective evolutionary algorithm based on decomposition, IEEE Trans. EComput., vol. 11, no. 6, pp. 712–731, 2007.

[68]

K. Deb, A. Pratap, S. Agarwal, and T. Meyarivan, A fast and elitist multiobjective genetic algorithm: NSGA-II, IEEE Tran. EComput., vol. 6, no. 2, pp. 182–197, 2002.

[69]

E. Jiang, L. Wang, and J. Wang, Decomposition-based multi-objective optimization for energy-aware distributed hybrid flow shop scheduling with multiprocessor tasks, Tsinghua Science and Technology, vol. 26, no. 5, pp. 646–663, 2021.

[70]

K. Li, T. Zhang, and R. Wang, Deep reinforcement learning for multiobjective optimization, IEEE Trans. Cybern., vol. 51, no. 6, pp. 3103–3114, 2021.

[71]

Z. Zhang, Z. Wu, H. Zhang, and J. Wang, Meta-learning-based deep reinforcement learning for multiobjective optimization problems, IEEE Trans. Neural Netw. Learn. Syst.

[72]
X. Lin, Z. Yang, and Q. Zhang, Pareto set learning for neural multi-objective combinatorial optimization, in Proc. 10 th Int. Conf. Learning Representations, Virtual Event, 2022, pp. 1–14.
[73]

Y. Fu, Y. Hou, Z. Wang, X. Wu, K. Gao, and L. Wang, Distributed scheduling problems in intelligent manufacturing systems, Tsinghua Science and Technology, vol. 26, no. 5, pp. 625–645, 2021.

[74]

X. Liang, X. Du, G. Wang, and Z. Han, A deep reinforcement learning network for traffic light cycle control, IEEE Trans. Veh. Technol., vol. 68, no. 2, pp. 1243–1253, 2019.

[75]

D. Ma, B. Zhou, X. Song, and H. Dai, A deep reinforcement learning approach to traffic signal control with temporal traffic pattern mining, IEEE Trans. Intell. Transp. Syst., vol. 23, no. 8, pp. 11789–11800, 2022.

[76]

J. J. Q. Yu, W. Yu, and J. Gu, Online vehicle routing with neural combinatorial optimization and deep reinforcement learning, IEEE Trans. Intell. Transp. Syst., vol. 20, no. 10, pp. 3806–3817, 2019.

[77]
K. Li, T. Zhang, R. Wang, and L. Wang, Deep reinforcement learning for online routing of unmanned aerial vehicles with wireless power transfer, arXiv preprint arXiv: 2204.11477, 2022.
[78]

P. T. A. Quang, Y. Hadjadj-Aoul, and A. Outtagarts, A deep reinforcement learning approach for VNF forwarding graph embedding, IEEE Trans. Netw. Serv. Manage., vol. 16, no. 4, pp. 1318–1331, 2019.

[79]

R. Solozabal, J. Ceberio, A. Sanchoyerto, L. Zabala, B. Blanco, and F. Liberal, Virtual network function placement optimization with deep reinforcement learning, IEEE J. Sel. Area. Commun., vol. 38, no. 2, pp. 292–303, 2020.

[80]

F. Soleymani and E. Paquet, Financial portfolio optimization with online deep reinforcement learning and restricted stacked autoencoder—DeepBreath, Expert Syst. Appl., vol. 156, p. 113456, 2020.

[81]

L. Wang, X. Hu, Y. Wang, S. Xu, S. Ma, K. Yang, Z. Liu, and W. Wang, Dynamic job-shop scheduling in smart manufacturing using deep reinforcement learning, Comput. Netw., vol. 190, p. 107969, 2021.

[82]

J. Jin, X. Zhu, B. Wu, J. Zhang, and Y. Wang, A dynamic and deadline-oriented road pricing mechanism for urban traffic management, Tsinghua Science and Technology, vol. 27, no. 1, pp. 91–102, 2022.

[83]

K. Zhu and T. Zhang, Deep reinforcement learning based mobile robot navigation: A review, Tsinghua Science and Technology, vol. 26, no. 5, pp. 674–691, 2021.

Publication history
Copyright
Acknowledgements
Rights and permissions

Publication history

Received: 18 May 2023
Revised: 15 July 2023
Accepted: 26 July 2023
Published: 02 May 2024
Issue date: October 2024

Copyright

© The Author(s) 2024.

Acknowledgements

Acknowledgment

This work was supported by the National Natural Science Foundation of China (Nos. 62173258 and 61773296).

Rights and permissions

The articles published in this open access journal are distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/).

Return