Solving Combinatorial Optimization Problems with Deep Neural Network: A Survey

Feng Wang; Qi He; Shicheng Li

doi:10.26599/TST.2023.9010076

AI Chat Paper

Note: Please note that the following content is generated by AMiner AI. SciOpen does not take any responsibility related to this content.

Chat more with AI

| Sign up

Browse by Subject

Search for peer-reviewed journals with full access.

Journals A - Z

About Us

Discover the SciOpen Platform and Achieve Your Research Goals with Ease.

About Us

Publish with Us

Support

Journals A - Z

About Us

Publish with Us

Support

PDF (1.7 MB)

Cite

EndNote(RIS) BibTeX

Collect

Submit Manuscript

AI Chat Paper

Show Outline

Outline

Show full outline

Hide outline

Outline

Show full outline

Hide outline

Open Access

Solving Combinatorial Optimization Problems with Deep Neural Network: A Survey

Feng Wang(

), Qi He, Shicheng Li

School of Computer Science, Wuhan University, Wuhan 430072, China

Show Author Information

Abstract

Combinatorial Optimization Problems (COPs) are a class of optimization problems that are commonly encountered in industrial production and everyday life. Over the last few decades, traditional algorithms, such as exact algorithms, approximate algorithms, and heuristic algorithms, have been proposed to solve COPs. However, as COPs in the real world become more complex, traditional algorithms struggle to generate optimal solutions in a limited amount of time. Since Deep Neural Networks (DNNs) are not heavily dependent on expert knowledge and are adequately flexible for generalization to various COPs, several DNN-based algorithms have been proposed in the last ten years for solving COPs. Herein, we categorize these algorithms into four classes and provide a brief overview of their applications in real-world problems.

Keywords

Graph Neural Network (GNN)Transformer Reinforcement Learning (RL)pointer network Combinatorial Optimization Problem (COPs)

References

[1]

K. W. Li, T. Zhang, R. Wang, W. J. Qin, H. H. He, and H. Huang, Research reviews of combinatorial optimization methods based on deep reinforcement learning, (in Chinese), Acta Automatica Sinica, vol. 47, no. 11, pp. 2521–2537, 2021.

Google Scholar

[2]

G. Laporte, The vehicle routing problem: An overview of exact and approximate algorithms, Eur. J. Oper. Res., vol. 59, no. 3, pp. 345–358, 1992.

Crossref Google Scholar

[3]

W. Yu, Y. Liao, and Y. Yang, Exact and approximation algorithms for the multi-depot capacitated arc routing problems, Tsinghua Science and Technology, vol. 28, no. 5, pp. 916–928, 2023.

Crossref Google Scholar

[4]

A. H. Halim and I. Ismail, Combinatorial optimization: Comparison of heuristic algorithms in travelling salesman problem, Arch. Computat. Methods Eng., vol. 26, no. 2, pp. 367–380, 2019.

Crossref Google Scholar

[5]

F. Zhao, S. Di, J. Cao, J. Tang, and Jonrinaldi, A novel cooperative multi-stage hyper-heuristic for combination optimization problems, Compl. Syst. Model. Simul., vol. 1, no. 2, pp. 91–108, 2021.

Crossref Google Scholar

[6]

Y. Shen, L. Yu, and J. Li, Robust electric vehicle routing problem with time windows under demand uncertainty and weight-related energy consumption, Compl. Syst. Model. Simul., vol. 2, no. 1, pp. 18–34, 2022.

Crossref Google Scholar

[7]

F. Wang, X. Wang, and S. Sun, A reinforcement learning level-based particle swarm optimization algorithm for large-scale optimization, Inf. Sci., vol. 602, pp. 298–312, 2022.

Crossref Google Scholar

[8]

O. Vinyals, M. Fortunato, and N. Jaitly, Pointer networks, in Proc. 28^th Int. Conf. Neural Information Processing Systems, Montreal, Canada, 2015, pp. 2692–2700.

[9]

A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, Attention is all you need, in Proc. 31^st Int. Conf. Neural Information Processing Systems, Long Beach, CA, USA, 2017, pp. 6000–6010.

[10]

J. Zhou, G. Cui, S. Hu, Z. Zhang, C. Yang, Z. Liu, L. Wang, C. Li, and M. Sun, Graph neural networks: A review of methods and applications, AI Open, vol. 1, pp. 57–81, 2020.

Crossref Google Scholar

[11]

T. N. Kipf and M. Welling, Semi-supervised classification with graph convolutional networks, arXiv preprint arXiv: 1609.02907, 2017.

[12]

L. Wang, Z. Pan, and J. Wang, A review of reinforcement learning based intelligent optimization for manufacturing scheduling, Compl. Syst. Model. Simul., vol. 1, no. 4, pp. 257–270, 2021.

Crossref Google Scholar

[13]

Z. Chen, L. Zhang, X. Wang, and P. Gu, Optimal design of flexible job shop scheduling under resource preemption based on deep reinforcement learning, Compl. Syst. Model. Simul., vol. 2, no. 2, pp. 174–185, 2022.

Crossref Google Scholar

[14]

B. Xi and D. Lei, Q-learning-based teaching-learning optimization for distributed two-stage hybrid flow shop scheduling with fuzzy processing time, Compl. Syst. Model. Simul., vol. 2, no. 2, pp. 113–129, 2022.

Crossref Google Scholar

[15]

L. Luo, N. Zhao, and G. Lodewijks, Scheduling storage process of shuttle-based storage and retrieval systems based on reinforcement learning, Compl. Syst. Model. Simul., vol. 1, no. 2, pp. 131–144, 2021.

Crossref Google Scholar

[16]

R. S. Sutton, D. McAllester, S. Singh, and Y. Mansour, Policy gradient methods for reinforcement learning with function approximation, in Proc. 12^th Int. Conf. Neural Information Processing Systems, Denver, CO, USA, 1999, pp. 1057–1063.

[17]

I. Sutskever, O. Vinyals, and Q. V. Le, Sequence to sequence learning with neural networks, in Proc. 27^th Int. Conf. Neural Information Processing Systems, Montréal, Canada, 2014, pp. 3104–3112.

[18]

J. Gu, Z. Lu, H. Li, and V. O. K. Li, Incorporating copying mechanism in sequence-to-sequence learning, in Proc. 54^th Annu. Meeting of the Association for Computational Linguistics (Volume 1 : Long Papers ), Berlin, Germany, 2016, pp. 1631–1640.

Crossref

[19]

I. Bello, H. Pham, Q. V. Le, M. Norouzi, and S. Bengio, Neural combinatorial optimization with reinforcement learning, arXiv preprint arXiv: 1611.09940, 2017.

[20]

M. Nazari, A. Oroojlooy, M. Takáč, and L. V. Snyder, Reinforcement learning for solving the vehicle routing problem, in Proc. 32^nd Int. Conf. Neural Information Processing Systems, Montréal, Canada, 2018, pp. 9861–9871.

[21]

M. Deudon, P. Cournut, A. Lacoste, Y. Adulyasak, and L. M. Rousseau, Learning heuristics for the TSP by policy gradient, in Proc. 15^th Int. Conf. Integration of Constraint Programming, Artificial Intelligence, and Operations Research, Delft, The Netherlands, 2018, pp. 170–181.

Crossref

[22]

W. Kool, H. van Hoof, and M. Welling, Attention, learn to solve routing problems! arXiv preprint arXiv: 1803.08475, 2019.

[23]

L. Xin, W. Song, Z. Cao, and J. Zhang, Multi-decoder attention model with embedding glimpse for solving vehicle routing problems, Proc. AAAI Conf. Artif. Intell., vol. 35, no. 13, pp. 12042–12049, 2021.

Crossref Google Scholar

[24]

Y. D. Kwon, J. Choo, B. Kim, I. Yoon, Y. Gwon, and S. Min, POMO: Policy optimization with multiple optima for reinforcement learning, in Proc. 34^th Int. Conf. Neural Information Processing Systems, Vancouver, Canada, 2020, pp. 21188–21198.

[25]

M. Kim, J. Park, and J. Park, Sym-NCO: Leveraging symmetricity for neural combinatorial optimization, arXiv preprint arXiv: 2205.13209, 2023.

[26]

A. Hottung, Y. D. Kwon, and K. Tierney, Efficient active search for combinatorial optimization problems, arXiv preprint arXiv: 2106.05126, 2022.

[27]

J. Choo, Y. D. Kwon, J. Kim, J. Jae, A. Hottung, K. Tierney, and Y. Gwon, Simulation-guided beam search for neural combinatorial optimization, arXiv preprint arXiv: 2207.06190, 2022.

[28]

Q. Ma, S. Ge, D. He, D. Thaker, and I. Drori, Combinatorial optimization by graph pointer networks and hierarchical reinforcement learning, arXiv preprint arXiv: 1911.04936, 2019.

[29]

M. Kim, J. Park, and J. Kim, Learning collaborative policies to solve NP-hard routing problems, arXiv preprint arXiv: 2110.13987, 2021.

[30]

X. Pan, Y. Jin, Y. Ding, M. Feng, L. Zhao, L. Song, and J. Bian, H-TSP: Hierarchically solving the large-scale traveling salesman problem, Proc. AAAI Conf. Artif. Intell., vol. 37, no. 8, pp. 9345–9353, 2023.

Crossref Google Scholar

[31]

Q. Hou, J. Yang, Y. Su, X. Wang, and Y. Deng, Generalize learned heuristics to solve large-scale vehicle routing problems in real-time, in Proc. 11^th Int. Conf. Learning Representations, Kigali, Rwanda, 2023, pp. 1–13.

[32]

Google or-tools, https://developers.google.com/optimization, 2023.

[33]

K. Helsgaun, General k-opt submoves for the Lin–Kernighan TSP heuristic, Math. Prog. Comput., vol. 1, pp. 119–163, 2009.

Crossref Google Scholar

[34]

H. Dai, E. B. Khalil, Y. Zhang, B. Dilkina, and L. Song, Learning combinatorial optimization algorithms over graphs, in Proc. 31^st Int. Conf. Neural Information Processing Systems, Long Beach, CA, USA, 2017, pp. 6351–6361.

[35]

S. Manchanda, A. Mittal, A. Dhawan, S. Medya, S. Ranu, and A. Singh, Learning heuristics over large graphs via deep reinforcement learning, arXiv preprint arXiv: 1903.03332, 2020.

[36]

A. Nowak, S. Villar, A. S. Bandeira, and J. Bruna, Revised note on learning algorithms for quadratic assignment with graph neural networks, arXiv preprint arXiv: 1706.07450, 2018.

Crossref

[37]

Z. Li, Q. Chen, and V. Koltun, Combinatorial optimization with graph convolutional networks and guided tree search, in Proc. 32^nd Int. Conf. Neural Information Processing Systems, Montréal, Canada, 2018, pp. 537–546.

[38]

C. K. Joshi, T. Laurent, and X. Bresson, An efficient graph convolutional network technique for the travelling salesman problem, arXiv preprint arXiv: 1906.01227, 2019.

[39]

Z. Xing and S. Tu, A graph neural network assisted Monte Carlo tree search approach to traveling salesman problem, IEEE Access, vol. 8, pp. 108418–108428, 2020.

Crossref Google Scholar

[40]

Z. H. Fu, K. B. Qiu, and H. Zha, Generalize a small pre-trained model to arbitrarily large TSP instances, Proc. AAAI Conf. Artif. Intell., vol. 35, no. 8, p. 7474, 2021.

Crossref Google Scholar

[41]

W. Kool, H. van Hoof, J. Gromicho, and M. Welling, Deep policy dynamic programming for vehicle routing problems, in Proc. 19^th Int. Conf. Integration of Constraint Programming, Artificial Intelligence, and Operations Research, Los Angeles, CA, USA, 2022, pp. 190–213.

Crossref

[42]

B. Hudson, Q. Li, M. Malencia, and A. Prorok, Graph neural network guided local search for the traveling salesperson problem, arXiv preprint arXiv: 2110.05291, 2022.

[43]

L. F. R. Ribeiro, P. H. P. Saverese, and D. R. Figueiredo, struc2vec: Learning node representations from structural identity, in Proc. 23^rd ACM SIGKDD Int. Conf. Knowledge Discovery and Data Mining, Halifax, Canada, 2017, pp. 385–394.

[44]

Concorde TSP solver, https://www.math.uwaterloo.ca/tsp/concorde.html, 2020.

[45]

D. L. Applegate, R. E. Bixby, V. Chvátal, W. Cook, D. G. Espinoza, M. Goycoolea, and K. Helsgaun, Certification of an optimal TSP tour through 85900 cities, Oper. Res. Lett., vol. 37, no. 1, pp. 11–15, 2009.

Crossref Google Scholar

[46]

K. Helsgaun, An Extension of the Lin-Kernighan-Helsgaun TSP Solver for Constrained Traveling Salesman and Vehicle Routing Problems. Roskilde, Denmark: Roskilde University, 2017.

[47]

X. Chen and Y. Tian, Learning to perform local rewriting for combinatorial optimization, in Proc. 33^rd Int. Conf. Neural Information Processing Systems, Vancouver, Canada, 2019, pp. 6281–6292.

[48]

H. Lu, X. Zhang, and S. Yang, A learning-based iterative method for solving vehicle routing problems, in Proc. 8^th Int. Conf. Learning Representations, Addis Ababa, Ethiopia, 2020, pp. 1–15.

[49]

L. Gao, M. Chen, Q. Chen, G. Luo, N. Zhu, and Z. Liu, Learn to design the heuristics for vehicle routing problem, arXiv preprint arXiv: 2002.08539, 2020.

[50]

Y. Wu, W. Song, Z. Cao, J. Zhang, and A. Lim, Learning improvement heuristics for solving routing problems, IEEE Trans. Neural Netw. Learn. Syst., vol. 33, no. 9, pp. 5057–5069, 2022.

Crossref Google Scholar

[51]

Y. Ma, J. Li, Z. Cao, W. Song, L. Zhang, Z. Chen, and J. Tang, Learning to iteratively solve routing problems with dual-aspect collaborative transformer, in Proc. 35 ^th Conf. Neural Information Processing Systems, Sydney, Australia, 2021, pp. 11096–11107.

[52]

J. Zheng, K. He, J. Zhou, Y. Jin, and C. M. Li, Combining reinforcement learning with Lin-Kernighan-Helsgaun algorithm for the traveling salesman problem, Proc. AAAI Conf. Artif. Intell., vol. 35, no. 14, pp. 12445–12452, 2021.

Crossref Google Scholar

[53]

L. Xin, W. Song, Z. Cao, and J. Zhang, NeurolKH: Combining deep learning model with Lin-Kernighan-Helsgaun heuristic for solving the traveling salesman problem, in Proc. 35 ^th Conf. Neural Information Processing Systems, Sydney, Australia, 2021, pp. 7472–7483.

[54]

H. Mao, M. Alizadeh, I. Menache, and S. Kandula, Resource management with deep reinforcement learning, in Proc. 15^th ACM Workshop on Hot Topics in Networks, Atlanta, GA, USA, 2016, pp. 50–56.

Crossref

[55]

J. Schulman, F. Wolski, P. Dhariwal, A. Radford, and O. Klimov, Proximal policy optimization algorithms, arXiv preprint arXiv: 1707.06347, 2017.

[56]

J. T. Linderoth and M. W. P. Savelsbergh, A computational study of search strategies for mixed integer programming, INFORMS J. Comput., vol. 11, no. 2, pp. 173–187, 1999.

Crossref Google Scholar

[57]

M. Gasse, D. Chételat, N. Ferroni, L. Charlin, and A. Lodi, Exact combinatorial optimization with graph convolutional neural networks, in Proc. 33^rd Int. Conf. Neural Information Processing Systems, Vancouver, Canada, 2019, pp. 15580–15592.

[58]

E. B. Khalil, C. Morris, and A. Lodi, MIP-GNN: A data-driven framework for guiding combinatorial solvers, Proc. AAAI Conf. Artif. Intell., vol. 36, no. 9, pp. 10219–10227, 2022.

Crossref Google Scholar

[59]

Q. Han, L. Yang, Q. Chen, X. Zhou, D. Zhang, A. Wang, R. Sun, and X. Luo, A GNN-guided predict-and-search framework for mixed-integer linear programming, arXiv preprint arXiv: 2302.05636, 2023.

[60]

H. Sun, W. Chen, H. Li, and L. Song, Improving learning to branch via reinforcement learning, in Proc. 1 ^st Workshop on Learning Meets Combinatorial Algorithms, Vancouver, Canada, 2020, pp. 1–12.

[61]

T. Zhang, A. Banitalebi-Dehkordi, and Y. Zhang, Deep reinforcement learning for exact combinatorial optimization: Learning to branch, in Proc. 26 ^th Int. Conf. Pattern Recognition, Montreal, Canada, 2022, pp. 3105–3111.

Crossref

[62]

GUROBI optimizer, https://www.gurobi.com, 2023.

[63]

Z. Zhang, Z. Zhang, X. Wang, and W. Zhu, Learning to solve travelling salesman problem with hardness-adaptive curriculum, Proc. AAAI Conf. Artif. Intell., vol. 36, no. 8, pp. 9136–9144, 2022.

Crossref Google Scholar

[64]

Y. Bengio, J. Louradour, R. Collobert, and J. Weston, Curriculum learning, in Proc. 26^th Annu. Int. Conf. Machine Learning, Montréal, Canada, 2009, pp. 41–48.

Crossref

[65]

J. Bi, Y. Ma, J. Wang, Z. Cao, J. Chen, Y. Sun, and Y. M. Chee, Learning generalizable models for vehicle routing problems via knowledge distillation, arXiv preprint arXiv: 2210.07686, 2023.

[66]

Y. Jiang, Y. Wu, Z. Cao, and J. Zhang, Learning to solve routing problems via distributionally robust optimization, Proc. AAAI Conf. Artif. Intell., vol. 36, no. 9, pp. 9786–9794, 2022.

Crossref Google Scholar

[67]

Q. Zhang and H. Li, MOEA/D: A multiobjective evolutionary algorithm based on decomposition, IEEE Trans. EComput., vol. 11, no. 6, pp. 712–731, 2007.

Crossref Google Scholar

[68]

K. Deb, A. Pratap, S. Agarwal, and T. Meyarivan, A fast and elitist multiobjective genetic algorithm: NSGA-II, IEEE Tran. EComput., vol. 6, no. 2, pp. 182–197, 2002.

Crossref Google Scholar

[69]

E. Jiang, L. Wang, and J. Wang, Decomposition-based multi-objective optimization for energy-aware distributed hybrid flow shop scheduling with multiprocessor tasks, Tsinghua Science and Technology, vol. 26, no. 5, pp. 646–663, 2021.

Crossref Google Scholar

[70]

K. Li, T. Zhang, and R. Wang, Deep reinforcement learning for multiobjective optimization, IEEE Trans. Cybern., vol. 51, no. 6, pp. 3103–3114, 2021.

Crossref Google Scholar

[71]

Z. Zhang, Z. Wu, H. Zhang, and J. Wang, Meta-learning-based deep reinforcement learning for multiobjective optimization problems, IEEE Trans. Neural Netw. Learn. Syst.

Crossref Google Scholar

[72]

X. Lin, Z. Yang, and Q. Zhang, Pareto set learning for neural multi-objective combinatorial optimization, in Proc. 10 ^th Int. Conf. Learning Representations, Virtual Event, 2022, pp. 1–14.

[73]

Y. Fu, Y. Hou, Z. Wang, X. Wu, K. Gao, and L. Wang, Distributed scheduling problems in intelligent manufacturing systems, Tsinghua Science and Technology, vol. 26, no. 5, pp. 625–645, 2021.

Crossref Google Scholar

[74]

X. Liang, X. Du, G. Wang, and Z. Han, A deep reinforcement learning network for traffic light cycle control, IEEE Trans. Veh. Technol., vol. 68, no. 2, pp. 1243–1253, 2019.

Crossref Google Scholar

[75]

D. Ma, B. Zhou, X. Song, and H. Dai, A deep reinforcement learning approach to traffic signal control with temporal traffic pattern mining, IEEE Trans. Intell. Transp. Syst., vol. 23, no. 8, pp. 11789–11800, 2022.

Crossref Google Scholar

[76]

J. J. Q. Yu, W. Yu, and J. Gu, Online vehicle routing with neural combinatorial optimization and deep reinforcement learning, IEEE Trans. Intell. Transp. Syst., vol. 20, no. 10, pp. 3806–3817, 2019.

Crossref Google Scholar

[77]

K. Li, T. Zhang, R. Wang, and L. Wang, Deep reinforcement learning for online routing of unmanned aerial vehicles with wireless power transfer, arXiv preprint arXiv: 2204.11477, 2022.

[78]

P. T. A. Quang, Y. Hadjadj-Aoul, and A. Outtagarts, A deep reinforcement learning approach for VNF forwarding graph embedding, IEEE Trans. Netw. Serv. Manage., vol. 16, no. 4, pp. 1318–1331, 2019.

Crossref Google Scholar

[79]

R. Solozabal, J. Ceberio, A. Sanchoyerto, L. Zabala, B. Blanco, and F. Liberal, Virtual network function placement optimization with deep reinforcement learning, IEEE J. Sel. Area. Commun., vol. 38, no. 2, pp. 292–303, 2020.

Crossref Google Scholar

[80]

F. Soleymani and E. Paquet, Financial portfolio optimization with online deep reinforcement learning and restricted stacked autoencoder—DeepBreath, Expert Syst. Appl., vol. 156, p. 113456, 2020.

Crossref Google Scholar

[81]

L. Wang, X. Hu, Y. Wang, S. Xu, S. Ma, K. Yang, Z. Liu, and W. Wang, Dynamic job-shop scheduling in smart manufacturing using deep reinforcement learning, Comput. Netw., vol. 190, p. 107969, 2021.

Crossref Google Scholar

[82]

J. Jin, X. Zhu, B. Wu, J. Zhang, and Y. Wang, A dynamic and deadline-oriented road pricing mechanism for urban traffic management, Tsinghua Science and Technology, vol. 27, no. 1, pp. 91–102, 2022.

Crossref Google Scholar

[83]

K. Zhu and T. Zhang, Deep reinforcement learning based mobile robot navigation: A review, Tsinghua Science and Technology, vol. 26, no. 5, pp. 674–691, 2021.

Crossref Google Scholar

Tsinghua Science and Technology

Volume 29 Issue 5,
October 2024

Pages 1266-1282

DOI: 10.26599/TST.2023.9010076

Cite this article:

Wang F, He Q, Li S. Solving Combinatorial Optimization Problems with Deep Neural Network: A Survey. Tsinghua Science and Technology, 2024, 29(5): 1266-1282. https://doi.org/10.26599/TST.2023.9010076

1132

Views

454

Downloads

Crossref

Web of Science

Scopus

CSCD

Google Scholar
Citation

Altmetrics

Received: 18 May 2023

Revised: 15 July 2023

Accepted: 26 July 2023

Published: 02 May 2024

The articles published in this open access journal are distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/).