[8]
O. Vinyals, M. Fortunato, and N. Jaitly, Pointer networks, in Proc. 28 th Int. Conf. Neural Information Processing Systems, Montreal, Canada, 2015, pp. 2692–2700.
[9]
A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, Attention is all you need, in Proc. 31 st Int. Conf. Neural Information Processing Systems, Long Beach, CA, USA, 2017, pp. 6000–6010.
[11]
T. N. Kipf and M. Welling, Semi-supervised classification with graph convolutional networks, arXiv preprint arXiv: 1609.02907, 2017.
[16]
R. S. Sutton, D. McAllester, S. Singh, and Y. Mansour, Policy gradient methods for reinforcement learning with function approximation, in Proc. 12 th Int. Conf. Neural Information Processing Systems, Denver, CO, USA, 1999, pp. 1057–1063.
[17]
I. Sutskever, O. Vinyals, and Q. V. Le, Sequence to sequence learning with neural networks, in Proc. 27 th Int. Conf. Neural Information Processing Systems, Montréal, Canada, 2014, pp. 3104–3112.
[18]
J. Gu, Z. Lu, H. Li, and V. O. K. Li, Incorporating copying mechanism in sequence-to-sequence learning, in Proc. 54 th Annu. Meeting of the Association for Computational Linguistics (Volume 1 : Long Papers ), Berlin, Germany, 2016, pp. 1631–1640.
[19]
I. Bello, H. Pham, Q. V. Le, M. Norouzi, and S. Bengio, Neural combinatorial optimization with reinforcement learning, arXiv preprint arXiv: 1611.09940, 2017.
[20]
M. Nazari, A. Oroojlooy, M. Takáč, and L. V. Snyder, Reinforcement learning for solving the vehicle routing problem, in Proc. 32 nd Int. Conf. Neural Information Processing Systems, Montréal, Canada, 2018, pp. 9861–9871.
[21]
M. Deudon, P. Cournut, A. Lacoste, Y. Adulyasak, and L. M. Rousseau, Learning heuristics for the TSP by policy gradient, in Proc. 15 th Int. Conf. Integration of Constraint Programming, Artificial Intelligence, and Operations Research, Delft, The Netherlands, 2018, pp. 170–181.
[22]
W. Kool, H. van Hoof, and M. Welling, Attention, learn to solve routing problems! arXiv preprint arXiv: 1803.08475, 2019.
[24]
Y. D. Kwon, J. Choo, B. Kim, I. Yoon, Y. Gwon, and S. Min, POMO: Policy optimization with multiple optima for reinforcement learning, in Proc. 34 th Int. Conf. Neural Information Processing Systems, Vancouver, Canada, 2020, pp. 21188–21198.
[25]
M. Kim, J. Park, and J. Park, Sym-NCO: Leveraging symmetricity for neural combinatorial optimization, arXiv preprint arXiv: 2205.13209, 2023.
[26]
A. Hottung, Y. D. Kwon, and K. Tierney, Efficient active search for combinatorial optimization problems, arXiv preprint arXiv: 2106.05126, 2022.
[27]
J. Choo, Y. D. Kwon, J. Kim, J. Jae, A. Hottung, K. Tierney, and Y. Gwon, Simulation-guided beam search for neural combinatorial optimization, arXiv preprint arXiv: 2207.06190, 2022.
[28]
Q. Ma, S. Ge, D. He, D. Thaker, and I. Drori, Combinatorial optimization by graph pointer networks and hierarchical reinforcement learning, arXiv preprint arXiv: 1911.04936, 2019.
[29]
M. Kim, J. Park, and J. Kim, Learning collaborative policies to solve NP-hard routing problems, arXiv preprint arXiv: 2110.13987, 2021.
[31]
Q. Hou, J. Yang, Y. Su, X. Wang, and Y. Deng, Generalize learned heuristics to solve large-scale vehicle routing problems in real-time, in Proc. 11 th Int. Conf. Learning Representations, Kigali, Rwanda, 2023, pp. 1–13.
[34]
H. Dai, E. B. Khalil, Y. Zhang, B. Dilkina, and L. Song, Learning combinatorial optimization algorithms over graphs, in Proc. 31 st Int. Conf. Neural Information Processing Systems, Long Beach, CA, USA, 2017, pp. 6351–6361.
[35]
S. Manchanda, A. Mittal, A. Dhawan, S. Medya, S. Ranu, and A. Singh, Learning heuristics over large graphs via deep reinforcement learning, arXiv preprint arXiv: 1903.03332, 2020.
[36]
A. Nowak, S. Villar, A. S. Bandeira, and J. Bruna, Revised note on learning algorithms for quadratic assignment with graph neural networks, arXiv preprint arXiv: 1706.07450, 2018.
[37]
Z. Li, Q. Chen, and V. Koltun, Combinatorial optimization with graph convolutional networks and guided tree search, in Proc. 32 nd Int. Conf. Neural Information Processing Systems, Montréal, Canada, 2018, pp. 537–546.
[38]
C. K. Joshi, T. Laurent, and X. Bresson, An efficient graph convolutional network technique for the travelling salesman problem, arXiv preprint arXiv: 1906.01227, 2019.
[41]
W. Kool, H. van Hoof, J. Gromicho, and M. Welling, Deep policy dynamic programming for vehicle routing problems, in Proc. 19 th Int. Conf. Integration of Constraint Programming, Artificial Intelligence, and Operations Research, Los Angeles, CA, USA, 2022, pp. 190–213.
[42]
B. Hudson, Q. Li, M. Malencia, and A. Prorok, Graph neural network guided local search for the traveling salesperson problem, arXiv preprint arXiv: 2110.05291, 2022.
[43]
L. F. R. Ribeiro, P. H. P. Saverese, and D. R. Figueiredo, struc2vec: Learning node representations from structural identity, in Proc. 23 rd ACM SIGKDD Int. Conf. Knowledge Discovery and Data Mining, Halifax, Canada, 2017, pp. 385–394.
[46]
K. Helsgaun, An Extension of the Lin-Kernighan-Helsgaun TSP Solver for Constrained Traveling Salesman and Vehicle Routing Problems. Roskilde, Denmark: Roskilde University, 2017.
[47]
X. Chen and Y. Tian, Learning to perform local rewriting for combinatorial optimization, in Proc. 33 rd Int. Conf. Neural Information Processing Systems, Vancouver, Canada, 2019, pp. 6281–6292.
[48]
H. Lu, X. Zhang, and S. Yang, A learning-based iterative method for solving vehicle routing problems, in Proc. 8 th Int. Conf. Learning Representations, Addis Ababa, Ethiopia, 2020, pp. 1–15.
[49]
L. Gao, M. Chen, Q. Chen, G. Luo, N. Zhu, and Z. Liu, Learn to design the heuristics for vehicle routing problem, arXiv preprint arXiv: 2002.08539, 2020.
[51]
Y. Ma, J. Li, Z. Cao, W. Song, L. Zhang, Z. Chen, and J. Tang, Learning to iteratively solve routing problems with dual-aspect collaborative transformer, in Proc. 35 th Conf. Neural Information Processing Systems, Sydney, Australia, 2021, pp. 11096–11107.
[53]
L. Xin, W. Song, Z. Cao, and J. Zhang, NeurolKH: Combining deep learning model with Lin-Kernighan-Helsgaun heuristic for solving the traveling salesman problem, in Proc. 35 th Conf. Neural Information Processing Systems, Sydney, Australia, 2021, pp. 7472–7483.
[54]
H. Mao, M. Alizadeh, I. Menache, and S. Kandula, Resource management with deep reinforcement learning, in Proc. 15 th ACM Workshop on Hot Topics in Networks, Atlanta, GA, USA, 2016, pp. 50–56.
[55]
J. Schulman, F. Wolski, P. Dhariwal, A. Radford, and O. Klimov, Proximal policy optimization algorithms, arXiv preprint arXiv: 1707.06347, 2017.
[57]
M. Gasse, D. Chételat, N. Ferroni, L. Charlin, and A. Lodi, Exact combinatorial optimization with graph convolutional neural networks, in Proc. 33 rd Int. Conf. Neural Information Processing Systems, Vancouver, Canada, 2019, pp. 15580–15592.
[59]
Q. Han, L. Yang, Q. Chen, X. Zhou, D. Zhang, A. Wang, R. Sun, and X. Luo, A GNN-guided predict-and-search framework for mixed-integer linear programming, arXiv preprint arXiv: 2302.05636, 2023.
[60]
H. Sun, W. Chen, H. Li, and L. Song, Improving learning to branch via reinforcement learning, in Proc. 1 st Workshop on Learning Meets Combinatorial Algorithms, Vancouver, Canada, 2020, pp. 1–12.
[61]
T. Zhang, A. Banitalebi-Dehkordi, and Y. Zhang, Deep reinforcement learning for exact combinatorial optimization: Learning to branch, in Proc. 26 th Int. Conf. Pattern Recognition, Montreal, Canada, 2022, pp. 3105–3111.
[64]
Y. Bengio, J. Louradour, R. Collobert, and J. Weston, Curriculum learning, in Proc. 26 th Annu. Int. Conf. Machine Learning, Montréal, Canada, 2009, pp. 41–48.
[65]
J. Bi, Y. Ma, J. Wang, Z. Cao, J. Chen, Y. Sun, and Y. M. Chee, Learning generalizable models for vehicle routing problems via knowledge distillation, arXiv preprint arXiv: 2210.07686, 2023.
[72]
X. Lin, Z. Yang, and Q. Zhang, Pareto set learning for neural multi-objective combinatorial optimization, in Proc. 10 th Int. Conf. Learning Representations, Virtual Event, 2022, pp. 1–14.
[77]
K. Li, T. Zhang, R. Wang, and L. Wang, Deep reinforcement learning for online routing of unmanned aerial vehicles with wireless power transfer, arXiv preprint arXiv: 2204.11477, 2022.