References(32)
[1]
L. Li, Y. Chai, and Y. Liu, Evolution of e-commerce patterns: Model and economic analysis, (in Chinese), Journal of Tsinghua University (Science and Technology), vol. 52, no. 11, pp. 1524–1529, 2012.
[2]
X. Liu and Y. Li, VRP model and a heuristic algorithm for across-region distribution in the environment of E-commerce, (in Chinese), Journal of Tsinghua University (Science and Technology), vol. 46, pp. 1014–1018, 2006.
[3]
A. Seghezzi, M. Winkenbach, and R. Mangiaracina, On-demand food delivery: A systematic literature review, Int. J. Logist. Manag., .
[5]
C. Li, and L. Miao, Planning methods of regional logistics systems and logistics parks, (in Chinese), Journal of Tsinghua University (Science and Technology), vol. 44, no. 3, pp. 398–401, 2004.
[6]
X. Wang, L. Wang, C. Dong, H. Ren, and K. Xing, An online deep reinforcement learning-based order recommendation framework for rider-centered food delivery system, IEEE Trans. Intell. Transp. Syst., vol. 24, no. 5, pp. 5640–5654, 2023.
[7]
E. Jiang, L. Wang, and J. Wang, Decomposition-based multi-objective optimization for energy-aware distributed hybrid flow shop scheduling with multiprocessor tasks, Tsinghua Science and Technology, vol. 26, no. 5, pp. 646–663, 2021.
[9]
B. Yildiz and M. Savelsbergh, Provably high-quality solutions for the meal delivery routing problem, Transp. Sci., vol. 53, no. 5, pp. 1372–1388, 2019.
[10]
M. W. Ulmer, B. W. Thomas, A. M. Campbell, and N. Woyak, The restaurant meal delivery problem: Dynamic pickup and delivery with deadlines and random ready times, Transp. Sci., vol. 55, no. 1, pp. 75–100, 2021.
[11]
S. Liu, L. He, and Z. J. M. Shen, On-time last-mile delivery: Order assignment with travel-time predictors, Manag. Sci., vol. 67, no. 7, pp. 4095–4119, 2021.
[12]
J. F. Chen, L. Wang, H. Ren, J. Pan, S. Wang, J. Zheng, and X. Wang, An imitation learning-enhanced iterated matching algorithm for on-demand food delivery, IEEE Trans. Intell. Transp. Syst., vol. 23, no. 10, pp. 18603–18619, 2022.
[13]
Z. Steever, M. Karwan, and C. Murray, Dynamic courier routing for a food delivery service, Comput. Oper. Res., vol. 107, pp. 173–188, 2019.
[14]
S. Paul, S. Rathee, J. Matthew, and K. M. Adusumilli, An optimization framework for on-demand meal delivery system, in Proc. 2020 IEEE Int. Conf. Industrial Engineering and Engineering Management (IEEM), Singapore, 2020, pp. 822–826.
[15]
M. Joshi, A. Singh, S. Ranu, A. Bagchi, P. Karia, and P. Kala, Batching and matching for food delivery in dynamic road networks, in Proc. 2021 IEEE 37th Int. Conf. Data Engineering (ICDE), Chania, Greece, 2021, pp. 2099–2104.
[16]
H. Jahanshahi, A. Bozanta, M. Cevik, E. M. Kavuk, A. Tosun, S. B. Sonuc, B. Kosucu, and A. Başar, A deep reinforcement learning approach for the meal delivery problem, Knowl. Based Syst., vol. 243, p. 108489, 2022.
[17]
L. Wang, Z. Pan, and J. Wang, A review of reinforcement learning based intelligent optimization for manufacturing scheduling, Complex System Modeling and Simulation, vol. 1, no. 4, pp. 257–270, 2021.
[18]
G. Shani, D. Heckerman, and R. I. Brafman, An MDP-based recommender system, J. Mach. Lear. Res., vol. 6, no. 43, pp. 1265–1295, 2005.
[19]
N. Taghipour and A. Kardan, A hybrid web recommender system based on Q-learning, in Proc. 2008 ACM Symp. on Applied Computing, Fortaleza, Brazil, 2008, pp. 1164–1168.
[20]
X. Bai, J. Guan, and H. Wang, A model-based reinforcement learning with adversarial training for online recommendation, in Proc. 33rd Int. Conf. Neural Information Processing Systems, Vancouver, Canada, 2019, pp. 10735–10746.
[21]
X. Xin, A. Karatzoglou, I. Arapakis, and J. M. Jose, Self-supervised reinforcement learning for recommender systems, in Proc. 43rd Int. ACM SIGIR Conf. Research and Development in Information Retrieval, Virtual Event, China, 2020, pp. 931–940.
[22]
X. Chen, C. Huang, L. Yao, X. Wang, W. Liu, and W. Zhang, Knowledge-guided deep reinforcement learning for interactive recommendation, in Proc. 2020 Int. Joint Conf. Neural Networks (IJCNN), Glasgow, UK, 2020, pp. 1–8.
[23]
X. Zhao, L. Zhang, Z. Ding, L. Xia, J. Tang, and D. Yin, Recommendations with negative feedback via pairwise deep reinforcement learning, in Proc. 24th ACM SIGKDD Int. Conf. Knowledge Discovery & Data Mining, London, UK, 2018, pp. 1040–1048.
[24]
Y. Deng, F. Bao, Y. Kong, Z. Ren, and Q. Dai, Deep direct reinforcement learning for financial signal representation and trading, IEEE Trans. Neural Netw. Learn. Syst., vol. 28, no. 3, pp. 653–664, 2017.
[25]
L. Zou, L. Xia, Z. Ding, J. Song, W. Liu, and D. Yin, Reinforcement learning to optimize long-term user engagement in recommender systems, in Proc. 25th ACM SIGKDD Int. Conf. Knowledge Discovery & Data Mining, Anchorage, AK, USA, 2019, pp. 2810–2818.
[26]
X. Wang, L. Wang, S. Wang, J. F. Chen, and C. Wu, An XGBoost-enhanced fast constructive algorithm for food delivery route planning problem, Comput. Ind. Eng., vol. 152, p. 107029, 2021.
[27]
Y. Tang, L. Li, and X. Liu, State-of-the-art development of complex systems and their simulation methods, Complex System Modeling and Simulation, vol. 1, no. 4, pp. 271–290, 2021.
[28]
H. Salehinejad, S. Sankar, J. Barfett, E. Colak, and S. Valaee, Recent advances in recurrent neural networks, arXiv preprint arXiv: 1801.01078, 2017.
[29]
S. Hochreiter and J. Schmidhuber, Long short-term memory, Neural Comput., vol. 9, no. 8, pp. 1735–1780, 1997.
[30]
D. Silver, G. Lever, N. Heess, T Degris, D. Wierstra, and M Riedmiller, Deterministic policy gradient algorithms, in Proc. 31st Int. Conf. Int. Conf. Machine Learning, Beijing, China, 2014, pp. 387–395.
[31]
C. M. Bishop and N. M. Nasrabadi, Pattern Recognition and Machine Learning. New York, NY, USA: Springer, 2006.
[32]
T. P. Lillicrap, J. J. Hunt, A. Pritzel, N. Heess, T. Erez, Y. Tassa, D. Silver, and D. Wierstra, Continuous control with deep reinforcement learning, arXiv preprint arXiv: 1509.02971, 2015.