A Matching Algorithm with Reinforcement Learning and Decoupling Strategy for Order Dispatching in On-Demand Food Delivery

Jingfang Chen; Ling Wang; Zixiao Pan; Yuting Wu; Jie Zheng; Xuetao Ding

doi:10.26599/TST.2023.9010069

AI Chat Paper

Note: Please note that the following content is generated by AMiner AI. SciOpen does not take any responsibility related to this content.

Chat more with AI

| Sign up

Browse by Subject

Search for peer-reviewed journals with full access.

Journals A - Z

About Us

Discover the SciOpen Platform and Achieve Your Research Goals with Ease.

About Us

Publish with Us

Support

Journals A - Z

About Us

Publish with Us

Support

PDF (504.3 KB)

Cite

EndNote(RIS) BibTeX

Collect

Submit Manuscript

AI Chat Paper

Show Outline

Outline

Show full outline

Hide outline

Outline

Show full outline

Hide outline

Open Access

A Matching Algorithm with Reinforcement Learning and Decoupling Strategy for Order Dispatching in On-Demand Food Delivery

Jingfang Chen^¹, Ling Wang^¹(

), Zixiao Pan^¹, Yuting Wu^¹, Jie Zheng^¹, Xuetao Ding^²

1Department of Automation, Tsinghua University, Beijing 100084, China

2Department of Delivery Technology, Meituan, Beijing 100102, China

Show Author Information

Abstract

The on-demand food delivery (OFD) service has gained rapid development in the past decades but meanwhile encounters challenges for further improving operation quality. The order dispatching problem is one of the most concerning issues for the OFD platforms, which refer to dynamically dispatching a large number of orders to riders reasonably in very limited decision time. To solve such a challenging combinatorial optimization problem, an effective matching algorithm is proposed by fusing the reinforcement learning technique and the optimization method. First, to deal with the large-scale complexity, a decoupling method is designed by reducing the matching space between new orders and riders. Second, to overcome the high dynamism and satisfy the stringent requirements on decision time, a reinforcement learning based dispatching heuristic is presented. To be specific, a sequence-to-sequence neural network is constructed based on the problem characteristic to generate an order priority sequence. Besides, a training approach is specially designed to improve learning performance. Furthermore, a greedy heuristic is employed to effectively dispatch new orders according to the order priority sequence. On real-world datasets, numerical experiments are conducted to validate the effectiveness of the proposed algorithm. Statistical results show that the proposed algorithm can effectively solve the problem by improving delivery efficiency and maintaining customer satisfaction.

Keywords

reinforcement learning order dispatching on-demand delivery decoupling strategy sequence-to-sequence neural network

References

[1]

Y. Huang, Y. Chai, Y. Liu, and J. Shen, Architecture of next-generation e-commerce platform, Tsinghua Science and Technology, vol. 24, no. 1, pp. 18–29, 2019.

Crossref Google Scholar

[2]

J. Tao, H. Dai, W. Chen, and H. Jiang, The value of personalized dispatch in O2O on-demand delivery services, Eur. J. Oper. Res., vol. 304, no. 3, pp. 1022–1035, 2023.

Crossref Google Scholar

[3]

Statista, Online food delivery — worldwide, https://www.statista.com/outlook/dmo/eservices/online-food-delivery/worldwide#revenue, 2023.

[4]

D. Reyes, A. Erera, M. Savelsbergh, S. Sahasrabudhe, and R. O’Neil, The meal delivery routing problem, http://www.optimization-online.org/DB_FILE/2018/04/6571.pdf, 2018.

[5]

B. Yildiz and M. Savelsbergh, Provably high-quality solutions for the meal delivery routing problem, Transp. Sci., vol. 53, no. 5, pp. 1372–1388, 2019.

Crossref Google Scholar

[6]

M. Cosmi, G. Oriolo, V. Piccialli, and P. Ventura, Single courier single restaurant meal delivery (without routing), Oper. Res. Lett., vol. 47, no. 6, pp. 537–541, 2019.

Crossref Google Scholar

[7]

M. Cosmi, G. Nicosia, and A. Pacifici, Scheduling for last-mile meal-delivery processes, IFAC-PapersOnLine, vol. 52, no. 13, pp. 511–516, 2019.

Crossref Google Scholar

[8]

H. Yu, X. Luo, and T. Wu, Online pickup and delivery problem with constrained capacity to minimize latency, J. Comb. Optim., vol. 43, pp. 974–993, 2022.

Crossref Google Scholar

[9]

Z. Steever, M. Karwan, and C. Murray, Dynamic courier routing for a food delivery service, Comput. Oper. Res., vol. 107, pp. 173–188, 2019.

Crossref Google Scholar

[10]

K. Wang, Y. Zhou, and L. Zhang, A workload-balancing order dispatch scheme for O2O food delivery with order splitting choice, J. Theor. Appl. Electron. Commer. Res., vol. 17, no. 1, pp. 295–312, 2022.

Crossref Google Scholar

[11]

M. W. Ulmer, B. W. Thomas, A. M. Campbell, and N. Woyak, The restaurant meal delivery problem: Dynamic pickup and delivery with deadlines and random ready times, Transp. Sci., vol. 55, no. 1, pp. 75–100, 2021.

Crossref Google Scholar

[12]

S. Liu, L. He, and Z. J. Max Shen, On-time last-mile delivery: Order assignment with travel-time predictors, Manag. Sci., vol. 67, no. 7, pp. 4095–4119, 2020.

Crossref Google Scholar

[13]

A. Kohar and S. K. Jakhar, A capacitated multi pickup online food delivery problem with time windows: A branch-and-cut algorithm, Ann. Oper. Res., .

Crossref Google Scholar

[14]

Y. Liu, An optimization-driven dynamic vehicle routing algorithm for on-demand meal delivery using drones, Comput. Oper. Res., vol. 111, pp. 1–20, 2019.

Crossref Google Scholar

[15]

H. Huang, C. Hu, J. Zhu, M. Wu, and R. Malekian, Stochastic task scheduling in UAV-based intelligent on-demand meal delivery system, IEEE Trans. Intell. Transp. Syst., vol. 23, no. 8, pp. 13040–13054, 2022.

Crossref Google Scholar

[16]

J. F. Chen, L. Wang, S. Wang, X. Wang, and H. Ren, An effective matching algorithm with adaptive tie-breaking strategy for online food delivery problem, Complex Intell. Syst., vol. 8, pp. 107–128, 2022.

Crossref Google Scholar

[17]

J. F. Chen, L. Wang, H. Ren, J. Pan, S. Wang, J. Zheng, and X. Wang, An imitation learning-enhanced iterated matching algorithm for on-demand food delivery, IEEE Trans. Intell. Transp. Syst., vol. 23, no. 10, pp. 18603–18619, 2022.

Crossref Google Scholar

[18]

J. Zheng, L. Wang, L. Wang, S. Wang, J. F. Chen, and X. Wang, Solving stochastic online food delivery problem via iterated greedy algorithm with decomposition-based strategy, IEEE Trans. Syst. Man Cybern.: Syst., vol. 53, no. 2, pp. 957–969, 2023.

Crossref Google Scholar

[19]

J. F. Chen, S. Wang, L. Wang, J. Zheng, Y. Cha, J. Hao, R. He, and Z. Sun, A hybrid differential evolution algorithm for the online meal delivery problem, in Proc. 2020 IEEE Congress on Evolutionary Computation (CEC), Glasgow, UK, 2020, pp. 1–8.

Crossref Google Scholar

[20]

X. Wang, S. Wang, L. Wang, H. Zheng, J. Hao, R. He, and Z. Sun, An effective iterated greedy algorithm for online route planning problem, in Proc. 2020 IEEE Congr. Evol. Comput. (CEC), Glasgow, UK, 2020, pp. 1–8.

Crossref Google Scholar

[21]

X. Wang, L. Wang, S. Wang, J. F. Chen, and C. Wu, An XGBoost-enhanced fast constructive algorithm for food delivery route planning problem, Comput. Ind. Eng., vol. 152, p. 107029, 2021.

Crossref Google Scholar

[22]

K. K. Kottakki, S. Rathee, K. Mitra Adusumilli, J. Mathew, B. Nayak, and S. Ahuja, Customer experience driven assignment logic for online food delivery, in Proc. 2020 IEEE Int. Conf. Ind. Eng. Eng. Manag. (IEEM), Singapore, 2020, pp. 827–831.

Crossref Google Scholar

[23]

S. Paul, S. Rathee, J. Matthew, and K. M. Adusumilli, An optimization framework for on-demand meal delivery system, in Proc. 2020 IEEE Int. Conf. Ind. Eng. Eng. Manag. (IEEM), Singapore, 2020, pp. 822–826.

Crossref Google Scholar

[24]

M. Joshi, A. Singh, S. Ranu, A. Bagchi, P. Karia, and P. Kala, Batching and matching for food delivery in dynamic road networks, http://arxiv.org/abs/2008.12905, 2020.

Crossref

[25]

H. Jahanshahi, A. Bozanta, M. Cevik, E. M. Kavuk, A. Tosun, S. B. Sonuc, B. Kosucu, and A. Başar, A deep reinforcement learning approach for the meal delivery problem, Knowl. Based Syst., vol. 243, p. 108489, 2022.

Crossref Google Scholar

[26]

A. Bozanta, M. Cevik, C. Kavaklioglu, E. M. Kavuk, A. Tosun, S. B. Sonuc, A. Duranel, and A. Basar, Courier routing and assignment for food delivery service using reinforcement learning, Comput. Ind. Eng., vol. 164, p. 107871, 2022.

Crossref Google Scholar

[27]

J. Hu, H. Wu, B. Zhong, and R. Xiao, Swarm intelligence-based optimisation algorithms: An overview and future research issues, Int. J. Autom. Control, vol. 14, nos. 5&6, pp. 656–693, 2020.

Crossref Google Scholar

[28]

W. Fang, R. Min, and Q. Wang, Large-scale global optimisation using cooperative co-evolution with self-adaptive differential grouping, Int. J. Autom. Control, vol. 15, no. 1, pp. 58–77, 2021.

Crossref Google Scholar

[29]

W. Fan, P. Chen, D. Shi, X. Guo, and L. Kou, Multi-agent modeling and simulation in the AI age, Tsinghua Science and Technology, vol. 26, no. 5, pp. 608–624, 2021.

Crossref Google Scholar

[30]

F. Zhao, S. Di, J. Cao, J. Tang, and Jonrinaldi, A novel cooperative multi-stage hyper-heuristic for combination optimization problems, Complex System Modeling and Simulation, vol. 1, no. 2, pp. 91–108, 2021.

Crossref Google Scholar

[31]

L. Wang, Z. Pan, and J. Wang, A review of reinforcement learning based intelligent optimization for manufacturing scheduling, Complex System Modeling and Simulation, vol. 1, no. 4, pp. 257–270, 2021.

Crossref Google Scholar

Tsinghua Science and Technology

Volume 29 Issue 2,
April 2024

Pages 386-399

DOI: 10.26599/TST.2023.9010069

Cite this article:

Chen J, Wang L, Pan Z, et al. A Matching Algorithm with Reinforcement Learning and Decoupling Strategy for Order Dispatching in On-Demand Food Delivery. Tsinghua Science and Technology, 2024, 29(2): 386-399. https://doi.org/10.26599/TST.2023.9010069

412

Views

Downloads

Crossref

Web of Science

Scopus

CSCD

Google Scholar
Citation

Altmetrics

Received: 31 May 2023

Revised: 30 June 2023

Accepted: 05 July 2023

Published: 22 September 2023

The articles published in this open access journal are distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/).