AI Chat Paper
Note: Please note that the following content is generated by AMiner AI. SciOpen does not take any responsibility related to this content.
{{lang === 'zh_CN' ? '文章概述' : 'Summary'}}
{{lang === 'en_US' ? '中' : 'Eng'}}
Chat more with AI
Article Link
Collect
Submit Manuscript
Show Outline
Outline
Show full outline
Hide outline
Outline
Show full outline
Hide outline
Research Article | Open Access

Bidirectional Q-learning for recycling path planning of used appliances under strong and weak constraints

Yang QiaJinxin Caoa,b( )Baijing Wuc
Institute of Transportation Engineering, Inner Mongolia University, Hohhot, 010010, China
Inner Mongolia Academy of Science and Technology, Hohhot, 010010, China
Institute of Electronics and Information Engineering, Lanzhou Jiaotong University, Lanzhou, 730070, China
Show Author Information

Abstract

With the continuous innovation in household appliance technology and the improvement of living standards, the production of discarded household appliances has rapidly increased, making their recycling increasingly significant. Traditional path planning algorithms encounter difficulties in balancing efficiency and constraints in addressing the multi-objective, multi-constraint challenge posed by discarded household appliance recycling routes. To tackle this issue, this study introduces a bi-directional Q-learning-based path planning algorithm. By developing a bi-directional Q-learning mechanism and enhancing the initialization method of Q-learning, the algorithm aims to achieve efficient and effective optimization of discarded household appliance recycling routes. It implements bidirectional updates of the state-action value function from both the starting point and the target point. Additionally, a hierarchical reinforcement learning strategy and guided rewards are introduced to minimize blind exploration and expedite convergence. By decomposing complex recycling tasks into multiple sub-tasks and seeking paths with superior performance at each sub-task level, the initial exploratory blindness is reduced. To validate the efficacy of the proposed algorithm, gridbased modeling of real-world environments is utilized. Comparative experiments reveal significant improvements in iteration counts and path lengths, thereby validating its practical applicability in path planning for recycling initiatives.

References

 
Bai, Z., Pang, H., Liu, M., Wang, M., 2022. An improved Q-learning algorithm and its application to the optimized path planning for unmanned ground robot with obstacle avoidance. In: 2022 6th CAA International Conference on Vehicular Control and Intelligence (CVCI), pp. 1–6.
 

Baxter, J., Lyng, K.A., Askham, C., Hanssen, O.J., 2016. High-quality collection and disposal of WEEE: environmental impacts and resultant issues. Waste Manag. 57, 17–26.

 
Bozyiğit, A., Alankuş, G., Nasiboğlu, E., 2017. Public transport route planning: modified Dijkstra's algorithm. In: 2017 International Conference on Computer Science and Engineering (UBMK), pp. 502–505.
 

Cao, J.X., Liu, Q.C., 2022. Research on path planning algorithms based on deep reinforcement learning. Comput. Appl. 39, 231–237.

 
Chen, X., Xu, R., Zhao, J., 2017. Multi-objective route planning for UAV. In: 2017 4th International Conference on Information Science and Control Engineering (ICISCE), pp. 1023–1027.
 

Demir, E., Bektaş, T., Laporte, G., 2014. The bi-objective pollution-routing problem. Eur. J. Oper. Res. 232, 464–478.

 

Džeroski, S., De Raedt, L., Driessens, K., 2001. Relational reinforcement learning. Mach. Learn. 43, 7–52.

 

El-Hussieny, H., Hameed, I., 2024. Obstacle-aware navigation of soft growing robots via deep reinforcement learning. IEEE Access 12, 38192–38201.

 

Gao, L., Ma, T., Liu, K., Zhang, Y., 2018. Application of improved Q-learning algorithm in path planning. J. Jilin. Univ. 36, 439–443.

 

Ge, Y., Zhu, F., Ling, X., Liu, Q., 2019. Safe Q-learning method based on constrained markov decision processes. IEEE Access 7, 165007–165017.

 
Habiba, U., Jahan, R., 2023. Path planning for UAV “drones” using sarsa: enhancing efficiency and performance. In: 2023 International Conference on Integration of Computational Intelligent System (ICICIS), pp. 1–6.
 

Hart, P.E., Nilsson, N.J., Raphael, B., 1968. A formal basis for the heuristic determination of minimum cost paths. IEEE Trans. Syst. Sci. Cybern. 4, 100–107.

 
Hassan, I.A., Abed, I.A., Al-Hussaibi, W.A., 2023. Path planning based on rapidlyexploration random tree (RRT). In: 2023 3rd International Conference on Electrical, Computer, Communications and Mechatronics Engineering (ICECCME), pp. 1–6.
 

He, J., Fan, H., Yin, J., 2014. Research on the design of waste electrical appliance recycling network in guangdong province. Sci. Technol. Manag. Res. 34, 235–238.

 
Jiale, L., Bing, W., Yuquan, C., Yang, W., Chen, R., 2023. Robot path planning algorithm based on improved Q-learning in dynamic environment. In: 2023 15th International Conference on Intelligent Human-Machine Systems and Cybernetics (IHMSC), pp. 124–127.
 

Jiang, J., Xin, J., 2019. Path planning of a mobile robot in a free-space environment using Q-learning. Prog. Artif. Intell. 8, 133–142.

 

Jin, W., Gu, R., Ji, Y., 2019. Reward function learning for Q-learning-based geographic routing protocol. IEEE Commun. Lett. 23, 1236–1239.

 

Kang, Y., Yin, H., Berger, C., 2019. Test your self-driving algorithm: an overview of publicly available driving datasets and virtual testing environments. IEEE Trans. Intell. Veh. 4, 171–185.

 

Li, Z.Y., Hu, X.T., Zhang, Y.L., Xu, J.J., 2024. Adaptive Q-learning path planning algorithm based on virtual target guidance. Comput. Integr. Manuf. Syst. 30, 553.

 

Liu, Y., Ye, Q., Escribano-Macias, J., Feng, Y., Candela, E., Angeloudis, P., 2023. Route planning for last-mile deliveries using mobile parcel lockers: a hybrid Q-learning network approach. Transport. Res. E Logist. Transport. Rev. 177, 103234.

 
Meerza, S.I.A., Islam, M., Uzzal, M.M., 2019. Q-learning based particle swarm optimization algorithm for optimal path planning of swarm of mobile robots. In: 2019 1st International Conference on Advances in Science, Engineering and Robotics Technology (ICASERT), pp. 1–5.
 
Milić, M., Mladenović, A., Milojković, J., 2023. Innovations in sustainable electronic circuit design to reduce negative impacts on the environment. In: 2023 10th International Conference on Electrical, Electronic and Computing Engineering (ICETRAN), pp. 1–5.
 

Saeed, R.A., Ali, E.S., Abdelhaq, M., Alsaqour, R., Ahmed, F.R.A., Saad, A.M.E., 2023. Energy efficient path planning scheme for unmanned aerial vehicle using hybrid generic algorithm based Q-learning optimization. IEEE Access 12, 113400–113417.

 

Si, Y., Pu, J., Lifan, S., 2022. Review of research on approximate reinforcement learning algorithms. J. Comput. Eng. Appl. 58, 30.

 

Song, L., Li, D.Z., 2023. Local path planning via improved fuzzy and Q(λ)-learning algorithms for the mobile robot. J. Comput. 34, 265–284.

 
Soni, H., Vyas, R., Hiran, K.K., 2022. Self-autonomous car simulation using deep Q-learning algorithm. In: 2022 International Conference on Trends in Quantum Computing and Emerging Business Technologies (TQCEBT), pp. 1–4.
 

Sun, P.F., Li, J.Y., Li, M.Y., Gao, Z.Y., Jin, H., Jeon, S.W., et al., 2024. Tour multi-route planning with matrix-based differential evolution. IEEE Trans. Intell. Transport. Syst. 25, 11753–11767.

 

Wang, J., Zhu, K., Dai, P., Han, Z., 2023. An adaptive Q-value adjustment-based learning model for reliable vehicle-to-UAV computation offloading. IEEE Trans. Intell. Transport. Syst. 25, 3699–3713.

 
Yan, L., Yang, L., Li, Y., 2023. Autonomous path planning of agv obstacle avoidance based on improved Q-learning. In: 2023 6th International Conference on Computer Network, Electronic and Automation (ICCNEA), pp. 290–295.
 

Yang, K., Liu, L., 2024. An improved deep reinforcement learning algorithm for path planning in unmanned driving. IEEE Access 12, 67935–67944.

 
Yin, Z., Cao, W., Song, T., Yang, X., Zhang, T., 2022. Reinforcement learning path planning based on step batch Q-learning algorithm. In: 2022 IEEE International Conference on Artificial Intelligence and Computer Applications (ICAICA), pp. 630–633.
 

Yin, Y., Wang, X., Zhou, J., 2023. Q-learning-based multi-UAV cooperative path planning method. Acta Armamentarii 44, 484.

 

Yuan, Y., Yu, Z.L., Gu, Z., Yeboah, Y., Wei, W., Deng, X., Li, J., et al., 2019. A novel multistep Q-learning method to improve data efficiency for deep reinforcement learning. Knowl. Base Syst. 175, 107–117.

 

Zhou, Q., Lian, Y., Wu, J., Zhu, M., Wang, H., Cao, J., 2024. An optimized Q-learning algorithm for mobile robot local path planning. Knowl. Base Syst. 286, 111400.

Communications in Transportation Research
Article number: 100153
Cite this article:
Qi Y, Cao J, Wu B. Bidirectional Q-learning for recycling path planning of used appliances under strong and weak constraints. Communications in Transportation Research, 2024, 4(4): 100153. https://doi.org/10.1016/j.commtr.2024.100153

117

Views

0

Crossref

0

Web of Science

0

Scopus

Altmetrics

Received: 07 July 2024
Revised: 08 September 2024
Accepted: 08 September 2024
Published: 27 November 2024
© 2024 The Authors.

This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).

Return