853
Views
56
Downloads
23
Crossref
N/A
WoS
24
Scopus
N/A
CSCD
Nowadays, Edge Information System (EIS) has received a lot of attentions. In EIS, Distributed Machine Learning (DML), which requires fewer computing resources, can implement many artificial intelligent applications efficiently. However, due to the dynamical network topology and the fluctuating transmission quality at the edge, work node selection affects the performance of DML a lot. In this paper, we focus on the Internet of Vehicles (IoV), one of the typical scenarios of EIS, and consider the DML-based High Definition (HD) mapping and intelligent driving decision model as the example. The worker selection problem is modeled as a Markov Decision Process (MDP), maximizing the DML model aggregate performance related to the timeliness of the local model, the transmission quality of model parameters uploading, and the effective sensing area of the worker. A Deep Reinforcement Learning (DRL) based solution is proposed, called the Worker Selection based on Policy Gradient (PG-WS) algorithm. The policy mapping from the system state to the worker selection action is represented by a deep neural network. The episodic simulations are built and the REINFORCE algorithm with baseline is used to train the policy network. Results show that the proposed PG-WS algorithm outperforms other comparation methods.
Nowadays, Edge Information System (EIS) has received a lot of attentions. In EIS, Distributed Machine Learning (DML), which requires fewer computing resources, can implement many artificial intelligent applications efficiently. However, due to the dynamical network topology and the fluctuating transmission quality at the edge, work node selection affects the performance of DML a lot. In this paper, we focus on the Internet of Vehicles (IoV), one of the typical scenarios of EIS, and consider the DML-based High Definition (HD) mapping and intelligent driving decision model as the example. The worker selection problem is modeled as a Markov Decision Process (MDP), maximizing the DML model aggregate performance related to the timeliness of the local model, the transmission quality of model parameters uploading, and the effective sensing area of the worker. A Deep Reinforcement Learning (DRL) based solution is proposed, called the Worker Selection based on Policy Gradient (PG-WS) algorithm. The policy mapping from the system state to the worker selection action is represented by a deep neural network. The episodic simulations are built and the REINFORCE algorithm with baseline is used to train the policy network. Results show that the proposed PG-WS algorithm outperforms other comparation methods.
This work was supported by the Science and Technology Foundation of Beijing Municipal Commission of Education (No. KM201810005027), the National Natural Science Foundation of China (No. U1633115), and the Beijing Natural Science Foundation (No. L192002).
© All articles included in the journal are copyrighted to the ITU and TUP. This work is available under the CC BY-NC-ND 3.0 IGO license: https://creativecommons.org/licenses/by-nc-nd/3.0/igo/.