Robustness Assessment of Asynchronous Advantage Actor-Critic Based on Dynamic Skewness and Sparseness Computation: A Parallel Computing View

Tong Chen; Ji-Qiang Liu; He Li; Shuo-Ru Wang; Wen-Jia Niu; En-Dong Tong; Liang Chang; Qi Alfred Chen; Gang Li

doi:10.1007/s11390-021-1217-z

Journal of Computer Science and Technology 2021, 36(5): 1002-1021 https://doi.org/10.1007/s11390-021-1217-z

Regular Paper | Issue | Published: 30 September 2021

Robustness Assessment of Asynchronous Advantage Actor-Critic Based on Dynamic Skewness and Sparseness Computation: A Parallel Computing View

Show Author's Information Hide Author's Information Tong Chen^¹, Ji-Qiang Liu^¹, He Li^¹, Shuo-Ru Wang^¹, Wen-Jia Niu^¹(

), En-Dong Tong^¹(

), Liang Chang^², Qi Alfred Chen^³, Gang Li^⁴

Beijing Key Laboratory of Security and Privacy in Intelligent Transportation, Beijing Jiaotong University Beijing 100044, China

Guangxi Key Laboratory of Trusted Software, Guilin University of Electronic Technology, Guilin 541004, China

Donald Bren School of Information and Computer Sciences, University of California, Irvine 92697, U.S.A.

Centre for Cyber Security Research and Innovation, Deakin University, Geelong, VIC 3216, Australia

both of them guided the completion of this article

Keywords:

reinforcement learning, robustness assessment, skewness, sparseness, asynchronous advantage actor-critic

Cite this article:

Chen T, Liu J-Q, Li H, et al. Robustness Assessment of Asynchronous Advantage Actor-Critic Based on Dynamic Skewness and Sparseness Computation: A Parallel Computing View. Journal of Computer Science and Technology, 2021, 36(5): 1002-1021. https://doi.org/10.1007/s11390-021-1217-z

Download citation

EndNote(RIS)

BibTeX

293

Views

Citations

Crossref

WoS

Scopus

CSCD

Abstract Electronic supplementary material About this article

Abstract

Reinforcement learning as autonomous learning is greatly driving artificial intelligence (AI) development to practical applications. Having demonstrated the potential to significantly improve synchronously parallel learning, the parallel computing based asynchronous advantage actor-critic (A3C) opens a new door for reinforcement learning. Unfortunately, the acceleration's inuence on A3C robustness has been largely overlooked. In this paper, we perform the first robustness assessment of A3C based on parallel computing. By perceiving the policy’s action, we construct a global matrix of action probability deviation and define two novel measures of skewness and sparseness to form an integral robustness measure. Based on such static assessment, we then develop a dynamic robustness assessing algorithm through situational whole-space state sampling of changing episodes. Extensive experiments with different combinations of agent number and learning rate are implemented on an A3C-based pathfinding application, demonstrating that our proposed robustness assessment can effectively measure the robustness of A3C, which can achieve an accuracy of 83.3%.

Electronic supplementary material

File

jcst-36-5-1002-Highlights.pdf (827 KB)

About this article

Publication history

Received: 12 December 2020

Accepted: 26 July 2021

Published: 30 September 2021

Issue date: September 2021

Robustness Assessment of Asynchronous Advantage Actor-Critic Based on Dynamic Skewness and Sparseness Computation: A Parallel Computing View

Publication history

Copyright