CAAI Artificial Intelligence Research 2023, 2: 9150013 https://doi.org/10.26599/AIR.2023.9150013

Article |

Open Access | Issue | Published: 30 June 2023

Deep Reinforcement Learning Object Tracking Based on Actor-Double Critic Network

Show Author's Information Hide Author's Information Jing Xin^¹, Jianglei Zhou^¹, Xinhong Hei^²(

), Pengyu Yue^¹, Jia Zhao^¹

1School of Automation and Information Engineering, Xi’an University of Technology, Xi’an 710048, China

2School of Computer Science and Engineering, Xi’an University of Technology, Xi’an 710048, China

Keywords:

deep reinforcement learning, object tracking, actor-double critic network

Cite this article:

Xin J, Zhou J, Hei X, et al. Deep Reinforcement Learning Object Tracking Based on Actor-Double Critic Network. CAAI Artificial Intelligence Research, 2023, 2: 9150013. https://doi.org/10.26599/AIR.2023.9150013

Download citation

EndNote(RIS)

BibTeX

1452

Views

210

Downloads

Citations

Crossref

N/A

WoS

N/A

Scopus

N/A

CSCD

Abstract Full text About this article

Abstract

Aiming at the problem of poor tracking robustness caused by severe occlusion, deformation, and object rotation of deep learning object tracking algorithm in complex scenes, an improved deep reinforcement learning object tracking algorithm based on actor-double critic network is proposed. In offline training phase, the actor network moves the rectangular box representing the object location according to the input sequence image to obtain the action value, that is, the horizontal, vertical, and scale transformation of the object. Then, the designed double critic network is used to evaluate the action value, and the output double Q value is averaged to guide the actor network to optimize the tracking strategy. The design of double critic network effectively improves the stability and convergence, especially in challenging scenes such as object occlusion, and the tracking performance is significantly improved. In online tracking phase, the well-trained actor network is used to infer the changing action of the bounding box, directly causing the tracker to move the box to the object position in the current frame. Several comparative tracking experiments were conducted on the OTB100 visual tracker benchmark and the experimental results show that more intensive reward settings significantly increase the actor network’s output probability of positive actions. This makes the tracking algorithm proposed in this paper outperforms the mainstream deep reinforcement learning tracking algorithms and deep learning tracking algorithms under the challenging attributes such as occlusion, deformation, and rotation.

Full text

Abstract

Full text

Outline

About this article

Deep Reinforcement Learning Object Tracking Based on Actor-Double Critic Network

Show Author's information Hide Author's Information Jing Xin^¹, Jianglei Zhou^¹, Xinhong Hei^²(

), Pengyu Yue^¹, Jia Zhao^¹

1School of Automation and Information Engineering, Xi’an University of Technology, Xi’an 710048, China

2School of Computer Science and Engineering, Xi’an University of Technology, Xi’an 710048, China

Abstract

Keywords: deep reinforcement learning, object tracking, actor-double critic network

References(36)

[1]

P. Li, X. Chen, and S. Shen, Stereo R-CNN based 3D object detection for autonomous driving, in Proc. 2019 IEEE/CVF Conf. Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 2020, pp. 7636–7644.

DOI

[2]

D. Liu, J. Zhao, A. Xi, C. Wang, X. Huang, K. Lai, and C. Liu, Data augmentation technology driven by image style transfer in self-driving car based on end-to-end learning, Comput. Model. Eng. Sci., vol. 122, no. 2, pp. 593–617, 2020.

DOI Google Scholar

[3]

I. Paulo Canal, M. Martin Pérez Reimbold, and M. de Campos, Ziegler–Nichols customization for quadrotor attitude control under emptyand full loading conditions, Comput. Model. Eng. Sci., vol. 125, no. 1, pp. 65–75, 2020.

DOI Google Scholar

[4]

J. Yang, S. Liu, H. Su, and Y. Tian, Driving assistance system based on data fusion of multisource sensors for autonomous unmanned ground vehicles, Comput. Netw., vol. 192, p. 108053, 2021.

DOI Google Scholar

[5]

D. Guo, Q. Yang, Y. D. Zhang, G. Zhang, M. Zhu, and J. Yuan, Adaptive object tracking discriminate model for multi-camera panorama surveillance in airport apron, Comput. Model. Eng. Sci., vol. 129, no. 1, pp. 191–205, 2021.

DOI Google Scholar

[6]

W. Tian, M. Lauer, and L. Chen, Online multi-object tracking using joint domain information in traffic scenarios, IEEE Trans. Intell. Transp. Syst., vol. 21, no. 1, pp. 374–384, 2020.

DOI Google Scholar

[7]

M. Adimoolam, S. Mohan, A. John, and G. Srivastava, A novel technique to detect and track multiple objects in dynamic video surveillance systems, Int. J. Interact. Multimed. Artif. Intell., vol. 7, no. 4, p. 112, 2022.

DOI Google Scholar

[8]

A. A. Gumbs, I. Frigerio, G. Spolverato, R. Croner, A. Illanes, E. Chouillard, and E. Elyan, Artificial intelligence surgery: How do we get to autonomous actions in surgery? Sensors, vol. 21, no. 16, p. 5526, 2021.

DOI Google Scholar

[9]

S. Ipsen, S. Böttger, H. Schwegmann, and F. Ernst, Target tracking accuracy and latency with different 4D ultrasound systems–a robotic phantom study, Curr. Dir. Biomed. Eng., vol. 6, no. 1, p. 20200038, 2020.

DOI Google Scholar

[10]

B. Yang, X. Liu, X. Tang, and X. Chen, AGV Multi-target tracking under smart factory, Electron. Sci. Technol., vol. 32, no. 11, pp. 23–27, 2019.

Google Scholar

[11]

Z. Xia, J. Du, J. Wang, C. Jiang, Y. Ren, G. Li, and Z. Han, Multi-agent reinforcement learning aided intelligent UAV swarm for target tracking, IEEE Trans. Veh. Technol., vol. 71, no. 1, pp. 931–945, 2022.

DOI Google Scholar

[12]

S. Zhang, Y. Li, and Q. Dong, Autonomous navigation of UAV in multi-obstacle environments based on a Deep Reinforcement Learning approach, Appl. Soft Comput., vol. 115, p. 108194, 2022.

DOI Google Scholar

[13]

D. E. Asher, A. Basak, R. Fernandez, P. K. Sharma, E. G. Zaroukian, C. D. Hsu, M. R. Dorothy, T. Mahre, G. Galindo, L. Frerichs, et al., Strategic maneuver and disruption with reinforcement learning approaches for multi-agent coordination, J. Def. Model. Simul., doi: 10.1177/15485129221104096.

DOI

[14]

S. Liu, J. Cao, Y. Wang, W. Chen, and Y. Liu, Self-play reinforcement learning with comprehensive critic in computer games, Neurocomputing, vol. 449, pp. 207–213, 2021.

DOI Google Scholar

[15]

S. Wen, Z. Wen, D. Zhang, H. Zhang, and T. Wang, A multi-robot path-planning algorithm for autonomous navigation using meta-reinforcement learning based on transfer learning, Appl. Soft Comput., vol. 110, p. 107605, 2021.

DOI Google Scholar

[16]

B. Chen, D. Wang, P. Li, S. Wang, and H. Lu, Real-time ‘actor-critic’ tracking, in Proc. 15th European Conf. Computer Vision (ECCV), Munich, Germany, 2018, pp. 328–345.

DOI

[17]

S. Yun, J. Choi, Y. Yoo, K. Yun, and J. Y. Choi, Action-decision networks for visual tracking with deep reinforcement learning, in Proc. 2017 IEEE Conf. Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 2017, pp. 1349–1358.

DOI

[18]

M. Danelljan, G. Häger, F. Shahbaz Khan, and M. Felsberg, Accurate scale estimation for robust visual tracking, in Proc. 2014 British Machine Vision Conference, Nottingham, UK, 2014, doi: 10.5244/C.28.65.

DOI

[19]

Z. Hong, Z. Chen, C. Wang, X. Mei, D. Prokhorov, and D. Tao, MUlti-Store Tracker (MUSTer): A cognitive psychology inspired approach to object tracking, in Proc. 2015 IEEE Conf. Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, 2015, pp. 749–758.

DOI

[20]

J. F. Henriques, R. Caseiro, P. Martins, and J. Batista, High-speed tracking with kernelized correlation filters, IEEE Trans. Pattern Anal. Mach. Intell., vol. 37, no. 3, pp. 583–596, 2015.

DOI Google Scholar

[21]

J. Choi, H. J. Chang, J. Jeong, Y. Demiris, and J. Y. Choi, Visual tracking using attention-modulated disintegration and integration, in Proc. 2016 IEEE Conf. Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 2016, pp. 4321–4330.

DOI

[22]

J. Choi, H. J. Chang, S. Yun, T. Fischer, Y. Demiris, and J. Y. Choi, Attentional correlation filter network for adaptive visual tracking, in Proc. 2017 IEEE Conf. Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 2017, pp. 4828–4837.

DOI

[23]

L. Wang, W. Ouyang, X. Wang, and H. Lu, Visual tracking with fully convolutional networks, in Proc. 2015 IEEE Int. Conf. Computer Vision (ICCV), Santiago, Chile, 2016, pp. 3119–3127.

DOI

[24]

M. Danelljan, G. Bhat, F. S. Khan, and M. Felsberg, ECO: Efficient convolution operators for tracking, in Proc. 2017 IEEE Conf. Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 2017, pp. 6931–6939.

DOI

[25]

H. Fan and H. Ling, SANet: structure-aware network for visual tracking, in Proc. 2017 IEEE Conf. Computer Vision and Pattern Recognition Workshops (CVPRW), Honolulu, HI, USA, 2017, pp. 2217–2224.

DOI

[26]

B. Han, J. Sim, and H. Adam, BranchOut: regularization for online ensemble tracking with convolutional neural networks, in Proc. 2017 IEEE Conf. Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 2017, pp. 521–530.

DOI

[27]

Z. Chi, H. Li, H. Lu, and M. H. Yang, Dual deep network for visual tracking, IEEE Trans. Image Process., vol. 26, no. 4, pp. 2005–2015, 2017.

DOI Google Scholar

[28]

J. Valmadre, L. Bertinetto, J. Henriques, A. Vedaldi, and P. H. S. Torr, End-to-end representation learning for correlation filter based tracking, in Proc. 2017 IEEE Conf. Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 2017, pp. 5000–5008.

DOI

[29]

I. Jung, J. Son, M. Baek, and B. Han, Real-time MDNet, in Proc. 15th European Conf. Computer Vision (ECCV), Munich, Germany, 2018, pp. 89–104.

DOI

[30]

P. Voigtlaender, J. Luiten, P. H. S. Torr, and B. Leibe, Siam R-CNN: Visual tracking by re-detection, in Proc. 2020 IEEE/CVF Conf. Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 2020, pp. 6577–6587.

DOI

[31]

T. Yang, P. Xu, R. Hu, H. Chai, and A. B. Chan, ROAM: Recurrently optimizing tracking model, in Proc. 2020 IEEE/CVF Conf. Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 2020, pp. 6717–6726.

DOI

[32]

Z. Teng, J. Xing, Q. Wang, C. Lang, S. Feng, and Y. Jin, Robust object tracking based on temporal and spatial deep networks, in Proc. 2017 IEEE Int. Conf. Computer Vision (ICCV), Venice, Italy, 2017, pp. 1153–1162.

DOI

[33]

C. Ma, J. B. Huang, X. Yang, and M. H. Yang, Hierarchical convolutional features for visual tracking, in Proc. 2015 IEEE Int. Conf. Computer Vision (ICCV), Santiago, Chile, 2016, pp. 3074–3082.

DOI

[34]

O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang, A. Karpathy, A. Khosla, M. Bernstein et al., ImageNet large scale visual recognition challenge, Int. J. Comput. Vis., vol. 115, no. 3, pp. 211–252, 2015.

DOI Google Scholar

[35]

Y. Wu, J. Lim, and M. H. Yang, Online object tracking: A benchmark, in Proc. 2013 IEEE Conf. Computer Vision and Pattern Recognition, Portland, OR, USA, 2013, pp. 2411–2418.

DOI

[36]

Y. Wu, J. Lim, and M. H. Yang, Object tracking benchmark, IEEE Trans. Pattern Anal. Mach. Intell., vol. 37, no. 9, pp. 1834–1848, 2015.

DOI Google Scholar

About this article

Publication history

Acknowledgements

Rights and permissions

Publication history

Received: 16 December 2022

Revised: 06 April 2023

Accepted: 28 April 2023

Published: 30 June 2023

Issue date: December 2023

Copyright

Acknowledgements

Acknowledgment

This work was supported in part by the National Key R&D Program of China (No. 2022YFB2602203), and in part by the National Natural Science Foundation of China (Nos. U20A20225 and 61873200) and Shaanxi Provincial Key Research and Development Program (No. 2022-GY111).

Rights and permissions

The articles published in this open access journal are distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/).