Journal Home > Volume 2

Aiming at the problem of poor tracking robustness caused by severe occlusion, deformation, and object rotation of deep learning object tracking algorithm in complex scenes, an improved deep reinforcement learning object tracking algorithm based on actor-double critic network is proposed. In offline training phase, the actor network moves the rectangular box representing the object location according to the input sequence image to obtain the action value, that is, the horizontal, vertical, and scale transformation of the object. Then, the designed double critic network is used to evaluate the action value, and the output double Q value is averaged to guide the actor network to optimize the tracking strategy. The design of double critic network effectively improves the stability and convergence, especially in challenging scenes such as object occlusion, and the tracking performance is significantly improved. In online tracking phase, the well-trained actor network is used to infer the changing action of the bounding box, directly causing the tracker to move the box to the object position in the current frame. Several comparative tracking experiments were conducted on the OTB100 visual tracker benchmark and the experimental results show that more intensive reward settings significantly increase the actor network’s output probability of positive actions. This makes the tracking algorithm proposed in this paper outperforms the mainstream deep reinforcement learning tracking algorithms and deep learning tracking algorithms under the challenging attributes such as occlusion, deformation, and rotation.


menu
Abstract
Full text
Outline
About this article

Deep Reinforcement Learning Object Tracking Based on Actor-Double Critic Network

Show Author's information Jing Xin1Jianglei Zhou1Xinhong Hei2( )Pengyu Yue1Jia Zhao1
School of Automation and Information Engineering, Xi’an University of Technology, Xi’an 710048, China
School of Computer Science and Engineering, Xi’an University of Technology, Xi’an 710048, China

Abstract

Aiming at the problem of poor tracking robustness caused by severe occlusion, deformation, and object rotation of deep learning object tracking algorithm in complex scenes, an improved deep reinforcement learning object tracking algorithm based on actor-double critic network is proposed. In offline training phase, the actor network moves the rectangular box representing the object location according to the input sequence image to obtain the action value, that is, the horizontal, vertical, and scale transformation of the object. Then, the designed double critic network is used to evaluate the action value, and the output double Q value is averaged to guide the actor network to optimize the tracking strategy. The design of double critic network effectively improves the stability and convergence, especially in challenging scenes such as object occlusion, and the tracking performance is significantly improved. In online tracking phase, the well-trained actor network is used to infer the changing action of the bounding box, directly causing the tracker to move the box to the object position in the current frame. Several comparative tracking experiments were conducted on the OTB100 visual tracker benchmark and the experimental results show that more intensive reward settings significantly increase the actor network’s output probability of positive actions. This makes the tracking algorithm proposed in this paper outperforms the mainstream deep reinforcement learning tracking algorithms and deep learning tracking algorithms under the challenging attributes such as occlusion, deformation, and rotation.

Keywords: deep reinforcement learning, object tracking, actor-double critic network

References(36)

[1]
P. Li, X. Chen, and S. Shen, Stereo R-CNN based 3D object detection for autonomous driving, in Proc. 2019 IEEE/CVF Conf. Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 2020, pp. 7636–7644.
DOI
[2]

D. Liu, J. Zhao, A. Xi, C. Wang, X. Huang, K. Lai, and C. Liu, Data augmentation technology driven by image style transfer in self-driving car based on end-to-end learning, Comput. Model. Eng. Sci., vol. 122, no. 2, pp. 593–617, 2020.

[3]

I. Paulo Canal, M. Martin Pérez Reimbold, and M. de Campos, Ziegler–Nichols customization for quadrotor attitude control under emptyand full loading conditions, Comput. Model. Eng. Sci., vol. 125, no. 1, pp. 65–75, 2020.

[4]

J. Yang, S. Liu, H. Su, and Y. Tian, Driving assistance system based on data fusion of multisource sensors for autonomous unmanned ground vehicles, Comput. Netw., vol. 192, p. 108053, 2021.

[5]

D. Guo, Q. Yang, Y. D. Zhang, G. Zhang, M. Zhu, and J. Yuan, Adaptive object tracking discriminate model for multi-camera panorama surveillance in airport apron, Comput. Model. Eng. Sci., vol. 129, no. 1, pp. 191–205, 2021.

[6]

W. Tian, M. Lauer, and L. Chen, Online multi-object tracking using joint domain information in traffic scenarios, IEEE Trans. Intell. Transp. Syst., vol. 21, no. 1, pp. 374–384, 2020.

[7]

M. Adimoolam, S. Mohan, A. John, and G. Srivastava, A novel technique to detect and track multiple objects in dynamic video surveillance systems, Int. J. Interact. Multimed. Artif. Intell., vol. 7, no. 4, p. 112, 2022.

[8]

A. A. Gumbs, I. Frigerio, G. Spolverato, R. Croner, A. Illanes, E. Chouillard, and E. Elyan, Artificial intelligence surgery: How do we get to autonomous actions in surgery? Sensors, vol. 21, no. 16, p. 5526, 2021.

[9]

S. Ipsen, S. Böttger, H. Schwegmann, and F. Ernst, Target tracking accuracy and latency with different 4D ultrasound systems–a robotic phantom study, Curr. Dir. Biomed. Eng., vol. 6, no. 1, p. 20200038, 2020.

[10]

B. Yang, X. Liu, X. Tang, and X. Chen, AGV Multi-target tracking under smart factory, Electron. Sci. Technol., vol. 32, no. 11, pp. 23–27, 2019.

[11]

Z. Xia, J. Du, J. Wang, C. Jiang, Y. Ren, G. Li, and Z. Han, Multi-agent reinforcement learning aided intelligent UAV swarm for target tracking, IEEE Trans. Veh. Technol., vol. 71, no. 1, pp. 931–945, 2022.

[12]

S. Zhang, Y. Li, and Q. Dong, Autonomous navigation of UAV in multi-obstacle environments based on a Deep Reinforcement Learning approach, Appl. Soft Comput., vol. 115, p. 108194, 2022.

[13]
D. E. Asher, A. Basak, R. Fernandez, P. K. Sharma, E. G. Zaroukian, C. D. Hsu, M. R. Dorothy, T. Mahre, G. Galindo, L. Frerichs, et al., Strategic maneuver and disruption with reinforcement learning approaches for multi-agent coordination, J. Def. Model. Simul., doi: 10.1177/15485129221104096.
DOI
[14]

S. Liu, J. Cao, Y. Wang, W. Chen, and Y. Liu, Self-play reinforcement learning with comprehensive critic in computer games, Neurocomputing, vol. 449, pp. 207–213, 2021.

[15]

S. Wen, Z. Wen, D. Zhang, H. Zhang, and T. Wang, A multi-robot path-planning algorithm for autonomous navigation using meta-reinforcement learning based on transfer learning, Appl. Soft Comput., vol. 110, p. 107605, 2021.

[16]
B. Chen, D. Wang, P. Li, S. Wang, and H. Lu, Real-time ‘actor-critic’ tracking, in Proc. 15th European Conf. Computer Vision (ECCV), Munich, Germany, 2018, pp. 328–345.
DOI
[17]
S. Yun, J. Choi, Y. Yoo, K. Yun, and J. Y. Choi, Action-decision networks for visual tracking with deep reinforcement learning, in Proc. 2017 IEEE Conf. Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 2017, pp. 1349–1358.
DOI
[18]
M. Danelljan, G. Häger, F. Shahbaz Khan, and M. Felsberg, Accurate scale estimation for robust visual tracking, in Proc. 2014 British Machine Vision Conference, Nottingham, UK, 2014, doi: 10.5244/C.28.65.
DOI
[19]
Z. Hong, Z. Chen, C. Wang, X. Mei, D. Prokhorov, and D. Tao, MUlti-Store Tracker (MUSTer): A cognitive psychology inspired approach to object tracking, in Proc. 2015 IEEE Conf. Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, 2015, pp. 749–758.
DOI
[20]

J. F. Henriques, R. Caseiro, P. Martins, and J. Batista, High-speed tracking with kernelized correlation filters, IEEE Trans. Pattern Anal. Mach. Intell., vol. 37, no. 3, pp. 583–596, 2015.

[21]
J. Choi, H. J. Chang, J. Jeong, Y. Demiris, and J. Y. Choi, Visual tracking using attention-modulated disintegration and integration, in Proc. 2016 IEEE Conf. Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 2016, pp. 4321–4330.
DOI
[22]
J. Choi, H. J. Chang, S. Yun, T. Fischer, Y. Demiris, and J. Y. Choi, Attentional correlation filter network for adaptive visual tracking, in Proc. 2017 IEEE Conf. Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 2017, pp. 4828–4837.
DOI
[23]
L. Wang, W. Ouyang, X. Wang, and H. Lu, Visual tracking with fully convolutional networks, in Proc. 2015 IEEE Int. Conf. Computer Vision (ICCV), Santiago, Chile, 2016, pp. 3119–3127.
DOI
[24]
M. Danelljan, G. Bhat, F. S. Khan, and M. Felsberg, ECO: Efficient convolution operators for tracking, in Proc. 2017 IEEE Conf. Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 2017, pp. 6931–6939.
DOI
[25]
H. Fan and H. Ling, SANet: structure-aware network for visual tracking, in Proc. 2017 IEEE Conf. Computer Vision and Pattern Recognition Workshops (CVPRW), Honolulu, HI, USA, 2017, pp. 2217–2224.
DOI
[26]
B. Han, J. Sim, and H. Adam, BranchOut: regularization for online ensemble tracking with convolutional neural networks, in Proc. 2017 IEEE Conf. Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 2017, pp. 521–530.
DOI
[27]

Z. Chi, H. Li, H. Lu, and M. H. Yang, Dual deep network for visual tracking, IEEE Trans. Image Process., vol. 26, no. 4, pp. 2005–2015, 2017.

[28]
J. Valmadre, L. Bertinetto, J. Henriques, A. Vedaldi, and P. H. S. Torr, End-to-end representation learning for correlation filter based tracking, in Proc. 2017 IEEE Conf. Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 2017, pp. 5000–5008.
DOI
[29]
I. Jung, J. Son, M. Baek, and B. Han, Real-time MDNet, in Proc. 15th European Conf. Computer Vision (ECCV), Munich, Germany, 2018, pp. 89–104.
DOI
[30]
P. Voigtlaender, J. Luiten, P. H. S. Torr, and B. Leibe, Siam R-CNN: Visual tracking by re-detection, in Proc. 2020 IEEE/CVF Conf. Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 2020, pp. 6577–6587.
DOI
[31]
T. Yang, P. Xu, R. Hu, H. Chai, and A. B. Chan, ROAM: Recurrently optimizing tracking model, in Proc. 2020 IEEE/CVF Conf. Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 2020, pp. 6717–6726.
DOI
[32]
Z. Teng, J. Xing, Q. Wang, C. Lang, S. Feng, and Y. Jin, Robust object tracking based on temporal and spatial deep networks, in Proc. 2017 IEEE Int. Conf. Computer Vision (ICCV), Venice, Italy, 2017, pp. 1153–1162.
DOI
[33]
C. Ma, J. B. Huang, X. Yang, and M. H. Yang, Hierarchical convolutional features for visual tracking, in Proc. 2015 IEEE Int. Conf. Computer Vision (ICCV), Santiago, Chile, 2016, pp. 3074–3082.
DOI
[34]

O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang, A. Karpathy, A. Khosla, M. Bernstein et al., ImageNet large scale visual recognition challenge, Int. J. Comput. Vis., vol. 115, no. 3, pp. 211–252, 2015.

[35]
Y. Wu, J. Lim, and M. H. Yang, Online object tracking: A benchmark, in Proc. 2013 IEEE Conf. Computer Vision and Pattern Recognition, Portland, OR, USA, 2013, pp. 2411–2418.
DOI
[36]

Y. Wu, J. Lim, and M. H. Yang, Object tracking benchmark, IEEE Trans. Pattern Anal. Mach. Intell., vol. 37, no. 9, pp. 1834–1848, 2015.

Publication history
Copyright
Acknowledgements
Rights and permissions

Publication history

Received: 16 December 2022
Revised: 06 April 2023
Accepted: 28 April 2023
Published: 30 June 2023
Issue date: December 2023

Copyright

© The author(s) 2023.

Acknowledgements

Acknowledgment

This work was supported in part by the National Key R&D Program of China (No. 2022YFB2602203), and in part by the National Natural Science Foundation of China (Nos. U20A20225 and 61873200) and Shaanxi Provincial Key Research and Development Program (No. 2022-GY111).

Rights and permissions

The articles published in this open access journal are distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/).

Return