[1]
A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S.Gelly, et al., An image is worth 16×16 words: Transformers for image recognition at scale, in Proc. Int. Conf. on Learning Representations, Virtual, 2021.
[2]
C. F. R. Chen, Q. Fan, and R. Panda, CrossViT: Cross-attention multi-scale vision transformer for image classification, in Proc. IEEE Int. Conf. on Computer Vision, Montreal, Canada, 2021, pp. 347–356.
[4]
C. J. Holder and M. Shafique, On efficient real-time semantic segmentation: A survey, arXiv preprint arXiv: 2206.08605, 2022.
[6]
X. Wang, L. Bo, and F. Li, Adaptive wing loss for robust face alignment via heatmap regression, in Proc. IEEE International Conference on Computer Vision, Seoul, Republic of South Korea, 2019, pp. 6970-6980.
[9]
C. Szegedy, W. Zaremba, I. Sutskever, J. Bruna, D. Erhan, I. Goodfellow, and R. Fergus, Intriguing properties of neural networks, in Proc. Int. Conf. on Learning Representations, Banff, Canada, 2014.
[11]
D. Jin, Z. Jin, J.T. Zhou, and P. Szolovits, Is bert really robust? a strong baseline for natural language attack on text classification and entailment, in Proc. AAAI Conf. on Artificial Intelligence, New York, NY, USA, 2020, pp. 8018–8025.
[12]
J. Yu, M. Gao, H. Yin, J. Li, C. Gao, and Q. Wang, Generating reliable friends via adversarial training to improve social recommendation, in Proc. IEEE Int. Conf. on Data Mining, Beijing, China, 2019, pp. 768–777.
[13]
G. Beigi, A. Mosallanezhad, R. Guo, H. Alvari, A. Nou, and H. Liu, Privacy-aware recommendation with private-attribute protection using adversarial learning, in Proc. ACM Int. Conf. on Web Search and Data Mining, Houston, TX, USA, 2020, pp. 34–42.
[14]
I. Goodfellow, J. Shlens, and C. Szegedy, Explaining and harnessing adversarial examples, in Proc. Int. Conf. on Learning Representations, San Diego, CA, USA, 2015, pp. 7–9.
[15]
A. Kurakin, I. Goodfellow, and S. Bengio, Adversarial machine learning at scale, arXiv preprint arXiv: 1611.01236, 2016.
[16]
Y. Dong, F. Liao, T. Pang, H. Su, J. Zhu, X. Hu, and J. Li, Boosting adversarial attacks with momentum, in Proc. IEEE Conf. on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 2018, pp. 9185–9193.
[17]
N. Carlini and D. Wagner, Towards evaluating the robustness of neural networks, in Proc. IEEE Symposium on Security and Privacy, San Jose, CA, USA, 2017, pp. 39–57.
[18]
H. Li, X. Xu, X. Zhang, S. Yang, and B. Li, Qeba: Query-efficient boundary-based blackbox attack, in Proc. IEEE Conf. on Computer Vision and Pattern Recognition, Seattle, WA, USA, 2020, pp. 1221–1230.
[19]
T. Maho, T. Furon, and E. Le Merrer, Surfree: A fast surrogate-free black-box attack, in Proc. IEEE Conference on Computer Vision and Pattern Recognition, 2021, pp. 10430–10439.
[20]
J. Fu, J. Sun, and G. Wang, Boosting black-box adversarial attacks with meta learning, in Proc. Chinese Control Conference, Hefei, China, 2022, pp. 7308–7313.
[23]
J. Lu, H. Sibai, and E. Fabry, Adversarial examples that fool detectors, arXiv preprint arXiv: 1712.02494, 2017.
[25]
W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C.-Y. Fu, and A.C. Berg, SSD: Single shot multibox detector, in Proc. European Conf. Computer Vision, Amsterdam, the Netherlands, 2016, pp. 21–37.
[26]
J. Redmon and A. Farhadi, YOlOv3: An incremental improvement, arXiv preprint arXiv: 1804.02767, 2018.
[27]
A. Bochkovskiy, C.-Y. Wang, and H.-Y. M. Liao, YOlOv4: Optimal speed and accuracy of object detection, arXiv preprint arXiv: 2004.10934, 2020.
[28]
R. Girshick, Fast R-CNN, in Proc. IEEE Conf. on Computer Vision and Pattern Recognition, Boston, MA, USA, 2015, pp. 1440–1448.
[29]
S. Ren, K. He, R. Girshick, and J. Sun, FasterR-CNN: Towards real-time object detection with region proposal networks, in Proc. Advances in Neural Information Processing Systems, Montreal, Canada, 2015, pp. 91–99.
[30]
A. Saha, A. Subramanya, K. Patil, and H. Pirsiavash, Role of spatial context in adversarial robustness for object detection, in Proc. IEEE Conf. on Computer Vision and Pattern Recognition Workshops, Seattle, WA, USA, 2020, pp. 784–785.
[31]
J. Bao, Sparse adversarial attack to object detection, arXiv preprint arXiv: 2012.13692, 2020.
[32]
S. Wu, T. Dai, and S.-T. Xia, Dpattack: Diffused patch attacks against universal object detection, arXiv preprint arXiv: 2010.11679, 2020.
[33]
A. Shapira, A. Zolfi, L. Demetrio, B. Biggio, and A. Shabtai, Phantom sponges: Exploiting non-maximum suppression to attack deep object detectors, in Proc. IEEE Winter Conf. on Applications of Computer Vision, Waikola, HI, USA, 2023, pp. 4571–4580.
[34]
C. Xie, J. Wang, Z. Zhang, Y. Zhou, L. Xie, and A. Yuille, Adversarial examples for semantic segmentation and object detection, in Proc. IEEE Int. Conf. on Computer Vision, Venice, Italy, 2017, pp. 1369–1378.
[35]
X. Wei, S. Liang, N. Chen, and X. Cao, Transferable adversarial attacks for image and video object detection, arXiv preprint arXiv: 1811.12641, 2018.
[36]
I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio, Generative adversarial nets, in Proc. Advances in Neural Information Processing Systems, Montreal, Canada, 2014.
[37]
A. Radford, L. Metz, and S. Chintala, Unsupervised representation learning with deep convolutional generative adversarial networks, arXiv preprint arXiv: 1511.06434, 2015.
[38]
X. Wang, X. He, J. Wang, and K. He, Admix: Enhancing the transferability of adversarial attacks, in Proc. IEEE Int. Conf. on Computer Vision, Montreal, Canada, 2021, pp. 16158–16167.
[39]
T.Y. Lin, P. Dollár, R. Girshick, K. He, B. Hariharan, and S. Belongie, Feature pyramid networks for object detection, in Proc. IEEE Conf. on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 2017, pp. 2117–2125.
[40]
K. Simonyan and A. Zisserman, Very deep convolutional networks for large-scale image recognition, arXiv preprint arXiv: 1409.1556, 2014.
[41]
K. He, X. Zhang, S. Ren, and J. Sun, Deep residual learning for image recognition, in Proc. IEEE Conf. on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 2016, pp. 770–778.
[42]
C. Y. Wang, H. Y. M. Liao, Y. H. Wu, P. Y. Chen, J. W. Hsieh, and I. H. Yeh, CSPNet: A new backbone that can enhance learning capability of CNN, in Proc. IEEE Conf. on Computer Vision and Pattern Recognition, Seattle, WA, USA, 2020, pp. 390–391.
[43]
C. Xiao, d B. Li, J.Y. Zhu, W. He, M. Liu, and D. Song, Generating adversarial examples with adversarial networks, arXiv preprint arXiv: 1801.02610, 2018.
[44]
P. Isola, J.Y. Zhu, T. Zhou, and A.A. Efros, Image-to-image translation with conditional adversarial networks, in Proc. IEEE Conf. on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 2017, pp. 1125–1134.
[45]
P. Zhu, Y. Sun, L. Wen, Y. Feng, and Q. Hu, Drone based RGBT vehicle detection and counting: A challenge, arXiv preprint arXiv: 2003.02437, 2020.
[46]
D. Du, P. Zhu, L. Wen, X. Bian, H. Lin, Q. Hu, T. Peng, J. Zheng, X. Wang, Y. Zhang, et al., VisDrone-DET2019: The vision meets drone object detection in image challenge results, in Proc. IEEE Int. Conf. on Computer Vision Workshops, Seoul, Repubilc of South Korea, 2019, pp. 213–226.
[47]
J. Redmon and A. Farhadi, YOLO9000: Better, faster, stronger, in Proc. IEEE Conf. on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 2017, pp. 7263–7271.
[48]
T.Y. Lin, P. Goyal, R. Girshick, K. He, and P. Dollár, Focal loss for dense object detection, in Proc. IEEE Conf. on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 2017, pp. 2980–2988.
[49]
K. Chen, J. Wang, J. Pang, Y. Cao, Y. Xiong, X. Li, S. Sun, W. Feng, Z. Liu, J. Xu, et al., MMDetection: Open mmlab detection toolbox and benchmark, arXiv preprint arXiv: 1906.07155, 2019.
[50]
R.R. Selvaraju, M. Cogswell, A. Das, R. Vedantam, D. Parikh, and D. Batra, Grad-CAM: Visual explanations from deep networks via gradient-based localization, in Proc. IEEE Int. Conf. on Computer Vision, Venice, Italy, 2017, pp. 618–626.