Journal Home > Volume 27 , Issue 2

The person re-identification (re-ID) community has witnessed an explosion in the scale of data that it has to handle. On one hand, it is important for large-scale re-ID to provide constant or sublinear search time and dramatically reduce the storage cost for data points from the viewpoint of efficiency. On the other hand, the semantic affinity existing in the original space should be preserved because it greatly boosts the accuracy of re-ID. To this end, we use the deep hashing method, which utilizes the pairwise similarity and classification label to learn deep hash mapping functions, in order to provide discriminative representations. More importantly, considering the great advantage of asymmetric hashing over the existing symmetric one, we finally propose an asymmetric deep hashing (ADH) method for large-scale re-ID. Specifically, a two-stream asymmetric convolutional neural network is constructed to learn the similarity between image pairs. Another asymmetric pairwise loss is formulated to capture the similarity between the binary hashing codes and real-value representations derived from the deep hash mapping functions, so as to constrain the binary hash codes in the Hamming space to preserve the semantic structure existing in the original space. Then, the image labels are further explored to have a direct impact on the hash function learning through a classification loss. Furthermore, an efficient alternating algorithm is elaborately designed to jointly optimize the asymmetric deep hash functions and high-quality binary codes, by optimizing one parameter with the other parameters fixed. Experiments on the four benchmarks, i.e., DukeMTMC-reID, Market-1501, Market-1501+500k, and CUHK03 substantiate the competitive accuracy and superior efficiency of the proposed ADH over the compared state-of-the-art methods for large-scale re-ID.


menu
Abstract
Full text
Outline
About this article

Asymmetric Deep Hashing for Person Re-Identifications

Show Author's information Yali ZhaoYali LiShengjin Wang( )
Department of Electronic Engineering, Tsinghua University, Beijing 100084, China

Abstract

The person re-identification (re-ID) community has witnessed an explosion in the scale of data that it has to handle. On one hand, it is important for large-scale re-ID to provide constant or sublinear search time and dramatically reduce the storage cost for data points from the viewpoint of efficiency. On the other hand, the semantic affinity existing in the original space should be preserved because it greatly boosts the accuracy of re-ID. To this end, we use the deep hashing method, which utilizes the pairwise similarity and classification label to learn deep hash mapping functions, in order to provide discriminative representations. More importantly, considering the great advantage of asymmetric hashing over the existing symmetric one, we finally propose an asymmetric deep hashing (ADH) method for large-scale re-ID. Specifically, a two-stream asymmetric convolutional neural network is constructed to learn the similarity between image pairs. Another asymmetric pairwise loss is formulated to capture the similarity between the binary hashing codes and real-value representations derived from the deep hash mapping functions, so as to constrain the binary hash codes in the Hamming space to preserve the semantic structure existing in the original space. Then, the image labels are further explored to have a direct impact on the hash function learning through a classification loss. Furthermore, an efficient alternating algorithm is elaborately designed to jointly optimize the asymmetric deep hash functions and high-quality binary codes, by optimizing one parameter with the other parameters fixed. Experiments on the four benchmarks, i.e., DukeMTMC-reID, Market-1501, Market-1501+500k, and CUHK03 substantiate the competitive accuracy and superior efficiency of the proposed ADH over the compared state-of-the-art methods for large-scale re-ID.

Keywords:

person re-identification, deep hashing, asymmetric hashing, large-scale
Received: 09 October 2020 Revised: 16 February 2021 Accepted: 23 February 2021 Published: 29 September 2021 Issue date: April 2022
References(57)
[1]
B. Neyshabur, P. Yadollahpour, Y. Makarychev, R. Salakhutdinov, and N. Srebro, The power of asymmetry in binary hashing, arXiv preprint arXiv: 1311.7662, 2013.
[2]
R. Zhao, W. L. Ouyang, and X. G. Wang, Person re-identification by salience matching, in Proc. 2013 IEEE Int. Conf. Computer Vision, Sydney, Australia, 2013, pp. 2528-2535.
[3]
F. Xiong, M. R. Gou, O. Camps, and M. Sznaier, Person re-identification using kernel-based metric learning methods, in Proc. 13th European Conf. Computer Vision, Zurich, Switzerland, 2014, pp. 1-16.
[4]
S. C. Liao, Y. Hu, X. Y. Zhu, and S. Z. Li, Person re-identification by Local Maximal Occurrence representation and metric learning, in Proc. 2015 IEEE Conf. Computer Vision and Pattern Recognition, Boston, MA, USA, 2015, pp. 2197-2206.
[5]
M. Köstinger, M. Hirzer, P. Wohlhart, P. M. Roth, and H. Bischof, Large scale metric learning from equivalence constraints, in Proc. 2012 IEEE Conf. Computer Vision and Pattern Recognition, Providence, RI, USA, 2012, pp. 2288-2295.
[6]
M. Dikmen, E. Akbas, T. S. Huang, and N. Ahuja, Pedestrian recognition with a learned metric, in Proc. 10th Asian Conf. Computer Vision, Queenstown, New Zealand, 2010, pp. 501-512.
[7]
R. R. Varior, B. Shuai, J. W. Lu, D. Xu, and G. Wang, A siamese long short-term memory architecture for human re-identification, in Proc. 14th European Conf. Computer Vision, Amsterdam, The Netherlands, 2016, pp. 135-153.
[8]
L. M. Zhao, X. Li, Y. T. Zhuang, and J. D. Wang, Deeply-learned part-aligned representations for person re-identification, in Proc. 2017 IEEE Int. Conf. Computer Vision, Venice, Italy, 2017, pp. 3219-3228.
[9]
Y. F. Sun, L. Zheng, W. J. Deng, and S. J. Wang, SVDNet for pedestrian retrieval, in Proc. 2017 IEEE Int. Conf. Computer Vision, Venice, Italy, 2017, pp. 3800-3808.
[10]
W. J. Li, S. Wang, and W. C. Kang, Feature learning based deep supervised hashing with pairwise labels, arXiv preprint arXiv: 1511.03855, 2016.
[11]
H. F. Yang, K. Lin, and C. S. Chen, Supervised learning of semantics-preserving hash via deep convolutional neural networks, arXiv preprint arXiv: 1507.00101, 2017.
[12]
H. M. Liu, R. P. Wang, S. G. Shan, and X. L. Chen, Deep supervised hashing for fast image retrieval, in Proc. 2016 IEEE Conf. Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 2016, pp. 2064-2072.
[13]
Z. F. Qiu, Y. W. Pan, T. Yao, and T. Mei, Deep semantic hashing with generative adversarial networks, in Proc. 40th Int. ACM SIGIR Conf. Research and Development in Information Retrieval, Shinjuku, Tokyo, Japan, 2017, pp. 225-234.
[14]
F. Q. Zhu, X. W. Kong, L. Zheng, H. Y. Fu, and Q. Tian, Part-based deep hashing for large-scale person re-identification, IEEE Trans. Image Process., vol. 26, no. 10, pp. 4806-4817, 2017.
[15]
L. Wu, Y. Wang, Z. Y. Ge, Q. C. Hu, and X. Li, Structured deep hashing with convolutional neural networks for fast person re-identification, Comput. Vis. Image Underst., vol. 167, pp. 63-73, 2018.
[16]
X. T. Zhu, B. T. Wu, D. C. Huang, and W. S. Zheng, Fast open-world person re-identification, IEEE Trans. Image Process., vol. 27, no. 5, pp. 2286-2300, 2018.
[17]
K. Lin, H. F. Yang, J. H. Hsiao, and C. S. Chen, Deep learning of binary hash codes for fast image retrieval, in Proc. 2015 IEEE Conf. Computer Vision and Pattern Recognition Workshops, Boston, MA, USA, 2015, pp. 27-35.
[18]
H. F. Yang, K. Lin, and C. S. Chen, Supervised learning of semantics-preserving hash via deep convolutional neural networks, IEEE Trans. Pattern Anal. Mach. Intell., vol. 40, no. 2, pp. 437-451, 2018.
[19]
E. Ristani, F. Solera, R. Zou, R. Cucchiara, and C. Tomasi, Performance measures and a data set for multi-target, multi-camera tracking, in Proc. European Conf. Computer Vision, Amsterdam, The Netherlands, 2016, pp. 17-35.
[20]
L. Zheng, L. Y. Shen, L. Tian, S. J. Wang, J. D. Wang, and Q. Tian, Scalable person re-identification: A benchmark, in Proc. 2015 IEEE Int. Conf. Computer Vision, Santiago, Chile, 2015, pp. 1116-1124.
[21]
Z. Zhong, L. Zheng, D. L. Cao, and S. Z. Li, Re-ranking person re-identification with k-reciprocal encoding, in Proc. 2017 IEEE Conf. Computer Vision and Pattern Recognition, Honolulu, HI, USA, 2017, pp. 1318-1327.
[22]
W. Li, R. Zhao, T. Xiao, and X. Wang, Deepreid: Deep filter pairing neural network for person reidentification. in Proceedings of Computer Vision and Pattern Recognition, Honolulu, HI, USA, 2014, pp. 152-159.
[23]
D. Gray, S. Brennan, and H. Tao, Evaluating appearance models for recognition, reacquisition, and tracking, in Proc. IEEE Int. Workshop on Performance Evaluation for Tracking and Surveillance, Las Vegas, NV, USA, 2007, pp. 1-7.
[24]
X. Fan, W. Jiang, H. Luo, and M. J. Fei, SphereReID: Deep hypersphere manifold embedding for person re-identification, J. Vis. Commun. Image Represent., vol. 60, pp. 51-58, 2019.
[25]
Y. Weiss, A. Torralba, and R. Fergus, Spectral hashing, in Proc. 21st Int. Conf. Neural Information Processing Systems, Vancouver, British Columbia, Canada, 2008, pp. 1753-1760.
[26]
Y. C. Gong, S. Lazebnik, A. Gordo, and F. Perronnin, Iterative quantization: A procrustean approach to learning binary codes for large-scale image retrieval, IEEE Trans. Pattern Anal. Mach. Intell., vol. 35, no. 12, pp. 2916-2929, 2013.
[27]
F. M. Shen, X. Zhou, Y. Yang, J. K. Song, H. T. Shen, and D. C. Tao, A fast optimization method for general binary code learning, IEEE Trans. Image Process., vol. 25, no. 12, pp. 5610-5621, 2016.
[28]
Q. Y. Jiang and W. J. Li, Scalable graph hashing with feature transformation, in Proc. 24th Int. Conf. Artificial Intelligence, Buenos Aires, Argentina, 2018, pp. 2248-2254.
[29]
F. M. Shen, C. H. Shen, W. Liu, and H. T. Shen, Supervised discrete hashing, in Proc. 2015 IEEE Conf. Computer Vision and Pattern Recognition, Boston, MA, USA, 2015, pp. 37-45.
[30]
Q. Y. Jiang and W. J. Li, Asymmetric deep supervised hashing, in Proc. 32nd AAAI Conf. Artificial Intelligence, New Orleans, LA, USA, 2018.
[31]
F. M. Shen, X. Gao, L. Liu, Y. Yang, and H. T. Shen, Deep asymmetric pairwise hashing, in Proc. 25th ACM Int. Conf. Multimedia, Mountain View, CA, USA, 2017, pp. 1522-1530.
[32]
S. P. Su, C. Zhang, K. Han, and Y. H. Tian, Greedy hash: Towards fast optimization for accurate hash coding in CNN, in Proc. 32nd Int. Conf. Neural Information Processing Systems, Montréal, Canada, 2018, pp. 806-815.
[33]
R. M. Zhang, L. Lin, R. Zhang, W. M. Zuo, and L. Zhang, Bit-scalable deep hashing with regularized similarity learning for image retrieval and person re-identification, IEEE Trans. Image Process., vol. 24, no. 12, pp. 4766-4779, 2015.
[34]
K. M. He, X. Y. Zhang, S. Q. Ren, and J. Sun, Deep residual learning for image recognition, in Proc. 2016 IEEE Conf. Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 2016, pp. 770-778.
[35]
C. Jose and F. Fleuret, Scalable metric learning via weighted approximate rank component analysis, in Proc. 14th European Conf. Computer Vision, Amsterdam, The Netherlands, 2016, pp. 875-890.
[36]
S. Karanam, M. R. Gou, Z. Y. Wu, A. Rates-Borras, O. Camps, and R. J. Radke, A comprehensive evaluation and benchmark for person re-identification: Features, metrics, and datasets, arXiv preprint arXiv: 1605.09653, 2018.
[37]
Y. F. Sun, L. Zheng, Y. Yang, Q. Tian, and S. J. Wang, Beyond part models: Person retrieval with refined part pooling (and a strong convolutional baseline), in Proc. 15th European Conf. Computer Vision, Munich, Germany, 2018, pp. 480-496.
[38]
Y. B. Chen, X. T. Zhu, and S. G. Gong, Person re-identification by deep learning multi-scale representations, in Proc. 2017 IEEE Int. Conf. Computer Vision, Venice, Italy, 2017, pp. 2590-2600.
[39]
S. C. Pang, S. B. Qiao, T. Song, J. L. Zhao, and P. Zheng, An improved convolutional network architecture based on residual modeling for person re-identification in edge computing, IEEE Access, vol. 7, pp. 106748-106759, 2019.
[40]
H. Y. Wang, T. Fang, Y. L. Fan, and W. Wu, Person re-identification based on DropEasy method, IEEE Access, vol. 7, pp. 97021-97031, 2019.
[41]
X. L. Qian, Y. W. Fu, T. Xiang, Y. G. Jiang, and X. Y. Xue, Leader-based multi-scale attention deep architecture for person re-identification, IEEE Trans. Pattern Anal. Mach. Intell., vol. 42, no. 2, pp. 371-385, 2020.
[42]
L. H. Wei, S. L. Zhang, H. T. Yao, W. Gao, and Q. Tian, GLAD: Global-local-alignment descriptor for scalable person re-identification, IEEE Trans. Multimed., vol. 21, no. 4, pp. 986-999, 2019.
[43]
K. Han, J. Y. Guo, C. Zhang, and M. J. Zhu, Attribute-aware attention model for fine-grained representation learning, in Proc. 26th ACM Int. Conf. Multimedia, Seoul, South Kerean, 2018, pp. 2040-2048.
[44]
J. Y. Guo, Y. H. Yuan, L. Huang, C. Zhang, J. G. Yao, and K. Han, Beyond human parts: Dual part-aligned representations for person re-identification, in Proc. Int. Conf. Computer Vision, Seoul, Republic of Korea, 2019, pp. 3642-3651.
[45]
G. Huang, S. C. Liu, L. van der Maaten, and K. Q. Weinberger, CondenseNet: An efficient densenet using learned group convolutions, in Proc. 2018 IEEE/CVF Conf. Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 2018, pp. 2752-2761.
[46]
A. G. Howard, M. L. Zhu, B. Chen, D. Kalenichenko, W. J. Wang, T. Weyand, M. Andreetto, and H. Adam, MobileNets: Efficient convolutional neural networks for mobile vision applications, arXiv preprint arXiv: 1704.04861, 2017.
[47]
N. N. Ma, X. Y. Zhang, H. T. Zheng, and J. Sun, ShuffleNet V2: Practical guidelines for efficient CNN architecture design, in Proc. 15th European Conf. Computer Vision, Munich, Germany, 2018, pp. 116-131.
[48]
M. Sandler, A. Howard, M. L. Zhu, A. Zhmoginov, and L. C. Chen, MobileNetV2: Inverted residuals and linear bottlenecks, in Proc. 2018 IEEE/CVF Conf. Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 2018, pp. 4510-4520.
[49]
K. Han, Y. H. Wang, Q. Tian, J. Y. Guo, C. J. Xu, and C. Xu, GhostNet: More features from cheap operations, in Proc. 2020 IEEE/CVF Conf. Computer Vision and Pattern Recognition, Seattle, WA, USA, 2020, pp. 1580-1589.
[50]
S. N. Xie, R. Girshick, P. Dollar, Z. W. Tu, and K. M. He, Aggregated residual transformations for deep neural networks, in Proc. 2017 IEEE Conf. Computer Vision and Pattern Recognition, Honolulu, HI, USA, 2017, pp. 1492-1500.
[51]
C. F. Chen, Q. F. Fan, N. Mallinar, T. Sercu, and R. Feris, Big-little net: An efficient multi-scale feature representation for visual and speech recognition, arXiv preprint arXiv: 1807.03848, 2019.
[52]
B. Zoph, V. Vasudevan, J. Shlens, and Q. V. Le, Learning transferable architectures for scalable image recognition, in Proc. 2018 IEEE/CVF Conf. Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 2018, pp. 8697-8710.
[53]
L. Zheng, Z. Bie, Y. F. Sun, J. D. Wang, C. Su, S. J. Wang, and Q. Tian, MARS: A video benchmark for large-scale person re-identification, in Proc. 14th European Conf. on Computer Vision, Amsterdam, The Netherlands, 2016, pp. 868-884.
[54]
T. Xiao, S. Li, B. C. Wang, L. Lin, and X. G. Wang, Joint detection and identification feature learning for person search, in Proc. 2017 IEEE Conf. Computer Vision and Pattern Recognition, Honolulu, HI, USA, 2017, pp. 3415-3424.
[55]
Y. L. Zhao, Y. L. Li, and S. J. Wang, Person re-identification with effectively designed parts, Tsinghua Science and Technology, vol. 25, no. 3, pp. 415-424, 2020.
[56]
Y. F. Sun, Z. P. Dou, Y. L. Li, and S. J. Wang, Improving semantic part features for person re-identification with supervised non-local similarity, Tsinghua Science and Technology, vol. 25, no. 5, pp. 636-646, 2020.
[57]
Y. Li, X. M. Tao, and J. H. Lu, Effectively lossless subspace appearance model compression using prior information, Tsinghua Science and Technology, vol. 20, no. 4, pp. 409-416, 2015.
Publication history
Copyright
Acknowledgements
Rights and permissions

Publication history

Received: 09 October 2020
Revised: 16 February 2021
Accepted: 23 February 2021
Published: 29 September 2021
Issue date: April 2022

Copyright

© The author(s) 2022

Acknowledgements

This work was supported by the National Natural Science Foundation of China (Nos. 61701277 and 61771288), the State Key Development Program in the 13th Five-Year Plan (No. 2017YFC0821601), and Open Project Fund of the National Engineering Laboratory for Intelligent Video Analysis and Application.

Rights and permissions

The articles published in this open access journal are distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/).

Return