Journal Home > Volume 26 , Issue 4

In this study, we address the problems encountered by incremental face clustering. Without the benefit of having observed the entire data distribution, incremental face clustering is more challenging than static dataset clustering. Conventional methods rely on the statistical information of previous clusters to improve the efficiency of incremental clustering; thus, error accumulation may occur. Therefore, this study proposes to predict the summaries of previous data directly from data distribution via supervised learning. Moreover, an efficient framework to cluster previous summaries with new data is explored. Although learning summaries from original data costs more than those from previous clusters, the entire framework consumes just a little bit more time because clustering current data and generating summaries for new data share most of the calculations. Experiments show that the proposed approach significantly outperforms the existing incremental face clustering methods, as evidenced by the improvement of average F-score from 0.644 to 0.762. Compared with state-of-the-art static face clustering methods, our method can yield comparable accuracy while consuming much less time.


menu
Abstract
Full text
Outline
About this article

Incremental Face Clustering with Optimal Summary Learning Via Graph Convolutional Network

Show Author's information Xuan ZhaoZhongdao WangLei GaoYali LiShengjin Wang( )
Department of Electronic Engineering, Tsinghua University, Beijing 100084, China.
First Research Institute of Ministry of Public Security, Beijing 100084, China.

Abstract

In this study, we address the problems encountered by incremental face clustering. Without the benefit of having observed the entire data distribution, incremental face clustering is more challenging than static dataset clustering. Conventional methods rely on the statistical information of previous clusters to improve the efficiency of incremental clustering; thus, error accumulation may occur. Therefore, this study proposes to predict the summaries of previous data directly from data distribution via supervised learning. Moreover, an efficient framework to cluster previous summaries with new data is explored. Although learning summaries from original data costs more than those from previous clusters, the entire framework consumes just a little bit more time because clustering current data and generating summaries for new data share most of the calculations. Experiments show that the proposed approach significantly outperforms the existing incremental face clustering methods, as evidenced by the improvement of average F-score from 0.644 to 0.762. Compared with state-of-the-art static face clustering methods, our method can yield comparable accuracy while consuming much less time.

Keywords: incremental face clustering, supervised learning, Graph Convolutional Network (GCN), optimal summary learning

References(30)

[1]
L. Yang, X. H. Zhan, D. P. Chen, J. J. Yan, C. C. Loy, and D. H. Lin, Learning to cluster faces on an affinity graph, in Proc. 2019 IEEE/CVF Conf. Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 2019, pp. 2293-2301.
[2]
Z. D. Wang, L. Zheng, Y. L. Li, and S. J. Wang, Linkage based face clustering via graph convolution network, in Proc. 2019 IEEE/CVF Conf. Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 2019, pp. 1117-1125.
[3]
C. Otto, D. Y. Wang, and A. K. Jain, Clustering millions of faces by identity, IEEE Trans. Pattern Anal. Mach. Intell., vol. 40, no. 2, pp. 289-303, 2018.
[4]
S. Lloyd, Least squares quantization in PCM, IEEE Trans. Inf. Theory, vol. 28, no. 2, pp. 129-137, 1982.
[5]
J. Shi and J. Malik, Normalized cuts and image segmentation, IEEE Trans. Pattern Anal. Mach. Intell., vol. 22, no. 8, pp. 888-905, 2000.
[6]
M. Ester, H. P. Kriegel, J. Sander, and X. W. Xu, A density-based algorithm for discovering clusters in large spatial databases with noise, in KDD-96 Proc., Menlo Park, CA, USA, 1996, pp. 226-231.
[7]
R. Sibson, SLINK: An optimally efficient algorithm for the single-link cluster method, Comput. J., vol. 16, no. 1, pp. 30-34, 1973.
[8]
N. Ailon, R. Jaiswal, and C. Monteleoni, Streaming k-means approximation, presented at Neural Processing Systems 22: 23rd Annu. Conf. Neural Information Processing Systems, Vancouver, Canada, 2009, pp. 10-18.
[9]
M. Shindler, A. Wong, and A. Meyerson, Fast and accurate k-means for large datasets, in Proc. of the 24th International Conference on Neural Information Processing Sysems, Granada, Spain, 2011, pp. 2375-2383.
[10]
A. M. Bagirov, J. Ugon, and D. Webb, Fast modified global k-means algorithm for incremental cluster construction, Pattern Recognit., vol. 44, no. 3, pp. 866-876, 2011.
[11]
M. Ester, H. P. Kriegel, J. Sander, M. Wimmer, and X. W. Xu, Incremental clustering for mining in a data warehousing environment, in Proc. 24th Int. Conf. Very Large Data Bases, New York, NY, USA, 1998, pp. 323-333.
[12]
P. A. Mansfield, Q. Wang, C. Downey, L. Wan, and I. L. Moreno, Links: A high-dimensional online clustering method, arXiv preprint arXiv: 1801.10123, 2018.
[13]
P. Kulshreshtha and T. Guha, An online algorithm for constrained face clustering in videos, in Proc. 25th IEEE Int. Conf. Image Proc. (ICIP), Athens, Greece, 2018, pp. 2670-2674.
[14]
D. Arthur and S. Vassilvitskii, k-means++: The advantages of careful seeding, in Proc. 18th Annu. ACM-SIAM Symp. on Discrete Algorithms, Philadelphia, PA, USA, 2007.
[15]
Y. F. Sun and Y. S. Lu, A grid-based subspace clustering algorithm for high-dimensional data streams, in Proc. Int. Conf. Web Information Systems, Wuhan, China, 2006.
[16]
A. Y. Zhou, F. Cao, W. N. Qian, and C. Q. Jin, Tracking clusters in evolving data streams over sliding windows, Knowl. Inf. Syst., vol. 15, no. 2, pp. 181-214, 2008.
[17]
A. Namadchian and G. Esfandani, DSCLU: A new data stream clustring algorithm for multi density environments, in Proc. 13th ACIS Int. Conf. Software Engineering, Artificial Intelligence, Networking and Parallel Distributed Computing (SNPD), Kyoto, Japan, 2012.
[18]
W. L. Hamilton, Z. Ying, and J. Leskovec, Inductive representation learning on large graphs, arXiv preprint arXiv: 1706.02216, 2017.
[19]
T. N. Kipf and M. Welling, Semi-supervised classification with graph convolutional networks, arXiv preprint arXiv: 1609.02907, 2017.
[20]
S. J. Yan, Y. J. Xiong, and D. H. Lin, Spatial temporal graph convolutional networks for skeleton-based action recognition, arXiv preprint arXiv: 1801.07455, 2018.
[21]
Z. M. Chen, X. S. Wei, P. Wang, and Y. W. Guo, Multi-label image recognition with graph convolutional networks, in Proc. 2019 IEEE/CVF Conf. Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 2019, pp. 5172-5181.
[22]
Y. C. Shi, C. Otto, and A. K. Jain, Face clustering: Representation and pairwise constraints, IEEE Trans. Inf. Forens. Secur., vol. 13, no. 7, pp. 1626-1640, 2018.
[23]
Y. L. Li, S. J. Wang, Q. Tian, and X. Q. Ding, A survey of recent advances in visual feature detection, Neurocomputing, vol. 149, pp. 736-751, 2015.
[24]
Y. L. Li, S. J. Wang, Q. Tian, and X. Q. Ding, Feature representation for statistical-learning-based object detection: A review, Pattern Recognit., vol. 48, no. 11, pp. 3542-3559, 2015.
[25]
Y. D. Guo, L. Zhang, Y. X. Hu, X. D. He, and J. F. Gao, MS-celeb-1m: A dataset and benchmark for large-scale face recognition, in Proc. 14th European Conf. Computer Vision, Amsterdam, the Netherlands, 2016.
[26]
D. Yi, Z. Lei, S. C. Liao, and S. Z. Li, Learning face representation from scratch, arXiv preprint arXiv: 1411.7923, 2014.
[27]
C. Whitelam, E. Taborsky, A. Blanton, B. Maze, J. C. Adams, T. Miller, N. D. Kalka, A. K. Jain, J. A. Duncan, K. Allen, et al., IARPA Janus benchmark-B face dataset, in Proc. 2017 IEEE Conf. Computer Vision and Pattern Recognition Workshops (CVPRW), Honolulu, HI, USA, 2017, pp. 592-600.
[28]
S. Theodoridis and K. Koutroumbas, Pattern Recognition. 4th ed. San Diego, CA, USA: Elsevier, 2009.
[29]
R. O. Duda, P. E. Hart, and D. G. Stork, Pattern Classification. 2nd ed. New York, NY, USA: John Wiley & Sons, 2001.
[30]
D. Pelleg and A. Moore, X-means: Extending k-means with efficient estimation of the number of clusters, in Proc. 17th Int. Conf. Machine Learning, Pattsburgh, PA, USA, 2000.
Publication history
Copyright
Acknowledgements
Rights and permissions

Publication history

Received: 13 July 2020
Accepted: 29 July 2020
Published: 04 January 2021
Issue date: August 2021

Copyright

© The author(s) 2021

Acknowledgements

This work was supported by the National Natural Science Foundation of China (Nos. 61701277 and 61771288) and the State Key Development Program in 13th Five-Year (Nos. 2016YFB0801301, 044007008, and 2016YFB1001005). This work was also supported by the National Engineering Laboratory for Intelligent Video Analysis and Application of China.

Rights and permissions

The articles published in this open access journal are distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/).

Return