Journal Home > Volume 9 , Issue 3

Subspace clustering methods which embrace a self-expressive model that represents each data point as a linear combination of other data points in the dataset provide powerful unsupervised learning techniques. However, when dealing with large datasets, representation of each data point by referring to all data points via a dictionary suffers from high computational complexity. To alleviate this issue, we introduce a parallelizable multi-subset based self-expressive model (PMS) which represents each data point by combining multiple subsets, with each consisting of only a small proportion of the samples. The adoption of PMS in subspace clustering (PMSSC) leads to computational advantages because the optimization problems decomposed over each subset are small, and can be solved efficiently in parallel. Furthermore, PMSSC is able to combine multiple self-expressive coefficient vectors obtained from subsets, which contributes to an improvement in self-expressiveness. Extensive experiments on synthetic and real-world datasets show the efficiency and effectiveness of our approach in comparison to other methods.


menu
Abstract
Full text
Outline
About this article

PMSSC: Parallelizable multi-subset based self-expressive model for subspace clustering

Show Author's information Katsuya Hotta1Takuya Akashi2Shogo Tokai1Chao Zhang1( )
Department of Engineering, University of Fukui, Fukui-shi 910-8507, Japan
Faculty of Science and Engineering, Iwate University,Morioka-shi 020-8551, Japan

Abstract

Subspace clustering methods which embrace a self-expressive model that represents each data point as a linear combination of other data points in the dataset provide powerful unsupervised learning techniques. However, when dealing with large datasets, representation of each data point by referring to all data points via a dictionary suffers from high computational complexity. To alleviate this issue, we introduce a parallelizable multi-subset based self-expressive model (PMS) which represents each data point by combining multiple subsets, with each consisting of only a small proportion of the samples. The adoption of PMS in subspace clustering (PMSSC) leads to computational advantages because the optimization problems decomposed over each subset are small, and can be solved efficiently in parallel. Furthermore, PMSSC is able to combine multiple self-expressive coefficient vectors obtained from subsets, which contributes to an improvement in self-expressiveness. Extensive experiments on synthetic and real-world datasets show the efficiency and effectiveness of our approach in comparison to other methods.

Keywords: big data, subspace clustering, self-expressive model, subsetting

References(53)

[1]
Vidal, R. Subspace clustering. IEEE Signal Processing Magazine Vol. 28, No. 2, 52–68, 2011.
[2]
Hotta, K.; Xie, H. R.; Zhang, C. Affine subspace clustering with nearest subspace neighbor. In: Proceedings of the SPIE 11766, International Workshop on Advanced Imaging Technology, 267–271, 2021.
DOI
[3]
Zhang, C. Energy minimization over m-branched enumeration for generalized linear subspace clustering. IEICE Transactions on Information and Systems Vol. E102.D, No. 12, 2485–2492, 2019.
[4]
Yang, A. Y.; Wright, J.; Ma, Y.; Sastry, S. S. Unsupervised segmentation of natural images via lossy data compression. Computer Vision and Image Understanding Vol. 110, No. 2, 212–225, 2008.
[5]
Vidal, R.; Tron, R.; Hartley, R. Multiframe motion segmentation with missing data using Power-Factorization and GPCA. International Journal of Computer Vision Vol. 79, No. 1, 85–105, 2008.
[6]
Tierney, S.; Gao, J. B.; Guo, Y. Subspace clustering for sequential data. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 1019–1026, 2014.
DOI
[7]
Zhang, C.; Lu, X. Q.; Hotta, K.; Yang, X. G2MF-WA: Geometric multi-model fitting with weakly annotated data. Computational Visual Media Vol. 6, No. 2, 135–145, 2020.
[8]
Elhamifar, E.; Vidal, R. Sparse subspace clustering. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2790–2797, 2009.
DOI
[9]
Elhamifar, E.; Vidal, R. Sparse subspace clustering: Algorithm, theory, and applications. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 35, No. 11, 2765–2781, 2013.
[10]
You, C.; Robinson, D. P.; Vidal, R. Scalable sparse subspace clustering by orthogonal matching pursuit. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 3918–3927, 2016.
DOI
[11]
Guo, Y.; Tierney, S.; Gao, J. B. Efficient sparse subspace clustering by nearest neighbour filtering. Signal Processing Vol. 185, 108082, 2021.
[12]
You, C.; Li, C.; Robinson, D. P.; Vidal, R. Self-representation based unsupervised exemplar selection in a union of subspaces. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 44, No. 5, 2698–2711, 2022.
[13]
Peng, X.; Zhang, L.; Yi, Z. Scalable sparse subspace clustering. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 430–437, 2013.
DOI
[14]
Matsushima, S.; Brbic, M. Selective sampling-based scalable sparse subspace clustering. In: Proceedings of the 33rd Conference on Neural Information Processing Systems, 12416–12425, 2019.
[15]
Tseng, P. Nearest q-flat to m points. Journal of Optimization Theory and Applications Vol. 105, No. 1, 249–252, 2000.
[16]
Zhang, T.; Szlam, A.; Lerman, G. Median K-Flats for hybrid linear modeling with many outliers. In: Proceedings of the IEEE 12th International Conference on Computer Vision Workshops, 234–241, 2009.
[17]
Lipor, J.; Hong, D.; Tan, Y. S.; Balzano, L. Subspace clustering using ensembles of K-subspaces. Information and Inference: A Journal of the IMA Vol. 10, No. 1, 73–107, 2021.
[18]
Lane, C.; Haeffele, B. D.; Vidal, R. Adaptive online k-subspaces with cooperative re-initialization. In: Proceedings of the IEEE/CVF International Conference on Computer Vision Workshop, 678–688, 2019.
DOI
[19]
Shi, J. B.; Malik, J. Normalized cuts and image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 22, No. 8, 888–905, 2000.
[20]
Luxburg, U. A tutorial on spectral clustering. Statistics and Computing Vol. 17, No. 4, 395–416, 2007.
[21]
Lu, C. Y.; Feng, J. S.; Lin, Z. C.; Mei, T.; Yan, S. C. Subspace clustering by block diagonal representation. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 41, No. 2, 487–501, 2019.
[22]
Dong, W. H.; Wu, X. J.; Kittler, J.; Yin, H. F. Sparse subspace clustering via nonconvex approximation. Pattern Analysis and Applications Vol. 22, No. 1, 165–176, 2019.
[23]
Hotta, K.; Xie, H.; Zhang, C. Candidate subspace screening for linear subspace clustering with energy minimization. In: Proceedings of the Irish Machine Vision and Image Processing Conference, 125–128, 2020.
[24]
Yan, J.; Pollefeys, M. A general framework for motion segmentation: Independent, articulated, rigid, non-rigid, degenerate and non-degenerate. In: Computer Vision – ECCV 2006. Lecture Notes in Computer Science, Vol. 3954. Leonardis, A.; Bischof, H.; Pinz, A. Eds. Springer Berlin Heidelberg, 94–106, 2006.
DOI
[25]
Chen, G. L.; Lerman, G. Spectral curvature clustering (SCC). International Journal of Computer Vision Vol. 81, No. 3, 317–330, 2009.
[26]
Donoho, D. L. For most large underdetermined systems of linear equations the minimal 1-norm solution is also the sparsest solution. Communications on Pure and Applied Mathematics Vol. 59, No. 6, 797–829, 2006.
[27]
Lu, C. Y.; Min, H.; Zhao, Z. Q.; Zhu, L.; Huang, D. S.; Yan, S. C. Robust and efficient subspace segmentation via least squares regression. In: Computer Vision – ECCV 2012. Lecture Notes in Computer Science, Vol. 7578. Fitzgibbon, A.; Lazebnik, S.; Perona, P.; Sato, Y.; Schmid, C. Eds. Springer Berlin Heidelberg, 347–360, 2012.
DOI
[28]
Liu, G.; Lin, Z.; Yu, Y. Robust subspace segmentation by lowrank representation. In: Proceedings of the 27th International Conference on International Conference on Machine Learning, 663–670, 2010.
[29]
You, C.; Li, C. G.; Robinson, D. P.; Vidal, R. Oracle based active set algorithm for scalable elastic net subspace clustering. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 3928–3937, 2016.
DOI
[30]
Ji, P.; Salzmann, M.; Li, H. D. Efficient dense subspace clustering. In: Proceedings of the IEEE Winter Conference on Applications of Computer Vision, 461–468, 2014.
[31]
Dyer, E. L.; Sankaranarayanan, A. C.; Baraniuk, R. G. Greedy feature selection for subspace clustering. Journal of Machine Learning Research Vol. 14, No. 1, 2487–2517, 2013.
[32]
Nasihatkon, B.; Hartley, R. Graph connectivity in sparse subspace clustering. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2137–2144, 2011.
DOI
[33]
You, C.; Li, C.; Robinson, D. P.; Vidal, R. A scalable exemplar-based subspace clustering algorithm for class-imbalanced data. In: Computer Vision – ECCV 2018. Lecture Notes in Computer Science, Vol. 11213. Ferrari, V.; Hebert, M.; Sminchisescu, C.; Weiss, Y. Eds. Springer Cham, 68–85, 2018.
DOI
[34]
Chen, Y.; Li, C. G.; You, C. Stochastic sparse subspace clustering. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 4154–4163, 2020.
DOI
[35]
You, C.; Donnat, C.; Robinson, D. P.; Vidal, R. A divide-and-conquer framework for large-scale subspace clustering. In: Proceedings of 50th Asilomar Conference on Signals, Systems and Computers, 1014–1018, 2016.
DOI
[36]
Davenport, M. A.; Wakin, M. B. Analysis of orthogonal matching pursuit using the restricted isometry property. IEEE Transactions on Information Theory Vol. 56, No. 9, 4395–4401, 2010.
[37]
Tropp, J. A. Greed is good: Algorithmic results for sparse approximation. IEEE Transactions on Information Theory Vol. 50, No. 10, 2231–2242, 2004.
[38]
Pati, Y. C.; Rezaiifar, R.; Krishnaprasad, P. S. Orthogonal matching pursuit: Recursive function approximation with applications to wavelet decom-position. In: Proceedings of the 27th Asilomar Conference on Signals, Systems and Computers, 40–44, 1993.
[39]
Wong, C. K.; Easton, M. C. An efficient method for weighted sampling without replacement. SIAM Journal on Computing Vol. 9, No. 1, 111–113, 1980.
[40]
Heckel, R.; Bölcskei, H. Subspace clustering via thresholding and spectral clustering. In: Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, 3263–3267, 2013.
DOI
[41]
Vidal, R.; Favaro, P. Low rank subspace clustering (LRSC). Pattern Recognition Letters Vol. 43, 47–61, 2014.
[42]
Chung, F. R. K. Spectral Graph Theory. American Mathematical Society, 1997.
[43]
Lee, K. C.; Ho, J.; Kriegman, D. J. Acquiring linear subspaces for face recognition under variable lighting. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 27, No. 5, 684–698, 2005.
[44]
Samaria, F. S.; Harter, A. C. Parameterisation of a stochastic model for human face identification. In: Proceedings of the IEEE Workshop on Applications of Computer Vision, 138–142, 1994.
[45]
Greene, D.; Cunningham, P. Practical solutions to the problem of diagonal dominance in kernel document clustering. In: Proceedings of the 23rd International Conference on Machine Learning, 377–384, 2006.
DOI
[46]
Stallkamp, J.; Schlipsing, M.; Salmen, J.; Igel, C. Man vs. computer: Benchmarking machine learning algorithms for traffic sign recognition. Neural Networks Vol. 32, 323–332, 2012.
[47]
LeCun, Y.; Bottou, L.; Bengio, Y.; Haffner, P. Gradient-based learning applied to document recognition. Proceedings of the IEEE Vol. 86, No. 11, 2278–2324, 1998.
[48]
Cohen, G.; Afshar, S.; Tapson, J.; van Schaik, A. EMNIST: Extending MNIST to handwritten letters. In: Proceedings of the International Joint Conference on Neural Networks, 2921–2926, 2017.
DOI
[49]
Krizhevsky, A. Learning multiple layers of features from tiny images. Technical Report. University of Toronto, 2009.
[50]
Cai, D.; He, X. F.; Hu, Y. X.; Han, J. W.; Huang, T. Learning a spatially smooth subspace for face recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 1–7, 2007.
DOI
[51]
Bruna, J.; Mallat, S. Invariant scattering convolution networks. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 35, No. 8, 1872–1886, 2013.
[52]
Zhang, S. Z.; You, C.; Vidal, R.; Li, C. G. Learning a self-expressive network for subspace clustering. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 12388–12398, 2021.
DOI
[53]
Yu, Y. D.; Chan, K. H. R.; You, C.; Song, C. B.; Ma, Y. Learning diverse and discriminative representations via the principle of maximal coding rate reduction. In: Proceedings of the 34th International Conference on Neural Information Processing Systems, Article No. 790, 9422–9434, 2020.
Publication history
Copyright
Acknowledgements
Rights and permissions

Publication history

Received: 14 February 2022
Accepted: 08 May 2022
Published: 31 March 2023
Issue date: September 2023

Copyright

© The Author(s) 2023.

Acknowledgements

This work was supported by JSPS KAKENHI Grant Number JP20K19568.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduc-tion in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made.

The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Other papers from this open access journal are available free of charge from http://www.springer.com/journal/41095. To submit a manuscript, please go to https://www.editorialmanager.com/cvmj.

Return