| Sign up

PDF (2.9 MB)

Cite

EndNote(RIS) BibTeX

Collect

Collect

Submit Manuscript

Open Access

Efficient Preference Clustering via Random Fourier Features

Jingshu Liu, Li Wang(), Jinglei Liu

∙ College of Data Science, Taiyuan University of Technology, Jinzhong 030600, China.

∙ School of Computer and Control Engineering, Yantai University, Yantai 264005, China.

Show Author Information

Abstract

Approximations based on random Fourier features have recently emerged as an efficient and elegant method for designing large-scale machine learning tasks. Unlike approaches using the Nyström method, which randomly samples the training examples, we make use of random Fourier features, whose basis functions (i.e., cosine and sine ) are sampled from a distribution independent from the training sample set, to cluster preference data which appears extensively in recommender systems. Firstly, we propose a two-stage preference clustering framework. In this framework, we make use of random Fourier features to map the preference matrix into the feature matrix, soon afterwards, utilize the traditional $k$ -means approach to cluster preference data in the transformed feature space. Compared with traditional preference clustering, our method solves the problem of insufficient memory and greatly improves the efficiency of the operation. Experiments on movie data sets containing 100 000 ratings, show that the proposed method is more effective in clustering accuracy than the Nyström and $k$ -means, while also achieving better performance than these clustering approaches.

Keywords

random Fourier features matrix decomposition similarity matrix

References

[1]

E.

Arias-Castro

, G.

Lerman

, and T.

Zhang

, Spectral clusteringbased on local PCA, Journal of Machine Learning Research, vol. 18, no. 1, pp. 253-309, 2017.

[2]

Y.

Yang

, F.

Shen

, Z.

Huang

, H. T.

Shen

, and X.

Li

, Discrete nonnegative spectral clustering, IEEE Transactions on Knowledge and Data Engineering, vol. 29, no. 9, pp. 1834-1845, 2017.

Crossref Google Scholar

[3]

L.

Wang

, J. C.

Bezdek

, C.

Leckie

, and R.

Kotagiri

, Selective sampling for approximate clustering of very large data sets, International Journal of Intelligent Systems, vol. 23, no. 3, pp. 313-331, 2008.

Crossref Google Scholar

[4]

R.

Xu

and D. C. W.

Ii

, Survey of clustering algorithms, IEEE Transactions on Neural Networks, vol. 16, no. 3, pp. 645-678, 2005.

Crossref Google Scholar

[5]

S.

Sun

, J.

Zhao

, and J.

Zhu

, A review of nystrom methods for large-scale machine learning, Information Fusion, vol. 26, no. 1, pp. 36-48, 2015.

Crossref Google Scholar

[6]

R.

Langone

, R.

Mall

, C.

Alzate

, and J. A. K.

Suykens

, Kernel spectral clustering and applications, in Unsupervised Learning Algorithms. 2016, pp. 135-161.

[7]

X.

Zhang

, L.

Zong

, Q.

You

, and X.

Yong

, Sampling for nystrom extension-based spectral clustering: Incremental perspective and novel analysis, ACM Transactions on Knowledge Discovery from Data, vol. 11, no. 1, pp. 7: 1-7: 25, 2016.

Crossref Google Scholar

[8]

B. C.

Fowlkes

, S.

Belongie

, F.

Chung

, and J.

Malik

, Spectral grouping using the nystroquotom method, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 26, no. 2, pp. 214-225, 2010.

Crossref Google Scholar

[9]

Y.

Sun

, J.

Gao

, X.

Hong

, B.

Mishra

, and B.

Yin

, Heterogeneous tensor decomposition for clustering via manifold optimization, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 38, no. 3, pp. 476-489, 2016.

Crossref Google Scholar

[10]

S.

Wang

and Z.

Zhang

, Efficient algorithms and error analysis for the modified nystrom method, in Proceedings of the Seventeenth International Conference on Artificial Intelligence and Statistics, Reykjavik, Iceland, 2014, pp. 996-1004.

[11]

W.

Wang

and Z.

Zhang

, Improving CUR matrix decomposition and the Nystrom approximation via adaptive sampling, Journal of Machine Learning Research, vol. 14, no. 1, pp. 2729-2769, 2013.

[12]

X.

Chen

and D.

Cai

, Large scale spectral clustering with landmark-based representation. in Proceedings of the Twenty-Fifth AAAI Conference on Artificial Intelligence, San Francisco, CA, USA, 2011, pp. 313-318.

[13]

L.

Wang

and M.

Dong

, Exemplar-based low-rank matrix decomposition for data clustering, Data Mining and Knowledge Discovery, vol. 29, no. 2, pp. 324-357, 2015.

Crossref Google Scholar

[14]

S.

Wang

, L.

Luo

, and Z.

Zhang

, SPSD matrix approximation via column selection: Theories, algorithms, and extensions, Journal of Machine Learning Research, vol. 17, no. 1, pp. 1697-1745, 2016.

[15]

S.

Wang

, Z.

Zhang

, and T.

Zhang

, Towards more efficient SPSD matrix approximation and CUR matrix decomposition, Journal of Machine Learning Research, vol. 17, no. 1, pp. 7329-7377, 2016.

[16]

P. S.

Bradley

and U. M.

Fayyad

, Refining initial points for kmeans clustering, in Fifteenth International Conference on Machine Learning, 1998, pp. 91-99.

[17]

H.

Zha

, X.

He

, C.

Ding

, H.

Simon

, and M.

Gu

, Spectral relaxation for k-means clustering, in Proceedings of the 14th International Conference on Neural Information Processing Systems: Natural and Synthetic, Vancouver, Canada, 2001, pp. 1057-1064.

[18]

L.

Wang

, M.

Rege

, M.

Dong

, and Y.

Ding

, Low-rank kernel matrix factorization for large-scale evolutionary clustering, IEEE Transactions on Knowledge and Data Engineering, vol. 24, no. 6, pp. 1036-1050, 2012.

Crossref Google Scholar

[19]

F. R.

Bach

and M. I.

Jordan

, Learning spectral clustering, Advances in Neural Information Processing Systems, vol. 16, no. 2, pp. 305-312, 2004.

[20]

A.

Dempster

, Maximum likelihood from incomplete data via the em algorithm, Journal of the Royal Statistical Society, vol. 39, no. 1, pp. 1-38, 1977.

Crossref Google Scholar

[21]

D. A.

Spielman

, Spectral graph theory and its applications, in Proceedings of the 48th Annual IEEE Symposium on Foundations of Computer Science, Washington, DC, USA, 2007, pp. 29-38.

[22]

W. Y.

Chen

, Y.

Song

, H.

Bai

, C. J.

Lin

, and E. Y.

Chang

, Parallel spectral clustering in distributed systems, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 33, no. 3, pp. 568-586, 2010.

Crossref Google Scholar

[23]

C.

Alzate

and J. A. K.

Suykens

, Multiway spectral clustering with out-of-sample extensions through weighted kernel PCA, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 32, no. 2, pp. 335-347, 2010.

Crossref Google Scholar

[24]

F.

Lauer

and C.

Schnorr

, Spectral clustering of linear subspaces for motion segmentation, in Proceedings of the IEEE 12th International Conference on Computer Vision, Kyoto, Japan, 2009, pp. 678-685.

[25]

J.

Malik

and J.

Shi

, Normalized cuts and image segmentation, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 22, no. 8, pp. 888-905, 2000.

Crossref Google Scholar

[26]

J. A.

Hartigan

and M. A.

Wong

, A k-means clustering algorithm, Applied Statistics, vol. 28, no. 1, pp. 100-108, 1979.

Crossref Google Scholar

[27]

A. E.

Alaoui

and M. W.

Mahoney

, Fast randomized kernel ridge regression with statistical guarantees, in Proceedings of the 28th International Conference on Neural Information Processing Systems, Montreal, Canada, 2015, pp. 775-783.

[28]

A.

Vedaldi

and A.

Zisserman

, Efficient additive kernels via explicit feature maps, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 34, no. 3, pp. 480-492, 2012.

Crossref Google Scholar

[29]

A.

Rahimi

and B.

Recht

, Random features for large-scale kernel machines, in Proceedings of the 20th International Conference on Neural Information Processing Systems, Vancouver, Canada, 2007, pp. 1177-1184.

[30]

L.

He

, N.

Ray

, Y.

Guan

, and H.

Zhang

, Fast large-scale spectral clustering via explicit feature mapping, IEEE Transactions on Cybernetics, vol. 49, no. 3, pp. 1058-1071, 2019.

Crossref Google Scholar

Big Data Mining and Analytics

Volume 2 Issue 3,
September 2019

Pages 195-204

DOI: 10.26599/BDMA.2019.9020003

Cite this article:

Liu J, Wang L, Liu J. Efficient Preference Clustering via Random Fourier Features. Big Data Mining and Analytics, 2019, 2(3): 195-204. https://doi.org/10.26599/BDMA.2019.9020003

About Us

Learn about Open Access

Tsinghua University Press

Publish with Us

Peer Review Policy

Copyright and Licensing

Article Processing Charge

Contact Us

Journal Collaboration: Yao Meng (Ms.)✉️ +86-10-83470574

Technical Support: Kuo Zhao (Mr.)✉️ +86-10-83470507

Media Contact: Hao Jin (Mr.)✉️ +86-10-83470559

Address: Floor 6, Tower B, Xueyan Building, Shuangqing Road, Haidian District, Beijing 100084, China.

SciOpen——中国科技期刊卓越行动计划支持项目

Copyright © 2025 Tsinghua University Press Ltd.

京ICP备 10035462号-42 京公网安备11010802044758号