TST: Threshold Based Similarity Transitivity Method in Collaborative Filtering with Cloud Computing

Feng Xie; Zhen Chen; Hongfeng Xu; Xiwei Feng; Qi Hou

doi:10.1109/TST.2013.6522590

Tsinghua Science and Technology 2013, 18(3): 318-327 https://doi.org/10.1109/TST.2013.6522590

Open Access | Issue | Published: 03 June 2013

TST: Threshold Based Similarity Transitivity Method in Collaborative Filtering with Cloud Computing

Show Author's Information Hide Author's Information Feng Xie, Zhen Chen(

), Hongfeng Xu, Xiwei Feng, Qi Hou

Department of Automation, Research Institute of Information Technology and Tsinghua National Laboratory for Information Science and Technology (TNList), Tsinghua University, Beijing 100084, China

Research Institute of Information Technology and Tsinghua National Laboratory for Information Science and Technology (TNList), Tsinghua University, Beijing 100084, China

Department of Computer Science and Technologies and Tsinghua National Laboratory for Information Science and Technology (TNList), Tsinghua University, Beijing 100084, China

Department of Electronic Engineering and Tsinghua National Laboratory for Information Science and Technology (TNList), Tsinghua University, Beijing 100084, China

Keywords:

big data, machine learning, cloud computing, collaborative filtering, data mining, mapReduce, recommender systems, similarity transitivity, android applications

Cite this article:

Xie F, Chen Z, Xu H, et al. TST: Threshold Based Similarity Transitivity Method in Collaborative Filtering with Cloud Computing. Tsinghua Science and Technology, 2013, 18(3): 318-327. https://doi.org/10.1109/TST.2013.6522590

Download citation

EndNote(RIS)

BibTeX

423

Views

Downloads

Citations

Crossref

N/A

WoS

Scopus

CSCD

Abstract Full text About this article

Abstract

Collaborative filtering solves information overload problem by presenting personalized content to individual users based on their interests, which has been extensively applied in real-world recommender systems. As a class of simple but efficient collaborative filtering method, similarity based approaches make predictions by finding users with similar taste or items that have been similarly chosen. However, as the number of users or items grows rapidly, the traditional approach is suffering from the data sparsity problem. Inaccurate similarities derived from the sparse user-item associations would generate the inaccurate neighborhood for each user or item. Consequently, its poor recommendation drives us to propose a Threshold based Similarity Transitivity (TST) method in this paper. TST firstly filters out those inaccurate similarities by setting an intersection threshold and then replaces them with the transitivity similarity. Besides, the TST method is designed to be scalable with MapReduce framework based on cloud computing platform. We evaluate our algorithm on the public data set MovieLens and a real-world data set from AppChina (an Android application market) with several well-known metrics including precision, recall, coverage, and popularity. The experimental results demonstrate that TST copes well with the tradeoff between quality and quantity of similarity by setting an appropriate threshold. Moreover, we can experimentally find the optimal threshold which will be smaller as the data set becomes sparser. The experimental results also show that TST significantly outperforms the traditional approach even when the data becomes sparser.

Full text

Abstract

Full text

Outline

About this article

TST: Threshold Based Similarity Transitivity Method in Collaborative Filtering with Cloud Computing

Show Author's information Hide Author's Information Feng Xie, Zhen Chen(

), Hongfeng Xu, Xiwei Feng, Qi Hou

Department of Automation, Research Institute of Information Technology and Tsinghua National Laboratory for Information Science and Technology (TNList), Tsinghua University, Beijing 100084, China

Research Institute of Information Technology and Tsinghua National Laboratory for Information Science and Technology (TNList), Tsinghua University, Beijing 100084, China

Department of Computer Science and Technologies and Tsinghua National Laboratory for Information Science and Technology (TNList), Tsinghua University, Beijing 100084, China

Department of Electronic Engineering and Tsinghua National Laboratory for Information Science and Technology (TNList), Tsinghua University, Beijing 100084, China

Abstract

Keywords: big data, machine learning, cloud computing, collaborative filtering, data mining, mapReduce, recommender systems, similarity transitivity, android applications

References(38)

[1]

P. Resnick and H. R. Varian, Recommender systems, Communications of the ACM, vol. 40, no. 3, pp. 56-58, 1997.

DOI Google Scholar

[2]

G. Adomavicius and A. Tuzhilin, Toward the next generation of recommender systems: A survey of the state-of-the-art and possible extensions, IEEE Transactions on Knowledge and Data Engineering, vol. 17, no. 6, pp. 734-749, 2005.

DOI Google Scholar

[3]

M. Balabanovic and Y. Shoham, Fab: content-based, collaborative recommendation, Communications of the ACM, vol. 40, no. 3, pp. 66-72, 1997.

DOI Google Scholar

[4]

M. J. Pazzani and D. Billsus, Content-based recommendation systems, The Adaptive Web. Heidelberg: Springer Berlin, 2007, pp. 325-341.

[5]

D. Goldberg, D. Nichols, B. M. Oki, and D. Terry, Using collaborative filtering to weave an information tapestry. Communications of the ACM, vol. 35, no. 12, pp. 61-70, 1992.

DOI Google Scholar

[6]

J. B. Schafer, D. Frankowski, J. Herlocker, and S. Sen, Collaborative filtering recommender systems, The Adaptive Web. Heidelberg: Springer Berlin, 2007, pp. 291-324.

[7]

X. Su and T. M. Khoshgoftaar, A survey of collaborative filtering techniques, Advances in Artificial Intelligence, vo. 2009, pp. 1-19.

DOI Google Scholar

[8]

C. Christakou, S. Vrettos, and A. Stafylopatis, A hybrid movie recommender system based on neural networks, International Journal on Artificial Intelligence Tools, vol. 16, no. 5, pp. 771-792, 2007.

DOI Google Scholar

[9]

B. Yang, T. Mei, X. S. Hua, L. Yang, S. Q. Yang, and M. Li, Online video recommendation based on multimodal fusion and relevance feedback. in Proceedings of the 6th ACM international conference on Image and video retrieval, Amsterdam, Netherlands, 2007, pp. 73-80.

DOI

[10]

M. Van Setten, M. Veenstra, A. Nijholt, and B. van Dijk, Prediction strategies in a TV recommender system-method and experiments. in Proceedings of the Second IADIS International Conference WWW/Internet, Algarve, Portugal, 2003, pp. 203-210.

[11]

J. Park, S. J. Lee, S. J. Lee, K. Kim, B. S. Chung, and Y. K. Lee, Online video recommendation through tag-cloud aggregation, IEEE MultiMedia, vol. 18, no. 1, pp. 78-86, 2011.

DOI Google Scholar

[12]

M. Balabanovic, Exploring versus exploiting when learning user models for text recommendation, User Modeling and User-Adapted Interaction, vol. 8, no. 1-2, pp. 71-102, 1998.

DOI Google Scholar

[13]

G. Linden, B. Smith, and J. York, Amazon. com recommendations: Item-to-item collaborative filtering, IEEE Internet Computing, vol. 7, no. 1, pp. 76-80, 2003.

DOI Google Scholar

[14]

T. Hofmann, Latent semantic models for collaborative filtering, ACM Transactions on Information Systems (TOIS), vol. 22, no. 1, pp. 89-115, 2004.

DOI Google Scholar

[15]

K. Miyahara, and M. J. Pazzani, Collaborative filtering with the simple Bayesian classifier, PRICAI 2000 Topics in Artificial Intelligence. Heidelberg: Springer Berlin, 2000, pp. 679-689.

DOI

[16]

X. Su and T. M. Khoshgoftaar, Collaborative filtering for multi-class data using belief nets algorithms, in Proceedings of 18th IEEE International Conference on Tools with Artificial Intelligence (ICTAI’06), Washington DC, USA, 2006, pp. 497-504.

DOI

[17]

G. Shani, D. Heckerman, and R. I. Brafman, An MDP-based recommender system, Journal of Machine Learning Research, vol. 6, no. 2, pp. 1265-1295, 2006.

Google Scholar

[18]

B. Sarwar, G. Karypis, J. Konstan, and J. Riedl, Analysis of recommendation algorithms for e-commerce, in Proceedings of the 2nd ACM conference on Electronic commerce, Minneapolis, MN, USA, 2000, pp. 158-167.

DOI

[19]

H. Ma, T. C. Zhou, M. R. Lyu, and I. King, Improving recommender systems by incorporating social contextual information, ACM Transactions on Information Systems (TOIS), vol. 29, no. 2, pp. 1-23, 2011.

DOI Google Scholar

[20]

F. Xie, M. Xu, and Z.Chen, RBRA: A simple and efficient rating-based recommender algorithm to cope with sparsity in recommender systems, in Procedings of 26th International Conference on Advanced Information Networking and Applications Workshops (WAINA), Fukuoka, Japan, 2012, pp. 306-311.

DOI

[21]

B. M. Sarwar, G. Karypis, J. A. Konstan, and J. Riedl, Application of dimensionality reduction in recommender systems-a case study, in Proceedings of 6th SIGKDD Workshop on Web Mining and Web Usage Analysis (WebKDD’00), Boston, MA, USA, 2000.

DOI

[22]

K. Goldberg, T. Roeder, D. Gupta, and C. Perkins, Eigentaste: A constant time collaborative filtering algorithm, Information Retrieval, vol. 4, no. 2, pp. 133-151, 2001.

DOI Google Scholar

[23]

B Sarwar, G Karypis, J Konstan, and J. Riedl, Incremental singular value decomposition algorithms for highly scalable recommender systems, in Procedings of Fifth International Conference on Computer and Information Science, 2002.

[24]

L. H. Ungar, and D. P. Foster, Clustering methods for collaborative filtering, in Procedings of AAAI Workshop on Recommendation Systems, Madison, isconsin, USA, 1998.

[25]

S. H. S. Chee, J. Han, and K. Wang, Rectree: An efficient collaborative filtering method, Data Warehousing and Knowledge Discovery, Springer Berlin Heidelberg, pp. 141-151, 2001.

DOI Google Scholar

[26]

Z. Huang, D. Zeng, and H. Chen, A comparative study of recommendation algorithms in e-commerce applications, IEEE Intelligent Systems, vol. 22, no. 5, pp. 68-78, 2007.

DOI Google Scholar

[27]

T. Zhou, J. Ren, M. Medo, and Y. C. Zhang, Bipartite network projection and personal recommendation, Physical Review E, vol. 76, no. 4, 046115, 2007.

DOI Google Scholar

[28]

X. Li, and H. Chen, Recommendation as link prediction in bipartite graphs: A graph kernel-based machine learning approach, Decision Support Systems, vol. 54, no. 2, pp. 880-890, 2012.

DOI Google Scholar

[29]

J. G. Liu, T. Zhou, H. A. Che, B. H. Wang, and Y. C. Zhang, Effects of high-order correlations on personalized recommendations for bipartite networks, Physica A: Statistical Mechanics and its Applications, vol. 389, no.4, pp. 881-886, 2010.

DOI Google Scholar

[30]

J. Dean, and S. Ghemawat, MapReduce: simplified data processing on large clusters, Communications of the ACM, vol. 51, no. 1, pp. 107-113, 2008.

DOI Google Scholar

[31]

J. L. Herlocker, J. A. Konstan, L. G. Terveen, and J. Riedl, Evaluating collaborative filtering recommender systems, ACM Transactions on Information Systems (TOIS), vol. 22, no. 1, pp. 5-53, 2004.

DOI Google Scholar

[32]

J. L. Herlocker, J. A. Konstan, A. Borchers, and J. Riedl, An algorithmic framework for performing collaborative filtering, in Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval, Berkeley, CA, USA, 1999, pp. 230-237.

DOI

[33]

Z. Chen, F. Y. Han, J. W. Cao, X. Jiang, and S. Chen, Cloud computing-based forensic analysis for collaborative network security management system, Tsinghua Science and Technology, vol. 18, no. 1, pp. 40-50, 2013.

DOI Google Scholar

[34]

A. Gunawardana, G. Shani, A survey of accuracy evaluation metrics of recommendation tasks, The Journal of Machine Learning Research, vol. 10, pp. 2935-2962, 2009.

Google Scholar

[35]

H. Steck, Training and testing of recommender systems on data missing not at random, in Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining, Washington DC, USA, 2010, pp. 713-722.

DOI

[36]

H. Steck, Item popularity and recommendation accuracy, in Proceedings of the fifth ACM conference on Recommender systems, Chicago, USA, 2011, pp. 125-132.

DOI

[37]

P. Castells, S. Vargas, and J. Wang, Novelty and diversity metrics for recommender systems: choice, discovery and relevance, in Proceedings of International Workshop on Diversity in Document Retrieval (DDR), Chicago, USA, 2011, pp. 29-37.

DOI

[38]

G. Adomavicius, and Y. O. Kwon, Improving aggregate recommendation diversity using ranking-based techniques, IEEE Transactions on Knowledge and Data Engineering, vol. 24, no. 5, pp. 896-911, 2012.

DOI Google Scholar

About this article

Publication history

Acknowledgements

Rights and permissions

Publication history

Received: 15 April 2013

Revised: 15 May 2013

Accepted: 15 May 2013

Published: 03 June 2013

Issue date: June 2013

Copyright

Acknowledgements

The authors would like to thank Prof. Jun Li of NSLAB from RIIT for his careful guidance about the paper’s structure and writing. We are also grateful to Prof. Junwei Cao from RIIT, Dr. Zihong Huang and Xiaoping Feng from Electronic Engineering Department for their help.

This work is supported by Ministry of Science and Technology of China under the National Key Basic Research and Development (973) Program of China (Nos. 2012CB315801 and 2011CB302805), the National Natural Science Foundation of China A3 Program (No. 61161140320) and the National Natural Science Foundation of China (No. 61233016). This work is also supported by Intel Research Council with the title of Security Vulnerability Analysis based on Cloud Platform with Intel IA Architecture.