Journal Home > Volume 23 , Issue 4

Graph clustering has a long-standing problem in that it is difficult to identify all the groups of vertices that are cohesively connected along their internal edges but only sparsely connected along their external edges. Apart from structural information in social networks, the quality of the location-information clustering has been improved by identifying clusters in the graph that are closely connected and spatially compact. However, in real-world scenarios, the location information of some users may be unavailable for privacy reasons, which renders existing solutions ineffective. In this paper, we investigate the clustering problem of privacy-preserving social networks, and propose an algorithm that uses a prediction-and-clustering approach. First, the location of each invisible user is predicted with a probability distribution. Then, each user is iteratively assigned to different clusters. The experimental results verify the effectiveness and efficiency of our method, and our proposed algorithm exhibits high scalability on large social networks.


menu
Abstract
Full text
Outline
About this article

Location- and Relation-Based Clustering on Privacy-Preserving Social Networks

Show Author's information Dan YinYiran Shen( )
College of Computer Science and Technology, Harbin Engineering University, Harbin 150001, China.

Abstract

Graph clustering has a long-standing problem in that it is difficult to identify all the groups of vertices that are cohesively connected along their internal edges but only sparsely connected along their external edges. Apart from structural information in social networks, the quality of the location-information clustering has been improved by identifying clusters in the graph that are closely connected and spatially compact. However, in real-world scenarios, the location information of some users may be unavailable for privacy reasons, which renders existing solutions ineffective. In this paper, we investigate the clustering problem of privacy-preserving social networks, and propose an algorithm that uses a prediction-and-clustering approach. First, the location of each invisible user is predicted with a probability distribution. Then, each user is iteratively assigned to different clusters. The experimental results verify the effectiveness and efficiency of our method, and our proposed algorithm exhibits high scalability on large social networks.

Keywords: clustering, social networks, location prediction, privacy-preserving

References(45)

[1]
S. Fortunato, Community detection in graphs, Phys. Rep., vol. 486, nos. 3–5, pp. 75-174, 2010.
[2]
A. D. King, N. Pržulj, and I. Jurisica, Protein complex prediction via cost-based clustering, Bioinformatics, vol. 20, no. 17, pp. 3013-3020, 2004.
[3]
M. Sasaki and H. Shinnou, Spam detection using text clustering, in Proc. 2005 Int. Conf. Cyberworlds, Singapore, 2005, p. 4.
DOI
[4]
S. Bandyopadhyay and E. J. Coyle, An energy efficient hierarchical clustering algorithm for wireless sensor networks, in Proc. 22th Annu. Joint Conf. IEEE Computer and Communications Societies, San Francisco, CA, USA, 2003, pp. 1713-1723.
[5]
H. D. White, B. Wellman, and N. Nazer, Does citation reflect social structure?: Longitudinal evidence from the “Globenet” interdisciplinary research group, J. Am. Soc. Inf. Sci. Technol., vol. 55, no. 2, pp. 111-126, 2004.
[6]
X. Zheng, Z. P. Cai, J. G. Yu, C. K. Wang, and Y. S. Li, Follow but no track: Privacy preserved profile publishing in cyber-physical social systems, IEEE Internet Things J., vol. 4, no. 6, pp. 1868-1878, 2017.
[7]
H. Shiokawa, Y. Fujiwara, and M. Onizuka, Fast algorithm for modularity-based graph clustering, in Proc. 27th AAAI Conf. on Artificial Intelligence, Bellevue, DC, USA, 2013, pp. 1170-1176.
[8]
M. Ester, H. P. Kriegel, J. Sander, and X. W. Xu, A density-based algorithm for discovering clusters in large spatial databases with noise, in Proc. 2nd Int. Conf. Knowledge Discovery and Data Mining, Portland, OR, USA, 1996, pp. 226-231.
[9]
M. C. V. Nascimento and A. C. P. L. F. De Carvalho, Spectral methods for graph clustering—A survey, Eur. J. Oper. Res., vol. 211, no. 2, pp. 221-231, 2011.
[10]
W. Chen, Y. F. Yuan, and L. Zhang, Scalable influence maximization in social networks under the linear threshold model, in Proc. 2010 IEEE Int. Conf. Data Mining, Sydney, Australia, 2010, pp. 88-97.
DOI
[11]
L. C. Zhang, Z. P. Cai, and X. M. Wang, FakeMask: A novel privacy preserving approach for smartphones, IEEE Trans. Netw. Serv. Manage., vol. 13, no. 2, pp. 335-348, 2016.
[12]
M. Han, J. Li, Z. P. Cai, and Q. L. Han, Privacy reserved influence maximization in GPS-enabled cyber-physical and online social networks, in Proc. 2016 IEEE Int. Conf. Big Data and Cloud Computing (BDCloud), Social Computing and Networking (SocialCom), Sustainable Computing and Communications (SustainCom) (BDCloud-SocialCom-SustainCom), Atlanta, GA, USA, 2016, pp. 284-292.
DOI
[13]
Q. N. Li, Y. Zheng, X. Xie, Y. K. Chen, W. Y. Liu, and W. Y. Ma, Mining user similarity based on location history, in Proc. 16th ACM SIGSPATIAL Int. Conf. on Advances in geographic Information Systems, Irvine, CA, USA, 2008, p. 34.
DOI
[14]
A. K. Jain, Data clustering: 50 years beyond k-means, Pattern Recognit. Lett., vol. 31, no. 8, pp. 651-666, 2010.
[15]
A. Shepitsen, J. Gemmell, B. Mobasher, and R. Burke, Personalized recommendation in social tagging systems using hierarchical clustering, in Proc. 2008 ACM Conf. Recommender Systems, Lausanne, Switzerland, 2008, pp. 259-266.
DOI
[16]
Y. Shi, Y. Peng, W. X. Xu, and X. W. Tang, Data mining via multiple criteria linear programming: Applications in credit card portfolio management, Int. J. Inf. Technol. Decis. Making., vol. 1, no. 1, pp. 131-151, 2002.
[17]
M. Han, J. B. Wang, M. Y. Yan, C. Y. Ai, Z. J. Duan, and Z. Hong. Near-complete privacy protection: Cognitive optimal strategy in location-based services, Procedia Comput. Sci., vol. 129, pp. 298-304, 2018.
[18]
C. C. Aggarwal and H. X. Wang, A survey of clustering algorithms for graph data, in Managing and Mining Graph Data, C.C. Aggarwal and H. X. Wang, eds. Springer, 2010, pp. 275-301.
DOI
[19]
M. Girvan and M. E. J. Newman, Community structure in social and biological networks, Proc. Natl. Acad. Sci. USA, vol. 99, no. 12, pp. 7821-7826, 2002.
[20]
J. Abello, M. G. C. Resende, and S. Sudarsky, Massive quasi-clique detection, in Proc. 5th American Symposium Cancun LATIN 2002: Theoretical Informatics, Mexico, 2002, pp. 598-612.
DOI
[21]
R. K. Ahuja, T. L. Magnanti, and J. B. Orlin, Network Flows: Theory, Algorithms, and Applications. Upper Saddle River, NJ, USA: Prentice Hall, 1993.
[22]
M. R. Garey and D. S. Johnson, Computers and Intractability: A Guide to the Theory of NP-Completeness, New York, NY, USA: W H Freeman and Co, 1979, p. 90.
[23]
M. J. Rattigan, M. Maier, and D. Jensen, Graph clustering with network structure indices, in Proc. 24th Int. Conf. Machine Learning, Corvalis, OR, USA, 2007, pp. 783-790.
DOI
[24]
F. R. K. Chung, Spectral Graph TheoryCBMS Regional Conferance Series in Mathematics. Providence, RI, USA: American Mathematical Society, 1997.
DOI
[25]
Y. Liang, Z. P. Cai, Q. L. Han, and Y. S. Li, Location privacy leakage through sensory data, Secur. Commun. Netw., vol. 2017, p. 7576307, 2017.
[26]
X. Zheng, Z. P. Cai, J. Z. Li, and H. Gao, Location-privacy-aware review publication mechanism for local business service systems, in Proc. 2017 IEEE Conf. Computer Communications, Atlanta, GA, USA, 2017, pp. 1-9.
DOI
[27]
J. Li, Z. P. Cai, M. Y. Yan, and Y. S. Li, Using crowd, sourced data in location-based social networks to explore influence maximization, in Proc. 35th Annu. IEEE Int. Conf. Computer Communications, San Francisco, CA, USA, 2016, pp. 1-9.
DOI
[28]
Z. B. He, Z. P. Cai, J. G. Yu, X. M. Wang, Y. C. Sun, and Y. S. Li, Cost-efficient strategies for restraining rumor spreading in mobile social networks, IEEE Trans. Vehic. Technol., vol. 66, no. 3, pp. 2789-2800, 2017.
[29]
T. Qiu, R. X. Qiao, and D. O. Wu, EABS: An event-aware backpressure scheduling scheme for emergency internet of things, IEEE Trans. Mobile Comput., vol. 17, no. 1, pp. 72-84, 2018.
[30]
T. Y. Song, N. Capurso, X. Z. Cheng, J. G. Yu, B. Chen, and W. Zhao, Enhancing GPS with lane-level navigation to facilitate highway driving, IEEE Trans. Vehic. Technol., vol. 66, no. 6, pp. 4579-4591, 2017.
[31]
J. J. Wang, Y. L. Han, and X. Y. Yang, An efficient location privacy protection scheme based on the Chinese remainder theorem, Tsinghua Sci. Technol., vol. 21, no. 3, pp. 260-269, 2016.
[32]
Y. Wang, D. B. Xu, and F. Li, Providing location-aware location privacy protection for mobile location-based services, Tsinghua Sci. Technol., vol. 21, no. 3, pp. 243-259, 2016.
[33]
M. Barthélemy, Spatial networks, Phys. Rep., vol. 499, nos. 1–3, pp. 1-101, 2011.
[34]
D. Guo, Regionalization with dynamically constrained agglomerative clustering and partitioning (REDCAP), Int. J. Geogr. Inf. Sci., vol. 22, no. 7, pp. 801-823, 2008.
[35]
P. Expert, T. S. Evans, V. D. Blondel, and R. Lambiotte, Uncovering space-independent communities in spatial networks, Proc. Natl. Acad. Sci. USA, vol. 108, no. 19, pp. 7663-7668, 2011.
[36]
Y. Chen, J. Xu, and M. Z. Xu, Finding community structure in spatially constrained complex networks, Int. J. Geogr. Inf. Sci., vol. 29, no. 6, pp. 889-911, 2015.
[37]
Y. X. Fang, R. Cheng, X. D. Li, S. Q. Luo, and J. F. Hu, Effective community search over large spatial graphs, Proc. VLDB Endow., vol. 10, no. 6, pp. 709-720, 2017.
[38]
Z. B. He, Z. P. Cai, and J. G. Yu, Latent-data privacy preserving with customized data utility for social network data, IEEE Trans. Vehic. Technol., vol. 67, no. 1, pp. 665-673, 2018.
[39]
M. Han, Q. L. Han, L. J. Li, J. Li, and Y. S. Li, Maximizing influence in sensed heterogenous social network with privacy preservation, Int J. Sensor Netw., .
[40]
N. Cao, Z. Y. Yang, C. Wang, K. Ren, and W. J. Lou, Privacy-preserving query over encrypted graph-structured data in cloud computing, in Proc. 31@st Int. Conf. Distributed Computing Systems, Minneapolis, MN, USA, 2011, pp. 393-402.
DOI
[41]
X. Y. He, J. Vaidya, B. Shafiq, N. Adam, and X. D. Lin, Reachability analysis in privacy-preserving perturbed graphs, in Proc. 2010 IEEE/WIC/ACM Int. Conf. Web Intelligence and Intelligent Agent Technology, Toronto, ON, Canada, 2010, pp. 691-694.
DOI
[42]
Z. P. Cai, Z. B. He, X. Guan, and Y. S. Li, Collective data-sanitization for preventing sensitive information inference attacks in social networks, IEEE Trans. Depend. Secure Comput., .
[43]
K. Mouratidis and M. L. Yiu, Shortest path computation with no information leakage, Proc. VLDB Endow., vol. 5, no. 8, pp. 692-703, 2012.
[44]
P. Damaschke, Degree-preserving spanning trees in small-degree graphs, Discrete Math., vol. 222, nos. 1–3, pp. 51-60, 2000.
[45]
A. Jeckmans, Q. Tang, and P. Hartel, Privacy-preserving profile matching using the social graph, in Proc. 2011 Int. Conf. Computational Aspects of Social Networks, Salamanca, Spain, 2011, pp. 42-47.
DOI
Publication history
Copyright
Acknowledgements
Rights and permissions

Publication history

Received: 19 August 2017
Accepted: 14 September 2017
Published: 16 August 2018
Issue date: August 2018

Copyright

© The authors 2018

Acknowledgements

This work was partially supported by the National Natural Science Foundation of China (Nos. 61602129, 61702132, and 61702133), the Natural Science Foundation of Heilongjiang Province (Nos. QC2017069 and QC2017071), Fundamental Research Funds for the Central Universities (Nos. HEUCFJ170602 and HEUCFJ160601), the China Postdoctoral Science Foundation (No. 166875), and Heilongjiang Postdoctoral Fund (No. LBH-Z16042).

Rights and permissions

Return