Journal Home > Volume 21 , Issue 1

Weibo is the Twitter counterpart in China that has attracted hundreds of millions of users. We crawled an almost complete Weibo user network that contains 222 million users and 27 billion links in 2013. This paper analyzes the structural properties of this network, and compares it with a Twitter user network. The topological properties we studied include the degree distributions, connected components, distance distributions, reciprocity, clustering coefficient, PageRank centrality, and degree assortativity. We find that Weibo users have a higher diversity index, higher Gini index, but a lower reciprocity and clustering coefficient for most of the nodes. A surprising observation is that the reciprocity of Weibo is only about a quarter of the reciprocity of the Twitter user network. We also show that Weibo adoption rate correlates with economic development positively, and Weibo network can be used to quantify the connections between provinces and regions in China. In particular, point-wise mutual information is shown to be accurate in quantifying the strength of connections. We developed an interactive analyzing software framework for this study, and released the data and code online.


menu
Abstract
Full text
Outline
About this article

A Comparative Analysis on Weibo and Twitter

Show Author's information Wentao HanXiaowei ZhuZiyan ZhuWenguang ChenWeimin ZhengJianguo Lu( )
Department of Computer Science and Technology, Tsinghua University, Beijing 100084, China.
Department of Computer Science and Engineering, University of California, San Diego, CA 92093, USA. This work was done during his visiting to Tsinghua University.
School of Computer Science, University of Windsor, Windsor, ON N9B 3P4, Canada.

Abstract

Weibo is the Twitter counterpart in China that has attracted hundreds of millions of users. We crawled an almost complete Weibo user network that contains 222 million users and 27 billion links in 2013. This paper analyzes the structural properties of this network, and compares it with a Twitter user network. The topological properties we studied include the degree distributions, connected components, distance distributions, reciprocity, clustering coefficient, PageRank centrality, and degree assortativity. We find that Weibo users have a higher diversity index, higher Gini index, but a lower reciprocity and clustering coefficient for most of the nodes. A surprising observation is that the reciprocity of Weibo is only about a quarter of the reciprocity of the Twitter user network. We also show that Weibo adoption rate correlates with economic development positively, and Weibo network can be used to quantify the connections between provinces and regions in China. In particular, point-wise mutual information is shown to be accurate in quantifying the strength of connections. We developed an interactive analyzing software framework for this study, and released the data and code online.

Keywords: complex network, mutual information, Weibo, Twitter, online social network

References(36)

[1]
Kwak H., Lee C., Park H., and Moon S., What is twitter, a social network or a news media? in WWW, ACM, 2010, pp. 591–600.
DOI
[2]
Myers S. A., Sharma A., Gupta P., and Lin J., Information network or social network?: The structure of the twitter follow graph, in WWW, 2014, pp. 493–498.
DOI
[3]
Fu K.-W. and Chau M., Reality check for the Chinese microblog space: A random sampling approach, PLOS ONE, vol. 8, no. 3, p. e58356, 2013.
[4]
Wang H. and Lu J., Detect inflated follower numbers in osn using star sampling, in The IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, 2013, pp. 127–133.
DOI
[5]
Guo Z., Li Z., and Tu H., Sina microblog: An information-driven online social network, in Cyberworlds (CW), 2011 International Conference on, IEEE, 2011, pp. 160-167.
[6]
Gao Q., Abel F., Houben G.-J., and Yu Y., A comparative study of users? Microblogging behavior on sina weibo and twitter, in User Modeling, Adaptation, and Personalization. Springer, 2012.
[7]
Chen S., Zhang H., Lin M., and Lv S., Comparision of microblogging service between sina weibo and twitter, in Computer Science and Network Technology (ICCSNT), 2011 International Conference on, IEEE, 2011, vol. 4, pp. 2259-2263.
[8]
Guan W., Gao H., Yang M., Li Y., Ma H., Qian W., Cao Z., and Yang X., Analyzing user behavior of the micro-blogging website sina weibo during hot social events, Physica A: Statistical Mechanics and its Applications, vol. 395, pp. 340-351, 2014.
[9]
Ugander J., Karrer B., Backstrom L., and Marlow C., The anatomy of the facebook social graph, arXiv preprint arXiv:1111.4503, 2011.
[10]
Leskovec J. and Horvitz E., Planetary-scale views on a large instant-messaging network, in Proceedings of the 17th International Conference on World Wide Web, ACM, 2008, pp. 915–924.
DOI
[11]
Wilson C., Boe B., Sala A., Puttaswamy K., and Zhao B., User interactions in social networks and their implications, in Proceedings of the 4th ACM European Conference on Computer Systems, ACM, 2009, pp. 205–218.
DOI
[12]
Mislove A., Marcon M., Gummadi K., Druschel P., and Bhattacharjee B., Measurement and analysis of online social networks, in SIGCOMM, ACM, 2007, pp. 29–42.
DOI
[13]
Viswanath B., Mislove A., Cha M., and Gummadi K. P., On the evolution of user interaction in facebook, in Proceedings of the 2nd ACM SIGCOMM Workshop on Social Networks (WOSN’09), 2009.
DOI
[14]
Newman M. and Park J., Why social networks are different from other types of networks, Physical Review E, vol. 68, no. 3, p. 036122, 2003.
[15]
Giles J., Making the links, Nature, vol. 488, no. 7412, pp. 448-450, 2012.
[16]
Eagle N., Macy M., and Claxton R., Network diversity and economic development, Science, vol. 328, no. 5981, pp. 1029-1031, 2010.
[17]
Mozur P., How many people really use sina weibo, Wall Street Journal, http://blogs.wsj.com/chinarealtime/ 2013/03/12/how-many-people-really-use-sina-weibo/, 2013.
[18]
Boldi P., Santini M., and Vigna S., A large time-aware web graph, in ACM SIGIR Forum, ACM, 2008, vol. 42, pp. 33–38.
DOI
[19]
Clauset A., Shalizi C. R., and Newman M. E., Power-law distributions in empirical data, SIAM Review, vol. 51, no. 4, pp. 661-703, 2009.
[20]
Zipf G., Human Behavior and the Principle of Least Effort, George Kingsley Oxford, UK: Addison-Wesley Press, 1949.
[21]
Montemurro M., Beyond the zipf-mandelbrot law in quantitative linguistics, Physica A: Statistical Mechanics and its Applications, vol. 300, no. 3, pp. 567-578, 2001.
[22]
Meusel R., Vigna S., Lehmberg O., and Bizer C., Graph structure in the web-Revisited: A trick of the heavy tail, in Proceedings of the Companion Publication of the 23rd International Conference on World Wide Web Companion, 2014.
DOI
[23]
Feld S., Why your friends have more friends than you do, American Journal of Sociology, vol. 96, no. 6, pp. 1464-1477, 1991.
[24]
Lu J. and Li D., Sampling online social networks by random walk, in ACM SIGKDD Workshop on Hot Topics in Online Social Networks, ACM, 2012, pp. 33–40.
DOI
[25]
Simpson E. H., Measurement of diversity, Nature, vol. 163, p. 688, 1949.
[26]
Hanneman R. A. and Riddle M., Introduction to Social Network Methods, Riverside, CA, USA: University of California, Riverside, 2005.
[27]
Newman M., Networks: An Introduction. Oxford University Press, Inc., 2010.
DOI
[28]
Dunbar R. I., Neocortex size and group size in primates: A test of the hypothesis, Journal of Human Evolution, vol. 28, no. 3, pp. 287-296, 1995.
[29]
Page L., Brin S., Motwani R., and Winograd T., The pagerank citation ranking: Bringing order to the web, Technical report, Standford Infolab, 1999.
[30]
[31]
Yang Y. and Pedersen J. O., A comparative study on feature selection in text categorization, in ICML, 1997, vol. 97, pp. 412-420.
[32]
GB/T 2260-2007, Codes for the administrative divisions of the People’s Republic of China, 1980.
[33]
National Bureau of Statistics of China. http://data. stats.gov.cn/, 2014.
[34]
Simplified wrapper and interface generator, http:// swig.org/, 2014.
[35]
Zhu X., Han W., and Chen W., Gridgraph: Large-scale graph processing on a single machine using 2-level hierarchical partitioning, in 2015 USENIX Annual Technical Conference (USENIX ATC 15), 2015, pp. 375–386.
[36]
Zhai J., Chen W., and Zheng W., Phantom: Predicting performance of parallel applications on large-scale parallel machines using a single node, ACM SIGPLAN Notices, vol. 45, pp. 305-314, 2010.
Publication history
Copyright
Acknowledgements
Rights and permissions

Publication history

Received: 23 June 2015
Accepted: 01 July 2015
Published: 04 February 2016
Issue date: February 2016

Copyright

© The author(s) 2016

Acknowledgements

This work was supported by NSERC (Natural Sciences and Engineering Research Council of Canada) Discovery grant (No. RGPIN-2014-04463), the National High-Tech Research and Development (863) Program of China (No. 2012AA010903), and the National Natural Science Foundation of China (Nos. 61433008 and U1435216).

Rights and permissions

Return