Journal Home > Volume 1 , Issue 4

Given that the USA and Germany are the most populous countries in North America and Western Europe, understanding the behavioral differences between American and German users of online social networks is essential. In this work, we conduct a data-driven study based on the Yelp Open Dataset. We demonstrate the behavioral characteristics of both American and German users from different aspects, i.e., social connectivity, review styles, and spatiotemporal patterns. In addition, we construct a classification model to accurately recognize American and German users according to the behavioral data. Our model achieves high classification performance with an F1-score of 0.891 and AUC of 0.949.


menu
Abstract
Full text
Outline
About this article

Understanding the Behavioral Differences Between American and German Users: A Data-Driven Study

Show Author's information Chenxi YangYang Chen( )Qingyuan GongXinlei HeYu XiaoYuhuan HuangXiaoming Fu
School of Computer Science, Fudan University, Shanghai 200433, China, and the Engineering Research Center of Cyber Security Auditing and Monitoring, Ministry of Education, Shanghai 200433, China.
Department of Communications and Networking, Aalto University, 02150 Espoo, Finland.
Faculty of European Languages and Cultures, Guangdong University of Foreign Studies, Guangzhou 510420, China.
Institute of Computer Science, University of Göttingen, 37077 Göttingen, Germany.

Abstract

Given that the USA and Germany are the most populous countries in North America and Western Europe, understanding the behavioral differences between American and German users of online social networks is essential. In this work, we conduct a data-driven study based on the Yelp Open Dataset. We demonstrate the behavioral characteristics of both American and German users from different aspects, i.e., social connectivity, review styles, and spatiotemporal patterns. In addition, we construct a classification model to accurately recognize American and German users according to the behavioral data. Our model achieves high classification performance with an F1-score of 0.891 and AUC of 0.949.

Keywords: machine learning, online social networks, behavioral difference, Yelp

References(44)

[1]
H. C. Triandis, R. Bontempo, M. J. Villareal, M. Asai, and N. Lucca, Individualism and collectivism: Cross-cultural perspectives on self-ingroup relationships, J. Personal. Soc. Psychol., vol. 54, no. 2, pp. 323-338, 1988.
[2]
R. Gumbrell-McCormick and R. Hyman, Embedded collectivism? Workplace representation in France and Germany, Ind. Relat. J., vol. 37, no. 5, pp. 473-491, 2006.
[3]
M. Ehrgott, F. Reimann, L. Kaufmann, and C. R. Carter, Social sustainability in selecting emerging economy suppliers, J. Bus. Eth., vol. 98, no. 1, pp. 99-119, 2011.
[4]
L. Jin, Y. Chen, T. Y. Wang, P. Hui, and A. V. Vasilakos, Understanding user behavior in online social networks: A survey, IEEE Commun. Mag., vol. 51, no. 9, pp. 144-150, 2013.
[5]
H. Krasnova and N. F. Veltri, Privacy calculus on social networking sites: Explorative evidence from Germany and USA, in Proc. 43rd Hawaii Int. Conf. System Sciences (HICSS), Honolulu, HI, USA, 2010, pp. 1-10.
DOI
[6]
J. W. Byers, M. Mitzenmacher, and G. Zervas, The groupon effect on yelp ratings: A root cause analysis, in Proc. 13th ACM Conf. Electronic Commerce, Valencia, Spain, 2012, pp. 248-265.
DOI
[7]
M. A. Vasconcelos, S. Ricci, J. Almeida, F. Benevenuto, and V. Almeida, Tips, dones and todos: Uncovering user profiles in foursquare, in Proc. 5th ACM Int. Conf. Web Search and Data Mining, Seattle, WA, USA, 2012, pp. 653-662.
DOI
[8]
Y. Chen, Y. X. Yang, J. Y. Hu, and C. F. Zhuang, Measurement and analysis of tips in foursquare, in Proc. 2016 IEEE Int. Conf. Pervasive Computing and Communication Workshops, Sydney, Australia, 2016.
DOI
[9]
Y. Chen, J. Y. Hu, H. Zhao, Y. Xiao, and P. Hui, Measurement and analysis of the swarm social network with tens of millions of nodes, IEEE Access, vol. 6, pp. 4547-4559, 2018.
[10]
T. Chen, M. A. Kaafar, and R. Boreli, The where and when of finding new friends: Analysis of a location-based social discovery network, in Proc. 7th Int. AAAI Conf. Weblogs and Social Media, Cambridge, MA, USA, 2013.
[11]
R. Xie, Y. Chen, S. H. Lin, T. Y. Zhang, Y. Xiao, and X. Wang, Understanding skout users’ mobility patterns on a global scale: A data-driven study, World Wide Web J., .
[12]
Y. Huang, Y. Chen, Q. Zhou, J. Zhao, and X. Wang, Where are we visiting? Measurement and analysis of venues in Dianping, in Proc. 2016 IEEE Int. Conf. Communications (ICC), Kuala Lumpur, Malaysia, 2016.
DOI
[13]
Q. Y. Gong, Y. Chen, X. L. He, Z. Zhuang, T. Y. Wang, H. Huang, X. Wang, and X. M. Fu, DeepScan: Exploiting deep learning for malicious account detection in location-based social networks, IEEE Commun. Mag., 2018. (in press)
[14]
S. M. Lipset, Some social requisites of democracy: Economic development and political legitimacy, Am. Polit. Sci. Rev., vol. 53, no. 1, pp. 69-105, 1959.
[15]
E. U. Weber, C. K. Hsee, and J. Sokolowska, What folklore tells us about risk and risk taking: Cross-cultural comparisons of American, German, and Chinese proverbs, Organ. Behav. Hum. Decis. Process., vol. 75, no. 2, pp. 170-186, 1998.
[16]
M. Clyne, Cultural differences in the organization of academic texts: English and German, J. Pragmat., vol. 11, no. 2, pp. 211-241, 1987.
[17]
A. Hicks, S. Comp, J. Horovitz, M. Hovarter, M. Miki, J. L. Bevan, Why people use Yelp.com: An exploration of uses and gratifications, Comput. Hum. Behav., vol. 28, no. 6, pp. 2274-2279, 2012.
[18]
J. Leskovec and R. Sosic, SNAP: A general-purpose network analysis and graph-mining library, ACM Trans. Intell. Sys. Technol., vol. 8, no. 1, p. 1, 2016.
[19]
L. Page, S. Brin, R. Motwani, and T. Winograd, The PageRank citation ranking: Bringing order to the Web, Technical Report, Stanford InfoLab, 1999.
[20]
X. H. Zhao, A. Sala, C. Wilson, X. Wang, S. Gaito, H. T. Zheng, and B. Y. Zhao, Multi-scale dynamics in a massive online social network, in Proc. 2012 Internet Measurement Conf., Boston, MA, USA, 2012, pp. 171-184.
DOI
[21]
Y. Y. Ahn, S. Han, H. Kwak, S. Moon, and H. Jeong, Analysis of topological characteristics of huge online social networking services, in Proc. 16th Int. Conf. World Wide Web, Banff, Canada, 2007, pp. 835-844.
DOI
[22]
J. W. Pennebaker, R. J. Booth, R. L. Boyd, and M. E. Francis, Linguistic Inquiry and Word Count: LIWC2015. Austin, TX, USA: Pennebaker Conglomerates, 2015.
[23]
M. Wolf, A. B. Horn, M. R. Mehl, S. Haug, J. W. Pennebaker, and H. Kordy, Computergestützte quantitative textanalyse: Äquivalenz und robustheit der DEUTSCHEN version des linguistic inquiry and word count, Diagnostica, vol. 54, no. 2, pp. 85-98, 2008.
[24]
J. W. Pennebaker and M. E. Francis, Cognitive, emotional, and language processes in disclosure, Cogn. Emot., vol. 10, no. 6, pp. 601-626, 1996.
[25]
K. D. Roach and P. R. Byrne, A cross-cultural comparison of instructor communication in American and German classrooms, Commun. Educ., vol. 50, no. 1, pp. 1-14, 2001.
[26]
E. Cho, S. A. Myers, and J. Leskovec, Friendship and mobility: User movement in location-based social networks, in Proc. 17th ACM SIGKDD Int. Conf. Knowledge Discovery and Data Mining, San Diego, CA, USA, 2011, pp. 1082-1090.
DOI
[27]
R. Xie, Y. Chen, Q. Xie, Y. Xiao, and X. Wang, We know your preferences in new cities: Mining and modeling the behavior of travelers, IEEE Commun. Mag., 2018. (in press)
[28]
T. Q. Chen and C. Guestrin, XGBoost: A scalable tree boosting system, in Proc. 22nd ACM SIGKDD Int. Conf. Knowledge Discovery and Data Mining, San Francisco, CA, USA, 2016, pp. 785-794.
DOI
[29]
M. Hall, E. Frank, G. Holmes, B. Pfahringer, P. Reutemann, and I. H. Witten, The WEKA data mining software: An update, SIGKDD Explor. Newsl., vol. 11, no. 1, pp. 10-18, 2009.
[30]
T. Fawcett, An introduction to ROC analysis, Pattern Recognit. Lett., vol. 27, no. 8, pp. 861-874, 2006.
[31]
Q. McNemar, Note on the sampling error of the difference between correlated proportions or percentages, Psychometrika, vol. 12, no. 2, pp. 153-157, 1947.
[32]
Y. M. Yang and J. O. Pedersen, A comparative study on feature selection in text categorization, in Proc. 14th Int. Conf. Machine Learning (ICML), San Francisco, CA, USA, 1997, pp. 412-420.
[33]
G. Topa, Social interactions, local spillovers and unemployment, Rev. Econom. Stud., vol. 68, no. 2, pp. 261-295, 2001.
[34]
A. L. J. Ter Wal and R. A. Boschma, Applying social network analysis in economic geography: Framing some key analytic issues, Ann. Reg. Sci., vol. 43, no. 3, pp. 739-756, 2009.
[35]
D. Liben-Nowell, J. Novak, R. Kumar, P. Raghavan, and A. Tomkins, Geographic routing in social networks, Proc. Natl. Acad. Sci. U.S.A., vol. 102, no. 33, pp. 11623-11628, 2005.
[36]
A. Mukherjee, V. Venkataraman, B. Liu, and N. Glance, What yelp fake review filter might be doing? in Proc. 7th Inte. AAAI Conf. Weblogs and Social Media, Cambridge, MA, USA, 2013.
[37]
M. Luca and G. Zervas, Fake it till you make it: Reputation, competition, and yelp review fraud, Manage. Sci., vol. 62, no. 12, pp. 3412-3427, 2016.
[38]
Y. S. Yao, B. Viswanath, J. Cryan, H. T. Zheng, and B. Y. Zhao, Automated crowdturfing attacks and defenses in online review systems, in Proc. 2017 ACM SIGSAC Conf. Computer and Communications Security, Dallas, TX, USA, 2017, pp. 1143-1158.
DOI
[39]
W. Ariyasriwatana and L. M. Quiroga, A thousand ways to say 'Delicious!’-Categorizing expressions of deliciousness from restaurant reviews on the social network site Yelp, Appetite, vol. 104, pp. 18-32, 2016.
[40]
J. McAuley and J. Leskovec, Hidden factors and hidden topics: Understanding rating dimensions with review text, in Proc. 7th ACM Conf. Recommender Systems, Hong Kong, China, 2013, pp. 165-172.
DOI
[41]
M. Srite and E. Karahanna, The role of espoused national cultural values in technology acceptance, MIS Quart., vol. 30, no, 3, pp. 679-704, 2006.
[42]
X. M. Fu, H. Huang, X. Y. Li, H. S. Tan, and J. Tang, A comparative analysis of school pupils’ daily habits in Germany and China, in Proc. 10th Int. Workshop on Hot Topics in Pervasive Mobile and Online Social Networking, Honolulu, HI, USA, 2018.
DOI
[43]
R. Garcia-Gavilanes, D. Quercia, and A. Jaimes, Cultural dimensions in twitter: Time, individualism and power, in Proc. 7th Int. AAAI Conf. Weblogs and Social Media, Cambridge, MA, USA, 2013.
[44]
Q. Y. Gong, Y. Chen, J. Y. Hu, Q. Cao, P. Hui, and X. Wang, Understanding cross-site linking in online social networks, ACM Trans. Web, 2018. (in press)
Publication history
Copyright
Acknowledgements
Rights and permissions

Publication history

Received: 05 March 2018
Accepted: 20 March 2018
Published: 02 July 2018
Issue date: December 2018

Copyright

© The author(s) 2018

Acknowledgements

This work was supported by the National Natural Science Foundation of China (Nos. 61602122 and 71731004), the Natural Science Foundation of Shanghai (No. 16ZR1402200), Shanghai Pujiang Program (No. 16PJ1400700), EU FP7 IRSES MobileCloud project (No. 612212), and Lindemann Foundation (No. 12-2016).

Rights and permissions

Return