Journal Home > Volume 2 , Issue 4

Health authorities worldwide strive to detect Influenza prevalence as early as possible in order to prepare for it and minimize its impacts. To this end, we address the Influenza prevalence surveillance and prediction problem. In this paper, we develop a new Influenza prevalence prediction model, called Tweetluenza, to predict the spread of the Influenza in real time using cross-lingual data harvested from Twitter data streams with emphases on the United Arab Emirates (UAE). Based on the features of tweets, Tweetluenza filters the Influenza tweets and classifies them into two classes, reporting and non-reporting. To monitor the growth of Influenza, the reporting tweets were employed. Furthermore, a linear regression model leverages the reporting tweets to predict the Influenza-related hospital visits in the future. We evaluated Tweetluenza empirically to study its feasibility and compared the results with the actual hospital visits recorded by the UAE Ministry of Health. The results of our experiments demonstrate the practicality of Tweetluenza, which was verified by the high correlation between the Influenza-related Twitter data and hospital visits due to Influenza. Furthermore, the evaluation of the analysis and prediction of Influenza shows that combining English and Arabic tweets improves the correlation results.


menu
Abstract
Full text
Outline
About this article

Tweetluenza: Predicting Flu Trends from Twitter Data

Show Author's information Balsam AlkouzZaher Al Aghbari( )Jemal Hussien Abawajy
Department of Computer Science, University of Sharjah, Sharjah 27272, UAE.
Department of Science, Engineering and Built Environment, Deakin University, Melbourne 3125, Australia.

Abstract

Health authorities worldwide strive to detect Influenza prevalence as early as possible in order to prepare for it and minimize its impacts. To this end, we address the Influenza prevalence surveillance and prediction problem. In this paper, we develop a new Influenza prevalence prediction model, called Tweetluenza, to predict the spread of the Influenza in real time using cross-lingual data harvested from Twitter data streams with emphases on the United Arab Emirates (UAE). Based on the features of tweets, Tweetluenza filters the Influenza tweets and classifies them into two classes, reporting and non-reporting. To monitor the growth of Influenza, the reporting tweets were employed. Furthermore, a linear regression model leverages the reporting tweets to predict the Influenza-related hospital visits in the future. We evaluated Tweetluenza empirically to study its feasibility and compared the results with the actual hospital visits recorded by the UAE Ministry of Health. The results of our experiments demonstrate the practicality of Tweetluenza, which was verified by the high correlation between the Influenza-related Twitter data and hospital visits due to Influenza. Furthermore, the evaluation of the analysis and prediction of Influenza shows that combining English and Arabic tweets improves the correlation results.

Keywords: Twitter data analysis, Influenza forecasting, prediction using social media, social media mining

References(39)

[1]
H. Achrekar, A. Gandhe, R. Lazarus, S. H. Yu, and B. Y. Liu, Online social networks flu trend tracker: A novel sensory approach to predict flu trends, in Proc. 5th Int. Joint Conf. Biomedical Engineering Systems and Technologies, Berlin, Germany, 2012, pp. 353-368.
DOI
[2]
M. J. Paul, M. Dredze, and D. Broniatowski, Twitter improves Influenza forecasting, PLoS Curr., .
[3]
A. F. Dugas, M. Jalalpour, Y. Gel, S. Levin, F. Torcaso, T. Igusa, and R. E. Rothman, Influenza forecasting with Google flu trends, PLoS One, vol. 8, no. 2, p. e56176, 2013.
[4]
K. Byrd, A. Mansurov, and O. Baysal, Mining Twitter data for Influenza detection and surveillance, in Proc. 2016 IEEE/ACM Int. Workshop on Software Engineering in Healthcare Systems, Austin, TX, USA, 2016, pp. 43-49.
DOI
[5]
S. Ram, W. L. Zhang, M. Williams, and Y. Pengetnze, Predicting asthma-related emergency department visits using big data, IEEE J. Biomed. Health Inform., vol. 19, no. 4, pp. 1216-1223, 2015.
[6]
J. W. Li and C. Cardie, Early stage Influenza detection from Twitter, arXiv preprint arXiv: 1309.7340, 2013.
[7]
M. Shah, Disease propagation in social networks: A novel study of infection genesis and spread on Twitter, in Proc. 5th Workshop on Big Data, Streams and Heterogeneous Source Mining: Algorithms, Systems, Programming Models and Applications, San Francisco, CA, USA, 2016, pp. 85-102.
[8]
Statista, Number of monthly active Twitter users worldwide from 1st quarter 2010 to 1st quarter 2019 (in millions), https://www.statista.com/statistics/282087/number-of-monthly-active-Twitter-users/, 2019.
[9]
M. A. Al-garadi, M. S. Khan, K. D. Varathan, G. Mujtaba, and A. M. Al-Kabsi, Using online social networks to track a pandemic: A systematic review, J. Biomed. Inform., vol. 62, pp. 1-11, 2016.
[10]
A. A. Aslam, M. H. Tsou, B. H. Spitzberg, L. An, J. M. Gawron, D. K. Gupta, K. M. Peddecord, A. C. Nagel, C. Allen, J. A. Yang, et al., The reliability of tweets as a supplementary method of seasonal Influenza surveillance, J. Med. Internet Res., vol. 16, no. 11, p. e250, 2014.
[11]
The National, Do you need to take the flu vaccination this winter? https://www.thenational.ae/lifestyle/wellbeing/do-you-need-to-take-the-flu-vaccination-this-winter-1.197490, 2016.
[12]
B. Alkouz and Z. Al Aghbari, Analysis and prediction of Influenza in the UAE based on Arabic tweets, in Proc. 3rd Int. Conf. Big Data Analysis, Shanghai, China, 2018, pp. 61-66.
DOI
[13]
A. Guille and C. Favre, Event detection, tracking, and visualization in Twitter: A mention-anomaly-based approach, Social Network Anal. Min., vol. 5, no. 1, p. 18, 2015.
[14]
M. Musleh, Spatio-temporal visual analysis for event-specific tweets, in Proc. 2014 ACM SIGMOD Int. Conf. Management of Data, Snowbird, UT, USA, 2014, pp. 1611-1612.
DOI
[15]
E. D’Andrea, P. Ducange, B. Lazzerini, and F. Marcelloni, Real-time detection of traffic from Twitter stream analysis, IEEE Trans. Intell. Trans. Syst., vol. 16, no. 4, pp. 2269-2283, 2015.
[16]
J. Capdevila, J. Cerquides, and J. Torres, Recognizing warblers: A probabilistic model for event detection in Twitter, in Proc. 2016 Anomaly Detection Workshop in the Int. Conf. Machine Learning, New York, NY, USA, 2016.
[17]
K. Lee, A. Agrawal, and A. Choudhary, Real-time disease surveillance using Twitter data: Demonstration on flu and cancer, in Proc. 19th ACM SIGKDD Int. Conf. Knowledge Discovery and Data Mining, Chicago, IL, USA, 2013, pp. 1474-1477.
DOI
[18]
E. Aramaki, S. Maskawa, and M. Morita, Twitter catches the flu: Detecting Influenza epidemics using Twitter, in Proc. Conf. Empirical Methods in Natural Language Processing, Edinburgh, UK, 2011, pp. 1568-1576.
[19]
M. C. Smith, D. A. Broniatowski, M. J. Paul, and M. Dredze, Towards real-time measurement of public epidemic awareness: Monitoring Influenza awareness through Twitter, in AAAI Spring Symposium on Observational Studies through Social Media and Other Human-Generated Content, Stanford, CA, USA, 2016.
[20]
D. A. Broniatowski, M. J. Paul, and M. Dredze, National and local Influenza surveillance through Twitter: An analysis of the 2012-2013 Influenza epidemic, PLoS One, vol. 8, no. 12, p. e83672, 2013.
[21]
Q. Zhang, N. Perra, D. Perrotta, M. Tizzoni, D. Paolotti, and A. Vespignani, Forecasting seasonal Influenza fusing digital indicators and a mechanistic disease model, in Proc. 26th Int. Conf. World Wide Web, Perth, Australia, 2017, pp. 311-319.
DOI
[22]
F. Zhang, J. Luo, C. Li, X. Wang, and Z. Y. Zhao, Detecting and analyzing Influenza epidemics with social media in China, in Proc. 18th Pacific-Asia Conf. Knowledge Discovery and Data Mining, Tainan, China, 2014, pp. 90-101.
DOI
[23]
S. Brennan, A. Sadilek, and H. Kautz, Towards understanding global spread of disease from everyday interpersonal interactions, in Proc. 23rd Int. Joint Conf. Artificial Intelligence, Beijing, China, 2013.
[24]
H. Achrekar, A. Gandhe, R. Lazarus, S. H. Yu, and B. Y. Liu, Twitter improves seasonal Influenza prediction, in Proc. Int. Conf. Health Informatics, Vilamoura, Portugal, 2012, pp. 61-70.
[25]
A. Signorini, A. M. Segre, and P. M. Polgreen, The use of Twitter to track levels of disease activity and public concern in the U.S. during the Influenza A H1N1 pandemic, PLoS One, vol. 6, no. 5, p. e19467, 2011.
[26]
I. Kagashe, Z. Yan, and I. Suheryani, Enhancing seasonal Influenza surveillance: Topic analysis of widely used medicinal drugs using Twitter data, J. Med. Internet Res., vol. 19, no. 9, p. e315, 2017.
[27]
S. Grover and G. S. Aujla, Prediction model for Influenza epidemic based on Twitter data, Int. J. Adv. Res. Comput. Commun. Eng., vol. 3, no. 7, pp. 7541-7545, 2014.
[28]
H. Achrekar, A. Gandhe, R. Lazarus, S. H. Yu, and B. Y. Liu, Predicting flu trends using Twitter data, in Proc. 2011 IEEE Conf. Computer Communications Workshops, Shanghai, China, 2011, pp. 702-707.
DOI
[29]
E. K. Kim, J. H. Seok, J. S. Oh, H. W. Lee, and K. H. Kim, Use of Hangeul Twitter to track and predict human Influenza infection, PLoS One, vol. 8, no. 7, p. e69305, 2013.
[30]
H. Woo, H. S. Cho, E. Shim, J. K. Lee, K. Lee, G. Song, and Y. Cho, Identification of keywords from Twitter and web blog posts to detect Influenza epidemics in Korea, Disaster Med. Public Health Prep., vol. 12, no. 3, pp. 352-359, 2018.
[31]
K. Lee, A. Agrawal, and A. Choudhary, Forecasting Influenza levels using real-time social media streams, in Proc. 2017 IEEE Int. Conf. Healthcare Informatics, Park City, UT, USA, 2017, pp. 409-414.
DOI
[32]
S. Ghosh, T. Rekatsinas, S. R. Mekaru, E. O. Nsoesie, J. S. Brownstein, L. Getoor, and N. Ramakrishnan, Forecasting rare disease outbreaks with spatio-temporal topic models, in NIPS 2013 Workshop on Topic Models, Lake Tahoe, NV, USA, 2013.
[33]
M. Akbari, X. Hu, F. Wang, and T. S. Chua, Wellness representation of users in social media: Towards joint modelling of heterogeneity and temporality, IEEE Trans. Knowl. Data Eng., vol. 29, no. 10, pp. 2360-2373, 2017.
[34]
Gulfbusiness, Top 10 nations where people spend most time on social media, http://gulfbusiness.com/top-10-nations-people-spend-time-social-media/, 2016.
[35]
Weedoo, Twitter arab world — Statistics Feb 2017, https://weedoo.tech/Twitter-arab-world-statistics-feb-2017/, 2017.
[36]
L. Dinges, A. Al-Hamadi, M. Elzobi, Z. Al Aghbari, and H. Mustafa, Offline automatic segmentation based recognition of handwritten Arabic words, Int. J. Signal Process., Image Process. Pattern Recognit., vol. 4, no. 4, pp. 131-143, 2011.
[37]
M. Elzobi, A. Al-Hamadi, Z. Al Aghbari, L. Dings, and A. Saeed, Gabor wavelet recognition approach for off-line handwritten Arabic using explicit segmentation, in Image Processing and Communications Challenges 5, Heidelberg, Germany, 2014, pp. 245-254.
DOI
[38]
NLTK Python Library, Natural Language Toolkit, http://www.nltk.org/, 2019.
[39]
Go-Gulf, Expats in middle east — Statistics and trends, https://www.go-gulf.ae/blog/expats-middle-east/, 2019.
Publication history
Copyright
Acknowledgements
Rights and permissions

Publication history

Received: 08 January 2019
Revised: 12 May 2019
Accepted: 22 May 2019
Published: 05 August 2019
Issue date: December 2019

Copyright

© The author(s) 2019

Acknowledgements

The authors would like to express their sincere thanks to the UAE Ministry of Health specially their Statistics and Research Center for providing us the actual counts of Influenza-related hospital visits to be used in our research.

Rights and permissions

The articles published in this open access journal are distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/).

Return