Comparative Study of Statistical Features to Detect the Target Event During Disaster

Madichetty Sreenivasulu; M. Sridevi

doi:10.26599/BDMA.2019.9020021

AI Chat Paper

Note: Please note that the following content is generated by AMiner AI. SciOpen does not take any responsibility related to this content.

Chat more with AI

| Sign up

Browse by Subject

Search for peer-reviewed journals with full access.

Journals A - Z

About Us

Discover the SciOpen Platform and Achieve Your Research Goals with Ease.

About Us

Publish with Us

Support

Journals A - Z

About Us

Publish with Us

Support

PDF (619.6 KB)

Cite

EndNote(RIS) BibTeX

Collect

Submit Manuscript

AI Chat Paper

Show Outline

Outline

Show full outline

Hide outline

Outline

Show full outline

Hide outline

Open Access

Comparative Study of Statistical Features to Detect the Target Event During Disaster

Madichetty Sreenivasulu(

), M. Sridevi

∙ Department of CSE, National Institute of Technology, Tiruchirappalli, Tamilnadu 620015, India.

Show Author Information

Abstract

Microblogs, such as facebook and twitter, have much attention among the users and organizations. Nowadays, twitter is more popular because of its real-time nature. People often interacted with real-time events such as earthquakes and floods through twitter. During a disaster, the number of posts or tweets is drastically increased in twitter. At the time of the disaster, detecting a target event is a challenging task. In this paper, a framework is proposed for observing the tweets and to detect the target event. For detecting the target event, a classifier is devised based on different combinations of statistical features such as the position of the keyword in a tweet, length of a tweet, the frequency of hashtag, and frequency of user mentions and the URL. From the result, it is evident that the combination of frequency of hashtag and position of keyword features provides good classification results than the other combinations of features. Hence, usage of two features, namely, frequency of hashtag and position of the earthquake keyword reduces the event’s detection time. And also these two features are further helpful for detecting the sub-events which are used for filtering the tweets related to the disaster. Additionally, different classifiers such as Artificial Neural Networks (ANN), decision tree, and K-Nearest Neighbor (KNN) are compared by using these two features. However, Support Vector Machine (SVM) with linear kernel by using the combination of position of earthquake keyword and frequency of hashtag outperforms state-of-the-art methods. Therefore, SVM (linear kernel) with proposed features is applied for detecting the earthquake during disaster. The proposed algorithm is tested on Nepal earthquake and landslide datasets, 2015.

Keywords

twitter disaster Support Vector Machine (SVM)statical features

References

[1]

Z. C. Miao, K. Chen, Y. Fang, J. H. He, Y. Zhou, W. J. Zhang, and H. Y. Zha, Cost-effective online trending topic detection and popularity prediction in microblogging, ACM Trans. Inf. Syst., vol. 35, no. 3, p. 18, 2017.

Crossref Google Scholar

[2]

N. Pervin, F. Fang, A. Datta, K. Dutta, and D. Vandermeer, Fast, scalable, and context-sensitive detection of trending topics in microblog post streams, ACM Trans. Manage. Inf. Syst., vol. 3, no. 4, p. 19, 2013.

Crossref Google Scholar

[3]

M. Sreenivasulu and M. Sridevi, A survey on event detection methods on various social media, in Recent Findings in Intelligent Computing Techniques, P. K. Sa, S. Bakshi, I. K. Hatzilygeroudis, and M. N. Sahoo, eds. Singapore: Springer, 2018, pp. 87-93.

[4]

H. Kwak, C. Lee, H. Park, and S. Moon, What is twitter, a social network or a news media? in Proc. 19th Int. Conf. World Wide Web, Raleigh, NC, USA, 2010, pp. 591-600.

Crossref

[5]

M. Imran, P. Mitra, and C. Castillo, Twitter as a lifeline: Human-annotated twitter corpora for NLP of crisis-related messages, arXiv preprint arXiv: 1605.05894, 2016.

[6]

M. Imran, C. Castillo, J. Lucas, P. Meier, and S. Vieweg, AIDR: Artificial intelligence for disaster response, in Proceedings of the 23rd International Conference on World Wide Web, Seoul, Korea, 2014, pp. 159-162.

Crossref

[7]

M. Imran, C. Castillo, F. Diaz, and S. Vieweg, Processing social media messages in mass emergency: A survey, ACM Comput. Surv., vol. 47, no. 4, p. 67, 2015.

Crossref Google Scholar

[8]

S. Vieweg, C. Castillo, and M. Imran, Integrating social media communications into the rapid assessment of sudden onset disasters, in Proc. 6th Int. Conf. Social Informatics, Barcelona, Spain, 2014, pp. 444-461.

Crossref

[9]

S. Madisetty and M. S. Desarkar, A neural network-based ensemble approach for spam detection in Twittern, IEEE Trans. Comput. Social Syst., vol. 5, no. 4, pp. 973-984, 2018.

Crossref Google Scholar

[10]

T. Sakaki, M. Okazaki, and Y. Matsuo, Tweet analysis for real-time event detection and earthquake reporting system development, IEEE Trans. Knowl. Data Eng., vol. 25, no. 4, pp. 919-931, 2013.

Crossref Google Scholar

[11]

B. Takahashi, E. C. Jr. Tandoc, and C. Carmichael, Communicating on twitter during a disaster: An analysis of tweets during Typhoon Haiyan in the Philippines, Comput. Human Behav., vol. 50, pp. 392-398, 2015.

Crossref Google Scholar

[12]

K. Rudra, S. Banerjee, N. Ganguly, P. Goyal, M. Imran, and P. Mitra, Summarizing situational tweets in crisis scenario, in Proc. 27th ACM Conf. Hypertext and Social Media, Halifax, Canada, 2016, pp. 137-147.

Crossref

[13]

K. Rudra, S. Ghosh, N. Ganguly, P. Goyal, and S. Ghosh, Extracting situational information from microblogs during disaster events: A classification-summarization approach, in Proc. 24th ACM Int. Conf. Information and Knowledge Management, Melbourne, Australia, 2015, pp. 583-592.

Crossref

[14]

S. Verma, S. Vieweg, W. J. Corvey, L. Palen, J. H. Martin, M. Palmer, A. Schram, and K. M. Anderson, Natural language processing to the rescue? extracting “situational awareness” tweets during mass emergency, in Proc. 5th Int. Conf. Weblogs and Social Media, Barcelona, Spain, 2011, pp. 385-392.

[15]

T. H. Nazer, F. Morstatter, H. Dani, and H. Liu, Finding requests in social media for disaster relief, in Proc. 2016 IEEE/ACM Int. Conf. Advances in Social Networks Analysis and Mining, Davis, CA, USA, 2016, pp. 1410-1413.

Crossref

[16]

S. R. Chowdhury, M. Imran, M. R. Asghar, S. Amer-Yahia, and C. Castillo, Tweet4act: Using incident-specific profiles for classifying crisis-related messages, in Proc. 10th Int. ISCRAM Conf., Baden-Baden, Germany, 2013, pp. 1-5.

[17]

M. Sreenivasulu and M. Sridevi, Mining informative words from the tweets for detecting the resources during disaster, in Proc. 5th Int. Conf. Mining Intelligence and Knowledge Exploration, Hyderabad, India, 2017, pp. 348-358.

Crossref

[18]

M. Basu, K. Ghosh, S. Das, R. Dey, S. Bandyopadhyay, and S. Ghosh, Identifying post-disaster resource needs and availabilities from microblogs, in Proc. 2017 IEEE/ACM Int. Conf. Advances in Social Networks Analysis and Mining, Sydney, Australia, 2017, pp. 427-430.

Crossref

[19]

P. Khosla, M. Basu, K. Ghosh, and S. Ghosh, Microblog retrieval for post-disaster relief: Applying and comparing neural IR models, arXiv preprint arXiv: 1707.06112, 2017.

[20]

M. Sreenivasulu and M. Sridevi, Re-ranking feature selection algorithm for detecting the availability and requirement of resources tweets during disaster, International Journal of Computational Intelligence & IoT, vol. 1, no. 2, pp. 207-211, 2018.

Google Scholar

[21]

M. Ikonomakis, S. Kotsiantis, and V. Tampakas, Text classification using machine learning techniques, WSEAS Trans. Comput., vol. 4, no. 8, pp. 966-974, 2005.

Google Scholar

[22]

E. H. Han, G. Karypis, and V. Kumar, Text categorization using weight adjusted k-nearest neighbor classification, in Proc. 5th Pacific-Asia Conf. Knowledge Discovery and Data Mining, Hong Kong, China, 2001, pp. 53-65.

Crossref

[23]

J. He, A. H. Tan, and C. L. Tan, On machine learning methods for Chinese document categorization, Appl. Intell., vol. 18, no. 3, pp. 311-322, 2003.

Google Scholar

[24]

L. Breiman, J. H. Friedman, R. A. Olshen, and C. J. Stone, Classification and Regression Trees. Belmont, CA, USA: Wadsworth, 1984.

[25]

J. N. Morgan and J. A. Sonquist, Problems in the analysis of survey data, and a proposal, J. Am. Stat. Assoc., vol. 58, no. 302, pp. 415-434, 1963.

Crossref Google Scholar

[26]

J. R. Quinlan, Induction of decision trees, Mach. Learn., vol. 1, no. 1, pp. 81-106, 1986.

Crossref Google Scholar

[27]

J. R. Quinlan, C 4.5: Programs for Machine Learning. Amsterdam, Netherlands: Elsevier, 2014.

[28]

W. S. McCulloch and W. Pitts, A logical calculus of the ideas immanent in nervous activity, Bull. Math. Biophys., vol. 5, no. 4, pp. 115-133, 1943.

Crossref Google Scholar

[29]

F. Rosenblatt, The perceptron: A probabilistic model for information storage and organization in the brain, Psychol. Rev., vol. 65, no. 6, pp. 386-408, 1958.

Crossref Google Scholar

[30]

P. Werbos, Beyond regression: New tools for prediction and analysis in the behavior science, Ph.D. dissertation, Harvard University, Cambridge, MA, USA, 1974.

[31]

D. E. Rumelhart, G. E. Hinton, and R. J. Williams, Learning internal representations by error propagation, Technical report, University of California, San Diego, CA, USA, 1985.

Crossref

[32]

K. Hornik, M. Stinchcombe, and H. White, Multilayer feedforward networks are universal approximators, Neural Netw., vol. 2, no. 5, pp. 359-366, 1989.

Crossref Google Scholar

[33]

R. Gutierrez-Osuna, CS 790: Selected Topics in Computer Science: Introduction to Pattern Recognition. Dayton, OH, USA: Wright State University, 2002.

[34]

T. Joachims, Text categorization with support vector machines: Learning with many relevant features, in Proc. 10th European Conf. Machine Learning, Chemnitz, Germany, 1998, pp. 137-142.

Crossref

[35]

S. Shalev-Shwartz, Y. Singer, N. Srebro, and A. Cotter, Pegasos: Primal estimated sub-gradient solver for SVM, Mathematical Programming, vol. 127, no. 1, pp. 3-30, 2011.

Crossref Google Scholar

[36]

C. J. Hsieh, K. W. Chang, C. J. Lin, S. S. Keerthi, and S. Sundararajan, A dual coordinate descent method for large-scale linear SVM, in Proc. 25th Int. Conf. Machine Learning, Helsinki, Finland, 2008, pp. 408-415.

Crossref

[37]

I. W. Tsang, J. T. Kwok, and P. M. Cheung, Core vector machines: Fast SVM training on very large data sets, J. Mach. Learn. Res., vol. 6, pp. 363-392, 2005.

Google Scholar

[38]

A. Rahimi and B. Recht, Random features for large-scale kernel machines, in Proc. 20th Int. Conf. Neural Information Processing Systems, Vancouver, Canada, 2008, pp. 1177-1184.

[39]

C. W. Hsu and C. J. Lin, A comparison of methods for multiclass support vector machines, IEEE Trans. Neural Netw., vol. 13, no. 2, pp. 415-425, 2002.

Crossref Google Scholar

[40]

F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, et al., Scikit-learn: Machine learning in Python, J. Mach. Learn. Res., vol. 12, pp. 2825-2830, 2011.

Google Scholar

[41]

M. Imran, P. Mitra, and C. Castillo, Twitter as a lifeline: Human-annotated twitter corpora for NLP of crisis-related messages, in Proc. 10th Int. Conf. Language Resources and Evaluation, Paris, France, 2016.

Big Data Mining and Analytics

Volume 3 Issue 2,
June 2020

Pages 121-130

DOI: 10.26599/BDMA.2019.9020021

Cite this article:

Sreenivasulu M, Sridevi M. Comparative Study of Statistical Features to Detect the Target Event During Disaster. Big Data Mining and Analytics, 2020, 3(2): 121-130. https://doi.org/10.26599/BDMA.2019.9020021

815

Views

Downloads

Crossref

Web of Science

Scopus

CSCD

Google Scholar
Citation

Altmetrics

Received: 05 November 2019

Accepted: 21 November 2019

Published: 27 February 2020

The articles published in this open access journal are distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/).