AI Chat Paper
Note: Please note that the following content is generated by AMiner AI. SciOpen does not take any responsibility related to this content.
{{lang === 'zh_CN' ? '文章概述' : 'Summary'}}
{{lang === 'en_US' ? '中' : 'Eng'}}
Chat more with AI
PDF (2.4 MB)
Collect
Submit Manuscript AI Chat Paper
Show Outline
Outline
Show full outline
Hide outline
Outline
Show full outline
Hide outline
Open Access

Analysis and Classification of Fake News Using Sequential Pattern Mining

College of Computer Science and Software Engineering, Shenzhen University, Shenzhen 518060
Guangdong Laboratory of Artificial Intelligence and Digital Economy (SZ), Shenzhen 518107, China
Department of Computer Science, Faculty of Computing and Information Technology, Univesity of Sargodha, Sargodha 40100, Pakistan
Show Author Information

Abstract

Disinformation, often known as fake news, is a major issue that has received a lot of attention lately. Many researchers have proposed effective means of detecting and addressing it. Current machine and deep learning based methodologies for classification/detection of fake news are content-based, network (propagation) based, or multimodal methods that combine both textual and visual information. We introduce here a framework, called FNACSPM, based on sequential pattern mining (SPM), for fake news analysis and classification. In this framework, six publicly available datasets, containing a diverse range of fake and real news, and their combination, are first transformed into a proper format. Then, algorithms for SPM are applied to the transformed datasets to extract frequent patterns (and rules) of words, phrases, or linguistic features. The obtained patterns capture distinctive characteristics associated with fake or real news content, providing valuable insights into the underlying structures and commonalities of misinformation. Subsequently, the discovered frequent patterns are used as features for fake news classification. This framework is evaluated with eight classifiers, and their performance is assessed with various metrics. Extensive experiments were performed and obtained results show that FNACSPM outperformed other state-of-the-art approaches for fake news classification, and that it expedites the classification task with high accuracy.

References

[1]

X. Zhou and R. Zafarani, A survey of fake news: Fundamental theories, detection methods, and opportunities, ACM Comput. Surv., vol. 53, no. 5, pp. 1–40, 2020.

[2]

G. Ruffo, A. Semeraro, A. Giachanou, and P. Rosso, Studying fake news spreading, polarisation dynamics, and manipulation by bots: A tale of networks and language, Comput. Sci. Rev., vol. 47, p. 100531, 2023.

[3]

X. Zhang and A. A. Ghorbani, An overview of online fake news: Characterization, detection, and discussion, Inf. Process. Manag., vol. 57, p. 102025, 2020.

[4]

C. Kong, G. Luo, L. Tian, and X. Cao, Disseminating authorized content via data analysis in opportunistic social networks, Big Data Mining and Analytics, vol. 2, no. 1, pp. 12–24, 2019.

[5]

S. A. Alkhodair, S. H. H. Ding, B. C. M. Fung, and J. Liu, Detecting breaking news rumors of emerging topics in social media, Inf. Process. Manag., vol. 57, p. 102018, 2020.

[6]
K. Shu, A. Sliva, S. Wang, J. Tang, and H. Liu, Fake news detection on social media: A data mining perspective, arXiv preprint arXiv: 1708.01967, 2017.
[7]

T. Buchanan, Why do people spread false information online? The effects of message and viewer characteristics on self-reported likelihood of sharing social media disinformation, PLoS One, vol. 15, no. 10, p. e0239666, 2020.

[8]
C. Boididou, S. Papadopoulos, Y. Kompatsiaris, S. Schifferes, and N. Newman, Challenges of computational verification in social multimedia, in Proc. 23rd Int. Conf. World Wide Web, Seoul, Republic of Korea, 2014, pp. 743–748.
[9]

C. Boididou, S. Papadopoulos, M. Zampoglou, L. Apostolidis, O. Papadopoulou, and Y. Kompatsiaris, Detection and visualization of misleading content on Twitter, Int. J. Multimed. Inf. Retr., vol. 7, no. 1, pp. 71–86, 2018.

[10]
N. Sitaula, C. K. Mohan, J. Grygiel, X. Zhou, and R. Zafarani, Credibility-based fake news detection, in Disinformation, Misinformation, and Fake News in Social Media, K. Shu, S. Wang, D. Lee, and H. Liu, eds. Cham, Switzerland: Springer, 2020, pp. 163–182.
[11]

X. Zhou, A. Jain, V. V. Phoha, and R. Zafarani, Fake news early detection: A theory-driven model, Digit. Threats Res. Pract., vol. 1, no. 2, p. 12, 2020.

[12]

M. Choudhary, S. S. Chouhan, E. S. Pilli, and S. K. Vipparthi, BerConvoNet: A deep learning framework for fake news classification, Appl. Soft Comput., vol. 110, p. 107614, 2021.

[13]
X. Zhou, J. Wu, and R. Zafarani, SAFE: Similarity-aware multi-modal fake news detection, in Proc. 24th Pacific-Asia Conference, PAKDD 2020, Singapore, 2020, pp. 354–367.
[14]
X. Zhou and R. Zafarani, Network-based fake news detection: A pattern-driven approach, arXiv preprint arXiv: 1906.04210, 2019.
[15]

B. Shi and T. Weninger, Discriminative predicate path mining for fact checking in knowledge graphs, Knowl. Based Syst., vol. 104, no. C, pp. 123–133, 2016.

[16]

G. L. Ciampaglia, P. Shiralkar, L. M. Rocha, J. Bollen, F. Menczer, and A. Flammini, Computational fact checking from knowledge networks, PLoS One, vol. 10, no. 6, p. e0128193, 2015.

[17]
Y. Wang, F. Ma, Z. Jin, Y. Yuan, G. Xun, K. Jha, L. Su, and J. Gao, EANN: Event adversarial neural networks for multi-modal fake news detection, in Proc. 24th ACM SIGKDD Conf. Knowledge Discovery & Data Mining, London, UK, 2018, pp. 849–857.
[18]
V. Pérez-Rosas, B. Kleinberg, A. Lefevre, and R. Mihalcea, Automatic detection of fake news, arXiv preprint arXiv: 1708.07104, 2017.
[19]

P. Fournier-Viger, J. C. W. Lin, R. U. Kiran, Y. S. Koh, and R. Thomas, A survey of sequential pattern mining, Data Science and Pattern Recognition, vol. 1, no. 1, pp. 54–77, 2017.

[20]

M. Cheng, X. Jin, Y. Wang, X. Wang, and J. Chen, A sequential pattern mining approach to tourist movement: The case of a mega event, J. Travel. Res., vol. 62, no. 6, pp. 1237–1256, 2023.

[21]

M. S. Nawaz, P. Fournier-Viger, M. Aslam, W. Li, Y. He, and X. Niu, Using alignment-free and pattern mining methods for SARS-CoV-2 genome analysis, Appl. Intell., vol. 53, no. 19, pp. 21920–21943, 2023.

[22]

M. S. Nawaz, P. Fournier-Viger, Y. He, and Q. Zhang, PSAC-PDB: Analysis and classification of protein structures, Comput. Biol. Med., vol. 158, p. 106814, 2023.

[23]

L. Ni, W. Luo, N. Lu, and W. Zhu, Mining the local dependency itemset in a products network, ACM Trans. Manage. Inf. Syst., vol. 11, no. 1, pp. 1–31, 2020.

[24]

R. U. Mustafa, M. S. Nawaz, J. Ferzund, M. I. U. Lali, B. Shahzad, and P. Fournier-Viger, Early detection of controversial Urdu speeches from social media, Data Science and Pattern Recognition, vol. 1, no. 2, pp. 26–42, 2017.

[25]
D. Schweizer, M. Zehnder, H. Wache, H. F. Witschel, D. Zanatta, and M. Rodriguez, Using consumer behavior data to reduce energy consumption in smart homes: Applying machine learning to save energy without lowering comfort of inhabitants, in Proc. IEEE 14th Int. Conf. Machine Learning and Applications (ICMLA), Miami, FL, USA, 2015, pp. 1123–1129.
[26]

M. S. Nawaz, P. Fournier-Viger, M. Z. Nawaz, G. Chen, and Y. Wu, MalSPM: Metamorphic malware behavior analysis and classification using sequential pattern mining, Comput. Secur., vol. 118, p. 102741, 2022.

[27]
M. S. Nawaz, M. Sun, and P. Fournier-Viger, Proof guidance in PVS with sequential pattern mining, in Proc. FSEN 2019, Tehran, Iran, 2019, pp. 45–60.
[28]
P. Fournier-Viger, T. Gueniche, and V. S. Tseng, Using partially-ordered sequential rules to generate more accurate sequence prediction, in Proc. 8th Int. Conf. Advanced Data Mining and Applications, ADMA 2012, Nanjing, China, 2012, pp. 431–442.
[29]
S. Feng, R. Banerjee, and Y. Choi, Syntactic stylometry for deception detection, in Proc. 50th Annual Meeting of the Association for Computational Linguistics, ACL 2012, Jeju Island, Republic of Korea, 2012, pp. 171–175.
[30]
H. Karimi and J. Tang, Learning hierarchical discourse-level structure for fake news detection, in Proc. 2019 Conf. the North American Chapter of the Association for Computational Linguistics : Human Language Technologies, Minneapolis, MN, USA, 2019, pp. 3432–3442.
[31]

V. L. Rubin and T. Lukoianova, Truth and deception at the rhetorical structure level, J. Assoc. Inf. Sci. Technol., vol. 66, no. 5, pp. 905–917, 2015.

[32]

B. Horne and S. Adali, This just in: Fake news packs a lot in title, uses simpler, repetitive content in text body, more similar to satire than real news, Proc. Int. AAAI Conf. Web Soc. Medium., vol. 11, no. 1, pp. 759–766, 2017.

[33]

J. C. S. Reis, A. Correia, F. Murai, A. Veloso, and F. Benevenuto, Supervised learning for fake news detection, IEEE Intell. Syst., vol. 34, no. 2, pp. 76–81, 2019.

[34]

J. Y. Khan, M. T. I. Khondaker, S. Afroz, G. Uddin, and A. Iqbal, A benchmark study of machine learning models for online fake news detection, Mach. Learn. Appl., vol. 4, p. 100032, 2021.

[35]

G. Gravanis, A. Vakali, K. Diamantaras, and P. Karadais, Behind the cues: A benchmarking study for fake news detection, Expert Syst. Appl., vol. 128, no. C, pp. 201–213, 2019.

[36]

I. Ahmad, M. Yousaf, S. Yousaf, and M. O. Ahmad, Fake news detection using machine learning ensemble methods, Complexity, vol. 2020, p. 8885861, 2020.

[37]

F. A. Ozbay and B. Alatas, Fake news detection within online social media using supervised artificial intelligence algorithms, Phys. A: Stat. Mech. Appl., vol. 540, p. 123174, 2020.

[38]

K. Shu, D. Mahudeswaran, S. Wang, D. Lee, and H. Liu, FakeNewsNet: A data repository with news content, social context, and spatiotemporal information for studying fake news on social media, Big Data, vol. 8, no. 3, pp. 171–188, 2020.

[39]
F. Qian, C. Gong, K. Sharma, and Y. Liu, Neural user response generator: Fake news detection with collective user intelligence, in Proc. 27th Int. Joint Conf. Artificial Intelligence (IJCAI-18), Stockholm, Sweden, 2018, pp. 3834–3840.
[40]

H. Jwa, D. Oh, K. Park, J. Kang, and H. Lim, exBAKE: Automatic fake news detection model based on bidirectional encoder representations from transformers (BERT), Appl. Sci., vol. 9, no. 19, p. 4062, 2019.

[41]
K. Shu, L. Cui, S. Wang, D. Lee, and H. Liu, dEFEND: Explainable fake news detection, in Proc. 25th ACM SIGKDD Int. Conf. Knowledge Discovery & Data Mining, Anchorage, AK, USA, 2019, pp. 395–405.
[42]
F. Khan, R. Alturki, G. Srivastava, F. Gazzawe, S. T. U. Shah, and S. Mastorakis, Explainable detection of fake news on social media using pyramidal co-attention network, IEEE Trans. Comput. Soc. Syst.
[43]

I. K. Sastrawan, I. P. A. Bayupati, and D. M. S. Arsa, Detection of fake news using deep learning CNN–RNN based methods, ICT Express, vol. 8, no. 3, pp. 396–408, 2022.

[44]

N. Rai, D. Kumar, N. Kaushik, C. Raj, and A. Ali, Fake news classification using transformer based enhanced LSTM and BERT, Int. J. Cogn. Comput. Eng., vol. 3, pp. 98–105, 2022.

[45]

R. K. Kaliyar, A. Goswami, and P. Narang, FakeBERT: Fake news detection in social media with a BERT-based deep learning approach, Multimed. Tools Appl., vol. 80, no. 8, pp. 11765–11788, 2021.

[46]

S. Y. Lin, Y. C. Kung, and F. Y. Leu, Predictive intelligence in harmful news identification by BERT-based ensemble learning model with text sentiment analysis, Inf. Process. Manag., vol. 59, no. 2, p. 102872, 2022.

[47]

S. Deepak and B. Chitturi, Deep neural approach to fake-news identification, Procedia Comput. Sci., vol. 167, pp. 2236–2243, 2020.

[48]

R. K. Kaliyar, A. Goswami, P. Narang, and S. Sinha, FNDNet—A deep convolutional neural network for fake news detection, Cogn. Syst. Res., vol. 61, no. C, pp. 32–44, 2020.

[49]
W. Y. Wang, “Liar, liar pants on fire”: A new benchmark dataset for fake news detection, arXiv preprint arXiv: 1705.00648, 2017.
[50]
H. Karimi, P. C. Roy, S. Saba-Sadiya, and J. Tang, Multi-source multi-class fake news detection, in Proc. 27th Int. Conf. Computational Linguistics (COLING), Santa Fe, NM, USA, 2018, pp. 1546–1557.
[51]
H. Rashkin, E. Choi, J. Y. Jang, S. Volkova, and Y. Choi, Truth of varying shades: Analyzing language in fake news and political fact-checking, in Proc. 2017 Conf. Empirical Methods in Natural Language Processing (EMNLP), Copenhagen, Denmark, 2017, pp. 2931–2937.
[52]
T. Rasool, W. H. Butt, A. Shaukat, and M. U. Akram, Multi-label fake news detection using multi-layered supervised learning, in Proc. 2019 11th Int. Conf. Computer and Automation Engineering, Perth, Australia, 2019, pp. 73–77.
[53]
M. Arif, A. L. Tonja, I. Ameer, O. Kolesnikova, A. F. Gelbukh, G. Sidorov, and A. G. M. Meque, CIC at CheckThat! 2022: Multi-class and cross-lingual fake news detection, in Proc. CEUR Workshop, Bologna, Italy, 2022, pp. 434–443.
[54]
Y. Long, Q. Lu, R. Xiang, M. Li, and C. R. Huang, Fake news detection through multi-perspective speaker profiles, in Proc. 8th Int. Joint Conf. Natural Language Processing (IJCNLP), Taipei, China, 2017, pp. 252–256.
[55]
N. Singh, R. K. Kaliyar, T. Vivekanand, K. Uthkarsh, V. Mishra, and A. Goswami, B-LIAR: A novel model for handling multiclass fake news data utilizing a transformer encoder stack-based architecture, in Proc. 1st Int. Conf. Informatics (ICI), Noida, India, 2022, pp. 31–35.
[56]
J. Alghamdi, Y. Lin, and S. Luo, Modeling fake news detection using BERT-CNN-BiLSTM architecture, in Proc. IEEE 5th Int. Conf. Multimedia Information Processing and Retrieval (MIPR), CA, USA, 2022, pp. 354–357.
[57]

T. E. Trueman, J. Ashok Kumar, P. Narayanasamy, and J. Vidya, Attention-based C-BiLSTM for fake news detection, Appl. Soft Comput., vol. 110, p. 107600, 2021.

[58]

M. H. Goldani, R. Safabakhsh, and S. Momtazi, Convolutional neural network with margin loss for fake news detection, Inf. Process. Manag., vol. 58, no. 1, p. 102418, 2021.

[59]

M. H. Goldani, S. Momtazi, and R. Safabakhsh, Detecting fake news with capsule neural networks, Appl. Soft Comput., vol. 101, p. 106991, 2021.

[60]
K. Shu, S. Wang, and H. Liu, Beyond news contents: The role of social context for fake news detection, arXiv preprint arXiv: 1712.07709, 2017.
[61]

S. Xiong, G. Zhang, V. Batra, L. Xi, L. Shi, and L. Liu, TRIMOON: Two-round inconsistency-based multi-modal fusion network for fake news detection, Inf. Fusion, vol. 93, no. C, pp. 150–158, 2023.

[62]

C. Song, N. Ning, Y. Zhang, and B. Wu, A multimodal fake news detection model based on crossmodal attention residual and multichannel convolutional neural networks, Inf. Process. Manag., vol. 58, no. 1, p. 102437, 2021.

[63]

B. Palani, S. Elango, and V. K. Vignesh, CB-Fake: A multimodal deep learning framework for automatic fake news detection using capsule neural network and BERT, Multimed. Tools Appl., vol. 81, no. 4, pp. 5587–5620, 2022.

[64]

G. Zhang, A. Giachanou, and P. Rosso, SceneFND: Multimodal fake news detection by modelling scene context information, J. Inf. Sci., vol. 50, no. 2, pp. 355–367, 2022.

[65]

J. Jing, H. Wu, J. Sun, X. Fang, and H. Zhang, Multimodal fake news detection via progressive fusion networks, Inf. Process. Manag., vol. 60, no. 1, p. 103120, 2023.

[66]
Y. J. Lu and C. T. Li, GCAN: Graph-aware co-attention networks for explainable fake news detection on social media, in Proc. 58th Annual Meeting of the Association for Computational Linguistics, Virtual Event, 2020, pp. 504–514.
[67]
G. McIntire, Fake Real News Dataset, https://github.com/GeorgeMcIntire/fake_real_news_dataset, 2024.
[68]
Kaggle, BuzzFeed News Analysis and Classification, http://kaggle.com/code/sohamohajeri/buzzfeed-news-analysis-and-classification/, 2024.
[69]
[70]
Kaggle, Fake and Real News Dataset, http://github.com/MuhammadzohaibNawaz/FakeNewDS6, 2024.
[71]

M. S. Nawaz, P. Fournier-Viger, A. Shojaee, and H. Fujita, Using artificial intelligence techniques for COVID-19 genome analysis, Appl. Intell., vol. 51, no. 5, pp. 3086–3103, 2021.

[72]
R. Agrawal and R. Srikant, Fast algorithms for mining association rules in large databases, in Proc. 20th VLDB, Santiago, Chile, 1994, pp. 487–499.
[73]
P. Fournier-Viger, A. Gomariz, T. Gueniche, E. Mwamikazi, and R. Thomas, TKS: Efficient mining of top-k sequential patterns, in Proc. 9th Int. Conf. Advanced Data Mining and Applications (ADMA), Hangzhou, China, 2013, pp. 109–120.
[74]
P. Fournier-Viger, A. Gomariz, M. Campos, and R. Thomas, Fast vertical mining of sequential patterns using co-occurrence information, in Advances in Knowledge Discovery and Data, V. S. Tseng, T. B. Ho, Z. H. Zhou, A. L. P. Chen, and H. Y. Kao, eds. Cham, Switzerland: Springer, 2014, pp. 40–52.
[75]
P. Fournier-Viger, T. Gueniche, S. Zida, and V. S. Tseng, ERMiner: Sequential rule mining using equivalence classes, in Advances in Intelligent Data Analysis XIII, H. Blockeel, M. van Leeuwen, and V. Vinciotti, eds. Cham, Switzerland: Springer, 2014, pp. 108–119.
[76]
P. Fournier-Viger, J. C. W. Lin, A. Gomariz, T. Gueniche, A. Soltani, Z. Deng, and H. T. Lam, The SPMF open-source data mining library version 2, in Machine Learning and Knowledge Discovery in Databases, B. Berendt, B. Bringmann, É. Fromont, G. Garriga, P. Miettinen, N. Tatti, and V. Tresp, eds. Cham, Switzerland: Springer, 2016, pp. 36–40.
[77]
O. Kramer, Scikit-learn, in Machine Learning for Evolution Strategies, O. Kramer, ed. Cham, Switzerland: Springer, 2016, pp. 45–53.
[78]
S. Ventura and J. M. Luna, Supervised Descriptive Pattern Mining. Berlin, Germany: Springer, 2018.
Big Data Mining and Analytics
Pages 942-963
Cite this article:
Nawaz MZ, Nawaz MS, Fournier-Viger P, et al. Analysis and Classification of Fake News Using Sequential Pattern Mining. Big Data Mining and Analytics, 2024, 7(3): 942-963. https://doi.org/10.26599/BDMA.2024.9020015

268

Views

46

Downloads

0

Crossref

0

Web of Science

0

Scopus

0

CSCD

Altmetrics

Received: 28 January 2024
Accepted: 11 March 2024
Published: 28 August 2024
© The author(s) 2024.

The articles published in this open access journal are distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/).

Return