High Performance Frequent Subgraph Mining on Transaction Datasets: A Survey and Performance Comparison

Bismita S. Jena; Cynthia Khan; Rajshekhar Sunderraman

doi:10.26599/BDMA.2019.9020006

AI Chat Paper

Note: Please note that the following content is generated by AMiner AI. SciOpen does not take any responsibility related to this content.

Chat more with AI

| Sign up

Browse by Subject

Search for peer-reviewed journals with full access.

Journals A - Z

About Us

Discover the SciOpen Platform and Achieve Your Research Goals with Ease.

About Us

Publish with Us

Support

Search articles, authors, keywords, DOl and etc.

Published Date

Reset Search

{{expandStatus?'Exit ':''}}Advanced Search

Journals A - Z

About Us

Publish with Us

Support

PDF (6.8 MB)

Cite

EndNote(RIS) BibTeX

Collect

Submit Manuscript

AI Chat Paper

Show Outline

Outline

Show full outline

Hide outline

Outline

Show full outline

Hide outline

Open Access

High Performance Frequent Subgraph Mining on Transaction Datasets: A Survey and Performance Comparison

Bismita S. Jena(

), Cynthia Khan, Rajshekhar Sunderraman

∙ Department of Computer Science, Georgia State University, Atlanta, GA 30302, USA.

Show Author Information

Abstract

Graph data mining has been a crucial as well as inevitable area of research. Large amounts of graph data are produced in many areas, such as Bioinformatics, Cheminformatics, Social Networks, etc. Scalable graph data mining methods are getting increasingly popular and necessary due to increased graph complexities. Frequent subgraph mining is one such area where the task is to find overly recurring patterns/subgraphs. To tackle this problem, many main memory-based methods were proposed, which proved to be inefficient as the data size grew exponentially over time. In the past few years, several research groups have attempted to handle the Frequent Subgraph Mining (FSM) problem in multiple ways. Many authors have tried to achieve better performance using Graphic Processing Units (GPUs) which has multi-fold improvement over in-memory while dealing with large datasets. Later, Google’s MapReduce model with the Hadoop framework proved to be a major breakthrough in high performance large batch processing. Although MapReduce came with many benefits, its disk I/O and non-iterative style model could not help much for FSM domain since subgraph mining process is an iterative approach. In recent years, Spark has emerged to be the De Facto industry standard with its distributed in-memory computing capability. This is a right fit solution for iterative style of programming as well. In this survey, we cover how high-performance computing has helped in improving the performance tremendously in the transactional directed and undirected aspect of graphs and performance comparisons of various FSM techniques are done based on experimental results.

Keywords

frequent subgraphs isomorphism Spark

References

【1】

Crossref Google Scholar

Big Data Mining and Analytics

Volume 2 Issue 3,
September 2019

Pages 159-180

DOI: 10.26599/BDMA.2019.9020006

	{{item.num}}
{{version.versionName}} Author Response
{{version.versionName}} Review comment

Comments on this article

Go to comment

< Back to all reports

Review Status: {{reviewData.commendedNum}} Commended , {{reviewData.revisionRequiredNum}} Revision Required , {{reviewData.notCommendedNum}} Not Commended Under Peer Review

Review Comment

Cite this Report

. . , {{reviewData.reportCite.doi}}

Cite this article:

Jena BS, Khan C, Sunderraman R. High Performance Frequent Subgraph Mining on Transaction Datasets: A Survey and Performance Comparison. Big Data Mining and Analytics, 2019, 2(3): 159-180. https://doi.org/10.26599/BDMA.2019.9020006

1484

Views

Downloads

Crossref

Web of Science

Scopus

CSCD

Google Scholar
Citation

Received: 13 November 2018

Accepted: 15 February 2019

Published: 04 April 2019