Next Article in Journal
Trajectory Tracking of an Omni-Directional Wheeled Mobile Robot Using a Model Predictive Control Strategy
Next Article in Special Issue
Towards New Mappings between Emotion Representation Models
Previous Article in Journal
1H-NMR Spectroscopy: A Possible Approach to Advanced Bitumen Characterization for Industrial and Paving Applications
Previous Article in Special Issue
Estimation of Mental Distress from Photoplethysmography
Article Menu
Issue 2 (February) cover image

Export Article

Open AccessArticle
Appl. Sci. 2018, 8(2), 230; https://doi.org/10.3390/app8020230

A Parallel Approach for Frequent Subgraph Mining in a Single Large Graph Using Spark

1,†,* , 1,†
,
1,†
,
1,†
,
2,†
and
1,†
1
College of Engineering System, National University of Defense Technology, Changsha 410073, Hunan, China
2
Digital Media Center, Hunan Education Publishing House, Changsha 410073, Hunan, China
Current address: No.109, Deya Road, Changsha 410073, Hunan, China.
*
Author to whom correspondence should be addressed.
Received: 3 January 2018 / Revised: 28 January 2018 / Accepted: 31 January 2018 / Published: 2 February 2018
(This article belongs to the Special Issue Socio-Cognitive and Affective Computing)
View Full-Text   |   Download PDF [2394 KB, uploaded 2 February 2018]   |  

Abstract

Frequent subgraph mining (FSM) plays an important role in graph mining, attracting a great deal of attention in many areas, such as bioinformatics, web data mining and social networks. In this paper, we propose SSiGraM (Spark based Single Graph Mining), a Spark based parallel frequent subgraph mining algorithm in a single large graph. Aiming to approach the two computational challenges of FSM, we conduct the subgraph extension and support evaluation parallel across all the distributed cluster worker nodes. In addition, we also employ a heuristic search strategy and three novel optimizations: load balancing, pre-search pruning and top-down pruning in the support evaluation process, which significantly improve the performance. Extensive experiments with four different real-world datasets demonstrate that the proposed algorithm outperforms the existing GraMi (Graph Mining) algorithm by an order of magnitude for all datasets and can work with a lower support threshold. View Full-Text
Keywords: frequent subgraph mining; parallel, algorithm; constraint satisfaction problem; Spark frequent subgraph mining; parallel, algorithm; constraint satisfaction problem; Spark
Figures

Figure 1

This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. (CC BY 4.0).

Share & Cite This Article

MDPI and ACS Style

Qiao, F.; Zhang, X.; Li, P.; Ding, Z.; Jia, S.; Wang, H. A Parallel Approach for Frequent Subgraph Mining in a Single Large Graph Using Spark. Appl. Sci. 2018, 8, 230.

Show more citation formats Show less citations formats

Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Related Articles

Article Metrics

Article Access Statistics

1

Comments

[Return to top]
Appl. Sci. EISSN 2076-3417 Published by MDPI AG, Basel, Switzerland RSS E-Mail Table of Contents Alert
Back to Top