Next Article in Journal
Mixed Maximum Loss Design for Optic Disc and Optic Cup Segmentation with Deep Learning from Imbalanced Samples
Next Article in Special Issue
Fast Depth Estimation in a Single Image Using Lightweight Efficient Neural Network
Previous Article in Journal
A Hybrid Spectrum Access Strategy with Channel Bonding and Classified Secondary User Mechanism in Multichannel Cognitive Radio Networks
Previous Article in Special Issue
Machine Learning for LTE Energy Detection Performance Improvement
Open AccessArticle

PEnBayes: A Multi-Layered Ensemble Approach for Learning Bayesian Network Structure from Big Data

1
Data Science and Knowledge Engineering Laboratory, College of Computer and Information, Hohai University, Nanjing 210036, China
2
Department of Information Systems, University of Maryland, Baltimore County, Baltimore, MD 21250, USA
3
San Diego Supercomputer Center, University of California, San Diego, La Jolla, CA 92093, USA
*
Authors to whom correspondence should be addressed.
Sensors 2019, 19(20), 4400; https://doi.org/10.3390/s19204400
Received: 1 September 2019 / Revised: 25 September 2019 / Accepted: 2 October 2019 / Published: 11 October 2019
(This article belongs to the Special Issue Intelligent Sensor Signal in Machine Learning)
Discovering the Bayesian network (BN) structure from big datasets containing rich causal relationships is becoming increasingly valuable for modeling and reasoning under uncertainties in many areas with big data gathered from sensors due to high volume and fast veracity. Most of the current BN structure learning algorithms have shortcomings facing big data. First, learning a BN structure from the entire big dataset is an expensive task which often ends in failure due to memory constraints. Second, it is quite difficult to select a learner from numerous BN structure learning algorithms to consistently achieve good learning accuracy. Lastly, there is a lack of an intelligent method that merges separately learned BN structures into a well structured BN network. To address these shortcomings, we introduce a novel parallel learning approach called PEnBayes (Parallel Ensemble-based Bayesian network learning). PEnBayes starts with an adaptive data preprocessing phase that calculates the Appropriate Learning Size and intelligently divides a big dataset for fast distributed local structure learning. Then, PEnBayes learns a collection of local BN Structures in parallel using a two-layered weighted adjacent matrix-based structure ensemble method. Lastly, PEnBayes merges the local BN Structures into a global network structure using the structure ensemble method at the global layer. For the experiment, we generate big data sets by simulating sensor data from patient monitoring, transportation, and disease diagnosis domains. The Experimental results show that PEnBayes achieves a significantly improved execution performance with more consistent and stable results compared with three baseline learning algorithms. View Full-Text
Keywords: Bayesian network learning; big data; ensemble method; Distributed Data Parallelization; scientific workflow Bayesian network learning; big data; ensemble method; Distributed Data Parallelization; scientific workflow
Show Figures

Figure 1

MDPI and ACS Style

Tang, Y.; Wang, J.; Nguyen, M.; Altintas, I. PEnBayes: A Multi-Layered Ensemble Approach for Learning Bayesian Network Structure from Big Data. Sensors 2019, 19, 4400.

Show more citation formats Show less citations formats
Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Article Access Map by Country/Region

1
Back to TopTop