Next Article in Journal
Blind Tone-Mapped Image Quality Assessment Based on Regional Sparse Response and Aesthetics
Previous Article in Journal
A Novel Method to Rank Influential Nodes in Complex Networks Based on Tsallis Entropy
Open AccessArticle

Application of Imbalanced Data Classification Quality Metrics as Weighting Methods of the Ensemble Data Stream Classification Algorithms

Department of Systems and Computer Networks, Wroclaw University of Science and Technology, 50-370 Wroclaw, Poland
*
Author to whom correspondence should be addressed.
Entropy 2020, 22(8), 849; https://doi.org/10.3390/e22080849
Received: 15 June 2020 / Revised: 27 July 2020 / Accepted: 28 July 2020 / Published: 31 July 2020
(This article belongs to the Section Signal and Data Analysis)
In the era of a large number of tools and applications that constantly produce massive amounts of data, their processing and proper classification is becoming both increasingly hard and important. This task is hindered by changing the distribution of data over time, called the concept drift, and the emergence of a problem of disproportion between classes—such as in the detection of network attacks or fraud detection problems. In the following work, we propose methods to modify existing stream processing solutions—Accuracy Weighted Ensemble (AWE) and Accuracy Updated Ensemble (AUE), which have demonstrated their effectiveness in adapting to time-varying class distribution. The introduced changes are aimed at increasing their quality on binary classification of imbalanced data. The proposed modifications contain the inclusion of aggregate metrics, such as F1-score, G-mean and balanced accuracy score in calculation of the member classifiers weights, which affects their composition and final prediction. Moreover, the impact of data sampling on the algorithm’s effectiveness was also checked. Complex experiments were conducted to define the most promising modification type, as well as to compare proposed methods with existing solutions. Experimental evaluation shows an improvement in the quality of classification compared to the underlying algorithms and other solutions for processing imbalanced data streams. View Full-Text
Keywords: data streams; imbalanced data; classification; classifier ensembles; oversampling data streams; imbalanced data; classification; classifier ensembles; oversampling
Show Figures

Figure 1

MDPI and ACS Style

Wegier, W.; Ksieniewicz, P. Application of Imbalanced Data Classification Quality Metrics as Weighting Methods of the Ensemble Data Stream Classification Algorithms. Entropy 2020, 22, 849. https://doi.org/10.3390/e22080849

AMA Style

Wegier W, Ksieniewicz P. Application of Imbalanced Data Classification Quality Metrics as Weighting Methods of the Ensemble Data Stream Classification Algorithms. Entropy. 2020; 22(8):849. https://doi.org/10.3390/e22080849

Chicago/Turabian Style

Wegier, Weronika; Ksieniewicz, Pawel. 2020. "Application of Imbalanced Data Classification Quality Metrics as Weighting Methods of the Ensemble Data Stream Classification Algorithms" Entropy 22, no. 8: 849. https://doi.org/10.3390/e22080849

Find Other Styles
Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Article Access Map by Country/Region

1
Search more from Scilit
 
Search
Back to TopTop