Next Article in Journal
Active Control Parameters Monitoring for Freight Trains, Using Wireless Sensor Network Platform and Internet of Things
Next Article in Special Issue
MPPIF-Net: Identification of Plasmodium Falciparum Parasite Mitochondrial Proteins Using Deep Features with Multilayer Bi-directional LSTM
Previous Article in Journal
A Reliable Automated Sampling System for On-Line and Real-Time Monitoring of CHO Cultures
Previous Article in Special Issue
An Adjective Selection Personality Assessment Method Using Gradient Boosting Machine Learning
Article

Measuring Performance Metrics of Machine Learning Algorithms for Detecting and Classifying Transposable Elements

1
Department of Computer Science, Universidad Autónoma de Manizales, Manizales 170001, Colombia
2
Department of Systems and Informatics, Universidad de Caldas, Manizales 170004, Colombia
3
Research Group in Software Engineering, Universidad Autónoma de Manizales, Manizales 170001, Colombia
4
Department of Electronics and Automation, Universidad Autónoma de Manizales, Manizales 170001, Colombia
5
Institut de Recherche pour le Développement, Univ. Montpellier, UMR DIADE, 34394 Montpellier, France
*
Authors to whom correspondence should be addressed.
Processes 2020, 8(6), 638; https://doi.org/10.3390/pr8060638
Received: 25 April 2020 / Revised: 20 May 2020 / Accepted: 22 May 2020 / Published: 27 May 2020
(This article belongs to the Special Issue Bioinformatics Applications Based On Machine Learning)
Because of the promising results obtained by machine learning (ML) approaches in several fields, every day is more common, the utilization of ML to solve problems in bioinformatics. In genomics, a current issue is to detect and classify transposable elements (TEs) because of the tedious tasks involved in bioinformatics methods. Thus, ML was recently evaluated for TE datasets, demonstrating better results than bioinformatics applications. A crucial step for ML approaches is the selection of metrics that measure the realistic performance of algorithms. Each metric has specific characteristics and measures properties that may be different from the predicted results. Although the most commonly used way to compare measures is by using empirical analysis, a non-result-based methodology has been proposed, called measure invariance properties. These properties are calculated on the basis of whether a given measure changes its value under certain modifications in the confusion matrix, giving comparative parameters independent of the datasets. Measure invariance properties make metrics more or less informative, particularly on unbalanced, monomodal, or multimodal negative class datasets and for real or simulated datasets. Although several studies applied ML to detect and classify TEs, there are no works evaluating performance metrics in TE tasks. Here, we analyzed 26 different metrics utilized in binary, multiclass, and hierarchical classifications, through bibliographic sources, and their invariance properties. Then, we corroborated our findings utilizing freely available TE datasets and commonly used ML algorithms. Based on our analysis, the most suitable metrics for TE tasks must be stable, even using highly unbalanced datasets, multimodal negative class, and training datasets with errors or outliers. Based on these parameters, we conclude that the F1-score and the area under the precision-recall curve are the most informative metrics since they are calculated based on other metrics, providing insight into the development of an ML application. View Full-Text
Keywords: transposable elements; metrics; machine learning; deep learning; detection; classification transposable elements; metrics; machine learning; deep learning; detection; classification
Show Figures

Figure 1

MDPI and ACS Style

Orozco-Arias, S.; Piña, J.S.; Tabares-Soto, R.; Castillo-Ossa, L.F.; Guyot, R.; Isaza, G. Measuring Performance Metrics of Machine Learning Algorithms for Detecting and Classifying Transposable Elements. Processes 2020, 8, 638. https://doi.org/10.3390/pr8060638

AMA Style

Orozco-Arias S, Piña JS, Tabares-Soto R, Castillo-Ossa LF, Guyot R, Isaza G. Measuring Performance Metrics of Machine Learning Algorithms for Detecting and Classifying Transposable Elements. Processes. 2020; 8(6):638. https://doi.org/10.3390/pr8060638

Chicago/Turabian Style

Orozco-Arias, Simon, Johan S. Piña, Reinel Tabares-Soto, Luis F. Castillo-Ossa, Romain Guyot, and Gustavo Isaza. 2020. "Measuring Performance Metrics of Machine Learning Algorithms for Detecting and Classifying Transposable Elements" Processes 8, no. 6: 638. https://doi.org/10.3390/pr8060638

Find Other Styles
Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Article Access Map by Country/Region

1
Back to TopTop