Next Article in Journal
Content-Aware Retargeted Image Quality Assessment
Next Article in Special Issue
FastText-Based Intent Detection for Inflected Languages
Previous Article in Journal
An Experimental Comparison of Feature-Selection and Classification Methods for Microarray Datasets
Previous Article in Special Issue
Detecting Emotions in English and Arabic Tweets
Article Menu

Export Article

Open AccessArticle

Machine Learning Models for Error Detection in Metagenomics and Polyploid Sequencing Data

Faculty of Mathematics and Informatics, University of Sofia “St. Kliment Ohridski”, 5 James Bourchier Blvd., 1164 Sofia, Bulgaria
Institute of Mathematics and Informatics, Bulgarian Academy of Sciences, Acad. G. Bonchev Str., Block 8, 1113 Sofia, Bulgaria
Author to whom correspondence should be addressed.
This paper is an extended version of our paper presented at the 18th International Conference AIMSA 2018, Varna, Bulgaria, 12–14 September 2018.
Information 2019, 10(3), 110;
Received: 27 January 2019 / Revised: 6 March 2019 / Accepted: 6 March 2019 / Published: 11 March 2019
(This article belongs to the Special Issue Artificial Intelligence—Methodology, Systems, and Applications)
PDF [289 KB, uploaded 11 March 2019]


Metagenomics studies, as well as genomics studies of polyploid species such as wheat, deal with the analysis of high variation data. Such data contain sequences from similar, but distinct genetic chains. This fact presents an obstacle to analysis and research. In particular, the detection of instrumentation errors during the digitalization of the sequences may be hindered, as they can be indistinguishable from the real biological variation inside the digital data. This can prevent the determination of the correct sequences, while at the same time make variant studies significantly more difficult. This paper details a collection of ML-based models used to distinguish a real variant from an erroneous one. The focus is on using this model directly, but experiments are also done in combination with other predictors that isolate a pool of error candidates. View Full-Text
Keywords: machine learning; neural network; NGS errors; metagenomics; polyploid genomes machine learning; neural network; NGS errors; metagenomics; polyploid genomes

Figure 1

This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited (CC BY 4.0).

Share & Cite This Article

MDPI and ACS Style

Krachunov, M.; Nisheva, M.; Vassilev, D. Machine Learning Models for Error Detection in Metagenomics and Polyploid Sequencing Data. Information 2019, 10, 110.

Show more citation formats Show less citations formats

Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Related Articles

Article Metrics

Article Access Statistics



[Return to top]
Information EISSN 2078-2489 Published by MDPI AG, Basel, Switzerland RSS E-Mail Table of Contents Alert
Back to Top