Next Article in Journal
EpiMOGA: An Epistasis Detection Method Based on a Multi-Objective Genetic Algorithm
Next Article in Special Issue
Genetic Variation of the Serine Acetyltransferase Gene Family for Sulfur Assimilation in Maize
Previous Article in Journal
Long-Term Waterlogging as Factor Contributing to Hypoxia Stress Tolerance Enhancement in Cucumber: Comparative Transcriptome Analysis of Waterlogging Sensitive and Tolerant Accessions
Previous Article in Special Issue
Comparative Study of Pine Reference Genomes Reveals Transposable Element Interconnected Gene Networks
Article

InpactorDB: A Classified Lineage-Level Plant LTR Retrotransposon Reference Library for Free-Alignment Methods Based on Machine Learning

1
Department of Computer Science, Universidad Autónoma de Manizales, 170002 Manizales, Colombia
2
Department of Systems and Informatics, Universidad de Caldas, 170002 Manizales, Colombia
3
Department of Physics and Mathematics, Universidad Autónoma de Manizales, 170002 Manizales, Colombia
4
Department of Electronics and Automation, Universidad Autónoma de Manizales, 170002 Manizales, Colombia
5
Institut de Recherche pour le Développement, CIRAD, University of Montpellier, 34394 Montpellier, France
*
Authors to whom correspondence should be addressed.
Academic Editor: Dariusz Grzebelus
Genes 2021, 12(2), 190; https://doi.org/10.3390/genes12020190
Received: 30 December 2020 / Revised: 21 January 2021 / Accepted: 22 January 2021 / Published: 28 January 2021
(This article belongs to the Special Issue Transposable Elements in Plant Genomes)
Long terminal repeat (LTR) retrotransposons are mobile elements that constitute the major fraction of most plant genomes. The identification and annotation of these elements via bioinformatics approaches represent a major challenge in the era of massive plant genome sequencing. In addition to their involvement in genome size variation, LTR retrotransposons are also associated with the function and structure of different chromosomal regions and can alter the function of coding regions, among others. Several sequence databases of plant LTR retrotransposons are available for public access, such as PGSB and RepetDB, or restricted access such as Repbase. Although these databases are useful to identify LTR-RTs in new genomes by similarity, the elements of these databases are not fully classified to the lineage (also called family) level. Here, we present InpactorDB, a semi-curated dataset composed of 130,439 elements from 195 plant genomes (belonging to 108 plant species) classified to the lineage level. This dataset has been used to train two deep neural networks (i.e., one fully connected and one convolutional) for the rapid classification of these elements. In lineage-level classification approaches, we obtain up to 98% performance, indicated by the F1-score, precision and recall scores. View Full-Text
Keywords: LTR retrotransposons; machine learning; deep neural networks; bioinformatics; plant genomes; genomics; InpactorDB LTR retrotransposons; machine learning; deep neural networks; bioinformatics; plant genomes; genomics; InpactorDB
Show Figures

Figure 1

MDPI and ACS Style

Orozco-Arias, S.; Jaimes, P.A.; Candamil, M.S.; Jiménez-Varón, C.F.; Tabares-Soto, R.; Isaza, G.; Guyot, R. InpactorDB: A Classified Lineage-Level Plant LTR Retrotransposon Reference Library for Free-Alignment Methods Based on Machine Learning. Genes 2021, 12, 190. https://doi.org/10.3390/genes12020190

AMA Style

Orozco-Arias S, Jaimes PA, Candamil MS, Jiménez-Varón CF, Tabares-Soto R, Isaza G, Guyot R. InpactorDB: A Classified Lineage-Level Plant LTR Retrotransposon Reference Library for Free-Alignment Methods Based on Machine Learning. Genes. 2021; 12(2):190. https://doi.org/10.3390/genes12020190

Chicago/Turabian Style

Orozco-Arias, Simon, Paula A. Jaimes, Mariana S. Candamil, Cristian F. Jiménez-Varón, Reinel Tabares-Soto, Gustavo Isaza, and Romain Guyot. 2021. "InpactorDB: A Classified Lineage-Level Plant LTR Retrotransposon Reference Library for Free-Alignment Methods Based on Machine Learning" Genes 12, no. 2: 190. https://doi.org/10.3390/genes12020190

Find Other Styles
Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Article Access Map by Country/Region

1
Back to TopTop