Deep Learning Models for Genomics

A special issue of Life (ISSN 2075-1729). This special issue belongs to the section "Biochemistry, Biophysics and Computational Biology".

Deadline for manuscript submissions: closed (15 January 2023) | Viewed by 16910

Special Issue Editors


E-Mail Website
Guest Editor
Institute of Computer Science (ICS) Foundation for Research and Technology, 71110 Heraklion, Greece
Interests: computational biology; machine learning; health informatics; natural language processing; clinical decision support systems; medical and genomics data analysis

E-Mail Website
Guest Editor
Institute of Computer Science (ICS) Foundation for Research and Technology – Hellas (FORTH), 70013 Heraklion, Greece
Interests: biomedical informatics; computational medicine; translational research in biomedicine; machine learning; intelligent decision support

E-Mail Website
Guest Editor
Institute of Computer Science (ICS) Foundation for Research and Technology – Hellas (FORTH), 70013 Heraklion, Greece
Interests: bioinformatics; population genetics; computational biology; scientific workflows; data mining; machine learning

Special Issue Information

Dear Colleagues,

With the evolution of biotechnology and the introduction of high throughput and next generation sequencing (NGS), biomedical researchers can produce and analyze vast amounts of -omics data. The entrance of post-genomics research into the big data era forces bioinformatics approaches to utilize advanced machine learning methodologies in order to identify patterns, make predictions and model the progression or treatment of a disease.

By effectively leveraging large data sets, deep learning (DL) and sophisticated neural network architectures have already transformed various research fields such as computer vision and natural language processing. This success has triggered the utilization of DL approaches in the biomedical research era, making it the method for many genomics modelling tasks. This move creates an unprecedented momentum in biomedical informatics with the introduction of new bioinformatics and computational biology methodologies. Compared with state-of-the-art methodologies, elaborated DL models provide better performances and higher predictive accuracies in specific genomics tasks. Moreover, it is evident that DL approaches are proved effective in dealing with multimodal data, a critical factor in wholistic and system-based genomics research tasks where the generation of heterogeneous data is the rule. This makes DL an excellent candidate for the realization of precision medicine endeavors.

Given the increasing interest and growing trend on the application of DL architectures in genomics research, in this special issue we aim to collect new efforts in the area of genomics biomedical research that utilize DL approaches and employ respective models. Of special interest are methodologies that ease the explainability of DL predictions towards the validation of respective diagnostic, prognostic and therapeutic models.

Dr Lefteris Koumakis
Dr. George Potamias
Dr. Alexandros Kanterakis
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Life is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • Deep Learning
  • Bioinformatics
  • Computational Biology
  • Genomics 
  • Transcriptomics 
  • Proteomics
  • Genetic variability
  • Precision Medicine
  • Explainable AI for genomics

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • e-Book format: Special Issues with more than 10 articles can be published as dedicated e-books, ensuring wide and rapid dissemination.

Further information on MDPI's Special Issue polices can be found here.

Published Papers (3 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

Jump to: Review

20 pages, 7462 KiB  
Article
Deep Neural Network Framework Based on Word Embedding for Protein Glutarylation Sites Prediction
by Chuan-Ming Liu, Van-Dai Ta, Nguyen Quoc Khanh Le, Direselign Addis Tadesse and Chongyang Shi
Life 2022, 12(8), 1213; https://doi.org/10.3390/life12081213 - 10 Aug 2022
Cited by 10 | Viewed by 2431
Abstract
In recent years, much research has found that dysregulation of glutarylation is associated with many human diseases, such as diabetes, cancer, and glutaric aciduria type I. Therefore, glutarylation identification and characterization are essential tasks for determining modification-specific proteomics. This study aims to propose [...] Read more.
In recent years, much research has found that dysregulation of glutarylation is associated with many human diseases, such as diabetes, cancer, and glutaric aciduria type I. Therefore, glutarylation identification and characterization are essential tasks for determining modification-specific proteomics. This study aims to propose a novel deep neural network framework based on word embedding techniques for glutarylation sites prediction. Multiple deep neural network models are implemented to evaluate the performance of glutarylation sites prediction. Furthermore, an extensive experimental comparison of word embedding techniques is conducted to utilize the most efficient method for improving protein sequence data representation. The results suggest that the proposed deep neural networks not only improve protein sequence representation but also work effectively in glutarylation sites prediction by obtaining a higher accuracy and confidence rate compared to the previous work. Moreover, embedding techniques were proven to be more productive than the pre-trained word embedding techniques for glutarylation sequence representation. Our proposed method has significantly outperformed all traditional performance metrics compared to the advanced integrated vector support, with accuracy, specificity, sensitivity, and correlation coefficient of 0.79, 0.89, 0.59, and 0.51, respectively. It shows the potential to detect new glutarylation sites and uncover the relationships between glutarylation and well-known lysine modification. Full article
(This article belongs to the Special Issue Deep Learning Models for Genomics)
Show Figures

Figure 1

20 pages, 2700 KiB  
Article
Cancer Type Classification in Liquid Biopsies Based on Sparse Mutational Profiles Enabled through Data Augmentation and Integration
by Alexandra Danyi, Myrthe Jager and Jeroen de Ridder
Life 2022, 12(1), 1; https://doi.org/10.3390/life12010001 - 21 Dec 2021
Cited by 4 | Viewed by 2655
Abstract
Identifying the cell of origin of cancer is important to guide treatment decisions. Machine learning approaches have been proposed to classify the cell of origin based on somatic mutation profiles from solid biopsies. However, solid biopsies can cause complications and certain tumors are [...] Read more.
Identifying the cell of origin of cancer is important to guide treatment decisions. Machine learning approaches have been proposed to classify the cell of origin based on somatic mutation profiles from solid biopsies. However, solid biopsies can cause complications and certain tumors are not accessible. Liquid biopsies are promising alternatives but their somatic mutation profile is sparse and current machine learning models fail to perform in this setting. We propose an improved method to deal with sparsity in liquid biopsy data. Firstly, data augmentation is performed on sparse data to enhance model robustness. Secondly, we employ data integration to merge information from: (i) SNV density; (ii) SNVs in driver genes and (iii) trinucleotide motifs. Our adapted method achieves an average accuracy of 0.88 and 0.65 on data where only 70% and 2% of SNVs are retained, compared to 0.83 and 0.41 with the original model, respectively. The method and results presented here open the way for application of machine learning in the detection of the cell of origin of cancer from liquid biopsy data. Full article
(This article belongs to the Special Issue Deep Learning Models for Genomics)
Show Figures

Figure 1

Review

Jump to: Research

28 pages, 1818 KiB  
Review
Artificial Intelligence and Cardiovascular Genetics
by Chayakrit Krittanawong, Kipp W. Johnson, Edward Choi, Scott Kaplin, Eric Venner, Mullai Murugan, Zhen Wang, Benjamin S. Glicksberg, Christopher I. Amos, Michael C. Schatz and W. H. Wilson Tang
Life 2022, 12(2), 279; https://doi.org/10.3390/life12020279 - 14 Feb 2022
Cited by 19 | Viewed by 10362
Abstract
Polygenic diseases, which are genetic disorders caused by the combined action of multiple genes, pose unique and significant challenges for the diagnosis and management of affected patients. A major goal of cardiovascular medicine has been to understand how genetic variation leads to the [...] Read more.
Polygenic diseases, which are genetic disorders caused by the combined action of multiple genes, pose unique and significant challenges for the diagnosis and management of affected patients. A major goal of cardiovascular medicine has been to understand how genetic variation leads to the clinical heterogeneity seen in polygenic cardiovascular diseases (CVDs). Recent advances and emerging technologies in artificial intelligence (AI), coupled with the ever-increasing availability of next generation sequencing (NGS) technologies, now provide researchers with unprecedented possibilities for dynamic and complex biological genomic analyses. Combining these technologies may lead to a deeper understanding of heterogeneous polygenic CVDs, better prognostic guidance, and, ultimately, greater personalized medicine. Advances will likely be achieved through increasingly frequent and robust genomic characterization of patients, as well the integration of genomic data with other clinical data, such as cardiac imaging, coronary angiography, and clinical biomarkers. This review discusses the current opportunities and limitations of genomics; provides a brief overview of AI; and identifies the current applications, limitations, and future directions of AI in genomics. Full article
(This article belongs to the Special Issue Deep Learning Models for Genomics)
Show Figures

Figure 1

Back to TopTop