ijms-logo

Journal Browser

Journal Browser

Advances in AI and Machine Learning for the Analysis of -Omics and Complex Molecular Data

A special issue of International Journal of Molecular Sciences (ISSN 1422-0067). This special issue belongs to the section "Molecular Informatics".

Deadline for manuscript submissions: closed (20 February 2025) | Viewed by 8626

Special Issue Editors


E-Mail Website
Guest Editor
1. Department für Biotechnologie, Universität für Bodenkultur Wien, (BOKU), Vienna, Austria
2. Institute of Advanced Research in Artificial Intelligence (IARAI), Vienna, Austria
Interests: machine learning; artificial intelligence; quantitative assays

E-Mail Website
Guest Editor
Department of Computer Networks and Systems, Silesian University of Technology, Akademicka 16, 44-100 Gliwice, Poland
Interests: machine learning; computational biology; bioinformatics; protein function

Special Issue Information

Dear Colleagues,

Increasingly, AI and machine learning spearhead efforts in analyzing the complex datasets generated by high-throughput -omic technologies. Advances in AI and machine learning, on the one hand, and progress in their applications, on the other hand, are traditionally pursued by different scientific communities, which we aim to bring together in this Special Issue of the IJMS.

We thus invite you to share your best work in the following domains:

(1) Advancing AI and machine learning for the analysis of -omics and complex molecular data. We welcome methodological advances or insights that robustly generalize to different data sources. Where complex algorithms or pipelines are introduced, individual steps need to be justified, such as through ablation studies.

(2) Applying AI and machine learning for novel insights into the mechanisms of biological processes or systems at the molecular level. We welcome novel insights concerning molecular functions, regulation mechanisms, pathways (regulation, signaling, metabolic, etc.), or molecular pathology. The identification of biomarkers is of interest if robust across cohorts or linked to mechanisms.

Novel insights should be developed in the context of complex systems, including, but not limited to, studies on organism interactions, healthy cohorts, or heterogenous diseases, such as cardiovascular, autoimmune, or ageing-related diseases, and cancer.

We sincerely hope that this Special Issue can showcase your latest work!

This Special Issue is edited by members of COST Action AtheroNET CA21153 (Network for implementing multi-omics approaches in atherosclerotic cardiovascular disease prevention and research, www.atheronet.eu).

Prof. Dr. David P Kreil
Dr. Aleksandra Gruca
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. International Journal of Molecular Sciences is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. There is an Article Processing Charge (APC) for publication in this open access journal. For details about the APC please see here. Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

 

Keywords

  • computational biology

  • bioinformatics
  • machine learning/AI
  • high-throughput data analysis
  • multi-omics
  • genomics
  • transcriptomics
  • proteomics
  • metabolomics
  • regulation mechanisms
  • pathway analysis (regulation, signaling, and metabolic)
  • molecular pathology
  • complex diseases (cancer, cardiovascular, autoimmune, ageing-related, etc.)
  • biomarkers
  • functional prediction/annotation
  • benchmarking

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • e-Book format: Special Issues with more than 10 articles can be published as dedicated e-books, ensuring wide and rapid dissemination.

Further information on MDPI's Special Issue policies can be found here.

Published Papers (4 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

16 pages, 4225 KiB  
Article
Integrating Metabolomics Domain Knowledge with Explainable Machine Learning in Atherosclerotic Cardiovascular Disease Classification
by Everton Santana, Eliana Ibrahimi, Evangelos Ntalianis, Nicholas Cauwenberghs and Tatiana Kuznetsova
Int. J. Mol. Sci. 2024, 25(23), 12905; https://doi.org/10.3390/ijms252312905 - 30 Nov 2024
Viewed by 807
Abstract
Metabolomic data often present challenges due to high dimensionality, collinearity, and variability in metabolite concentrations. Machine learning (ML) application in metabolomic analyses is enabling the extraction of meaningful information from complex data. Bringing together domain-specific knowledge from metabolomics with explainable ML methods can [...] Read more.
Metabolomic data often present challenges due to high dimensionality, collinearity, and variability in metabolite concentrations. Machine learning (ML) application in metabolomic analyses is enabling the extraction of meaningful information from complex data. Bringing together domain-specific knowledge from metabolomics with explainable ML methods can refine the predictive performance and interpretability of models used in atherosclerosis research. In this work, we aimed to identify the most impactful metabolites associated with the presence of atherosclerotic cardiovascular disease (ASCVD) in cross-sectional case–control studies using explainable ML methods integrated with metabolomics domain knowledge. For this, a subset from the FLEMENGHO cohort with metabolomic data available was used as the training cohort, including 63 patients with a history of ASCVD and 52 non-smoking controls matched by age, sex, and body mass index from the same population. First, Partial Least Squares Discriminant Analysis (PLS-DA) was applied for dimensionality reduction. The selected metabolites’ correlations were analyzed by considering their chemical categorization. Then, eXtreme Gradient Boosting (XGBoost) was used to identify metabolites that characterize ASCVD. Next, the selected metabolites were evaluated in an external cohort to determine their effectiveness in distinguishing between cases and controls. A total of 56 metabolites were selected for ASCVD discrimination using PLS-DA. The primary identified metabolites’ superclasses included lipids, organic acids, and organic oxygen compounds. Upon integrating these metabolites with the XGBoost model, the classification yielded a test area under the curve (AUC) of 0.75. SHAP analyses ranked cholesterol, 3-methylhistidine, and glucuronic acid among the most impactful features and showed the diversity of metabolites considered for building the ASCVD discriminator. Also using XGBoost, the selected metabolites achieved an AUC of 0.93 in an independent external validation cohort. In conclusion, the combination of different metabolites has the potential to build classifiers for ASCVD. Integrating metabolite categorization within the SHAP analysis further enhanced the interpretability of the model, offering insights into metabolite-specific contributions to ASCVD risk. Full article
Show Figures

Figure 1

19 pages, 3020 KiB  
Article
Multimodal Identification of Molecular Factors Linked to Severe Diabetic Foot Ulcers Using Artificial Intelligence
by Anita Omo-Okhuasuyi, Yu-Fang Jin, Mahmoud ElHefnawi, Yidong Chen and Mario Flores
Int. J. Mol. Sci. 2024, 25(19), 10686; https://doi.org/10.3390/ijms251910686 - 4 Oct 2024
Viewed by 2126
Abstract
Diabetic foot ulcers (DFUs) are a severe complication of diabetes mellitus (DM), which often lead to hospitalization and non-traumatic amputations in the United States. Diabetes prevalence estimates in South Texas exceed the national estimate and the number of diagnosed cases is higher among [...] Read more.
Diabetic foot ulcers (DFUs) are a severe complication of diabetes mellitus (DM), which often lead to hospitalization and non-traumatic amputations in the United States. Diabetes prevalence estimates in South Texas exceed the national estimate and the number of diagnosed cases is higher among Hispanic adults compared to their non-Hispanic white counterparts. San Antonio, a predominantly Hispanic city, reports significantly higher annual rates of diabetic amputations compared to Texas. The late identification of severe foot ulcers minimizes the likelihood of reducing amputation risk. The aim of this study was to identify molecular factors related to the severity of DFUs by leveraging a multimodal approach. We first utilized electronic health records (EHRs) from two large demographic groups, encompassing thousands of patients, to identify blood tests such as cholesterol, blood sugar, and specific protein tests that are significantly associated with severe DFUs. Next, we translated the protein components from these blood tests into their ribonucleic acid (RNA) counterparts and analyzed them using public bulk and single-cell RNA sequencing datasets. Using these data, we applied a machine learning pipeline to uncover cell-type-specific and molecular factors associated with varying degrees of DFU severity. Our results showed that several blood test results, such as the Albumin/Creatinine Ratio (ACR) and cholesterol and coagulation tissue factor levels, correlated with DFU severity across key demographic groups. These tests exhibited varying degrees of significance based on demographic differences. Using bulk RNA-Sequenced (RNA-Seq) data, we found that apolipoprotein E (APOE) protein, a component of lipoproteins that are responsible for cholesterol transport and metabolism, is linked to DFU severity. Furthermore, the single-cell RNA-Seq (scRNA-seq) analysis revealed a cluster of cells identified as keratinocytes that showed overexpression of APOE in severe DFU cases. Overall, this study demonstrates how integrating extensive EHRs data with single-cell transcriptomics can refine the search for molecular markers and identify cell-type-specific and molecular factors associated with DFU severity while considering key demographic differences. Full article
Show Figures

Figure 1

21 pages, 16949 KiB  
Article
Combining Hyperspectral Techniques and Genome-Wide Association Studies to Predict Peanut Seed Vigor and Explore Associated Genetic Loci
by Zhenhui Xiong, Shiyuan Liu, Jiangtao Tan, Zijun Huang, Xi Li, Guidan Zhuang, Zewu Fang, Tingting Chen and Lei Zhang
Int. J. Mol. Sci. 2024, 25(15), 8414; https://doi.org/10.3390/ijms25158414 - 1 Aug 2024
Viewed by 1489
Abstract
Seed vigor significantly affects peanut breeding and agricultural yield by influencing seed germination and seedling growth and development. Traditional vigor testing methods are inadequate for modern high-throughput assays. Although hyperspectral technology shows potential for monitoring various crop traits, its application in predicting peanut [...] Read more.
Seed vigor significantly affects peanut breeding and agricultural yield by influencing seed germination and seedling growth and development. Traditional vigor testing methods are inadequate for modern high-throughput assays. Although hyperspectral technology shows potential for monitoring various crop traits, its application in predicting peanut seed vigor is still limited. This study developed and validated a method that combines hyperspectral technology with genome-wide association studies (GWAS) to achieve high-throughput detection of seed vigor and identify related functional genes. Hyperspectral phenotyping data and physiological indices from different peanut seed populations were used as input data to construct models using machine learning regression algorithms to accurately monitor changes in vigor. Model-predicted phenotypic data from 191 peanut varieties were used in GWAS, gene-based association studies, and haplotype analyses to screen for functional genes. Real-time fluorescence quantitative PCR (qPCR) was used to analyze the expression of functional genes in three high-vigor and three low-vigor germplasms. The results indicated that the random forest and support vector machine models provided effective phenotypic data. We identified Arahy.VMLN7L and Arahy.7XWF6F, with Arahy.VMLN7L negatively regulating seed vigor and Arahy.7XWF6F positively regulating it, suggesting distinct regulatory mechanisms. This study confirms that GWAS based on hyperspectral phenotyping reveals genetic relationships in seed vigor levels, offering novel insights and directions for future peanut breeding, accelerating genetic improvements, and boosting agricultural yields. This approach can be extended to monitor and explore germplasms and other key variables in various crops. Full article
Show Figures

Figure 1

20 pages, 23188 KiB  
Article
Boosting Clear Cell Renal Carcinoma-Specific Drug Discovery Using a Deep Learning Algorithm and Single-Cell Analysis
by Yishu Wang, Xiaomin Chen, Ningjun Tang, Mengyao Guo and Dongmei Ai
Int. J. Mol. Sci. 2024, 25(7), 4134; https://doi.org/10.3390/ijms25074134 - 8 Apr 2024
Cited by 1 | Viewed by 3219
Abstract
Clear cell renal carcinoma (ccRCC), the most common subtype of renal cell carcinoma, has the high heterogeneity of a highly complex tumor microenvironment. Existing clinical intervention strategies, such as target therapy and immunotherapy, have failed to achieve good therapeutic effects. In this article, [...] Read more.
Clear cell renal carcinoma (ccRCC), the most common subtype of renal cell carcinoma, has the high heterogeneity of a highly complex tumor microenvironment. Existing clinical intervention strategies, such as target therapy and immunotherapy, have failed to achieve good therapeutic effects. In this article, single-cell transcriptome sequencing (scRNA-seq) data from six patients downloaded from the GEO database were adopted to describe the tumor microenvironment (TME) of ccRCC, including its T cells, tumor-associated macrophages (TAMs), endothelial cells (ECs), and cancer-associated fibroblasts (CAFs). Based on the differential typing of the TME, we identified tumor cell-specific regulatory programs that are mediated by three key transcription factors (TFs), whilst the TF EPAS1/HIF-2α was identified via drug virtual screening through our analysis of ccRCC’s protein structure. Then, a combined deep graph neural network and machine learning algorithm were used to select anti-ccRCC compounds from bioactive compound libraries, including the FDA-approved drug library, natural product library, and human endogenous metabolite compound library. Finally, five compounds were obtained, including two FDA-approved drugs (flufenamic acid and fludarabine), one endogenous metabolite, one immunology/inflammation-related compound, and one inhibitor of DNA methyltransferase (N4-methylcytidine, a cytosine nucleoside analogue that, like zebularine, has the mechanism of inhibiting DNA methyltransferase). Based on the tumor microenvironment characteristics of ccRCC, five ccRCC-specific compounds were identified, which would give direction of the clinical treatment for ccRCC patients. Full article
Show Figures

Figure 1

Back to TopTop