Unravelling Convergent Signaling Mechanisms Underlying the Aging-Disease Nexus Using Computational Language Analysis

Junyent, Marina; Noori, Haki; De Schepper, Robin; Frajdenberg, Shanna; Elsaigh, Razan Khalid Abdullah Hussen; McDonald, Patricia H.; Duckett, Derek; Maudsley, Stuart

doi:10.3390/cimb47030189

Open AccessArticle

Unravelling Convergent Signaling Mechanisms Underlying the Aging-Disease Nexus Using Computational Language Analysis

by

Marina Junyent

^1,2,

Haki Noori

^1,3

,

Robin De Schepper

¹,

Shanna Frajdenberg

¹,

Razan Khalid Abdullah Hussen Elsaigh

¹,

Patricia H. McDonald

⁴,

Derek Duckett

⁵ and

Stuart Maudsley

^1,5,*

¹

Receptor Biology Lab., University of Antwerp, 2610 Wilrijk, Belgium

²

IMIM, Hospital del Mar Research Institute, 08003 Barcelona, Spain

³

Department of Chemistry, KU Leuven, Oude Markt 13, 3000 Leuven, Belgium

⁴

Lexicon Pharmaceuticals Inc., 2445 Technology Forest Blvd Fl 1, The Woodlands, TX 77381, USA

⁵

Department of Drug Discovery, H. Lee Moffitt Cancer Center, 12902 Magnolia Drive, Tampa, FL 33612, USA

^*

Author to whom correspondence should be addressed.

Curr. Issues Mol. Biol. 2025, 47(3), 189; https://doi.org/10.3390/cimb47030189

Submission received: 21 January 2025 / Revised: 12 February 2025 / Accepted: 8 March 2025 / Published: 14 March 2025

(This article belongs to the Collection Bioinformatics Approaches to Biomedicine)

Download

Browse Figures

Versions Notes

Abstract

Multiple lines of evidence suggest that multiple pathological conditions and diseases that account for the majority of human mortality are driven by the molecular aging process. At the cellular level, aging can largely be conceptualized to comprise the progressive accumulation of molecular damage, leading to resultant cellular dysfunction. As many diseases, e.g., cancer, coronary heart disease, Chronic obstructive pulmonary disease, Type II diabetes mellitus, or chronic kidney disease, potentially share a common molecular etiology, then the identification of such mechanisms may represent an ideal locus to develop targeted prophylactic agents that can mitigate this disease-driving mechanism. Here, using the input of artificial intelligence systems to generate unbiased disease and aging mechanism profiles, we have aimed to identify key signaling mechanisms that may represent new disease-preventing signaling pathways that are ideal for the creation of disease-preventing chemical interventions. Using a combinatorial informatics approach, we have identified a potential critical mechanism involving the recently identified kinase, Dual specificity tyrosine-phosphorylation-regulated kinase 3 (DYRK3) and the epidermal growth factor receptor (EGFR) that may function as a regulator of the pathological transition of health into disease via the control of cellular fate in response to stressful insults.

Keywords:

disease; aging; mechanisms; informatics; target; convergence; signal transduction; receptor; kinase; aging; stress

1. Introduction

Human aging is a complex process that involves a panoply of coordinated molecular changes in cells and tissues throughout the body. These changes can include alterations in gene expression, protein structure and function, cellular signaling pathways, and the accumulation of several types of damage to biomolecules such as DNA, proteins, and lipids. This damage invariably causes a unidirectional diminution of physiological integrity that engenders loss of physiological functionality, resulting in an increased cellular vulnerability to morbidity and finally mortality [1,2,3,4,5,6,7]. With this deleterious effect of aging on cellular resilience—at a local and a systemic level—it is unsurprising that such a negative impact on the global cellular health status can lead to the increased incidence of multiple disease pathophysiologies including neurodegenerative conditions such as Alzheimer’s disease, osteoporosis, chronic kidney disease, schizophrenia, depression, and T2DM [8,9,10,11,12,13,14,15,16,17]. Given this significant intersection between the aging process and disease, it has recently been a subject of debate as to whether ‘pathological aging’ per se can be considered a specific disease [18,19]. As aging and the accumulation of damage is a natural process observed in nearly all organisms, it differs in a sense from aging-related diseases by its near universal presence in organisms, while aging-related diseases present at a much lower incidence rate [20,21,22]. Hence, the aging process itself is generally not considered to be a specific disease [23,24]. Aging, however, presents itself as a potent pathological process that facilitates, or perhaps even induces, the creation of disease(s) in elderly people [25] and can as such be considered a prime risk factor for the development of many of the world’s most debilitating disorders [26,27,28,29].

Given the profound linkage between aging and disease, it is ever more important to understand how this intersection may occur at the molecular level and which specific therapeutically tractable targets could be present at this nexus. Molecular aging is likely driven by a combination of both intrinsic and extrinsic factors that cause progressive changes that lead to imbalances in cellular homeostasis [30,31,32]. These factors can include genetic predisposition, environmental stressors such as radiation or oxidative stress, and lifestyle factors such as diet and exercise [33,34,35]. Some of the molecular changes, often codified as ‘hallmarks’ of aging include telomere shortening (leading to genomic instability), mitochondrial dysfunction, altered intercellular communication, oxidation-associated DNA damage, accumulation of senescent cells, and impaired protein quality control (proteostasis) [1,2]. These dysfunctions of normal cellular function invariably induce molecular damage—of which not all can be repaired [36]. This unrepaired damage thus accumulates over time, and this degree then results in the transition of healthy homeostasis to disrupted homeostasis—i.e., pathophysiology and disease [37,38].

In this context, it is evident that aging-related disorders share many common features: oxidative stress, DNA damage, metabolic disruption, and loss of cellular resilience [1,2,3,39,40,41], suggesting the potential for new multidimensional therapeutic strategies to mitigate this generic point of disease/aging genesis. Here, we have chosen to apply a multi-methodological approach to impartially identify tractable intervention mechanisms to control the key signaling features of the potential molecular intersection between cellular aging mechanisms and the ultimate disease process. To this end, we have used large language artificial intelligence models, interactomic analyses, semantic data mining, and molecular signature analysis to identify the signaling systems that could be manipulated to prevent aging-induced multi-disease etiology. Using this comprehensive combinatorial approach, we have identified a potentially new disease intervention mechanism linked to DYRK3 (Dual specificity tyrosine-phosphorylation-regulated kinase 3) and the epidermal growth factor receptor (EGFR).

2. Materials and Methods

2.1. Large Language Model-Based Data Curation

Using the free access large language model application ChatGPT v. 3.5 (https://chat.openai.com/: Open AI Inc., San Francisco, CA, USA), a simple semantic query was employed as follows: ‘Can you generate a list of 100 proteins associated with…’. This interrogation term was qualified with the insertion of a specific disease term: Coronary Heart Disease; Cancer; Chronic Obstructive Pulmonary Disorder; Stroke; Alzheimer’s Disease; Type II Diabetes Mellitus; Chronic Kidney Disease; Non-Alcoholic Fatty Liver Disease; Long-COVID; Major Depressive Disorder. The specific interrogator terms used for the diverse aging mechanisms were: genomic instability; telomere attrition; disrupted epigenetic regulation; disrupted proteostasis; disrupted nutrient sensing; mitochondrial dysfunction; disrupted cell-cell communication; cell senescence; cell frailty. To generate 100 protein lists of random proteins, the application Random Gene Set Generator (https://molbiotools.com/randomgenesetgenerator.php: molbioltools, Prague, Czech Republic) was employed. Random Gene Set Generator is a freely available web-based platform for the generation of random selections of genes from the genome of a specified organism. The random selection algorithm employed is based on the Mersenne Twister pseudorandom number generator (PRNG), and the source lists of genes have been parsed from species-specific genome annotation files that are readily available at Ensembl (https://useast.ensembl.org/info/data/ftp/index.html). For an alternative mechanism to generate text-associated protein lists, we employed the application PubPular v. 3.1.2 (https://heart.shinyapps.io/PubPular/: Denver, CO, USA). PubPular is an application that enables rapid bibliometric analysis, in conjunction with text mining and data curation, to generate prioritized proteins statistically associated with a specific input research topic [42]. All web-based applications were accessed on 1 July 2024.

2.2. Network Function Analysis

To generate and analyze protein-protein interaction (PPI) networks, STRING (https://string-db.org/) v. 12 was employed with the protein ID set as human STRING (Search Tool for the Retrieval of Interacting Genes/Proteins) [43]. It is a widely used database and analytical tool that integrates known and predicted protein-protein associations from multiple sources. STRING provides insights into how proteins interact within cellular processes, aiding in the understanding of biological functions and pathways. For clustering proteins into subgroups within a network, k means clustering was employed. Unless specifically stated otherwise, the minimum required interaction score level was set at 0.4. For multiplexed protein-protein/chemical network analysis, the application INDRA (Integrated Network and Dynamical Reasoning Assembler) and NDEX [44] were employed via the DarkKinome database system (https://darkkinome.org/) [45]. This database is a resource that focuses on the underexplored or ‘dark’ kinome—referring to kinases that are the subject of few studies compared to more commonly studied kinases. All web-based applications were accessed on 1 July 2024.

2.3. Pathway Enrichment Analysis

Multiple applications were employed to perform pathway enrichment analysis. Unless specifically stated, the primary mechanisms used was over representation analysis (ORA), using hypergeometric tests of probability. For all the mentioned pathway/Gene Ontology enrichment results stated, the basal inclusion criteria of a scored pathway/Gene Ontology term was the presence of at least two independent proteins enriching the specific category to a probability of at least p = 0.05. Two major multiplexed ORA analysis platforms were employed routinely in this study to facilitate specific pathway enrichment analyses, i.e., Enrichr (https://maayanlab.cloud/Enrichr/: [46,47,48]) and GeneTrail v.3.2 (https://genetrail.bioinf.uni-sb.de/: [49]). All web-based applications were accessed on 1 July 2024.

2.4. Data Representation and Venn Analyses

Word frequency analysis and word cloud generation was performed using the Word Frequency Counter from Code Beautify (https://codebeautify.org/word-frequency-counter) and Word Art (https://wordart.com/create). Venn diagram separation and analysis was performed using InteractiVenn [50] and the online Venn Diagram generator from the Biology and Evolutionary Genomics at Ghent University (https://bioinformatics.psb.ugent.be/webtools/Venn/). All web-based applications were accessed on 1 July 2024.

2.5. Cell Culture and Treatment

Human HEK293 cells (CRL 1573) were obtained from ECACC and cultured at 37 °C with 5% CO₂ ambient tension, according to the approved culture protocols defined for these cells. Cells were maintained in Dulbecco’s Modified Eagle Medium (DMEM; Sigma-Aldrich, St. Louis, MO, USA) with 10% fetal bovine serum (FBS)-containing propagation media, supplemented with 1% Penicillin/Streptomycin antibiotics as previously described [51]. One day prior to experimentation, 3 × 10⁶ cells were seeded into 10 cm plates to obtain a 50–80% cell confluence on the day of the transfection. Cells were counted using a Luna II Automated Cell Counter (Invitrogen-Life Technologies, Thermo Fisher Scientific, Carlsbad, CA, USA). To induce oxidative stress, cells were treated with 100 nM hydrogen peroxide (H₂O₂/peroxide) for 90 min. Cellular proteins were extracted using an NP-40-based lysis buffer (150 mM NaCl, 50 mM Tris, 0.5% Sodium deoxycholate, 1% NP-40) supplemented with a phosphatase inhibitor cocktail (PhosSTOP, Roche Diagnostics, Rotkreuz, Switzerland) and a protease inhibitor cocktail (complete mini, Roche Diagnostics).

2.6. Immunoblot and Immunoprecipitation

Extracted proteins were separated on 4–12% SDS-PAGE (Life Technologies), transferred to PVDF membrane (Amersham, UK) and blocked using 5% BLOTTO milk. Primary antibodies for immunoblots: HER1/EGFR (sc-373746—Santa Cruz, CA, USA), DYRK3 (sc-390532—Santa Cruz), beta-Actin (A2228—Milipore Sigma, Sigma-Aldrich, St. Louis, MO, USA). The membrane was then incubated with species appropriate secondary antibodies conjugated to horseradish peroxidase (HRP), immune complexes were then identified using enhanced chemiluminescence (ECL, GE Healthcare, Chicago, IL, USA) and an Amersham imager 680 system. WB quantification was performed with GE-ImageQuant TL v.8 and Image J software v.1.54. Cerebral cortex lysates (1 mg/mL protein content), as described previously [52], were obtained from C57BL/6 mice, bred under NIH protocol numbers, 432-LCI-2015 and 433-LCI-2015, according to approval of the Institutional Review Board. All animal studies performed were approved according to the guidelines of the NIA Animal Care and Use Committee. EGFR/HER1 or DYRK3 were immunoprecipitated with 2 μg of either anti-EGFR or anti-DYRK3 antibody incubated with the clarified whole cell lysate in addition to 20 μL of a 30% slurry of protein G plus/protein A-agarose (Pierce). Following overnight incubation with agitation at 4 °C, immune complexes were collected by centrifugation (15 min, 14,000 rpm) and washed twice in ice-cold NP-40 based lysis buffer. Proteins were then removed from the immune complexes using an equal volume of 2× Laemmli sample buffer [51]. Immunoprecipitated proteins were then resolved via SDS-PAGE as previously described before selective immunoblotting.

2.7. Statistical Analyses

In each histogram or figure, the data are represented as the means ± SEM (standard error of the mean). Statistical analyses (Student’s t-test) were performed using GraphPad Prism version 9.5 (GraphPad Software, San Diego, CA, USA). Significance level is indicated in each figure as * p ≤ 0.05; ** p ≤ 0.01; *** p ≤ 0.001.

3. Results

3.1. Signature Generation for Common Diseases and Generic Aging Mechanisms

We employed an openly available large language model (LLM) AI application (ChatGPT3.5: https://chat.openai.com/ accessed on 20 February 2025) to generate a nuanced and impartial assessment of the molecular protein associations with either a common disease or a generic aging mechanism. ChatGPT is an advanced conversational AI system based on a large language model (LLM) that uses deep learning techniques, specifically the transformer architecture, to understand and generate human-like semantic responses to query questions. This molecular signature generation using this LLM application was limited to a simple 100 protein signature for all the desired categories. Hence, the disease signatures were generated for: coronary heart disease (CHD); cancer; chronic obstructive pulmonary disease (COPD); stroke; Alzheimer’s disease (AD); Type II diabetes mellitus (T2DM); chronic kidney disease (CKD); Non-alcoholic fatty liver disease (NAFLD); Long-Covid (also known as post-acute sequelae of SARS-CoV-2 infection (PASC)); major depression (MD) (Table S1). At a similar scale (i.e., 100 proteins), similar signatures were made using the LLM for multiple aspects of the molecular aging process: genomic instability; telomere attrition; disrupted epigenetic regulation; disrupted proteostasis; disrupted nutrient sensing; mitochondrial dysfunction; stem cell depletion; disrupted cell-cell communication; cell senescence; cellular frailty (Table S2). To control for the LLM-based signature generation bias, similar magnitude (100 proteins) random protein lists were also created using the Random Geneset Generator from molbioltools (https://molbiotools.com/randomgenesetgenerator.php accessed on 20 February 2025). The specific 100 protein signatures were then analyzed for functional associations between the proteins using STRING-based analysis. The specific disease-based networks (Figure 1A) were assessed for their network edge enrichment (i.e., number of observed edges/numbers of expected edges), average node degree, and average clustering coefficient. A similar level of network investigation was applied for the mechanisms of aging (Figure 1B) and the ten random 100 protein datasets. It was evident that for both human diseases and aging mechanisms, there were significant differences in the network parameters (edge enrichment, average node degree, average clustering coefficient) compared to those generated for the random 100 protein networks (Figure 1C: Supplementary Figure S1).

To assess the specificity of the employed LLM to identify meaningful simple insights into complex processes, we compared it to the retrieval capacity of PubPular v3.1 (https://heart.shinyapps.io/PubPular/ accessed on 20 February 2025 [42]) that generates a semantics-based capacity to retrieve protein identities linked to an input search topic or disease identifier. We compared the retrieval capacity of PubPular (Table S3—Disease; Table S4—Pathomechanisms) to the LLM and then also to a similarly sized randomly generated dataset using simple overlapping protein analysis (Figure 1). Here, we found that for human diseases across all ten studied terms, there was a 19.6 + 3.03% overlap between PubPular retrieval and the LLM, compared to 0.34 + 0.067% overlap with the random data lists (comparing across all ten diseases this distinction was p < 0.0001, two-tailed t-test). A similar definite correlation between PubPular and LLM results was observed for the aging mechanisms: 21.2 + 2.95% overlap with LLM for PubPular retrieval versus 0.24 + 0.026% overlap with LLM for the random data (p < 0.0001 for cross-mechanism means). Hence, there is a strong and meaningful connection between the LLM and PubPular signatures despite their differential mechanistic construction.

The functional phenotypes of the human disease and aging mechanisms signatures were assessed using DisGeNET human disease pathway enrichment (https://www.disgenet.org/ accessed on 20 February 2025) [53] and Gene Ontology-Biological Process annotation (https://www.geneontology.org/ accessed on 20 February 2025) using the Enrichr analytical platform. Enrichr is an open access web-based platform designed for gene set enrichment analysis, and it integrates a wide variety of biological datasets. Given the innovative data collection approach, i.e., open source LLM, the resulting disease-based analytical outputs (DisGeNET) were highly accurate in identifying the specific disease phenotype from each LLM signature (Figure 2). The accuracy of the specific top 5 DisGeNET annotations indicates the high quality of the data retrieval from the LLM. A similar data accuracy was found with respect to the Gene Ontology-Biological Process annotation of the aging mechanism signatures (Figure 3). In this regard, the application of a simple LLM prompt has demonstrated a novel method for impartial data retrieval and curation.

3.2. Signature Refinement and Analysis for Core Properties

Next, we assessed the degree of distinction of the human disease and aging mechanism protein signatures. Clustering the proteins in the two signature categories based on their commonality across the multiple diseases (Figure 4A) or aging mechanisms (Figure 4B) revealed that there was a greater overall level of commonality for proteins across the multiple disease (686 total distinct proteins) states compared to the different aging mechanisms (772 total distinct proteins). Given that the input in each of the categories (disease or mechanism) was 100 protein items each, it is interesting to note that at the specific protein level, the characteristic aging mechanisms appear to be more diverse at the molecular level compared to the diseases (Figure 4C,D). This suggests that disease signatures therefore represent some form of condensed biology of the aging process and that the etiological mechanism of inducing disease is more varied than the ultimate pathological processes causing disease. We further assessed the protein identity distribution using ten random 100 protein identity lists to compare to the disease or pathomechanisms protein distributions (Figure 4E). Comparing the specific datasets with the random datasets, it was evident that there was a clear distinction between specific retrieval and random retrieval processes. Assessing the percentage of proteins that were unique to a specific disease process or aging mechanism, we found that, on average across all ten items, the mean percentage of disease-associated proteins that were unique to a specific disease was 51.5 + 4.4%, while in contrast, 63.5 + 7.8% of the aging mechanism proteins were unique to a specific aging mechanism. The difference between these was non-significant but indicates the greater convergence of disease proteins compared to aging mechanism driving proteins (Figure 4E). However, when compared to the percentage of list-specific proteins found in the random distribution diagram, both disease and pathomechanisms demonstrated a significant difference (Figure 4E). This would suggest that a greater level of diversity in the transition between health and disease exists in the damage-related mechanisms, and once a specific ‘tipping point’ of damage is reached, then aging-related disease progress occurs in a relatively generic manner.

3.3. Multilevel Functional Analyses of Aging Mechanisms and Disease Processes

Next, we generated a multilevel approach to investigate the potential mechanistic convergence of aging-related mechanisms and disease processes. To this end, we created a cohort of datasets extracted from the LLM-derived signatures. As we have previously described, there are two initial levels of interrogation that are apparent with respect to analysis across the complete range of aging-related diseases or aging mechanisms, i.e., proteins that were unique to just one disease/mechanism or ones that were shared. Using this simple parsing mechanism, we created a Venn diagram using the whole disease or mechanism datasets overlapping with the datasets exclusively for those proteins shared across at least two diseases or mechanisms (Figure 5). With this simple four-way (total disease; >2 protein common disease; total mechanisms; >2 protein common mechanisms) overlap analysis, we found that there were 21 proteins completely common to all the four datasets. In addition to this core of 21 proteins, there were 57 additional proteins that were found in at least three of the four distinct datasets (37 proteins common to total diseases, >2 protein common disease, total mechanisms; 20 proteins common to total disease, >2 protein common mechanisms, total mechanisms). The final group of proteins that displayed some level of commonality between the aging related diseases and mechanisms were 55 proteins (found only once in either a specific disease or mechanism) common only to the total disease or mechanisms datasets.

In addition to the simple parsing process of whether disease or mechanism proteins are found in more than one respective disease or mechanism category, we applied a total protein analysis of the total protein cohort in each case and assessed what level of commonality across either diseases or mechanisms was greater than two standard deviations above the mean number of disease or mechanism commonalities. Hence, across the ten diseases, we found that there were 71 proteins that were found in at least three different diseases—with the mean number of commonalities across all the disease-associated proteins being 1.45 + 2SD of 2.065. As the cut-off would fall somewhere between 3 and 4 commonalities, we employed the 71 detailed proteins that possessed at least three different disease commonalities for further investigation. Applying this metric to the mechanisms datasets, we identified 55 proteins that possessed at least three mechanism commonalities (mean commonalities plus 2SD deviation was 1.29 + 2 SD of 1.51).

We next combined these two metrics approaches to generate five distinct levels of dataset investigation (Table S5). Hence, we constructed five distinct levels of datasets to intensify the degree of convergence between aging-related disease and aging drive mechanisms. Level 5 comprises the 21 proteins common to all four of the initial input datasets (totals and >2 commonalities across ten diseases/mechanisms). Level 4 includes this cohort of proteins plus the additional 57 proteins found in at least three out of four of these datasets (generating a total of 78 proteins). Level 3 (133 proteins) comprises these 78 proteins plus the additional 55 proteins found common between the disease and mechanisms datasets that were only found in one category. Level 2 (206 proteins) was generated by supplementing the Level 3 list with additional proteins that were found in the datasets of either disease or mechanisms demonstrating at least three category commonalities (>2 SD away from the total disease/mechanism population mean). Level 1 comprises the total of the LLM-generated data, i.e., 1316. Proteins (Figure 5A).

Applying classical Over Representation Analysis (ORA) over pathway analysis of the five different data levels, we found a strong representation of aging-associated biology across all the five data levels (Figure 5). For several distinct forms of signaling pathway analysis (KEGG Pathway Analysis—Figure 5B; BioCarta—Figure 5C; WikiPathways—Figure 5D), a highly significant enrichment of KEGG—‘Longevity Regulating Pathway’, BioCarta—‘Longevity Pathway’, and WikiPathways ‘Calorie Restriction (CR), and Aging’ was found (Figure 5B–D). In general, the degree of enrichment probability was relatively similar for these specific signaling pathways from Level 1 to Level 5 of the different scale datasets (Figure 5B–D—left panel). Interestingly, in contrast to this relative consistency of enrichment, it was clear that the percentage of proteins in these aging-associated signaling pathways decreased significantly from level 5 to Level 1 (Figure 5B–D—right panel). Hence, a profound concentration of the aging-specific proteins was found in the Level 5 dataset compared to the other data levels. This suggests that with respect to common disease-generating processes, the 21 proteins comprising this level (Level 5) of data representation may have a profound impact on the progress and etiology of aging-associated disease. While it is clear that this core of proteins may be pivotal in the aging-disease nexus, we next performed an informatic expansion process. This involved the identification of the associated protein interactomes for each of these 21 ‘Level 5’ proteins using seven different protein-protein association databases: BioGRID (https://thebiogrid.org/) [54]; The Molecular INTeraction Database—MINT (https://mint.bio.uniroma2.it/) [55]; STRING (https://string-db.org/); GeneShot (https://maayanlab.cloud/geneshot/) [56]; PubPular [42] (https://heart.shinyapps.io/PubPular/); IntACT Molecular Interaction Database—IntACT (https://www.ebi.ac.uk/intact/home) [57]; Humanbase (https://hb.flatironinstitute.org/) [58]. Employing these diverse curated databases, protein-protein interaction (PPI) datasets for each of the 21-core aging-disease proteins were created. Across the seven curated databases, the average total proteins for the 21 core proteins were 1890 + 121.25. To refine this extraction process, interacting proteins that were identified in at least two proteins between the seven PPI databases were also assessed with an average of 399 + 37.64 proteins found in at least two databases. These proteins, found in at least two of the seven PPI databases for each protein were then aggregated to form the 21-protein expansion (Table S6). The distribution, across the different proteins from the core 21, of these proteins is described in Figure 6A. This network of proteins—all verified as protein associates to the original core 21 aging-disease proteins—likely represents an important potential cohort of proteins that constitutes an important target network for interdicting aging-driven disease. To further investigate this, we then interrogated this expansion network with the 2023 edition of the curated Proteomics Drug Atlas of therapeutic molecule signatures [59].

3.4. Therapeutic Signature Analysis of the Aging-Disease Nexus Core Expansion

We next performed Proteomics Drug Atlas (PDA) ORA enrichment analysis of the multiple data levels, i.e., how many core-21 dataset expansion interactome commonalities (from >2 to >18 commonalities), of the expansion dataset. We found that there were 35 PDA therapeutic signatures that were found in all the expansion dataset series (Figure 6B). Hence, these PDA molecular signatures must share some important molecular features key to the regulation of the aging-disease nexus. We aggregated all the proteins identified in the core 21 expansion cohort that were enriched in the 35 common PDA therapeutic signatures (Figure 7 and Figure 8) and then analyzed the frequency of incorporation of specific proteins across these different 35 therapeutic signatures. We found the protein with the highest inclusion frequency, i.e., 27 times out of the 35 signatures was the Dual specificity tyrosine-phosphorylation-regulated kinase 3 (DYRK3).

DYRK3 is a member of the DYRK (dual-specificity tyrosine-regulated kinase) family of protein kinases—this comprises two Class I kinases (DYRK1A, DYRK1B) and three Class II kinases (DYRK2, 3, and 4) [60]. DYRK3 is a dual function kinase and is thus capable of phosphorylating both serine/threonine and tyrosine residues in substrate proteins. DYRK3 has been shown to be involved in multiple cellular processes that can affect aging/longevity including cell cycle regulation, DNA damage response, stress response, and protein stabilization [61,62,63]. Given these features, it is not surprising that an altered expression or dysregulation of DYRK3 has also been associated with certain cancers and neurodegenerative diseases [64,65,66].

We next investigated how the functional signature of DYRK3 may intersect with the aging-disease nexus core 21 expansion. First, we created an unbiased interactomic interpretation of human DYRK3 (UniProt ID: O43781) using a seven database (BioGRID, MINT, STRING, GeneShot, PubPular. IntACT, Humanbase) aggregation process. Using these diverse platforms, we accumulated a 321 PPI signature for DYRK3 (Figure 8) that we then used to identify the total number of DYRK3-associated factors in the expansion 1497 dataset. We thus found 73 DYRK3-expansion common proteins—then this intersection was repeated with 10 random datasets each comprising 321 randomly-chosen proteins and the level of random intersection with the expansion dataset was only 20.5 + 1.66 (mean + SEM). This random protein intersection process was assessed across all the various levels of protein commonalities for the expansion dataset. In each case, the actual number of DYRK3-associated factors was consistently above the level of random protein intersection (Figure 8).

3.5. Mechanistic Investigation of the DYRK3-Aging/Disease Nexus

Given our identification of the potential therapeutic importance of DYRK3 in the aging/disease nexus, we decided to characterize the functionality of the intersection protein dataset between DYRK3 and the aging/disease nexus expansion. To this end, we found that—recapitulating our initial posit—that this dataset is indeed focused on regulating longevity (Figure 9) after KEGG pathway investigation of the 73 protein DYRK3-aging/disease intersection. With further pathway-based interrogation of this target dataset, we found a consistent presence of EGFR/EGF-associated connections for the novel dataset (Figure 9). To assess if this association could be verified via other metadata sources, we consulted the DarkKinome database [45]. Using the INDRA (Integrated Network and Dynamical Reasoning Assembler) platform, we identified the functional intersection between EGFR and DYRK3 [44]. INDRA is a computational platform that allows the construction, analysis, and simulation of biological networks based on the integration of information from various biological data sources. Further reinforcing the importance of DYRK3 in the aging process, the INDRA analysis revealed functional associations with multiple factors linked to aging-related disease therapy/pathophysiology (NEDD4L—[67]; SIRT1—[68]; mTORC1—[69]; CREB—[70]) (Figure 9).

To develop a more in-depth inspection of the potential DYRK3-EGFR association, we created a PPI dataset (in a comparable manner to DYRK3) for EGFR and then assessed the degree (versus random datasets) of intersection between DYRK3 and EGFR (Figure 10).

Our assembled EGFR PPI dataset (created using multiple PPI databases) comprised a total of 5059 proteins with 1179 proteins found common between at least two PPI databases. This 1179 protein dataset and those protein groups with greater levels of PPI database commonality (i.e., >2, 3, 4, 5, 6, 7) were then assessed for their intersection (compared to random datasets) with the DYRK3 321 PPI dataset. At each of the EGFR-DYRK3 intersection levels, we found a highly significant association between EGFR and DYRK3 datasets—compared to random protein datasets of the same size as the DYRK3 dataset. When comparing the 1179 >2 commonality EGFR PPI dataset with the DYRK3 PPI dataset, we found 42 proteins that represent the functional link between these two factors. When we investigated (employing STRING network analysis) the synergistic functional relationship between EGFR-DYRK3, it was evident that a highly nuanced interconnection between EGFR functionality and aging/longevity mechanisms was present in the EGFR-DYRK3 nexus. Hence, multiple proteins within the EGFR-DYRK3 nexus were simultaneously involved in aging disease/mechanisms and EGFR activity. Expanding our analytical pipeline to this 42-protein intersection (Figure 11), we found that this intersection participated in the etiology of multiple disease processes (using Jensen Disease database: https://diseases.jensenlab.org/) that encompass many of the initial age-related disease factors targeted, i.e., hyperglycemia (T2DM), cancer, fatty liver disease (NAFLD), lung disease (COPD), and brain disease (AD, MDD, Stroke).

With respect to the potential effect on transcriptional effects, we observed the most intense enrichment probability for SNAI2, which controls (among many other functions) cell stemness in the context of aging/cancer/metabolic dysfunction [71,72,73]. Given the importance of the EGFR in both aging [74,75] and oncology [76], we assessed the potential EGFR-DYRK3 enrichment of oncogenic signatures using the MSigDB (Molecular Signatures Database: https://www.gsea-msigdb.org/gsea/msigdb/). Using the Oncogenic Signatures section of MSigDB, we found an interesting series of closely-associated enrichments, i.e., ones involving EGFR, BMI1 (Polycomb complex protein BMI-1), and MEL18 (aka Polycomb group RING finger protein 2—PCGF2). It is interesting to note that both BMI1 and MEL18 also appear to play a reciprocal role in controlling cell stemness and cell fate [77,78,79]. Using an additional subset of the MSigDB, i.e., the chemical and genetic perturbations (CGP), we found another highly enriched member of the Polycomb group RING finger protein 2 (PRC2) complex, i.e., SUZ12. It is interesting to note that healthy aging is associated with a close control of PRC2 complex activity [80,81,82]. We additionally found a specific enrichment of SUZ12 again using the EGFR-DYRK3 intersection data through the ESCAPE database (https://www.maayanlab.net/ESCAPE/) that curates evidence of stem cell pluripotency [83,84]. This suggests that the functional link between EGFR and DYRK3 may exert an important function in the aging-disease nexus via the control of cellular fate and stemness. This connection was further reinforced using TF Perturbations followed by expression (via Enrichr interrogation, of the NCBI Gene Expression Omnibus: https://www.ncbi.nlm.nih.gov/geo/) enrichment analysis. Using this platform, we found that the strongest enrichment was for PRRX1, which has been shown to regulate cell stemness in the setting of aging and oncology [85,86,87,88]. These data suggest that perhaps one of the most important anti-aging functions of the EGFR-DYRK3 relationship could be controlling age-associated damage and disease progression via the control of cellular fate in times of cell stress and damage. Therefore, targeting this network controlled by the EGFR-DYRK3 axis may represent an important target for the interdiction of multiple age-related diseases.

4. Discussion

Here, we have investigated the potential importance of a novel form of unbiased data curation for complex molecular biological situations, i.e., unravelling novel therapeutic mechanisms for interventions in aging-related disorders. Here, we have initially generated novel molecular signatures for both aging-related diseases as well as molecular drivers of the aging process at the cellular level. To this end, we employed ChatGPT 3.5 as a mechanism by which to identify protein cohorts linking aging diseases to aging mechanisms. Using a similar mechanism of molecular signature generation, i.e., queries applied at the same time using a standardized semantic input phrase that can generate an output of identical size.

Large language model artificial intelligence (AI) systems such as ChatGPT-3 have been making significant contributions to biomedical science in recent years. These open use applications have the potential to improve our understanding of complex biological systems and accelerate the development of new treatments for a wide range of diseases [89,90,91]. LLMs, in particular, and AI, in general, are especially useful with respect to the capacity to comprehensively appreciate the gestalt nature of large datasets such as genomics, proteomics, and medical imaging [14,15,92,93,94,95,96]. AI algorithms possess capacities to assimilate and analyze large ‘omic’-level datasets and thus identifying cryptic patterns and relationships that would be difficult or impossible for humans to detect. This has led to new insights into the underlying causes of diseases and hopefully the development of more effective treatments. Here, we have employed the capacity of an LLM to perform a novel aggregation of molecular signatures of highly complex processes, i.e., aging related diseases and molecular mechanisms of aging pathophysiology. Allowing a facile capacity to create these customized signatures adds an extra level of high-dimensionality data analysis that may indeed possess some benefits over traditional concept clustering [97,98,99,100]. For example, LLM base training sets can be rapidly updated and thus may include more novel insights compared to databases that may be modified at a lower frequency. In addition, these models can also employ a broader range dataset than classical informatic platforms. While LLM technologies bring novel aspects of database interrogation, it is vital that benchmarking and integration with other informatic systems still take place to ensure that sole reliance on LLMs is not pursued. It is important to note that LLMs can generate signature protein lists through condensing multiple text sources that will also be based on empirical research data and domain expert human curation. While providing a novel mechanism of text-based research, it is important to consider the potential limitations of LLM-based investigation. Training data bias is a common issue for LLMs and can depend on the relative surfeit or dearth of previous experimentation/text concerning a specific protein in signaling networks. Hence, data can be biased by trends in research concerning protein targets or disease burdens. The time to curate LLM databases can also be an issue with respect to inclusion of the most cutting-edge data into the training datasets. With respect to protein interaction/association databases, there is also the issue of whether the coincidence of protein expression in a specific dataset specifically guarantee functional interaction (either physically or indirectly). Associations based on the co-occurrence of proteins and diseases in the training data may not always align with experimentally validated findings. Hence, correlation does not always imply causation, so inferred protein-disease links may reflect shared experimental contexts rather than direct relevance. Appreciating the presence of these limitations of LLM systems, it is important to introduce mitigation strategies to address these issues. Hence, we cross-checked with other databases, demonstrated functional relevance of extracted data using empirical data investigation (e.g., INDRA, PDA), verified the quality of data with classical enrichment analyses, and have rigorously compared the veracity of the generated profiles using unbiased random data interrogation simultaneously. Here, we have demonstrated how a simple integration of LLM and classical informatic investigation can be used to create a novel mechanism for investigating complex biology and pathophysiology. Future development and refinement of LLMs alongside standard bioinformatic resources will likely improve the efficacy of this process to define disease mechanisms and therapeutic effects.

In this current research, the initial LLM-derived molecular signatures (Figure 1) were benchmarked using standard ORA-based pathway/Gene Ontology enrichment analysis (Figure 2). Thus, even with a distinct mode of generation, these signatures were effective condensations of the specific concept used to assemble the lists. From these signatures, we discovered a potential distinction between proteomic convergence between two different concepts in aging, i.e., the relative difference in molecular convergence between aging-related disease and molecular aging mechanisms. Thus, we found that there was a greater diversity of proteins involved in aging mechanisms whereas the proteins associated with age-related disease showed a stronger level of convergence (Figure 3). It has been shown in several conditions that across complex syndromic conditions, there are often observed convergences of protein complexity [101,102,103,104]. Using our combinatorial informatics approach, we were able to identify a core series of 21 protein factors (Figure 4) that were reliably found in a data scale-independent manner to link both aging-related disease and aging mechanisms. We found that this 21-protein cohort possessed a profound concentration of factors associated with longevity (Figure 5), suggesting that this data cohort represents a microcosm of the aging-disease nexus. These factors likely function as consistent drivers/regulators of the stressful responses to insults that cause aging (e.g., oxygen radical damage) as well as regulators of the eventual disease-inducing pathophysiological process. As this cohort may represent an extremely important therapeutic target, we performed an informatic ‘expansion’ process to capture as much as the functional profundity of this aging-nexus core (Figure 6). This ‘expansion’ created a rationally generated functional interactomic signature that could be employed for therapeutic screening analysis. We investigated therapeutic intervention enrichment in this core-21 expansion using the Proteomics Drug Atlas 2023 [59]. Across multiple conservance levels within the core-21 PPI expansion, we found a significant enrichment of 35 therapeutic signatures that were consistently found across all scale levels in the expansion (Figure 7). Upon inspection of the proteins involved with these 35 significantly enriched therapeutic interventions, we found that the most consistently represented signaling factor was the dual specificity kinase DYRK3 (Figure 8). With respect to its potential role in the aging-disease nexus, DYRK3 has been shown to control cell survival dynamics through its ability to phosphorylate the aging regulator SIRT1 [62,105], as well as control CREB activity [106]. DYRK3 has also been shown to be involved in controlling subcellular structure through LLPS (liquid-liquid phase separation) and stress granule (SG) and non-membrane bound organelle (NMO) generation in response to pro-aging cellular stress [107,108]. DYRK3 also possesses the capacity to link factors involved in nutrient sensing (mTORC1) factors to cellular protective effects linked to SG formation [63,109]. To further our investigation into how our evidence reveals the potential role of DYRK3 in the aging-disease nexus, we extracted the specific DYRK3-associated features from our original total aging-disease (1497 factor) protein cohort (Supplementary Figure S2). We found that, across every scale of inspection, the DYRK3 component found in the original aging-disease cohort was significantly greater than that expected randomly. The total number of aging-disease proteins common with our DYRK3 PPI signature was 73 proteins. This aging-disease-DYRK3 cohort was shown to be strongly associated with longevity regulation (Figure 9) and unexpectedly linked to regulatory modulation of the epidermal growth factor receptor (EGFR) (Figure 9 and Figure 10). This was an interesting finding as we have previously demonstrated that not only is the EGFR a subtle regulator of complex cell survival and proliferation activity [110,111,112] but also a key feature of classical neurometabolic aging [74,113,114]. In addition to this functional cooperation in the aging-disease nexus, it has also been demonstrated that DYRK3 may control EGFR functionality as well through a direct or indirect interaction process [67]. To reinforce and further investigate this connectivity, we characterized the intersection signatures of both human DYRK3 and EGFR (Figure 11). We found that the degree of functional intersection between DYRK3 and EGFR was considerably greater than that expected randomly, suggesting that this association is functionally relevant. Indeed, when assessing the functional sequelae of the 42, we found that the majority of the DYRK3-EGFR common PPI proteins generated a phenotype dominated by both EGFR functional regulation (‘EGFR tyrosine kinase inhibitor resistance’ [KEGG], ‘Downregulation of ERBB2 signaling’ [Reactome], ‘EGFR tyrosine kinase inhibitor resistance’ [WikiPathways]), and longevity/disease regulatory pathways (‘Longevity regulating pathway’ [KEGG], ‘Cellular Senescence’ [Reactome], ‘NAD metabolism, sirtuins and aging’ [WikiPathways]). This data suggests that the functional association of DYRK3 with EGFR signaling pathways has the potential to regulate the transition from pro-aging mechanisms to disease etiology. It is interesting to note that there is considerable (24 tissues: adipose; ovary; brain; colon; endometrium; heart; pancreas; prostate; kidney; testes; skin; stomach; thyroid; pancreatic beta cell; lung; urinary bladder; gall bladder; mammary gland; smooth muscle; coronary artery; trabecular bone tissue; adrenal gland; hippocampus; temporal lobe; hypothalamus) expression overlap between curated database profiles for both DYRK3 and EGFR (DYRK3 expression = profiles—https://www.bgee.org/gene/ENSG00000143479, accessed on 20 February 2025;—https://www.ebi.ac.uk/gxa/genes/ensg00000143479, accessed on 20 February 2025: EGFR expression profiles—https://www.bgee.org/gene/ENSG00000146648, accessed on 20 February 2025; https://www.ebi.ac.uk/gxa/genes/ensg00000146648, accessed on 20 February 2025). This considerable coordination of protein expression reinforces the potential for this interaction to occur in multiple tissues in the body. Hence, this association may possess activities across multiple physiological sites.

In addition to these aging-disease investigations, we further sought more nuanced aspects as to how this DYRK3-EGFR axis may exert its pathophysiological actions (Figure 12). Here, we found that this DYRK3-EGFR axis is associated with a broad range of diseases and may control cell fate (as well as stem cell fate) through control of the PRC2 complex via control of BMI1, MEL18 and SUZ12 (Figure 13). This complex has previously been shown to have a potent effect on the longevity of lower organisms such as C. elegans [115] as well as widespread disease prevention through the modulation of pro-aging cellular insults [116,117,118]. Given this data, it is reasonable to propose that chemical modulators of this novel DYRK3-EGFR axis may prove to be interesting novel agents with which to experimentally interdict aging-associated disease and damage.

We also assessed using human cell culture (HEK293) and ex vivo tissue analysis that EGFR and DYRK3 can associate with each other physically in a manner that is sensitive to both stressors that mimic aging (i.e., oxidative stress induced by hydrogen peroxide exposure: Supplementary Figure S3A,B) and also a natural advancing aging process (from ages of 3 to 12 months in murine cerebral cortex samples) (Supplementary Figure S3C). Hence, this physical demonstration reinforces the informatic definitions we have created thus far. In addition to these aging-disease investigations, we further sought more nuanced aspects as to how this DYRK3-EGFR axis may exert its pathophysiological actions (Figure 12). Here, we found that this DYRK3-EGFR axis is associated with a broad range of diseases and may control cell fate (as well as stem cell fate) through control of the PRC2 complex via control of BMI1, MEL18, and SUZ12 (Figure 13). This complex has previously been shown to significantly impact the longevity of lower organisms such as C. elegans [115], and to contribute to widespread disease prevention through the modulation of pro-aging cellular insults [116,117,118]. Given this data, it is reasonable to propose that chemical modulators of the DYRK3-EGFR axis may serve as promising agents to experimentally intervene in aging-associated disease and damage. While our data presents evidence of the potential impact of a DYRK3-EGFR interaction, it does not exclude the importance of many other protein-protein interactions in the aging-disease nexus. Our discovery process simply highlights a novel mechanism for investigating complex, multifactorial biomedical processes, such as the pathological aging process. Further research will potentially reveal linkages between this protein nexus and other factors linked to aging-disease pathomechanisms.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/cimb47030189/s1.

Author Contributions

Experimental and theoretical conceptualization, S.M., M.J. and H.N.; methodology, S.M., M.J., H.N., R.D.S., S.F. and R.K.A.H.E.; software, S.M.; validation, S.M.; formal analysis, S.M.; investigation, S.M., M.J., H.N., R.D.S., S.F. and R.K.A.H.E.; resources, S.M.; data curation, S.M.; writing—original draft preparation, S.M.; writing—review and editing, S.M., M.J., H.N., R.D.S., S.F., R.K.A.H.E., P.H.M. and D.D.; visualization, S.M.; supervision, S.M.; project administration, S.M.; funding acquisition, S.M. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the FWO-OP/Odysseus Program (42/FA010100/32/6484), a FWO Ph.D. Fundamental Research grant (1198020N), and the University of Antwerp Seal of Excellence Award.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

All data pertaining to primary experimental result acquisition is made available by the included data Supplementary Figures S1 and S2, Supplementary Tables S1–S5.

Conflicts of Interest

Author Patricia H. McDonald was employed by the company Lexicon Pharmaceuticals Inc. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

López-Otín, C.; Blasco, M.A.; Partridge, L.; Serrano, M.; Kroemer, G. The hallmarks of aging. Cell 2013, 153, 1194–1217. [Google Scholar] [CrossRef] [PubMed]
López-Otín, C.; Blasco, M.A.; Partridge, L.; Serrano, M.; Kroemer, G. Hallmarks of aging: An expanding universe. Cell 2023, 186, 243–278. [Google Scholar] [CrossRef] [PubMed]
van Gastel, J.; Boddaert, J.; Jushaj, A.; Premont, R.T.; Luttrell, L.M.; Janssens, J.; Martin, B.; Maudsley, S. GIT2-A keystone in ageing and age-related disease. Ageing Res. Rev. 2018, 43, 46–63. [Google Scholar] [CrossRef]
Leysen, H.; van Gastel, J.; Hendrickx, J.O.; Santos-Otte, P.; Martin, B.; Maudsley, S. G Protein-Coupled Receptor Systems as Crucial Regulators of DNA Damage Response Processes. Int. J. Mol. Sci. 2018, 19, 2919. [Google Scholar] [CrossRef]
McEwen, B.S. Central effects of stress hormones in health and disease: Understanding the protective and damaging effects of stress and stress mediators. Eur. J. Pharmacol. 2008, 583, 174–185. [Google Scholar] [CrossRef]
Paronetto, M.P.; Passacantilli, I.; Sette, C. Alternative splicing, and cell survival: From tissue homeostasis to disease. Cell Death Differ. 2016, 23, 1919–1929. [Google Scholar] [CrossRef]
Santos-Otte, P.; Leysen, H.; van Gastel, J.; Hendrickx, J.O.; Martin, B.; Maudsley, S. G Protein-Coupled Receptor Systems and Their Role in Cellular Senescence. Comput. Struct. Biotechnol. J. 2019, 17, 1265–1277. [Google Scholar] [CrossRef]
Coppens, V.; De Wachter, O.; Goossens, J.; Hendrix, J.; Maudsley, S.; Azmi, A.; van Gastel, J.; Van Saet, A.; Lauwers, T.; Morrens, M. Profiling of the Peripheral Blood Mononuclear Cell Proteome in Schizophrenia and Mood Disorders for the Discovery of Discriminatory Biomarkers: A Proof-of-Concept Study. Neuropsychobiology 2020, 79, 324–334. [Google Scholar] [CrossRef] [PubMed]
van Gastel, J.; Hendrickx, J.O.; Leysen, H.; Martin, B.; Veenker, L.; Beuning, S.; Coppens, V.; Morrens, M.; Maudsley, S. Enhanced Molecular Appreciation of Psychiatric Disorders Through High-Dimensionality Data Acquisition and Analytics. Methods Mol. Biol. 2019, 2011, 671–723. [Google Scholar] [CrossRef]
Opdebeeck, B.; Maudsley, S.; Azmi, A.; De Maré, A.; De Leger, W.; Meijers, B.; Verhulst, A.; Evenepoel, P.; D’Haese, P.C.; Neven, E. Indoxyl Sulfate and p-Cresyl Sulfate Promote Vascular Calcification and Associate with Glucose Intolerance. J. Am. Soc. Nephrol. 2019, 30, 751–766. [Google Scholar] [CrossRef]
Curtis, E.; Litwic, A.; Cooper, C.; Dennison, E. Determinants of Muscle and Bone Aging. J. Cell Physiol. 2015, 230, 2618–2625. [Google Scholar] [CrossRef] [PubMed]
Laursen, K.R.; Hulman, A.; Witte, D.R.; Terkildsen Maindal, H. Social relations, depressive symptoms, and incident type 2 diabetes mellitus: The English Longitudinal Study of Ageing. Diabetes Res. Clin. Pract. 2017, 126, 86–94. [Google Scholar] [CrossRef] [PubMed]
Neergaard, J.S.; Dragsbæk, K.; Kehlet, S.N.; Hansen, H.B.; Hansen, G.; Byrjalsen, I.; Alexandersen, P.; Lindgren, L.M.; Bihlet, A.R.; Riis, B.J.; et al. Cohort Profile: The Prospective Epidemiological Risk Factor (PERF) study. Int. J. Epidemiol. 2017, 46, 1104. [Google Scholar] [CrossRef] [PubMed][Green Version]
Maudsley, S.; Devanarayan, V.; Martin, B.; Geerts, H.; Brain Health Modeling Initiative (BHMI). Intelligent and effective informatic deconvolution of “Big Data” and its future impact on the quantitative nature of neurodegenerative disease therapy. Alzheimer’s Dement. 2018, 14, 961–975. [Google Scholar] [CrossRef]
Hendrickx, J.O.; van Gastel, J.; Leysen, H.; Martin, B.; Maudsley, S. High-dimensionality Data Analysis of Pharmacological Systems Associated with Complex Diseases. Pharmacol. Rev. 2020, 72, 191–217. [Google Scholar] [CrossRef]
Pedditzi, E.; Peters, R.; Beckett, N. The risk of overweight/obesity in mid-life and late life for the development of dementia: A systematic review and meta-analysis of longitudinal studies. Age Ageing 2016, 45, 14–21. [Google Scholar] [CrossRef]
Rogulj, D.; El Aklouk, I.; Konjevoda, P.; Ljubić, S.; Pibernik Okanović, M.; Barbir, A.; Luburić, M.; Radman, M.; Budinski, N.; Vučić Lovrenčić, M. Age-dependent systemic DNA damage in early Type 2 Diabetes mellitus. Acta Biochim. Pol. 2017, 64, 233–238. [Google Scholar] [CrossRef]
Mendoza-Núñez, V.M.; Mendoza-Soto, A.B. Is Aging a Disease? A Critical Review Within the Framework of Ageism. Cureus 2024, 16, e54834. [Google Scholar] [CrossRef]
Gladyshev, T.V.; Gladyshev, V.N. A Disease or Not a Disease? Aging As a Pathology. Trends Mol. Med. 2016, 22, 995–996. [Google Scholar] [CrossRef]
Le Couteur, D.G.; Thillainadesan, J. What Is an Aging-Related Disease? An Epidemiological Perspective. J. Gerontol. A Biol. Sci. Med. Sci. 2022, 77, 2168–2174. [Google Scholar] [CrossRef]
Franceschi, C.; Garagnani, P.; Morsiani, C.; Conte, M.; Santoro, A.; Grignolio, A.; Monti, D.; Capri, M.; Salvioli, S. The Continuum of Aging and Age-Related Diseases: Common Mechanisms but Different Rates. Front. Med. 2018, 5, 61. [Google Scholar] [CrossRef]
Li, Z.; Zhang, Z.; Ren, Y.; Wang, Y.; Fang, J.; Yue, H.; Ma, S.; Guan, F. Aging and age-related diseases: From mechanisms to therapeutic strategies. Biogerontology 2021, 22, 165–187. [Google Scholar] [CrossRef]
Novoselov, V.M. Is aging a disease? Adv. Gerontol. 2017, 30, 836–840. [Google Scholar] [CrossRef] [PubMed]
Saborido, C.; García-Barranquero, P. Is Aging a Disease? The Theoretical Definition of Aging in the Light of the Philosophy of Medicine. J. Med. Philos. 2022, 47, 770–783. [Google Scholar] [CrossRef]
Rattan, S.I. Aging is not a disease: Implications for intervention. Aging Dis. 2014, 5, 196–202. [Google Scholar] [CrossRef] [PubMed]
Bakula, D.; Aliper, A.M.; Mamoshina, P.; Petr, M.A.; Teklu, A.; Baur, J.A.; Campisi, J.; Ewald, C.Y.; Georgievskaya, A.; Gladyshev, V.N.; et al. Aging, and drug discovery. Aging 2018, 10, 3079–3088. [Google Scholar] [CrossRef] [PubMed]
Miquel, S.; Champ, C.; Day, J.; Aarts, E.; Bahr, B.A.; Bakker, M.; Bánáti, D.; Calabrese, V.; Cederholm, T.; Cryan, J.; et al. Poor cognitive ageing: Vulnerabilities, mechanisms and the impact of nutritional interventions. Ageing Res. Rev. 2018, 42, 40–55. [Google Scholar] [CrossRef]
Bland, J.S. Age as a Modifiable Risk Factor for Chronic Disease. Integr. Med. 2018, 17, 16–19. [Google Scholar]
MacNee, W.; Rabinovich, R.A.; Choudhury, G. Ageing and the border between health and disease. Eur. Respir. J. 2014, 44, 1332–1352. [Google Scholar] [CrossRef]
Kourtis, N.; Tavernarakis, N. Cellular Stress response pathways and ageing: Intricate molecular relationships. EMBO J. 2011, 30, 2520–2531. [Google Scholar] [CrossRef]
Tenchov, R.; Sasso, J.M.; Wang, X.; Zhou, Q.A. Aging Hallmarks and Progression and Age-Related Diseases: A Landscape View of Research Advancement. ACS Chem. Neurosci. 2024, 15, 1–30. [Google Scholar] [CrossRef]
Gaspar-Silva, F.; Trigo, D.; Magalhaes, J. Ageing in the brain: Mechanisms and rejuvenating strategies. Cell Mol. Life Sci. 2023, 80, 190. [Google Scholar] [CrossRef] [PubMed]
Ho, E.; Qualls, C.; Villareal, D.T. Effect of Diet, Exercise, or Both on Biological Age and Healthy Aging in Older Adults with Obesity: Secondary Analysis of a Randomized Controlled Trial. J. Nutr. Health Aging 2022, 26, 552–557. [Google Scholar] [CrossRef]
Tong, J.; Hei, T.K. Aging and age-related health effects of ionizing radiation. Radiat. Med. Prot. 2020, 1, 15–23. [Google Scholar] [CrossRef]
Liguori, I.; Russo, G.; Curcio, F.; Bulli, G.; Aran, L.; Della-Morte, D.; Gargiulo, G.; Testa, G.; Cacciatore, F.; Bonaduce, D.; et al. Oxidative stress, aging, and diseases. Clin. Interv. Aging 2018, 13, 757–772. [Google Scholar] [CrossRef]
Schumacher, B.; Pothof, J.; Vijg, J.; Hoeijmakers, J.H.J. The central role of DNA damage in the ageing process. Nature 2021, 592, 695–703. [Google Scholar] [CrossRef] [PubMed]
Li, Y.; Tian, X.; Luo, J.; Bao, T.; Wang, S.; Wu, X. Molecular mechanisms of aging and anti-aging strategies. Cell Commun. Signal 2024, 22, 285. [Google Scholar] [CrossRef] [PubMed]
Yousefzadeh, M.; Henpita, C.; Vyas, R.; Soto-Palma, C.; Robbins, P.; Niedernhofer, L. DNA damage-how and why we age? Elife 2021, 10, e62852. [Google Scholar] [CrossRef]
van Gastel, J.; Leysen, H.; Boddaert, J.; Vangenechten, L.; Luttrell, L.M.; Martin, B.; Maudsley, S. Aging-related modifications to G protein-coupled receptor signaling diversity. Pharmacol. Ther. 2021, 223, 107793. [Google Scholar] [CrossRef]
Guo, J.; Huang, X.; Dou, L.; Yan, M.; Shen, T.; Tang, W.; Li, J. Aging, and aging-related diseases: From molecular mechanisms to interventions and treatments. Signal Transduct. Target. Ther. 2022, 7, 391. [Google Scholar] [CrossRef]
Ukraintseva, S.; Arbeev, K.; Duan, M.; Akushevich, I.; Kulminski, A.; Stallard, E.; Yashin, A. Decline in biological resilience as key manifestation of aging: Potential mechanisms and role in health and longevity. Mech. Ageing Dev. 2021, 194, 111418. [Google Scholar] [CrossRef] [PubMed]
Lau, E.; Venkatraman, V.; Thomas, C.T.; Wu, J.C.; Van Eyk, J.E.; Lam, M.P.Y. Identifying High-Priority Proteins Across the Human Diseasome Using Semantic Similarity. J. Proteome Res. 2018, 17, 4267–4278. [Google Scholar] [CrossRef]
Szklarczyk, D.; Kirsch, R.; Koutrouli, M.; Nastou, K.; Mehryary, F.; Hachilif, R.; Gable, A.L.; Fang, T.; Doncheva, N.T.; Pyysalo, S.; et al. The STRING database in 2023: Protein-protein association networks and functional enrichment analyses for any sequenced genome of interest. Nucleic Acids Res. 2023, 51, D638–D646. [Google Scholar] [CrossRef]
Pillich, R.T.; Chen, J.; Churas, C.; Fong, D.; Gyori, B.M.; Ideker, T.; Karis, K.; Liu, S.N.; Ono, K.; Pico, A.; et al. NDEx IQuery: A multi-method network gene set analysis leveraging the Network Data Exchange. Bioinformatics 2023, 39, btad118. [Google Scholar] [CrossRef]
Berginski, M.E.; Moret, N.; Liu, C.; Goldfarb, D.; Sorger, P.K.; Gomez, S.M. The Dark Kinase Knowledgebase: An online compendium of knowledge and experimental results of understudied kinases. Nucleic Acids Res. 2021, 49, D529–D535. [Google Scholar] [CrossRef] [PubMed]
Chen, E.Y.; Tan, C.M.; Kou, Y.; Duan, Q.; Wang, Z.; Meirelles, G.V.; Clark, N.R.; Ma’ayan, A. Enrichr: Interactive and collaborative HTML5 gene list enrichment analysis tool. BMC Bioinform. 2013, 14, 128. [Google Scholar] [CrossRef]
Kuleshov, M.V.; Jones, M.R.; Rouillard, A.D.; Fernandez, N.F.; Duan, Q.; Wang, Z.; Koplev, S.; Jenkins, S.L.; Jagodnik, K.M.; Lachmann, A.; et al. Enrichr: A comprehensive gene set enrichment analysis web server 2016 update. Nucleic Acids Res. 2016, 44, W90–W97. [Google Scholar] [CrossRef] [PubMed]
Xie, Z.; Bailey, A.; Kuleshov, M.V.; Clarke, D.J.B.; Evangelista, J.E.; Jenkins, S.L.; Lachmann, A.; Wojciechowicz, M.L.; Kropiwnicki, E.; Jagodnik, K.M.; et al. Gene Set Knowledge Discovery with Enrichr. Curr. Protoc. 2021, 1, e90. [Google Scholar] [CrossRef]
Gerstner, N.; Kehl, T.; Lenhof, K.; Müller, A.; Mayer, C.; Eckhart, L.; Grammes, N.L.; Diener, C.; Hart, M.; Hahn, O.; et al. Walter J, Wyss-Coray T, Meese E, Keller A, Lenhof HP. GeneTrail 3: Advanced high-throughput enrichment analysis. Nucleic Acids Res. 2020, 48, W515–W520. [Google Scholar] [CrossRef]
Heberle, H.; Meirelles, G.V.; da Silva, F.R.; Telles, G.P.; Minghim, R. InteractiVenn: A web-based tool for the analysis of sets through Venn diagrams. BMC Bioinform. 2015, 16, 169. [Google Scholar] [CrossRef]
Maudsley, S.; Pierce, K.L.; Zamah, A.M.; Miller, W.E.; Ahn, S.; Daaka, Y.; Lefkowitz, R.J.; Luttrell, L.M. The beta(2)-adrenergic receptor mediates extracellular signal-regulated kinase activation via assembly of a multi-receptor complex with the epidermal growth factor receptor. J. Biol. Chem. 2000, 275, 9572–9580. [Google Scholar] [CrossRef] [PubMed]
van Gastel, J.; Leysen, H.; Santos-Otte, P.; Hendrickx, J.O.; Azmi, A.; Martin, B.; Maudsley, S. The RXFP3 receptor is functionally associated with cellular responses to oxidative stress and DNA damage. Aging 2019, 11, 11268–11313. [Google Scholar] [CrossRef]
Piñero, J.; Ramírez-Anguita, J.M.; Saüch-Pitarch, J.; Ronzano, F.; Centeno, E.; Sanz, F.; Furlong, L.I. The DisGeNET knowledge platform for disease genomics: 2019 update. Nucleic Acids Res. 2020, 48, D845–D855. [Google Scholar] [CrossRef]
Oughtred, R.; Rust, J.; Chang, C.; Breitkreutz, B.J.; Stark, C.; Willems, A.; Boucher, L.; Leung, G.; Kolas, N.; Zhang, F.; et al. The BioGRID database: A comprehensive biomedical resource of curated protein, genetic, and chemical interactions. Protein Sci. 2021, 30, 187–200. [Google Scholar] [CrossRef]
Licata, L.; Briganti, L.; Peluso, D.; Perfetto, L.; Iannuccelli, M.; Galeota, E.; Sacco, F.; Palma, A.; Nardozza, A.P.; Santonico, E.; et al. MINT, the molecular interaction database: 2012 update. Nucleic Acids Res. 2012, 40, D857–D861. [Google Scholar] [CrossRef] [PubMed]
Lachmann, A.; Schilder, B.M.; Wojciechowicz, M.L.; Torre, D.; Kuleshov, M.V.; Keenan, A.B.; Ma’ayan, A. Geneshot: Search engine for ranking genes from arbitrary text queries. Nucleic Acids Res. 2019, 47, W571–W577. [Google Scholar] [CrossRef]
Panneerselvam, K.; Porras, P.; Del-Toro, N.; Perfetto, L.; Shrivastava, A.; Ragueneau, E.; Reyes, J.J.M.; Orchard, S.; Hermjakob, H. IMEx Consortium Curators. IntAct Database for Accessing IMEx’s Contextual Metadata of Molecular Interactions. Curr. Protoc. 2024, 4, e70018. [Google Scholar] [CrossRef] [PubMed]
Greene, C.S.; Krishnan, A.; Wong, A.K.; Ricciotti, E.; Zelaya, R.A.; Himmelstein, D.S.; Zhang, R.; Hartmann, B.M.; Zaslavsky, E.; Sealfon, S.C.; et al. Understanding multicellular function and disease with human tissue-specific networks. Nat. Genet. 2015, 47, 569–576. [Google Scholar] [CrossRef]
Mitchell, D.C.; Kuljanin, M.; Li, J.; Van Vranken, J.G.; Bulloch, N.; Schweppe, D.K.; Huttlin, E.L.; Gygi, S.P. A proteome-wide atlas of drug mechanism of action. Nat. Biotechnol. 2023, 41, 845–857. [Google Scholar] [CrossRef]
Yoshida, S.; Yoshida, K. New insights into the roles for DYRK family in mammalian development and congenital diseases. Genes. Dis. 2022, 10, 758–770. [Google Scholar] [CrossRef]
Boni, J.; Rubio-Perez, C.; López-Bigas, N.; Fillat, C.; de la Luna, S. The DYRK Family of Kinases in Cancer: Molecular Functions and Therapeutic Opportunities. Cancers 2020, 12, 2106. [Google Scholar] [CrossRef] [PubMed]
Guo, X.; Williams, J.G.; Schug, T.T.; Li, X. DYRK1A and DYRK3 promote cell survival through phosphorylation and activation of SIRT1. J. Biol. Chem. 2010, 285, 13223–13232. [Google Scholar] [CrossRef]
Wippich, F.; Bodenmiller, B.; Trajkovska, M.G.; Wanka, S.; Aebersold, R.; Pelkmans, L. Dual specificity kinase DYRK3 couples stress granule condensation/dissolution to mTORC1 signaling. Cell 2013, 152, 791–805. [Google Scholar] [CrossRef] [PubMed]
Santos-Durán, G.N.; Barreiro-Iglesias, A. Roles of dual specificity tyrosine-phosphorylation-regulated kinase 2 in nervous system development and disease. Front. Neurosci. 2022, 16, 994256. [Google Scholar] [CrossRef] [PubMed]
Belal, A.; Abdel Gawad, N.M.; Mehany, A.B.M.; Abourehab, M.A.S.; Elkady, H.; Al-Karmalawy, A.A.; Ismael, A.S. Design, synthesis, and molecular docking of new fused 1H-pyrroles, pyrrolo [3,2-d]pyrimidines and pyrrolo[3,2-e][1, 4]diazepine derivatives as potent EGFR/CDK2 inhibitors. J. Enzyme Inhib. Med. Chem. 2022, 37, 1884–1902. [Google Scholar] [CrossRef] [PubMed]
Ivanova, E.; Sharma, S.D.; Brichkina, A.; Pfefferle, P.; Keber, U.; Pagenstecher, A.; Lauth, M. DYRK3 contributes to differentiation and hypoxic control in neuroblastoma. Biochem. Biophys. Res. Commun. 2021, 567, 215–221. [Google Scholar] [CrossRef]
Guo, Y.; Cui, Y.; Li, Y.; Jin, X.; Wang, D.; Lei, M.; Chen, F.; Liu, Y.; Xu, J.; Yao, G.; et al. Cytoplasmic YAP1-mediated ESCRT-III assembly promotes autophagic cell death and is ubiquitinated by NEDD4L in breast cancer. Cancer Commun. 2023, 43, 582–612. [Google Scholar] [CrossRef]
Jiang, M.; Wang, J.; Fu, J.; Du, L.; Jeong, H.; West, T.; Xiang, L.; Peng, Q.; Hou, Z.; Cai, H.; et al. Neuroprotective role of Sirt1 in mammalian models of Huntington’s disease through activation of multiple Sirt1 targets. Nat. Med. 2011, 18, 153–158. [Google Scholar] [CrossRef]
Mota-Martorell, N.; Jové, M.; Pamplona, R. mTOR Complex 1 Content and Regulation Is Adapted to Animal Longevity. Int. J. Mol. Sci. 2022, 23, 8747. [Google Scholar] [CrossRef]
Ganner, A.; Gerber, J.; Ziegler, A.K.; Li, Y.; Kandzia, J.; Matulenski, T.; Kreis, S.; Breves, G.; Klein, M.; Walz, G.; et al. CBP-1/p300 acetyltransferase regulates SKN-1/Nrf cellular levels, nuclear localization, and activity in C. elegans. Exp. Gerontol. 2019, 126, 110690. [Google Scholar] [CrossRef]
Ciummo, S.L.; D’Antonio, L.; Sorrentino, C.; Fieni, C.; Lanuti, P.; Stassi, G.; Todaro, M.; Di Carlo, E. The C-X-C Motif Chemokine Ligand 1 Sustains Breast Cancer Stem Cell Self-Renewal and Promotes Tumor Progression and Immune Escape Programs. Front. Cell Dev. Biol. 2021, 9, 689286. [Google Scholar] [CrossRef] [PubMed]
Doshida, Y.; Sano, H.; Iwabuchi, S.; Aigaki, T.; Yoshida, M.; Hashimoto, S.; Ishigami, A. Age-associated changes in the transcriptomes of non-cultured adipose-derived stem cells from young and old mice assessed via single-cell transcriptome analysis. PLoS ONE 2020, 15, e0242171. [Google Scholar] [CrossRef] [PubMed]
Gross, K.M.; Zhou, W.; Breindel, J.L.; Ouyang, J.; Jin, D.X.; Sokol, E.S.; Gupta, P.B.; Huber, K.; Zou, L.; Kuperwasser, C. Loss of Slug Compromises DNA Damage Repair and Accelerates Stem Cell Aging in Mammary Epithelium. Cell Rep. 2019, 28, 394–407.e6. [Google Scholar] [CrossRef] [PubMed]
Siddiqui, S.; Fang, M.; Ni, B.; Lu, D.; Martin, B.; Maudsley, S. Central role of the EGF receptor in neurometabolic aging. Int. J. Endocrinol. 2012, 2012, 739428. [Google Scholar] [CrossRef]
Vermeulen, Z.; Hervent, A.S.; Dugaucquier, L.; Vandekerckhove, L.; Rombouts, M.; Beyens, M.; Schrijvers, D.M.; De Meyer, G.R.Y.; Maudsley, S.; De Keulenaer, G.W.; et al. Inhibitory actions of the NRG-1/ErbB4 pathway in macrophages during tissue fibrosis in the heart, skin, and lung. Am. J. Physiol. Heart Circ. Physiol. 2017, 313, H934–H945. [Google Scholar] [CrossRef]
Akhoon, B.A.; Rathor, L.; Pandey, R. Withanolide A extends the lifespan in human EGFR-driven cancerous Caenorhabditis elegans. Exp. Gerontol. 2018, 104, 113–117. [Google Scholar] [CrossRef]
Won, H.Y.; Lee, J.Y.; Shin, D.H.; Park, J.H.; Nam, J.S.; Kim, H.C.; Kong, G. Loss of Mel-18 enhances breast cancer stem cell activity and tumorigenicity through activating Notch signaling mediated by the Wnt/TCF pathway. FASEB J. 2012, 26, 5002–5013. [Google Scholar] [CrossRef]
Yoon, M.H.; Kang, S.M.; Lee, S.J.; Woo, T.G.; Oh, A.Y.; Park, S.; Ha, N.C.; Park, B.J. p53 induces senescence through Lamin A/C stabilization-mediated nuclear deformation. Cell Death Dis. 2019, 10, 107. [Google Scholar] [CrossRef]
Miki, J.; Fujimura, Y.; Koseki, H.; Kamijo, T. Polycomb complexes regulate cellular senescence by repression of ARF in cooperation with E2F3. Genes Cells 2007, 12, 1371–1382. [Google Scholar] [CrossRef]
Kennerdell, J.R.; Liu, N.; Bonini, N.M. MiR-34 inhibits polycomb repressive complex 2 to modulate chaperone expression and promote healthy brain aging. Nat. Commun. 2018, 9, 4188. [Google Scholar] [CrossRef]
Ito, T.; Teo, Y.V.; Evans, S.A.; Neretti, N.; Sedivy, J.M. Regulation of Cellular Senescence by Polycomb Chromatin Modifiers through Distinct DNA Damage- and Histone Methylation-Dependent Pathways. Cell Rep. 2018, 22, 3480–3492. [Google Scholar] [CrossRef] [PubMed]
Sharma, S.; Mukherjee, A.K.; Roy, S.S.; Bagri, S.; Lier, S.; Verma, M.; Sengupta, A.; Kumar, M.; Nesse, G.; Pandey, D.P.; et al. Human telomerase is directly regulated by non-telomeric TRF2-G-quadruplex interaction. Cell Rep. 2021, 35, 109154. [Google Scholar] [CrossRef] [PubMed]
Xu, H.; Ang, Y.S.; Sevilla, A.; Lemischka, I.R.; Ma’ayan, A. Construction and validation of a regulatory network for pluripotency and self-renewal of mouse embryonic stem cells. PLoS Comput. Biol. 2014, 10, e1003777. [Google Scholar] [CrossRef]
Xu, H.; Baroukh, C.; Dannenfelser, R.; Chen, E.Y.; Tan, C.M.; Kou, Y.; Kim, Y.E.; Lemischka, I.R.; Ma’ayan, A. ESCAPE: Database for integrating high-content published data collected from human and mouse embryonic stem cells. Database 2013, 2013, bat045. [Google Scholar] [CrossRef]
Shi, L.; Tang, X.; Qian, M.; Liu, Z.; Meng, F.; Fu, L.; Wang, Z.; Zhu, W.G.; Huang, J.D.; Zhou, Z.; et al. A SIRT1-centered circuitry regulates breast cancer stemness and metastasis. Oncogene 2018, 37, 6299–6315. [Google Scholar] [CrossRef] [PubMed]
Pan, Y.; Zhang, H.; Zheng, Y.; Zhou, J.; Yuan, J.; Yu, Y.; Wang, J. Resveratrol Exerts Antioxidant Effects by Activating SIRT2 to Deacetylate Prx1. Biochemistry 2017, 56, 6325–6328. [Google Scholar] [CrossRef]
Mo, C.; Guo, J.; Qin, J.; Zhang, X.; Sun, Y.; Wei, H.; Cao, D.; Zhang, Y.; Zhao, C.; Xiong, Y.; et al. Single-cell transcriptomics of LepR-positive skeletal cells reveals heterogeneous stress-dependent stem and progenitor pools. EMBO J. 2022, 41, e108415. [Google Scholar] [CrossRef]
Sun, L.; Han, T.; Zhang, X.; Liu, X.; Li, P.; Shao, M.; Dong, S.; Li, W. PRRX1 isoform PRRX1A regulates the stemness phenotype and epithelial-mesenchymal transition (EMT) of cancer stem-like cells (CSCs) derived from non-small cell lung cancer (NSCLC). Transl. Lung Cancer Res. 2020, 9, 731–744. [Google Scholar] [CrossRef]
Liu, T.; Li, K.; Wang, Y.; Li, H.; Zhao, H. Evaluating the Utilities of Foundation Models in Single-cell Data Analysis. bioRxiv 2024, preprint. [Google Scholar] [CrossRef]
Agathokleous, E.; Saitanis, C.J.; Fang, C.; Yu, Z. Use of ChatGPT: What does it mean for biology and environmental science? Sci. Total Environ. 2023, 888, 164154. [Google Scholar] [CrossRef]
Li, T.; Shetty, S.; Kamath, A.; Jaiswal, A.; Jiang, X.; Ding, Y.; Kim, Y. CancerGPT: Few-shot Drug Pair Synergy Prediction using Large Pre-trained Language Models. arXiv 2023, arXiv:2304.10946v1. [Google Scholar] [CrossRef]
Fawaz, A.; Ferraresi, A.; Isidoro, C. Systems Biology in Cancer Diagnosis Integrating Omics Technologies, and Artificial Intelligence to Support Physician Decision Making. J. Pers. Med. 2023, 13, 1590. [Google Scholar] [CrossRef]
Wu, X.; Li, W.; Tu, H. Big data, and artificial intelligence in cancer research. Trends Cancer. 2023, 10, 147–160. [Google Scholar] [CrossRef] [PubMed]
Vandenberk, B.; Chew, D.S.; Prasana, D.; Gupta, S.; Exner, D.V. Successes and challenges of artificial intelligence in cardiology. Front. Digit. Health 2023, 5, 1201392. [Google Scholar] [CrossRef] [PubMed] [PubMed Central]
Chen, Y.; Hua, Z.; Lin, F.; Zheng, T.; Zhou, H.; Zhang, S.; Gao, J.; Wang, Z.; Shao, H.; Li, W.; et al. Detection, and classification of breast lesions using multiple information on contrast-enhanced mammography by a multiprocess deep-learning system: A multicenter study. Chin. J. Cancer Res. 2023, 35, 408–423. [Google Scholar] [CrossRef] [PubMed]
Van Meenen, J.; Leysen, H.; Chen, H.; Baccarne, R.; Walter, D.; Martin, B.; Maudsley, S. Making Biomedical Sciences publications more accessible for machines. Med. Health Care Philos. 2022, 25, 179–190. [Google Scholar] [CrossRef]
Gerussi, A.; Verda, D.; Cappadona, C.; Cristoferi, L.; Bernasconi, D.P.; Bottaro, S.; Carbone, M.; Muselli, M.; Invernizzi, P.; Asselta, R.; et al. LLM-PBC: Logic Learning Machine-Based Explainable Rules Accurately Stratify the Genetic Risk of Primary Biliary Cholangitis. J. Pers. Med. 2022, 12, 1587. [Google Scholar] [CrossRef] [PubMed]
Benary, M.; Wang, X.D.; Schmidt, M.; Soll, D.; Hilfenhaus, G.; Nassir, M.; Sigler, C.; Knödler, M.; Keller, U.; Beule, D.; et al. Leveraging Large Language Models for Decision Support in Personalized Oncology. JAMA Netw. Open. 2023, 6, e2343689. [Google Scholar] [CrossRef]
Chen, Q.; Sun, H.; Liu, H.; Jiang, Y.; Ran, T.; Jin, X.; Xiao, X.; Lin, Z.; Chen, H.; Niu, Z. An extensive benchmark study on biomedical text generation and mining with ChatGPT. Bioinformatics 2023, 39, btad557. [Google Scholar] [CrossRef]
Joachimiak, M.P.; Caufield, J.H.; Harris, N.L.; Kim, H.; Mungall, C.J. Gene Set Summarization using Large Language Models. arXiv 2023, arXiv:2305.13338v2. [Google Scholar]
Xxxx, S.; Ahmad, M.H.; Rani, L.; Mondal, A.C. Convergent Molecular Pathways in Type 2 Diabetes Mellitus and Parkinson’s Disease: Insights into Mechanisms and Pathological Consequences. Mol. Neurobiol. 2022, 59, 4466–4487. [Google Scholar] [CrossRef] [PubMed]
Argueti-Ostrovsky, S.; Alfahel, L.; Kahn, J.; Israelson, A. All Roads Lead to Rome: Different Molecular Players Converge to Common Toxic Pathways in Neurodegeneration. Cells 2021, 10, 2438. [Google Scholar] [CrossRef] [PubMed]
Tran, A.A.; De Smet, M.; Grant, G.D.; Khoo, T.K.; Pountney, D.L. Investigating the Convergent Mechanisms between Major Depressive Disorder and Parkinson’s Disease. Complex. Psychiatry 2021, 6, 47–61. [Google Scholar] [CrossRef] [PubMed]
Lu, G.; Wang, Y.; Shi, Y.; Zhang, Z.; Huang, C.; He, W.; Wang, C.; Shen, H.M. Autophagy in health and disease: From molecular mechanisms to therapeutic target. MedComm 2022, 3, e150. [Google Scholar] [CrossRef]
You, Y.; Liang, W. SIRT1 and SIRT6: The role in aging-related diseases. Biochim. Biophys. Acta Mol. Basis Dis. 2023, 1869, 166815. [Google Scholar] [CrossRef]
Li, K.; Zhao, S.; Karur, V.; Wojchowski, D.M. DYRK3 activation, engagement of protein kinase A/cAMP response element-binding protein, and modulation of progenitor cell survival. J. Biol. Chem. 2002, 277, 47052–47060. [Google Scholar] [CrossRef]
Gallo, R.; Rai, A.K.; McIntyre, A.B.R.; Meyer, K.; Pelkmans, L. DYRK3 enables secretory trafficking by maintaining the liquid-like state of ER exit sites. Dev. Cell. 2023, 58, 1880–1897.e11. [Google Scholar] [CrossRef]
Zhang, C.; Rabouille, C. Membrane-Bound Meet Membraneless in Health, and Disease. Cells 2019, 8, 1000. [Google Scholar] [CrossRef]
Mediani, L.; Antoniani, F.; Galli, V.; Vinet, J.; Carrà, A.D.; Bigi, I.; Tripathy, V.; Tiago, T.; Cimino, M.; Leo, G.; et al. Hsp90-mediated regulation of DYRK3 couples stress granule disassembly and growth via mTORC1 signaling. EMBO Rep. 2021, 22, e51740. [Google Scholar] [CrossRef]
Roudabush, F.L.; Pierce, K.L.; Maudsley, S.; Khan, K.D.; Luttrell, L.M. Transactivation of the EGF receptor mediates IGF-1-stimulated shc phosphorylation and ERK1/2 activation in COS-7 cells. J. Biol. Chem. 2000, 275, 22583–22589. [Google Scholar] [CrossRef]
Chadwick, W.; Keselman, A.; Park, S.S.; Zhou, Y.; Wang, L.; Brenneman, R.; Martin, B.; Maudsley, S. Repetitive peroxide exposure reveals pleiotropic mitogen-activated protein kinase signaling mechanisms. J. Signal Transduct. 2011, 2011, 636951. [Google Scholar] [CrossRef] [PubMed]
Donlon, T.A.; Morris, B.J.; He, Q.; Chen, R.; Masaki, K.H.; Allsopp, R.C.; Willcox, D.C.; Tranah, G.J.; Parimi, N.; Evans, D.S.; et al. Association of Polymorphisms in Connective Tissue Growth Factor and Epidermal Growth Factor Receptor Genes With Human Longevity. J. Gerontol. A Biol. Sci. Med. Sci. 2017, 72, 1038–1044. [Google Scholar] [CrossRef] [PubMed]
Detienne, G.; De Haes, W.; Ernst, U.R.; Schoofs, L.; Temmerman, L. Royalactin extends lifespan of Caenorhabditis elegans through epidermal growth factor signaling. Exp. Gerontol. 2014, 60, 129–135. [Google Scholar] [CrossRef] [PubMed]
Siebold, A.P.; Banerjee, R.; Tie, F.; Kiss, D.L.; Moskowitz, J.; Harte, P.J. Polycomb Repressive Complex 2 and Trithorax modulate Drosophila longevity and stress resistance. Proc. Natl. Acad. Sci. USA 2010, 107, 169–174. [Google Scholar] [CrossRef]
Chu, L.; Qu, Y.; An, Y.; Hou, L.; Li, J.; Li, W.; Fan, G.; Song, B.L.; Li, E.; Zhang, L.; et al. Induction of senescence-associated secretory phenotype underlies the therapeutic efficacy of PRC2 inhibition in cancer. Cell Death Dis. 2022, 13, 155. [Google Scholar] [CrossRef]
Adibfar, S.; Elveny, M.; Kashikova, H.S.; Mikhailova, M.V.; Farhangnia, P.; Vakili-Samiani, S.; Tarokhian, H.; Jadidi-Niaragh, F. The molecular mechanisms and therapeutic potential of EZH2 in breast cancer. Life Sci. 2021, 286, 120047. [Google Scholar] [CrossRef]
Cao, Y.; Li, L.; Fan, Z. The role and mechanisms of polycomb repressive complex 2 on the regulation of osteogenic and neurogenic differentiation of stem cells. Cell Prolif. 2021, 54, e13032. [Google Scholar] [CrossRef]
Wang, W.; Qin, X.; Wang, R.; Xu, J.; Wu, H.; Khalid, A.; Jiang, H.; Liu, D.; Pan, F. EZH2 is involved in vulnerability to neuroinflammation and depression-like behaviors induced by chronic stress in different aged mice. J. Affect. Disord. 2020, 272, 452–464. [Google Scholar] [CrossRef]

Figure 1. (A) Protein-protein interaction networks for commonly occurring disease processes. The 100 proteins identified in association with the defined diseases (coronary heart disease (CHD); cancer; chronic obstructive pulmonary disease (COPD); stroke; Alzheimer’s disease (AD); Type II diabetes mellitus (T2DM); chronic kidney disease (CKD); Non-alcoholic fatty liver disease (NAFLD); Long-Covid (also known as post-acute sequelae of SARS-CoV-2 infection (PASC)); major depression (MD)) were clustered into protein-protein interaction (PPI) networks using STRING. For all of the generated networks, the PPI enrichment probability was stated as p < 1.0 × 10⁻¹⁶. The values for the STRING-derived edge enrichment ratio (observed number of PPI network edges/expected number of PPI network edges) are depicted in the figure below the specific disease networks. (B) Protein-protein interaction networks for well-characterized aging pathomechanisms. The 100 proteins identified in association with the defined aging pathomechanisms: genomic instability; telomere attrition; disrupted epigenetic regulation; disrupted proteostasis; disrupted nutrient sensing; mitochondrial dysfunction; stem cell depletion; disrupted cell-cell communication; cell senescence; cellular frailty, were clustered into protein-protein interaction (PPI) networks using STRING. For all of the generated networks, the PPI enrichment probability was stated as p < 1.0 × 10⁻¹⁶. The values for the STRING-derived edge enrichment ratio (observed number of PPI network edges/expected number of PPI network edges) are depicted in the figure below the specific disease networks. (C) Protein-protein interaction networks for random protein lists. Hundred randomly selected proteins were clustered into protein-protein interaction (PPI) networks using STRING. These datasets were denoted Random 1–10. The PPI enrichment probability (p) for each of the random networks was as follows: Random 1—0.484; Random 2—0.58; Random 3—0.442; Random 4—0.895; Random 5—0.623; Random 6—0.892; Random 7—0.915; Random 8—0.16; Random 9—0.739; Random 10—0.32. The values for the STRING-derived edge enrichment ratio (observed number of PPI network edges/expected number of PPI network edges) are depicted in the figure below the specific disease networks. (D) Venn diagram set comparison of the disease specific datasets generated using PubPular and the LLM. The number of overlapping proteins found in specific diseases definitions (100 proteins each) from PubPular or the LLM. In addition, the overlap level of proteins between the LLM and randomly generated proteins lists is indicated. The mean percentage overlap of protein identities between the PubPular disease definition and the LLM definition was 19.6 + 3.03% while the random protein-LLM overlap mean was 0.34 + 0.067%). (E) Similar to the disease terms, the level of protein overlaps between PubPular definition of disease pathomechanisms and the LLM definitions is shown, compared to overlap with randomly generated 100 protein datasets. The mean percentage overlap of protein identities between the PubPular pathomechanism definition and the LLM definition was 21.2 + 2.95% while the random protein-LLM overlap mean was 0.24 + 0.026%.

Figure 2. DisGeNET disease pathway enrichment analysis of the LLM-generated disease definitions. Using the DisGeNET curated databases, an enrichment analysis was performed for each of the specific 100 protein disease lists. Each panel denotes the top 5 most significantly enriched unbiased disease definitions. The probability scores are denoted on each panel as negative log10 transforms of the enrichment probability.

Figure 3. Gene Otology biological process enrichment analysis of the LLM-generated aging pathomechanisms. Using the Gene Ontology—Biological Process (GO-BP)—curated database enrichment analysis was performed for each of the specific 100 protein disease lists. Each panel denotes the top 5 most significantly enriched unbiased enriched GO-BP terms. The probability scores are denoted on each panel as negative log10 transforms of the enrichment probability.

Figure 4. Protein distribution patterns for disease or pathomechanisms defined by the LLM. (A) The heatmaps depict the degree of commonality of specific proteins across the diverse disease processes. In the panel, the leftmost heatmap includes all retrieved proteins while the rightmost panel in (A) depicts the cross-disease distribution of proteins found in at least two different disease definitions. (B) As with panel (A), the heatmap on the left of the panel shows the total distribution of proteins across different pathomechanisms, while the heatmap on the right depicts the cross-pathomechanism distribution of proteins found in at least two different pathomechanism definitions. (C) Numerical distribution pattern of proteins across the ten LLM disease definitions. The diagram indicates the relative number of proteins that are found common with differing numbers of disease signatures. (D) Numerical distribution pattern of proteins across the ten LLM pathomechanisms definitions. As with panel (C), the relative number of proteins that are found common with differing numbers of pathomechanism signatures are shown. (E) Creating ten individual 100 identity protein lists the degree of commonality of proteins found with random data is shown in panel E in a comparable manner to panels (C,D). (F) The percentage of unique proteins found in either the disease, pathomechanisms, or random signatures is shown. Histogram-based data represent the means ± SEM (standard error of the mean). The significance level is indicated in each figure as * p ≤ 0.05; ** p ≤ 0.01.

Figure 5. Venn diagram separation of the disease or pathomechanism LLM datasets. (A) Four-way Venn diagram separation of the proteins comprising total disease (solid blue line), >2 commonality disease proteins (dashed blue line); total pathomechanisms (solid red line), and >2 commonality pathomechanism datasets. To create distinct levels of data analysis the datasets outlined in the panel were constructed as follows. Level 5 = 21 proteins common to all four of the initial input datasets (totals and >2 commonalities across ten diseases/mechanisms). Level 4 = 78 proteins (Level 5 plus the additional 57 proteins found in at least three out of four of these datasets). Level 3 = 133 proteins (Level 5 and 4 plus the additional 55 proteins found common between the disease and mechanisms datasets that were only found in one category). Level 2 = 206 proteins (Level 5, 4, 3 plus additional 73 proteins found in the datasets of either disease or mechanisms demonstrating at least three category commonalities (>2 SD away from the total disease/mechanism population mean). Level 1 = 1316 proteins (comprises the total of the LLM-generated data. (B) KEGG pathway ‘Longevity Regulating Pathway’ enrichment analysis of Level 1 to 5 data (left panel). At each of the analytical levels for the ‘Longevity Regulating Pathway’ enrichment, the percentage of the total input dataset that was used for the pathway enrichment that belonged to this specific pathway was calculated. For each histogram bar, the number of input proteins that populated this pathway (Longevity Regulating Pathway) is denoted next to the total input protein dataset size (i.e., 14 proteins found in the pathway from the total of 21 input proteins). Panels (C,D) depict similar data for the specific enrichment of the BioCarta Longevity Pathway (C) and the WikiPathways CR and Aging pathway (D). For all three panels, there is a clear increase in the percentage of the input dataset used that falls within the specific aging-associated pathway from Level 1 ascending to Level 5.

Figure 6. Interactome-based expansion of the Level 1 disease-pathomechanism core dataset. (A) An expansion dataset of proteins found to be associated (in at least seven different curated protein-protein association databases) with the 21 individual Level 1 proteins were created. The protein distribution pattern, i.e., how often a specific protein is found to be common between the original 21 protein lists, is depicted. In total, 1497 Level 1 core interacting proteins were found. At the most consistently found level there were 7 proteins found in 19 out of 21 input dataset lists. (B) By Generating Proteomics Drug Atlas Drug Response enrichment analysis profiles to the different protein commonality datasets (excluding the >19 common protein lists as no significant enrichment was found), we found 35 Proteomics Drug Atlas Drug Responses that were common to all the input dataset analysis streams.

Figure 7. Heatmap for the common disease-pathomechanism enriched proteomic drug responses. The heatmap indicates the enrichment probabilities for all the completely common 35 drug responses. The associated key indicates the negative log10 transform the Proteomics Drug Atlas Drug Response enrichment probability score.

Figure 8. Frequency analysis of proteins linked to conserved Proteomics Drug Atlas Drug Response signature. (A) STRING-derived protein networks for the specific disease-pathomechanism proteins that significantly populate the conserved 35 drug response networks (1–35). The frequency of proteins found within all these 35 networks is depicted in panel (B) as a proportional word cloud. The kinase DYRK3 was found in 27 of the 35 drug response networks.

Figure 9. Multifaceted pathway enrichment analysis of the DYRK3-specific disease-pathomechanism dataset. Using the 73 DYRK3-specific proteins extracted from the >2 disease-pathomechanism nexus dataset, the following enrichment paradigms were performed: (A) KEGG pathway analysis; (B) NetPath analysis; (C) SignaLink pathway analysis; (D) PharmGKB pathway analysis. For each analytical output, the top 10 enriched pathways are indicated via a combined metric correlating the negative log10 transform of the enrichment probability value and the specific number of enriching proteins from the input DYRK3-associated dataset.

Figure 10. DarkKinome analysis of the functional sequelae of DYRK3. The knowledge graph depicted using NDEX v. 2.6. indicates the functional connections between the kinase DYRK3 and multiple forms of functional targets.

Figure 11. EGFR and DYRK3 share a profound functional interaction. (A) A functional EGFR interactome was created using eight independent protein interaction databases. The distribution of individual proteins across multiple databases is indicated in the color-coded proportional diagram. (B) Proportional Venn diagrams indicate the degrees of protein identity overlap between all levels of EGFR interactome protein commonalities with either the full DYRK3 interaction dataset (321 proteins) or the mean of 10 random (321 proteins) datasets. EGFR-DYRK3 intersections are denoted in blue, while the EGFR-Random intersections are denoted in orange. (C) Histogram representation of the degrees of protein overlap between the multiple levels of EGFR dataset interrogation with either the DYRK3 specific (blue bars) or the random (orange bars) datasets. Histogram-based data shown represent the means ± SEM (standard error of the mean). The significance level is indicated in each figure as * p ≤ 0.05; ** p ≤ 0.01; *** p ≤ 0.001.

Figure 12. EGFR-DYRK3 functional nexus is associated with classical hallmarks of aging. STRING-based network creation with signaling pathway overlays was created for the specific 42 proteins identified as forming a functional nexus between EGFR and DYRK3. The network generated was created using the ‘evidence’ based setting allowing data to be collected from: textmining; curated experimental databases; gene neighborhood/fusions; gene co-occurrences; co-expression; protein homology. The color coding of the input nodes identifies the specific pathway enrichments (p < 0.05) found. The pathways employed for this include: KEGG; Reactome; WikiPathways. Protein nodes that are found in multiple pathway enrichments are denoted as multicolored according to the pathways they reside.

Figure 13. Multidimensional interrogation of the EGFR-DYRK3 functional nexus. Further enrichment analyses were performed on the 42 protein EGFR-DYRK3 aging nexus signature. Each pie chart depicts the top 5 most significantly enriched molecular pathway or dataset collection. (A) Jensen disease enrichment analysis indicates a broad spectrum of diseases that may be controlled by this aging-regulatory EGFR-DYRK3 nexus—indicating the fundamental basis of this molecular unit. (B) The CGP (Chemical Genetic Perturbation) dataset collection at MSigDB indicated an association of the EGFR-DYRK3 nexus with the Polycomb Repressive Complex 2 Subunit SUZ12. (C) ARCHS4 TF co-expression analysis demonstrated a potent association with SNAI2 that regulates cell fate in stressful circumstances. (D) ESCAPE database enrichment analysis reinforced the association of the EGFR-DYRK3 nexus with SUZ12. (E) MSigDB Oncogenic signature analysis revealed the importance of EGFR-DYRK3 with BMI1 and MEL18 that are also associated with SUZ12. (F) Analysis using the TF Perturbations Followed by Expression database revealed the importance of PRRX1 and the EGFR-DYRK3 nexus.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Junyent, M.; Noori, H.; De Schepper, R.; Frajdenberg, S.; Elsaigh, R.K.A.H.; McDonald, P.H.; Duckett, D.; Maudsley, S. Unravelling Convergent Signaling Mechanisms Underlying the Aging-Disease Nexus Using Computational Language Analysis. Curr. Issues Mol. Biol. 2025, 47, 189. https://doi.org/10.3390/cimb47030189

AMA Style

Junyent M, Noori H, De Schepper R, Frajdenberg S, Elsaigh RKAH, McDonald PH, Duckett D, Maudsley S. Unravelling Convergent Signaling Mechanisms Underlying the Aging-Disease Nexus Using Computational Language Analysis. Current Issues in Molecular Biology. 2025; 47(3):189. https://doi.org/10.3390/cimb47030189

Chicago/Turabian Style

Junyent, Marina, Haki Noori, Robin De Schepper, Shanna Frajdenberg, Razan Khalid Abdullah Hussen Elsaigh, Patricia H. McDonald, Derek Duckett, and Stuart Maudsley. 2025. "Unravelling Convergent Signaling Mechanisms Underlying the Aging-Disease Nexus Using Computational Language Analysis" Current Issues in Molecular Biology 47, no. 3: 189. https://doi.org/10.3390/cimb47030189

APA Style

Junyent, M., Noori, H., De Schepper, R., Frajdenberg, S., Elsaigh, R. K. A. H., McDonald, P. H., Duckett, D., & Maudsley, S. (2025). Unravelling Convergent Signaling Mechanisms Underlying the Aging-Disease Nexus Using Computational Language Analysis. Current Issues in Molecular Biology, 47(3), 189. https://doi.org/10.3390/cimb47030189

Article Menu

Unravelling Convergent Signaling Mechanisms Underlying the Aging-Disease Nexus Using Computational Language Analysis

Abstract

1. Introduction

2. Materials and Methods

2.1. Large Language Model-Based Data Curation

2.2. Network Function Analysis

2.3. Pathway Enrichment Analysis

2.4. Data Representation and Venn Analyses

2.5. Cell Culture and Treatment

2.6. Immunoblot and Immunoprecipitation

2.7. Statistical Analyses

3. Results

3.1. Signature Generation for Common Diseases and Generic Aging Mechanisms

3.2. Signature Refinement and Analysis for Core Properties

3.3. Multilevel Functional Analyses of Aging Mechanisms and Disease Processes

3.4. Therapeutic Signature Analysis of the Aging-Disease Nexus Core Expansion

3.5. Mechanistic Investigation of the DYRK3-Aging/Disease Nexus

4. Discussion

Supplementary Materials

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI