Uncovering Potential Roles of Differentially Expressed Genes, Upstream Regulators, and Canonical Pathways in Endometriosis Using an In Silico Genomics Approach

Endometriosis is characterized by ectopic endometrial tissue implantation, mostly within the peritoneum, and affects women in their reproductive age. Studies have been done to clarify its etiology, but the precise molecular mechanisms and pathophysiology remain unclear. We downloaded genome-wide mRNA expression and clinicopathological data of endometriosis patients and controls from NCBI’s Gene Expression Omnibus, after a systematic search of multiple independent studies comprising 156 endometriosis patients and 118 controls to identify causative genes, risk factors, and potential diagnostic/therapeutic biomarkers. Comprehensive gene expression meta-analysis, pathway analysis, and gene ontology analysis was done using a bioinformatics-based approach. We identified 1590 unique differentially expressed genes (129 upregulated and 1461 downregulated) mapped by IPA as biologically relevant. The top upregulated genes were FOS, EGR1, ZFP36, JUNB, APOD, CST1, GPX3, and PER1, and the top downregulated ones were DIO2, CPM, OLFM4, PALLD, BAG5, TOP2A, PKP4, CDC20B, and SNTN. The most perturbed canonical pathways were mitotic roles of Polo-like kinase, role of Checkpoint kinase proteins in cell cycle checkpoint control, and ATM signaling. Protein–protein interaction analysis showed a strong network association among FOS, EGR1, ZFP36, and JUNB. These findings provide a thorough understanding of the molecular mechanism of endometriosis, identified biomarkers, and represent a step towards the future development of novel diagnostic and therapeutic options.


Introduction
Endometriosis is a painful gynecological ailment marked by the presence of endometrial tissue outside the uterine cavity, commonly involving the uterus, ovaries, fallopian tubes, and pelvic tissues [1]. It is a complex and chronic estrogen-dependent disorder, wherein abnormal growth of uterine-lining (endometrium) tissue occurs outside the uterus, which can lead to serious complications like diabetes, obesity, mood disorders, dysmenorrhea, chronic pelvic pain, or even fatal endometrial cancer and cardiovascular disorders if left untreated for long. The most common site of endometriosis is the Douglas pouch (rectovaginal region) of the pelvic peritoneum [2]. Common symptoms include agonizing abdominal pain, period cramps (dysmenorrhea), heavy periods, pain with bowel movements or urination, dyspareunia, and infertility [3]. might also predispose to endometriosis. Detailed understanding on the basis of gene expression studies is lacking, and findings are often inconsistent or even contradictory.

Data Retrieval and Sample Description
Our approach was the integration of publicly available gene expression data generated by different microarray platforms. We first retrieved whole-transcript array datasets (.CEL files) along with provided clinical details of endometriosis patients dated up to 30 March 2020 from the Gene Expression Omnibus (GEO, NCBI) databank, a public domain hosting high-throughput genomic data. The present study included following expression data series with GEO accession numbers GSE7846, GSE7305, GSE6364, GSE4888, GSE51981, GSE31683, and GSE25628 and their sample information to compare the transcriptomic status of affected and control patients ( Table 1). The GSE51981 dataset has a total of 148 endometrial samples from patients with ages ranging from 20-50 years. It includes samples from women in different menstrual cycle phases, including endometriosis with severe pelvic pain/infertility (n = 77) and normal without endometriosis (n = 71). Normal women with uterine fibroids, adenomyosis, or pelvic organ prolapse were further grouped as normal with uterine/pelvic pathology (n = 37), and others as normal without uterine pathology (n = 34). The GSE7846 dataset includes five arrays for human endometrial endothelial cells (HEECs) derived from eutopic endometria of patients with endometriosis, and five from patients without endometriosis (controls). GSE7305 includes expression profiles of 10 each of normal and diseased cases.

Gene Expression Analysis
To generate expression profiles of endometriosis samples, .CEL files were imported to Partek Genomics Suite, version 7.0 (Partek Inc., St. Louis, MO, USA) followed by log-transformation and normalization of the robust background-adjusted array dataset. Principal component analysis (PCA) was done on high-dimensional data to assess quality and overall variance in gene expression of individuals among sample groups. Analysis of variance (ANOVA) was employed to create a list of differentially expressed genes (DEGs) with a cut-off p-value of ≤ 0.05 and fold change of ± 2. Hierarchical clustering was done to reveal the pattern of most differentially expressed (up-and downregulated) genes across samples.

Gene Ontology, Pathway, and Upstream Regulators Analysis
The identified statistically significant DEGs with corresponding probe sets ID, p-value, fold-change values, and other relevant data were uploaded into the Ingenuity Pathways Analysis (IPA, QIAGEN's Ingenuity Systems, Redwood City, CA, USA) software for molecular network and canonical pathway analysis to define interaction amongst the differentially regulated genes using functional algorithms. The Benjamini-Hochberg method was used to adjust p-values for canonical pathways, and p-values below 0.01 and Altman z-scores of ± 2 were considered significant. Positive and negative values of z-score represent activation and inhibition of dysregulated canonical pathways. Gene ontology study was done to functionally categorize endometriosis-significant genes. All endometriosis-associated DEGs were imported to figuratively represent all identified connections and potential relationships among them, in order to identify significant pathways leading to endometriosis initiation and progression.

Protein-Protein Interaction Analysis
To check the interactions at the protein level, the STRING v11.0 database (http://string-db.org) was used to search for possible physical and functional associations among proteins encoded by the top DEGs (including both up-and downregulated) for a better understanding of disease pathobiology [20]. This prediction gives a visual idea about the possible interconnections between the proteins involved in a specific disease network.

Differentially Expressed Genes from Meta-Analysis
Integration of the seven GEO data series included in present study comprised a total of 156 endometriosis patients and 118 controls. Data were merged before analysis as all had used the same GPL570 platform, except GSE25628 which used the GPL571 platform (Table 1). Proliferative (n = 6), Proliferative normal (n = 5); Early-secretory (n = 6), Early-secretory normal (n = 3); Mid-secretory (n = 9), Mid-secretory normal (n = 8) Principal component analysis showed the grouping of the samples in three-dimensional space as per their whole-genome expression patterns, where each circle represents an individual ( Figure 1). Comparing endometriosis with normal non-endometriosis tissue without any pelvic/uterine pathology resulted in the detection of 1590 differentially expressed genes (129 upregulated and 1461 downregulated). The top upregulated genes, including FOS, EGR1, ZFP36, JUNB, APOD, CST1, GPX3, and PER1, are shown in Table 2 and the top downregulated genes, including DIO2, CPM, OLFM4, PALLD, BAG5, TOP2A, PKP4, CDC20B, and SNTN, are shown in Table 3. Hierarchical clustering of DEGs showed a clear difference in expression pattern of genes between endometriosis cases and controls ( Ingenuity pathway analysis for the DEGs of endometriosis revealed altered canonical pathways that were either activated or inhibited ( Figure 3, Table 4). Mitotic roles of polo-like kinase (z-score −2.71), aldosterone signaling in epithelial cells (z-score −3.464), and role of CHK proteins in cell cycle checkpoint control (z-score −0.632) were found to be inhibited while ATM signaling (z-score + 1.698) and SUMOylation pathways (z-score + 2.668) were activated ( Figure 4, Figure 5). IPA predicted the activation status of upstream regulators among identified DEGs of endometriosis. REL (transcription factor, z-score −4.13, Pval 0.0002), CTNNB1 (transcription factor, z-score −3.2, Pval 0.01), PGR (ligand-dependent nuclear receptor, z-score −2.2, Pval 0.0005), and VCAN (proteoglycan, z-score −2.6, Pval 0.02) were the top inhibited upstream regulators (Table 5). We also used a biological database, STRING, to predict functional associations and interaction between the proteins encoded by the identified significant DEGs (top up-and downregulated ones), and the results are shown in

Discussion
Endometriosis, a growth/deposition of endometrial tissue at extra-uterine sites, affects around 10% of reproductive women. In addition to abnormal reproductive physiological problems, cases are increasing drastically due to adverse consequences of treatment with oral contraceptives, GnRH agonists, synthetic progestins, and aromatase inhibitors (letrozole) to prevent the menstrual cycle and/or pregnancy [1,21]. Understanding the molecular etiology of origin and progression of endometriosis is necessary to explore therapeutic options and provide better treatment. We therefore conducted transcriptomic meta-analysis to identify endometriosis-associated significant DEGs and essential pathological pathways.
Combining multiple studies has always been challenging, as different studies use varied protocols, platforms, and analysis methods. We used raw data (.CEL) files to integrate multiple data series to get a bigger cohort and analyzed the data. We identified transcriptomic signatures of endometriosis and evaluated the roles of specific genes, upstream regulators, and dysregulated pathways. Our results provide some insight into the molecular mechanisms underlying endometriosis pathogenesis. Pathogenic genes and pathways may serve as novel targets for diagnostic and prognostic biomarkers and potential therapies for endometriosis. In the present study, we had a long list of genes and pathways, but have restricted our discussion to the most prominent genes and pathways.

Molecular Etiology of Endometriosis
Retrograde or "reverse" menstruation has been suggested as an initial cause of endometriosis, where menstrual blood is thrown back into pelvic cavity outside the uterus, instead of flowing out of the cervix. This endometrial tissue growth out of the uterus is the result of an estrogen-dependent hormonal local imbalance. Higher prevalence has been also seen in women with immune disorders (like rheumatoid arthritis, multiple sclerosis, systemic lupus erythematosus, and hypo-or hyperthyroidism) [17]. Recently, a small-molecule agonist G-1 (Tespria) against the G-protein-coupled estrogen receptor also showed reduction in endometrial growth [22].
Unusual transformation of certain abdominal wall cells into endometrial cells has been reported in some women [23] and, interestingly, it is believed that during embryonic development, the same cells are responsible for the growth of female reproductive organs. Researchers also think that pelvic inflammation, damage, or infection of cells that line the pelvis like a prior caesarean surgery can also trigger endometriosis [23][24][25]. The exact pathogenesis still remains uncertain. We therefore conducted a transcriptomics study in order to understand the genetic factors that allow cells to grow as endometrial tissue outside the uterus.
In our results, we found high expression of early and immediate early-response genes such as FBJ murine osteosarcoma viral oncogene homolog or Fos proto-Oncogene (FOS), FosB Proto-Oncogene (FOSB), Early Growth Response 1 (EGR1), ZFP36 Ring Finger Protein (ZFP36), Immediate Early Response 2 (IER2), Immediate Early Response 3 (IER3), Jun B Proto-Oncogene (JUNB), and Transcription Factor SOX-13 (SOX 13). The majority of these are DNA-binding proteins that act as transcriptional factors. Some others, like Dual specificity protein phosphatase 1 (DUSP1) and Receptor-type tyrosine-protein phosphatase O (PTPRO), possess phosphatase activity. c-Fos is the transcription factor of the Fos family, including FosB, Fra-1, and Fra-2 [26]. It is an immediate early-response gene involved in cell proliferation and differentiation of normal tissue after extracellular stress stimuli. Its deregulation has been linked to oncogenic transformation and tumor progression. FOS plays a significant role in endometrial cells' proliferation and its overexpression is associated with a poor prognosis of endometrial carcinoma [27]. Fos and Jun family proteins form a heterodimer complex of AP-1 transcription factor, shown to be involved in endometrial carcinogenesis [28]. Upstream regulator analysis revealed genes such as REL, CTNNB1, PGR, and VCAN by analyzing linkage to DEGs that were experimentally shown to affect gene expression [29]. All upstream regulators were inhibited.

REL: REL (V-Rel
Avian Reticuloendotheliosis Viral Oncogene Homolog) encodes for the proto-oncogene c-Rel protein, a transcription factor of the NF-κB family that regulates genes involved in B-and T-cell differentiation, immune response, survival, apoptosis, proliferation, and oncogenic processes, including endometrial carcinogenesis [30,31]. CTNNB1: CTNNB1 (Catenin β1) codes for a protein that regulates and coordinates cell-cell adhesion, embryonic development, epithelial-mesenchymal transition, and gene transcription. It is an integral part of the canonical Wnt pathway. Aberrant Wnt/β-catenin signaling pathway function is allied with loose cytoskeleton organization and cell-to-cell contacts of epithelial cells, along with a high motility of mesenchymal cells to promote invasiveness and fibrosis. This might lead to multiple cancers, including endometrial cancer [32][33][34][35][36]. Targeting the Wnt/β-catenin signaling was shown to avert fibrogenesis in a xenograft endometriosis mice model [35].
VCAN: VCAN (Versican) codes for four extracellular matrix isoforms like large chondroitin sulfate proteoglycan in different tissues and organs that regulate cell adhesion, proliferation, migration, and survival [2]. Higher expression of VCAN has been reported in angiogenesis, tumor growth, cancer relapse, and inflammatory lung disorders [37][38][39]. Significantly high expression of VCAN was also reported in the mid-secretory phase of endometrial epithelial cells after combination estrogen/progesterone treatment. The V1 isoform of VCAN was recently reported to the facilitate development of endometrial receptivity and human embryo implantation [40]. Higher expression of VCAN is connected with pathogenesis of peritoneal endometriosis and seems to be an indicator of poor prognosis endometrial cancer [2,41].
Aromatase activates estrogen biosynthesis locally from androgens, thereby sequentially stimulating a positive feedback cycle of prostaglandin E2 production by upregulating cyclooxygenase-2 (COX-2). Because of lack of aromatase (estrogen synthase) in the normal endometrium, androgens cannot be converted into estrogen [49]. In contrast, numerous studies have described aberrantly high expression of aromatase in eutopic and ectopic endometrium [17]. Increased COX-2 expression in the stromal cells and aberrant aromatase overexpression in eutopic endometrium have both been indicated as potential therapeutic biomarkers, and therefore, their specific inhibitors are being increasingly employed for therapeutic management [50]. A probable connection of Krüppel-like Factor 9 (KLF9) dysregulation has been suggested in both pregnancy failure and endometrial pathogenesis [51]. The progesterone resistance and subsequent infertility seen in endometriosis seems to have an association with KLF9, a progesterone-receptor-interacting protein, as mice null for Klf9 are sub-fertile. It is implicated that deficiency of KLF9 contributes to progesterone resistance of eutopic endometrium in patients [52] and exhibits simultaneous abrogation of Hedgehog-, Notch-, and steroid-receptor-regulated networks [53].
Based on serum proteomic differential expression, a possible biomarker panel comprising zinc-alpha-2-glycoprotein, albumin, and complement C3 has been proposed for effective and non-invasive diagnosis of endometriosis [54]. Importantly, the three markers were independent from the endometriosis stage and cycle phase. Brain Derived Neurotrophic Factor (BDNF) has been identified as a potential peripheral early diagnostic marker, as its mean plasma concentrations were twice as high in endometriosis cases in contrast to asymptomatic or healthy controls [9]. Based on this, a nano-chip-based electrochemical detection technique was developed. The only limitation to this is its non-specificity, as the variations in BDNF expression have been reported in numerous unconnected pathologies [55].

Canonical Pathways Involved in Endometriosis
Molecular pathway analysis revealed a couple of significantly altered canonical pathways for DEGs of endometriosis. Herein, we discuss the role of key pathways like Mitotic roles of polo-like kinase, Role of CHK proteins in cell cycle checkpoint control, Aldosterone signaling in epithelial cells, and ATM Signaling in endometriosis progression.
Mitotic Roles of Polo-Like Kinase Pathway: The Polo-like kinase (Plks) is a member of the serine/threonine protein kinase (PLK1-5) family that regulates the mitotic checkpoint during M phase of cell division. Plks can act either as oncogene or tumor suppressor, and has been found to be overexpressed in different cancer types including endometrial [56] and ovarian [57] cancers. Because of its direct association with increased cellular proliferation and poor prognosis, it is considered a bona fide cancer biomarker [58,59]. Direct association of Plks expression with serum estrogen (ovarian hormone) levels and abnormal regulation of ectopic endometrial cell proliferation strongly suggest its role in the pathogenesis of endometriosis [60]. Plks inhibitors such as volasertib and rigosertib are in advanced stage of clinical trials and might be used for endometriosis treatment [61].
Role of CHK Proteins in Cell Cycle Checkpoint Control Pathway: Activation of cell cycle checkpoint kinases including Chk1 and Chk2 are an instant response to repair any type of DNA damage [62]. In response to DNA damage, this signaling pathway temporarily delays cell cycle progression, allowing time for DNA repair, or triggers programmed cell death. Activated ATM kinase phosphorylates Chk2 which phosphorylates CDC25C to block the progression from G2 to M phase. Chk2 also phosphorylates p53, attenuating p53 binding to MDM2 and activating p21/WAF1 to arrest the G1 phase of the cell cycle. Rad3-dependent activation of Chk1 phosphorylates CDC25A and CDC2 to inhibit their activity to block G2-M transition. Overall, CHK protein signaling depends on the type of stress and extent of DNA damage and is involved in endometrial cancer [63].
Cisplatin exerts an anticancer effect by activating DNA-damage-response genes Chk1/2, which generates both survival (repair) and apoptotic signals that lead to cell death. Cisplatin-resistant cells have dominant repair signaling that allows cells to survive. Chk1/2 inhibitor AZD7762 has been shown to overcome cisplatin resistance in endometriosis-associated ovarian cancer by reducing repair signaling [64].
ATM Signaling Pathway: Ataxia telangiectasia mutated (ATM) gene codes for serine/threonine protein kinase and participates in cell division and DNA repair. DNA damage induces autophosphorylation of ATM which activates DNA repair enzymes by phosphorylating Chk1/2 to fix the broken strands [65]. Efficient cross-talk between ATR-Chk1 and ATM-Chk2 leads to repair of damaged DNA strands which helps to maintain the cell's genomic stability and integrity [66]. The ATM signaling pathway, because of its central role in cell division and DNA repair, has been a focus of cancer research, especially endometrial cancer, for exploring novel molecular therapies targeting ATM pathways [67].
Aldosterone Signaling in Epithelial Cells Pathway: Aldosterone is a mineralocorticoid steroid hormone produced by the adrenal cortex. Aldosterone signaling primarily controls blood pressure and inflammation by regulating its target genes (FKBP5, IGF1, KRAS, PKCε, NCOA1, NCOR1, NEDD4L, SGK, and MR/NR3C2 as per RGD, https://rgd.mcw.edu [68] and IPA). Recent studies have shown the possible involvement of aldosterone in multiple gynecological problems and inflammatory disorders [69]. There is a well-established association of endometriosis with intraperitoneal inflammation diseases like atherosclerosis and hypertension, and also with autoimmune diseases like diabetes, hypothyroidism, and cancer [9]. A metabolomics-based study revealed high aldosterone levels in endometriosis patients with infertility [70].

Future Directions
The strength of present work lies in the inclusion of multiple endometriosis-related expression datasets in order to understand endometriosis at the molecular level. However, the absence of a validation study was its limitation. In future, we plan to conduct RT-PCR-based validation studies for differentially expressed genes on endometriosis samples collected from the Jeddah region. Further cell cultures and animal models could be used to assess the effect of activated/suppressed genes on molecular pathways and disease phenotypes for potential clinical translation. Virtual screening of potential lead compounds against identified therapeutic biomarkers for rational drug design will be done. This could facilitate imminent tailor-made personalized therapies.

Conclusions
Endometriosis is an estrogen-dependent, progesterone-resistant, inflammatory multifactorial gynecological disorder. Identification of distinct molecular signatures and potential therapeutic molecules corresponding to endometriosis is needed for better diagnosis. The present microarray-based genomics and molecular pathway analysis method helped to establish a better understanding of endometriosis at the molecular level, as multiple expression datasets were integrated to determine differentially expressed genes and identify canonical molecular pathways related to endometriosis in a broad way. The study identified alterations of gene expression and molecular signaling, including aldosterone signaling, that result in the hormonal imbalances and pathogenesis of endometriosis. An anti-inflammatory diet and increased levels of antioxidants and phytonutrients can be recommended to patients to reverse inflammation and oxidative damage, while also supporting healthy hormone balance.