Identification of Upstream Transcriptional Regulators of Ischemic Cardiomyopathy Using Cardiac RNA-Seq Meta-Analysis

Ischemic cardiomyopathy (ICM), characterized by pre-existing myocardial infarction or severe coronary artery disease, is the major cause of heart failure (HF). Identification of novel transcriptional regulators in ischemic HF can provide important biomarkers for developing new diagnostic and therapeutic strategies. In this study, we used four RNA-seq datasets from four different studies, including 41 ICM and 42 non-failing control (NF) samples of human left ventricle tissues, to perform the first RNA-seq meta-analysis in the field of clinical ICM, in order to identify important transcriptional regulators and their targeted genes involved in ICM. Our meta-analysis identified 911 differentially expressed genes (DEGs) with 582 downregulated and 329 upregulated. Interestingly, 54 new DEGs were detected only by meta-analysis but not in individual datasets. Upstream regulator analysis through Ingenuity Pathway Analysis (IPA) identified three key transcriptional regulators. TBX5 was identified as the only inhibited regulator (z-score = −2.89). F2R and SFRP4 were identified as the activated regulators (z-scores = 2.56 and 2.00, respectively). Multiple downstream genes regulated by TBX5, F2R, and SFRP4 were involved in ICM-related diseases such as HF and arrhythmia. Overall, our study is the first to perform an RNA-seq meta-analysis for clinical ICM and provides robust candidate genes, including three key transcriptional regulators, for future diagnostic and therapeutic applications in ischemic heart failure.


Introduction
Heart failure (HF), leading to considerable mortality and health care costs, is a critical health problem, especially among the people aged ≥65, around the world [1]. Characterized by pre-existing myocardial infarction, hibernating myocardium or severe coronary artery disease, ischemic cardiomyopathy (ICM) accounted for more than 60% of systolic HF cases in industrialized countries [2]. Mild or severe repeated injuries to left ventricle are common in patients with ICM, resulting in cardiac remodeling, chronic myocardial dysfunction and eventual HF [3].
Dysregulated transcriptional hubs, such as transcription factors, non-coding RNAs and chromatin regulators, and their downstream gene expression signatures are representative of genomic mechanisms underlying disease processes. Therefore, detecting such genomic signatures aids in the development

Differentially Expressed Genes
Expression levels of 58,884 coding and non-coding genes based on Homo_sapiens.GRCh38.96.gtf were quantified in this study. Nine hundred and eleven (911) differentially expressed genes (DEGs), including 582 downregulated and 329 upregulated genes, were identified in meta-analysis (Supplementary Table S1). The top 50 DEGs ordered by p-value are presented in Table 2. The number of common DEGs among the meta-analysis and the analyses of four individual studies are presented using Venn diagram ( Figure 1A). A considerable number of DEGs were not consistent among different RNA-seq studies in terms of statistical significance ( Figure 1A). Interestingly, 54 new DEGs were identified in the meta-analysis but not in the individual datasets ( Figure 1A, Table 3). No common genes were identified as statistically significant among the meta-analysis and all four individual analyses ( Figure 1A). However, a heatmap of all meta-analysis identified DEGs showed consistent patterns of up-or down-regulated DEGs among the samples across different studies ( Figure 1B). TNNT2   (Table 1).   (Table 1).

Toxicity Pathway Analysis
The IPA-Tox analysis identified 232 significant toxicity pathways (p-value < 0.05). Activation z-score > 2 was considered as significantly activated. Only two pathways, arrhythmia (z-score = 2.81) and failure of heart (z-score = 2.41), were marked with significant activation status. Interactions of these two pathways with their associated DEGs are shown in Figure 2A,B. To identify shared DEGs between these two pathways, an integrated network was generated ( Figure 2C). Seventeen genes out of 62 DEGs were shared between the two pathways and interestingly, most of them were downregulated in the ICM group ( Figure 2C, Table 4). Among the common DEGs between failure of heart and arrhythmia, DES and TNNT2 were among the top 50 DEGs ( Figure 2C, Table 2). ATP1A1 was among the DEGs that were identified only in meta-analysis but not in the four individual datasets ( Figure 2C, Table 3).  Table 4. Shared DEGs between arrhythmia and failure of heart.
TTN −0.66 Down 1 : Average of log 2 FC from individual studies, FC: fold-change; 2 : "Up" or "Down" indicates whether the gene was upregulated or downregulated.

Canonical Pathway Analysis
Among 122 significant canonical pathways identified by IPA (p-value < 0.05), only four pathways had absolute z-scores more than 2.0 and were marked as significantly inhibited in the ICM group ( Figure 3). The DEGs involved in these pathways were summarized in Supplementary Table S2. Among these DEGs, ATP1A1 was involved in the Superpathway of Inositol Phosphate Compounds and also contributed to both heart failure and arrhythmia (Supplementary Table S2, Figure 2C). ACTC1 and TGFBR2 contributing to arrhythmia were also involved in EIF2 Signaling Pathway and Senescence Pathway, respectively (Figure 2A, Supplementary Table S2). MTOR, EP300 and PPP3CC in the Senescence Pathway were also involved in failure of heart ( Figure 2B, Supplementary

Canonical Pathway Analysis
Among 122 significant canonical pathways identified by IPA (p-value < 0.05), only four pathways had absolute z-scores more than 2.0 and were marked as significantly inhibited in the ICM group ( Figure 3). The DEGs involved in these pathways were summarized in Supplementary Table  S2. Among these DEGs, ATP1A1 was involved in the Superpathway of Inositol Phosphate Compounds and also contributed to both heart failure and arrhythmia (Supplementary Table S2, Figure 2C). ACTC1 and TGFBR2 contributing to arrhythmia were also involved in EIF2 Signaling Pathway and Senescence Pathway, respectively ( Table S2).

Upstream Regulator Analysis
IPA upstream regulator analysis identified 61 significant upstream regulators (p-value < 0.05) that were 38 downregulated and 23 upregulated in the ICM group, respectively (Supplementary Table S3). Among those upstream regulators, only TBX5 was marked as a significantly inhibited regulator (z-score = −2.89); only F2R and SFRP4 were significantly activated regulators (z-scores = 2.56 and 2.00, respectively) (Supplementary Table S3). Figure 4 summarized the targeted genes by these three upstream regulators.  Table S2).

Upstream Regulator Analysis
IPA upstream regulator analysis identified 61 significant upstream regulators (p-value < 0.05) that were 38 downregulated and 23 upregulated in the ICM group, respectively (Supplementary Table  S3). Among those upstream regulators, only TBX5 was marked as a significantly inhibited regulator (z-score = −2.89); only F2R and SFRP4 were significantly activated regulators (z-scores = 2.56 and 2.00, respectively) (Supplementary Table S3). Figure 4 summarized the targeted genes by these three upstream regulators.
Integrating the results of the TBX5-targeted genes and the DEGs in toxicity pathways, we found that the dysregulation of TBX5-targeted genes, TNNT2, NPPA, TTN, ATP2A2, DES and SCN5A, contributed to the development of heart failure and arrhythmia (Figures 2C and 4A). NKX2-5, HSPB7 and BCL2L1 were also involved in failure of heart ( Figures 2B and 4A). TBX5-targeted genes, MYH6, ACTC1 and TPM1, were involved in arrhythmia (Figures 2A and 4A). Moreover, ACTC1 in EIF2 Signaling Pathway was also regulated by TBX5. Among TBX5-targeted genes, MYH6, TNNT2, ECM2, TPM1 and DES, were found in the top 50 DEGs list ( Table 2). The network of TBX5, its targeted genes and the corresponding cardiac disorders via IPA regulator effects analysis further indicated an important role of TBX5 in the development of HF-related dysfunctions and diseases ( Figure 5). Inhibited TBX5 caused dysregulation of several genes such as BCL2L1, HSPB7, NPPA, SCN5A, NKX2-5, TNNT2, ATP2A2, TTN and DES, which are involved in the activation of heart failure ( Figure 5). Inhibited TBX5 also contributed to other cardiac dysfunctions including degeneration of heart (increased), cardiac contractility (decreased), contractility of muscle (decreased) and function of cardiac muscle (decreased) ( Figure 5).  Integrating the results of the TBX5-targeted genes and the DEGs in toxicity pathways, we found that the dysregulation of TBX5-targeted genes, TNNT2, NPPA, TTN, ATP2A2, DES and SCN5A, contributed to the development of heart failure and arrhythmia (Figures 2C and 4A). NKX2-5, HSPB7 and BCL2L1 were also involved in failure of heart ( Figures 2B and 4A). TBX5-targeted genes, MYH6, ACTC1 and TPM1, were involved in arrhythmia (Figures 2A and 4A). Moreover, ACTC1 in EIF2 Signaling Pathway was also regulated by TBX5. Among TBX5-targeted genes, MYH6, TNNT2, ECM2, TPM1 and DES, were found in the top 50 DEGs list ( Table 2). The network of TBX5, its targeted genes and the corresponding cardiac disorders via IPA regulator effects analysis further indicated an important role of TBX5 in the development of HF-related dysfunctions and diseases ( Figure 5). Inhibited TBX5 caused dysregulation of several genes such as BCL2L1, HSPB7, NPPA, SCN5A, NKX2-5, TNNT2, ATP2A2, TTN and DES, which are involved in the activation of heart failure ( Figure 5). Inhibited TBX5 also contributed to other cardiac dysfunctions including degeneration of heart (increased), cardiac contractility (decreased), contractility of muscle (decreased) and function of F2R, as the activated upstream regulator, upregulated PLAT, which was involved in heart failure and arrhythmia (Figures 2C and 4B). Increased CCN2, regulated by F2R, contributed to failure of heart ( Figures 2B and 4B). Dysregulated MMP2, DSP and TGM2 were involved in arrhythmia (Figures 2A and  4B). SFRP4, as the other activated upstream regulator, inhibited TNNT2 and MYH7, which contributed to the development of heart failure and arrhythmia ( Figures 2C and 4C). Upregulation of SFRP4 also inhibited the expression of NKX2-5 and MYH6 ( Figure 4C). NKX2-5 and MYH6 were involved in failure of heart and arrhythmia, respectively ( Figure 2). F2R, as the activated upstream regulator, upregulated PLAT, which was involved in heart failure and arrhythmia ( Figures 2C and 4B). Increased CCN2, regulated by F2R, contributed to failure of heart ( Figures 2B and 4B). Dysregulated MMP2, DSP and TGM2 were involved in arrhythmia (Figures 2A and 4B). SFRP4, as the other activated upstream regulator, inhibited TNNT2 and MYH7, which contributed to the development of heart failure and arrhythmia ( Figures 2C and 4C). Upregulation of SFRP4 also inhibited the expression of NKX2-5 and MYH6 ( Figure 4C). NKX2-5 and MYH6 were involved in failure of heart and arrhythmia, respectively ( Figure 2).
Interestingly, an integrated network of the three upstream regulators and their targeted genes showed that MYH6, NKX2-5 and TNNT2 were regulated by both TBX5 and SFRP4 ( Figure 4D). CCND1 was also regulated by both SFRP4 and F2R ( Figure 4D). As mentioned above, some of these regulated genes were involved in failure of heart and/or arrhythmia ( Figure 2C), indicating that the three key upstream regulators are common hubs regulating the downstream genes importantly contributing to HF-related cardiac disorders.

Discussion
In this study, we performed the first RNA-seq meta-analysis in the field of clinical ICM using four RNA-seq studies to profile gene expression signatures and identify key transcriptional regulators. We applied a consistent bioinformatics pipeline for processing the raw RNA-seq data (FASTQ files) of all four individual datasets to prevent methodological inconsistences in terms of data processing and bioinformatics pipelines among original studies. Our meta-analysis identified a total of 911 differentially expressed genes including 582 downregulated and 329 upregulated genes (Supplementary Table S1).
Among the top 50 significant DEGs (Table 2), several genes, such as OGN and RPL26, were previously reported to be dysregulated in ICM patients [15,16]. Upregulation of OGN, osteoglycin, has been reported to play a role in collagen maturation and deposition in mouse myocardial infarction tissue [15]. Increased circulating OGN has also been observed in ischemic HF patients experiencing myocardial infarction compared to patients with non-ischemic HF and thus it has been proposed as a biomarker for ischemic HF with pre-existing myocardial infarction [15]. Moreover, upregulation of RPL26, ribosomal protein L26, has also been observed in patient with ischemic HF [16]. Abnormal expressions of DES and PTN were also reported in dilated cardiomyopathy (DCM) [17,18]. Interestingly, genetic variants in several genes of the top 50 DEGs were reported to be associated with ICM and other heart diseases. For example, genetic variants of PALLD (palladin, cytoskeletal associated protein), important for organizing actin cytoskeleton, have been reported to be associated with myocardial infarction [19,20]. Genetic variants of MYH6, TNNT2 and TPM1, have Interestingly, an integrated network of the three upstream regulators and their targeted genes showed that MYH6, NKX2-5 and TNNT2 were regulated by both TBX5 and SFRP4 ( Figure 4D). CCND1 was also regulated by both SFRP4 and F2R ( Figure 4D). As mentioned above, some of these regulated genes were involved in failure of heart and/or arrhythmia ( Figure 2C), indicating that the three key upstream regulators are common hubs regulating the downstream genes importantly contributing to HF-related cardiac disorders.

Discussion
In this study, we performed the first RNA-seq meta-analysis in the field of clinical ICM using four RNA-seq studies to profile gene expression signatures and identify key transcriptional regulators. We applied a consistent bioinformatics pipeline for processing the raw RNA-seq data (FASTQ files) of all four individual datasets to prevent methodological inconsistences in terms of data processing and bioinformatics pipelines among original studies. Our meta-analysis identified a total of 911 differentially expressed genes including 582 downregulated and 329 upregulated genes (Supplementary Table S1).
Among the top 50 significant DEGs (Table 2), several genes, such as OGN and RPL26, were previously reported to be dysregulated in ICM patients [15,16]. Upregulation of OGN, osteoglycin, has been reported to play a role in collagen maturation and deposition in mouse myocardial infarction tissue [15]. Increased circulating OGN has also been observed in ischemic HF patients experiencing myocardial infarction compared to patients with non-ischemic HF and thus it has been proposed as a biomarker for ischemic HF with pre-existing myocardial infarction [15]. Moreover, upregulation of RPL26, ribosomal protein L26, has also been observed in patient with ischemic HF [16]. Abnormal expressions of DES and PTN were also reported in dilated cardiomyopathy (DCM) [17,18]. Interestingly, genetic variants in several genes of the top 50 DEGs were reported to be associated with ICM and other heart diseases. For example, genetic variants of PALLD (palladin, cytoskeletal associated protein), important for organizing actin cytoskeleton, have been reported to be associated with myocardial infarction [19,20]. Genetic variants of MYH6, TNNT2 and TPM1, have been found to be associated with hypoplastic left heart, cardiac hypertrophy and DCM, respectively [21][22][23].
Meta-analysis is a more sensitive and reliable approach to identify novel robust DEGs due to its greater power to detect differential expression [9]. Fifty-four new DEGs were discovered through our meta-analysis and they were not detected by analyzing the individual datasets. Most of these genes have not been reported prior as related to ICM. However, some of these DEGs have been found to be associated with HF-related diseases. For example, CRMP1 has been demonstrated as a potential candidate for left-sided congenital heart disease [24]. Lower expression of ATP1A1 has been reported in end-stage HF [25]. Consistently, our IPA pathway analysis showed that ATP1A1 was involved in heart failure ( Figure 2B). Abnormal expressions of AZGP1 in chronic HF [26] and MDFIC in DCM have been previously reported [27]. Moreover, genetic mutation in PKD2 has been reported in idiopathic DCM [28]. Further research is needed to investigate pathophysiological mechanisms of these newly identified genes in our meta-analysis.
ACE2, SP100, CITED2, CEBPD, BCL3, CREB, SMARCA4, NCAM1 and SFRP4 have been previously reported as transcriptional regulators in heart failure [29][30][31][32][33][34]. Although CITED2, CREB5 (belongs to CREB family), CREB3L1 (belongs to CREB family), SMARCA4, NCAM1 and SFRP4 were significant DEGs in our dataset (Supplementary Table S1), only SFRP4 was identified as the significantly activated transcriptional regulator based on significant activation z-score (absolute z-score > 2.0) from our IPA analysis (Figure 4). Upstream regulator analysis also identified two additional transcriptional regulators, TBX5 and F2R (Figure 4). TBX5, a member of the T-box transcription factor family, was the top inhibited regulator. It has been previously reported that malfunction of TBX5 could lead to several cardiovascular diseases during embryonic development and also during adulthood [35]. In our study, inhibition of TBX5 was shown to dysregulate several genes such as DES, NKX2-5, ACTC1, MYH6, ATP2A2 and HSPB7, which further contributed to ICM-related diseases such as failure of heart and degeneration of heart ( Figure 5). Cardiac muscle functions including cardiac contractility, contractility of muscle and function of cardiac muscles were also shown to be influenced due to inhibition of TBX5 ( Figure 5). Consistent with our finding, TBX5 along with MEF2C has been reported to activate the expression of MYH6, which is considered as the building block of cardiomyocytes and plays a crucial role in heart development and function [36]. Moreover, DES expression has also been reported to be regulated by TBX5 [37] and a decreased number of DES-positive myocytes has been found in ischemic heart failure and was associated with reduced cardiac function [38]. Our study also found that dysregulated TBX5 could inhibit the expression of ATP2A2 (Figure 5), an ATPase enzyme that plays an important role in muscle contraction and relaxation, and decreased ATP2A2 has also been observed during human end-stage heart failure [39,40]. TBX5-regulated NKX2-5 is involved in heart formation and development and dysregulated NKX2-5 could lead to heart failure and sudden cardiac death [37,41]. Therefore, our results demonstrate a strong association of TBX5 with heart diseases and further propose its important transcriptional regulatory role in the development of ICM for future mechanistic studies.
F2R and SFRP4 were identified as significantly activated upstream regulators mediating multiple HF-related genes (Figures 2 and 4). SFRP4, secreted frizzled-related protein 4, is a member of the SFRPs family, functioning as soluble modulators in Wnt signaling [20]. Increased SFRP4 has been reported in patients with coronary heart disease and DCM [34,42]. Several SFRP4-targeted genes, such as MYH6, MYH7, TNNT2 and NKX2-5, contributed to different types of heart diseases [22,41,43,44]. F2R, coagulation factor II receptor, is a member of the G-protein coupled receptor family and it is important for regulating the thrombotic response [20]. Genetics variants in F2R have been reported to influence the risk of myocardial infarction and coronary heart disease [45,46]. In our study, activation of F2R was indicated to cause overexpression of several genes, including CCN2, CCND1, CDH11, CASP4, EGR1, MMP2 and PLAT, in the ICM group ( Figure 4B). Activation of SFRP4 and F2R also upregulated CCND1 ( Figure 4D), which was involved in EIF2 Signaling and Senescence Pathways (Supplementary Table S2). CCND1 (Cyclin D1) has been reported to promote cardiomyocyte division in vivo and regulate cardiac function responding to heart failure in a rat myocardial infarction model [47]. Besides TBX5, our study identified F2R and SFRP4 as two important activated transcriptional regulators involved in the development of ICM-related cardiac dysfunctions.
In conclusion, our study, which is the first RNA-seq meta-analysis in the field of clinical ICM, identified multiple novel dysregulated genes and three key transcriptional regulators involved in the development of ischemic cardiomyopathy and its associated cardiovascular diseases. The three transcriptional regulators could be further examined as potential biomarkers for simultaneous regulation of multiple ICM-involved genes to develop novel diagnostic and therapeutic strategies in ischemic heart failure. Table 1 summarizes four RNA-seq studies of ICM using tissue samples from human left ventricle, found in NCBI GEO [48] database (https://www.ncbi.nlm.nih.gov/geo/). Further, we did not include the datasets if they were not collected using the Illumina sequencing platform or they were collected in patients with any treatment of a specific drug or medically implanted device. The clinical information of ICM patients and their controls has been described in Study_1 [49], Study_2 [50], Study_3 [51] and Study_4 [14]. RNA-seq analyses of these four individual studies were previously published [49,50,52,53], thus all four individual datasets have been validated for our current meta-analysis. Detailed information of the data processing and bioinformatics analysis has been described in our recently published paper [32] and is shown in Figure 6. Briefly, FASTQ files were downloaded from the European Nucleotide Archive website (https://www.ebi.ac.uk/ena). Quality control for raw reads and trimmed reads was performed using FastQC [54]. Adaptors and low-quality bases (Phred quality score < 10) were filtered using Cutadapt [55]. A pipeline of HISAT2 [56], Samtools [57] and HTSeq-count [58] was used for aligning the trimmed reads to the human reference genome (GRCh38) and quantifying gene expression. Only uniquely mapped reads were used for expression quantification.

Materials and Methods
important activated transcriptional regulators involved in the development of ICM-related cardiac dysfunctions.
In conclusion, our study, which is the first RNA-seq meta-analysis in the field of clinical ICM, identified multiple novel dysregulated genes and three key transcriptional regulators involved in the development of ischemic cardiomyopathy and its associated cardiovascular diseases. The three transcriptional regulators could be further examined as potential biomarkers for simultaneous regulation of multiple ICM-involved genes to develop novel diagnostic and therapeutic strategies in ischemic heart failure. Table 1 summarizes four RNA-seq studies of ICM using tissue samples from human left ventricle, found in NCBI GEO [48] database (https://www.ncbi.nlm.nih.gov/geo/). Further, we did not include the datasets if they were not collected using the Illumina sequencing platform or they were collected in patients with any treatment of a specific drug or medically implanted device. The clinical information of ICM patients and their controls has been described in Study_1 [49], Study_2 [50], Study_3 [51] and Study_4 [14]. RNA-seq analyses of these four individual studies were previously published [49,50,52,53], thus all four individual datasets have been validated for our current meta-analysis. Detailed information of the data processing and bioinformatics analysis has been described in our recently published paper [32] and is shown in Figure 6. Briefly, FASTQ files were downloaded from the European Nucleotide Archive website (https://www.ebi.ac.uk/ena). Quality control for raw reads and trimmed reads was performed using FastQC [54]. Adaptors and low-quality bases (Phred quality score < 10) were filtered using Cutadapt [55]. A pipeline of HISAT2 [56], Samtools [57] and HTSeq-count [58] was used for aligning the trimmed reads to the human reference genome (GRCh38) and quantifying gene expression. Only uniquely mapped reads were used for expression quantification. DESeq2 [59] was used to perform differential expression analysis. Genes with low read counts were filtered with default parameters in DESeq2. Quantitative meta-analysis was performed through Fisher's combined probability test [60] using metaRNASeq [9]. Raw p-values were adjusted by the Benjamini-Hochberg false discovery rate (FDR) method and the adjusted p-values less than 0.05 were considered as statistically significant. Only DEGs with consistent expression directions among the four individual studies were included in the final DEGs list.

Materials and Methods
To identify enriched canonical pathways, toxicity functions (IPA-Tox) and upstream transcriptional regulators, the Ingenuity Pathway Analysis software (IPA, Qiagen, Redwood City, CA, USA) [61] was used to analyze DEGs identified by meta-analysis. VennDiagram [62] in R was used to generate a Venn diagram of common DEGs among the meta-analysis and individual studies. A heatmap of the DEGs identified by meta-analysis was generated using the heatmap.2 function from the gplots package in R [63]. The compute-intensive tasks were performed using Ohio Supercomputer Center [64].