Co-Expression Network Analysis of AMPK and Autophagy Gene Products during Adipocyte Differentiation

Autophagy is involved in the development and differentiation of many cell types. It is essential for the pre-adipocytes to respond to the differentiation stimuli and may contribute to reorganizing the intracellulum to adapt the morphological and metabolic demands. Although AMPK, an energy sensor, has been associated with autophagy in several cellular processes, how it connects to autophagy during the adipocyte differentiation remains to be investigated. Here, we studied the interaction between AMPK and autophagy gene products at the mRNA level during adipocyte differentiation using public-access datasets. We used the weighted-gene co-expression analysis to detect and validate multiple interconnected modules of co-expressed genes in a dataset of MDI-induced 3T3-L1 pre-adipocytes. These modules were found to be highly correlated with the differentiation course of the adipocytes. Several novel interactions between AMPK and autophagy gene products were identified. Together, it is possible that AMPK-autophagy interaction is temporally and locally modulated in response to the differentiation stimuli.


Introduction
Autophagy is essential for the white adipocyte differentiation. The knockdown of Atg5 and/or Atg7 gene in the 3T3-L1 pre-adipocyte prevents its maturation upon the chemical induction [1,2]. Secondary to that, the pre-adipocyte fails to accumulate triglycerides and to form the fat droplets, which is a characteristic of mature white adipocytes [3]. This is yet to be reconciled with another observation in cells known to contain large quantities of lipids (e.g., hepatocytes) where autophagy takes part in lipid degradation [4]. AMP-activated protein kinase (AMPK), which can be activated at the low level of energy such as starvation, stimulates autophagy through the inhibition of the mTOR activity [5,6] and/or the direct phosphorylation of ULK1 [7,8]. Autophagy and AMPK regulate several aspects of the lipid metabolism and the cell response to changing energy levels. Therefore, the AMPK-autophagy interaction could be consequential in the context of adipocyte differentiation.
3T3-L1 pre-adipocyte is a mouse fibroblast with the potential to differentiate into a mature adipocyte when treated with the MDI differentiation induction medium (160 nM insulin, 250 nM dexamethasone, and 0.5 mM 1-methyl-3-isobutylxanthine) [9]. Upon induction, the pre-adipocyte undergoes multiple metabolic and morphological changes to reach maturation. Evidently, several of these changes can be observed at the transcription level of multiple adipogenic and lipogenic markers [10,11].
The aim of this work is to identify the potential AMPK-autophagy connections, in the broad sense of the pathways, that are both novel and consequential in the adipocyte differentiation, mainly using the weighted-gene co-expression network analysis (WGCNA) [12]. This approach focuses on identifying co-expressed pairs of genes across the differentiation stages, and enables the direct use of similar datasets to test and validate the findings even though they might be performed on different platforms.
In this study, we applied the WGCNA approach to a microarrays dataset of MDI-induced adipocyte at eight different time points corresponding to three differentiation stages. We identified two networks/modules among autophagy and AMPK gene products that were correlated with the differentiation course. By analyzing these modules in one of the datasets, we were able to specify several potential novel AMPK-autophagy interactions and connect these to candidate functions through the known annotations. Finally, we checked these findings in three independent datasets of similar design and found the networks/modules to be well preserved.

Preparing Data and Annotations
First, we retrieved several microarray datasets from the Gene expression omnibus (GEO) [13]. We sought arrays of MDI-induced 3T3-L1 pre-adipocytes at three or more time points. After excluding the ones with varying designs and limited annotations, four datasets were included in the analysis (Table 1); one dataset (GSE34150) was chosen for the main analysis, and the rest were reserved for testing and validation. In GSE34150, the total RNA from 24 samples of MDI-induced pre-adipocytes were collected at eight different time points corresponding to three adipocyte differentiation stages (0 day, undifferentiated; two and four days, differentiating; 6-18 days, maturating). The initial quality assessment included checking the distribution of the intensities from all probes at the log scale, hierarchal clustering and multi-dimensional scaling analysis (MDS). Groups of samples from different stage of differentiation showed similar distributions, appropriate clustering and separation across the two different dimensions of MDS. Furthermore, to ensure the reliability of the analysis, we examined the expression of a number of differentiation and lipogenesis markers (Appendix A). The Pparg, Cebpa and Lpl genes, essential factors for adipocyte differentiation, were highly expressed in differentiating and maturing cells compared to the undifferentiated cells. Expression of most lipogenic genes (Pparg, Cebpa, Lpl, Scd1, Scd2, Dgat1, Dgat2 and Fasn) was correlated with the development of 3T3-L1 pre-adipocytes into mature adipocytes. The gene ontology (GO) terms: AMP-activated protein kinase activity (AMPK) and autophagy were used to identify 14 and 167 genes of known functions in the corresponding biological processes (BP), respectively. A total of 181 genes was used in the downstream analysis to limit the input to WGCNA, over-representation and defining novel interactions between and among AMPK and autophagy pathways. GO was also used to define the terms involving these interacting gene products and link them to known molecular functions (MF) and cellular components (CC). Appendix A contains a detailed discussion for the data inclusion criteria, quality assessment and obtaining the GO annotations.

Detecting Co-Expression Modules of AMPK and Autophagy Genes in Differentiating Adipocytes
Constructing co-expression networks is a multi-step process. First, the Pearson's correlation coefficient was calculated between each pair of the genes of interest (n = 181) across all samples (n = 24). Second, these correlations were raised to the power 5 to obtain an adjacency matrix of all possible pairs. Third, the adjacency matrix was used to calculate the Topological Overlap Matrix (TOM) as a reliable similarity measure. Finally, TOM similarity between pairs of genes were used to calculate the weight of their connection in a network of all possible pairs and a distance (1 -TOM) to cluster the pairs into highly interconnected modules/colors ( Figure 1). Appendix B provides a detailed discussion of the previous steps and the rationale for the different choices that were made in this analysis. Among all possible pairwise correlations between the 181 genes of interest, two groups/modules of highly co-expressed gene products were formed (blue, 42; turquoise, 66), and the rest were unassigned (gray, 10). Genes that code for the subunits of the AMPK complex fell into different modules; four AMPK genes, Prkaa2, Prkab2, Prkag2 and Prkag3, in the blue module along with 38 of the genes involved in autophagy; and two AMPK genes, Prkab2 and Prkag1, in the turquoise module together with 63 of the autophagy genes (Table 2). In the following sections, we describe the significance and the interactions of the individual members of these modules.

Correlating the Detected Modules to the Stage of Differentiation
To establish the biological significance of these modules, we used the expression values of their individual members to calculate a representative summary-the first principal component (PC)-for each module. Then, we calculated the Pearson's correlation coefficient for the first PC with the stage of differentiation (undifferentiated, differentiating or maturating) of all 24 samples. Both modules showed a reasonable correlation with sample stages (>0.8 for the blue and >0.3 for the turquoise module) (Figure 2A). In other words, the expression values of the members of the blue module, and to a less extent the turquoise, capture a lot of the observed differences between the cells as they progress from a differentiation stage to the next.

Testing the Over-Representation of the Modules over the Differentiation Course
Again, we considered the expression values of the individual members of each module to calculate the fraction of the differentially expressed genes (DE) across differentiation stages. Both blue and turquoise modules had a significant fraction of their member genes (>0.5) either up or downregulated at the differentiating or maturating stage compared to the control undifferentiated cell stage ( Figure 2B). These fractions were significantly higher than the expected fractions of DE genes in randomly selected modules of the corresponding sizes. The calculated p-values were adjusted for multiple testing using the False Discovery Rate (FDR). Adjusted p-values less than 0.1 were considered significant.

Visualizing Modules and Identifying Novel AMPK-Autophagy Interactions
To visually explore the detected modules, we treated each of their members as a node in a network graph. Nodes were divided into two networks based on the module to which they belong. Each pair of nodes was connected by an edge that has a weight calculated from the TOM similarity measure between the corresponding pair of genes. Edges with weights less than a minimum threshold (0.1) were excluded to obtain a less condense network ( Figure 3). Evidently, some nodes did not share edges that passes this threshold and were not included in the network graph. In addition, nodes were labeled with the corresponding official gene symbol and colored as AMPK or autophagy genes; and the edges were colored by the novelty of the connection. The latter was determined mainly based on previous reports in the STRING database (textmining evidence), which is evidence extracted from abstracts of scientific literature. By representing the modules in graphs, we were able to calculate different statistics to identify influential genes/nodes and important interactions/edges. Considering various centrality measures, we ranked the genes in each module by their influence on the modules (Table 3). Trp53inp2, Map1lc3a, Wadr45, Pink1 and Dapk1 genes were the most influential nodes in the graph of the blue module with a hub score more than 0.95, while Foxo1, Dcn and Xbp1 genes had the highest scores in the turquoise module. Edges between AMPK and autophagy nodes that were not previously reported in the STRING database (text-mining evidence) were considered novel potential interactions ( Table 4). The protein kinase AMP-activated non-catalytic subunit beta 1 (Prkab1) showed a potential interaction with several autophagy gene products including Bcl2-modifying factor (Bmf), death associated protein kinase 1 (Dapk1), Ras-associated protein Rab8a (Rab8a), SH3 domain, GRB2-like, endophilin B1 (Sh3glb1) and transformation related protein 53 inducible nuclear protein 2 (Trp53inp2) as part of the blue module. Similarly, the gamma subunit 1 (Prkag1) in the turquoise module revealed a novel binding ability to some well-known autophagy-related gene products such as Becn1, Fundc1, Lamp2 or Map1lc3b and also showed a novel interaction with some other gene products including calpain-like cysteine protease (Capn10), cyclin-dependent kinase inhibitor 2a (Cdkn2a) for p16INK4a and p14ARF, and ubiquitin-associated proteins (Trp53inp1, Nbr1, Usp33). In addition, there were some novel interactions between AMPK and autophagy gene products across the two modules, indicated as "inbetween" in Table 4.

Testing for Molecular Functions and Cellular Components Enrichment by the Detected Modules
We used a list-based enrichment to specify the contributions of the modules to the differentiation process. The mouse gene ontology Molecular Function (MF) and Cellular Components (CC) terms were submitted to an enrichment analysis by the gene members of the detected modules (42 for blue and 66 for turquoise). The significant terms (FDR < 0.1) are shown in Figure 4 stratified by the category (MF/CC) and the module (blue/turquoise). As expected, the two modules share a number of MF terms, namely; ubiquitin-like protein binding, ubiquitinyl hydrolase activity and phospholipid binding. At the same time, several terms had significance by mutually exclusive enrichment by the modules. This includes the nucleoside binding and the ubiquitin-like protein transferase activity by the blue module; and a few protein kinase terms by the turquoise module. Similarly, two of the CC terms; extrinsic component of membrane and outer membrane were enriched by both modules, while others related to only one of the two modules. Building on these two pieces of the analysis, the suggested novel AMPK interactions and the gene ontology enrichment by the members of the modules, we set out to specify the kind of functions that the AMPK-autophagy interactions were likely to be involved in. Table 5 shows how AMPK is functionally connected to autophagy gene products through several gene ontology terms such as membrane components, and regulation of kinases, enzymatic and ubiquitin activity. For examples, Prkab1 interacts with Sh3glb1, a Bax-interacting protein at the mitochondrial outer membrane and also with Trp53Inp2 for the uniquitin-like activity. In addition, Prkag1 associates with many autophagy gene products such as Usp33 for the uniquitin activity at the membrane anchoring junction, Cdkn2a for the regulation of kinase activity, and other cellular components for their molecular functions.

Preservation of AMPK-Autophagy Networks across Independent Datasets
Finally, we validated these findings in three independent datasets of similar MDI-induced 3T3-L1 cells at different time points or differentiation stages. Three GEO microarray datasets (GSE15018, GSE20696 and GSE69313) were used to perform this step of the analysis ( Table 1). The average log expression of the 181 genes of interest from the three datasets were first compared to these in the main dataset ( Figure 5). As expected, the averages are highly correlated between the datasets (>0.74), a pre-requisite for the following module preservation analysis. A moderate to high preservation of the modules was observed in the three independent datasets ( Figure 6). Generally, modules with a Z summary values between 5 and 10 are considered moderately preserved and these above 10 are considered highly preserved. In fact, the two modules: blue and turquoise, showed a summary statistics in that first category with at least 6 and 7, respectively, indicating that the interaction modules of the main dataset are well preserved in other independent datasets.  The GSE34150 dataset was used to detect the highly co-expressed modules among AMPK and autophagy genes (42, blue; 66, turquoise; 10, gray, unassigned; and 55, gold , randomly assigned). The detected modules were used as a reference to calculate several preservation statistics in three independent datasets of similar design (GSE15018, GSE20696 and GSE69313). Z summary statistics and sizes of four modules are shown as colored points.

Validation of Selected Gene Products Correlations with Prkab1 and Prkag1
We selected several autophagy gene products that are highly correlated with the AMPK subunits for experimental validation using RT-qPCR ( Figure 7A,B). The relative mRNA level of each group of genes were used to calculate the Pearson's correlation coefficients with two AMPK subunits: Prkab1 and Prkag1. Although the resulting coefficient may vary, these calculated earlier due to the different sensitivities between microarrays and RT-qPCR, the directions of the correlation were the same as ones that we observed in the dataset ( Figure 7C,D). In agreement with the suggested potential interaction of Prkab1 with Wipi1, Rab8a and Trp53inp2, strong correlations were validated. Prkag1 showed strong to moderate correlations with Becn1, Sirt2 and Trim21 as previously predicted by WGCNA.

Discussion
The adipocyte differentiation is a well regulated complex process. On one hand, this complexity allows for flexibility in response to different stimuli. For example, the over-expression of LC3 in 3T3-L1 pre-adipocytes produced a downstream activation of key regulators of adipogenesis and resulted in a differentiation pattern similar to that of the MDI induction [18]. On the other hand, this process likely involves a wide range of changes in transcription, translation and protein modification. In a previous study from our laboratory, we suggested that many autophagy genes were functionally associated with adipocyte differentiation using the RNA-Seq expression data [19]. We also showed that the mRNA level of key autophagy genes is specifically regulated at different time points, and clusters of these genes respond to the differentiation stimulus in a time-dependent manner. In particular, the subsets of organelle specific autophagy (e.g., mitophagy, reticulophagy, etc.) are highly regulated, suggesting a role in reorganizing the interacellulum and removing parts of the cell to adapt the morphological and metabolic changes of the mature adipocyte. Here, we explore the connection between autophagy and AMPK, which was established in conditions such as starvation, as it applies to the pre-adipocytes response to differentiation stimuli.
One aspect of this connection can be deduced from the observed positive correlation of the detected modules with the differentiation course (Figure 2A). In addition, the blue and the turquoise modules scores a significant (p-value < 0.001) protein-protein interaction (PPI) enrichment with an average clustering co-efficient of 0.4 and 0.5, respectively (STRING web interface). Although a detailed molecular link would be less clearer, AMPK gene products were evenly split among these modules, and had multiple edges with highly influential nodes (hubs) of well studied autophagy genes (Tables 3 and 4). Thus, AMPK gene products are part of biologically connected autophagy modules, which seem to be fairly consequential in the process of adipocyte differentiation.
The AMPK complex is formed of one catalytic subunit (α) and two non-catalytic regulatory subunits (β and γ), each has more than one isoform encoded by a separate gene [20]. The different subunits contribute to the stability and activity of the complex, whereas the combinations of the different isoforms give rise to complexes that behave differently and/or are specific to certain tissues [21,22]. Probes corresponding to the genes that code the different isoforms of the subunits were consistently expressed at different levels. Moreover, they showed varying correlations with the cell differentiation stage (data not shown). We considered the subunits and the isoforms of the AMPK complex individually. Prkab1 and Prkag1 were expressed at higher levels, and they, therefore, are the main AMPK side of the reported interaction with the autophagy pathway (Tables 4 and 5).
The adipocyte differentiation is characterized by events of increased lipogenesis and intracellular remodeling. AMPK is known for inhibiting the former and stimulating the latter. The absence of a reliable signal from the catalytic subunits of AMPK in our analysis may only enable a partial view. Nevertheless, we observe enrichment of ubiquitin activity and certain cellular component terms, probably akin to a form of localization, by autophagy genes co-expressed with the regulatory (β and γ) subunits of AMPK (Figure 4). One of these terms, ubiquitin-like protein binding, includes two gene products, Trp53inp2 and Nbr1. Both are known to help the formation of autophagosome and the selective removal of ubiquitinated proteins through binding to LC3 [23][24][25]. Few other terms related to the organelle membranes appear interesting. Perhaps, the localization of AMPK at certain intracellular locations mediates the autophagy selectivity, as suggested before [26]. This is consistent with the description of two emerging mechanisms of AMPK regulation, namely by ubiquitination and sub-cellular distribution [27].
Together, it is possible that AMPK-autophagy connection is determined by the energy supply and demand of the differentiating cells. It is more likely that the two pathways interact dexterously with some temporospatial agility. For example, AMPK activates autophagy early in the differentiation course in response to the differentiation stimulus. In a later stage, autophagy might remove ubiquitinated AMPK to allow the accelerated fat accumulation. Finally, the localization of AMPK to certain intracellular organelles could guide their recycling or removal by selective autophagy. The implications of these features do not escape us, both AMPK and autophagy are involved in disorders such as obesity and diabetes [28,29]. The manipulation of one pathway could affect the outcomes controlled by the other. In addition, the timely intervention during adipocyte differentiation could preferentially favor certain consequences of the pathways' interplay.
Nassiri and colleagues provided a system view of the adipogenesis and ranked the involved biological processes to identify the coordinated activity among them using the NASFinder method of publicly available omics data [30]. According to their analyses, the translational machinery, mitochondrial associated pathways, PPAR signaling, insulin and leptin signaling, and some membrane associated complexes are coordinately upregulated after adipocyte induction. Although they might be associated with AMPK or autophagy at the broad concept, they do not show any specific coordination between AMPK pathway and autophagy process. Here, we used the widely known WGCNA method to detect the conserved genetic networks of AMPK-autophagy gene products that might contribute to the process of differentiation [31,32]. Typically, it needs to input the list of differentially expressed genes among three or more experimental conditions [12]. In this study, we limited the analysis to the probes that mapped uniquely to AMPK and autophagy genes as defined in their gene ontology terms. The downside of limiting the analysis to a predefined set of genes is that the prospective findings would be limited to the available annotation, potential loss of signals from probes that map to genes not in the predefined gene set and the inclusion of probes that map to genes that are not actively changing among the conditions. On the other hand, this approach allows for simplifying the analysis steps and the interpretation of the results. The detected networks are more likely to have biologically meaningful consequences since they are formed of nodes that are known for certain functions in their pathways/gene sets. In addition, this allows for including genes that are highly correlated even though they don't show the highest degree of differentiation among conditions. Certainly, some of these genes are involved in the biology of adipocyte differentiation either by maintaining essential cellular processes or they show subtle changes that wouldn't be typically picked by the differential expression approach.

Gene Ontology
The Gene Ontology (GO) terms AMP-activated protein kinase (AMPK) (GO:0004679) and autophagy (GO:0006914) were used to identify the gene products (14 and 167, respectively) with known functions in the corresponding biological processes [33]. Similarly, GO was used to identify the molecular function (MF) and cellular component (CC) terms containing these gene products. GO was accessed through the GO.db and the mouse organism package org.Mm.eg.db [34,35].

Microarrays Expression Data
To identify the relevant datasets, we queried the NCBI Gene Expression Omnibus (GEO) metadata by GEOmetadb [36]. The term '3T3-L1' was used to search the titles of all entries, the query results were then searched manually and datasets of similar induction time-course design were included. The expression and the annotation data were then obtained using a GEOquery [37]. Table 1 summaries the four datasets that were used in this analysis. GSE34150 consists of 24 samples of MDI-induced 3T3-L1 pre-adipocytes at eight different time points corresponding to three differentiation stages (0 day, undifferentiated; two and four days, differentiating; 6-18 days, maturating).

Protein-Protein Interactions
The STRING database was used to query all possible AMPK-autophagy protein-protein interactions that are reported with different evidence types [38]. The HUGO symbols of 181 genes were mapped to the ENSEMBL IDs before querying the database. STRINGdb was used to do the mapping, construct the query and obtain the results. The interactions were matched against the edges of the co-expression networks of the detected modules to label the edges with the type of evidence when they were previously reported.

Weighted-Gene Co-Expression Network Analysis
The package WGCNA was used to apply most of the necessary steps for weighted-gene co-expression network analysis on the GSE34150 dataset as described in the original publications [39]. Briefly, a co-expression measure (Pearson's correlation coefficient) was calculated between each pair of genes. The coefficients were raised to the power of 5 to form an adjacency matrix. The adjacency matrix was then used to calculate the topological overlap similarity matrix (TOM). To detect modules and assign genes to them, a dissimilarity matrix is obtained (1 − TOM) and used as distances between genes. A hierarchical clustering was then performed and a gene tree is built. Upon cutting the tree at a certain height, genes nearby are assigned to modules, referred to as colors (names are arbitrarily assigned). The detected modules were then used to find the correlation with the phenotype and the preservation in independent datasets. To correlate the modules to the sample phenotypes or to each other, an eigengen or the principal components (PC) were calculated from the expression of their respective members and used as a representative summary. Finally, a module preservation analysis was performed by calculating various summary statistics on the detected modules in the test datasets [40].

Network Visualization and Analysis
The igraph package was used to visualize and analyze the detected modules [41]. The genes of interest were treated as nodes in a network graph and were connected by an edge if its weight-calculated from the TOM similarity between each pair of genes-passed a minimum threshold. Several graph statistics were used to determine the importance/centrality of genes and their interactions.

Gene Modules Over-Representation
The limma package was used to test for the over-representation of the detected modules in the GSE34150 dataset [42]. An index of the modules as gene sets, the expression data and comparison matrix based on the differentiation stage were used as input. A gene set is considered over-represented when it has a significantly higher fraction of differentially expressed genes than a randomly selected module of the same size. The clusterProfiler package was used to apply a similar list-based enrichment of GO terms by the detected modules [43]. Tests were adjusted for multiple testing using the False Discovery Rate (FDR) and a cutoff (0.1) was applied.

Cell Culture and RT-qPCR
3T3-L1 pre-adipocytes were cultured and induced for differentiation using MDI protocol as described before [19]. Total RNA was collected at four different time points corresponding the the major differentiation stages of the adipocytes (−2 day, full confluence; 0 day, undifferentiated; 10 h differentiating; and −6 day, maturating). The list of the primers that were used in the reaction are provided in (Appendix A). The C t values from the RT-qPCR reaction were normalized by a reference gene 18S and calibrated by the confluent samples (∆∆C t ) using the pcr R package [44].

Software Environment and Reproducibility
The data were obtained, processed and analyzed in an R environment and using multiple Bioconductor packages [45,46]. The full analysis was done and reproduced in an isolated environment based on docker (bioconductor/release_base2) [47]. The scripts for reproducing the analysis, figures and tables are available at https://github.com/MahShaaban/aacna. The instructions for reproducing the analysis are described in Appendix C.

Conclusions
In summary, we used the WGCNA to investigate the interactions of AMPK and autophagy gene products in the context of adipocyte differentiation. Two co-expression networks were found to be highly correlated with the time course of differentiation. We were able to validate the case of these networks in other independent datasets of similar experimental designs. These networks appear to be consequential in the response of the pre-adipocyte to the differentiation stimulus. Finally, we present several novel potential interactions between AMPK and autophagy gene products and link them to potential functions and cellular sites.

Conflicts of Interest:
The authors declare no conflict of interest.

Appendix A. Datasets and Annotations
This appendix contains details of the datasets and the gene annotation used in the study.

Appendix A.1. Time Point of Samples in the Microarrays Datasets
We queried the metadata of the Gene Expression omnibus (GEO) for microarrays datasets of MDI-induced 3T3-L1 pre-adipocytes at different time points that covers the various differentiation stages. The datasets were then manually checked for containing sufficient phenotype and annotation data. A few datasets were generated using custom microarrays chips of a few thousand probes and were excluded for not containing a sufficient number of the probes of interest. In Table A1, we listed four datasets included in this study and the time points (in hours) of their samples.

Appendix A.2. Comparing the Average Expression in Four Datasets
To ensure that the different datasets are exhibiting comparable probe expression, we compared the average expression of all common probes in three datasets (GSE15018, GSE20696 and GSE69313) with the main (GSE34150) dataset ( Figure A1). The accession number (GSE34150) was used to obtain the expression matrix and the metadata of the dataset. Several quality assessment measures were applied to ensure the suitability of the data for the downstream analysis ( Figure A2).

Appendix A.4. Confirming Differentiation and Lipogenesis
Several adipogenic and lipogenic of markers transcriptional changes are expected during the course of the 3T3-L1 cell differentiation. Figure A3 show the log expression level of some of these markers at different stages in the GSE34150 dataset.
GTGTGTCTAGACGACGAGAATG GACTTCTGAGGTAGGCTTCTTG Appendix A.6. Gene Ontology Annotation To define the sets of genes involved in the AMPK and autophagy pathways, we turned to the gene ontology (GO) annotations. GO identifies the AMP-dependent protein kinase activity (GO:0004679) as catalysis of the reaction: ATP + a protein = ADP + a phosphoprotein. In total, 14 genes were identified to be involved in the pathway and its regulation and are referred to as AMPK genes in the manuscript. Similarly, GO defines the term autophagy (GO:0006914) as the catabolic process in which the cells digest parts of their own cytoplasm. It contains 10 children/subcategory terms and 167 genes. The gene symbols of AMPK and autophagy gene sets are listed in Table A3.

Appendix B.2. Steps of Constructing the Weighed Co-Expression Networks
In this section, we discuss the different steps for calculating the similarity measures that was used in constructing the networks as well as comparing to the intermediary forms in representing the notation of co-expression. Particularly, the issue of penalizing the low correlation values. Three main steps are necessary: • The absolute values of either Pearson's (default) or Spearman's coefficient can be used to provide an initial similarity measure s ij between each pair of nodes (ij) as in: The similarity matrix is then transformed to and adjacency matrix by elevating it to a selected power β as in: This matrix is then used to calculate the connectivity/weight of each pair of nodes as follows: where l ij = ∑ u a iu a uj and k i = ∑ u a iu . These weights (also not shown) are finally used to calculate a dissimilarity measure for clustering and detecting the gene modules as in: d w ij = 1 − w ij . Figure A5A shows the cumulative distribution functions of the final TOM similarity measure and the two intermediaries; Pearson's correlation and the adjacency (after being raised to the power of 5). Figure A5. Gene similarity and node centrality measures. (A) the cumulative distribution function (CDF) of three correlation/similarity measures of AMPK and autophagy genes (n = 181) are shown as colored lines (green, Pearson's correlation coefficients; red, adjacency; and blue, TOM); (B) three centrality measures for all nodes in the two detected modules are as points. The degree centrality on the x-axis, the betweenness centrality on the y-axis and the hub score as the point size.

Appendix B.3. Node Centrality Measures
To determine the importance of each node/gene in the networks, we relied on multiple measures of centrality or node influence. These measures are calculated as follows: • Degree Centrality: the number of edges/connections shared by a node. The Degree of a node v is given by: where a v,j is the adjacency matrix of the network or the sum of the corresponding row.

•
Betweenness Centrality: The number of the occasion of a node falls on the shortest between two other nodes. The Betweenness of a node v is given by: or the fraction of the shortest paths between each pair of nodes (s, t) that passes through v.
• Hub Score/Eigenvector: a measure of the influence of a given node in a network. The Eigenvector of a node v is given by: or the sum of the scores x i of its neighboring nodes M(v) in the network G. Figure A5B shows the different centrality measures and the correlations between them. Specifically, we found that the degree centrality is highly correlated with the hub scores (Pearson's coefficient about 0.73) and less so with the betweenness centrality (Pearson's coefficient about 0.44).

Appendix B.4. Network Preservation
The authors of the WGCNA method suggests using composite preservation summaries to evaluate the evidence of the preservation of the detected modules in the test dataset/s as opposed to using individual statistics as they measure different aspect of the preservation. In the main text, we showed the Z summary for the preservation of the two detected modules in three test datasets. Here, we briefly expand on what this statistics composed of and show additional statistics supporting the preservation of the modules.
The Z summary statistics is given by: Each of the two Z summaries are further composed of several statistics, the details of which are provided in the references.
As opposed to the Z summary as evidence for module preservation, the median rank of the preservation is more robust to the module sizes. However, it is more informative in comparing preservation of modules relative to each other, as it is based on the ranks of the observed preservation statistics of the module. The lower the median rank of the module, the more preservation it exhibits in the test set. Figure A6 shows the median ranks/relative preservation of the modules with their sizes. Figure A6. Module preservation ranks across multiple MDI-induced 3T3-L1 microarrays datasets. The GSE34150 dataset was used to detect the highly co-expressed modules among AMPK and autophagy genes (42, blue; 66, turquoise; 10, gray, unassigned; and 55, gold, randomly assigned). The detected modules were used as a reference to calculate several preservation statistics in three independent datasets of similar design (GSE15018, GSE20696 and GSE69313). The median ranks of the preservation statistics and the sizes of four modules are shown as colored points.
• 01.analysis.R This script loads the required libraries, download the data and run all the steps of the analysis described in the manuscript, • figures/A sub-folder with a separate file for each graph in the manuscript, • tables/A sub-folder with a sepearte file for each table in the manuscript.
The following code clones the repository containing the source code: $ g i t c l o n e h t t p :// github . com/MahShaaban/aacna .