Integrative Analysis of Metabolomic and Transcriptomic Profiles Uncovers Biological Pathways of Feed Efficiency in Pigs

Feed efficiency (FE) is an economically important trait. Thus, reliable predictors would help to reduce the production cost and provide sustainability to the pig industry. We carried out metabolome-transcriptome integration analysis on 40 purebred Duroc and Landrace uncastrated male pigs to identify potential gene-metabolite interactions and explore the molecular mechanisms underlying FE. To this end, we applied untargeted metabolomics and RNA-seq approaches to the same animals. After data quality control, we used a linear model approach to integrate the data and find significant differently correlated gene-metabolite pairs separately for the breeds (Duroc and Landrace) and FE groups (low and high FE) followed by a pathway over-representation analysis. We identified 21 and 12 significant gene-metabolite pairs for each group. The valine-leucine-isoleucine biosynthesis/degradation and arginine-proline metabolism pathways were associated with unique metabolites. The unique genes obtained from significant metabolite-gene pairs were associated with sphingolipid catabolism, multicellular organismal process, cGMP, and purine metabolic processes. While some of the genes and metabolites identified were known for their association with FE, others are novel and provide new avenues for further research. Further validation of genes, metabolites, and gene-metabolite interactions in larger cohorts will elucidate the regulatory mechanisms and pathways underlying FE.


Introduction
Feed represents about 60-70% of total pork production costs in modern pig production. Thus, to decrease costs and increase profitability, it is pivotal to identify feed efficient (FE) animals [1]. However, due to the polygenic architecture of FE, individual pigs in a herd exhibit considerable variation in FE despite belonging to similar genetic background and environment [2]. Considering this variation, different methods have been proposed and widely used to measure the FE, including feed conversion ratio (FCR) and residual feed intake (RFI) [1,3]. FCR is the ratio of feed intake (FI) per unit body weight gain and is affected by many factors such as breed, sex, diet, and environmental conditions [4,5]. Pigs with low FCR are considered high FE and vice-versa. RFI estimates the difference between actual and expected FI predicted on production traits as average daily gain (ADG) [6]. Since FCR considers both FI and weight gain, and FCR is also one of the critical predictors of FE, it suggests that feed efficient pigs may possess different physiological-biochemical profiles compared to the inefficient ones [2].
Based on the advances in omics technologies, several approaches have been adopted to shed light on the genetic mechanisms underlying FE in pigs. Among these omics technologies, transcriptomics and metabolomics have provided tools to elucidate the molecular basis of FE. While transcriptomics allows us to have a transcriptional snapshot of genes underpinning the phenotype under investigation, metabolomics bridges the gap between genotype and phenotype. Recently, an increasing number of metabolomic studies have reported the role of metabolites in economically important traits [7], such as meat quality [8,9], pre-slaughter stress [10], and FE [3]. Likewise, several transcriptomic studies have pointed out candidate genes underpinning FE and other related-traits such as immune response, growth, and metabolism in pigs [11][12][13].
Recently, we have investigated RNA-seq data on the 41 Danish production pigs that underwent feed efficiency and performance testing trials to identify differentially expressed genes and gene networks and reported 13 genes as potential biomarkers for feed efficiency [14]. Despite the new insights into key genes and molecular mechanisms reported in these studies, these approaches rely solely on data from a single biological layer. It has been shown that the integration of transcriptomics datasets with genomic and other omics datasets (systems genomics) increases the power to detect causal and regulatory factors and molecular pathways underlying complex phenotypes or diseases in animals [15,16].
To gain further insights into biochemical aspects of complex traits, data integration analysis has emerged as a fruitful tool, unveiling potential biomarkers via integration of metabolomics and transcriptomics [17]. By the development of analytical technologies for data integration, the assessment of system-wide changes of transcript levels as surrogate measurements of metabolic rearrangements can be widely assessed. Metabolite-transcript co-responses using combined profiling can yield vital information on the complex biological regulation of the trait. Transcriptome-metabolome integration is a powerful combination as the metabolome is affected by the phenotypic measurements to which the global measures of transcriptome can be anchored [18]. Therefore, herein, we integrated data from metabolome-transcriptome approaches to unveil the unique gene-metabolite pairs. To this end, we adopted a two-step framework, as follows: (1) we first employed the numerical integration of gene-metabolite levels to identify gene-metabolite interaction pairs separately for the breeds (Duroc and Landrace) and FE groups (low and high FE) using IntLIM R-package; (2) next, a knowledge-based integration approach based on pathway over-representation analysis was used to reveal the underlying pathways in each group (breed-specific and FE-specific). To the best of our knowledge, this is the first study of its kind to ever combine high throughput metabolomics data with RNA sequence based gene expression data in pigs to unravel the missing links between genes and metabolites and to shed light on the molecular basis that characterizes the specific differences based on breed and feed efficiency.

Descriptive Statistics and Linear Model Analysis for genes and Metabolites
The data on 749 metabolites and 25,880 genes from 40 samples were generated using untargeted metabolomics and transcriptomic approaches, respectively. We utilized data of 405 annotated metabolites (see methodology) for further analysis. For the transcriptomic data generated on 25,880 genes, we analyzed the data for each of the two groups (breed-specific and FE-specific), as described in the methodology. The genes with a gene count <1 were removed, resulting in 20,233 genes for both the groups. The gene count data for each group (breed-specific and FE-specific) was normalized, and the linear model was fitted into the data as given in the methodology. The genes were also screened for their chromosomal information from the Ensembl Sus scrofa database. After normalization, removal of values < 0 and obtaining the gene chromosomal location information, 17,726 (breed-specific), and 17,697 (FE-specific) genes were retained in each group. As a quality control for IntLIM, we filtered out genes with the lowest 5% of the variation, which gave 16,839 genes (breed-specific) and 16,812 genes (FE-specific) that were subjected to IntLIM analysis. A schematic representation of the study design and analysis steps are given in Figure 1. We performed the PCA analysis ( Figure S1) on the filtered metabolome-transcriptome data, which included 405 metabolites, and 16,839 genes (breed-specific), and 16,812 genes (FE-specific). The results of PCA analysis for the metabolites-genes based on breed and FE groups are shown in Figure S1. analysis ( Figure S1) on the filtered metabolome-transcriptome data, which included 405 metabolites, and 16,839 genes (breed-specific), and 16,812 genes (FE-specific). The results of PCA analysis for the metabolites-genes based on breed and FE groups are shown in Figure S1.

Gene Metabolite Interaction of Breed-Specific and FE-Specific Groups
From the IntLIM analysis, we identified gene-metabolite pairs that have a strong association with respect to the breed (Duroc and Landrace) and FE (low and high FE) groups. For the breedspecific groups, all possible combinations of gene-metabolite pairs (6,819,795 model runs) were evaluated, using Duroc and Landrace as the breed-group. Based on this approach, we identified 21 gene-metabolite associations (false discovery rate-FDR adjusted p-value < 0.1 and correlation difference effect size > 0.1) ( Table 1). Clustering these pairs by the direction of association (positive and negative correlations) within each breed group revealed two major clusters (Figure 2a

Gene Metabolite Interaction of Breed-Specific and FE-Specific Groups
From the IntLIM analysis, we identified gene-metabolite pairs that have a strong association with respect to the breed (Duroc and Landrace) and FE (low and high FE) groups. For the breed-specific groups, all possible combinations of gene-metabolite pairs (6,819,795 model runs) were evaluated, using Duroc and Landrace as the breed-group. Based on this approach, we identified 21 gene-metabolite associations (false discovery rate-FDR adjusted p-value ≤ 0.1 and correlation difference effect size > 0.1) ( Table 1). Clustering these pairs by the direction of association (positive and negative correlations) within each breed group revealed two major clusters (Figure 2a Similarly, we used IntLIM for the FE-specific group and evaluated all possible combinations of gene-metabolite pairs (6,808,860 models), with low and high FE as a binary phenotype. With this approach, we identified 12 FE-specific gene-metabolite correlations (FDR adjusted interaction p-value ≤ 0.1, and a Spearman correlation difference > 0.1) involving eight unique gene and metabolites each ( Table 2). The heat map with gene-metabolite Spearman correlation for low and high FE group showed a clear separation between the two groups ( Figure 3a). The high FE-correlated cluster of 12 gene-metabolite pairs (eight unique genes and metabolites with high correlations in high-FE groups) were negatively correlated with the low-FE group. The two gene-metabolite pairs (ranked in descending order of absolute value Spearman correlation difference between high and low FE group) Similarly, we used IntLIM for the FE-specific group and evaluated all possible combinations of gene-metabolite pairs (6,808,860 models), with low and high FE as a binary phenotype. With this approach, we identified 12 FE-specific gene-metabolite correlations (FDR adjusted interaction p-value < 0.1, and a Spearman correlation difference > 0.1) involving eight unique gene and metabolites each ( Table 2). The heat map with gene-metabolite Spearman correlation for low and high FE group showed a clear separation between the two groups ( Figure 3a). The high FE-correlated cluster of 12 gene-metabolite pairs (eight unique genes and metabolites with high correlations in high-FE groups) were negatively correlated with the low-FE group. The two gene-metabolite pairs (ranked in descending order of absolute value Spearman correlation difference between high and low FE group) in high-FE correlated clusters were ENSSSCG00000025106 (THNSL2)-pyrocatechol ( Figure 3b

Pathway and Gene Ontology Over-Representation Analysis
We identified the pathways associated with the unique metabolites in each cluster identified from breed-specific (21) and FE-specific (12) interactions. The three unique metabolites from Duroc correlated/Landrace anti-correlated clusters were associated with arginine and proline metabolism (p-value = 0.02). Furthermore, the ten unique metabolites from Duroc anti-correlated/Landrace correlated cluster were associated with valine-leucine-isoleucine biosynthesis (p-value = 0.01) and valine-leucine-isoleucine degradation (p-value = 0.07) along with arginine and proline metabolism (pvalue = 0.07). The eight unique metabolites from high-FE correlated/low-FE anti-correlated cluster were associated with valine-leucine-isoleucine biosynthesis (p-value = 0.01) and degradation (p-value = 0.07). The pathways associated with the metabolites in breed-specific and FE-specific clusters for unique metabolites are given in Supplementary Table S1a.
Unique and mappable genes from each group (breed-specific-each cluster, and FE-specific) were screened by using GeneMania to generate a composite functional association network that

Pathway and Gene Ontology Over-Representation Analysis
We identified the pathways associated with the unique metabolites in each cluster identified from breed-specific (21) and FE-specific (12) interactions. The three unique metabolites from Duroc correlated/Landrace anti-correlated clusters were associated with arginine and proline metabolism (p-value = 0.02). Furthermore, the ten unique metabolites from Duroc anti-correlated/Landrace correlated cluster were associated with valine-leucine-isoleucine biosynthesis (p-value = 0.01) and valine-leucine-isoleucine degradation (p-value = 0.07) along with arginine and proline metabolism (p-value = 0.07). The eight unique metabolites from high-FE correlated/low-FE anti-correlated cluster were associated with valine-leucine-isoleucine biosynthesis (p-value = 0.01) and degradation (p-value = 0.07). The pathways associated with the metabolites in breed-specific and FE-specific clusters for unique metabolites are given in Supplementary Table S1a.
Unique and mappable genes from each group (breed-specific-each cluster, and FE-specific) were screened by using GeneMania to generate a composite functional association network that includes all the evidence of co-functionality. From the breed-specific group, unique genes (4 genes) from Duroc correlated/Landrace anti-correlated cluster (Table 1) mapped to 20 genes based on co-functionality from GeneMania (Table S1b). The gene-ontology enrichment analysis of the identified 24 genes (unique genes from Table 1 and co-functional genes from GeneMania) revealed enrichment of the regulation of hemopoiesis, response to thyroid hormone, and the sphingolipid catabolic process (Table S1c). These genes were enriched for the sphingolipid metabolism KEGG pathway (adjusted p-value corrected with Bonferroni step down ≤ 0.05) (Supplementary Table S1d). Unique and mappable genes (6 genes) identified from Duroc anti-correlated/Landrace correlated cluster (Table 1) were co-functional with 20 genes based on GeneMania (Table S1b). The gene ontology enrichment analysis of these 26 genes (unique genes from Table 1 and co-functional genes from GeneMania) revealed the ER to Golgi vesicle-mediated transport and membrane fusion (Table S1c) as an enriched biological process. Butanoate metabolism, alanine-aspartate-glutamate metabolism, and valine-leucine-isoleucine degradation were significantly enriched KEGG pathways from the Duroc anti-correlated/Landrace correlated cluster (Supplementary Table S1d). Similarly, from the FE-specific group, unique mapped genes (7 genes) from high-FE correlated/low-FE anti-correlated clusters ( Table 2) were co-functional with 20 genes identified from GeneMania (Table S1b). These genes were involved with the cGMP metabolic process, purine nucleotide biosynthesis, and phosphorus-oxygen lyase activity pathways (Table S1c). The top significant KEGG pathway enriched was the cGMP-PKG signaling pathway (Supplementary Table S1d).

Discussion
FE is an important quantitative trait, which quantifies the efficiency of nutrient conversion from the feed to a tissue that is of nutritional and economic significance [19]. Understanding the molecular mechanisms underlying FE will be advantageous in the efficient selection of pigs and benefit the pig producers. In the Danish pig industry, Duroc is the most popularly used terminal sires in combination with crossbred Landrace X Yorkshire breeds [20], so the selection pressures for FE in Duroc is higher as compared to Landrace. Thus, in the current study, we attempted to identify the gene-metabolite interactions specific to each breed. FE is a complex trait influenced by environmental and health factors and involves many organs. Skeletal muscle, being the largest organ in the body, is an essential location for the metabolism of carbohydrates and lipids [21,22]. It plays a significant role in the utilization and storage of energy acquired from the feed. Thus, understanding the difference in the regulatory processes from a divergent FE group will add a layer of knowledge to the understanding of biological mechanisms involved with this complex trait.
A plethora of metabolome and transcriptome studies for FE in pigs are reported [3,9,10,12]. However, to the best of our knowledge, markers from the integration of metabolome and transcriptome in Duroc and Landrace pigs has not been done before. Herein, we unveiled the gene-metabolite relationships that are phenotype dependent. This approach highlighted molecular functions and pathways that are strongly evidenced by the integration study. Evaluating phenotype-specific relationships between metabolites and genes assists us to elucidate novel co-regulation patterns that would not be identified by single approaches. In the current study, we integrated untargeted metabolomic and transcriptomic data. We used a numerical data integration approach that employed the integration of a linear model (IntLIM package) to predict metabolite levels from gene-expression in a phenotype dependent manner [23].
We attempted to identify the breed-specific and FE-specific gene-metabolite pairs in the current study. The PCA analysis showed a difference in the expression of genes in Duroc and Landrace. However, PCA for the FE group did not exhibit significant clusters between groups, which may be due to the small sample size evaluated here. With our current metabolome-transcriptome analysis, we identified 21 gene-metabolite breed-specific pairs and 12 gene-metabolite FE-specific pairs.

Breed-Specific Pathway Analysis
In the breed-specific analysis, two clusters were identified between Duroc and Landrace breeds. The Duroc correlated/Landrace anti-correlated cluster associated L-glutamic acid 5-phosphate metabolite with the FAM160A2, ENSSSCG00000040110, ETS2, and SGPL1 genes; cystathionine ketimine metabolite with ENSSSCG00000040110 and SGPL1 genes and Rhodamine B with the SNRPN genes. The arginine and proline metabolism pathways were associated with the unique metabolites from this cluster. The gene ontology enrichment analysis of the unique genes identified from this cluster with the co-functional genes found enrichment for the multicellular organismal process, sphingolipid catabolic process, regulation of hemostasis, and coagulation pathways. Sphingolipid metabolism associated with the SGPL1 and GBA genes was identified as the top KEGG pathway. SGPL1 (sphingosine-1-phosphate) catalyzes the final step of the sphingolipid pathway by irreversibly converting sphingosine-1-phosphate (S1P) to its by-products [24], thereby regulating S1P. S1P plays a role as a muscle trophic factor by activating quiescent muscle stem cells (satellite cells) for efficient skeletal muscle repair and regeneration [25]. The role of FE on skeletal muscle mass was well established from previous studies wherein the improvement of muscle properties and an increase in muscle mass is attributed by selection for low RFI in pigs [26]. S1P is also reported to trigger glutamate secretion and potentiates depolarization-evoked glutamate secretion [27]. Glutamic acid has been found to play a crucial role in FE as it improves the FE of weaned piglets [28]. This supports the results in the current study where SGPL1 was associated with L-glutamic acid 5-phosphate as a significant gene-metabolite pair. A brief overview of the role of SGPL1 in sphingolipid-metabolism regulating FE is given in Figure 4. Therefore, further investigation of SGPL1-L-glutamic acid 5-phosphate gene-metabolite pair, which was positively correlated with Duroc while negatively correlated with Landrace concerning FE traits, could be a major avenue for breed-specific research and its effect on FE and meat quality traits.
Metabolites 2020, 10, 275 9 of 19 cells) for efficient skeletal muscle repair and regeneration [25]. The role of FE on skeletal muscle mass was well established from previous studies wherein the improvement of muscle properties and an increase in muscle mass is attributed by selection for low RFI in pigs [26]. S1P is also reported to trigger glutamate secretion and potentiates depolarization-evoked glutamate secretion [27]. Glutamic acid has been found to play a crucial role in FE as it improves the FE of weaned piglets [28]. This supports the results in the current study where SGPL1 was associated with L-glutamic acid 5phosphate as a significant gene-metabolite pair. A brief overview of the role of SGPL1 in sphingolipid-metabolism regulating FE is given in Figure 4. Therefore, further investigation of SGPL1-L-glutamic acid 5-phosphate gene-metabolite pair, which was positively correlated with Duroc while negatively correlated with Landrace concerning FE traits, could be a major avenue for breed-specific research and its effect on FE and meat quality traits. We also identified the SNRPN gene and Rhodamine B relation in the Duroc correlated cluster. A previous study showed the SNRPN gene (small nuclear ribonucleoprotein polypeptide N) was ubiquitously expressed in pigs [29]. Small nuclear ribonucleoproteins and heterogeneous small nuclear riboproteins play roles in nucleolar ribosomal RNA (rRNA) and messenger RNA (mRNA) synthesis in conjunction with spliceosome activity responsible for cleaving on introns from the pre mRNA molecule [30]. Furthermore, in a study of FE in broiler chickens, a high FE phenotype exhibited enrichment of ribosome assembly including small nuclear ribonucleoprotein, as well as nuclear transport and protein translation processes than low FE phenotype [31].
The Duroc anti-correlated/Landrace correlated cluster identified aloesol-ZC2HC1B, Proanthocyanidin a2-ZDHHC22; fenamiphos metabolite with ENSSSCG00000040467, We also identified the SNRPN gene and Rhodamine B relation in the Duroc correlated cluster. A previous study showed the SNRPN gene (small nuclear ribonucleoprotein polypeptide N) was ubiquitously expressed in pigs [29]. Small nuclear ribonucleoproteins and heterogeneous small nuclear riboproteins play roles in nucleolar ribosomal RNA (rRNA) and messenger RNA (mRNA) synthesis in conjunction with spliceosome activity responsible for cleaving on introns from the pre mRNA molecule [30]. Furthermore, in a study of FE in broiler chickens, a high FE phenotype exhibited enrichment of ribosome assembly including small nuclear ribonucleoprotein, as well as nuclear transport and protein translation processes than low FE phenotype [31].
The Duroc anti-correlated/Landrace correlated cluster identified aloesol-ZC2HC1B, Proanthocyanidin a2-ZDHHC22; fenamiphos metabolite with ENSSSCG00000040467, ENSSSCG00000018649 gene; ganoderenic acid e metabolite with ENSSSCG00000037595 and SEC22C genes; FAM163B gene with taraxacolide 1-o-b-d-glucopyranoside, theogallin, and ketoleucine metabolites; LRRTM2 gene with cystathionine ketimine and L-glutamic acid 5-phosphate metabolites and GLS2 gene with L-glutamic acid 5-phosphate, cystathionine ketimine, and paracetamol sulfate metabolites. One of the significant pathways in Landrace correlated cluster was valine, leucine, and isoleucine degradation which included the ABAT and ACADS genes (co-functional genes) identified in the current study. Valine, leucine, and isoleucine are branched-chain amino acids (BCAA) and have a crucial role in protein synthesis and energy production [32]. The degradation of BCAA can be glucogenic (valine), ketogenic (leucine and isoleucine), or both (isoleucine). The end products from their degradation, succinyl-CoA and/or acetyl-CoA can enter the tricarboxylic acid (TCA) cycle for energy generation and gluconeogenesis or may act as precursors for lipogenesis and ketone body production through acetyl-CoA and acetoacetate [33]. Glucose metabolism and the TCA pathway in the skeletal muscle is a key pathway regulating FE traits in pigs [34]. In an interesting proteomic study involving glucose metabolism and the TCA cycle reported earlier, the proteins catalyzing the conversion of glucose to pyruvate and oxaloacetate were up-regulated in high-FE pigs while those involved in the conversion of pyruvate to lactate or acetyl-CoA were down-regulated in high-FE pigs [34]. This resulted in inhibition of the TCA cycle in high-FE pigs due to the down-regulation of key catalytic proteins [34]. Thus, the pathway identified with BCAA in the current study may cause differences in FE concerning specific breed as evident from the indirect link with TCA and FE ( Figure 5). The BCAA also affects protein synthesis, as reported earlier in a study with reduced degradation of rat skeletal muscle proteins [35]. Additionally, leucine activates mTOR signaling, one of the central regulators of cell growth and metabolism along with an increase in fatty acid oxidation. With all these supporting facts of BCAA regulating cellular metabolism, protein, and fatty acid degradation, which are also key factors influencing FE, the role of the valine-leucine-isoleucine pathway in FE cannot be overlooked. Furthermore, this pathway has been found over-represented in a GWAS study for RFI in beef cattle [36] and pigs [3].
Metabolites 2020, 10, 275 10 of 19 [34]. This resulted in inhibition of the TCA cycle in high-FE pigs due to the down-regulation of key catalytic proteins [34]. Thus, the pathway identified with BCAA in the current study may cause differences in FE concerning specific breed as evident from the indirect link with TCA and FE ( Figure  5). The BCAA also affects protein synthesis, as reported earlier in a study with reduced degradation of rat skeletal muscle proteins [35]. Additionally, leucine activates mTOR signaling, one of the central regulators of cell growth and metabolism along with an increase in fatty acid oxidation. With all these supporting facts of BCAA regulating cellular metabolism, protein, and fatty acid degradation, which are also key factors influencing FE, the role of the valine-leucine-isoleucine pathway in FE cannot be overlooked. Furthermore, this pathway has been found over-represented in a GWAS study for RFI in beef cattle [36] and pigs [3]. We identified the alanine-aspartate-glutamate metabolism KEGG pathway involving the GLS2 and ABAT genes within the Duroc anti-correlated/Landrace correlated cluster. GLS2 is a mitochondrial glutaminase that catalyzes glutamine to glutamate which is further converted to aketoglutarate, an important substrate for the citric acid cycle to produce ATP in mitochondria. We identified the alanine-aspartate-glutamate metabolism KEGG pathway involving the GLS2 and ABAT genes within the Duroc anti-correlated/Landrace correlated cluster. GLS2 is a mitochondrial glutaminase that catalyzes glutamine to glutamate which is further converted to a-ketoglutarate, an important substrate for the citric acid cycle to produce ATP in mitochondria. Glutamate is also a precursor of reduced glutathione (GSH), an important antioxidant molecule, and a scavenger for ROS (Reactive oxygen species) [37]. ROS are regulated by FE related traits as reported from the previous studies, higher levels of ROS production and oxidized mitochondrial proteins have been found in the muscle of low FE chickens [38] while in pigs, ROS production in mitochondria was higher in semitendinosus muscle of less efficient pigs selected for high RFI compared to high efficient pigs (low RFI) [39]. GLS2 interacted with L-glutamic acid 5-phosphate and cystathionine ketimine metabolites as identified from Duroc anti-correlated/Landrace correlated clusters ( Figure 5).
The unique metabolites identified in the breed-specific clusters were also previously reported in another study for the metabolomic analysis of FE related traits in Duroc and Landrace [3]. The breed-specific unique metabolites such as aloesol and ketoleucine affected FE in Duroc [3]. In contrast, rhodamine B, taraxacolide 1-o-b-d-glucopyranoside, and ganoderenic acid were underlying testing daily gain (TDG) in Duroc [3]. Theogallin and ketoleucine were involved with TDG and daily gain (DG) in Duroc and Landrace and RFI in Duroc [3]. L-glutamic acid 5-phosphate, cystathionine ketimine, and paracetamol sulfate were associated with FE and RFI in Landrace [3]. It is worth highlighting the interaction of metabolites L-glutamic acid 5-phosphate and cystathionine ketimine identified in this study. While these metabolites interact with the SGPL1 gene as identified in the Duroc correlated cluster, on the contrary, they interact with the GLS2 gene in the Landrace correlated cluster. Both SGPL1 and GLS2 were in the top significant pathways in their respective cluster. Therefore, these gene-metabolite interactions which are highly specific to breed differences open up the avenues for further research to extrapolate differences in FE related traits concerning breeds.

cGMP-PKG Pathway Involved with FE-Specific Analysis
In the FE-specific analysis, we found 12 significant gene-metabolite pairs. The gene-metabolite pairs in high-FE correlated/low-FE anti-correlated cluster were TBXT gene with theogallin and ketoleucine metabolites; THNSL2 gene with pyrocatechol, 2-pyrocatechuic acid, ketoleucine, and theogallin metabolites; TUBAL3 gene with neodispyrin metabolite, RNF145 gene with ketoleucine metabolite, ENAM gene and U2 snRNA with proanthocyanidin a2 metabolite, ENSSSCG00000038441 gene with adrenochrome metabolite, and PRKG2 gene with levulinic acid metabolite. The pathway analysis with the unique metabolites identified from high-FE correlated/low-FE anti-correlated clusters was over-represented for valine-leucine-isoleucine biosynthesis and degradation pathway. The unique mapped genes and co-functional genes were enriched for the following biological processes: lyase activity, cGMP metabolic process, phosphorus-oxygen lyase activity, and cyclic purine nucleotide metabolic process. cGMP-PKG, purine metabolism, and renin secretion were the KEGG pathways identified in this cluster. The cGMP pathways were also identified in the studies reported earlier with FE related traits with pigs [40] and beef cattle [41]. The PRKG2 gene, one of the main predictors for cGMP pathways and also identified in this study, encodes the serine/threonine-protein, which binds to inhibits the activation of several receptor tyrosine kinases and is a regulator of the intestinal secretion, bone growth, and renin secretion (https://www.ncbi.nlm.nih.gov/gene/5593). PRKG2 encodes for CGKII (guanosine 3,5-cyclic monophosphate (cGMP)-dependent protein kinase II) and is abundantly expressed in intestinal epithelium. CGKII relays signaling through a membrane-associated, cGMP-producing enzyme, guanylyl cyclase (GC). The catalytic activity of this receptor-enzyme is triggered by two locally produced ligands, the peptides guanylin and uroguanylin [42]. The GC is activated by nitric oxide (NO) and catalyzes the conversion of intracellular guanosine-5 -triphosphate (GTP) to cyclic guanosine-3',5'-monophosphate (cGMP). This enzyme has two forms: a membrane protein and a soluble form with specific kinetic properties and tissue distributions. The soluble GC (sGC) form is a heterodimeric protein consisting of α (α 1 and α 2 ,) and β (β 1 and β 2 ) subunits encoded by distinct genes [43]. An alpha subunit of guanylyl cyclase, GUCY1A2 and a beta subunit GUCY1B3 was identified as co-functional genes in the current study and were involved with cGMP, phosphorus metabolic process, nitrogen metabolic process pathways as identified in the current study. Uroguanylin is a gastrointestinal hormone primarily involved in fluid and electrolyte handling. It has recently been reported that prouroguanylin, secreted postprandially, is converted to uroguanylin in the brain and activates the receptor guanylate cyclase-C (GC-C) to reduce food intake in mice [44]. Reduced FI is a characteristic feature for the selection of the pigs known for high FE [1]. The overview of the mechanism involving the cGMP-PKG pathway and its role in FE is given in Figure 6.
epithelium. CGKII relays signaling through a membrane-associated, cGMP-producing enzyme, guanylyl cyclase (GC). The catalytic activity of this receptor-enzyme is triggered by two locally produced ligands, the peptides guanylin and uroguanylin [42]. The GC is activated by nitric oxide (NO) and catalyzes the conversion of intracellular guanosine-5′-triphosphate (GTP) to cyclic guanosine-3',5'-monophosphate (cGMP). This enzyme has two forms: a membrane protein and a soluble form with specific kinetic properties and tissue distributions. The soluble GC (sGC) form is a heterodimeric protein consisting of α (α1 and α2,) and β (β1 and β2) subunits encoded by distinct genes [43]. An alpha subunit of guanylyl cyclase, GUCY1A2 and a beta subunit GUCY1B3 was identified as co-functional genes in the current study and were involved with cGMP, phosphorus metabolic process, nitrogen metabolic process pathways as identified in the current study. Uroguanylin is a gastrointestinal hormone primarily involved in fluid and electrolyte handling. It has recently been reported that prouroguanylin, secreted postprandially, is converted to uroguanylin in the brain and activates the receptor guanylate cyclase-C (GC-C) to reduce food intake in mice [44]. Reduced FI is a characteristic feature for the selection of the pigs known for high FE [1]. The overview of the mechanism involving the cGMP-PKG pathway and its role in FE is given in Figure 6.  A positive correlation between FI and plasma cholesterol levels is previously established in pigs [45]. RNF145, which was positively correlated in the high-FE cluster participates in key signaling pathways of cholesterol homeostasis [46]. RNF145 expression is induced in response to LXR activation and high-cholesterol diet feeding [46]. Transduction of RNF145 into mouse liver inhibits the expression of genes involved in cholesterol biosynthesis and reduces plasma cholesterol levels. On the other hand, its inactivation increases cholesterol levels both in the liver and plasma [46]. In this study, RNF145 was identified with ketoleucine as a significant gene-metabolite pair in this cluster ( Figure 6). Ketoleucine is an abnormal metabolite that arises from the incomplete breakdown of branched-chain amino acids (https://hmdb.ca/metabolites/HMDB0000695). Ketoleucine is regulated by branched-chain α-keto acid dehydrogenase. Studies reported that the branched-chain α-keto acid dehydrogenase catalyzes the irreversible oxidative decarboxylation of all three branched-chain keto acids (BCKA) derived from branched-chain amino acids (BCAA), i.e., α-ketoisocaproate (ketoleucine) [47]. They demonstrated changes in BCKA activity that showed a significant alteration in BCAA and protein metabolism during starvation in rats [47]. BCAA also plays a crucial role in FE by regulating energy homeostasis in addition to lipid and protein metabolism as reported in pigs [48].
Apart from RNF145, gene-metabolite interaction of ketoleucine with TBXT and THNSL2 was also identified in this study. THNSL2 was reported among the top 40 significantly differentially expressed genes of characterized proteins between high-and low-ADG steers from a liver transcriptome profiling of beef cattle [49]. This gene has also been associated with abdominal and visceral fat in humans based on GWAS [50]. An interaction of proanthocyanidin a2 with ENSSSCG00000019329 (U2 snRNA) was also identified with low but positive correlation with high-FE cluster while a negative correlation with the low-FE cluster. U2 spliceosomal snRNAs are the molecules found in the major spliceosomal machinery of all eukaryotic organisms and affect their gene expression [51]. U2 snRNA plays a central role in the splicing of mRNA precursors by regulating a dynamic set of RNA-RNA base-pairing interactions [52]. From the previously reported studies, the role of precursor mRNA in gene expression has been established as it removes the intronic sequence from immature RNA, leading to a production of mature mRNA that might differ in function [53]. Regulation of pre-mRNA splicing by nutrients modulates the carbohydrate and lipid metabolism [53]. U2 interacts with proanthocyanidin a2 in the current study. Proanthocyanidin a2 is an antioxidant and has a broad spectrum of biologic properties against oxidative stress [54]. Proanthocyanidin significantly increased the activity of antioxidant enzymes such as superoxide dismutase, glutathione peroxidase, and catalase [54]. The role of antioxidant activity with FE was reported earlier in beef cattle as low feed efficient steers had greater superoxide dismutase and glutathione peroxidase activity than the high feed efficient ones, potentially using a greater proportion of energy [55]. Thus, as evident from all these facts, the potential role of proanthocyanidin a2-U2 interaction in high-FE pigs identified in the current study might be interesting to explore.
The gene-metabolite pairs identified in the present study over-represented some pathways that have been reported to have a role in FE related traits. Some of the genes identified are novel and were not included in the pathway analysis. Since these gene-metabolite pairs selected have a highly significant correlation with respect to each study group, a detailed study of these genes and metabolites are needed to better understand their role in FE related-traits. Further studies on the identified gene-metabolite pairs may assist in the discovery of biomarkers as these significant pairs identified directly reflect the phenotype as revealed by the candidate gene-expression with the downstream metabolite differences in pigs with low and high FE groups.

Data Resource and Phenotype Generation
The pigs used in this experiment were raised at the pig testing station "Bøgildgård" operated by SEGES within Landbrug and Fødevarer (L&F: Danish Agriculture and Food Council). Pigs were ad libitum fed and free water supplied. The authors of this study were not responsible for animal husbandry, diet, and care as the testing station is a facility within the Danish breeding program run by SEGES. The initial bodyweight of the pigs before the testing period was approximately 7 kg, followed by a 5-week acclimatization phase. For the phenotypic traits, we calculated FCR and RFI, as reported in our previous study [40]. We considered the same classification of animals in this study as efficient and inefficient (low and high FCR, respectively), as previously reported [40]. The classification was done by selecting pigs that were one standard deviation above or below the mean FCR for each breed as previously reported.

Gene Expression Profile, Metabolite Profile, and Data Analyses
For transcriptome analysis, we collected psoas major muscle from 40 purebred uncastrated male pigs from two breeds comprising of 12 Danbred Duroc and 28 Danbred Landrace. The tissue samples were preserved in RNAlater (Ambion, Austin, TX, USA) immediately post-slaughter and stored at −25 • C until subsequent analysis. Total RNA isolation and sequencing were carried out by BGI Genomics. Paired-end sequencing (100 bp) was performed on the BGISEQ-500 platform after Oligo dT library preparation. Read quality control, mapping, and gene counts were reported elsewhere [14]. Lowly expressed genes were filtered out, and the gene counts normalization was carried out by applying the variance stabilizing transformation (VST) function from DeSeq2 [56].
To identify significant gene-metabolite pairs, we analyzed the data considering two approaches, i.e., breed-specific (Duroc-Landrace) and FE-specific (low-high FE groups). Thus, we fitted a linear model for adjusting the read counts with the covariates using the Limma R package [57]. For adjusting the data to identify breed-specific differences, we adopted the following model: where y ijkl : is the normalized read counts; µ: is the intercept; F i : is the fixed effect of the FE group (two levels, high and low); R j : is the covariate for the RIN values; A k : is the covariate for the animal's slaughter age, in days; P l : is the fixed effect of the pen (8 levels); ε ijkl : is the random residual effect associated with each observation.
To identify differences between high and low FE groups, the breed effect was added in the linear model, as follows: where y ijkl : is the normalized read counts; µ: is the intercept; B i : is the fixed effect of the breed (two levels, Duroc and Landrace); R j : is the covariate for the RIN values; A k : is the covariate for the animal's slaughter age, in days; P l : is the fixed effect of the pen (eight levels); ε ijkl : is the random residual effect associated with each observation.
Regarding the identification of the metabolites, we used an untargeted metabolomic approach, as reported elsewhere [3]. In summary, 5 mL of blood samples at two-time points were collected from jugular venipuncture of each non-fasted pig into the EDTA tubes and immediately placed on ice. The plasma samples extracted from 109 animals (59 Duroc and 50 Landrace) were subjected to metabolomics analysis, as described in a previous study [3]. The metabolite data from this study were accessed using MetaboLights accession ID MTBLS1384 with a link: https://www.ebi.ac.uk/metabolights/ MTBLS1384. Due to the need for paired data to carry the integrative analysis, only those samples with both metabolite and RNA-Seq data were used herein. The metabolite data from time-point two in 40 pigs were log-normalized before fitting into a linear model. Only those with the relative standard deviation > 0.15 were used based on the raw counts. As proposed for the RNA-seq, we adjusted the log-normalized metabolite data considering both approaches. First, the following model was employed for the breed-specific analysis: where m ijkl : is the is the log-normalized concentration of each metabolite (n = 749); µ: is the intercept; F i : is the fixed effect of the FE group (two levels, high and low); D j : is the fixed effect of the batch for metabolomic analysis (two levels); A k : is the covariate for the sampling age, in days; P l : is the fixed effect of the pen (8 levels); ε ijkl : is the random residual effect associated with each observation.
For the FE-specific group approach, we fitted the data as follows: where m ijkl : is the is the log-normalized concentration of each metabolite (n = 749); µ: is the intercept; B i : is the fixed effect of the breed (two levels, Duroc and Landrace); D j : is the fixed effect of the batch for metabolomic analysis (two levels); A k : is the covariate for the sampling age, in days; P l : is the fixed effect of the pen (8 levels); ε ijkl : is the random residual effect associated with each observation. The metabolites were annotated with HMDB (Human metabolome database) based on library search of the masses in HMDB with a mass uncertainty of 0.005 Da or 5 ppm. Those metabolites that did not correspond to HMDB entries were left unannotated and removed from the analysis.

Integration of Transcriptomic and Metabolomic Data Based on the Linear Model
To uncover the complex relationship between metabolites and genes, we adopted a linear model framework using the IntLIM (Integration of Linear model) R-package (version 0.1.0) (https://github.com/ mathelab/IntLIM) [23]. The IntLIM approach allows us to integrate the metabolomic-transcriptomic data considering a case-control design. Thus, as initially proposed, we compared the breeds (Duroc vs. Landrace) and the FE groups (low and high FCR animals). As a quality control step from IntLIM, we filtered out genes with the lowest 5% of the variation. Gene and metabolite exploratory analyses were performed by applying Principal Component Analysis to identify breed-and FE-specific clusters.
The linear model for data integration is given as described in the following equation: where m: is the log-normalized metabolite abundance; β 1 : is the intercept; β 2 g: is the normalized and adjusted gene expression level; β 3 p: is the phenotype (FE group-high and low FE; or breed-Duroc and Landrace); β 4 (g : p) : is the interaction between gene expression and phenotype; ε: is the residual effect associated with each observation (ε = N(0, σ)).
A statistically significant two-tailed p-value of the gene-phenotype (g-p) interaction indicates the difference in the phenotype of FE groups (high and low) or breed (Duroc and Landrace) calculated by the slope relating gene-expression and metabolite abundance [23]. The two-tailed p-value indicates that the slope relating gene-expression and metabolite abundance is different from one phenotype compared to the other. Thus, it was used to identify gene-metabolite associations that are specific to a particular phenotype (breed-Duroc and Landrace, FE-low and high). We calculated the absolute difference in the Spearman correlation identified from IntLIM between the FE and breed groups to find the significant (p < 10 −7 ) gene-metabolite pairs. The absolute difference between the FE group was estimated as (r Low_cor − r High_cor ) where (r High_cor ) is the correlation given for a gene-metabolite pair in high-FE group ( Table 2) while (r Low_cor ) is the correlation given for a gene-metabolite pair in the low-FE group ( Table 2). The absolute difference in the correlation between the breeds was estimated as (r Landrace_cor − r Duroc_cor ), where, (r Duroc_cor ) is the correlation for a given gene-metabolite pair in Duroc (Table 1) while (r Landrace_cor ) is the correlation for a gene-metabolite pair in Landrace (Table 1).

Pathway Over-Representation Analysis
In the breed-specific and FE-specific group, the same metabolite can be related to more than one gene and vice-versa. So, we screened for the common metabolites and genes in the group and referred them as unique metabolites and unique genes respectively in this study. We analyzed the unique metabolites in each group (breed-specific and FE-specific) using Metaboanalyst 4.0 (www.metaboanalyst.ca) [58]. We used three parameters for the pathway analysis: the pathway library, algorithm for pathway over-representation analysis, and algorithm for topological analysis. For the current study, we selected the Homo sapiens (KEGG) pathway library to estimate the importance of the compound in a given metabolic pathway. For pathway over-representation and topology analysis, we used the hypergeometric test and relative-betweenness centrality algorithm, respectively, to measure the connections with the other nodes, including the number of shortest paths going through the node of interest.
Regarding the unique mapped (with chromosomal location information) genes, we carried out a co-functionality analysis using GeneMANIA [59] (www.genemania.org). GeneMANIA considers our query list of unique genes identified in each cluster (breed-specific and FE-specific) and allows us to predict the co-functional genes underlying similar functions. Thus, we analyzed the unique genes in each cluster (breed-specific-Duroc correlated cluster, and Duroc anti-correlated cluster, FE-specific-High-FE correlated cluster) to identify the co-functional genes in each cluster. Next, we used the unique genes, as well as the co-functional genes in each cluster of each group, to identify the GO terms using GOrilla (Gene ontology enrichment analysis and visualization tool) (http://cbl-gorilla.cs.technion.ac.il/) [60]. To this end, the Homo sapiens were used as the reference, and the entire set of identified and annotated genes in this study (n = 15,187 genes) was used as a background. Over-representation KEGG pathway analysis with the unique and co-functional genes in each cluster was performed using ClueGO version 2.5.4 [61] to cluster redundant terms with a kappa score of 0.4 and S. scrofa annotation as the background. The pathways were selected after filtering for group p-value corrected with Bonferroni step down ≤ 0.05 and those with at-least two over-represented genes.

Conclusions
This study applied a novel approach for metabolome-transcriptome data integration using the linear model unveiling potential gene-metabolite pairs affecting the biological processes related to FE in pigs. To the best of our knowledge, this is the first study to report the gene-metabolite interaction mechanisms that may determine nutrient partitioning and energy utilization and hence affect FE in pigs. The approach followed here provided many interesting genes and metabolites with significant p-values. While some of the metabolites and genes identified were known with their association for FE, others are novel and provide new avenues for further research. The unique metabolites were associated with valine-leucine-isoleucine biosynthesis/degradation and arginine-proline metabolism. The unique genes enriched for sphingolipid metabolism, valine-leucine-isoleucine degradation, alanine-aspartate-glutamate pathway (breed-specific), and cGMP-PKG signaling pathway (FE-specific). Further validation of genes, metabolites, and gene-metabolite interactions in a cohort with more animals with additional features such as alteration in dietary components, farm variations, and other environmental effects would help to establish a framework for future FE prediction using metabolomics biomarker profiles that could be practical to use in large populations other than genomic profiling. More data would also make it possible to model the complex relations in gene-metabolite profiles over time more accurately and will help to elucidate the regulatory mechanisms affecting the pathways underlying FE.
Supplementary Materials: The following are available online at http://www.mdpi.com/2218-1989/10/7/275/s1, Figure S1: The principal component analysis of metabolites and genes. A and B: PCA plot of genes and metabolites, respectively, in (1) Duroc and (2) Landrace; C and D: PCA plot of genes and metabolites, respectively, in high and low feed efficient groups, Table S1: Pathways enriched by unique gene-metabolite pairs. Spreadsheet tabs are divided into (a) metabolite enrichment analysis results by Metaboanalyst; (b) list of the co-functional genes by GeneMania; (c) gene ontology pathways enriched by Gorilla; (d) KEGG pathways enriched by ClueGO.