Multi-Omics Integration and Network Analysis Reveal Potential Hub Genes and Genetic Mechanisms Regulating Bovine Mastitis

Mastitis, inflammation of the mammary gland, is the most prevalent disease in dairy cattle that has a potential impact on profitability and animal welfare. Specifically designed multi-omics studies can be used to prioritize candidate genes and identify biomarkers and the molecular mechanisms underlying mastitis in dairy cattle. Hence, the present study aimed to explore the genetic basis of bovine mastitis by integrating microarray and RNA-Seq data containing healthy and mastitic samples in comparative transcriptome analysis with the results of published genome-wide association studies (GWAS) using a literature mining approach. The integration of different information sources resulted in the identification of 33 common and relevant genes associated with bovine mastitis. Among these, seven genes—CXCR1, HCK, IL1RN, MMP9, S100A9, GRO1, and SOCS3—were identified as the hub genes (highly connected genes) for mastitis susceptibility and resistance, and were subjected to protein-protein interaction (PPI) network and gene regulatory network construction. Gene ontology annotation and enrichment analysis revealed 23, 7, and 4 GO terms related to mastitis in the biological process, molecular function, and cellular component categories, respectively. Moreover, the main metabolic-signalling pathways responsible for the regulation of immune or inflammatory responses were significantly enriched in cytokine–cytokine-receptor interaction, the IL-17 signaling pathway, viral protein interaction with cytokines and cytokine receptors, and the chemokine signaling pathway. Consequently, the identification of these genes, pathways, and their respective functions could contribute to a better understanding of the genetics and mechanisms regulating mastitis and can be considered a starting point for future studies on bovine mastitis.


Introduction
Over the last decade, advances in high-throughput genotyping and sequencing technologies [1], along with progress in developing computational methods [2], have led to a revolution towards a better understanding of the genetic architecture underlying complex traits and diseases, with exceptional depth. To date, several studies have focused on integrating different information sources ("omics" datasets) to create robust insights into complex molecular functional mechanisms by reinforcing complementary evidence from multiple levels [3][4][5]. In this regard, the results of different types of multi-layer studies have been reported, ranging from simple combinations (two different kinds of -omics data) to more comprehensive and computationally demanding ones (multiple kinds of -omics data). Incorporating two layers under a systems biology framework can involve approaches that integrate genomics and transcriptomics [6][7][8], metabolomics and transcriptomics [9][10][11], proteomics and transcriptomics [12,13], and proteomics and metabolomics analyses [14,15] to functionally characterize the interactions at the molecular level for traits of interest in humans and in livestock species.
Bovine mastitis is a common and costly disease, which has a considerable effect on the profitability of the production system, owing to its negative impacts on milk yield, quality, and reproductive performance; early culling; animal welfare issues; and the cost of treatment [16][17][18][19]. The inflammation of the mammary gland occurs in response to infection with pathogenic microorganisms or physiological and metabolic changes [20,21]. Although the heritability of mastitis is low [22], and genetic correlations between mastitis and production traits are unfavorable [23,24], genetic improvement in terms of mastitis resistance is a major breeding goal. It is also known that mastitis is highly genetically correlated with somatic cell count (SCC), which consists of macrophages, lymphocytes, and epithelial cells, and consequently, this can be used as an important indicator of udder health [25]. Furthermore, selection for correlated traits, such as reduced SCC (indicating increased mastitis resistance) could be an interesting alternative, allowing scientists to infer and comprehend the genetic and molecular mechanisms underlying these traits. In other words, the discovery of genomic regions, disease-causing genes, and biomarkers associated with mastitis is of essential importance in improving the diagnosis and treatment of the disease. In the literature, numerous studies have been carried out to identify functional candidate genes associated with mastitis based on genome-wide association studies (GWAS) [26][27][28][29][30] and transcriptome studies [6,16,31,32]. On the other hand, concordance among these studies is low, indicating difficulties in identifying reliable candidate genes for mastitis. New approaches integrating GWAS results with additional sources of information can overcome this challenge. Hence, it is worth investigating the molecular regulatory mechanisms through which mastitis can be developed. Therefore, the objective of this study was to use the integration of previously published RNA-Seq and microarray data with GWAS results to identify and prioritize potential hub genes and create reconstructions of the protein-protein interaction (PPI) and gene regulatory networks, as well as modeling of the three-dimensional hub protein structure involved in pathological processes related to mastitis in dairy cattle.

Materials and Methods
The overall workflow for the data collection and the analysis of relevant genes related to mastitis in dairy cattle is presented in Figure 1.

Data Collection
Collection and evaluation of the available data is the first step in better understanding the reconstruction of molecular networks and the biological basis in terms of the identification of candidate genes, gene regulation, interactions, protein-protein interaction (PPI), and metabolic signaling networks. In this study, the microarray and RNA-sequencing (RNA-Seq) datasets, available in the public repository of the National Center for Biotechnology Information (NCBI) Gene Expression Omnibus (GEO), were retrieved for samples of mastitis and healthy Bos taurus species. The accession numbers for the RNA-Seq and microarray datasets are shown in Table 1. Six Holstein cows from first to third lactations and days in milk (DIM) ranging from 7 to 236 were included in the GSE131607 dataset. All cows were kept in freestall housing at the University of California-Davis, fed a total mixed ration (TMR), and given ad libitum access to water. Two different samples were taken from each cow after diagnosis using the California Mastitis Test, one sample from the mastitic quarter (n = 6), and the other sample taken diagonally across from the mastitic quarter, which was confirmed as the healthy quarter (n = 6), based on having a somatic cell count (SCC) less than 100,000 cells/mL milk [16]. The GSE15020 and GSE15022 datasets were related to microarray analysis from a study by Mitterhuemer et al. [31]. Fifteen healthy German Holstein Frisian cows in mid-lactation (3 to 6 months postpartum) were included in the study. Quarter milk samples were collected and tested weekly before the trial to ensure that they contained <50,000 somatic cells/mL and were free of mastitis pathogens. The animals were inoculated in one quarter with E. coli and slaughtered after 6 h (n = 5) or 24 h (n = 5) in two different infection methods. Five cows, considered as controls, received no treatment and were slaughtered after 24 h. In total, 89 healthy and 75 diseased German Holstein cows were tested for the GSE93082 dataset. Diseases were distinguished by either systemic (extra-mammary) occurrence or those affecting the mammary gland (mastitis) to account for influences on the milk composition from local inflammatory processes. All cows were examined thoroughly by the dairy herd manager, trained staff, or a veterinarian. Healthy animals (2-4 years old, 1st to 3rd lactation, one animal 4th and one 8th lactation) which had no clinical signs of disease and no abnormalities in the udder or milk, with a reported somatic cell count less than 100,000 cells/mL were chosen Figure 1. Schematic of the workflow used to reconstruct the metabolic pathways of mastitis in dairy cattle. The main gene list was prepared from RNA-Seq and microarray datasets, and literature mining. The protein-protein interaction network (PPI), gene regulatory network (GRN), and interactive bipartite network of gene-miRNA interactions were reconstructed using Cytoscape.

Data Collection
Collection and evaluation of the available data is the first step in better understanding the reconstruction of molecular networks and the biological basis in terms of the identification of candidate genes, gene regulation, interactions, protein-protein interaction (PPI), and metabolic signaling networks. In this study, the microarray and RNA-sequencing (RNA-Seq) datasets, available in the public repository of the National Center for Biotechnology Information (NCBI) Gene Expression Omnibus (GEO), were retrieved for samples of mastitis and healthy Bos taurus species. The accession numbers for the RNA-Seq and microarray datasets are shown in Table 1. Six Holstein cows from first to third lactations and days in milk (DIM) ranging from 7 to 236 were included in the GSE131607 dataset. All cows were kept in freestall housing at the University of California-Davis, fed a total mixed ration (TMR), and given ad libitum access to water. Two different samples were taken from each cow after diagnosis using the California Mastitis Test, one sample from the mastitic quarter (n = 6), and the other sample taken diagonally across from the mastitic quarter, which was confirmed as the healthy quarter (n = 6), based on having a somatic cell count (SCC) less than 100,000 cells/mL milk [16]. The GSE15020 and GSE15022 datasets were related to microarray analysis from a study by Mitterhuemer et al. [31]. Fifteen healthy German Holstein Frisian cows in mid-lactation (3 to 6 months postpartum) were included in the study. Quarter milk samples were collected and tested weekly before the trial to ensure that they contained <50,000 somatic cells/mL and were free of mastitis pathogens. The animals were inoculated in one quarter with E. coli and slaughtered after 6 h (n = 5) or 24 h (n = 5) in two different infection methods. Five cows, considered as controls, received no treatment and were slaughtered after 24 h. In total, 89 healthy and 75 diseased German Holstein cows were tested for the GSE93082 dataset. Diseases were distinguished by either systemic (extra-mammary) occurrence or those affecting the mammary gland (mastitis) to account for influences on the milk composition from local inflammatory processes. All cows were examined thoroughly by the dairy herd manager, trained staff, or a veterinarian. Healthy animals (2-4 years old, 1st to 3rd lactation, one animal 4th and one 8th lactation) which had no clinical signs of disease and no abnormalities in the udder or milk, with a reported somatic cell count less than 100,000 cells/mL were chosen as controls. Most of the control samples were taken during early lactation within 10 to 100 days postpartum.
Diseased animals were in the 1st to 8th lactation period from 10 to 220 days postpartum. The milk samples of control animals or cows with extra-mammary diseases were collected and tested from one quarter or a composite milk sample (equal volumes from all 4 quarters mixed) [32]. In the GSE75379 dataset, sixteen healthy primiparous Holstein cows were inoculated with live E. coli into one mammary quarter at four to six weeks after parturition. The cows were housed in straw-bedded tie-stalls, where they were individually fed and given free access to water. The animals were fed using a TMR based on corn silage, minerals, and vitamins for ad libitum intake twice daily. Daily feed intake and milk yield at each milking were recorded. Prior to the start of the study period, cows were considered healthy and free of mastitis-causing pathogens based on body temperature, white blood cell count (WBC), California Mastitis Test, glutaraldehyde test, SCC, and bacteriological examinations of milk samples. Control quarters were selected based on bacteriological tests, in which quarter foremilk SCCs were <181,000 cells/mL at 24 h post-intramammary infection (IMI). Biopsy specimens of healthy and diseased udder tissue were performed 24 h post-IMI in infected and non-infected (control) mammary quarters [33].

Differential Gene Expression Analysis
Microarray data were pre-processed and normalized using the Lumi package [34] and the GCRMA algorithm (GeneChip Robust Multi-array Averaging) method, implemented in the Affy package in R software, to remove the variance and to prepare the data for further analysis [35]. Gene expression analysis was performed in R/Bioconductor software to screen the significant differential expression genes (DEGs) according to the comparison of the test and control data using the packages Limma [36], GEOquary [37], Biobase [38], and Umap [39].
Concerning RNA-Seq data, the quality of the raw data was assessed using FastQC software (v0.11.9) [40]. Then, based on the results of the raw data quality control, the sequences were edited to remove the adapters, PCR primers, and low-quality reads using Trimmomatic software (v0.38.0) [41]. Alignment sequences and mapping of reads were conducted on the Bos taurus reference genome (http://ftp.ensembl.org/pub/release-103/fasta/ bos_taurus/dna/ (accessed on 20 September 2021)) using HISAT2 software (v2.1.0) [42]. For transcript quantification, featureCounts software (v2.0.1) was employed to measure the total raw counts of mapped reads [43]. DESeq2 software (v2.11.40.6) was applied for the measurement of final differences in gene expression [44]. In addition to DEGs, the identification of miRNAs was also performed in the RNA-Seq datasets, simultaneously. Finally, the threshold for statistical significance of the differential expression of each gene was obtained with the criteria of a |log fold-change (FC)| ≥ 2.0 and a false discovery rate (FDR) ≤ 0.05 in accession numbers related to microarray and RNA-Seq datasets. The gene lists from the differential expression related to microarray and RNA-Seq analysis were considered Gene Sets 1 and 2, respectively (Supplementary Materials 1 and 2).

Literature Mining to Discover Candidate Genes for Mastitis
Extensive literature surveys were performed to search for candidate genes using keywords related to bovine mastitis in the PubMed and Google Scholar databases without time limitation. GWAS studies were selected for further detailed review. Then, iHOP (iHOP literature server, http://www.ihop.net.org/ (accessed on 9 October 2021)) was used, which is a web-based tool that allows the exploration of a network of gene and protein interactions by directly navigating the pool of published scientific literature [45]. Finally, the candidate gene list extracted through literature mining was mentioned as Gene Set 3 (Supplementary Material 3).

Determination of Main Gene List
To identify the relevant candidate genes related to mastitis, 3 datasets (microarray, RNA-Seq, and GWAS) from the differential expression and GWAS analyses were integrated. Subsequently, genes that were common to the three gene sets were selected as the main gene list for further analysis. The number of shared DE genes between the 3 datasets was analyzed using the R package VennDiagram v1.6.18 [46].

Functional Enrichment and KEGG Pathway Analysis
Gene ontology (GO) and enrichment analysis were performed using the online programs DAVID [47] (Database for Annotation, Visualization, and Integrated Discovery), PANTHER [48] (Protein ANalysis THrough Evolutionary Relationships), GeneCards (www.genecards.org/ (accessed on 9 October 2021)), g:Profiler [49] (https://biit.cs.ut. ee/gprofiler/gost (accessed on 9 October 2021)), and the STRING database [50] (https: //string-db.org (accessed on 9 October 2021)), which are comprehensive web tools that help to explore the biological process (BP), molecular function (MF), and cellular component (CC) of the mined gene set. The pathway enrichment of the identified genes was provided in the Kyoto Encyclopedia of Genes and Genomes (KEGG). Gene Ontology terms with FDR < 0.05 were considered significantly enriched for the identified genes.

Identification of miRNAs and Target Gene Prediction
The functional annotation of the expressed miRNAs consisted of the functional annotation of their potential target genes. The potentially targeted genes were predicted using miRBase [51] (https://www.mirbase.org/ (accessed on 9 October 2021)) and Targetscan [52]. The predicted target genes were selected and submitted to DAVID, KEGG, Reactome pathways, and the PANTHER database for the enrichment target genes of each miRNA.

Reconstruction of Omics Multi-Layers Networks
The miRNA-gene bipartite network was reconstructed based on the master gene list and the molecular interactions documented in related papers and in online interaction databases. Protein-protein interaction (PPI) data were abstracted from the Biomolecular Interaction Network Database (BIND), the Database of Interacting Proteins (DIP), the Biological General Repository for Interaction Datasets (BioGRID), and the Mammalian Protein-Protein Interactions Database (MIPS). Finally, PPI network analysis was performed using the STRING database to explore interactions between genes, specifically in Bos taurus species. Each miRNA and target gene was entered into the database and resulting interactions were imported into the networks using Cytoscape software v3.8.2. (National Institute of General Medical Sciences, Bethesda Softworks, Rockville, MD, USA) [53]. Genes and miRNAs in generated networks are represented as nodes and the interactions between these nodes as edges. Furthermore, the metabolic-signaling pathway enrichment of the PPI network was reconstructed using ClueGO v.2.5.5 [54].

Modeling of Three-Dimensional (3D) Structure of Hub Proteins
The SWISS-MODEL template-based approach [55] (https://www.swissmodel.expasy. org/interactive (accessed on 15 October 2021)) was used to predict the 3D structures of hub proteins using individual FASTA sequences and reference PDB files. The resulting PDB files are enclosed in Supplementary Material 4.

Transcriptome Analysis for Identifying Differentially Expressed Genes (DEGs)
To obtain better insights into the molecular mechanism and the genetic basis of mastitis, we investigated the pattern of transcriptome profiles of mastitis samples versus healthy samples in dairy cattle. The experimental data used for the study were obtained from the GEO database, consisting of microarray and RNA-Seq datasets, as presented in Table 1. The analysis of differentially expressed genes between mastitis and healthy cows was performed based on a fold change > ±2 and a false discovery rate (FDR) < 0.05. The results of the statistical analysis of the microarray datasets showed a total of 564 significant genes from three datasets as follows. The first dataset (GSE93082): 378 DE genes, the second dataset (GSE15020): 177 DE genes, and the third dataset (GSE15022): nine DE genes. Concerning the analysis of RNA-Seq, a total of 774 genes were differentially expressed in the mastitic versus healthy group comparison, of which 442 and 332 genes were detected from accession numbers GSE131607 and GSE75379, respectively. Gene counts and a detailed summary of the alignments for all GEO accession numbers are provided as Gene Sets 1 and 2 for microarray and RNA-Seq analyses, respectively, in Supplementary Materials 1 and 2.

Literature Mining and Identification of Main Gene List
Literature mining was carried out using the iHOP web tool to increase the study's accuracy and to obtain previous evidence for associations between the identified genes and mastitis in dairy cattle. Based on literature mining, 3217 candidate genes were considered as Gene Set 3 (Supplementary Material 3). The Venn diagram shows the number of genes that are unique or common among the three gene sets for mastitis ( Figure 2). Notably, there is a remarkably small overlap between the three datasets. Overall, 33 genes were common in the Gene Sets 1, 2, and 3 relating to microarray, RNA-Seq, and GWAS datasets, respectively, which were named as the main genes involved in mastitis and were considered for subsequent analysis ( Table 3). The results of differential expression analysis of 33 common genes based on the RNA-Seq datasets are presented in Table 3. When comparing healthy cows to mastitis cows, of 33 DE genes, nine were underexpressed in the mastitis cows and 24 genes were overexpressed in the mastitis cows, based on their fold-change values (fold change > ±2).
of genes that are unique or common among the three gene sets for mastitis ( Figure 2). Notably, there is a remarkably small overlap between the three datasets. Overall, 33 genes were common in the Gene Sets 1, 2, and 3 relating to microarray, RNA-Seq, and GWAS datasets, respectively, which were named as the main genes involved in mastitis and were considered for subsequent analysis ( Table 3). The results of differential expression analysis of 33 common genes based on the RNA-Seq datasets are presented in Table 3. When comparing healthy cows to mastitis cows, of 33 DE genes, nine were underexpressed in the mastitis cows and 24 genes were overexpressed in the mastitis cows, based on their fold-change values (fold change > ±2).     * Information on common differentially expressed genes between the mastitis and healthy samples in dairy cattle provided based on RNA-Seq datasets.

PPI Network and Identification of Hub Genes
Protein-protein interaction (PPI) networks for up-and downregulated genes were reconstructed with the STRING database, which indicated the physical connection between two or more protein molecules related to biochemical functions (Figure 4). Twentyone nodes with 45 connections (edges) were represented in the PPI network, as presented in Figure 4. Moreover, we considered hub genes based on their higher-degree connectivity values in the PPI network. A total of seven hub genes, including MMP9, HCK, GRO1, SOCS3, CXCR1, IL1RN, and S100A9, were identified, all of which were overexpressed genes. All the hub proteins identified are protein-coding genes. The functional enrichment analysis demonstrated that hub genes were involved in the majority of molecular functions and biological processes (Table 4).

PPI Network and Identification of Hub Genes
Protein-protein interaction (PPI) networks for up-and downregulated genes were reconstructed with the STRING database, which indicated the physical connection between two or more protein molecules related to biochemical functions (Figure 4). Twenty-one nodes with 45 connections (edges) were represented in the PPI network, as presented in Figure 4. Moreover, we considered hub genes based on their higher-degree connectivity values in the PPI network. A total of seven hub genes, including MMP9, HCK, GRO1, SOCS3, CXCR1, IL1RN, and S100A9, were identified, all of which were overexpressed genes. All the hub proteins identified are protein-coding genes. The functional enrichment analysis demonstrated that hub genes were involved in the majority of molecular functions and biological processes (Table 4).

Prediction of miRNA-Target Genes and Gene Regulatory Network Reconstruction
We also aimed to determine whether the expression of miRNAs was associated with that of the 33 DE genes in the mastitic and healthy cows. Among the DE miRNAs, btamir-222, bta-mir-27a, bta-mir-23a, and bta-mir-142 suppressed 11 of the identified DE genes as targets of the selected miRNAs. A target gene search using TargetScan demonstrated that bta-mir-142 has seven target genes, namely, RHPN2, LYST, SERPINE1, CD40, SOCS3, TRIB1, and SLC16A3, followed by bta-mir-23a having three target genes, SGK1, TNFAIP6, and TRIB1. In addition, bta-mir-27a suppressed the PSTPIP2 and VAV1 genes, and bta-mir-222 suppressed the SOCS3 gene. The TRIB1 and SOCS3 genes displayed the highest suppression by miRNAs. The identified target genes, associated with their miR-NAs, are visualized in Figure 5. For constructing the gene regulatory network, we compiled a list of DE genes and miRNAs (as nodes) involved in mastitis based on literature mining and PPI resources. Briefly, miRNA-gene bipartite networks are commonly represented in an undirected graph format, with nodes representing miRNAs or genes and edges corresponding to interactions (genes-genes and miRNAs-targeted genes). In this network, we identified 30 nodes (26 genes and four miRNAs), with 57 edges interacting with it ( Figure 5).

Prediction of miRNA-Target Genes and Gene Regulatory Network Reconstruction
We also aimed to determine whether the expression of miRNAs was associated with that of the 33 DE genes in the mastitic and healthy cows. Among the DE miRNAs, bta-mir-222, bta-mir-27a, bta-mir-23a, and bta-mir-142 suppressed 11 of the identified DE genes as targets of the selected miRNAs. A target gene search using TargetScan demonstrated that bta-mir-142 has seven target genes, namely, RHPN2, LYST, SERPINE1, CD40, SOCS3, TRIB1, and SLC16A3, followed by bta-mir-23a having three target genes, SGK1, TNFAIP6, and TRIB1. In addition, bta-mir-27a suppressed the PSTPIP2 and VAV1 genes, and btamir-222 suppressed the SOCS3 gene. The TRIB1 and SOCS3 genes displayed the highest suppression by miRNAs. The identified target genes, associated with their miRNAs, are visualized in Figure 5. For constructing the gene regulatory network, we compiled a list of DE genes and miRNAs (as nodes) involved in mastitis based on literature mining and PPI resources. Briefly, miRNA-gene bipartite networks are commonly represented in an undirected graph format, with nodes representing miRNAs or genes and edges corresponding to interactions (genes-genes and miRNAs-targeted genes). In this network, we identified 30 nodes (26 genes and four miRNAs), with 57 edges interacting with it ( Figure 5).

Figure 5.
Interactive bipartite network (gene-miRNA) which demonstrates the regulatory effect on mastitis in dairy cattle. In this network, the quadrilateral points represent genes, and the octagonal points represent miRNAs. Regarding miRNAs and target genes, the edges indicate the suppressive role of miRNAs. The edges also represent the gene-gene interactions. The green quadrilateral nodes represent the hub genes. The quadrilateral nodes that have purple around them are the genes showing the highest suppression by miRNAs.

Three-Dimensional Modeling of Hub Proteins
In the present study, we also modeled the 3-dimensional protein structure of the seven hub genes identified in the PPI network that had the most interaction with other genes involved in the network ( Figure 6). 3D modeling revealed that the predicted structures of these seven hub proteins were significantly different from each other. Four hub proteins (MMP9, HCK, CXCR1, and S100A9) had the greatest structural complexity compared to the three other proteins.

Three-Dimensional Modeling of Hub Proteins
In the present study, we also modeled the 3-dimensional protein structure of the seven hub genes identified in the PPI network that had the most interaction with other genes involved in the network ( Figure 6). 3D modeling revealed that the predicted structures of these seven hub proteins were significantly different from each other. Four hub proteins (MMP9, HCK, CXCR1, and S100A9) had the greatest structural complexity compared to the three other proteins.

Three-Dimensional Modeling of Hub Proteins
In the present study, we also modeled the 3-dimensional protein structure of the seven hub genes identified in the PPI network that had the most interaction with other genes involved in the network ( Figure 6). 3D modeling revealed that the predicted structures of these seven hub proteins were significantly different from each other. Four hub proteins (MMP9, HCK, CXCR1, and S100A9) had the greatest structural complexity compared to the three other proteins.

Discussion
Mastitis is a complex trait and is prominent among health-related traits in the cattle industry, exerting a severe impact on profitability and animal welfare. The identification of functional candidate genes and molecular mechanisms involved in mastitis is required, given the persistence of the disease on dairy farms. Moreover, understanding the interplay between molecular and cellular components, with each component interacting at different levels that are entangled in several biological pathways, is important. Hence, the present study provides a general framework to investigate and integrate different sources of transcriptome data and previous results from GWAS studies to identify the genetic basis and key pathways associated with bovine mastitis. Numerous studies have demonstrated that the integration of multiple layers of omics data is a powerful strategy to increase the efficiency and accuracy of candidate gene and biomarker discovery, detecting molecular and biochemical interactions, and the relationships between biological variables in different species [4,7,8,[56][57][58]. In this study, the integrative analysis of multiple datasets resulted in prioritizing 33 DE genes as the main gene list, of which nine genes were downregulated and 24 genes were upregulated in the mastitic cows compared with the healthy cows based on their FC values. A list of main detected genes related to mastitis is provided in Table 3, with their main functions (biological process, molecular function, and cellular component) listed in Table 4. Among them, CSN2, CSN3, CSN1S1, CSN1S2, RHPN2, LALBA, ACSS2, RHOU, and KRT7 genes were underexpressed in the mastitic cows, mostly located on chromosomes 6. The most important overexpressed genes in cows with mastitis were GRO1, CXCR1, SOCS3, S100A9, MMP9, HCK, and IL1RN, which were hub genes (highly connected genes) involved in mastitis in this study. The casein cluster is composed of four genes; β-casein (CSN2), κ-casein (CSN3), αs1-casein (CSN1S1), and αs2-casein (CSN1S2), which encode approximately 80% of the protein content of bovine milk [59], and the whey protein gene (LALBA) was downregulated in inflamed mammary glands. LALBA encodes α-lactalbumin and is essential for lactose synthesis, which plays an important role in milk production as an osmotic regulator of milk secretion [60]. A possible explanation for the lower expression levels of these genes could be that the protein content in mastitic milk would decrease due to an antagonistic genetic relationship between mastitis and protein yield [24]. In previous studies, it was also demonstrated that all five genes were observed in enhanced abundance in the mammary glands of lactating dairy cows [61], dairy sheep [62], and lactating dairy goats [63]. Interestingly, as presented in Table 4, these genes were found in a majority of enriched pathways, suggesting possible key regulatory roles for them. Other noteworthy genes (RHPN2, ACSS2, RHOU, and KRT7) showing lower expression in cows with mastitis have critical roles in biological pathways, cellular process, fatty acid synthesis, and metabolism. For instance, ACSS2 (acyl-CoA synthetase short-chain family member 2) is well known to affect mastitis resistance in dairy cows and plays a role in the activation of acetate for de novo fatty acid synthesis [64]. Similarly to our results, Chen et al. [65] reported lower expression for ACSS2 and RHPN2 genes in response to the intramammary infection caused by two different pathogens (Escherichia coli and Streptococcus uberis) in dairy cows. As presented in Table 3, in the mastitis cows, 24 genes were more highly expressed, of which seven genes were considered as hub genes involved in significantly enriched biological processes and KEGG pathways. Subsequently, the PPI networks and gene regulatory networks were constructed based on these hub genes, which showed significant connectivity and which could shed light on the post-transcriptional regulation of gene expression by the identified miRNAs. Furthermore, the functional enrichment analysis resulted in four significant KEGG pathways associated with mastitis, which comprised six hub genes, i.e., GRO1, CXCR1, S100A9, MMP9, HCK, and IL1RN, as presented in Figure 3. Among these genes, GRO1 and CXCR1 were observed in four and three pathways, respectively. GRO1 (melanoma growth stimulating activity, alpha) also known as CXCL1, is a protein-encoding gene and plays an important role in inflammation and immune defense due to the modulation of leukocyte infiltration [66], which has been previously proposed as a biomarker and therapeutic target in mastitis [67]. This gene is also involved in the metabolic pathways of cytokine-cytokine-receptor interaction, the IL-17 signaling pathway, the chemokine signaling pathway, and viral protein interaction with cytokines and cytokine receptors. In the case of the cytokine-cytokine-receptor interaction pathway, other genes, such as CD40, IL1RN, CXCR1, TNFRSF6B, and CCL19, were found to be involved in mastitis defense or immune response, as all these genes were upregulated in the mastitic cows based on their FC values. The significant role of cytokines in the immune response to infectious agents is well known because they are soluble extracellular proteins or glycoproteins that are crucial intercellular regulators and mobilizers of cells engaged in innate as well as adaptive inflammatory host defenses, cell growth, differentiation, cell death, and cell development and repair processes. It was previously reported that cytokines can participate in activation of the host defense mechanisms during mastitis [68,69]. The CXCR1 and CCL19 genes were also enriched in two other pathways of the chemokine signaling pathway and in viral protein interaction with cytokines and cytokine receptors. CXCR1 (chemokine (C-X-C motif) receptor 1), identified as a hub gene, is a protein-encoding gene for major pro-inflammatory cytokine receptors [70] that is introduced as a potential genetic marker for resistance to mastitis in dairy cows [71,72]. In our study, the gene CXCR1 was involved in seven GO terms-response to stimulus, immune response, cellular response to stimulus, neutrophil chemotaxis, biological regulation, cellular process, and cellular response to cytokine stimulus for biological processes (Table 4). In addition, earlier studies have reported that a non-synonymous mutation, c.365C > T, located in exon II of the CXCR1 gene is associated with susceptibility to mastitis in different breeds of cattle [73,74]. The viral protein interaction with cytokines and the cytokine receptor pathway is an immune system pathway which has a key role in the inflammatory responses to infection and may activate or inhibit cytokine signaling and possibly affect different aspects of immunity. Furthermore, the S100A8, S100A9, and MMP9 genes have been recognized as components of the IL-17 signaling pathway. This pathway plays crucial roles in both acute and chronic inflammatory responses. In fact, the interleukin 17 (IL-17) family, as a subset of cytokines, signals via their correspondent receptors and activates downstream pathways that include NF-kappaB, MAPKs, and C/EBPs to induce the expression of antimicrobial peptides, cytokines, and chemokines. S100A9 and MMP9, which were identified as hub genes and which were upregulated in the mastitic cows, play key roles in the regulation of immune response and inflammatory pathways [65,66].
SOCS3 (suppressor of cytokine signalling 3) was another hub gene identified with higher expression levels, which encodes an intracellular inhibitor of cytokine signaling and has a crucial role in the initial steps of the recognition of a pathogen-associated molecular pattern (PAMP) in the innate immune cells [75]. Furthermore, in the regulatory network, SOCS3 is suppressed by bta-mir-142 and bta-mir-222. These two miRNAs showed upregulation and their target gene, SOCS3, showed the lower expression than them in mastitic cows. We characterized bta-mir-222, bta-mir-27a, bta-mir-23a, and bta-mir-142 as the major miRNAs which play a prominent role in regulating this network of genes and these were upregulated in the mastitic cows. The gene regulatory network showed that the greatest target genes (seven genes) were suppressed by bta-mir-142. There is evidence that miRNAs play a critical role in the regulation of inflammation and immune function during infection with mastitis in dairy cattle [76][77][78][79].
Modeling of the 3D protein structure of hub proteins can be an invaluable aid in order to better understand the details of a particular protein because studies of protein structure and function are becoming a promising approach in the field of bioinformatics. Functional characterization of a protein is often facilitated by its 3D structure. Hence, it is necessary that a 3D structure is determined in examining the proteins' function at the molecular level. Sequence identity, as a measure of the expected accuracy of a model represented, >30% indicates a relatively good predictor of the model [80]. When sequence identity drops below 30%, the main problem becomes the identification of related templates and their alignment with the sequence to be modeled. Based on our results, among the seven hub proteins, HCK, CXCR1, S100A9, and MMP9 showed the highest structural complexity, with sequence identities of 94.6%, 75.4%, 69.9%, and 47.5%, respectively, whereas the proteins of SOCS3, IL1RN, and GRO1 had the lowest structural complexity with sequence identities of 92.6%, 80.1%, and 72.5%, respectively ( Figure 6). Consequently, these findings demonstrate the relevance of integrating results from transcriptomic and functional analyses for a better understanding of the function of important genes and molecular mechanisms responsible for mastitis development.

Conclusions
The integration of multi-omics data resulted in the identification of 33 common and relevant genes associated with bovine mastitis. Among these, seven genes (CXCR1, HCK, IL1RN, MMP9, S100A9, GRO1, and SOCS3) were identified as the hub genes and these can be explored as potential candidate genes for mastitis susceptibility and resistance. Functional annotation and enrichment analysis identified 23, 7, and 4 GO terms related to mastitis in the biological process, molecular function, and cellular component categories, respectively. We identified eight differentially expressed miRNAs, of which four suppressed 11 of the identified genes as their targets. Furthermore, the reconstruction of the regulatory network of genes associated with their miRNAs sheds light on the post-transcriptional regulation of this network. Therefore, this study provides a general framework to investigate and incorporate multiple layers of omics data from high-throughput technologies or available pathway annotation databases, which has led to the elucidation of molecular networks, the cellular and molecular-level features, and the genetic and biological basis of mastitis in dairy cattle.

Acknowledgments:
The authors wish to express their thanks to the Gene Expression Omnibus (GEO) open repository database for providing a platform for the availability of molecular data to move forward to more detailed conclusions and interoperations of earlier outcomes.

Conflicts of Interest:
The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analysis, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.