A Multi-Tissue Gene Expression Atlas of Water Buffalo (Bubalus bubalis) Reveals Transcriptome Conservation between Buffalo and Cattle

We generated 73 transcriptomic data of water buffalo, which were integrated with publicly available data in this species, yielding a large dataset of 355 samples representing 20 major tissue categories. We established a multi-tissue gene expression atlas of water buffalo. Furthermore, by comparing them with 4866 cattle transcriptomic data from the cattle genotype–tissue expression atlas (CattleGTEx), we found that the transcriptomes of the two species exhibited conservation in their overall gene expression patterns, tissue-specific gene expression and house-keeping gene expression. We further identified conserved and divergent expression genes between the two species, with the largest number of differentially expressed genes found in the skin, which may be related to structural and functional differences in the skin of the two species. This work provides a source of functional annotation of the buffalo genome and lays the foundations for future genetic and evolutionary studies in water buffalo.


Introduction
Gene expression atlases have been widely used to investigate gene expression in different tissues, cell types, and developmental stages. These resources provide a comprehensive view of gene expression patterns across the genome, which can help improve the functional annotation of the genome and understanding of the molecular mechanisms underlying different tissues and complex biological processes. In humans, the Functional Annotation of the Mammalian Genome Consortium (FANTOM) [1] and the Encyclopedia of DNA Elements project (ENCODE) [2] were proposed to facilitate the elucidation of numerous human disease genes and the identification of functional elements within the human genome. Numerous international consortium projects, such as the Genotype-Tissue Expression (GTEx) [3] and the International Human Epigenome Consortium (IHEC) [4], have been initiated with the objective of establishing the correlation between genetic variation and gene expression in human tissues and deciphering the epigenetic regulation of cell states that are relevant to human health and disease. Recently, with technological advancement and data accumulation, integration of large-scale multi-omics data has gradually been applied in the field of agricultural animals, including the CattleGTEx Project [5], the PigGTEx Project [6] and the construction of multi-tissue gene expression atlases in beef cattle [7] and pigs [8].
The Asian water buffalo (Bubalus bubalis) is a large-bodied member of the Bovini tribe that is an economically important provider of milk, meat, draught power, and leather in at least 67 countries on five continents [9,10]. Due to its natural adaptation to tropical and subtropical environments, the water buffalo has played a key role in the sustainable development of global agriculture [11]. Its global population of 204 million in 2021 showed

RNA-Seq Samples
We collected 73 samples from 19 tissues in four swamp buffalo. Total RNA was prepared using the TRIzol reagent in accordance with the manufacturer's recommendation. RNA sequencing was performed using the Illumina NovaSeq 6000 platform (Illumina, San Diego, CA, USA) with paired-end reads of 150 bp length. These newly generated data were integrated with 282 publicly available buffalo RNA-Seq data downloaded from the European Nucleotide Archive (ENA).
Cattle RNA-seq samples were analyzed uniformly by the CattleGTEx consortium [5], and the normalized gene expression (TPM) data were obtained from https://cgtex.roslin. ed.ac.uk/downloads/, accessed on 12 January 2023. Ultimately, we obtained normalized gene expression values (TPM) for 355 and 4866 RNA-seq samples from 20 common tissues in buffalo and cattle, respectively.

Genome Similarity, Transcriptome Similarity and Homologous Gene Identification between Buffalo and Cattle
The sequence similarity of the genome, as well as transcriptome (coding sequence region, CDS region) between the reference genomes of buffalo (UOA_WB_1) and cattle (ARS-UCD1.2), were analyzed by using MashMap [19] that implements an approximate algorithm. The homologous genes were identified by OrthoFinder, a software that identifies orthologous genes by integrating the bidirectional best-hit principle and analysis of phylogenetic trees of genes [20]. Genes were further classified into four categories: one-to-one homology, complex homology (including one-to-many, many-to-one, or many-to-many), no homology, and non-protein [21].

Sample Clustering
We used the function IntegrateData in the R package Seurat [22] to combine gene expression datasets of buffalo and cattle, following the approach presented in a comparative transcriptome study between humans and cattle [23]. This methodology not only corrects for technical variations but also aligns the shared gene expression features across the datasets [22]. The integrated dataset was used in the subsequent cluster analysis and detection of differentially expressed genes (DEGs) between species. Afterward, we performed the t-distributed stochastic neighbor embedding (t-SNE) using the R package Rtsne [24] with parameter "dim = 2, perplexity = 350, theta = 0.5" to map the samples to a two-dimensional space based on corrected expression values of orthologous genes. We calculated the median gene expression in each tissue in buffalo and cattle separately to represent the "true" expression of the particular tissue in each species. We then performed hierarchical clustering using the R package heatmap to explore the relationship of tissues in buffalo and cattle based on the median gene expression.

Detection of Tissue-Specific Genes (TSGs)
We utilized the R package Limma [25] to identify TSGs through differential expression analysis. Specifically, we employed the functions model. matrix, lmFit, contrasts. fit, eBayes, and topTable to assess the differences in gene expression between samples in a particular tissue and samples in the remaining tissues. To account for multiple testing, p-values were adjusted using the Benjamini and Hochberg method (FDR) [26]. We defined tissue-specific genes as those with log 2 (FC) > 1.5 and FDR < 0.05 [23].

Detection of House-Keeping Genes (HKGs)
We identified preliminary HKGs (pHKGs) by selecting genes that are expressed on average above a threshold of TPM > 1 across all tissues. In order to investigate the changes in the expression of each pHKG within the buffalo and cattle expression profiles, we have utilized the coefficient of variation (CV) to measure the degree of variation of gene expression for each gene [27]. We divided the pHKGs into low variable expression (CV ≤ first quartile), medium variable expression (first quartile < CV < third quartile), and highly variable expression (CV ≥ third quartile) based on the quartiles of the total distribution of CV values [7,8]. To investigate the functional differences between pHKGs with low, medium, and high expression variability, we performed GO enrichment analysis on the pHKGs using the R package clusterProfiler [28]. Moreover, the low variable expression pHKGs were further considered as HKGs and subdivided into three groups based on the average expression across all tissues, including low expression (1 < TPM ≤ 10), medium expression (10 < TPM ≤ 50), and high expression (TPM > 50) [7,8].

Detection of Differentially Expressed Genes (DEGs) between Species
To identify DEGs between species, we utilized the R package Limma [25] and then considered genes with log 2 (FC) > 1.2 and FDR < 0.05 significant. These thresholds were lower than that employed for identifying TSGs as the differences in gene expression within tissues between species are smaller than those between tissues within species [23]. In the differential expression analysis, the upregulated genes refer to genes that were upregulated in buffalo, whereas the downregulated genes were genes that were upregulated genes in cattle. We ranked genes according to their degree of differential expression (−log 10 p) from DEG analysis between buffalo and cattle. Subsequently, we selected the highest and lowest 10% of all orthologous genes as the most divergent and most conserved genes, respectively.

Summary of Gene Expression Profiles in Buffalo
We analyzed 73 newly generated and 282 existing RNA-Seq samples, representing 57 tissues in domestic buffalo. Using a uniform pipeline of analysis, we generated 9.27 billion clean reads. Details of sample information were summarized in Supplemen-tal Table S1. We further divided these tissues into 20 classes following Yao et al. [23]. (Figure 1a). As expected, we observed a clear clustering of these tissues based on their gene expression patterns (Figures 1b and S1). Some tissues, such as the brain, testes, and liver, formed a distinct cluster separating from other tissues (Figure 1c). Tissues with similar physiological functions exhibited greater correlation in their gene expression patterns, such as the small intestine and large intestine (Figure 1c).

Summary of Gene Expression Profiles in Buffalo
We analyzed 73 newly generated and 282 existing RNA-Seq samples, representing 57 tissues in domestic buffalo. Using a uniform pipeline of analysis, we generated ~9.27 billion clean reads. Details of sample information were summarized in Supplemental  Table S1. We further divided these tissues into 20 classes following Yao et al. [23]. ( Figure  1a). As expected, we observed a clear clustering of these tissues based on their gene expression patterns (Figures 1b and S1). Some tissues, such as the brain, testes, and liver, formed a distinct cluster separating from other tissues (Figure 1c). Tissues with similar physiological functions exhibited greater correlation in their gene expression patterns, such as the small intestine and large intestine (Figure 1c).

Sequence Similarity of Genome and Transcriptome between Buffalo and Cattle
Based on the reference genomes of buffalo (UOA_WB_1) and cattle (ARS-UCD1.2), we analyzed the genomic, as well as transcriptomic sequence consistency between these two species. Our findings indicated that the buffalo reference genome had 98.87% of its sequences aligning with the cattle reference genome, with an average consistency of 95.88%. This agrees with a previous study [16]. Moreover, we compared the sequence similarity of CDS regions between the two species, revealing that 87.91% of buffalo CDS regions could be mapped to cattle CDS regions, with an average consistency of 98.12%. These findings demonstrated the relatively strong collinearity between the reference genome sequences of the two species, laying a foundation for subsequent comparative transcriptome analyses.

Sequence Similarity of Genome and Transcriptome between Buffalo and Cattle
Based on the reference genomes of buffalo (UOA_WB_1) and cattle (ARS-UCD1.2), we analyzed the genomic, as well as transcriptomic sequence consistency between these two species. Our findings indicated that the buffalo reference genome had 98.87% of its sequences aligning with the cattle reference genome, with an average consistency of 95.88%. This agrees with a previous study [16]. Moreover, we compared the sequence similarity of CDS regions between the two species, revealing that 87.91% of buffalo CDS regions could be mapped to cattle CDS regions, with an average consistency of 98.12%. These findings demonstrated the relatively strong collinearity between the reference genome sequences of the two species, laying a foundation for subsequent comparative transcriptome analyses.

Conservation of Global Gene Expression Patterns between Buffalo and Cattle
Following a previous study [21], we classified genes into four categories: one-to-one homology, complex homology (including one-to-many, many-to-one, or many-to-many), no homology, and non-protein. In each tissue, we compared the proportion of expressed genes in each category to the total number of expressed genes (TMP > 0.1) ( Figure S2a). In Buffalo, the average proportions of one-to-one homology, complex homology, no homology, and non-protein genes across tissues were 82.62%, 7.94%, 4.07%, and 5.37%, Genes 2023, 14, 890 5 of 12 respectively, while in cattle, they were 85.37%, 10.51%, 1.79%, and 2.33%. Similarly, we compared the summed gene expression (log 2 (TMP)) in these categories to the total expression ( Figure S2b). In buffalo, the average proportions of one-to-one homology, complex homology, no homology, and non-protein genes across tissues were 83.85%, 10.14%, 3.15%, and 2.84%, respectively, while in cattle, they were 83.85%, 12.54%, 1.59%, and 2.01%. Our results revealed that the genes in the four categories exhibited similar patterns in terms of both the number of expressed genes and their expression levels between the two species. Notably, the one-to-one homology genes had the largest number of expressed genes and represented the predominant expression levels in the tissues ( Figure S2). Therefore, our subsequent analyses were based on 16,497 one-to-one orthologous genes for comparing the transcriptomes between the two species.
To assess the conservation of gene expression between buffalo and cattle, we compared the number of expressed genes in each tissue and observed a significant correlation (Spearman's r = 0.59, p = 0.0071) between the two species ( Figure 2a). Notably, the testes exhibited the highest number of expressed genes in both species (n buffalo = 15,042; n cattle = 13,764), while the muscle (n buffalo = 12,285; n cattle = 11,182) and blood/immune tissues (n buffalo = 12,230; n cattle = 10,994) showed the lowest numbers of expressed genes.

Conservation of Global Gene Expression Patterns between Buffalo and Cattle
Following a previous study [21], we classified genes into four categories: one-to-one homology, complex homology (including one-to-many, many-to-one, or many-to-many), no homology, and non-protein. In each tissue, we compared the proportion of expressed genes in each category to the total number of expressed genes (TMP > 0.1) ( Figure S2a). In Buffalo, the average proportions of one-to-one homology, complex homology, no homology, and non-protein genes across tissues were 82.62%, 7.94%, 4.07%, and 5.37%, respectively, while in cattle, they were 85.37%, 10.51%, 1.79%, and 2.33%. Similarly, we compared the summed gene expression (log2(TMP)) in these categories to the total expression ( Figure S2b). In buffalo, the average proportions of one-to-one homology, complex homology, no homology, and non-protein genes across tissues were 83.85%, 10.14%, 3.15%, and 2.84%, respectively, while in cattle, they were 83.85%, 12.54%, 1.59%, and 2.01%. Our results revealed that the genes in the four categories exhibited similar patterns in terms of both the number of expressed genes and their expression levels between the two species. Notably, the one-to-one homology genes had the largest number of expressed genes and represented the predominant expression levels in the tissues ( Figure S2). Therefore, our subsequent analyses were based on 16,497 one-to-one orthologous genes for comparing the transcriptomes between the two species.
To assess the conservation of gene expression between buffalo and cattle, we compared the number of expressed genes in each tissue and observed a significant correlation (Spearman's r = 0.59, p = 0.0071) between the two species ( Figure 2a). Notably, the testes exhibited the highest number of expressed genes in both species (nbuffalo = 15,042; ncattle = 13,764), while the muscle (nbuffalo = 12,285; ncattle = 11,182) and blood/immune tissues (nbuffalo = 12,230; ncattle = 10,994) showed the lowest numbers of expressed genes.  To visualize the variation in gene expression among samples, we used the t-SNE-based method and found that samples from similar tissues clustered together rather than by species, indicating the conservation of gene expression among the species (Figure 2b,c). This observation was further supported by the hierarchical clustering of tissues based on the mean or median gene expression in each tissue ( Figure S3a,b). Additionally, we found that correlations based on gene expression in the same tissue between species were significantly higher than those observed between different tissues in the same species  Figure S4a). Tissues, such as the liver, brain, small intestine, and stomach, exhibited the highest similarity in gene expression between buffalo and cattle, while tissues, such as skin and salivary gland, showed the lowest similarity ( Figure S4b,c). Finally, we observed that buffalo and cattle shared most genes at the top (highest expression) and bottom (lowest expression) 10% of genes sorted by their median level of expression in each tissue (Figure 2d).

Detection and Comparison of Tissue-Specific Genes (TSGs)
By counting the number of tissues where a gene is expressed (TPM > 0.1), we found that genes tend to express ubiquitously (in all tissues) or tissue-specifically (in a few tissues) in both buffalo and cattle (Figure 3a). There was a significant correlation between the number of tissues where each gene was detected as expressed (TPM > 0.1) in each species (Spearman's r = 0.87, p < 2.2 × 10 −16 ), indicating global conservation of tissuespecific expression among orthologous genes. We then identified TSGs using the R package Limma described in a previous study [23], and genes with adjusted p value < 0.05 and log 2 (FC) > 1.5 were considered as TSGs. The number of TSGs in each tissue was significantly correlated between the two species (Spearman's r = 0.48, p = 0.033) (Figure 3b). The testes exhibited the largest number of TSGs in both buffalo and cattle, while the salivary gland and the mammary gland displayed the smallest number in buffalo and cattle, respectively. Moreover, we discovered a significant overlap of TSGs in the same tissues between the two species (Hypergeometric test, FDR < 0.0001) (Figure 3c). Notably, the top 10 TSGs with the highest expression detected in buffalo exhibited a strong tissue specificity in cattle and vice versa for the top 10 TSGs in cattle (Figure 3d). These findings suggested that TSGs were conserved between buffalo and cattle.
To visualize the variation in gene expression among samples, we used the t-SNEbased method and found that samples from similar tissues clustered together rather than by species, indicating the conservation of gene expression among the species (Figure 2b,c). This observation was further supported by the hierarchical clustering of tissues based on the mean or median gene expression in each tissue ( Figure S3a,b). Additionally, we found that correlations based on gene expression in the same tissue between species were significantly higher than those observed between different tissues in the same species ( Figure S4a). Tissues, such as the liver, brain, small intestine, and stomach, exhibited the highest similarity in gene expression between buffalo and cattle, while tissues, such as skin and salivary gland, showed the lowest similarity ( Figure S4b,c). Finally, we observed that buffalo and cattle shared most genes at the top (highest expression) and bottom (lowest expression) 10% of genes sorted by their median level of expression in each tissue (Figure 2d).

Detection and Comparison of Tissue-Specific Genes (TSGs)
By counting the number of tissues where a gene is expressed (TPM > 0.1), we found that genes tend to express ubiquitously (in all tissues) or tissue-specifically (in a few tissues) in both buffalo and cattle (Figure 3a). There was a significant correlation between the number of tissues where each gene was detected as expressed (TPM > 0.1) in each species (Spearman's r = 0.87, p < 2.2 × 10 −16 ), indicating global conservation of tissuespecific expression among orthologous genes. We then identified TSGs using the R package Limma described in a previous study [23], and genes with adjusted p value < 0.05 and log2(FC) > 1.5 were considered as TSGs. The number of TSGs in each tissue was significantly correlated between the two species (Spearman's r = 0.48, p = 0.033) ( Figure  3b). The testes exhibited the largest number of TSGs in both buffalo and cattle, while the salivary gland and the mammary gland displayed the smallest number in buffalo and cattle, respectively. Moreover, we discovered a significant overlap of TSGs in the same tissues between the two species (Hypergeometric test, FDR < 0.0001) (Figure 3c). Notably, the top 10 TSGs with the highest expression detected in buffalo exhibited a strong tissue specificity in cattle and vice versa for the top 10 TSGs in cattle (Figure 3d). These findings suggested that TSGs were conserved between buffalo and cattle.  In testes, TSGs uniquely expressed in buffalo were enriched in functions related to the regulation of response to DNA damage stimulus, regulation of DNA repair, RNA localization, ncRNA processing, and chromatin remodeling. In contrast, TSGs uniquely expressed in cattle were enriched in functions related to cell junction assembly, positive regulation of cell projection organization, axonogenesis, and synapse assembly (Figure 3e, Table S2). Interestingly, the TSGs overlapping between the two species were enriched in functions related to cellular processes involved in reproduction in multicellular organisms, microtubule-based movement, cilium organization, germ cell development, and cilium assembly (Figure 3e, Table S2).

Detection and Comparison of House-Keeping Genes (HKGs)
A method described by Zhang et al. [7] was used to explore HKGs in buffalo and cattle. We identified 8385 and 7923 preliminary HKGs (pHKGs) in buffalo and cattle, respectively, of which the median TPM was >1 in all tissues. Of these preliminary HKGs, 7491 (89.3% in buffalo and 94.5% in cattle) were found to be shared in both species. We further classified these shared pHKGs into three groups (high, medium and low) based on their expression variability across tissues, measured by the coefficient of variation (CV). We found that 1611 (21.5%), 3844 (51.3%) and 2036 (27.2%) pHKGs showed high, medium and low expression variability in buffalo, while 1702 (22.7%), 3821 (51.0%) and 1968 (26.3%) showed high, medium and low expression variability in cattle, respectively (Figure 4a). A total of 1211, 2567, and 1134 pHKGs showed consistently low, medium and high expression variability between buffalo and cattle, respectively (Figure 4a). GO enrichment analysis showed that highly variable genes were related to energy metabolism (e.g., sulfur compound metabolic process, small molecule catabolic process, fatty acid catabolic process and lipid catabolic process etc.), medium variable genes were related to basic biological activities (e.g., macroautophagy, DNA damage repair, peptidyl-lysine modification and stem cell population maintenance) and low variable genes were involved in organelle functions (e.g., mitochondrial translation, ribosome biogenesis, Golgi vesicle transport and protein insertion into membrane) (Figure 4b, Table S3).

Detection of Differentially Expressed Genes (DEGs) between Species
We identified DEGs between buffalo and cattle in each matching tissue. The brain showed the lowest number of DEGs, while the skin had the highest one (Figure 5a). We selected the top and bottom 10% of genes with the smallest and largest p-values from the differential expression analysis as the divergent and conserved genes between species, respectively, and compared these genes with TSGs. We found that TSGs in some tissues tended to be differentially expressed between species, such as skin, while those in other tissues tended to be conserved between species, such as mammary glands (Figure 5b). We conducted GO enrichment analysis for the DEGs in the skin between buffalo and cattle and found that those upregulated in buffalo were associated with epidermis development, skin development, and keratinocyte differentiation, while those upregulated in cattle were associated with collagen fibril organization, cell-substrate adhesion, and collagen metabolic processes (Figure 5c, Table S5). These results presumably reflected differences in the morphology of skin between the two species.

Detection of Differentially Expressed Genes (DEGs) between Species
We identified DEGs between buffalo and cattle in each matching tissue. The brain showed the lowest number of DEGs, while the skin had the highest one (Figure 5a). We selected the top and bottom 10% of genes with the smallest and largest p-values from the differential expression analysis as the divergent and conserved genes between species, respectively, and compared these genes with TSGs. We found that TSGs in some tissues tended to be differentially expressed between species, such as skin, while those in other tissues tended to be conserved between species, such as mammary glands (Figure 5b). We conducted GO enrichment analysis for the DEGs in the skin between buffalo and cattle and found that those upregulated in buffalo were associated with epidermis development, skin development, and keratinocyte differentiation, while those upregulated in cattle were associated with collagen fibril organization, cell-substrate adhesion, and collagen metabolic processes (Figure 5c, Table S5). These results presumably reflected differences in the morphology of skin between the two species.

Discussion
In this study, we integrated transcriptomic data from buffalo, established a comprehensive gene expression atlas, and performed a comparative transcriptomic analysis between buffalo and cattle. Samples in our transcriptomic dataset clustered by tissue in the expression heatmap, despite being generated from various breeds of river and swamp buffalo. This suggested that our integrated data were devoid of any conspicuous batch effects and, furthermore, underscored that expression differences between tissues exceeded those between breeds [29]. We identified TSGs of 20 tissues and found that these TSGs were mainly related to the physiological function of tissues, which also demonstrated the reliability of our results. This resource will enhance our understanding of the genetic and biological processes of complex traits in future studies [30].
Buffalo and cattle exhibited conservation in their overall expression patterns, TSGs, and HKGs. Firstly, the correlation of the numbers of genes expressed in each tissue between the two species is 0.59, which was consistent with the findings of a previous comparative transcriptome study in cattle and humans [23]. The correlation in expression patterns within the same tissue between the two species was much higher than that within the same species in different tissues, as previously reported [23]. This finding further confirmed the conservation of tissue expression patterns between species [29]. We identified a significant overlap of TSGs between species, indicating their conservation. Notably, the testes had the highest number of TSGs in both species and the shared TSGs between species, reflecting a relatively unique expression pattern [31][32][33]. We identified 8385 and 7923 pHKGs in buffalo and cattle, respectively, which was similar in mice [34]. Among these pHKGs, 89.3% in buffalo and 94.5% in cattle were shared between species. Moreover, 65.6% of shared pHKGs showed the same expression variation level, and 76.9% of shared HKGs exhibited the same expression level. This suggested that HKGs were conserved across species in terms of quantity, expression variation, and expression levels [7,35].
Despite the strong conservation of the transcriptome between buffalo and cattle, several genes that were differentially expressed between the two species have been identified.

Discussion
In this study, we integrated transcriptomic data from buffalo, established a comprehensive gene expression atlas, and performed a comparative transcriptomic analysis between buffalo and cattle. Samples in our transcriptomic dataset clustered by tissue in the expression heatmap, despite being generated from various breeds of river and swamp buffalo. This suggested that our integrated data were devoid of any conspicuous batch effects and, furthermore, underscored that expression differences between tissues exceeded those between breeds [29]. We identified TSGs of 20 tissues and found that these TSGs were mainly related to the physiological function of tissues, which also demonstrated the reliability of our results. This resource will enhance our understanding of the genetic and biological processes of complex traits in future studies [30].
Buffalo and cattle exhibited conservation in their overall expression patterns, TSGs, and HKGs. Firstly, the correlation of the numbers of genes expressed in each tissue between the two species is 0.59, which was consistent with the findings of a previous comparative transcriptome study in cattle and humans [23]. The correlation in expression patterns within the same tissue between the two species was much higher than that within the same species in different tissues, as previously reported [23]. This finding further confirmed the conservation of tissue expression patterns between species [29]. We identified a significant overlap of TSGs between species, indicating their conservation. Notably, the testes had the highest number of TSGs in both species and the shared TSGs between species, reflecting a relatively unique expression pattern [31][32][33]. We identified 8385 and 7923 pHKGs in buffalo and cattle, respectively, which was similar in mice [34]. Among these pHKGs, 89.3% in buffalo and 94.5% in cattle were shared between species. Moreover, 65.6% of shared pHKGs showed the same expression variation level, and 76.9% of shared HKGs exhibited the same expression level. This suggested that HKGs were conserved across species in terms of quantity, expression variation, and expression levels [7,35].
Despite the strong conservation of the transcriptome between buffalo and cattle, several genes that were differentially expressed between the two species have been identified.
The largest number of differentially expressed genes was observed in the skin. GO enrichment analysis revealed that the upregulated genes in buffalo were associated with epidermis development, skin development, and keratinocyte differentiation, while those upregulated in cattle were associated with collagen fibril organization, cell-substrate adhesion, and collagen metabolic processes. Collagen fibrils are a critical component of animal skin [36][37][38], and the differential expression of these genes may be related to structural and functional differences between buffalo and cattle skin. For example, buffalo skin had a lower density of sweat glands and thicker skin than cattle [39], which could contribute to the observed differences in gene expression in skin tissue between the two species.

Conclusions
Our study provided a multi-tissue gene expression atlas and identified tissue-specific and housekeeping genes in buffalo. This enriches the functional annotation of the buffalo genome and establishes a foundation for further exploration of its genomic information and biological mechanisms underlying complex traits and adaptive evolution in buffalo. Additionally, we compared the conservation of transcriptomes between buffalo and cattle, which deepens our understanding of gene expression conservation between species and provides future direction for more comprehensive comparative analyses between the two species at a functional level.
Supplementary Materials: The following supporting information can be downloaded at: https:// www.mdpi.com/article/10.3390/genes14040890/s1; Figure S1: Plot of t-SNE of samples based on gene expression; Figure S2: The percentage of expressed genes and summed expression in the four categories; Figure S3: Hierarchical clustering of tissues in buffalo and cattle; Figure S4: Comparison of gene expression between buffalo and cattle tissues; Figure S5: Heatmap of gene expression of top 30 highly expressed HKGs in buffalo and cattle. Table S1: Detailed information of the RNA-seq data; Table S2: Top 10 significantly enriched Gene Ontology terms for three groups of tissue-specific genes; Table S3: Significantly enriched Gene Ontology terms of overlapped preliminary house-keeping genes; Table S4: Significantly enriched Gene Ontology terms of overlapped house-keeping genes; Table S5: Significantly enriched Gene Ontology terms for up-regulated genes in the skin in buffalo and cattle.

Informed Consent Statement: Not applicable.
Data Availability Statement: The multi-tissue gene expression atlas, tissue-specific genes and house-keeping genes of water buffalo generated in this study are publicly available at https:// doi.org/10.6084/m9.figshare.22219327.v1, accessed on 6 March 2023. The raw RNA-Seq data for the 73 swamp buffalo samples have been deposited at the Sequence Read Archive (SRA) with study ID PRJNA951806. The accessions for the previously published datasets can be found in Supplementary Table S1.