Genomic Identification, Evolution and Sequence Analysis of the Heat-Shock Protein Gene Family in Buffalo

Heat-shock proteins (HSP) are conserved chaperones crucial for protein degradation, maturation, and refolding. These adenosine triphosphate dependent chaperones were classified based on their molecular mass that ranges between 10–100 kDA, including; HSP10, HSP40, HSP70, HSP90, HSPB1, HSPD, and HSPH1 family. HSPs are essential for cellular responses and imperative for protein homeostasis and survival under stress conditions. This study performed a computational analysis of the HSP protein family to better understand these proteins at the molecular level. Physiochemical properties, multiple sequence alignment, and phylogenetic analysis were performed for 64 HSP genes in the Bubalus bubalis genome. Four genes were identified as belonging to the HSP90 family, 10 to HSP70, 39 to HSP40, 8 to HSPB, one for each HSPD, HSPH1, and HSP10, respectively. The aliphatic index was higher for HSP90 and HSP70 as compared to the HSP40 family, indicating their greater thermostability. Grand Average of hydropathicity Index values indicated the hydrophilic nature of HSP90, HSP70, and HSP40. Multiple sequence alignment indicated the presence of highly conserved consensus sequences that are plausibly significant for the preservation of structural integrity of proteins. In addition, this study has expanded our current knowledge concerning the genetic diversity and phylogenetic relatedness of HSPs of buffalo with other mammalian species. The phylogenetic tree revealed that buffalo is more closely related to Capra hircus and distantly associated with Danio rerio. Our findings provide an understanding of HSPs in buffalo at the molecular level for the first time. This study highlights functionally important HSPs and indicates the need for further investigations to better understand the role and mechanism of HSPs.


Introduction
Heat-shock proteins (HSP) are well-known and highly conserved protein families responsible for cellular responses to various stresses. These evolutionary conserved molecular chaperones play a vital role in the survival of organisms in response to elevated ambient temperatures. Protection from Genes 2020, 11, 1388 3 of 18

Identification of the HSP Genes in Buffalo
Published sequences from Bos taurus were used as queries for the genome-wide identification of HSP genes in the buffalo, so we used representative sequences of HSP10 (NP_776771.1), HSP40 (XP_005224490.1), HSP70 (NP_976067.3), HSP90 (NP_001012688.1), HSPB (NP_001020740.1), HSPD (NP_001160081.1), and HSPH1 (NP_001068770.1) from Bos taurus. We analyzed HSP protein sequences using Protein-protein basic local alignment (BLAST) search with an E value equal or less than 1.0 × e −5 of the non-redundant protein sequences in buffalo using all the parameters by default. After retrieving the required sequences, redundancy of the sequences was checked to avoid ambiguity. Chromosome locations of each HSP gene were obtained from buffalo genome resources. The corresponding gene positions files were obtained from the buffalo genome annotation file with the general feature format (GFF), which can be set as the input files of the MCScanX program, as reported previously [18]. Moreover, the exon-intron structure was analyzed by the buffalo genome annotation file using the in-house scripts. The conserved protein domains in buffalo HSP families were blasted against the conserved domain database (CDD) from NCBI.

Characterization of Physiochemical Properties of HSP Genes
To characterize the physical and chemical parameters of HSP proteins, ProtParam tool (https: //web.expasy.org/protparam/) was used [19]. Protein sequences were submitted to identify the molecular weight (MW), number of amino acids, isoelectric point (pI), aliphatic index (AI), instability index (II) and grand average of hydropathicity (GRAVY). Moreover, conserved protein motifs of HSP were analyzed by using the Multiple Expectation Maximization for Motif Elicitation (MEME) programs with a maximum of 10 motifs [20]. Three-dimension protein structure for various HSP protein (HSP10, HSP40, HSP70, HSP90, HSPB1 and HSPH1) were constructed by using Phyre2 software (http://www.sbg.bio.ic.ac.uk/phyre2) with fold recognition end homology modeling for the identification of secondary structure attributes.

Phylogenetic Analysis
Results of multiple sequence alignment were further used for phylogenetic analysis using MEGA7 [21]. The evolutionary history was inferred by using the Maximum Likelihood method based on the Jones-Taylor Thornton (JTT) matrix-based model [22]. The tree with the highest log likelihood (−581.71) was selected. The percentage of trees in which the associated taxa clustered together is shown next to the branches. Initial tree(s) for the heuristic search were obtained automatically by applying Neighbor-Join and BioNJ algorithms to a matrix of pairwise distances estimated using a JTT model with 1000 bootstrap resampling, and then selecting the topology with superior log likelihood value. The tree is drawn to scale, with branch lengths measured in the number of substitutions per site. The analysis involved 77 amino acid sequences. All positions containing gaps and missing data were eliminated. There was a total of 16 positions in the final dataset. Evolutionary analyses were conducted in MEGA7 [21].

Sequence Analysis
Accession no of all sequences which were used for multiple sequence alignment and subsequent analyses are presented in Tables S1-S5. Comparison of buffalo HSP was carried out with other mammalian species including cattle (Bos taurus), goat (Capra hircus), sheep (Ovis aries), horse (Equus caballus), camel (Camelus ferus), mouse (Rattus norvegicus), pig (Sus scrofa), zebra fish (Danio rerio), mouse (Mus musculus), and dog (Canis lupus familiaris). Sequence homology of buffalo HSP family was carried out along with elucidation of phylogenetic relationships using the Geneious version 2020.1 created by Biomatters (https://www.geneious.com).

Identification of Single Nucleotide Polymorphism (SNP)
Multiple sequence alignment of coding regions of all the members of two major HSP gene families (HSP90 and HSP70) in buffalo and Bos taurus were perused in DNASTAR Lasergene software to identify the number of synonymous and non-synonymous SNPs in buffalo [23]. Moreover, we also calculated the number of non-synonymous substitutions per non-synonymous site (Ka)/ the number of synonymous substitutions per synonymous site (Ks) ratio as described previously [24].

Identification of HSP Gene Families and Their Physiochemical Properties
In the present study, we applied a comprehensive strategy to characterize the HSP in the buffalo genome. By using human and mouse sequences as queries, we detected four genes belonging to HSP90, 10 genes to HSP70, 39 genes to HSP40, 8 genes to HSPB, and one gene to each HSPD, HSPH1, and HSP10 family. Buffalo each HSP gene family tree and percentage of identity with other species is represented in Figures S1-S7. The physical and chemical properties of these HSP genes in buffalo are presented in Tables 1-4. The functional diversity of HSP isoforms could be realized from a wide range of their MW (10 kDa to 254 kDa) and the total number of amino acids in different HSP genes that ranged from 102 (HSP10) to 2223 amino acids in DnaJ homolog subfamily C member 13 (DNAJC13). The results of the present study revealed that all the members of the HSP90 family are unstable according to instability index. The isoelectric point indicates the acidic nature of protein members except for TRAP1, which is basic in nature. An AI greater than 65 indicates the thermostability of HSP90 protein members whereas, lower GRAVY values of HSP90 indicates its hydrophilic nature (Table 4). HSP70 members are thermostable, acidic, and hydrophilic in nature. The instability index indicates the unstable nature of HSPA14, HSPA4, and HSPA4L protein members (Table 1). Various physical and chemical parameters of HSP40 protein members are shown in Table 2. HSP40 is the largest HSP gene family, with 39 genes widely distributed over different buffalo chromosomes with MW ranging from 15 kDa to 254 kDa. The largest gene of this family is DNAJC13, which harbors 59 exons, 2243 amino acids, and MW of 254 kDa. The details about the physiochemical nature of small HSPs, including HSP10, HSPB, HSPD, and HSPH1, are given in Table 5. All members of HSPB are thermostable except HSPB7 and 8.

Phylogenetic Analysis
The evolutionary history was inferred by using the Maximum Likelihood method based on JTT matrix-based model. The bootstrap consensus tree inferred from 1000 replicates was taken to represent the evolutionary history of the taxa analyzed. Each HSP family genes clustered together in one group, revealing seven groups ( Figure 1). Overall phylogenetic relationships revealed that buffalo HSP gene family is more closely related to Bos taurus, goat and sheep along with the common clade of horse and camel ( Figure 2). Moreover, buffalo is distantly related to other mammals, including pig, dog, rat, mouse, and zebra fish.

Phylogenetic Analysis
The evolutionary history was inferred by using the Maximum Likelihood method based on JTT matrix-based model. The bootstrap consensus tree inferred from 1000 replicates was taken to represent the evolutionary history of the taxa analyzed. Each HSP family genes clustered together in one group, revealing seven groups ( Figure 1). Overall phylogenetic relationships revealed that buffalo HSP gene family is more closely related to Bos taurus, goat and sheep along with the common clade of horse and camel ( Figure 2). Moreover, buffalo is distantly related to other mammals, including pig, dog, rat, mouse, and zebra fish.

Gene Structure and Motif Analysis
To gain further insight of HSP genes in buffalo, gene structures and conserved motifs were predicted. The number of exons varies in all genes; for instance, DNAJC13 has a higher number of exons i.e., 59. The exonic pattern of buffalo HSP genes are shown in Figure 3B. Ten putative motifs of heat-shock protein gene family were observed in buffalo (Table 5). MEME motif 1 and 2 (consisting of 41 and 28 amino acid) were annotated as the DnaJ domain in Pfam search, while 3 to 9 motifs were annotated as HSP70. These motifs were also confirmed from NCBI CDD. Moreover, MEME was used to predict conserved motifs present within HSP families in the buffalo ( Figure 3D).     Collinearity analysis showed that HSP genes were distributed over 20 chromosomes in buffalo, while in cattle, these genes were randomly distributed over 25 chromosomes (Figure 4). Most of the genes were present on proximal or distal ends of chromosomes in buffalo. Collinearity analysis showed that HSP genes were distributed over 20 chromosomes in buffalo, while in cattle, these genes were randomly distributed over 25 chromosomes (Figure 4). Most of the genes were present on proximal or distal ends of chromosomes in buffalo.

HSP Protein Structural Configuration
Three-dimensional protein models were also predicted for various heat-shock proteins (HSP10, HSPB1, HSP90, HSP40 and HSP70) in Bos taurus to elucidate comparative structural configuration ( Figure 4). It was observed that the overall structure of HSP10 in both species was unaltered with the same number (102) of amino acid residues. Similarly, in the case of HSPB1, no significant structural variation was observed among both species in 201 amino acid residues. For HSP90, considerable variation in the structure of the protein was observed in both species ( Figure 5). In cattle, 380 residues were present, but in buffalo, only 339 amino acids were observed in final protein structure ( Figure 5). Moreover, secondary structural elements also varied in both species as β-sheets observed in buffalo were absent in the case of Bos taurus. In cattle, protein was majorly consisting of α-helix. Similarly, the HSP40 gene also showed structural variation in buffalo as the final structure comprised of 480 amino acid residues as compared to 453 in cattle ( Figure 5). However, the structure of HSP70 protein was similar in both species having amino acids 641.

HSP Protein Structural Configuration
Three-dimensional protein models were also predicted for various heat-shock proteins (HSP10, HSPB1, HSP90, HSP40 and HSP70) in Bos taurus to elucidate comparative structural configuration ( Figure 4). It was observed that the overall structure of HSP10 in both species was unaltered with the same number (102) of amino acid residues. Similarly, in the case of HSPB1, no significant structural variation was observed among both species in 201 amino acid residues. For HSP90, considerable variation in the structure of the protein was observed in both species ( Figure 5). In cattle, 380 residues were present, but in buffalo, only 339 amino acids were observed in final protein structure ( Figure 5). Moreover, secondary structural elements also varied in both species as β-sheets observed in buffalo were absent in the case of Bos taurus. In cattle, protein was majorly consisting of α-helix. Similarly, the HSP40 gene also showed structural variation in buffalo as the final structure comprised of 480 amino acid residues as compared to 453 in cattle ( Figure 5). However, the structure of HSP70 protein was similar in both species having amino acids 641.

SNP Analysis
Mutation analysis of 14 genes belonging to HSP70 and HSP90 families from buffalo and Bos taurus were compared to detect synonymous and non-synonymous SNPs. We identified a total of 194 SNPs in the HSP90 family genes among which only 19 were non-synonymous ( Table 6). The highest number of SNPs (128) was observed in the TRP1 gene, among which 17 were non-synonymous.

SNP Analysis
Mutation analysis of 14 genes belonging to HSP70 and HSP90 families from buffalo and Bos taurus were compared to detect synonymous and non-synonymous SNPs. We identified a total of 194 SNPs in the HSP90 family genes among which only 19 were non-synonymous ( Table 6). The highest number of SNPs (128) was observed in the TRP1 gene, among which 17 were non-synonymous. Moreover, the TRP1 gene also exhibited a single indel of 3 nucleotides at position 1703-1705. HSP90AA1 showed a total of 23 synonymous SNPs and only one non-synonymous SNP, while HSP90AB1 and HSP90B1 exhibited only 22 and 19 synonymous SNPs, respectively (Table S6). The present study identified 291 SNPs in the HSP70 family genes, among which 59 were non-synonymous. The highest number of SNPs (80) was observed in the HSPA8 gene exhibiting 8 non-synonymous substitutions. The lowest number of SNPs (10) was detected in the HSPA14 gene with two non-synonymous SNPs. HSP70 gene showed highest non-synonymous to synonymous ratio by exhibiting 23 non-synonymous SNPs in a total of 62. Interestingly, the HSPA2 gene of the HSP70 family showed an indel of 15 nucleotides at position 777-791 (Table S6).

Discussion
HSP play crucial functions, including stress tolerance, cell survival, protein refolding, and prevention of protein denaturation. Moreover, HSP also facilitate the regulation of specific cellular processes such as signal transduction, protein synthesis, protein trafficking, and DNA replication [4,25,26]. It is well established that HSP90 plays pivotal role in controlling cell cycle, transport, folding, and degradation of proteins as well as thermotolerance and induction of signal transduction. Studies have revealed that HSP90 is present in various organisms and this HSP family is highly conserved [27]. HSPs are the molecular chaperones that help to maintain activity of cells in different stressful states. Some of these proteins are produced due to different stressful stimuli (inducible), while others are expressed constitutively. Cellular metabolism, ultraviolet radiations, alteration in protein structures, cytotoxic drugs, and heat are some of the stress stimuli that usually induce HSP expression. These proteins are highly involved in facilitating the degradation of misfolded proteins and regulating proper protein folding [28]. In this study, we analyzed a total of 64 HSP genes in buffalo. Members of HSP90 are found to be present on four chromosomes in buffalo similar to Bos taurus [29], suggesting that these protein molecules are highly conserved and share a common ancestor ( Figure S3).

Physiochemical Properties of HSP Gene Families
The characterization of the physiochemical properties of proteins of various gene families is essential to identify the functions and properties of proteins encoded by these genes. Isoelectric point indicates that members of HSP90 in buffalo are acidic in nature (pI < 7.0), except for the TRAP1 gene, which was basic in nature similar to human HSP90 [30].
AI of the globular proteins is associated with their thermostability [31]. The AI, the respective volume of a protein occupied by aliphatic side chains (alanine, valine, isoleucine, and leucine), may be interpreted as a positive factor for increasing thermostability of globular proteins [31]. As the AI is associated with the thermostability of globular proteins, so a higher value of the AI reveals the high thermostability of protein. Hence, a higher AI of HSP90 and HSP70 showed that members of these proteins are more thermostable compared to HSP40 members. Similar results have also been observed in cattle genome [29]. In the present study, the HSP90 gene family of buffalo had AI greater than 65, indicating the stability of proteins at high temperatures. Similarly, the higher expression of HSP90 has been found in goat during the peak summer season, revealing the extent of thermostability of HSPs [32].
The GRAVY value is used to predict water and protein interaction. The GRAVY value for a protein or a peptide is defined as the sum of hydropathy values of all amino acids divided by the protein length. GRAVY qualifies and quantifies the hydrophobic or hydrophilic property of a protein.
A positive GRAVY score indicates a globally hydrophobic protein, whereas a negative GRAVY score is related to hydrophilic protein [33]. Hence, more the negative GRAVY score, more soluble the protein is. All members of HSP40, HSP70, and HSP90 had a negative GRAVY score in the present study, suggesting these all members are soluble proteins. However, highly negative GRAVY values of HSP90 observed in the present study indicate the greater hydrophilic nature of its members. It can help to enhance the functional properties of protein binding and oligomerization [34,35]. Among heat-shock proteins, HSP90 is considered an essential and highly conserved molecular chaperone widely present in eukaryotic cells [36]. HSP90 in buffalo share high sequence similarities with other species (Figure S3). The presence of conserved motifs helps interact with other proteins and increases their chaperone function [37,38].
A total of 10 HSP70 members were observed in buffalo. Interestingly, all members of HSP70 showed acidic nature and almost similar isoelectric points indicating the possibility of functional similarity among buffalo HSP70 members. It has been suggested earlier that HSP70 has conserved biological functions [39]. GRAVY values of HSP70 protein of buffalo indicates their hydrophilic nature and high values of AI revealed their thermal stability in a wide range of temperatures. These findings confirm the consistency of its chaperoning role in protection against specific cellular and systemic stresses that can lead to protein denaturation [40]. In evolution, HSP70 is the most conserved protein, and its conserved domains and functions are considered responsible for across species conservation [11,41]. In the present study, a tree with an alignment view of HSP70 showed the presence of conserved domains. Buffalo is closely related to Bos taurus, followed by goat and sheep ( Figure S2). The comparative analysis of the HSP70 protein in water buffalo revealed that most of the sequences were conserved across species. Comparative sequence analysis for bubaline HSP70.1 indicated the high similarity of genes among mammalian species, such as cattle, goat, human, mouse, and pig, respectively. Furthermore, phylogenetic analysis of bovine and bubaline HSP70.1 gene with other species revealed Indian zebu cattle grouped close to Bos taurus followed by buffalo and goat, while porcine, ruminants, and murine formed distinct cluster [42,43].
Heat-shock protein 40 (HSP40/DnaJ) is evolutionary conserved and thought to play an essential role in translocation, degradation, translation, folding, and unfolding of proteins by stimulating the ATPase activity of HSP70 protein chaperones [44,45]. All members of HSP40 have domain J, which helps in binding with HSP70. Also, 41 DnaJ domains have been reported in humans [46]. In the present study, 39 putative members of HSP40 were found. GRAVY value of HSP40 members represents the hydrophilic nature of these proteins. Furthermore, pI values of HSP40 protein members suggest that some of these proteins possess acidic nature while other proteins appear to be basic. It may explain the different functional properties of these proteins and provides insights for designing further experiments to elucidate their functional diversity. HSP40 represents a large and diverse protein family and is usually divided into three classes [47]. Type I HSP40s is known to have a glycine and phenylalanine (G/F)-rich region and a cysteine-rich, zinc finger-like region (ZFLR). Type II HSP40s only contain a G/F-rich region, while type III HSP40s have neither a G/F-rich region nor a ZFLR [48]. Comparative sequence and phylogenetic analyses revealed that the HSP40 family of buffalo is closely related to Bos taurus ( Figure S1). Another study conducted on the bovine genome also indicated the presence of evolutionary conserved residues in HSP40 [29].

Sequence Analyses
Data of protein sequences can provide useful information regarding evolutionary conserved regions important for various biological functions. Multiple sequence alignment is one of the ways to identify these conserved constraints and to compile data for structural and functional analysis of proteins [49]. To investigate the protein sequence features of the heat-shock proteins, 10 motifs of HSP gene families in buffalo were predicted by the MEME tool, and the regular expression levels of the conserved motifs are listed in Table 5. These conserved motifs can play a crucial role in the structural and functional organization of proteins. Similarly, another study on Ciona savignyi, identified conserved motifs in HSP60, HSP70, and HSP90 by using MEME [50].

Phylogenetic Analysis
Molecular phylogenetic tree of heat-shock proteins (HSP40, HSP70, HSP90, HSPB, HSPD, HSP10, and HSPH1) was constructed by using the Maximum Likelihood method based on the JTT matrix-based model [22]. The bootstrap consensus tree inferred from 1000 replicates was taken to represent the evolutionary history of the taxa analyzed ( Figure 1). Overall phylogenetic relationships revealed that buffalo is more closely related to goat than Bos taurus, and sheep along with horse and camel ( Figure 2). Moreover, buffalo is distantly related to other mammals, including pig, dog, rat, mouse and zebra fish. A study conducted on cattle also revealed similar results as phylogenetic analysis of HSPA8 and HSPA1A indicted the high similarity between buffalo and sheep [51]. Previously, molecular characterization of HSP70.1 of goat have shown that buffalo, cattle, sheep, and goat share a common ancestor whereas human, horse and pig showed dissimilarities thus having different ancestors [52]. Moreover, Karan Fries cattle was evolutionary related to Bos taurus, Bos indicus, and buffalo, whereas distinctly related to rat and mouse [53].

Protein Structural Configuration
Three-dimensional (3-D) protein configurations of HSP in Bos taurus and buffalo was predicted. The analysis of 3-D secondary protein structures revealed the structural variations only in HSP90 and HSP40 of Bos taurus and buffalo, while other HSPs were similar or structurally related in both species. These findings confirm the results obtained in gene structure and phylogenetic analysis that HSPs are conserved proteins. Eukaryotic HSP90 proteins had a modular structure with an N-terminal ATP binding domain connected to a middle domain by a 'linker' of variable length. The N-terminal and middle domains together form a "split" ATPase site that is also the binding pocket for GA [54]. HSP40 can function as a molecular chaperone to bind non-native polypeptides. HSP40 pairs with HSP70 to achieve its function of protein folding, transport, and degradation functions. The J-domain of HSP40 may stimulate the ATPase activity of HSP70 and the C-terminal peptide-binding fragment of HSP40 can load the bound non-native polypeptide to HSP70 [55]. HSP70 and HSP40 are essential chaperons affecting the protein folding in the cell. Three-dimensional structural configurations of HSP40 in cattle and buffalo varied in number of amino acid residues and overall secondary structural attributes that may be associated with differential chaperon functions and stress responses in both species. Further investigations at the transcriptome level are required to corroborate these findings.

SNP Analysis
The present study revealed that HSP90AA1, HSP90AB1 and HSP90B1 are more conserved genes than TRP1 due to a higher number of non-synonymous SNPs in TRP1. Similarly, our findings indicate that HSP70.1, HSPA14, and HSPA1L are more conserved than other genes of HSP70 family. We observed a relatively higher ratio of non-synonymous to synonymous SNPs in the HSP70 gene (0.59) as compared to other genes of the HSP70 family. It indicates that HSP70 gene is under selection owing to its strong association with cellular thermotolerance [1]. Moreover, the expression dynamics of the HSP70 gene are used as a potential biomarker to reveal the level of heat stress and the ability of animals to withstand climate stress [56]. It is mainly attributed to the fact that HSP expression indicates cellular response and intensity of stress in animals [57]. At the transcription level, persuaded expression of HSP70 genes is coordinated as upstream elements at promoter region directly control the expression profiles of HSP70. Studies have shown a higher expression of HSP70 in Murrah and Tarai buffaloes [58] during the summer compared to other seasons (or thermoneutral zone). Moreover, increased expression of HSP70 in bovine lymphocytes has also been observed after heat stress [59].
Due to subsequent induction of HSP70 after heat shock and its association with physiological parameters (rectal temperature, pulse, and respiratory rates) of animals [59], serum HSP70 is considered a sensitive biomarker for heat stress management in livestock [58]. In addition to thermotolerance, studies have also reported other roles of the HSP family in animal physiology. In particular, SNPs identified in the 5 -UTR regions of HSP70 gene have been linked with different performance traits of livestock such as milk production, thermal stress and disease vulnerability [1,43]. Moreover, SNPs have also been identified in the HSP70 gene of beef cattle breeds (Angus, Brahman, and their crossbreds) with little effect on milk protein and milk fat [60]. Polymorphism (SNP) at the promoter region has been shown to alter the expression level of bovine HSP70.1 gene, and exhibited its significant association with heat tolerance and higher milk production in Frieswal cattle [43]. These findings indicate the putative effects of selective pressure (natural or artificial selection) on the HSP70 gene to ultimately favor animals with better thermotolerance, performance, and stress resilience. Our findings are in line with recent whole genome-sequencing studies on buffalo that reported evidence for the selection of genes related to social behavior, digestive physiology, fitness, and milk production [61].
HSPs are highly conserved in nature and play crucial role in stress stimulus and in several normal cellular functions. The genomic characterization of the HSP gene family and elucidation of their physiochemical properties performed in the present study would help in better understanding HSPs in buffalo. HSPs are highly conserved proteins and important from evolutionary point of view, so further investigations are warranted to study expression dynamics under various physiological conditions to elucidate their putative roles in thermotolerance and stress physiology in buffalo. In particular, non-synonymous SNPs and indels observed in the present study need to be further explored to detect their effect on protein function and subsequent physiological manifestations in buffalo. These findings might explain the better thermotolerance and fitness of buffalo compared to Bos taurus, particularly under tropical climate.

Conclusions
The present study revealed that the buffalo genome contains 64 genes of the HSP family in buffalo. These genes are widely dispersed across the buffalo genome. Multiple sequence alignment and phylogenetic analysis indicated the highly conserved consensus sequences and evolutionary relatedness of the HSP family. Sequence analysis revealed that the HSP90 gene family buffalo is more conserved in nature as compared to the HSP70 family that exhibited a quite higher ratio of non-synonymous substitutions. Structural variations in secondary structures of HSP40 and HSP90 were observed in buffalo and cattle. The findings of the present study provide insights into the genomic variations in the HSP gene family that could be used for a better understanding of the function of these genes and their further use for selective breeding of buffalo for better thermotolerance and stress resilience.

Supplementary Materials:
The following are available online at http://www.mdpi.com/2073-4425/11/11/1388/s1, Figure S1: Tree with alignment view of HSP40. Percentage of sequence is represented with different colors in Identity bar at top of the sequences green color indicates 100% identity, green-brown shows at least 30 and under 100% while red representing below 30 identity. Figure S2: Tree with alignment view of HSP70. Percentage of sequence is represented with different colors in Identity bar at top of the sequences green color indicates 100% identity, green-brown shows at least 30 and under 100% while red representing below 30 identity. Figure S3: Tree with alignment view of HSP90. Percentage of sequence is represented with different colors in Identity bar at top of the sequences green color indicates 100% identity, green-brown shows at least 30 and under 100% while red representing below 30 identity. Figure S4: Tree with alignment view of HSPD. Percentage of sequence is represented with different colors in Identity bar at top of the sequences green color indicates 100% identity, green-brown shows at least 30 and under 100% while red representing below 30 identity. Figure S5: Tree with alignment view of HSP10. Percentage of sequence is represented with different colors in Identity bar at top of the sequences green color indicates 100% identity, green-brown shows at least 30 and under 100% while red representing below 30 identity. Figure S6: Tree with alignment view of HSPH1. Percentage of sequence is represented with different colors in Identity bar at top of the sequences green color indicates 100% identity, green-brown shows at least 30 and under 100% while red representing below 30 identity. Figure S7: Tree with alignment view of HSPB. Percentage of sequence is represented with different colors in Identity bar at top of the sequences green color indicates 100% identity, green-brown shows at least 30 and under 100% while red representing below 30 identity. Table S1: Accession no. of different heat-shock protein in different organisms used in phylogenetic analysis, Table S2: Accession no. of buffalo HSP90, Table S3: Accession no. of buffalo HSP70, Table S4: Accession no. of buffalo HSP40, Table S5: Accession no. of buffalo HSP family B, HSP10, HSPD, and HSPH1, Table S6