Genomic Survey of Heat Shock Proteins in Liriodendron chinense Provides Insight into Evolution, Characterization, and Functional Diversities

Heat shock proteins (HSPs) are conserved molecular chaperones whose main role is to facilitate the regulation of plant growth and stress responses. The HSP gene family has been characterized in most plants and elucidated as generally stress-induced, essential for their cytoprotective roles in cells. However, the HSP gene family has not yet been analyzed in the Liriodendron chinense genome. In current study, 60 HSP genes were identified in the L. chinense genome, including 7 LchiHSP90s, 23 LchiHSP70s, and 30 LchiHSP20s. We investigated the phylogenetic relationships, gene structure and arrangement, gene duplication events, cis-acting elements, 3D-protein structures, protein–protein interaction networks, and temperature stress responses in the identified L. chinense HSP genes. The results of the comparative phylogenetic analysis of HSP families in 32 plant species showed that LchiHSPs are closely related to the Cinnamomum kanehirae HSP gene family. Duplication events analysis showed seven segmental and six tandem duplication events that occurred in the LchiHSP gene family, which we speculated to have played an important role in the LchiHSP gene expansion and evolution. Furthermore, the Ka/Ks analysis indicated that these genes underwent a purifying selection. Analysis in the promoter region evidenced that the promoter region LchiHSPs carry many stress-responsive and hormone-related cis-elements. Investigations in the gene expression patterns of the LchiHSPs using transcriptome data and the qRT-PCR technique indicated that most LchiHSPs were responsive to cold and heat stress. In total, our results provide new insights into understanding the LchiHSP gene family function and their regulatory mechanisms in response to abiotic stresses.


Introduction
Plant growth and development is to a greater extend affected by various abiotic stresses, including cold, heat, salt, and drought stress [1][2][3]. Nonetheless, plants have evolved complex molecular mechanisms using a wide range of specialized genes and other transcription factors to adapt to prevailing adverse environmental factors [4,5]. These include the heat shock proteins (HSPs), a large family of molecular chaperones [6], highly conserved and widely distributed in prokaryotes and eukaryotes. Research has shown the HSP gene family to play important roles in regulating plant growth and enhancing abiotic stress resistance [7,8]. Under normal conditions, the HSP content has been estimated to constitute about 5% of the total cell protein, and synthesized in large quantities under environmental stress as a response mechanism especially under high-temperature stress, amounting up to 15% of the total cell protein [9]. In detail, the heat shock proteins participate in the folding and unfolding of polypeptide chains, assembly, transport, and the degradation of proteins [10]. Biochemical research on HSPs has shown a classification of five families based on the molecular weight, that is: HSP20, HSP60/GroEL, HSP70/DnaK, and quantitative real-time PCR (qRT-PCR) were used to analyze the expression patterns of LchiHSPs under heat and cold stress. This study will provide a firm base for further experimentations, a theoretical reference, and the comprehension of the biological functions of LchiHSPs in abiotic stresses.

Phylogenetic Relationship of the LchiHSP Genes in L. chinense
To better understand the evolution and the phylogenetic relationship of the HSP genes in L. chinense, A. thaliana, and O. sativa, we constructed a rooted phylogenetic tree using the full protein sequences ( Figure 1, Table S1). For better presentation purposes, we classified the HSP protein sequences according to previous classification methods. LchiHSP90 was divided into three subfamilies, of which group I contained three members (LchiHSP90-3, -4, -5), two in group II (LchiHSP90-1 and LchiHSP90-2), and group III (LchiHSP90-6 and LchiHSP90-7) ( Figure 1A). Twenty-three HSP70 sequences were classified into five groups (C, P, M, SSE, and ER). Group ER and group C had the least members (2 proteins) and the most members (12 proteins), respectively. Interestingly, group P, M, and SSE had the same number of proteins, three proteins, ( Figure 1B). Thirty LchiHSP20s were classified into 12 subfamilies based on the phylogeny and subcellular localization analyses. In detail, 10 genes localized in the cytoplasm (CIs), were classified as: 3 CIIs, 3 CIIIs, 0 CIVs, 1 CV, 1 CVI, and 0 CVII; 2 mitochondria (MIs) as 1 MIIs; 2 in the peroxisomes (Pos); 2 in the endoplasmic reticulum (ER); and 2 in the plastids (Ps). Surprisingly, three L. chinense HSP20 genes (HSP20-17, -18, and -19) did not cluster into any subfamily ( Figure 1C). In total, we concluded that most LchiHSP20s (18/28) belonged to CI-CVII, except three unclassified LchiHSP20s. These findings indicated that the cytoplasm might be the main functional region of the LchiHSP20s.

Evolution of HSP Genes from Algae to Angiosperms
To have a deeper understanding of the HSP gene family evolution, we searched the HSP gene family in 32 plants, including 2 algae, 2 mosses, 1 fern, 1 gymnosperm, 2 basal angiosperms, 17 dicotyledons, and 7 monocotyledons using the HMM and BLASTP search ( Figure 2, Table S1). We identified a total of 2483 HSP genes (312 HSP90s, 945 HSP70s, and 1226 HSP20s). Among the identified species, G. hirsutum and V. carteri had the most (160) and the least (24) HSP genes, respectively. Overall, the number of HSPs in the higher plants was higher than that in the lower plants. In contrast, algae and D. carota contained the HSP20 family more than the HSP70 and HSP90 family. In the magnolia, the protein sequences of the C. kanehirae (97) were greater than those of the L. chinense (60), to which we related to the fact that the C. kanehirae underwent two whole genome duplication events during evolution as compared to the L. chinense.

Evolution of HSP Genes from Algae to Angiosperms
To have a deeper understanding of the HSP gene family evolution, we searched the HSP gene family in 32 plants, including 2 algae, 2 mosses, 1 fern, 1 gymnosperm, 2 basal angiosperms, 17 dicotyledons, and 7 monocotyledons using the HMM and BLASTP search ( Figure 2, Table S1). We identified a total of 2483 HSP genes (312 HSP90s, 945 HSP70s, and 1226 HSP20s). Among the identified species, G. hirsutum and V. carteri had the most (160) and the least (24) HSP genes, respectively. Overall, the number of HSPs in the higher plants was higher than that in the lower plants. In contrast, algae and D. carota contained the HSP20 family more than the HSP70 and HSP90 family. In the magnolia, the protein sequences of the C. kanehirae (97) were greater than those of the L. chinense (60), to which we related to the fact that the C. kanehirae underwent two whole genome duplication events during evolution as compared to the L. chinense.

Gene Structure and Motif Analysis of the LchiHSP Genes
Gene structure analysis provides important evidence that supports the evolution of gene families. We investigated the gene structures and motif arrangements of the identified LchiHSPs (Figure 3). In the LchiHSP90 class, different genes showed large differences

Gene Structure and Motif Analysis of the LchiHSP Genes
Gene structure analysis provides important evidence that supports the evolution of gene families. We investigated the gene structures and motif arrangements of the identified LchiHSPs (Figure 3). In the LchiHSP90 class, different genes showed large differences in the number of introns (2-18) ( Figure 3A). The intron number varied greatly among different subfamilies. Group I had 2-3 introns, while groups II and groups III contained 14-18 introns. There were 0-3 introns in the cytoplasmic subfamily of L. chinense HSP90 members and 8-13 introns in the SSE subfamily, and two members (LchiHSP70-14 and LchiHSP70-19) contained no intron, and 9 had only 1 intron ( Figure 3B). The gene structure of LchiHSP20s was relatively conserved compared to LchiHSP90s and LchiHSP70s. Nonetheless, LchiHSP20-24 (4) and LchiHSP20-30 (12) had one to two introns differing from the rest of the LchiHSP20 members ( Figure 3C). having a motif range between 7 and 10. It is worth mentioning that in the cytoplasm subfamily, the member of LchiHSP all contained 10 motifs except LchiHSP70-3, 4, 21, and 22 ( Figure 4B). A further analysis of the 10 conserved motifs revealed that motifs 2, 3, 4, 6, 7, and 9 were the N-terminal ATP binding domains of LchiHSP70s, motifs 1, 8, and 10 were the C-terminal domains, and motif 5 was the subordinate binding domain. In the LchiHSP20 family, motifs 1, 2, and 5 were present in most members. The complete sequences of motifs 1, 2, and 3 together formed a highly conserved whole ACD domain that exhibits vital biological functions ( Figure 4C).  We used the online tool, MEME, to identify the conserved motif arrangement in LchiHSPs ( Figure 4). The motifs from the same subfamily were identical in number, type, and order. Ten conserved motifs were identified in each LchiHSP superfamily (Table S2). Motif six was assigned as the HATPase_c domain, based on the annotation information from the Pfam database. The other motifs were also shown to belong to the HSP90 conservative domain. In the LchiHSP90 family, motifs 1, 2, 4, and 6 were highly conserved as compared to the other classes ( Figure 4A). Protein motif 3 was contained in all 23 LchiHSP70 proteins. The subfamily SSE had the least number of motifs (4)(5), with the rest having a motif range between 7 and 10. It is worth mentioning that in the cytoplasm subfamily, the member of LchiHSP all contained 10 motifs except LchiHSP70-3, 4, 21, and 22 ( Figure 4B). A further analysis of the 10 conserved motifs revealed that motifs 2, 3, 4, 6, 7, and 9 were the Nterminal ATP binding domains of LchiHSP70s, motifs 1, 8, and 10 were the C-terminal domains, and motif 5 was the subordinate binding domain. In the LchiHSP20 family, motifs 1, 2, and 5 were present in most members. The complete sequences of motifs 1, 2, and 3 together formed a highly conserved whole ACD domain that exhibits vital biological functions ( Figure 4C).

Analysis of Cis-Element in LchiHSP Gene Promoters
The cis-acting elements in the promoter regions play an important role in the plant's response to abiotic stress. To gain further insight into the role of the cis-regulatory elements in LchiHSPs, we used the PlantCARE website to predict the 2 kb sequence of cisregulatory elements upstream of LchiHSPs (Figures 7 and S2, and Table S5). Here, we focused on hormones and abiotic stress-related cis-acting elements, including MeJA-responsive (CGTCA-motif/TGACG-motif), auxin-responsive (AuxRE-core/TGA-element), ethylene-responsive (ERE), gibberellin-responsive (TATC-box/P-box/GARE-motif), abscisic acid-responsive (ABRE), salicylic acid-responsive (TCA-element), wound responsive (WUN-motif), drought inducibility (MBS), anaerobic inducibility (ARE/GC-motif), low temperature-responsive (LTR), and defense and stress-responsive (W-box/STRE/TC-rich repeats). The analysis showed that the promoters of the LchiHSP genes contained different numbers of hormone and abiotic stress-related cis-acting elements. Among them, the ABRE, CGTCA-motif, and TGACG-motif accounted for the largest proportion in the hormone response category, while ARE and STRE accounted for the largest proportion in the stress response category. All the analyzed LchiHSPs except for LchiHSP90-1, LchiHSP70-7, -11, and -21 were shown to carry at least three cis-elements associated with the stress responses, illustrating that LchiHSPs are associated with the regulation of abiotic stresses. In addition, LchiHSP20-22 had the highest number of cis-acting elements (32) in responding to hormones and stress, which leads to the speculation that it may be involved in regulating multiple abiotic stresses (Figure 7).

Analysis of Cis-Element in LchiHSP Gene Promoters
The cis-acting elements in the promoter regions play an important role in the plant's response to abiotic stress. To gain further insight into the role of the cis-regulatory elements in LchiHSPs, we used the PlantCARE website to predict the 2 kb sequence of cis-regulatory elements upstream of LchiHSPs (Figures 7 and S2, and Table S5). Here, we focused on hormones and abiotic stress-related cis-acting elements, including MeJAresponsive (CGTCA-motif/TGACG-motif), auxin-responsive (AuxRE-core/TGA-element), ethylene-responsive (ERE), gibberellin-responsive (TATC-box/P-box/GARE-motif), abscisic acid-responsive (ABRE), salicylic acid-responsive (TCA-element), wound responsive (WUN-motif), drought inducibility (MBS), anaerobic inducibility (ARE/GC-motif), low temperature-responsive (LTR), and defense and stress-responsive (W-box/STRE/TC-rich repeats). The analysis showed that the promoters of the LchiHSP genes contained different numbers of hormone and abiotic stress-related cis-acting elements. Among them, the ABRE, CGTCA-motif, and TGACG-motif accounted for the largest proportion in the hormone response category, while ARE and STRE accounted for the largest proportion in the stress response category. All the analyzed LchiHSPs except for LchiHSP90-1, LchiHSP70-7, -11, and -21 were shown to carry at least three cis-elements associated with the stress responses, illustrating that LchiHSPs are associated with the regulation of abiotic stresses. In addition, LchiHSP20-22 had the highest number of cis-acting elements (32) in responding to hormones and stress, which leads to the speculation that it may be involved in regulating multiple abiotic stresses (Figure 7).

Three-Dimensional Structures and Protein-Protein Network Analysis of LchiHSP Proteins
The analysis of the three-dimensional structure of the proteins plays a crucial role in understanding their biological functions, thus we predicted the three-dimensional struc ture of the identified LchiHSP proteins using SWISS-MODEL homology modeling. We selected 52 LchiHSP models based on a target consistency and template greater than 30% ( Figure 8, Table S6). All the proteins were modeled and visualized using the Chimera soft ware based on the template similarity, that is: LchiHSP90 (5uls.1.A), LchiHSP70 (7sqc.391.A), and LchiHSP20 (1gme.2.A). It was found that the structure of the proteins in the same template were identical, except for the fat that LchiHSP90-4 had a large differ ence compared to the LchiHSP90 family. Additionally, the obtained GMQE values ranged from 0.26 (LchiHSP20-15) to 0.82 (LchiHSP70-23).
To further investigate the biological functions of the LchiHSP genes, we constructed a protein interaction network using the Online String database, based on the homologou relationship between LchiHSP proteins and P. trichocarpa. We noted interactions among 33 LchiHSPs, collectively forming 213 interactions (Table S7). In detail, LchiHSP70-7 (31 had the largest number of interacting proteins HSP, followed by LchiHSP90-4 (21) and LchiHSP90-7 (19), implying that LchiHSP70-7, LchiHSP90-4, and LchiHSP90-7 interac with various protein to fully exhibit their functions (Figure 9).

Three-Dimensional Structures and Protein-Protein Network Analysis of LchiHSP Proteins
The analysis of the three-dimensional structure of the proteins plays a crucial role in understanding their biological functions, thus we predicted the three-dimensional structure of the identified LchiHSP proteins using SWISS-MODEL homology modeling. We selected 52 LchiHSP models based on a target consistency and template greater than 30% (Figure 8, Table S6). All the proteins were modeled and visualized using the Chimera software based on the template similarity, that is: LchiHSP90 (5uls.1.A), LchiHSP70 (7sqc.391.A), and LchiHSP20 (1gme.2.A). It was found that the structure of the proteins in the same template were identical, except for the fat that LchiHSP90-4 had a large difference compared to the LchiHSP90 family. Additionally, the obtained GMQE values ranged from 0.26 (LchiHSP20-15) to 0.82 (LchiHSP70-23).
To further investigate the biological functions of the LchiHSP genes, we constructed a protein interaction network using the Online String database, based on the homologous relationship between LchiHSP proteins and P. trichocarpa. We noted interactions among 33 LchiHSPs, collectively forming 213 interactions (Table S7). In detail, LchiHSP70-7 (31) had the largest number of interacting proteins HSP, followed by LchiHSP90-4 (21) and LchiHSP90-7 (19), implying that LchiHSP70-7, LchiHSP90-4, and LchiHSP90-7 interact with various protein to fully exhibit their functions (Figure 9). The α-helix, β-strand, and random coil are marked by red, yellow, and green, respectively. The structural images were generated using the Chimera software. The lower right corner of each superfamily is the overlay of all proteins in the same template.  The α-helix, β-strand, and random coil are marked by red, yellow, and green, respectively. The structural images were generated using the Chimera software. The lower right corner of each superfamily is the overlay of all proteins in the same template.

Expression Analysis of LchiHSPs under Heat and Cold Stress
To further understand functions and the expression patterns of the LchiHSPs during abiotic stresses, we used the transcriptome data from the leaf tissue under heat and cold stress (Figures 10 and 11, Table S8). Generally, many stress-related genes were differentially expressed at different time points, suggesting the importance of the HSPHSPs during abiotic stresses. In detail, all the LchiHSPSP90 genes were significantly up-regulated under heat stress and reached an expression peak during the first and third hour ( Figure 10A). Most of the LchiHSP70s were significantly expressed at 1 h of heat stress treatment in varying degrees, whereas LchiHSP70-1, -2, and -14 were significantly down-regulated ( Figure 10B). Furthermore, all the LchiHSP20s showed a higher gene expression under a heat stress treatment as compared to LchiHSP90s and LchiHSP70s. Specifically, six genes (LchiHSP20-1, -17, -18, -19, -24, and -30) were expressed the highest after 6 h of the heat treatment ( Figure 10C). in varying degrees, whereas LchiHSP70-1, -2, and -14 were significantly down-regulated ( Figure 10B). Furthermore, all the LchiHSP20s showed a higher gene expression under a heat stress treatment as compared to LchiHSP90s and LchiHSP70s. Specifically, six genes (LchiHSP20-1, -17, -18, -19, -24, and -30) were expressed the highest after 6 h of the heat treatment ( Figure 10C).
In cold stress, the LchiHSP90 genes were slightly upregulated at different time periods ( Figure 11A). In detail, 18 LchiHSP70s were responsive to the cold treatment. Specifically, 16 genes had elevated expression patterns while the remaining had not. Nearly 50% of the LchiHSP70s genes were expressed the highest at 3 d, indicating that the LchiHSP70s are long period cold regulatory genes ( Figure 11B). Likewise, most of the LchiHSP20 genes (18/30) reached their highest expression levels at 3 d. However, during the 0-12 h time period, the expression patterns of LchiHSP20-3 and LchiHSP20-6 I were marked by an expression upregulation initially and a downregulation later, reaching their expression peak at 1 h of cold stress treatment ( Figure 11C).
In conclusion, the majority of the LchiHSP genes responded to cold and heat stress at different time points and different levels. Based on the transcriptome results, we selected some LchiHSP genes for a further validation by qRT-PCR. The results showed that the LchiHSP90-5, LchiHSP70-3, and LchiHSP70-4 were induced to different degrees under cold and heat stress. Furthermore, we noted that LchiHSP20-5 and LchiHSP20-13 were significantly up-regulated after a heat stress treatment and much lower in a cold stress treatment ( Figure S3).  In cold stress, the LchiHSP90 genes were slightly upregulated at different time periods ( Figure 11A). In detail, 18 LchiHSP70s were responsive to the cold treatment. Specifically, 16 genes had elevated expression patterns while the remaining had not. Nearly 50% of the LchiHSP70s genes were expressed the highest at 3 d, indicating that the LchiHSP70s are long period cold regulatory genes ( Figure 11B). Likewise, most of the LchiHSP20 genes (18/30) reached their highest expression levels at 3 d. However, during the 0-12 h time period, the expression patterns of LchiHSP20-3 and LchiHSP20-6 I were marked by an expression upregulation initially and a downregulation later, reaching their expression peak at 1 h of cold stress treatment ( Figure 11C).
In conclusion, the majority of the LchiHSP genes responded to cold and heat stress at different time points and different levels. Based on the transcriptome results, we selected some LchiHSP genes for a further validation by qRT-PCR. The results showed that the LchiHSP90-5, LchiHSP70-3, and LchiHSP70-4 were induced to different degrees under cold and heat stress. Furthermore, we noted that LchiHSP20-5 and LchiHSP20-13 were significantly up-regulated after a heat stress treatment and much lower in a cold stress treatment ( Figure S3).

Discussion
Recent studies have shown that the heat shock proteins are abundantly expressed under stress conditions such as a high temperature, drought, low temperature, and salt stress [24,29,33,34]. They maintain plant homeostasis by participating in various biological functions such as protein synthesis, folding, cellular, and cellular localization, and protein degradation [6,10,17]. In terms of regulation, heat shock proteins are involved in many biological functions, such as protein transmembrane transport and target protein degradation. However, with the continuous progress and wide application of genome sequencing technology, more and more plant HSPs genome data have been published recently, including Arabidopsis [19], rice [20], tea plant [35], and P.trichocarpa [21]. Nonetheless, the L. chinense HSP gene family and its biological functions had not yet been characterized.
In this study, a total of 60 LchiHSP genes were identified in the L. chinense genome, including 7 LchiHSP90s, 23 LchiHSP70s, and 30 LchiHSP20s (Table 1). The L. chinense is considered to be a basal angiosperm [33], suggesting the reason of its small size LchiHSP gene family (Figure 2). It is also possible that the differences in the number of HSP genes are due to differences in the genome size and gene duplication during the plant evolution. Gene duplication events play an important role in the expansion of plant gene families [12,15,16]. The number of HSP genes in the L. chinense genome was lower than that in C. kanehirae due to the fact that there was only a single WGD event in L. chinense and two in C. kanehirae [36,37]. Previous studies have demonstrated that the naming of HSPs is based on their subcellular localization and clustering depending phylogenetic relationship [35,38]. In this research, we observed that the classification of the LchiHSPs was to a greater extent consistent with the protein localization, and proteins in the same family were also clustered together by the phylogenetic analysis ( Figure 1). Interestingly, we also noted that the M subfamily members were adjacent to the P subfamily members in LchiHSP20s, implying that they may have undergone a tighter differentiation period. Additionally, as there were no LchiHSP20s in the CIV and CVII subfamilies, we related this phenomenon to gene loss during evolution ( Figure 1C).
Gene structure plays an important role in the evolution of multigene families, in which the development of introns is an important process in genome evolution and an adaptive measure in species evolution [39]. The lower the number of introns, the greater

Discussion
Recent studies have shown that the heat shock proteins are abundantly expressed under stress conditions such as a high temperature, drought, low temperature, and salt stress [24,29,33,34]. They maintain plant homeostasis by participating in various biological functions such as protein synthesis, folding, cellular, and cellular localization, and protein degradation [6,10,17]. In terms of regulation, heat shock proteins are involved in many biological functions, such as protein transmembrane transport and target protein degradation. However, with the continuous progress and wide application of genome sequencing technology, more and more plant HSPs genome data have been published recently, including Arabidopsis [19], rice [20], tea plant [35], and P.trichocarpa [21]. Nonetheless, the L. chinense HSP gene family and its biological functions had not yet been characterized.
In this study, a total of 60 LchiHSP genes were identified in the L. chinense genome, including 7 LchiHSP90s, 23 LchiHSP70s, and 30 LchiHSP20s (Table 1). The L. chinense is considered to be a basal angiosperm [33], suggesting the reason of its small size LchiHSP gene family (Figure 2). It is also possible that the differences in the number of HSP genes are due to differences in the genome size and gene duplication during the plant evolution. Gene duplication events play an important role in the expansion of plant gene families [12,15,16]. The number of HSP genes in the L. chinense genome was lower than that in C. kanehirae due to the fact that there was only a single WGD event in L. chinense and two in C. kanehirae [36,37]. Previous studies have demonstrated that the naming of HSPs is based on their subcellular localization and clustering depending phylogenetic relationship [35,38]. In this research, we observed that the classification of the LchiHSPs was to a greater extent consistent with the protein localization, and proteins in the same family were also clustered together by the phylogenetic analysis ( Figure 1). Interestingly, we also noted that the M subfamily members were adjacent to the P subfamily members in LchiHSP20s, implying that they may have undergone a tighter differentiation period. Additionally, as there were no LchiHSP20s in the CIV and CVII subfamilies, we related this phenomenon to gene loss during evolution ( Figure 1C).
Gene structure plays an important role in the evolution of multigene families, in which the development of introns is an important process in genome evolution and an adaptive measure in species evolution [39]. The lower the number of introns, the greater the ability of the plant to adapt to different developmental processes and environmental stimuli. The analysis of the gene structure of the LchiHSP90 family members showed differences which we classified into three subfamilies and were consistent with the phylogenetic evolution results ( Figure 3A). There was an intronic deletion in LchiHSP20 genes that has been shown to associate with the induction and rapid processing of proteins after a stress response, and this deletion has also been shown to be responsible for the rapid expression of the LchiHSP20 genes under various stress conditions ( Figure 3C). Gene families and classes are also clustered based on the similarity in the distribution of the conserved motif and research has shown that the proteins in the same group display similar functions. In the research, we noted that the HATPase_c domain composed of motif 1 as the ADP/ATP binding site and had ATPase activity [19,40]. The conserved domain of LchiHSP90, composed of the remaining nine motifs, we speculated to have a key role in maintaining the full movement of the HATPase_c domain. Three highly conserved protein motifs formed the ACD domain [35] and the structural basis for the biological function of LchiHSP20s were detected in most LchiHSP20s, consistent with earlier findings in pepper and switchgrass. We concluded that the diversity of the LchiHSP family is likely driven by environmental selection pressures and the continued evolution of plants ( Figure 4).
The cis-regulatory elements are important molecular switches involved in the transcriptional regulation of the gene expression and the control of various biological processes, including the stress responses, hormonal responses, and developmental processes. The cis-regulatory element prediction has been widely used to explore the function of HSPs in several species. In this study, various stress-responsive and phytohormone-responsive related cis-elements were found in the promoter region of the LchiHSPs (Figure 7), in which the phytohormone-responsive classes accounted for a higher proportion. Phytohormones are a large group of endogenous small, low molecular weight molecules that not only regulate plant growth and development at low concentrations but can also act as signaling molecules involved in plant responses to environmental stresses. Phytohormones such as auxin, ethylene, jasmonic acid, salicylic acid (SA), abscisic acid (ABA), cytokinins (CK), and gibberellins (GAs) play important roles in plant development and stress responses. In higher plants, phytohormones are involved in heat stress signal transduction and regulating the LchiHSP expression under heat and cold stress. In addition, the cis-regulatory elements in the promoters are involved in the crosstalk of different stress signals in the gene expression, constituting a gene expression cascade in the abiotic stress response, controlling the molecular processes of the stress response and stress tolerance. Therefore, we observed an interaction between LchiHSPs and phytohormones during environmental stresses, exhibiting important roles in regulating the growth, development, and pressure responses in L. chinense. However, future analysis will require further analysis of how the L. chinense HSP genes exert their functions. Previous studies have proved that HSP protein interactions regulates plant growth, development, and stress responses [19,40]. HSP90 and HSP70 interact with each other depending on their co-chaperones to form a complex that regulates external stress. In the present study, we also found that interactions between the LchiHSP90 and LchiHSP70 family proteins are ubiquitous and may play an important role in L. chinense to high-and low-temperature stress. In addition, some LchiHSP20s were also found to interact with some LchiHSP70s, similar to previous reports on pea and tobacco [41,42]. Therefore, they may act synergistically in stress responses in L. chinense.
Gene expression patterns are highly correlated with the function. Therefore, to understand the role of the LchiHSPs under heat and cold stress, we selected some genes from each superfamily for a further analysis based on the transcriptome (Figures 10 and 11). The results showed that the expression of all the LchiHSPs was significantly increased after the heat stress treatment and gradually decreased with the extension of the treatment period, in which LchiHSP20s showed a higher level of gene expression in response to the heat stress treatment compared to the other two superfamilies. Under cold stress, most genes reached the gene expression peak at 3 d, indicating that 3 d is an important time point for cold stress acclimation in L. chinense. Furthermore, these gene resources can be provided for plants through gene editing and genetic transformation [43].

Identification and Characterization of HSP Genes in L. chinense
Genome and chromosome annotation information of L. chinense were download from the genome database (https://hardwoodgenomics.org/Genome-assembly/263042 0, accessed on 15 September 2022), the protein sequences of Arabidopsis thaliana HSPs were extracted from TAIR (https://www.arabidopsis.org/, accessed on 15 September 2022) [19,44,45]. Three major heat shock protein family genes were proposed for this study. HSP90, HSP70, and HSP20 family genes were directly extracted from the Pfam database (http://pfam.xfam.org/, accessed on 15 September 2022) using identifier numbers PF00183, PF00012, and PF00011 and applied the E-value < 10 −5 as the criteria for selection. In addition, the BLASTP search of HSPs using the A. thaliana HSPs amino acid sequence and the L. chinense genome database was performed to identify the L. chinense HSP gene family members. The NCBI-CDD database (https://www.ncbi.nlm.nih.gov/Structure/cdd/wrpsb.cgi, accessed on 15 September 2022) and SMART website (http://smart.embl-heidelberg.de/, accessed on 15 September 2022) were used to predict the conserved domain of HSPs in L. chinense [46].

Systematic Evolution of HSP Gene Family
To investigate the evolutionary patterns of the HSP gene members in L. chinense, the L. chinense, Arabidopsis, and rice HSP family protein sequences were collected, and multiple sequence alignments were performed in CLUSTALW (https://www.genome.jp/ tools-bin/clustalw, accessed on 15 September 2022). A phylogenetic tree was constructed using MEGA-X software with the neighbor-joining method, and the parameter bootstrap value was set to 1000. The evolutionary tree is beautified through the EvolView website (http://www.evolgenius.info/evolview, accessed on 15 September 2022) [48]. OrthoFinder (v2.5.4) was used to explore the HSP gene evolutionary status of all the 32 species with the FASTA format full-length sequence folder as the parameter '-f' and used the MSA gene tree inference and DIAMOND software to blast quickly. Then, we use MAFFT to perform the multiple sequence alignment, and the IQ-TREE program was used to construct a species tree with the maximum inference likelihood [49].

Gene Structure and Conserved Motif Analysis of LchiHSPs
The exon and intron structure information of the members of the LchiHSPs gene family was obtained from the GFF3 annotation file of the L. chinense genome, and the conserved protein motifs were analyzed using the online website MEME tool (http://meme-suite. org/tools/meme, accessed on 15 September 2022). The LchiHSPs gene structures and conserved motifs were visualized by the TBtools software [50]; the MEME analysis parameters included a minimum width set to 5 and a full set to 50, the number of conserved motifs was set to 10, and the other parameters were set to the default values.

Chromosome Distribution, Gene Duplication Events, and Collinearity Analysis
The chromosomal location information of all the members of the L. chinense HSP gene family was downloaded from the L. chinense genome database. Except for 4 members (LchiHSP70-23, LchiHSP20-28, -29, -30) that are not identified on the chromosome, the remaining 56 members can be used to make a chromosome distribution map through the TBtools tool according to their positioning information.
MCscanX was used to study the tandem duplication and segmental duplication events of LchiHSPs between different chromosomes of L. chinense. The Ka/Ks Calculator function in the TBtools software was used to calculate the duplicate gene pairs' Ka/Ks value. The L. chinense genome, Arabidopsis genome, rice genome, maize genome, and Populus genome were analyzed and visualized by TBtools for interspecies collinearity analysis.

Cis-Regulatory Element Analysis of LchiHSP Gene Promoters
The upstream 2000 bp promoter sequences of the LchiHSPs were extracted from the L. chinense genome, and the cis-acting elements were analyzed by the PlantCARE database (http://bioinformatics.psb.ugent.be/webtools/plantcare/html, accessed on 15 September 2022), and the results were visualized by TBtools. The number of cis-acting factors detected is displayed using a heatmap.

Three-Dimensional (3D) Protein Structures Prediction of LchiHSPs
The three-dimensional structure of the LchiHSP proteins determine the specific functions a protein and is essential for understanding the protein functions. The PDB file of the LchiHSP protein was downloaded from the SWISS-MODEL website (https://swissmodel. expasy.org/, accessed on 15 September 2022) based on the homology modelling method according to the sequences of LchiHSP. The PDB files were visualized using Chimera software (http://www.cgl.ucsf.edu/chimera/, accessed on 15 September 2022).

Protein Interaction Network Analysis and Visualization
The protein interaction network of the LchiHSP gene family was predicted in the database STRING (http://string-db.org, accessed on 15 September 2022) based on the protein interaction network information of P. trichocarpa and visualized by Cytoscape software (v 3.9.1).

Expression Analysis of LchiHSP Genes
This study cultured 3-month-old somatic embryo seedlings in the greenhouse (16 h day/8 h night, 24 • C, and 70% relative humidity). Heat temperature (40 • C) and low temperature (4 • C) were used to simulate heat and cold stress, respectively. The control group received an equal amount of water in the matrix at 22 • C. Five biological replicates were set per treatment at each time point. After treatment for 1 h, 3 h, 6 h, 12 h, 1 day, and 3 days, respectively, referred to in the previous reports by Li et al. [51], the qRT-PCR primers were designed for selected sequences using the Primer3plus online website (https: //www.primer3plus.com/, accessed on 15 September 2022) (Table S9). Using LchiActin as the internal reference, each PCR was set to three biological replicates. The relative expression was calculated according to using the 2 -∆∆Ct method, and the significance analysis (* p < 0.05, ** p < 0.01) was performed using Student's t-test in SPSS 26 software, and the graphs of the data were constructed using GraphPad Prism.

Conclusions
In conclusion, identification and bioinformatics analysis of the LchiHSP genes were performed. The physicochemical properties, comparative phylogeny, gene duplication events, gene structure, cis-acting elements, and protein structure function of LchiHSPs were systematically revealed. The expression patterns of LchiHSPs under cold and heat stress were explored, and this study provides a good basis for the subsequent exploration of the functional properties of the LchiHSPs gene family members.