In Silico Characterization and Expression Proﬁles of Heat Shock Transcription Factors (HSFs) in Maize ( Zea mays L.)

: Heat shock transcription factors (HSFs) regulate many environmental stress responses and biological processes in plants. Maize ( Zea mays L.) is a major cash crop that is grown worldwide. However, the growth and yield of maize are affected by several adverse environmental stresses. Therefore, investigating the factors that regulate maize growth and development and resistance to abiotic stress is an essential task for developing stress-resilient maize varieties. Thus, a comprehensive genome-wide identiﬁcation analysis was performed to identify HSFs genes in the maize genome. The current study identiﬁed 25 ZmHSFs , randomly distributed throughout the maize genome. Phylogenetic analysis revealed that ZmHSFs are divided into three classes and 13 sub-classes. Gene structure and protein motif analysis supported the results obtained through the phylogenetic analysis. Segmental duplication is shown to be responsible for the expansion of ZmHSFs . Most of the ZmHSFs are localized inside the nucleus, and the ZmHSFs which belong to the same group show similar physio-chemical properties. Previously reported and publicly available RNA-seq analysis revealed a major role of class A HSFs including ZmHSFA-1a and ZmHSFA-2a in all the maize growth stages, i.e., seed, vegetative, and reproductive development. Under abiotic stress conditions (heat, drought, cold, UV, and salinity), members of class A and B ZmHSFs are induced. Gene ontology and protein–protein interaction analysis indicated a major role of ZmHSFs in resistance to environmental stress and regulation of primary metabolism. To summarize, this study provides novel insights for functional studies on the ZmHSFs in maize breeding programs.


Introduction
In recent years, there has been an increasing trend to focus on the responses of plants to abiotic stresses due to global climate change [1]. Particularly, research has been focused on plant heat stress (HS) tolerance mechanisms, as higher temperatures have a negative effect on plant growth and production [2,3]. Although plants are susceptible to HS throughout their lifespan, reproductive tissues are specifically characterized by the sustained activation of several HS-memory genes through methylation of the target genomic region [32]. The transcripts of HSFA2 are undetectable under normal conditions. However, after HS, the HSFA2 becomes the most strongly induced HSF in plants [18,33]. Yoshida et al. reported that the overexpression of HSFA3 improves plant thermotolerance, while the T-DNA mutants showed reduced thermotolerance [34]. Lin et al. reported that HSFA2, HSFA4a, and HSFA7a are essential for HSR and cytoplasmic protein response. HSFs have also been characterized in several crop plants [35]. It was reported that OsHSFA4a and its homolog in wheat TaHSFA4a, confers cadmium tolerance to plants [36]. The expression of TaHSFA2-10 is induced in response to HS, oxidative stress, salicylic acid, and its overexpression improves plant thermotolerance [37]. In addition, HSFs are also involved in the regulation of growth and development in Arabidopsis thaliana [16]. For example, HSFA9 is expressed specifically during embryogenesis and maturation in Arabidopsis seeds [38]. Albihlal et al. [39] reported that in Arabidopsis, at least 85 development-associated genes are controlled by HSFA1b [39]. The authors proposed that the HSFA1b allows plants to adjust growth and development under continuously varying environments by transducing external stimuli to stress-associated and development-related genes.
The HSF gene family has been characterized in several plant species, including Arabidopsis thaliana [20], Oryza sativa [40], Zea mays [41], Glycine max [42], Populus trichocarpa [43], Solanum lycopersicum [44], Brachypodium distachyon [45], and Triticum aestivum [46]. However, the role of HSFs in plant growth and development and in responses to stresses other than HS, is not very well understood in maize. Computational biology approaches provide a convenient and reliable platform upon which further wet-lab studies could be carried out. Here, we perform an extensive in silico analysis of maize HSFs to gain better insights into the genomic distribution, phylogeny, gene duplication history, gene structure and protein motif, physio-chemical properties, gene annotation, protein networks, and expression profiling of maize HSFs in growth and development and tolerance to multiple abiotic stresses.

Sequence Retrieval
The protein sequences of Zea mays HSFs were extracted from PLAZA 4.5 (https: //bioinformatics.psb.ugent.be/plaza/versions/plaza_v4_5_monocots/, accessed on 1 August 2021) and Ensembl Plants (https://plants.ensembl.org/index.html, accessed on 1 August 2021) databases. For this, the BLASTP search was performed using the Arabidopsis (AT4G17750, AT2G26150, AT5G03720, AT4G36990, and AT3G24520) and Zea mays (GRMZM2G115456, GRMZM2G002131, GRMZM2G086880) HSFs against maize genome Zm-B73-REFERENCE-NAM-5.0, as a query to obtain a putative list of HSFs in the maize genome using default parameters (e-value ≤ 10 −5 and identity % = 80%). These sequences were checked for the presence of DNA binding domain (DBD) and oligomerization domain (OD) through EMBL-EBI employing the hidden Markov model (HMM) (https://www.ebi.ac.uk/Tools/hmmer/search/phmmer, accessed on 1 August 2021) and SMART (http://smart.embl-heidelberg.de/, accessed on 1 August 2021) tools [47]. The coiled-coil structure, which is a property of OD was predicted using MARCOIL [48]. After carefully analyzing the sequence, the proteins that lacked DBD and/or OD (HR-A/B region) were removed. In addition, the redundant proteins that were the product of a single gene were also discarded from further analysis. Finally, a total of 25 maize HSFs genes were selected for further analysis.

Sequence Alignment and Phylogenetic Analysis
The protein sequences of Arabidopsis thaliana, Oryza sativa, Brachypodium distachyon, and Sorghum bicolor were aligned with Zea mays HSFs through Clustal W [53]. Phylogenetic analysis was performed with the neighbor-joining method implemented in MEGA7.0 and tests were carried out with 1000 bootstrap replicates [54].

Chromosomal Distribution
Based on their initial positions on the maize genome, the HSF genes were named, and a chromosomal graph was constructed using Tbtools.

Expression Profiling of HSF Genes
The RNA-seq data utilized in the current study was retrieved from the maize MaizeGDB database (https://qteller.maizegdb.org/genes_by_name_B73v4.php, accessed on 1 August 2021). Previously, a comprehensive maize gene expression analysis was performed by Stelpflug et al. [59], used in the current study. We analyzed HSF profiles in 3 different growth stages (seed, vegetative, and reproductive) and across 20 different tissues (embryo, endosperm, whole seed, primary root, tap root, whole root, stem and shoot apical meristem, immature leaves, tip of stage 2 leaf, mature leaf tissue, pooled leaves, topmost leaves, vegetative meristem and surrounding tissues, immature tassel, meiotic tassel, anthers, mature pollen, mature female spikelet, pre-pollination cob, immature cob, and silks) (Table S4). Furthermore, the expression patterns were investigated at different timescales in a particular developmental stage to get an overview of the spatiotemporal expression of HSFs. In addition, the expression of ZmHSFs was also analyzed under abiotic (heat, drought, salinity, cold, UV) stress conditions. For the construction of the heatmap, FPKM values were used, which are already available on MaizeGDB. The heatmap was constructed using Tbtools.

Protein 3D Structure, Network Interaction, and Gene Ontology Analysis
The three-dimensional (3D) structure of maize HSFs was predicted through AlphaFold (https://www.alphafold.ebi.ac.uk/, accessed on 1 August 2021) [60]. For this, the protein IDs were entered into the search bar and the structures were obtained. The protein interaction network analysis was performed using the STRING database (https://string-db.org/, accessed on 1 August 2021) using default parameters, i.e., sequences showing more than 40% identity in the database were included for interaction networking [61]. The net-  [62]. Gene ontology annotation analysis was performed by uploading the gene IDs of ZmHSFs to the GENE ONTOLOGY RESOURCE (http://geneontology.org/, accessed on 1 August 2021) [63].

Identification and Chromosomal Distribution of Maize HSFs
With the availability of the genomic sequences of the number of plant species, including maize, it is now possible to obtain the protein sequences of all the HSF members. In the present study, a total of 25 HSFs were identified from the maize genome ( Figure 1; Table 1). All the HSF proteins were surveyed for the presence of DBD and OD through EMBL-EBI, employing HMM. Furthermore, SMART was used to search the HSF-DBD to check the accuracy of the results. After discarding redundant sequences, 25 ZmHSFs were selected for analysis. These HSFs were named based on their chromosomal locations (ZmHSF-01 to ZmHSF-25) ( Table S1). The characteristics of maize HSF genes are presented in Table 1. All the HSFs were mapped on the chromosomes of maize ( Figure 1). The maize genome was shown to possess HSF genes on all of its chromosomes, though the number of HSFs between different chromosomes varied considerably. Chromosome 1 had a maximum of 6 HSFs genes, whereas a single HSF gene copy was localized on each of chromosomes 4, 6, and 10. On the other hand, chromosomes 2, 3, 7, and 9 harbor two gene copies each, while three gene copies were recognized on each of chromosomes 5 and 8. Except for ZmHSF01, ZmHSF02, ZmHSF12, ZmHSF16, and ZmHSF23, all the other HSF genes were present on the lower arm of the chromosomes.

Protein 3D Structure, Network Interaction, and Gene Ontology Analysis
The three-dimensional (3D) structure of maize HSFs was predicted through Al-phaFold (https://www.alphafold.ebi.ac.uk/, accessed on 1 August 2021) [60]. For this, the protein IDs were entered into the search bar and the structures were obtained. The protein interaction network analysis was performed using the STRING database (https://stringdb.org/, accessed on 1 August 2021) using default parameters, i.e., sequences showing more than 40% identity in the database were included for interaction networking [61]. The network interaction file was downloaded and visualized using Cytoscape V. 3.8.2 (https://cytoscape.org/, accessed on 1 August 2021) [62]. Gene ontology annotation analysis was performed by uploading the gene IDs of ZmHSFs to the GENE ONTOLOGY RE-SOURCE (http://geneontology.org/, accessed on 1 August 2021) [63].

Identification and Chromosomal Distribution of Maize HSFs
With the availability of the genomic sequences of the number of plant species, including maize, it is now possible to obtain the protein sequences of all the HSF members. In the present study, a total of 25 HSFs were identified from the maize genome ( Figure 1; Table 1). All the HSF proteins were surveyed for the presence of DBD and OD through EMBL-EBI, employing HMM. Furthermore, SMART was used to search the HSF-DBD to check the accuracy of the results. After discarding redundant sequences, 25 ZmHSFs were selected for analysis. These HSFs were named based on their chromosomal locations (ZmHSF-01 to ZmHSF-25) ( Table S1). The characteristics of maize HSF genes are presented in Table 1. All the HSFs were mapped on the chromosomes of maize ( Figure 1). The maize genome was shown to possess HSF genes on all of its chromosomes, though the number of HSFs between different chromosomes varied considerably. Chromosome 1 had a maximum of 6 HSFs genes, whereas a single HSF gene copy was localized on each of chromosomes 4, 6, and 10. On the other hand, chromosomes 2, 3, 7, and 9 harbor two gene copies each, while three gene copies were recognized on each of chromosomes 5 and 8. Except for ZmHSF01, ZmHSF02, ZmHSF12, ZmHSF16, and ZmHSF23, all the other HSF genes were present on the lower arm of the chromosomes.

Phylogenetic Analysis and Classification of Maize HSFs
In present study, the evolutionary relationship among AtHSFs, OsHSFs, SbHSFs, BdHSFs, and ZmHSFs was explored. A total of 118 HSFs were divided into three classes and 15 sub-classes based on the phylogenetic tree ( Figure 2; Table S2). Variation in HSF gene number was observed among different plant species and between sub-groups ( Table 2). For example, Arabidopsis thaliana contains 21 HSFs (15 HSFAs, 5 HSFBs, and 1 HSFC), the Oryza sativa possess 25 HSFs (13 HSFAs, 8 HSFBs, and 4 HSFCs), Zea mays harbors 25 HSFs in its genome (15 HSFAs,7 HSFBs, and 3 HSFCs), Brachypodium distachyon 24 HSFs (14 HSFAs, 7 HSFBs, and 4 HSFCs), while Sorghum bicolor contains 23 HSFs (12 HSFAs, 7 HSFBs, and 4 HSFCs) ( Table 2). Results indicated that maize HSFs were close to sorghum HSFs. Similarly, HSFs of rice were closer to Brachypodium HSFs, which is in line with the botanical classification among monocots. In contrast to Arabidopsis, sub-class A1 contains fewer HSFs in monocots ( Table 2). On the other hand, sub-class A2 appears to have expanded in monocots. There was no ortholog of Arabidopsis HSFA9, HSFB3 in monocots that might reflect the loss of these sub-groups in family Poaceae ( Figure 2, Table 2). Contrarily, this could also signify the gain of these sub-groups in dicots. A higher number of class C HSFs was observed in monocots.

Gene Duplication Analysis and Evolutionary Rate Calculation
In the present study, a total of 18 (18/25; 72%) maize HSF genes were shown to be duplicated (Table 3). Further, only one pair of a gene (ZmHSF-01/Zm-HSF-04) appeared to be tandemly duplicated, which was recognized on chromosome number 1 ( Figure 3). The rest of duplicated genes were all segmentally duplicated, with eight different clusters containing 16 genes. These genes were recognized on chromosomes 1-9. Moreover, the molecular evolutionary rate of tandemly and segmentally duplicated HSF genes was calculated to gain insights into the selective constraints on the duplicated HSF genes. The ratio of Ka and Ks substitution rate is an effective method to investigate the selective constraint among duplicated gene pairs [64]. Hence, in the present study, Ka, Ks, and Ka/Ks values for each paralogous gene pair were calculated (Table 3). Here, 18 ZmHSF genes were shown to be duplicated. The Ka/Ks ratio for duplicated ZmHSF genes ranged from 0.3415 to 0.7572. All the HSF genes in the present study have Ka/Ks value < 1. Table 3. Duplicated gene pairs, non-synonymous substitution rate (Ka), synonymous substitution rate (Ks), nonsynonymous to synonymous substitution rate ratio (Ka/Ks), estimated time of duplication event in a million years ago (MYA), and mode of gene duplication.  The outcome suggests that these genes were under strong purifying selection pressure by natural selection during the course of evolution. Further, the divergence periods of tandemly and segmentally duplicated ZmHSF genes were estimated to range from 5.96 to 38.04 with an average of 20.89 million years ago (MYA). Some paralogous gene pairs (ZmHSF-02/ZmHSF-24, ZmHSF-09/ZmHSF21, and ZmHSF-16/ZmHSF20) appeared to be recently duplicated ( Table 3). The grass species are estimated to have diverged around 56-73 MYA [65,66]. In the present analysis, all the HSF genes in maize appeared to have duplicated after the divergence of grass species. Further, most of the HSF genes in maize are paralogs, and it can be concluded that duplication events (primarily segmental) played a significant role in the expansion of the HSF gene family in maize. The outcome suggests that these genes were under strong purifying selection pressure by natural selection during the course of evolution. Further, the divergence periods of tandemly and segmentally duplicated ZmHSF genes were estimated to range from 5.96 to 38.04 with an average of 20.89 million years ago (MYA). Some paralogous gene pairs (ZmHSF-02/ZmHSF-24, ZmHSF-09/ZmHSF21, and ZmHSF-16/ZmHSF20) appeared to be recently duplicated ( Table 3). The grass species are estimated to have diverged around 56-73 MYA [65,66]. In the present analysis, all the HSF genes in maize appeared to have duplicated after the divergence of grass species. Further, most of the HSF genes in maize are paralogs, and it can be concluded that duplication events (primarily segmental) played a significant role in the expansion of the HSF gene family in maize.

Gene Structure and Protein Motif Analysis
To investigate the structural relationship among the HSF genes, the intron-exon organization of all the targeted HSFs was analyzed using GSDS software. The intron-exon structure and number play a key role in the evolution of gene families [67,68]. The gene structure analysis was in line with the phylogenetic relationship among maize HSFs ( Figure 4). In general, the intron and exon numbers were shown to be highly consistent. Particularly, 92% (23/25) HSFs contain only one intron except for HSF-02 and HSF-24 ( Figure 4). Similarly, HSF-02 and HSF-24 were shown to contain three and five exons. In contrast, the rest of the HSF genes contained two exons. Further, 17 HSFs contained 5 UTR and 3 UTR. The HSF genes belonging to the same class and sub-class showed a similar intron-exon pattern in terms of intron number, exon length, intron phase, and overall gene length ( Figure 4).

Gene Structure and Protein Motif Analysis
To investigate the structural relationship among the HSF genes, the intron-exon organization of all the targeted HSFs was analyzed using GSDS software. The intron-exon structure and number play a key role in the evolution of gene families [67,68]. The gene structure analysis was in line with the phylogenetic relationship among maize HSFs (Figure 4). In general, the intron and exon numbers were shown to be highly consistent. Particularly, 92% (23/25) HSFs contain only one intron except for HSF-02 and HSF-24 ( Figure  4). Similarly, HSF-02 and HSF-24 were shown to contain three and five exons. In contrast, the rest of the HSF genes contained two exons. Further, 17 HSFs contained 5′ UTR and 3′ UTR. The HSF genes belonging to the same class and sub-class showed a similar intronexon pattern in terms of intron number, exon length, intron phase, and overall gene length ( Figure 4). MEME was used to identify the conserved motifs/regions responsible for DNA-binding, oligomerization, nuclear localization, nuclear export, and biological activation of HSFs ( Figure 5). In total, 20 motifs designated as motifs 1-20 were identified among maize HSF proteins ( Table 4). The highly conserved DBD is represented by motifs 1, 2, and 4. Motif 3 corresponds to OD of class A and C HSFs. The HR-A core region of OD is represented by motif 5 and is present in all HSFs. OD of class B HSFs is depicted by motif 7.  MEME was used to identify the conserved motifs/regions responsible for DNAbinding, oligomerization, nuclear localization, nuclear export, and biological activation of HSFs ( Figure 5). In total, 20 motifs designated as motifs 1-20 were identified among maize HSF proteins ( Table 4). The highly conserved DBD is represented by motifs 1, 2, and 4. Motif 3 corresponds to OD of class A and C HSFs. The HR-A core region of OD is represented by motif 5 and is present in all HSFs. OD of class B HSFs is depicted by motif 7. Motif 6 constitutes the NLS of class A HSFs and is present in eight members. The AHA motif is shown by motif 8 and is present in 11 class A HSFs. The NES of class A HSFs is represented by motif 16. Motif 15 constitutes the NLS of Class B2 HSFs. Certain HSFs also possess class-specific conserved motifs i.e., ZmHSF-02 and ZmHSF-24 contain motifs 10 and 13, HSFs of subclass B1 harbor motif 14, sub-class C1 HSFs motif 19, the sub-class B2 HSFs motif 11, and sub-class A4 members (ZmHSF-16, ZmHSF-20) contain motif 18. Motifs 9 and 12 are present in the same members of class A HSFs. Four members of class A HSFs contain motif 17. Motif 18 is found in two members of class A HSFs. Finally, motif 20 is found in members of sub-class A2 and A6 HSFs. HSFs contain motif 17. Motif 18 is found in two members of class A HSFs. Finally, motif 20 is found in members of sub-class A2 and A6 HSFs.

Domain Analysis and Physio-Chemical Properties
The modular structure and the functional domains of HSFs have been studied and described extensively [22]. The HSF-type DBD was highly conserved and consisted of approximately 100 amino acids ( Figure 6). The locations of DBD and OD were predicted using SMART and MARCOIL ( Table 5). The DBD of most maize HSFs was located at the beginning of the N-terminal. Few exceptions were ZmHSF-03, ZmHSF-10, ZmHSF-14, and ZmHSF-15. As expected, the linker length between DBD and OD of HSFBs was larger than HSFAs and HSFCs.
The physio-chemical properties of HSFs such as amino acid length, Mw, and pI were investigated using Expasy (Table 5). In addition, the amino acid composition of each group was analyzed using the online tool CoPId (  -21). In general, the pI of class A HSFs was in acidic ranges except for ZmHSF-10, which was shown to be in a slightly basic range. The pI of class B was in slightly acidic and basic ranges. Finally, for class C the pI was in similar ranges as class B HSFs. The average amino acid composition of ZmHSFs ranged from 1.1 (cysteine) to 10.0 (alanine) ( Figure 7A). The average amino acid composition of class A HSFs ranged from 0.8 (cysteine) to 8.7 (alanine) ( Figure 7B). In contrast, the average amino acid composition of class B ranged from 1.3 (tryptophan) to 12.6 (alanine) ( Figure 7C). Finally, the class C amino acid composition ranged from 1.1 (tryptophan) to 11.2 (alanine) ( Figure 7D).

Domain Analysis and Physio-Chemical Properties
The modular structure and the functional domains of HSFs have been studied and described extensively [22]. The HSF-type DBD was highly conserved and consisted of approximately 100 amino acids ( Figure 6). The locations of DBD and OD were predicted using SMART and MARCOIL ( Table 5). The DBD of most maize HSFs was located at the beginning of the N-terminal. Few exceptions were ZmHSF-03, ZmHSF-10, ZmHSF-14, and ZmHSF-15. As expected, the linker length between DBD and OD of HSFBs was larger than HSFAs and HSFCs. Figure 6. Multiple sequence alignment of maize HSFs. The highly conserved DBD consists of four antiparallel β-stranded sheets (β1, β2, β3, β4) and a bundle of three α-helices (α1, α2, α3) which form the secondary structure. Rectangular boxes represent α-helices while square boxes represent β-stranded sheets.

Proteins Structure and Sub-Cellular Localization of Maize HSFs
The protein structures of maize HSFs were predicted through Alphafold. The predicted models were downloaded to view their 3D structure (Figure 8). The highly conserved DBD is represented by α-helices and β-sheets. The OD can be seen to be linked with DBD through linker residues. Most maize HSFs were predicted to be localized inside the nucleus (Table 5). Exceptions were ZmHSF-10 and ZmHSF-23 (HSFA7a and HSFA7b). This indicates that class 7 HSFs might possess distinct properties.   Table 5.

Expression Profiles of Zea mays HSFs during Different Developmental Stages
The expression patterns of Zea mays HSF genes were investigated during different growth stages and across different time scales. Although gene expression pattern is not always directly related to protein abundance, the transcriptome profiles can still provide insights into the probable role of genes in particular biological processes [69]. The transcriptome data utilized in the present study was downloaded from the maize genome database [59].

Expression Profiles of Zea mays HSFs during Different Developmental Stages
The expression patterns of Zea mays HSF genes were investigated during different growth stages and across different time scales. Although gene expression pattern is not always directly related to protein abundance, the transcriptome profiles can still provide insights into the probable role of genes in particular biological processes [69]. The transcriptome data utilized in the present study was downloaded from the maize genome database [59].
The ZmHSF-04, ZmHSF-05, and ZmHSF-06 are shown to be the most highly induced HSFs across all the tissues during different growth stages ( Figure 9A-C

Expression Pattern of Zea mays HSFs under Abiotic Stresses
Maize growth, development, and yield is adversely affected by several abiotic stresses [70]. Therefore, to examine maize HSFs expression under different abiotic stress events, RNAseq data was analyzed and a heat map was constructed ( Figure 10; Table S5). In response to HS, members of class A and B HSFs showed the highest expression. Interestingly, under the drought stress, only the transcript of ZmHSF-05 was moderately overexpressed.

Expression Pattern of Zea mays HSFs under Abiotic Stresses
Maize growth, development, and yield is adversely affected by several abiotic stresses [70]. Therefore, to examine maize HSFs expression under different abiotic stress events, RNA-seq data was analyzed and a heat map was constructed ( Figure 10; Table S5). In response to HS, members of class A and B HSFs showed the highest expression. Interestingly, under the drought stress, only the transcript of ZmHSF-05 was moderately overexpressed. Under cold stress, three members of class A HSFs (ZmHSF-05, ZmHSF-06

Functional Annotation of Maize HSFs
HSFs have been reported to play a major role not only under stressful conditions but also in plant growth and development [16,69]. Therefore, the regulatory functions of maize HSFs were predicted through GO annotation investigation based on the biological process (BP), molecular function (MF), and cellular component (CC) classes ( Figure 11; Table S6). The BP annotation analysis indicated that maize HSFs are mainly involved in cellular response to heat (GO:0034605), response to heat (GO:0009408), response to temperature stimulus (GO:0009266), regulation of transcription by RNA polymerase II (GO:0006357), response to abiotic stimulus (GO:0009628), cellular response to stress

Functional Annotation of Maize HSFs
HSFs have been reported to play a major role not only under stressful conditions but also in plant growth and development [16,69]. Therefore, the regulatory functions of maize HSFs were predicted through GO annotation investigation based on the biological process (BP), molecular function (MF), and cellular component (CC) classes ( Figure 11; Table S6). The BP annotation analysis indicated that maize HSFs are mainly involved in cellular response to heat (GO:0034605), response to heat (GO:0009408), response to temperature stimulus (GO:0009266), regulation of transcription by RNA polymerase II (GO:0006357), response to abiotic stimulus (GO:0009628), cellular response to stress (GO:0033554), regulation of RNA biosynthetic process (GO:2001141), etc. (Figure 11). With regard to MF annotation analysis, it was revealed that maize HSFs are mostly involved in RNA polymerase II cis-regulatory region sequence-specific DNA binding (GO:0000978), cis-regulatory region sequence-specific DNA binding (GO:0000987), RNA polymerase II transcription regulatory region sequence-specific DNA binding (GO:0000977), transcription regulatory region nucleic acid binding (GO:0001067), transcription cis-regulatory region binding (GO:0000976), sequence-specific double-stranded DNA binding (GO:1990837), double-stranded DNA binding (GO:0003690), etc. (Figure 11). The CC annotation study showed that ZmHSFs are majorly involved in the nucleus (GO:0005634), intracellular membrane-bounded organelle (GO:0043231), membrane-bounded organelle (GO:0043227), intracellular organelle (GO:0043229), organelle (GO:0043226), etc. (Figure 11). To conclude, the GO annotation study confirms the role of maize HSFs in regulating abiotic stresses and plant metabolism.  Figure 11). With regard to MF annotation analysis, it was revealed that maize HSFs are mostly involved in RNA polymerase II cis-regulatory region sequence-specific DNA binding (GO:0000978), cis-regulatory region sequence-specific DNA binding (GO:0000987), RNA polymerase II transcription regulatory region sequence-specific DNA binding (GO:0000977), transcription regulatory region nucleic acid binding (GO:0001067), transcription cis-regulatory region binding (GO:0000976), sequence-specific double-stranded DNA binding (GO:1990837), double-stranded DNA binding (GO:0003690), etc. (Figure 11). The CC annotation study showed that ZmHSFs are majorly involved in the nucleus (GO:0005634), intracellular membrane-bounded organelle (GO:0043231), membrane-bounded organelle (GO:0043227), intracellular organelle (GO:0043229), organelle (GO:0043226), etc. ( Figure  11). To conclude, the GO annotation study confirms the role of maize HSFs in regulating abiotic stresses and plant metabolism.

Protein-Protein Interaction Network Analysis
The protein network interaction analysis can help understand protein biological function's and mechanisms [71]. Since both the RNA-seq data and GO annotation analysis suggested the role of HSFs in stress conditions and normal growth, we performed network analysis to predict the interacting partners of ZmHSFs (Table S7). The results showed that maize HSFs interact with themselves and a range of proteins with wellknown functions in cellular growth and stress responses ( Figure 12). For example, HSFs

Protein-Protein Interaction Network Analysis
The protein network interaction analysis can help understand protein biological function's and mechanisms [71]. Since both the RNA-seq data and GO annotation analysis suggested the role of HSFs in stress conditions and normal growth, we performed network analysis to predict the interacting partners of ZmHSFs (Table S7). The results showed that maize HSFs interact with themselves and a range of proteins with well-known functions in cellular growth and stress responses ( Figure 12). For example, HSFs were shown to interact with molecular chaperons HSP101, HSP82 (belongs to HSP90 family), HSBP-2, and DnaJ-like protein (belongs to the HSP40 family). It was reported that HSP101 and HSA32 interact with each other and promote acquired thermotolerance in Arabidopsis [72]. The HSP82 was reported to be induced by higher temperatures. A higher concentration of HSP82 is required for normal cellular growth in yeast at higher temperatures [73]. Gu et al. reported that maize HSBP-2 and HSFA2 interact with each other and modulate raffinose biosynthesis [74]. HSFA2 was shown to bind to the promoter sequence of HSBP-2 and activate its expression. Higher raffinose synthesis improved HS tolerance of Arabidopsis thaliana. The DnaJ-like proteins are molecular co-chaperones that interact with HSP70s and control protein homeostasis [75]. DnaJ proteins have been reported to play a critical role in plant growth, development, and HS tolerance [75][76][77]. ZmHSFs also interact with two major proteins, i.e., multi-protein bridging factor 1c (MBF1c) and DREB2A. Both these proteins have been shown to accumulate under diverse abiotic stress conditions. DREB2A is a major protein, and its overexpression improves plant HS, drought stress, cold stress, etc., tolerance [78]. MBF1c is a transcriptional co-activator that modulates the expression of DREB2A, some HSFs, and phytohormones [3]. Interestingly, MBF1c is necessary for basal thermotolerance but not for acquired thermotolerance [79]. In addition, MBF1c is also shown to be required for plant developmental responses [80].

Discussion
Maize (Zea mays) is a major cereal crop that is widely cultivated worldwide for food feed, fiber, and fuel. Maize is also considered a model plant for basic and applied researc in plant science [83]. Unraveling the factors regulating the growth and stress resistanc would contribute significantly to the development of climate-smart, stress-resilient maiz cultivars with higher agricultural productivity. The sequencing of the maize genome (B7 inbred line) in 2009 opened a plethora of opportunities to identify, analyze, and charac terize stress-associated genes in maize [84]. To provide food security in the scenario o climate change and ever-growing world population, it is imperative to understand th Maize HSFs are also predicted to interact with SUMO proteins. SUMOylation is a post-translational phenomenon where SUMO proteins are covalently attached and detached to target proteins [81]. This process affects several biological processes inside the cell, including transcriptional regulation of gene expression, apoptosis, programmed cell death, cellular response to stress, stability of proteins, etc. [81]. Rytz et al. reported that SIZI, a SUMO protein, targets multiple TFs, chromatin remodelers, transcriptional co-activators/repressors connected to abiotic and biotic stress responses [82]. This suggests maize HSFs may also be SUMOylated under diverse biological conditions and stress responses. To conclude, PPI analysis aligned with the RNA-seq and GO annotation analysis, which indicated that HSFs of Zea mays play an important role in abiotic stress conditions and in maize growth and metabolism.

Discussion
Maize (Zea mays) is a major cereal crop that is widely cultivated worldwide for food, feed, fiber, and fuel. Maize is also considered a model plant for basic and applied research in plant science [83]. Unraveling the factors regulating the growth and stress resistance would contribute significantly to the development of climate-smart, stress-resilient maize cultivars with higher agricultural productivity. The sequencing of the maize genome (B73 inbred line) in 2009 opened a plethora of opportunities to identify, analyze, and characterize stressassociated genes in maize [84]. To provide food security in the scenario of climate change and ever-growing world population, it is imperative to understand the molecular mechanisms behind plant stress resistance and explore genetic resources associated with the higher crop yield [3,15]. HSFs have been identified in several plant species, including important crops. The Arabidopsis thaliana, Oryza sativa, Zea mays, Glycine max, Populus trichocarpa, Solanum lycopersicum, Brachypodium distachyon, Sorghum bicolor, and Triticum aestivum contain 21,25,25,38,28,26,24,23, and 61 HSFs in their genomes, respectively [20,[40][41][42][43][44][45]71,85]. Following the sequencing of several plants, it is found out that the number of HSFs may be independent of the genome size [71]. For example, Arabidopsis thaliana (135 Mb) contains 21 HSFs [20,86], while Medicago truncatula (375 Mb) harbors 15 HSFs [43,87]. Similarly, 25 HSFs are found in Oryza sativa (430 Mb) [40,88] and an equal number of HSFs are also present in Zea mays (2.4 Gb) [41,84]. Even though the HSF gene family was previously characterized by Lin et al. [41], our work differs from theirs in multiple aspects. Their research was mostly restricted to the identification and classification of ZmHSFs. On the other hand, this comprehensive study particularly focused on the evolutionary analysis, expression profiling, GO, and PPI networks to explore the probable regulatory role played by ZmHSFs under benign and stress conditions.
The distribution of the HSF gene family in maize was analyzed by constructing a chromosomal map (Figure 1). The fact that all the chromosomes harbor at least one HSF gene suggests that Zea mays' most recent common ancestor has HSF genes distributed widely in its genome. Phylogenetic analysis indicated the AtHSFA2, AtHSFC1 did not align with sub-class A2, C1 in the present study, which aligns with the results reported by Lin et al. [41]. Maize HSFs are divided into three classes and further into 13 sub-classes which is consistent with the HSF class number observed in other monocots. For example, the HSFs of Oryza sativa, Brachypodium distachyon, Sorghum bicolor, and Triticum aestivum are also divided into three classes and 13 sub-classes [40,45,46,85]. Despite that, differences among HSF numbers were observed in different sub-classes between monocots ( Figure 2, Table 2). For example, compared to rice, Brachypodium, and Sorghum, the sub-families B4, A2, and C2 contain fewer HSF members in maize. On the other hand, the sub-classes A1, A4, A8, B1, and B2 are expanded in Zea mays ( Figure 2, Table 2). Gene duplications generate new genes and provide novel possibilities for evolutionary success [89,90]. In fact, it has been proposed that tandem and segmental duplications have been the primary driving source of evolution as these events lead to expansion of gene families and generation of proteins with novel functions [91]. Tandem duplication involves the duplication of two or more genes located on the same chromosome, while segmental duplication refers to the phenomenon when genes belonging to the same clade but located on different chromosomes are duplicated [92]. In the present analysis, nine pairs of ZmHSFs were shown to be paralogs (Table 3, Figure 3). The results indicate that segmental duplication events have played a major role in the expansion of the HSF gene family in maize. An increase in gene regulatory repertoire such as transcriptional regulators, developmental regulators, signal transducers, etc., is a prerequisite for the evolution of complex systems in different organisms [22,93]. Since the gene duplication events result in the doubling of a single gene which cannot account for such large expansions, it has been suggested that whole-genome duplication (WGD) events have been instrumental in expanding the regulatory repertoire of plants [90]. It is assumed that the Arabidopsis genome experienced two rounds of WGD in the past 60-70 million years [94,95]. More than 90% increase in regulatory genes has been caused by duplication events Arabidopsis in the past 150 million years [94]. This suggests that an increase in the HSF gene members in plants accounts for WGDs. Additionally, segmental duplications occur in gene families, which evolve at a slower rate [91]. It is thought that the increase or decrease in exon number plays an important role in the evolution of a gene family [96]. Therefore, we investigated the number and distribution of introns and exons in ZmHSF genes. Our results showed that the ZmHSF genes contain 2 exons and 1 intron except for ZmHSF-02 and ZmHSF-24 ( Figure 4; Table 1). Moreover, the length and position of exons and introns were well conserved in the same sub-classes but varied considerably between different sub-classes.
The previous investigations showed that HSFs play an important role in plant growth [39]. Therefore, we investigated the tissue-specific expression of ZmHSFs in 20 different developmental tissues using RNA-seq data ( Figure 9A-C). Several genes showed an enhanced expression that reflects their role under various developmental stages. In particular, ZmHSF-05 (A-2a) and ZmHSF-06 (A-1a) were highly expressed almost across all the growth phases. The hsfa1abde quadruple mutants displayed abnormal phenotype and growth retardation, implying HSFA1s is also involved in developmental processes [29]. Interestingly, HSFA2 could rescue the developmental defects of hsfa1abde quadruple mutants [97]. This further supports the result obtained from our analysis and provides a strong base for further wet-lab studies to characterize the function of ZmHSF-05 and ZmHSF-06 in plant growth and development. Similarly, HSFs have been reported to play a key role in plant acclimation to abiotic stress conditions. Kumar et al. reported that TaHSFA6e modulates tolerance of wheat to HS and drought stress during pollination and grain filling stages [98]. Yokotani et al. reported that OsHSFA2e improves Arabidopsis tolerance to HS and salinity stress by activating the expression of HSPs [99]. Thus, the expression patterns of ZmHSFs were evaluated under abiotic stress conditions. Most of ZmHSFs displayed stress-specific expression, with some HSFs showing upregulation only under particular stress events ( Figure 10). Jiang et al. reported that ZmHSF-04 improves plant tolerance to HS, salinity stress and increases the sensitivity to abscisic acid [100]. Similarly, ZmHSF-12 overexpression improves plant basal thermotolerance and AT [101]. These results are in line with our analysis which showed a higher transcript accumulation of these TFs under respective stress conditions ( Figure 10).
The PPI analysis indicated that maize HSFs interact with molecular chaperones and stress-associated proteins ( Figure 12). Molecular chaperones are present inside the cells and are constitutively expressed under normal conditions or are induced under specific developmental stages or stress conditions [102]. These chaperons perform various functions under physiological conditions inside the cells, such as signaling, folding, and stabilization, translocation, and degradation of proteins [11,102]. Under harsh environmental conditions, molecular chaperones act as powerful buffers to limit protein misfolding/unfolding and prevent protein aggregate formation that might be otherwise toxic to plant cells [25]. Here, the ZmHSFs were shown to interact with chaperons belonging to different families (HSP101, HSP90, HSP40) and genes with a well-known role in thermotolerance (HSA32, HSP82, HSBP-2). DREB2A is a major transcriptional activator that functions downstream of HSFA1s dependent transcriptional cascade in Arabidopsis thaliana [3,12,15,29]. Similarly, MBF1c is a major protein characterized by its role in regulating abiotic stress responses and growth in plants [3,82,83]. SUMO proteins are attached to their target proteins and modify their biological activities under various physiological and stress conditions [84]. In Arabidopsis, SUMOylation has been proposed as one of the molecular mechanisms that are responsible for the activation of HSFA1s [12]. Many HSFs in Arabidopsis such as HSFA1d, HSFA2, and HSFB2B have the potential to be SUMOylated [103]. In tomatoes, the knockout of SIZI (a SUMO ligase) reduces plant thermotolerance [104]. Here, ZmHSFs were shown to interact with all these proteins (DREB2A, MBF1c, and SUMO proteins), which further confirms their role in regulating the abiotic stress response.
Taken together, our present analysis provides strong support for the positive role of HSFs in the growth and development of maize by the regulation of primary metabolism. Furthermore, HSFs of maize interact with the major stress-responsive proteins and confer abiotic stress resistance.

Conclusions
Here, we identified a total of 25 HSFs from the maize genome through genomewide investigation analysis. To better understand the roles of HSF genes in the maize genome, comprehensive in silico analysis was performed, including phylogenetic analysis, gene structure, and conserved protein motif analysis, gene duplication and evolutionary analysis, domain analysis, and physio-chemical properties, protein 3-D structure, GO and PPI network. Further, the expression profiles of ZmHSFs under various developmental stages, and stress conditions were studied. The results indicate that ZmHSFs play a major role in plant growth and stress responses. These discoveries will lay the basis for studying the roles of ZmHSFs genes in maize developmental processes and response to several stresses using different functional validation options, such as overexpression, knockout via CRISPR/Cas9 systems, etc.