Genome-Wide Identiﬁcation, Structure Characterization, and Expression Proﬁling of Dof Transcription Factor Gene Family in Wheat ( Triticum aestivum L.)

: DNA binding with one ﬁnger (Dof) proteins are plant-speciﬁc transcription factors with crucial roles in plant growth and stress response. Even so, little is known about them in wheat. In this study, 108 wheat Dof (TaDof) genes across 21 chromosomes were detected. Although variable in sequence length, molecular weight, and isoelectric point, all TaDof proteins contained conserved zinc-ﬁnger structures and were phylogenetically divided into 7 sub-groups. Exon / intron and motif analyses suggested that TaDof structures and conserved motifs were similar within sub-groups but diverse among sub-groups. Many segmental duplications were identiﬁed and Ka / Ks and inter-species synthetic analyses indicated that polyploidization was main reason for increased number of TaDofs. Prediction and experimental conﬁrmation revealed that TaDofs functioned as transcription factors in the nucleus. Expression pattern proﬁling showed that TaDofs speciﬁcally a ﬀ ected growth and development, and biotic and abiotic stress responses. Wheat miRNAs and cis -regulator were predicted as essential players in molding TaDofs expression patterns. qRT-PCR analysis revealed that TaDofs were induced by salt and drought stresses. Customized annotation revealed that TaDofs were widely involved in phytohormone response, defense, growth and development, and metabolism. Our study provided a comprehensive understanding to wheat TaDofs. in di ﬀ erent growth and development stage, biotic stress, and abiotic stress situations, and integration analysis with miRNAs and the cis -regulators, the possible transcriptional regulatory networks of TaDofs were further explored, which provide valuable information for further experimental studies. qRT-PCR analysis revealed that the expression levels of selected TaDofs were signiﬁcantly altered by salt and drought stress. Through BLASTp searches the possible roles of TaDofs were manually annotated by alignment to Dofs with experimentally supported functions in other species. Our results provide valuable clues for further analysis of the regulatory mechanisms and speciﬁc functions of TaDof genes in wheat, especially their roles in stress response.


Introduction
Transcriptional regulation plays a crucial role in many biological processes, such as signal transduction and response to abiotic and biotic stress in plants [1][2][3]. A typical transcription factor contains four functional regions: a DNA binding region, a transcriptional regulatory region (including activation and inhibition domains), an oligomerization site, and a nuclear localization signal region [4]. Transcription factors enter the nucleus at specific times to interact with the cis-acting elements of gene

Analysis of Dof Motifs and Gene Structures
Annotation information related to TaDofs was examined using webtool GSDS (http://gsds.cbi.pku. edu.cn/index.php) to predict TaDof gene structure, intron and exon distribution, and intron and exon boundaries. Conserved sequences in the TaDof genes were identified using MEME Suite and MAST Primer Search software tools [28]. The parameters were established using known Dof protein sequences from Arabidopsis, rice and maize, and were then applied to identify conserved sequences in TaDofs as follows: each sequence could contain any number of non-overlapping occurrences of each motif, the total number of different motifs was 20, and motif length ranging from 6 to 50 aa. The functions of the predictive motifs were analyzed using InterPro (http://www.ebi.ac.uk/interpro) and SMART (http://coot.embl-heidelberg.de/SMART), and TBtools software (https://github.com/CJ-Chen/TBtools) was used for graphical visualization [29].

Gene Duplication and Ka/Ks Analyses
Gene duplications were classified into tandem duplication and segmental duplication events. Tandem duplication was determined using the following judgement criteria: (1) Length of the aligned region > 80%; (2) identity > 80%; (3) threshold ≤ 10 −10 ; (4) only one duplication can be admitted when genes are closely linked; and (5) intergenic distance < 25 kb. If gene pairs meet criteria (1), (2) and (3) and located on different chromosomes, they were considered as segmental duplications [29]. After identification of a duplication, the Ks value and Ka/Ks ratio were calculated, and selection pressure and selection mode were analyzed. Information including the IDs of duplication gene pairs and the corresponding CDS sequences were collected. And the files containing this information were transformed into the form required by software TBtools, then were put into TBtools to calculate the Ka and Ks values and Ka/Ks ratios by using a built-in function called Simple Ka/Ks Calculator which uses muscle to do the codon alignment. The formula T = Ks/2λ × 10 −6 Mya was used to estimate the time (T) of duplication in millions of years (Mya), where λ = 6.5 × 10 −9 represented the rate of replacement of each locus per herb plant year [30]. The reference genomes of Aegilops tauschii (DD), T. urartu (AA), and T. dicoccoides (AABB) were download from NCBI, and Dof gene identification followed the same Agronomy 2020, 10, 294 4 of 25 procedure as for TaDof genes. The duplication gene pairs between species were identified and used to carry out the inter-species synthetic analysis using R package "circlize".

Functional Annotation Sub-Cellular Localization Prediction, and Experimental Confirmation
Multiple databases, including GO (Gene Ontology), were used to perform functional annotation of TaDof genes [31,32]. For experimental confirmation, total RNA of wheat was extracted using a Plant RNA Kit (Omega, London, UK) and cDNA was synthesized using RevertAid First Strand cDNA Synthesis Kit (Thermos Scientific, Madison, WI, USA). Full length TaDofs were amplified by Phanta HS Master Mix (Vazyme, Nanjing) using corresponding primers (Additional file 1: Table S1) (synthesized by Sangon Biotech, Shanghai, China). After linearization with XhoI (NEB, Nanjing, China), the plant expression vector pART27:GFP was purified by a Cycle-pure Kit (Omega). PCR products were inserted into lined pART27:GFP using ClonExpress II One Step Cloning Kit (Vazyme). Positive clones were transformed into Agrobacterium tumefaciens for transient expression in leaves of Nicotiana benthamiana. A fluorescence microscope (Olympus FV3000, Tokyo, Japan) was used to observe sub-cellular localization after three days.

Expresssion Pattern Analysis of TaDofs Using Lager-Scale Transcriptome Data
The original transcriptome data were obtained from a comprehensive study of wheat expression profiles following 337 different situations [33]. The expression profiles of TaDof genes represented by TPM (transcripts per million) were manually extracted and used to generate the expression heatmap and overview boxplot using R package "pheatmap" and R function "boxplot".

Identification of miRNA and Cis-Regulatory Element
To identify miRNAs targeting TaDofs, mature miRNA sequences from wheat and TaDofs sequences were submitted to the online tool psRNATarget set with default parameters [34]. For promoter analysis, 1.5 Kb sequences upstream of the transcription start site of TaDof genes were retrieved and subjected to search for cis-regulatory elements by the CARE program (http://bioinformatics.psb.ugent.be/webtools/ plantcare/html/) in the PlantCARE database [35].

Real Time-Quantitative PCR (RT-qPCR)
Two-leaf wheat seedlings (cultivar Emai 170) were treated with mannitol (15.03 g/L), NaCl (2.41 g/L), and PEG with an osmotic potential of −0.5 MPa (84.36 g/L). Normal growth seedlings were used as control. Leaf and root tissues were harvested after 2, 4, 8, 12, 24, 72, and 120 h. Total RNA was extracted using a Plant RNA Kit (Omega) and cDNA was synthesized using a RevertAid First Strand cDNA Synthesis Kit (Thermos Scientific) from 1 µg RNA. qRT-PCR was performed on a CFX 96 Real-Time PCR system (Bio-Rad, Hercules, CA, USA) using 20 µL reaction system, including 10 µL of SYBR Green Master Mix (Vazyme), 1 µL of each primer (10 µM), 2 µL of template (about 100 ng/µL), and 6 µL of ddH 2 O, to explore the expression levels of TaDofs genes. The protocol was carried out as following: pre-denaturation at 95 • C for 30 s (step 1), denaturation at 95 • C for 5 s (step 2), and primer annealing/extension and collection of fluorescence signal at 60 • C for 20 s (step 3). The next 40 loops started in step 2. ADP-ribosylation factor Ta2291 (Forward: GCTCTCCAACAACATTGCCAAC, Reverse: GCTTCTGCCTGTCACATACGC) was used as housekeeping gene for qRT-PCR analysis. Relative quantities were calculated using the 2 − ct method [36]. Each sample was assayed in three replications, and each replication contains two technical repeats. Primers used for real-time PCR are listed in Additional file 1: Table S2 (synthesized by Sangon Biotech, Shanghai, China).

Identification of 108 TaDofs in Seven Sub-Groups
Dof family members (TaDofs) in wheat were identified by a genome-wide search by the Hidden Markov Model (HMM) and BLASTp analyses using Dof genes from Arabidopsis (36), maize (47), and rice (30) as queries (Additional file 1: Table S3). One hundred and eight non-redundant full-length TaDofs were identified (Additional file 2: File S1). To examine their evolutionary relationships in wheat and the other plant species a phylogenetic tree was constructed by multiple sequence alignment of all 108 Dof proteins using the Maximum Likelihood (ML) method ( Figure 1). The predicted TaDof genes were classified into seven sub-groups (I, II, III, IV, V, VII, and VIII) based on the phylogenetic tree and earlier reports [9]. Each TaDof was renamed based on its phylogenetic relationship with AtDofs [9]. Sub-group III contained the largest number of TaDofs (29 genes, 26.85%), followed by sub-groups VII (22,20.37%) and VIII (22,20.37%). Sub-group VI had only four members, that were present only in Arabidopsis, hence indicating that they may be dicot-specific. Alternative splicing of isoforms was predicted for TaDof3.1-1B, TaDof3.1-1D, TaDof3.5-5A, TaDof3.5-5B, TaDof3.5-5D, and TaDof8.5-3D (Table 1).
Generally, Dof proteins have a DNA-binding domain of 40-60 amino acid residues at the N-terminus. Multiple sequence alignment analysis revealed that all TaDofs contained a typical ZF-Dof domain, including a highly conserved repeat containing four cysteines to combine with zinc ions. This domain has a highly conserved CX2CX21CX2C single structure that is essential for the zinc finger configuration and loop stability. Multiple protein sequence alignments of Dof DNA-binding domains revealed that all included 17 highly conserved amino acids "CPRC-S-T-FCY-NNY-QPR-C-C" in the 29 amino acid residues containing the CX2CX21CX2C single zinc-finger structure ( Figure 2).

Gene Structure and Conserved Motifs Are Similar Intragroup but Diverse Intergroup
Exon-intron structural diversity often has a role in the evolution of gene families and could provide information regarding duplication events and evolutionary patterns within gene families [37,38]. The phylogenetic relationships of TaDofs were shown in Figure 3A, and the intron/exon distribution patterns of the TaDof genes were shown in Figure 3B. Most of them had zero to two introns, but TaDof4.1 had seven introns. Sub-group I and VII genes had no introns. Among the 13 sub-group II genes, Agronomy 2020, 10, 294 9 of 25 11 contained one intron, and 2 (TaDof2.1-2A.2 and TaDof2.2-2D.1) had none. The numbers of introns in sub-groups III and VIII varied from 0 to 2; TaDof3.3-3A.1 and TaDof8.8-4B contain two, TaDof3.3-3A.3 and remaining 11 contained none. Most sub-group IV genes had one intron, except for TaDof4.1, which contained seven. Six sub-group V genes contained one intron and six had none. Generally, there were similar numbers and lengths of exons and introns in the same sub-group ( Figure 3B). Divergence between groups suggest that TaDof genes are evolving into more diverse exon-intron structures as a means of functional diversification.

Polyploidization is the Main Basis of Member Expansion of TaDofs
Distribution and synteny of family members were analyzed to better understand the chromosomal locations and duplication events in TaDofs. According to the genomic location of each TaDof family member, we used R package "LinkageMapView" to draft a chromosome distribution map of TaDof genes ( Figure 4A, Additional file 1: Table S5). TaDof genes were distributed across all 21 wheat chromosomes, with numbers in each chromosome ranging from one to 11. The distribution was not random; there were several regions with three to five TaDofs clustered together. For example, chromosome 2B had four TaDofs in a short chromosome region (about 200 Kb), and chromosome 3A had three TaDofs in a 300 Kb region. Only one TaDof gene, a member of Group VII, was located on each homoeologous group 7 chromosomes. The density of TaDof genes was highest (29, 26.85%) in homoeologous group 3 ( Figure 4A). Gene duplication analysis detected 63 pairs, including three tandem and 60 segmental duplications. Tandem duplications caused gene clusters or hotspot regions, The locations of conserved domains were determined by SMART and were visualized by the MEME program to reveal TaDof gene diversification. Twenty conserved motifs, namely 1 to 20, were identified ( Figure 3C and Additional file 1: Table S4). The number of motifs in each TaDof protein ranged from two to 12; all TaDofs contained motif 1 and most of them had motif 15 (Additional file 3: Figure S1). Some sub-group-specific motifs, such as motif 10 and motifs 4 and 9 were identified only in sub-groups II and III, respectively. TaDofs that clustered together in the phylogenetic tree usually contained similar motifs, suggesting similar functions of TaDofs within the same sub-group.
The results of gene structures and motif locations thus indicated that most members were conserved within sub-groups, but showed divergence between sub-groups.

Polyploidization is the Main Basis of Member Expansion of TaDofs
Distribution and synteny of family members were analyzed to better understand the chromosomal locations and duplication events in TaDofs. According to the genomic location of each TaDof family member, we used R package "LinkageMapView" to draft a chromosome distribution map of TaDof genes ( Figure 4A, Additional file 1: Table S5). TaDof genes were distributed across all 21 wheat chromosomes, with numbers in each chromosome ranging from one to 11. The distribution was not random; there were several regions with three to five TaDofs clustered together. For example, chromosome 2B had four TaDofs in a short chromosome region (about 200 Kb), and chromosome 3A had three TaDofs in a 300 Kb region. Only one TaDof gene, a member of Group VII, was located on each homoeologous group 7 chromosomes. The density of TaDof genes was highest (29,26.85%) in homoeologous group 3 ( Figure 4A). Gene duplication analysis detected 63 pairs, including three tandem and 60 segmental duplications. Tandem duplications caused gene clusters or hotspot regions, e.g., the cluster on homoeologous group 2 chromosomes (A, B, and D). Segmental duplications resulted in homologous genes, that potentially expand the number of TaDof gene groups. For example, TaDof7.8-7A, TaDof7.9-7B, and TaDof7.8-7D in chromosomes 7A, 7B, and 7D are homologous to each other and represented segmental duplications ( Figure 4A, Additional file 1: Table S6).
The non-synonymous substitution rate (Ka), synonymous substitution rate (Ks), and Ka/Ks for the 63 duplicated pairs were calculated to reveal evolutionary constraints acting on all duplicated TaDof genes ( Figure 4C, Additional file 1: Table S6). It is generally believed that the ratio of non-synonymous to synonymous mutation rates (Ka/Ks) can be used in evaluating the selection force of a coding sequence [30]. The Ka/Ks ratios for these 63 duplicated pairs were less than 1 ( Figure 4C), implying that all duplicated gene pairs tended to be under negative selection pressure. The Ks values, which were used to estimate the time of occurrence of duplication events, indicated that 63 copies of Dof duplication genes occurred about 1.55 to 12.92 million years (average 7.17, 48 values in 63 earlier than 5.5) ago, the time period mostly before the wheat polyploidization event (Additional file 1: Table S6). Further syntenic analysis was extended to common wheat (AABBDD) and progenitor species Aegilops tauschii (DD), T. urartu (AA), and T. dicoccoides (AABB). Identical orthologues were found in similar genomic regions among the A, B, and D sub-genomes and their corresponding progenitors ( Figure 4B, Additional file 1: Table S7). Summarizing the results of the above duplication events, Ka/Ks, intra-and inter-species syntenic analysis, it suggested that expansion of TaDof genes occurred with polyploidization, and hence that polyploidization was the main reason for high numbers of TaDof genes.
Agronomy 2020, 10, 294 12 of 27 example, TaDof7.8-7A, TaDof7.9-7B, and TaDof7.8-7D in chromosomes 7A, 7B, and 7D are homologous to each other and represented segmental duplications ( Figure 4A, Additional file 1: Table S6). The non-synonymous substitution rate (Ka), synonymous substitution rate (Ks), and Ka/Ks for the 63 duplicated pairs were calculated to reveal evolutionary constraints acting on all duplicated TaDof genes ( Figure 4C, Additional file 1: Table S6). It is generally believed that the ratio of nonsynonymous to synonymous mutation rates (Ka/Ks) can be used in evaluating the selection force of a coding sequence [30]. The Ka/Ks ratios for these 63 duplicated pairs were less than 1 ( Figure 4C), implying that all duplicated gene pairs tended to be under negative selection pressure. The Ks values, which were used to estimate the time of occurrence of duplication events, indicated that 63 copies of Dof duplication genes occurred about 1.55 to 12.92 million years (average 7.17, 48 values in 63 earlier than 5.5) ago, the time period mostly before the wheat polyploidization event (Additional file 1: Table  S6). Further syntenic analysis was extended to common wheat (AABBDD) and progenitor species Aegilops tauschii (DD), T. urartu (AA), and T. dicoccoides (AABB). Identical orthologues were found in  Table S8). Sub-cellular localizations of TaDofs were predicted using the online tool Plant-mPLoc and all were localized in the nucleus, suggesting that they function as transcription factors in same cell compartment. To confirm the nuclear localizations, ten selected genes were constructed into the plant expression vector pART27:NGFP and transiently expressed in Nicotiana benthamiana leaves. Confocal microscopy confirmed their transcription factor roles, all ten were localized in the nucleus ( Figure 5B).
implying multiple functions of TaDofs ( Figure 5A, Additional file 1: Table S8). Sub-cellular localizations of TaDofs were predicted using the online tool Plant-mPLoc and all were localized in the nucleus, suggesting that they function as transcription factors in same cell compartment. To confirm the nuclear localizations, ten selected genes were constructed into the plant expression vector pART27:NGFP and transiently expressed in Nicotiana benthamiana leaves. Confocal microscopy confirmed their transcription factor roles, all ten were localized in the nucleus ( Figure 5B).

Transcriptome Analysis Revealed Diverse Expression Patterns of TaDofs
Expression profiles were mined from a comprehensive profiling study on wheat samples from 337 different situations including different growth and development stages, and multiple biotic and abiotic stresses (Additional file 1: Table S9) [33]. The expression heatmap and overview boxplot of the TaDof genes were drawn using the R package "pheatmap" and R function "boxplot" (Figures 6 and 7). As shown in Figures 6 and 7, almost all TaDof genes were expressed in at least one of 337situation, except for TaDof3 class were barely expressed in most situations, but were specifically induced in anthesis-associated processes (average value, 4.46). The 5th class contained 35 members that were barely expressed under most situations (Figure 6). and 7). As shown in Figures 6 and 7, almost all TaDof genes were expressed in at least one of 337situation, except for TaDof3.3-3A.3 (cutoff value: TPM > 0.1). TaDofs could be divided into five classes based on expression patterns. The 1 st class contained 27 members with average expression levels in TPM ranging from 1.39 to 20.23 (average value, 4.90). These genes were widely expressed across hundreds of situations, but the expression levels varied among different tissues at different growth stages, and under different biotic and abiotic stresses. The 2nd and 3rd classes contained 21 and 22 members and showed relatively low expression levels (average values, 1.21 and 1.05) in most conditions. Genes belonging to 2nd class members were highly induced under certain conditions; for example, TaDof2.1-2D. 1

miRNAs and Cis-Regulators Are Essential Players in Molding TaDofs Expression Patterns
MicroRNAs are widely accepted essential players in posttranscriptional gene regulation by mRNA transcript cleavage and/or protein inhibition of translation [39]. Using TaDof genes and mature wheat miRNA sequences as queries, the miRNAs targeting TaDofs were predicted using the online tool psRNATarget. About 63 TaDof genes were targets of 43 wheat miRNAs, ranging from one to seven miRNAs for each targeted TaDof. Thirty-one TaDofs were targeted by one miRNA, 16 were targeted by two miRNAs, 9 were targeted by three miRNAs, 1 TaDof were targeted by four miRNAs, 5 TaDofs were targeted by six miRNAs, and 1 TaDof was targeted by seven miRNAs (Figure 7, Additional file 1. Table S10). Further integration analysis of the numbers of miRNAs and TaDof expression patterns (represented by TPM means and SD), it seems the barely expressed TaDofs (e.g., TaDof3.3-3A.1, TaDof4.1, and TaDof3.3-3B.2 in class 5 with small TPM mean values) were commonly targeted by more miRNAs, whereas the stably expressed ones (e.g., TaDof8.1-1A, TaDof2.1-2A.2, and TaDof7.8-7D in class 2 with low TPM SD values) were not. Strongly induced TaDofs (e.g., TaDof4.2-5A, TaDof4.3-6A with large TPM SD values) were usually targeted by miRNAs, implying the roles of miRNAs as modulators of TaDof expression patterns (Figure 7). Integration of expression profiles, miRNAs, and cis-elements. Box "All" contained all 337 profiling situations and "growth and development stage" contained 211 conditions, "biotic stress" contained 97 conditions, and "abiotic stress" contained 29 conditions. t-tests were used to analyze the significance of expression pattern differences among "growth and development stage", "biotic stress", and "abiotic stress" (shortened as Growth/Biotic, Growth/Abiotic, and Biotic/Abiotic) (*, p < 0.05). The numbers of miRNAs targeting corresponding TaDofs are shown in the "miRNA" panel.
The numbers of cis-elements in TaDof promoters are shown in the "Growth and development", "Phytohormone response", and "Biotic/abiotic stress" panels.

qRT-PCR Confirmed the Response Capability of TaDofs to Stress Conditions
In order to explore functions of TaDof genes in regulation of stress response, 11 TaDof genes ( Figure 8, Additional file 1: Table S2) from the seven groups were used to examine expression patterns under salt, PEG, and mannitol stresses using qRT-PCR. The expression patterns of these TaDofs varied between tissues and stress treatments ( Figure 8). All 11 TaDofs responded to all three stress conditions, but the response speeds and intensities were different. Integration of expression profiles, miRNAs, and cis-elements. Box "All" contained all 337 profiling situations and "growth and development stage" contained 211 conditions, "biotic stress" contained 97 conditions, and "abiotic stress" contained 29 conditions. t-tests were used to analyze the significance of expression pattern differences among "growth and development stage", "biotic stress", and "abiotic stress" (shortened as Growth/Biotic, Growth/Abiotic, and Biotic/Abiotic) (*, p < 0.05). The numbers of miRNAs targeting corresponding TaDofs are shown in the "miRNA" panel. The numbers of cis-elements in TaDof promoters are shown in the "Growth and development", "Phytohormone response", and "Biotic/abiotic stress" panels.
Gene functions can be predicted by bioinformatic analyses of cis-regulatory elements. This was performed by retrieving 1.5 Kb upstream sequences from transcription start sites of TaDof genes. PlantCARE database searches identified a large number of cis-regulators (2492 representing 26 kinds) (Figure 7, Additional file 1: Table S9), including elements responsible for 'plant development and growth' (7 kinds), 'phytohormone response' (8 kinds), and 'biotic/abiotic stresses' (11 kinds) in the promoter regions of TaDofs, suggesting roles in multiple metabolic and response processes. Among the cis-regulatory elements, CAAT-box (307 individuals, sequence pattern: GGGTCAATCT) and TATA-box (284 individuals, sequence pattern: TATAATAAT) for 'plant development and growth', ABRE (266 individuals, ABA responsive element, sequence pattern: ACGTC/GCCGCTGGC) for 'phytohormone responses', G-box (247 individuals, light-responsive element, sequence pattern: CACGTC), and TCT-motif (211 individuals, light-responsive element, sequence pattern: TCTTAC) for 'biotic/abiotic stresses' were generally present in the promoters of most TaDofs. Integration analysis of the cis-regulators with the expression patterns of TaDofs revealed that TaDofs with significantly different expression patterns between 'growth and development stage' and 'biotic stress' (growth/biotic) or between 'growth and development stage' and 'abiotic stress' (growth/abiotic) usually contained distinctive cis-regulator contents (Figure 7), suggesting specific roles of the elements in modulating TaDof expression.

qRT-PCR Confirmed the Response Capability of TaDofs to Stress Conditions
In order to explore functions of TaDof genes in regulation of stress response, 11 TaDof genes (Figure 8, Additional file 1: Table S2) from the seven groups were used to examine expression patterns under salt, PEG, and mannitol stresses using qRT-PCR. The expression patterns of these TaDofs varied between tissues and stress treatments ( Figure 8). All 11 TaDofs responded to all three stress conditions, but the response speeds and intensities were different. qRT-PCR results showed that there was considerable variation in expression patterns of the TaDof genes over time. The majority of genes were downregulated in leaves at the beginning of NaCl treatment, but began to be up-regulated after 4 h, except for TaDof8.2-1D and TaDof8.4-2A, expression levels of which were lower than the CK at 2 h and 4 h. After NaCl treatment for 12 h, eight of the Dof genes were significantly down-regulated, including TaDof4.2-5A, TaDof7.4-5D, TaDof8.5-3B,  TaDof2.2-2A.1, TaDof3.4-4A, TaDof5.2-2B, TaDof7.6-6A, and TaDof8.4-2A. After NaCl treatment for 24 h and 120 h, most of these genes examined were greatly down-and up-regulated, respectively. Under mannitol stress, all tested genes were downregulated, but 5 of them (TaDof2.2-2B.1, TaDof4.2-5A,  TaDof2.2-2A.1, TaDof5.2-2B, and TaDof7.6-6A) were up-regulated at 4 h and 8 h. The expression levels of 8 and 7 genes were decreased at 24 h and 72 h, respectively, whereas 7 were up-regulated at 72 h. Since both mannitol and PEG induced osmotic stress, it was not surprising that similar expression patterns were identified for most of the 11 genes across most treatment time points. For example, the expression patterns of TaDof2.2-2B.1 and TaDof5.2-2B in the leaves, and TaDof1.2-2B, TaDof7.4-5D, TaDof7.6-6A, and TaDof8.4-2A in the roots were increased or decreased synchronously in response to mannitol and PEG across time points, but with differing response amplitudes different between the two treatments. However, there were exceptions; for example, several genes, including TaDof4.2-5A, TaDof7.4-5D, TaDof8.5-3B, and TaDof8.4-2A in leaves and TaDof8.5-3B, TaDof4.2-5A, TaDof8.2-1D, and TaDof5.2-2B in roots showed differential expression levels at the initial and/or final stages of treatment with mannitol and PEG stress.

A Large Number of TaDof Genes Are Present in Common Wheat
Numerous physiological and biochemical processes in plants are regulated by transcription factors. Knowledge of the structure and function of transcription factors at the whole-genome level will help in understanding gene regulatory networks in individual species [4,29]. As plant-specific transcription factors, Dof (DNA binding with one finger) genes have important roles in plant growth and development [3,6,7]. However, the specific function(s) of most Dof genes remains unknown. Prediction of Dof genes genome-wide became possible with completion of genome sequencing. In this study, we analyzed the structure, phylogenetic relationships, chromosomal locations, sub-cellular localization, gene duplication events, cis-elements, and expression patterns of Dof genes in wheat (TaDofs); 108 TaDof genes were identified. Dof genes were previously identified in several species, including Arabidopsis (36 genes) [9], rice (30) [9], tomato (34) [11], soybean (78) [40], corn (46) [13], potato (35) [41], chrysanthemum (20) [42], and foxtail genomes (36) [43]. A larger number of TaDof genes were detected, but this was likely caused by the allohexaploid genome of wheat.
Analysis of protein sequences encoded by the 108 TaDof genes showed the presence of many conserved domains, consistent with the classification of Dof transcription factor family genes ( Figure 2) [7]. Basic amino acids and hydrophobic amino acid residues such as Pro, Lys, Ile, and Leu were present in the N-terminus, whereas Arg, Tyr, Thr, Gly, and Trp were common in the C-terminus.
Systematic classification of TaDof proteins should help to understand this transcription factor family. Yanagisawa [44] classified the Dof family into seven distinct sub-populations. Lijavetzky et al. [9] collected Dof transcription factors of Arabidopsis and rice, and divided them into four sub-families, named Aa, Bb, Cc, and Dd. They further divided each sub-group into groups based on the number of introns. Moreno-Risueno et al. [10] conducted a more comprehensive classification using 116 Dof genes from seven species and divided the Dof family into seven sub-groups, A to G. The members of each sub-group had high similarity in terms of conserved amino acid species and their numbers, and number of introns; for example, sub-groups A, F, and G contained introns whereas sub-groups B, C, and E did not. Comparing with these earlier classification rules, wheat Dof family were consistent with the classifications by Yanagisawa [44] and Lijavetzky et al. [9]. However, the number of introns in the wheat sub-groups did not follow the classification rules proposed by Moreno-Risueno et al. [10]. For example, TaDof2.1-2A.2 and TaDof2.2-2D.2 in sub-group II do not contain introns, whereas other TaDof2 genes contain one intron; and most of TaDofs in sub-group III had one intron, except for TaDof3.3-3A.3, which contained no intron. TaDofs mostly followed the conserved structures of Dof genes reported by Yanagisawa [44] and Lijavetzky et al. [9], but also included some species-specific structures. Our phylogenetic tree likewise divided TaDof genes into 7 sub-groups. In common with maize and rice, no wheat Dof gene clustered with Arabidopsis sub-group VI, suggesting that the Dof genes in graminaceous plants had diverged. Apparently, some TaDof genes evolved independently after differentiation of monocots and dicots.

Negatviely Selected TaDof Genes Expand by Polyploidization
Duplication is an important evolutionary process in gene family expansion, and duplicated genetic material provides opportunities for functional differentiation [29]. Therefore, gene duplication analysis can help us to better understand the evolution of genes and species. Sixty pairs of segmentally duplicated and three pairs of tandemly duplicated TaDof genes were identified. Most of the segmentally duplicated gene pairs (60/63) were from the same phylogenetic group, with high sequence similarities. Most TaDofs in segmentally duplicated pairs were located at similar positions in homoeologous chromosomes, supporting polyploidization as the main cause of expansion in for TaDof number. Two analyses were performed for confirmation. Firstly, non-synonymous substitution rates (Ka), synonymous substitution rates (Ks), and Ka/Ks ratios were calculated for 63 pairs of duplicated genes ( Figure 4C, Additional file 1: Table S6). The Ka/Ks ratios were less than 1 ( Figure 4C), implying that all Agronomy 2020, 10, 294 20 of 25 duplicated gene pairs were negatively selected. Ks values used to estimate the time of occurrence of repetitive events revealed that 63 duplications occurred about 1.55 to 12.92 million years (average 7.17, 48 values in 63 earlier than 5.5) ago, the time period mostly before the wheat polyploidization event [45]. Secondly, syntenic analysis was expanded to the wheat progenitor species Aegilops tauschii (DD), T. urartu (AA), and T. dicoccoides (AABB) for comparison with common wheat (AABBDD). Identical orthologues were found in the corresponding genomic regions among the A, B, D sub-genomes of wheat and the progenitor species ( Figure 4B, Additional file 1: Table S7). Considering the fact of that duplicated TaDof gene pairs were under negative selection force, it is apparent that polyploidization was the main reason for the higher number of TaDof family genes.

Integration of Expression Profiles, miRNA, and Cis-Elements Uncovers Complex Regulatory Patterns of TaDofs
Data from a comprehensive study of wheat expression profiles containing 337 different situations were analyzed to reveal the expression patterns of 108 TaDof genes (Additional file 1: Table S9) [33]. The results revealed variable regulation models of TaDof genes in different plant development stages and environment situations ( Figure 6). However, it is still unclear how the expression of TaDofs was affected by specific conditions. Gene promoters and miRNAs are essential factors in modulation of gene expression patterns, and they have been shown to regulate gene expression at both the transcriptional and posttranscriptional levels [31,32,39]. We analyzed the cis-regulatory elements of TaDofs and wheat miRNAs that target TaDofs (Additional file 1: Tables S10 and S11). Integration of expression profiles for miRNAs and the cis-elements of TaDofs led to some interesting results. For example, genes TaDof4.2-5A, TaDof4.2-5B, and TaDof4.2-5D specifically induced during anthesis were targeted by two or three miRNAs; they had numerous 'growth and development' associated cis-regulators in their promoter regions, whereas cis-regulators related to 'phytohormone response' and 'biotic/abiotic stress' were limited in number. Based on these results, it can be speculated that cis-regulators are responsible for anthesis stage-specific induction and miRNAs are responsible for expression suppression during other growth and development stages. Seven to eight G-box motifs ('biotic/abiotic stress' kind of cis-regulators) were detected in the promoter regions of TaDof5.3-3A, TaDof5.3-3B, and TaDof5.3-3D, and their expression levels were highly induced by biotic stress, but were not induced by abiotic stress. This implied the regulatory roles of cis-regulators in three genes responding to biotic stress. Additionally, TaDof5.3-3A and TaDof5.3-3D were not targeted by miRNA, suggesting miRNA played no or undetected roles in modulating the expression patterns of two genes. TaDof5.4-6A, which was targeted by tae-miR398 and had diverse cis-regulators, was highly expressed in various growth and development conditions. Its expression level was further induced by biotic and abiotic stresses, suggesting that tae-miR398 releases TaDof5.4-6A from suppression under stress conditions. In summary, miRNAs and cis-elements are possible regulators likely involved in complex regulation of TaDofs expression. Besides miRNA and cis-elements there are many other regulators of gene expression, such as DNA methylation and transcription factors. Thus, more experimental evidence is needed to further detail the regulation mechanisms of TaDofs. Our results provide valuable clues for experimental validation of regulation patterns by miRNAs and cis-elements.

A Large Number of TaDof Genes Are Present in Common Wheat
Numerous physiological and biochemical processes in plants are regulated by transcription factors. Knowledge of the structure and function of transcription factors at the whole-genome level will help in understanding gene regulatory networks in individual species [4,29]. As plant-specific transcription factors, Dof (DNA binding with one finger) genes have important roles in plant growth and development [3,6,7]. However, the specific function(s) of most Dof genes remains unknown. Prediction of Dof genes genome-wide became possible with completion of genome sequencing. In this study, we analyzed the structure, phylogenetic relationships, chromosomal locations, sub-cellular localization, gene duplication events, cis-elements, and expression patterns of Dof genes in wheat (TaDofs); 108 TaDof genes were identified. Dof genes were previously identified in several species, including Arabidopsis (36 genes) [9], rice (30) [9], tomato (34) [11], soybean (78) [40], corn (46) [13], potato (35) [41], chrysanthemum (20) [42], and foxtail genomes (36) [43]. A larger number of TaDof genes were detected, but this was likely caused by the allohexaploid genome of wheat.

Functional Clues of Nuclear Localizaed TaDofs Were Uncovered by Customized Annotation
As transcription factors, Dof genes are believed to function in the nucleus to regulate the expression of target genes [16]. All TaDofs were predicted to be localized in nucleus. Localization of 10 TaDofs was confirmed through confocal laser scanning microscopy following transient expression in Nicotiana benthamiana leaves ( Figure 5B). We then annotated TaDofs with multiple public databases, and Gene Ontology (GO) enrichment analysis showed that most TaDofs (98) were annotated under GO terms 'regulation of transcription, DNA-templated' (GO:0006355) and 'DNA binding' (GO:0003677). Several TaDofs were annotated under 'response to heat' (GO:0009408), 'response to high light intensity' (GO:0009644), 'response to hydrogen peroxide' (GO:0042542), and 'response to chitin' (GO:0010200), implying multifaceted roles ( Figure 5A, Additional file 1: Table S8). To further decipher the roles of TaDofs, 42 Dofs with experimentally supported functions were collected (Additional file 1: Table S12). After BLASTp searching, 25 of them were hit by 62 TaDofs (Table 2, Additional file 1: Table S13). According to our customized functional annotations, it was apparent that TaDofs are widely involved in aspects of phytohormone response, defense, growth and development, and metabolism. For example, TaDof4.2-5A, TaDof4.2-5B, and TaDof4.4-5D were specifically induced in samples at anthesis (Additional file 1: Table S9). These three genes hit barley BPBF, which has a role in endosperm specificity [46], implying their specific roles in wheat anthesis stage. OsDof3 is controlled by gibberellin to regulate the expression of Type III carboxypeptidase (CPD3) in rice [47]. Our BLASTp searching showed that TaDof4.3-6B and TaDof4.3-6D hit OsDof3, suggesting that these wheat Dofs may involve in gibberellin response through regulating wheat CPD3. Rdd1 is a regulator of photoperiodic flowering in rice and plays a role in grain size determination [48]. BLASTp searching showed that TaDof8.5-3B hit Rdd1, implying a role of TaDof8.5-3B in photoperiodic flowering and grain development. SRF1 has a role in modulating carbohydrate metabolism in the storage root of sweet potato through negative regulation of Ibβfruct2, which encodes an isoform of vacuolar invertase [49]. TaDof8.1-1A hit SRF1, suggesting TaDof8.1-1A may take part in carbon metabolism ( Table 2).

Conclusions
A large number of TaDof genes identified in common wheat were separated into seven sub-groups. Although variable in length, MW, and pI, all TaDof proteins contain a conserved zinc-finger structure. Exon/intron and motif analysis showed that TaDof gene structures and conserved motifs were similar within groups but diverse between groups. Ka/Ks analysis and inter-species syntenic analysis suggested that polyploidization was the main cause of high numbers of TaDofs. Based on GO annotation, sub-cellular localization prediction, and experimental confirmation TaDofs were shown to function as transcription factors in the nucleus. Large-scale expression profiling revealed variable regulation models of TaDof genes in different growth and development stage, biotic stress, and abiotic stress situations, and integration analysis with miRNAs and the cis-regulators, the possible transcriptional regulatory networks of TaDofs were further explored, which provide valuable information for further experimental studies. qRT-PCR analysis revealed that the expression levels of selected TaDofs were significantly altered by salt and drought stress. Through BLASTp searches the possible roles of TaDofs were manually annotated by alignment to Dofs with experimentally supported functions in other species. Our results provide valuable clues for further analysis of the regulatory mechanisms and specific functions of TaDof genes in wheat, especially their roles in stress response.
Supplementary Materials: The following are available online at http://www.mdpi.com/2073-4395/10/2/294/s1, Figure S1: Sequence logos for 2 motifs in Dof domains, Table S1: Primers used for TaDof gene cloning, Table S2: qRT-PCR primers for TaDof genes, Table S3: Dof genes in Arabidopsis, Oryza sativa, and Zea mays, Table S4: Multilevel consensus sequences in TaDof genes identified by MEME, Table S5: Chromosome locations of TaDof  genes in the Chinese Spring wheat reference genome, Table S6: Ka/Ks analysis and estimated divergence time for duplicated TaDof genes, Table S7: Duplication of gene pairs among wheat and three progenitor species, Table S8: Functional annotation of wheat Dof genes, Table S9: TaDof gene expression levels represented by  TPM values, Table S10: Wheat miRNAs targeting TaDofs, Table S11: The cis-elements in upstream sequences of TaDof genes, Table S12: Function prediction of TaDofs by BLASTp searching, Table S13: BLASTp search results, File S1: Sequences of Dof genes and coding proteins used in this research, File S2: Sequences of Dof genes with experimentally supported functions.