Genetic Analysis of Hexaploid Wheat (Triticum aestivum L.) Using the Complete Sequencing of Chloroplast DNA and Haplotype Analysis of the Wknox1 Gene

The aim of the presented study is a genetic characterization of the hexaploid wheat Triticum aestivum L. Two approaches were used for the genealogical study of hexaploid wheats—the complete sequencing of chloroplast DNA and PCR-based haplotype analysis of the fourth intron of Wknox1d and of the fifth-to-sixth-exon region of Wknox1b. The complete chloroplast DNA sequences of 13 hexaploid wheat samples were determined: Free-threshing—T. aestivum subsp. aestivum, one sample; T. aestivum subsp. compactum, two samples; T. aestivum subsp. sphaerococcum, one sample; T. aestivum subsp. carthlicoides, four samples. Hulled—T. aestivum subsp. spelta, three samples; T. aestivum subsp. vavilovii jakubz., two samples. The comparative analysis of complete cpDNA sequences of 20 hexaploid wheat samples (13 samples in this article plus 7 samples sequenced in this laboratory in 2018) was carried out. PCR-based haplotype analysis of the fourth intron of Wknox1d and of the fifth-to-sixth exon region of Wknox1b of all 20 hexaploid wheat samples was carried out. The 20 hexaploid wheat samples (13 samples in this article plus 7 samples in 2018) can be divided into two groups—T. aestivum subsp. spelta, three samples and T. aestivum subsp. vavilovii collected in Armenia, and the remaining 16 samples, including T. aestivum subsp. vavilovii collected in Europe (Sweden). If we take the cpDNA of Chinese Spring as a reference, 25 SNPs can be identified. Furthermore, 13–14 SNPs can be identified in T. aestivum subsp. spelta and subsp. vavilovii (Vav1). In the other samples up to 11 SNPs were detected. 22 SNPs are found in the intergenic regions, 2 found in introns, and 10 SNPs were found in the genes, of which seven are synonymous. PCR-based haplotype analysis of the fourth intron of Wknox1d and the fifth-to-sixth-exon region of Wknox1b provides an opportunity to make an assumption that hexaploid wheats T. aestivum subsp. macha var. palaeocolchicum and var. letshckumicum differ from other macha samples by the absence of a 42 bp insertion in the fourth intron of Wknox1d. One possible explanation for this observation would be that two Aegilops tauschii Coss. (A) and (B) participated in the formation of hexaploids through the D genome: Ae. tauschii (A)—macha (1–5, 7, 8, 10–12), and Ae. tauschii (B)—macha M6, M9, T. aestivum subsp. aestivum cv. ‘Chinese Spring’ and cv. ‘Red Doly’.


Introduction
Wheat is the leading grain crop in the world. It originated in the Fertile Crescent approximately 10,000 years ago and has since spread worldwide. There are two biological species of hexaploidy wheat-T. aestivum, genome BBA u A u DD and Triticum zhukovskyi Menabde & Ericz., GGA u A u A m A m . T. zhukovskyi and its predecessors (Triticum monococcum L. and Triticum timopheevii (Zhuk.) Zhuk.) form a separate lineage irrelevant to the evolution of the principal wheat lineage, which is formed by T. aestivum and its predecessors Aegilops tauschii Coss. And Triticum turgidum L. [1]. The T. aestivum lineage is divided into the domesticated, hulled lineage and the free-threshing lineage. The free-threshing lineage includes T. aestivum subsp. Aestivum, T. aestivum subsp. Compactum (Host) Mackey, T. aestivum subsp. Sphaerococcum (Percival) Mackey, and T. aestivum subsp. carthlicoides nom. nud. (Tables 1 and 2). Hexaploid wheat T. aestivum subsp. carthlicoides was found by Kuckuck [3] near the border of Turkey, Armenia and West Georgia. This hexaploid wheat showed the subsp. carthlicum-like spike morphology.
According to Dekaprelevich, Georgia is characterized by the largest number of cultivated wheat species in the world; altogether, 12 species of are found here. Only three narrowly endemic species, Triticum abyssinicum Vav., T. sphaerococcum and T. spelta, are absent [4]. All these cultivated species and subspecies have been found only in West Georgia, except subsp. carthlicum which has been distributed in East Georgia as well. All these species and subspecies grew in the territory of Georgia until the middle of the last century.
The "Wheat Enigma" was a term for the observation that wild predecessors of five Georgian endemic wheat subspecies are found in Fertile Crescent, quite far from the South Caucasus [8,9]. One possibility to explain the "Wheat Enigma" is that speakers of ProtoGeorgian language could have moved to Mesopotamia after migration from Africa to the Arabian Peninsula, where wheat was domesticated. Furthermore, they could have migrated to South Caucasus together with domesticated wheat subspecies [9,10].
The examination of genealogical data provides insights into the evolutionary history of a species. The wide application of gene genealogies for evolutional studies in plants involves identifying DNA sequences with levels of ordered variation within chloroplast, mitochondrial, or nuclear genomes [11]. Traditionally, extranuclear DNA, such as chloroplast DNA (cpDNA), has been considered as an effective tool for genealogic studies [12][13][14]. The sequences of wheat plasmons B and G (complete cpDNA sequences) of the genus Triticum were determined in our laboratory [8,15,16]. Plasmon B is detected in polyploid species-Triticum turgidum and T. aestivum. Plasmon of T. zhukovskyi belong to the G type.
Another effective tool for gene genealogy studies for Triticum species is the three homoeologous loci of wheat Wknox1 gene, functioning at shoot apical meristems (SAM) [17]. A comparative study of the three Wknox1 genomic sequences revealed accumulation of numerous mutations, particularly in the fourth intron and the 5 -upstream region. Later, Takumi a. Morimoto [18] reported the discovery of a new allele for the fifth-to-sixth exon region of the Wknox1b KNOTTED1-type homeobox gene in a common wheat subspecies (T. aestivum subsp. carthlicoides).
The second aim of the present investigation was to carry out a PCR-based haplotype analysis of the fourth intron of Wknox1d and of the fifth-to-sixth exon region of Wknox1b of all 20 hexaploid wheat samples.

Complete cpDNA Sequence of Hexaploid Wheats
For comparative analysis of chloroplast DNA of hexaploid wheats, 20 hexaploid wheat samples were selected. CpDNA sequences from 13 of them were sequenced in this study, and the remaining 7 were sequenced earlier [15] ( Table 3).
To illustrate the evolutionary relationship among the studied cultivars, a phylogenetic tree was constructed based on complete nucleotide sequences of cpDNA of 20 hexaploid wheat samples (Figure 1).  If we take the cpDNA of Chinese Spring as a reference, 25 SNPs can be identified. Furthermore, 13-14 SNPs can be identified in T. aestivum subsp. Spelta and subsp. Vavilovii (Vav1) (collected in Armenia) ( Table 4). In the other samples, up to 11 SNPs were detected. In total, 22 SNPs are found in the intergenic regions, 2 were found in introns, and 10 SNPs were found in the genes, of which seven are synonymous and do not alter the amino acids. Indels specific for 20 cpDNA are given in Table 5. If we take the cpDNA of Chinese Spring as a reference, 25 SNPs can be identified. Furthermore, 13-14 SNPs can be identified in T. aestivum subsp. Spelta and subsp. Vavilovii (Vav1) (collected in Armenia) ( Table 4). In the other samples, up to 11 SNPs were detected. In total, 22 SNPs are found in the intergenic regions, 2 were found in introns, and 10 SNPs were found in the genes, of which seven are synonymous and do not alter the amino acids. Indels specific for 20 cpDNA are given in Table 5.
This section may be divided into subheadings. It should provide a concise and precise description of the experimental results, their interpretation, as well as the experimental conclusions that can be drawn.      In the present investigation, the genomic sequences of the Wknox1d fourth intron regions were amplified by PCR using the primer pair of Takumi a. Morimoto [18] (Table 7). A common wheat, T. aestivum subsp. aestivum cv. 'Chinese Spring' (CS), and a durum wheat, T. turgidum subsp. durum cv. 'Langdon' (Ldn), were used for PCR as control and the amplified DNA fragments were visualized on an agarose gel. In the fourth intron of Wknox1d in common wheat, a 122-bp MITE insertion has been reported [17]. The MITEcontaining band (411 bp) was missing in all tetraploid wheat accessions and was observed in all subspecies of common wheat. In the case of subsp. macha (11 samples) the 453 bp band was observed. In the case of two macha samples (M6 and M9) the MITE-containing band typical for other hexaploid subspecies (411 bp) were detected. In 2% agarose-gel, the PCR-amplified, fourth intron of the Wknox1d DNA region of hexaploid wheat gives four bands: N1 (280 bp), N2 (375 bp), N3 (411 bp), and N4 (453). Bands N3 and N4 were cut out and sequenced. In the case of band N4 (453), a 42 bp insertion was detected in the position 7009-AGTTTGCACACCTGAACATTTTGCATTATGTTCGGGAGCCTA ( Figure 2).
Hexaploid wheats with matching values of the fourth intron of Wknox1d and the fifth-to-sixth-exon region of Wknox1b are given in Table 9.
Hexaploid wheats with matching values of the fourth intron of Wknox1d and the fifth to-sixth-exon region of Wknox1b are given in Table 9.

Discussion
It is widely believed that the birthplace of T. aestivum is in a region between Transcaucasia and southwestern Caspian Iran [19]. The first step in the evolution of cultivated wheat was the formation in northern Mesopotamia of a tetraploid species with an A u A u BB genome [20]. Approximately 7000 years ago, the hexaploid bread wheat T. aestivum L. (BBA u A u DD) arose in southwestern Caspian Iran and Transcaucasia by allopolyploidization of the cultivated Emmer wheat Triticum dicoccum Schrank with the Caucasian Ae. tauschii [19,[21][22][23][24].
It should be noted that Ae. tauschii was found in the western Caucasus as well [25,26]. According to authors "Ae. squarrosa (Ae. tauschii) grows in lowlands and mountain foothills, rarely in dry and humid silty areas up to the middle of the mountain, mainly in desert, semi-desert and field vegetation, as well as weeds in Abkhazia, Samegrelo, Imereti, Guria, Adjara, Kartli, in outer Kakheti, Gardabani" [25]. It can be assumed that two Ae. tauschii  (Tables 3 and 7).
T. aestivum subsp. catrhlicoides most likely originated in western Transcaucasia [3]; In 1967 Kuckuck had found populations with carthlicium and carthlicoides types near the border of East Turkey (Ardahan and Kars Provinces, Turkey) and West Georgia. This hexaploid wheat accession showed the subsp. carthlicum-like morphology. Subsp. carthlicum was proposed to have originated from spontaneous hybridization between subsp. carthlicoides and cultivated emmer wheat, T. turgidum subsp. dicoccon (Schrank) Thell. Subsp. carthlicoides should be considered as the original and elder genotype from which genes for this particular morphology of the ear were transferred together with the Q-factor to T. carthlicum [3,18]. According to Kuckuck, this region was distinguished by a tremendous genetic variation in wheat including T. dicoccum, T. carthlicum and T. aestivum subsp. macha [3]. It can be assumed that subsp. carthlicoides took part in the formation of the subsp. macha as an ancestor.
One of the four subspecies of wheat detected in Georgia is hexaploid, domesticated, hulled wheat T. aestivum subsp. macha [15]. This subspecies was detected in West Georgia in 1928 and described by Dekaprelevich and Menabde [27]. It is endemic to Georgia and is cultivated along with tetraploid West Georgian wheat (T. turgidum subsp. palaeocolchicum) [28].
It is proposed that macha and West Georgian wheats are sibling cultivars that arose in a hybrid swarm involving T. aestivum and wild emmer wheat [29]. It is accepted that spelt wheat is derived from free-threshing hexaploid wheat (T. aestivum subsp. aestivum) by hybridization with hulled emmer (T. turgidum). Experimental data suggest that European and Asian spelt may be polyphyletic. Free-threshing, hexaploid wheat seems to precede spelt. At least some European spelt originated from hybridization of club wheat (T. aestivum subsp. compactum) with emmer (domesticated hulled Triticum turgidum). Free-threshing hexaploid wheat was an ancestor of not only European spelt but also of some of the Asian forms of spelt, although the exact role free-threshing wheat has played is debatable [1].
The comparative analysis of complete nucleotide sequence 20 samples of hexaploid wheats offers the opportunity to draw several conclusions: 20 hexaploid wheat samples are divided into two groups-Triticum aestivum subsp. spelta three samples + T. aestivum subsp. vavilovii (Vav1) (collected in Armenia), and the remaining 16 samples including T. aestivum subsp. vavilovii (Vav2) (Collected in Sweden).
Hirosawa et al. [30] found two cpDNA SSR lineages in T. aestivum: Plastogroup I and II. Most subsp. aestivum, compactum, macha, and sphaerococcum accessions belonged to Plas-togroup I. Subspecies spelta was split into Plastogroup I (70% of accessions examined) and Plastogroup II (30%). So, given the numbers of subsp. spelta accessions used (27 accessions in Hirosawa et al. and 3 in our work), the cpDNA trees (Hirosawa et al.'s and ours) appear quite consistent. Plastogroup I (and our major lineage) might represent a major lineage that was transmitted from cultivated T. turgidum via hybrid cross with Ae. tauschii and its subsequent allopolyploidization. Plastogroup II (and our minor lineage) might have resulted from introgression between hexaploid and tetraploid wheats, because subsp. spelta has the emmer wheat ancestry in Europe and Asia [1].
PCR-based haplotype analysis of the fourth intron of Wknox1d and the fifth-to-sixthexon region of Wknox1b of all 20 hexaploid wheat samples was carried out. PCR-based haplotype analysis of the fourth intron of Wknox1d and the fifth-to-sixth-exon region of Wknox1b provides an opportunity to make an assumption that hexaploid wheats T. aestivum In recent years, wheat yields have not increased per hectare, which is important due to the growing world population. Transferring agronomically important genes from wild relatives to common wheat has been shown to be an effective genetic resource for hexaploid wheat improvement. Advances in new technologies have made the complete wheat reference genome available, which offers a promising future for the study of wheat improvement which is essential to meet global food demand [31]. The seeds of hexaploid wheat samples were received from the seed bank of The Agricultural University of Georgia (the late Prof. P. Naskidashvili) and Institute of Botany (Ilia State University, Tbilisi, Georgia) (ISU); The U.S. National Plant Germplasm System (GRINGlobal); The National BioResource Project-WHEAT Centre (Graduate School of Agriculture, Kyoto University, Kyoto, Japan); The Scientific Research Center of Agriculture (SRCA) (Georgia) and Tel Aviv University (Israel). The seeds were germinated in water at room temperature. Total genomic DNA extraction from young leaves, the construction of genomic DNA libraries and assembly of cpDNA have been described earlier [15,31]. An automatic annotation of cpDNA was performed by CpGAVAS [32]. For detection of SNPs (single nucleotide polymorphism) and Indels (insertion/deletion) and phylogeny tree construction, computer programs Mafft and Blast were used [33,34].

Construction of Shotgun Genomic DNA Libraries
Construction of 13 shotgun genomic libraries and sequencing on the NovaSeq 6000 was carried out at the Roy J. Carver Biotechnology Center, University of Illinois at Urbana-Champaign (UIUC). The shotgun genomic DNA libraries were constructed from 50 ng of DNA after sonication with a Covaris ME220 (Covaris, MA, USA) to an average fragment size of 400 bp with the Hyper Library Preparation Kit from Kapa Biosystems (Roche, CA, USA). To prevent index switching, the libraries were constructed using unique dual-indexed adaptors from Illumina (San Diego, CA, USA). The individually barcoded libraries were amplified with 6 cycles of PCR and run on a Fragment Analyzer (Agilent, CA, USA) to confirm the absence of free primers and primer dimers and to confirm the presence of DNA of the expected size range. Libraries were pooled in equimolar concentration and the pool was further quantitated by qPCR on a BioRad CFX Connect Real-Time System (Bio-Rad Laboratories, Inc., Hercules, CA, USA).

Sequencing of Libraries in the NovaSeq
The pooled, barcoded, shotgun libraries were loaded on a NovaSeq SP lane for cluster formation and were sequenced for 150 cycles from each side of the DNA fragments. The fastq read files were generated and demultiplexed with the bcl2fastq v2.20 Conversion Software (Illumina, San Diego, CA, USA).

PCR Analysis of Three Homoeologous Wheat Wknox1 Gene
The genomic sequences of the Wknox1d 4th intron regions were amplified by PCR using the primer pair of Takumi a. Morimoto [18]: 5 -AAAAAAAAGGTTAAATGGAC-3 and 5 -ACCTTATACATGATTGGGAA-3 . The Wknox1b 5th-to-6th exon region was amplified by PCR using the primer pair 5 -GCTGAAGCACCATCTCCTGA-3 and 5 -CATGTAGAAGGCGGCGTTAG-3 . The DNA fragments (PCR products) were excised from the agarose gel. The DNA extraction and purification from the agarose gel was performed by the QIAquick Gel Extraction Kit (QIAGEN). PCR products were purified with GenElute PCR Clean-Up Kit (Sigma-Aldrich), dye labeled using a Big Dye Terminator Kit (Applied Biosystems), and sequenced on Applied Biosystems 3700 genetic analyzer (Laboratory Services Division of the University of Guelph, Guelph, ON, Canada).

Conclusions
Based on complete chloroplast DNA sequences, the 20 hexaploid wheat samples can be divided into two groups-T. aestivum subsp. spelta three samples + T. aestivum subsp. vavilovii collected in Armenia, and the remaining 16 samples, including T. aestivum subsp. vavilovii collected in Europe (Sweden).
Based on the fourth intron of Wknox1d and the fifth-to-sixth-exon region of Wknox1b hexaploid wheats, T. aestivum subsp. macha var. palaeocolchicum and var. letshckumicum were found to differ from other macha samples by the absence of a 42 bp insertion in the fourth intron of Wknox1d. Two