Cyanogenesis in the Sorghum Genus: From Genotype to Phenotype

Domestication has resulted in a loss of genetic diversity in our major food crops, leading to susceptibility to biotic and abiotic stresses linked with climate change. Crop wild relatives (CWR) may provide a source of novel genes potentially important for re-gaining climate resilience. Sorghum bicolor is an important cereal crop with wild relatives that are endemic to Australia. Sorghum bicolor is cyanogenic, but the cyanogenic status of wild Sorghum species is not well known. In this study, leaves of wild species endemic in Australia are screened for the presence of the cyanogenic glucoside dhurrin. The direct measurement of dhurrin content and the potential for dhurrin-derived HCN release (HCNp) showed that all the tested Australian wild species were essentially phenotypically acyanogenic. The unexpected low dhurrin content may reflect the variable and generally nutrient-poor environments in which they are growing in nature. Genome sequencing of six CWR and PCR amplification of the CYP79A1 gene from additional species showed that a high conservation of key amino acids is required for correct protein function and dhurrin synthesis, pointing to the transcriptional regulation of the cyanogenic phenotype in wild sorghum as previously shown in elite sorghum.


Introduction
Sorghum (Sorghum bicolor (L.) Moench) is a major cereal crop and the fifth most important cereal crop worldwide. As a C 4 plant, sorghum has several advantages over wheat and rice under harsh growing conditions as a result of increased photosynthetic efficiency and a higher tolerance to drought and elevated temperatures [1,2]. Sorghum produces the cyanogenic glucoside dhurrin in all vegetative tissues. Cyanogenesis describes the process whereby cyanogenic glucosides are hydrolyzed by specific β-glucosidases to release hydrogen cyanide (HCN) [3]. Plants avoid autotoxicity by the spatial separation of cyanogenic substrate and enzyme at the cellular or subcellular level, thus HCN is only released following tissue disruption [3][4][5]. This binary system has been demonstrated to provide plants with an immediate targeted response to herbivore attacks [6][7][8][9]. However, cyanogenesis also limits the use of sorghum as livestock feed and forage [10].
Cyanogenic glucosides are widespread throughout the plant kingdom [9,[11][12][13][14][15][16][17], yet these compounds occur at a disproportionately high frequency in cultivated plants [18]. It is currently unclear why so many crop plants, including S. bicolor, are cyanogenic. An increased production of cyanogenic glucosides may have been indirectly selected for during domestication as a form of natural pesticide [3,18,19]. Cyanogenesis is an effective deterrent against generalist herbivores, but some of the most common and damaging insect pests of S. bicolor (e.g., cotton bollworm (Helicoverpa armigera), sorghum midge (Stenodiplosis sorghicola)) feed mainly on the acyanogenic mature grains, rather than the cyanogenic vegetative tissues, thereby avoiding any potential HCN toxicity effects [20,21]. The initial reasons for the domestication and early cultivation of S. bicolor must also be considered. Sorghum was likely cultivated primarily for the production of the acyanogenic grain, rather than as an animal feed [22][23][24][25]. The presence of dhurrin in all vegetative parts of sorghum may therefore have provided benefits in terms of deterring herbivores, without any impact of cyanide toxicity on human health.
The prevalence of cyanogenesis in domesticated and undomesticated plants has generally been studied independently [3,18]. Direct comparisons of cyanogenic traits between major crop species and their genetically isolated wild relatives have rarely been performed. For example, Nassar and Fichtner [26,27] assessed the quantitative HCN content of six undomesticated cassava (Manihot) species but did not examine domesticated varieties (M. esculenta) in the same study. Variation in cyanogenic capacity has been investigated more thoroughly in naturalized populations of domesticated species, for example in lima bean (Phaseolous lunatus) [9], white clover (Trifolium repens) [28,29], legumes (Lotus spp.) [30], Macadamia spp. [31] and the rubber tree (Hevea spp.) [32].
S. bicolor was originally domesticated in northeastern Africa, including Ethiopia, Sudan and East Africa, around 6000 years ago [44,45], and most Eusorghum are also native to the African continent. S. halepense is thought to have originated in the Mediterranean, but has become widespread across the world and is considered an invasive weedy species [46]. The hybrid S. × almum was developed in Argentina from a cross between S. bicolor and S. halepense and is widely grown in Argentina for animal fodder, but it is less popular elsewhere due to high HCNp [47]. A large proportion of undomesticated Sorghum species (15 of 19) are distributed exclusively in the remote, relatively undisturbed regions of northern Australia [36,48,49]. The only species that is found as far south as Victoria is S. leiocladum, which is thought to have been traded by Australia's First Peoples. Due to the varied geographic distribution and likely extensive period of genetic isolation between species of the Sorghum genus [23], the endemic Australian species provide a unique opportunity to investigate the evolutionary drivers for the deployment of cyanogenic glucosides, i.e., differences in composition that may have arisen as a result of natural selection rather than anthropogenic artificial selection.  [33]. Letters is parentheses after each species indicate taxonomic subgenera, where S = Stiposorghum, P = Parasorghum, E = Eusorghum, C = Chaetosorghum, H = Heterosprghum. This is the strict consensus tree of 46 equally parsimonious trees of 1666 steps (CI = 0.873) for the combined Adh1, ITS1 and ndhF sequence data under maximum parsimonious analyses. Numbers above branches are percentages of 10,000 bootstrap replicates in which the clade was recovered. Trees were rooted using Zea mays and Hordeum vulgare. Chromosome numbers sourced from [42] except those denoted with * which were sourced from [43].
S. bicolor was originally domesticated in northeastern Africa, including Ethiopia, Sudan and East Africa, around 6000 years ago [44,45], and most Eusorghum are also native to the African continent. S. halepense is thought to have originated in the Mediterranean, but has become widespread across the world and is considered an invasive weedy species [46]. The hybrid S. × almum was developed in Argentina from a cross between S. bicolor and S. halepense and is widely grown in Argentina for animal fodder, but it is less popular elsewhere due to high HCNp [47]. A large proportion of undomesticated Sorghum species (15 of 19) are distributed exclusively in the remote, relatively undisturbed regions of northern Australia [36,48,49]. The only species that is found as far south as Victoria is S. leiocladum, which is thought to have been traded by Australia's First Peoples. Due to the varied geographic distribution and likely extensive period of genetic isolation between species of the Sorghum genus [23], the endemic Australian species provide a unique opportunity to investigate the evolutionary drivers for the deployment of cyanogenic glucosides, i.e., differences in composition that may have arisen as a result of natural selection rather than anthropogenic artificial selection.
Resource allocation theories predict that the synthesis and maintenance of nitrogenbased cyanogenic glucosides must come at a cost to plant growth [50,51]. Accordingly, plants distributed in natural, resource-limited environments, such as undomesticated Sor- Figure 1. Phylogenic tree of the sorghum genus, modified from Dillon et al. [33]. Letters is parentheses after each species indicate taxonomic subgenera, where S = Stiposorghum, P = Parasorghum, E = Eusorghum, C = Chaetosorghum, H = Heterosprghum. This is the strict consensus tree of 46 equally parsimonious trees of 1666 steps (CI = 0.873) for the combined Adh1, ITS1 and ndhF sequence data under maximum parsimonious analyses. Numbers above branches are percentages of 10,000 bootstrap replicates in which the clade was recovered. Trees were rooted using Zea mays and Hordeum vulgare. Chromosome numbers sourced from [42] except those denoted with * which were sourced from [43].
Resource allocation theories predict that the synthesis and maintenance of nitrogenbased cyanogenic glucosides must come at a cost to plant growth [50,51]. Accordingly, plants distributed in natural, resource-limited environments, such as undomesticated Sorghum, might show a reduced capacity for cyanogenic glucoside production and subsequent release of HCN. In this study, a phenotype-genotype approach is employed with the aim of constructing the first profile of cyanogenesis across the undomesticated species of the Sorghum genus. Phenotypic variation in cyanogenic status is assessed through the analysis of differences in two distinct parameters: firstly, the quantitative potential to release hydrogen cyanide from leaf tissue is measured at different developmental stages in species of the five Sorghum subgenera; secondly, the identity and relative content of any cyanogenic glucosides present are determined by LC-MS. Phenotypic analyses are conducted at pre-flowering stages, as dhurrin concentration peaks during seedling development in S. bicolor [52]. Genotypic differences are also investigated by analyzing structural variation in genes known or thought to be involved in one of several cyanogenic pathways, including dhurrin biosynthesis and bio-activation [25,53,54], HCN detoxification and endogenous recycling of dhurrin [25,55]. In addition, the sequence of CYP79A1, the key gene in dhurrin biosynthesis, is analyzed in greater detail by isolation of the gene from the majority of CWR.

Hydrogen Cyanide Potential in the Wild Relatives of Sorghum
Hydrogen cyanide potential (HCNp), as a proxy for cyanogenic glucoside concentration, was determined for leaf tissue harvested from S. bicolor and 18 related Sorghum species. Hydrogen cyanide release was measured in the leaves of all species at three time points during plant development ( Figure 2). Foliar HCNp was extremely low in all wild species from the Chaetosorghum, Heterosorghum, Parasorghum and Stiposorghum subgenera compared with the three Eusorghum species at 2, 4, and 6 weeks post-germination (p < 0.001). The overall average HCNp across the tertiary species was similar at each harvest point with an average of 0.33 ± 0.04 µg g −1 at 2 weeks post-germination, 0.37 ± 0.03 µg g −1 at 4 weeks, and 0.33 ± 0.03 µg g −1 at 6 weeks ( Figure 2). This was up to three orders of magnitude lower than the HCNp in the three Eusorghum species, which together had an overall average of 880 ± 70 µg g −1 at 2 weeks, 660 ± 50 µg g −1 at 4 weeks, and 520 ± 0.50 µg g −1 at 6 weeks ( Figure 2). HCNp did not vary significantly among the Eusorghum species at the different time points, except for at 2 weeks when S. bicolor had significantly higher HCNp (1200 ± 110 µg g −1 ) than both S. halepense (670 ± 8 µg g −1 ) and S. × almum (790 ± 90 µg g −1 ) (p < 0.05). There was a general trend towards a decrease in HCNp over time in each Eusorghum species, though the differences were not statistically significant in either S. halepense or S. × almum (p > 0.05). While HCNp varied significantly among the 16 CWR at the µg scale, the differences were minute in terms of final HCN concentration.

Relative Cyanogenic Glucoside Content
LC-MS analysis of leaf tissue of a subset of five wild species (Chaetosorghum: S. macrospermum; Stiposorghum: S. brachypodum; S. interjectum; Parasorghum: S. purpureosericeum, S. stipoideum) and two Eusorghum species (S. halepense and S. propinquum) identified the cyanogenic glucoside present in the wild sorghum species as dhurrin, the same cyanogenic glucoside as in the cultivated sorghum S. bicolor ( Figure 3). The two Eusorghum species showed significantly higher relative dhurrin concentrations than any of the wild species, up to two orders of magnitude in some cases (p < 0.05) ( Figure 3A). Amongst the wild species, S. macrospermum had the lowest relative dhurrin concentration, although the differences were not statistically significant ( Figure 3A). The pattern seen in the results from the LC-MS was similar to the HCNp detected in leaves from the same individual plants of each species ( Figure 3B). Both Eusorghum species showed significantly higher HCNp, determined by the colorimetric assay compared to the wild species.

Variation within Cyanogenesis Related Genes in Sorghum Detected by Genome Sequencing
Access to preliminary genome sequence data for six CWR (S. laxiflorum, S. macrospermum, S. brachypodum, S. leiocladum, S. matarankense and S. purpureosericeum [35]) enabled the analysis of the variation present in selected genes involved in cyanogenesis and related pathways. Briefly, the trimmed reads were mapped to the genomic sequences of the 18 selected S. bicolor genes (Supplementary Table S2). The alignment of the reads identified numerous single nucleotide variants (SNVs) in all selected genes in the six species (Table 1). In CYP79A1, the gene encoding the enzyme catalyzing the rate-limiting step in dhurrin biosynthesis, S. macrospermum (Chaetosorghum) had the fewest SNVs (146) and S. leiocladum (Parasorghum) the most (406). This is not an unexpected result, as these are the most closely related and one of the more distantly related species, respectively, to S. bicolor and the Eusorghum according to the current phylogeny ( Figure 1) [33]. In general, the sequences of the genes more directly involved in cyanogenic metabolism, including dhurrin biosynthesis (CYP79A1, CYP71E1, UGT85B1, POR) and bioactivation (DHRs, HNL), varied the most across all six species. Genes encoding enzymes that function in more fundamental metabolic processes, such as ethylene synthesis (ACC), were generally more highly conserved across all species. Interestingly, the glutathione S-transferase (GST) family genes thought to be involved in the endogenous recycling and detoxification of cyanogenic glucosides [55] were also relatively conserved compared to the biosynthetic and bioactivation genes. The nitrilase 4 class (NIT4) genes, likely involved in recycling and in the general HCN detoxification pathway, varied substantially in all species relative to S. bicolor. Table 1. Number of single nucleotide variants (SNVs) called for the examined genes in the six wild Sorghum species-S. laxiflorum (lax), S. macrospermum (mac), S. brachypodum (bra), S. leiocladum (lei), S. matarankense (mat) and S. purpureosericeum (pur)-when mapped to the genomic S. bicolor sequence of the selected genes. CYP79A1 was analyzed in greater detail by examining the predicted effects of all SNVs on the protein sequence. The majority of SNVs were predicted to result in a synonymous variant in the amino acid sequence and to have little or no impact on CYP79A1 function. The remaining SNVs were predicted to have a moderate impact on the protein as a result of missense mutations, although the majority of amino acid changes occurred within the hydrophobic class and are likely to have a lower impact on protein function [56]. Overall, 45% of this subset of SNVs resulted in changes among positively charged amino acids.

Variation within the Key Biosynthesis Gene, CYP79A1
To investigate the sequence of CYP79A1 in greater detail, PCR was used to amplify the gene from the majority of the wild sorghum species (Table 2). Primers were designed to the coding region of CYP79A1 as it was expected that untranslated regions would be more variable. PCR was successful in amplifying the full length CYP79A1 gene from 11 species; for 3 species, data were available for both genome and PCR approaches ( Table 2). Alignment of the CYP79A1 amino acid sequences obtained indicates that, overall, there is a high degree of conservation across the sorghum genus (approx. 85-95% identity; Supplementary Figure S1). The PCR sequence data show that the coding region of the CYP79A1 sequence in the wild sorghum species varies between 550 and 559 amino acid residues compared to 559 amino acids in S. bicolor. The length of the single intron in CYP79A1 varied from 81 to 195 nucleotide residues ( Table 2). Significant divergence in the sequence of the introns and other non-coding regions may be less easily detected when mapping Illumina reads to the S. bicolor reference as highly divergent reads may not map. Homology models of CYP79A1 were made using the solved crystal structures of relevant P450s as a template [57,58]. CYP79A1 contains 12 conserved major α-helices and 6 β-strands forming 2 highly conserved β-sheets ( Figure 4A). The correct folding of the CYP79A1 protein is important for functionality, ensuring heme-binding and correct docking of the substrate, tyrosine. Modelling and mutation studies also identified additional key amino acids for CYP79A1 activity; the R152 residue (S. bicolor sequence used for numbering) is involved in positioning the tyrosine substrate [57,58] whilst R411 is part of the E-R-R triad locking the heme pockets into position and stabilizing the protein core structure by formation of a salt bridge with E408 and R411 of the XEXXR sequence and R460 in the PERF motif (a P450 signature sequence) [59]. The heme-binding domain (WXXXR) is also important for correct folding and stability of the enzyme. Site directed mutagenesis and the generation of sorghum mutants also identified specific amino acids that impact on enzyme function and the synthesis of dhurrin [58,60]. The mutations E145K, R152A and T534A resulted in reduced dhurrin synthesis [58], whilst sorghum lines with mutations at P414L and C493Y resulted in acyanogenic plants [60,61]. The analysis of the sequence data obtained by PCR from the wild species indicated that these identified motifs and key amino acids are conserved in all the wild species investigated ( Figure 4B). The high conservation of the coding sequences in the CYP79A1 gene may reflect that this gene has an essential conserved function and supports the results of the cyanide assays, indicating that dhurrin is synthesized to some degree in all species.

Discussion
The current study investigates cyanogenesis in most of the currently known tertiary wild relatives of S. bicolor for the first time. The cyanogenic glucoside present in the wild species and elite S. bicolor is dhurrin, and the genomic machinery encoding the biosynthetic enzymes catalyzing dhurrin production and bioactivation was highly conserved among all tested species across the Sorghum genus. CYP79A1 has a very high substrate specificity with tyrosine being the only amino acid used as substrate [63]. With the highly conserved CYP79A1 sequences found in the wild compared to elite sorghums, it is not surprising that the wild sorghums also produce dhurrin. However, the phenotypic expression of hydrogen cyanide potential (HCNp) was substantially reduced in all species from the four undomesticated subgenera compared to S. bicolor and the domesticated Eusorghum. While a lower cyanogenic capacity in the tertiary Sorghum species was not unexpected, the order of magnitude differences between cultivated and tertiary Sorghum was greater than anticipated. Moreover, because of the high sequence identity of CYP79A1 between the elite and wild sorghum lines, this is not likely to result from the lost catalytic capacity of the CYP79A1 enzyme in the wild species.

Phenotypic Variation of Cyanogenesis in Sorghum
In the current study, S. bicolor and the other tested Eusorghum species (S. × almum, S. halepense and S. propinquum) showed high HCNp and dhurrin content. The production of specialized metabolites, such as cyanogenic glucosides, has long been thought to come at a metabolic cost to plants, tying up resources that could otherwise be utilized in growth and development [50,51]. Under the relatively controlled environmental conditions characterizing cultivated systems, these costs may be partially offset by a more stable uptake of essential resources, such as water and soil nutrients. However, production costs are likely to be more keenly felt in highly variable natural environments. This may be reflected in the highly reduced dhurrin content and negligible HCNp detected in the undomesticated species of the Chaetosorghum, Heterosorghum, Parasorghum and Stiposorghum subgenera. A large proportion of species in these groups are endemic to northern Australia, with the main center of diversity extending from the northerly monsoonal tropics to the arid and semi-arid regions of Central Australia. Soils in these regions are typically characterized by low concentrations of available nitrogen [23], the signature element and part of the fundamental nitrile group present in cyanogenic glucosides. This could suggest that the endemic wild Sorghum species with their low HCNp prioritize the allocation of available nitrogen to general metabolic processes in growth and development rather than to the production of dhurrin. In future, a comparative study of the cyanogenic glucoside content in the elite and wild Sorghum species from different habitats and following subjection to different levels of nitrogen deficiency might provide insights into the importance of these parameters in individual species belonging to the same genus [39,40].
Leaf tissue of wild species expressed cyanide potential at a scale lower than 1 µg g −1 , translating to dhurrin concentrations of less than 1 ppm. For feeding herbivores, plant tissue containing dhurrin at this low scale would be virtually indistinguishable from tissue with no capacity to release HCN, i.e., functionally acyanogenic [64]. Unless cyanogenesis is specifically induced by insect or pest attacks, this suggests that the wild, undomesticated Sorghum species do not utilize cyanogenic glucosides as part of their chemical defense, particularly as leaves are considered the plant organ most vulnerable to predation. Production of dhurrin in S. bicolor is developmentally regulated, with concentrations often found to be highest in young, developing tissues, such as newly formed leaves [52,[65][66][67][68][69]. In terms of cyanogenesis, this pattern is consistent with the optimal resource allocation theory, in that leaves are generally the most heavily defended organ as they house the photosynthetic apparatus [70,71]. With the potentially high costs of dhurrin synthesis in nitrogen-poor environments, the wild Sorghum species may instead place greater emphasis on other defense mechanisms to deter herbivores, including the production of less costly carbon-based physical structures, such as trichomes [72][73][74]. Indeed, a higher density of epidermal trichomes has been documented S. brachypodum and S. macrospermum [40]. It is currently unclear which specific herbivores feed on the endemic Australian species, although marsupials have been seen feeding on some species in the field (pers. comm. Dr Sally Norton). Recent evidence indicates that cyanogenic glucosides possess additional physiological functions beyond defense, in particular acting as storage compounds for reduced nitrogen that can be recovered for use in general plant metabolism upon demand [25,55,67,75] In the wild Sorghum species, dhurrin may be turned over immediately after production to release freely available reduced nitrogen. Future studies could explore this by comparative studies of the transcript levels of the genes encoding the recycling pathway and measuring the biosynthetic activity of the biosynthetic enzymes in microsomes isolated from the wild and domesticated sorghum lines, also considering the importance of diurnal rhythms [55,75,76].
An extremely low concentration of dhurrin and the resulting minute HCN potential were consistently observed across all wild Sorghum species examined in the current study. In previous studies, high quantitative and qualitative intraspecific variation of cyanogenesis has been recorded within species of other plant genera [3]. Some species of Eucalyptus show high quantitative variation for cyanogenic traits both within and between different populations [64,[77][78][79]. In several species of Lotus, the potential for HCN release has been observed to vary both quantitatively between individual plants, and qualitatively at the population scale [29,[80][81][82]. Within natural populations of white clover (Trifolium repens), individual plants can be either cyanogenic or completely acyanogenic, representing the true polymorphism of the trait [28,80,[83][84][85]. In cassava (Manihot esculenta), another major cyanogenic crop, genetically isolated wild relatives (equivalent to the tertiary Sorghum species) showed quantitative variation in their potential for HCN release in tubers under stable environmental conditions [26,27]. Such levels of intraspecific variation were not apparent in the wild Sorghum species. However, different populations may vary substantially in the capacity to produce dhurrin. In our current study, only a single accession of each species, i.e., a wild population from a single locality, was examined.

Genomic Variation of Cyanogenesis in Sorghum
The preliminary genome sequencing analysis reported in this study focused on the genetic variation in cyanogenic metabolism within six wild relatives of Sorghum. The structure of the CYP79A1 gene was further investigated by isolating and sequencing the gene from the majority of sorghum CWRs. These results and the single nucleotide variants (SNVs) analysis of the genome data suggest that the majority of amino acid changes identified in the wild species do not have a major effect on the catalytic activity of CYP79A1, the important initial step of dhurrin biosynthesis.
In the analysis of selected cyanogenic genes in wild Sorghum, differences in gene copy number and ploidy levels may have major effects. For example, a CYP79A1-like gene positioned at chromosome 10 shows 75% identity to CYP79A1 positioned at chromosome 1 in S. bicolor [86]. When full genome sequences of the wild Sorghum species become available, more detailed analysis of the CYP79 sequences present should be performed to examine whether the sequence reads have been assigned to the correct gene. The ploidy level of the different Sorghum species also varies with the chromosome number for S. bicolor 2n = 20, whilst for the wild species it varies with 2n = 10-40, potentially affecting the synthesis and/or recycling of dhurrin.
The PCR and genomic sequencing results analyzed to date show that the dhurrin biosynthetic genes are present and largely intact in the geographically and genetically isolated tertiary Sorghum species. However, there were substantial differences in HCNp and concentration of dhurrin between wild and domesticated Sorghum. Therefore, the expression of the key dhurrin biosynthesis gene CYP79A1 is likely to be controlled by regulatory mechanisms. It has previously been shown that the biosynthesis of dhurrin in S. bicolor is regulated at the transcriptional level [52]. Ehlert et al. [12] also found evidence of transcriptional regulation of the cyanogenic glucoside epiheterodendrin in barley. Wild almond species accumulate the phenylalanine-derived bitter and toxic cyanogenic glucoside amygdalin [87]. Almond domestication resulting in sweet kernels resulted from a single amino acid substitution (L346P) in the transcription factor bHLH2 controlling the transcription of the CYP79-and CYP71-encoding genes in the amygdalin pathway. This nonsynonymous point mutation in the dimerization domain of bHLH2 prevented the formation of a functional dimer and the transcription of the two biosynthetic genes [13]. The absence of dhurrin formation in the leaves of the wild sorghum species is thus likely also to reflect the lack of transcription of the CYP79A1 gene. Future studies should include transcriptomic analysis of the tertiary wild Sorghum species at different stages of plant development, in different tissue types, and in plants grown under different environmental conditions to further understand the regulation of dhurrin. The genomic and phenotypic variation apparent in this one functional trait, cyanogenesis, suggests a high degree of genetic diversity in the wild Sorghum germplasm. These species therefore shape as a valuable genetic resource for the breeding of more climate-resilient Sorghum crops in the future [23,88].  Table S1). Sorghum bicolor (BTx623) seeds were supplied by the Queensland Alliance for Agriculture and Food Innovation (QAAFI), University of Queensland (UQ).

Plant Material and Growing Conditions
Seeds of the wild sorghums used for all experiments were germinated as detailed in Cowan et al. 2020 [41]. Briefly the caryopsis was removed from the seed covering and

Hydrogen Cyanide Assays
The youngest fully emerged leaf was harvested from each plant at 2, 4 and 6 weeks for analysis of cyanogenic glucoside concentration. The leaf tissue was freeze dried, ground to a fine powder using a MM 300 MixerMill (Retsch, Haan, Germany) and 10 mg of tissue was used to determine the evolved HCN detection method following Gleadow et al. [89]. The hydrogen cyanide potential (HCNp) is the total amount of HCN evolved from hydrolysis of the entire content of endogenous cyanogenic glucosides. It is used as a proxy for dhurrin, such that each mg of HCN is equivalent to 11.5 mg of dhurrin in the plant tissue. All assays were performed in triplicate, and NaCN standards were included on each plate to create a standard curve. Data are expressed as cyanide potential (HCNp, mg CN g −1 dry weight), that is, the maximum cyanide release per mg cyanogenic glucoside and includes any free cyanide that may be present in the tissue. Depending on the amount of HCNp detected, the samples were either diluted 1 in 10 or assayed without dilution.

LC-MS Analysis and Identification of Cyanogenic Glucoside(s)
The youngest fully emerged leaf from plants of individual accessions of Sorghum species (Chaetosorghum: S. macrospermum; Stiposorghum: S. brachypodum, S. interjectum; Parasorghum: S. purpureosericeum, S. stipoideum; Eusorghum: S. halepense, S. propinquum) (n = 3) was removed at the ligule at six weeks post-germination, snap frozen and shipped on dry ice to the University of Copenhagen for analysis. The presence and identification of cyanogenic glucosides were analyzed using tandem mass spectrometry similar to Montini et al. 2020 [90]. Briefly, chromatography was performed on an Advance UHPLC system (Bruker, Bremen, Germany). Separation was achieved on a Zorbax XDB-C18 column (3.0 × 100 mm, 1.8 µm, Agilent Technologies). Formic acid (0.05%, v/v) in water and acetonitrile (supplied with 0.05% formic acid, v/v) were employed as mobile phases A and B, respectively. Mobile phase flow rate was 500 µL min −1 , and the elution profile was as follows: 0-0.3 min, 2% B; 0.3-0.9 min, 2-15% B; 0.9-1.4 min, 15-60% B; 1.4-3.3 min 60-100% B; 3.3-3.9 min, 100% B; 3.9-4.0 min 100-2% B and 4.0-5.0 min 2% B. The column temperature was maintained at 40 • C. The liquid chromatograph was coupled to an EVOQ Elite TripleQuadrupole mass spectrometer (Bruker, Bremen, Germany) equipped with an electrospray ion source (ESI) operated in positive ion mode. The ion spray voltage was maintained at 5000 V. The cone temperature was set to 300 • C and cone gas to 20 psi. The heated probe temperature was set to 200 • C and probe gas flow set to 50 psi. The nebulizing gas was set to 60 psi and collision gas to 1.6 mTorr. Multiple reaction monitoring (MRM) was used to monitor analyte precursor ion → fragment ion transitions. MRM transitions and corresponding collision energies (CE) were determined from direct infusion experiments of a reference standard. Dhurrin was detected as the sodium adduct [M+Na] + with the following MRM transitions: m/z 334.1 → 145.0 (CE −15 eV), m/z 334.1 → 185.0 (CE −15 eV), m/z 334.1 → 307.0 (CE −10 eV); Both, Q1 and Q3 quadrupoles were maintained at unit resolution. The Bruker MS Workstation software (Version 8.2,1) was used for data acquisition and processing. The relative quantities of dhurrin were calculated as the ratio of the base peak area (m/z 185) to the sample weight (50 mg). Chemically synthesized dhurrin was used as a standard [91].

Variant Analysis within Selected Genes Involved in Cyanogenesis and Related Pathways
The available sequences of a series of selected genes in the genomes of 6 wild sorghum species [35] were used to analyze the variation in 18 genes selected based on known and putative roles in cyanogenesis (dhurrin biosynthesis, bioactivation, recycling) as well as in the synthesis of tyrosine (the amino acid substrate for the first step in dhurrin biosynthesis) and ethylene (Supplementary Table S2). The S. bicolor genomic sequences for these genes were downloaded to CLC from Phytozome (https://phytozome.jgi.doe.gov, accessed on 10 January 2022) and used to map the reads generated from the partial sequencing of the six wild Sorghum species. Single nucleotide variants (SNVs) for each gene were called using the basic variant detection tool in CLC Genomics Workbench 12.0 (CLC Bio, Aarhus, Denmark) with a minimum coverage of 10, read count of 2 and allele frequency of 10%. Multi-allelic nucleotide variations were not included in this study as these were likely to be false positives, potentially caused by sequencing errors or errors in variant detection. Predicted effects on the protein sequence of all SNVs for all species were investigated in CYP79A1, the gene encoding the rate-limiting step of dhurrin biosynthesis [52], using the online Ensembl Variant Effect Predictor software [92]. The impact of these changes on the amino acid sequence of CYP79A1 and protein function was determined using the online software program SNAP2 within PredictProtein (https://www.predictprotein.org/, accessed on 10 January 2022) [56].

Detailed Sequence Analysis of CYP79A1 across the Sorghum Phylogeny
To further analyze the CYP79A1 sequence variation, polymerase chain reaction (PCR) was used to amplify the gene from additional wild sorghum species. DNA was extracted from leaf tissue using the CTAB protocol [93] and the concentration and integrity determined by gel electrophoresis and NanoDrop spectrophotometer (ND-1000, Thermo Fisher Scientific). The sequence of S. bicolor was used to design primers to amplify the full coding region of CYP79A1 and synthesized by Sigma (https://www.sigmaaldrich.com/AU/en, accessed on 10 January 2022) (Supplementary Table S3). High fidelity Pfu DNA polymerase (Promega) was used in the PCR (according to manufacturer's instructions) with touchdown cycling conditions as follows: 95 • C-4 min; 95 • C-30 s step down 62 • C, −0.5 • C/cycle for 9 cycles-30 s, 72 • C-2.5 min; 95 • C-30 s, 58 • C-30 s, 72 • C-2.5 min for 29 cycles; 72 • C-10 min; 4 • C hold. PCR products were purified using the Wizard gel purification kit (Promega) and sequenced by Micromon (Monash University). Additional internal primers were used for complete sequencing of the 2 kb region (Supplementary Table S3). Sequences were analyzed using Snapgene, Clustal Omega, NCBI blast and Phytozome.

Statistical Analysis
All quantitative data were analyzed using GraphPad Prism version 7.02 for Windows (GraphPad Software, San Diego, CA, USA). Ordinary one-way analysis of variance (ANOVA), followed by Tukey's multiple comparisons tests were used to compare HCNp between and within species across different time points. A 95% confidence level was set for all statistical tests.

Conclusions
Cyanogenesis is a highly variable functional trait, controlled by a complex interaction of internal and external factors [3]. The current study took the fundamental initial step of characterizing cyanogenesis in the undomesticated species of the genus Sorghum. While the structural genetic variation of the cyanogenic machinery was limited, all tested species of Chaetosorghum, Heterosorghum, Parasorghum and Stiposorghum showed negligible potential for HCN release in leaf tissues. In simple terms, this low phenotypic expression of cyanogenesis might reflect the conditions in their natural environments, such as a limited access to nutrient resources and/or differences in herbivore pressures [79]. The reality is likely to be much more complex. The regulation of dhurrin and HCNp is also dependent on plant ontogeny and specific tissue type in S. bicolor [65,67,94]. In order to better understand the potential utilization and growth-defense trade-offs of cyanogenic glucosides in general plant metabolism, detailed tissue-and age-dependent variation of HCNp in wild species in comparison to S. bicolor is required.
Supplementary Materials: The following are available online at https://www.mdpi.com/article/10 .3390/genes13010140/s1. Figure S1: CLUSTAL O (1.2.4) multiple sequence alignment of the amino acid sequences of CYP79A1 obtained by PCR from wild sorghum species; Table S1: Accession and provenance details of the sorghum crop wild relatives; Table S2: Genes selected for variant analysis; Table S3: Sequence of the primers used to amplify CYP79A1 from the wild sorghum species.

Conflicts of Interest:
The authors declare no conflict of interest.