Next Article in Journal
Evaluation of Protection by Caffeic Acid, Chlorogenic Acid, Quercetin and Tannic Acid against the In Vitro Neurotoxicity and In Vivo Lethality of Crotalus durissus terrificus (South American Rattlesnake) Venom
Next Article in Special Issue
Evolution of the Ergot Alkaloid Biosynthetic Gene Cluster Results in Divergent Mycotoxin Profiles in Claviceps purpurea Sclerotia
Previous Article in Journal
Mycobiota and Mycotoxin Contamination of Traditional and Industrial Dry-Fermented Sausage Kulen
Previous Article in Special Issue
Using On-Farm Monitoring of Ergovaline and Tall Fescue Composition for Horse Pasture Management
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Mining Indole Alkaloid Synthesis Gene Clusters from Genomes of 53 Claviceps Strains Revealed Redundant Gene Copies and an Approximate Evolutionary Hourglass Model

1
Ottawa Research & Development Centre, Agriculture and Agri-Food Canada, Ottawa, ON K1A 0C6, Canada
2
Department of Agricultural Biology, Colorado State University, Fort Collins, CO 80523, USA
3
USDA, Agricultural Research Service, National Center for Agricultural Utilization Research, Mycotoxin Prevention and Applied Microbiology Research Unit, 1815 N. University St., Peoria, IL 61604, USA
4
Institute of Microbiology of the Czech Academy of Sciences CAS, 14220 Prague, Czech Republic
5
Morden Research and Development Centre, Agriculture and Agri-Food Canada, Morden, MB R6M 1Y5, Canada
*
Author to whom correspondence should be addressed.
Toxins 2021, 13(11), 799; https://doi.org/10.3390/toxins13110799
Submission received: 21 October 2021 / Revised: 9 November 2021 / Accepted: 10 November 2021 / Published: 13 November 2021
(This article belongs to the Special Issue Global Impact of Ergot Alkaloids)

Abstract

:
Ergot fungi (Claviceps spp.) are infamous for producing sclerotia containing a wide spectrum of ergot alkaloids (EA) toxic to humans and animals, making them nefarious villains in the agricultural and food industries, but also treasures for pharmaceuticals. In addition to three classes of EAs, several species also produce paspaline-derived indole diterpenes (IDT) that cause ataxia and staggers in livestock. Furthermore, two other types of alkaloids, i.e., loline (LOL) and peramine (PER), found in Epichloë spp., close relatives of Claviceps, have shown beneficial effects on host plants without evidence of toxicity to mammals. The gene clusters associated with the production of these alkaloids are known. We examined genomes of 53 strains of 19 Claviceps spp. to screen for these genes, aiming to understand the evolutionary patterns of these genes across the genus through phylogenetic and DNA polymorphism analyses. Our results showed (1) varied numbers of eas genes in C. sect. Claviceps and sect. Pusillae, none in sect. Citrinae, six idt/ltm genes in sect. Claviceps (except four in C. cyperi), zero to one partial (idtG) in sect. Pusillae, and four in sect. Citrinae, (2) two to three copies of dmaW, easE, easF, idt/ltmB, itd/ltmQ in sect. Claviceps, (3) frequent gene gains and losses, and (4) an evolutionary hourglass pattern in the intra-specific eas gene diversity and divergence in C. purpurea.
Key Contribution: Indole alkaloid gene clusters from a wide range of Clavicep spp. were identified through genome screening. Six indole diterpene/lolitrem genes, idt/ltmP, Q, B, C, S, and M, were commonly present in various species in C. sect. Claviceps. Micro-evolution of eas genes within Claviceps purpurea revealed that their evolutionary rates fit an hourglass model.

1. Introduction

Fungi in the genus Claviceps (Clavicipitaceae, Hypocreales, Sordariomycetes) infect the florets of cereal crops, nonagricultural grasses (Poaceae), sedges (Cyperaceae), and rushes (Juncaceae) [1], followed by occupying the unfertilized ovaries and eventually replacing the seeds with fungal resting bodies, i.e., sclerotia, known as ergots [2]. In light of molecular phylogenetics, 63 named species [3,4] are classified into four sections, i.e., Claviceps sect. Claviceps, C. sect. Citrinae, C. sect. Paspalorum, and C. sect. Pusillae, on the basis of morphological, ecological, and alkaloid-producing features [3]. Ergot bodies or sclerotia contain a wide spectrum of alkaloids toxic to humans and animals, making them unwelcome pathogens in agricultural and food production [5,6], but also important resources for pharmaceuticals [7,8]. Among the alkaloids produced by Claviceps, ergot alkaloids (EAs) are the major culprit for the mass food/feed poisoning in human and livestock, as well as a number of tragedies in human history [9,10]. EAs are indole compounds characterized by a tricyclic or tetracyclic ring system. Over 80 different EAs found in nature fall into three structural groups: clavines, lysergic acid amides, and ergopeptides [8,11], corresponding to their structural complexity. Clavines are the intermediates or derivatives of the intermediates in the lysergic acid amide pathway, whereas ergopeptines are the most complex group [11]. Intensive investigations on biochemistry and molecular genetics have elucidated the EA biosynthetic pathways in EA producers especially Claviceps spp. [12,13]. A cluster of 12 functioning EA synthesis (eas) genes (cloA, dmaW, easA, easCH, lpsAC) in C. purpurea strain 20.1 were considered to encode all the enzymes needed for the end-product ergotamine and ergocryptine [14]. The four early steps, requiring dmaW, easF, easC, and easE, are responsible for the closure of the third ring resulting in chanoclavine, followed by middle steps, requiring easD, easA, easG, and cloA, for forming tetracyclic clavines, and later steps for producing the lysergic acid amides, dihydroergot alkaloids, and complex ergot peptines [13] (Figure 1). Among the 12 genes, the homologs of nine were found in C. fusiformis in a cluster. In C. paspali, two additional genes (easP and easO) were found; however, easE was defective. The presence or absence of eas genes has proven to be correlated with EA profiles in several Claviceps spp. and strains [13,14]. However, the investigation of eas gene clusters in a wide range of Claviceps spp. is lacking, and less is reported about the evolution of the individual gene in these clusters among and within the species.
Indole diterpenes (IDTs) are another large group of bioactive compounds with diverse structural variations, triggering toxicity in animals and insects through interfering with ion channels [16,17]. In the literature, there are copious reports that certain species in Claviceps (i.e., C. paspali and C. cynodontis) and close relatives in Epichloë (Neotyphodium as the asexual name before implementation of the International Code of Nomenclature for algae, fungi, and plants (Shenzhen Code) [18]) produce the paspaline-derived IDTs, such as paspalitrem, lolitrem, and paxilline, causing ataxia and staggers in livestock that feed on the grasses infected by those fungal species [19,20,21]. Biosynthetic pathways and associated gene clusters of these paspaline-derived IDTs have been investigated [22,23,24], resulting in the discovery of at least 10 genes involved in IDT production in Epichloë spp. and the prediction that ltmG, M, C, and B were responsible for the synthesis of paspaline, the basic structural backbone of IDTs, whereas ltmP and Q were essential for the production of lolitrem and ltmF, J, K, and E, which are required for more complex structures [25,26]. The proposed scheme for the biosynthesis of paspalitrem in C. paspali involved seven genes including the initial formation of paspaline through ltmG, M, C, and B, followed by the sequential functioning steps of ltmP, Q, and F [22]. Recently, the pre-paspaline steps were further resolved as three sequential steps: starting from ltmG converting farnesyl diphosphate (FPP) to Geranylgeranyl diphosphate (GGPP), followed by ltmC transferring GGPP to 3-geranylgeranylindole, and finally through ltmM and B yielding paspaline [27]. In addition to C. paspali and C. cynodontis, other Claviceps spp., i.e., C. arundinis, C. humidiphila, and C. purpurea, could also produce indole diterpenes or paspaline-like compounds [28,29,30]. The genome investigation of C. purpurea 20.1 revealed the presence of ltmM, C, B, P, Q, and an extra gene ltmS [14]. It is not known whether these genes are consistently present in various strains of C. purpurea and other Claviceps species. In addition, two other classes of alkaloids, i.e., lolines (LOL) and peramines (PER), produced by Epichloë spp., are known to function as insecticides, but are not associated with any toxicity symptoms in grazing mammals [31,32]. Given the close relationship between Epichloë and Claviceps, it is reasonable to raise the question of whether any of the loline or peramine gene homologs are present in any of the Claviceps spp. even though those two classes of alkaloids have not been reported in Claviceps.
The ‘hourglass model’ borrowed from ontogeny refers to the pattern that the morphological divergence of mid-development stages of an embryo are more conserved compared with earlier and later stages, resembling an hourglass with a narrow waist, but broad ends [33,34]. Before the hourglass model (HGM) was proposed in the 1990s, the early conservation model (ECM) was widely accepted, which echoed von Baer’s third law [35], i.e., embryos progressively diverge in morphology during ontogeny. The debates about these two, along with other models, i.e., adaptive penetrance model [36] and unconstrained model (random) [37], are still ongoing, although recent evidence at molecular and genomic levels has provided support for the presence of the phylotypic stage (the waist stage of development) in fungi, insects, plants, and vertebrates [38,39,40]. According to Haeckel’s biogenetic hypothesis, ontogeny recapitulates phylogeny [41]. The evident similarities between the development of an individual and the evolution of the whole biological system have been addressed by many generations [42] to verify that these models in ontogeny are recapitulated in other evolutionary processes. For example, studies on gene evolution in Drosophila spp. recaptured the hourglass model in that the early maternal genes showed a higher level of diversity than zygotic genes [43]. Here, we propose the biosynthesis of complex biological compounds as an analogy of the development of an organism, and ask whether any of the models fit to the evolution of the genes involved in the biosynthesis.
The objectives of the present study were to shed light on the presence of four classes of alkaloid genes (clusters) in 53 strains of 19 Claviceps species, and to understand the evolutionary patterns of these genes at inter- and intraspecific levels. This information helps build the foundation for future studies on chemo- and genotype associations and for developing gene-based chemotyping and toxin detection.

2. Results

2.1. Genome Assemblies

The 37 genome sequences assembled in this study resulted in 1362 to 2581 contigs, N50 values ranged from 19,946 to 55,909 bp, and the completeness measured by Benchmarking Universal Single-Copy Orthologs (BUSCO) over the fungal database (fungi odb10) ranged from 97% to 99.1% (Table 1, available in GenBank https://www.ncbi.nlm.nih.gov/ (accessed on 9 November 2021) as accessions JAIURI000000000–JAIUSS000000000 available upon publication of the article). The quality of the assemblies was equivalent to the assemblies of 17 genomes from previous studies (Table 1) [44,45]. Overall, the 54 assemblies of 53 strains (two versions of assemblies for CCC1102 were included because certain genes were obtained from one or the other assemblies) of 19 Claviceps species included in this study belong to three sections: C. sect. Citrinae, C. sect. Claviceps, and C. sect. Pusillae, from six continents (Africa, Asia, Australia, Europe, North America, and South America) and on host plants in 26 genera (Table S1).

2.2. Presence of Four Classes of Alkaloid Genes in 53 Genomes

One thousand sequences of 19 loci were extracted from the 53 genome assemblies as detailed below. The DNA sequences of each genes were submitted to Genbank associated with accession numbers: cloA (49 sequences) MZ882098–MZ882146, dmaW (118 sequences) MZ871640–MZ871757, easA (51 sequences) MZ851397–MZ851447, easC (50 sequences) MZ851807–MZ851856, easD (51 sequences) MZ871767–MZ871817, easE (66 sequences) MZ877968–MZ878033, easF (88 sequences) MZ881959–MZ882046, easG (50 sequences) MZ882047–MZ882096, easH1 (50 sequences) MZ934760–MZ934809, easH2 (32 sequences) MZ934810–MZ934841, lpsB (48 sequences) MZ934842–MZ934889, lpsC (44 sequences) MZ934890–MZ934933, idt/ltmB (55 sequences) MZ935033–MZ935087, idt/ltmC (47 sequences) MZ935088–MZ935134, idt/ltmG (three sequences) MZ935227–MZ935229, idt/ltmM (47 sequences) MZ935135–MZ935181, idt/ltmP (46 sequences) MZ934987–MZ935032, idt/ltmQ (60 sequences) MZ934934–MZ934986, and idt/ltmS (45 sequences) MZ935182–MZ935226.

2.2.1. Ergot Alkaloid Genes (eas)

More consistency in terms of presence/absence of eas genes was observed in C. sect. Claviceps than C. sect. Pusillae. The results from BLASTn searches using in-house script (see Section 4.2 for details) and Geneious mapping (https://www.geneious.com, accessed on 9 November 2021) with reference genes showed the genomes of all isolates from C. sect. Claviceps contained at least 10 eas genes matching the C. purpurea 20.1 reference sequences (Table 2; lpsA1 and lpsA2 were excluded from analyses as they were heavily fragmented due to the significantly long length. A study on long-read sequencing of several selected strains by Hicks et al. was focused on these two genes (in this Special Issue). The 10–12 genes were assembled on two to three contigs. For most strains, nine genes (lpsC, easA, lpsB, cloA, and easC-G) were on the same contig. Genes after dmaW, i.e., easH1, easH2, and fragments of lpsA, were on different contigs. The easH2 gene was either not detected or on a separate contig possibly due to the long length of lpsA1 because it was located between lpsA1 and A2 in the reference genome C. purpurea 20.1. The exceptions were C. humidiphila LM576, C. spartinae CCC535, C. purpurea LM461, and C. ripicola LM220 and LM454, in which lpsC was on a different contig, or lpsC along with the next three to four genes were on the same contig separated from other genes (Table 2).
Both inter- and intraspecific variation was observed, regardless of the general consistency of presence of eas genes. Species-specific features included all three strains of C. occidentalis have two partial copies of dmaW (~658 bp, ~641 bp composed of a partial exon 1 and full-length exon 2 and 3) and a single copy of all other eas genes except easH. Of a relevant note, all partial genes detected in the present study were located at the end of contigs. Moreover, all three strains of C. quebecensis had a second partial nonfunctioning copy of easE (275, 275, and 1208 bp) and two partial copies of easF with good open reading frames (ORFs), and they were lacking easH2 (Table 2).
Intraspecific variation among the 27 strains of C. purpurea was evident as most strains contained one copy of lpsC, easA, lpsB, cloA, easC, D, G, H1, and H2, and two copies of dmaW. However, three strains (LM65, LM72, and LM582) lacked easH2. Eleven strains had a second copy of easE (easE2), six full- or near-full-length and five partial, but these gene fragments contained indels of various sizes and internal stop codons (Table 2). This would indicate that they may not be functional genes unless those variations were caused by sequencing or assembly errors. In contrast, the second copy of full-length easF (easF2) from LM72 (MZ881984) and LM461 (MZ881981) had good ORFs. The easF2 gene of the other six strains was split on two contigs with gaps in the middle. Most of these fragments, except the second exon at the 3′ end of Clav26, Clav55, and LM470, were free of internal stop codons. Four strains had a full length (or close to full length), and one strain (LM469 652 bp) had a partial third copy, easF3, yet these gene fragments had a number of indels and internal stop codons (Table 2). The intraspecific variations were also found in C. arundinis and C. ripicola (Table 2).
The six genomes from C. sect. Pusillae had more variable numbers of the eas genes observed, but all six genomes lacked lpsC and easH2 (Table 2). The strain C. lovelessii CCC647 had the highest number of matches, i.e., 10 full- or near-full-length matches (cloA 1788 bp, easD had an 8 bp short gap at split region), while all but easH1 and lpsB had good ORFs. In contrast, C. digitariae CCC659 had only two gene matches: dmaW and easA, but both were full-length with good ORFs. C. maximensis CCC398 and C. citrina CCC265 (C. sect. Citrinae) had no matches for any eas genes (Table 2).
Examining each eas genes, easA was present the most consistently in 51 of 53 genomes as a single copy and had good ORFs, except for the one in C. pusilla CCC602 which had an internal stop codon. Similarly, lpsB, cloA, easC, easD, and easG were present as a single copy in all species of sect. Claviceps and two to four species in sect. Pusillae (Table 2).
For easE, all species in sect. Claviceps contained at least one copy, six strains of C. purpurea (LM39, LM63, LM72, LM461, LM469, and LM474)) had a full length second copy (easE2), and the other five strains C. purpurea (Clav04, Clav46, Clav52, Clav55, and LM470), all three C. quebecensis, one C. spartina, and one C. monticola had a second partial copy. Compared with the C. purpurea 20.1 easE1 reference sequence, all the easE2 sequences contained a large number of deletions (gaps) of various sizes in exon and intron regions, internal stop codons, and no start codon, indicating that they are likely not functional. For species in sect. Pusillae, one copy of easE was found in four species with good ORFs (C. africana CCC489, C. lovelessii CCC647, C. pusilla CCC602, and C. sorghi CCC632).
For easF, all species in sect. Claviceps contained at least one copy; however, two strains of C. purpurea (Clav55 and LM470) had internal stop codons near the 3′ end. Twenty-three strains of seven species (C. arundinis, C. humidiphila, C. monticola, C. pazoutovae, C. purpurea, C. quebecensis, and C. spartinae) had a second full-length or partial copy, among which 19 strains had good ORFs. In addition, a third copy was found in some C. purpurea strains in full length (LM39, LM63, and LM65) or partial (LM461 and LM469). Even though with 77–93% similarity to C. purpurea 20.1 easF1, none of the third copies had a correct open reading frame (not functional) (Table 2). Three species in sect. Pusillae (C. africana, C. lovelessii, and C. sorgji) had one functioning copy.
For dmaW, most species (strains) in sect. Claviceps contained two full-length copies or copies split on two contigs with gaps. Six strains of C. purpurea (Clav26, Clav52, LM223, LM232, LM4, and LM470) had a partial second copy, but all three strains of C. occidentalis had partial sequences (~650 bp) for both copies. One strain of C. arundinis (CCC1102) had a third copy in full length, with 81% and 83% similarities with dmaW1 and dmaW2, but frameshifts and internal stop codons were present. Five species in sect. Pusillae, except C. maximensis, had one copy.
Interestingly, the additional copies of easE, easF, and dmaW were more or less clustered together, such that the second copies of all three genes were present on the same contig in C. monticola CCC1483 and C. spartinae CCC535 (Figure 2A). Alternatively, the easF2 sequence was split on two contigs, which were located with easE2 on one contig and dmaW2 on the other, i.e., C. purpurea Clav55, and C. quebecensis Clav32, Clav50, and LM458 (Figure 2B). More commonly, easE2 was on the same contig as easF2, whereas dmaW was on another contig, such as in seven strains of C. purpurea (Clav46, Clav52, LM470, LM474, and LM72; Table 2), or easF2 co-located with dmaW when easE was a single copy (LM583; Figure 2C). In cases when the third copy of easF was present, they were often on the same contig with dmaW2, i.e., C. purpurea LM39, LM63, LM65, and LM469 (Figure 2D). The arrangement in LM461 was more peculiar in that the second copies of easE and easF were on the same contig with dmaW1 and easG (a single-copy gene), which indicates that they may all be on the primary ergot alkaloid gene cluster (Figure 2E). The third dmaW from CCC1102 (from SW assembly) was not connected to other eas genes (Table 2).
For easH, easH1 was present in 50 genomes, except C. citrina, C. digitariae, and C.maximensis; however, the genes of the four species (CCC489, CCC602, CCC632, and CCC647) in C. sect. Pusillae had numerous indels of various sizes throughout the sequence, causing frameshifts and internal stop codons. Further validation of the sequences is needed to confirm whether these are functioning. The easH2 gene was present in 32 strains of six species (C. arundinis, C. humidiphila, C. occidentalis, C. perihumidiphila, C. purpurea, and C. ripicola). The reference sequence of easH2 from C. purpurea 20.1 was 840 bp, which is about 100 bp shorter than easH1 (945 bp), and it was considered a pseudogene. Our results showed that the 32 easH2 sequences had variable lengths and high levels of nucleotide variation (see more notes in later sections: phylogenies and gene diversity). Most of these sequences appeared not functional; however, the lengths of the sequences from two strains of C. ripicola (LM218 and LM220) were 954 bp and contained full-length ORFs, indicating that they are likely functioning genes.
For lpsC, at least one strain per species in sect. Claviceps (except C. perihumidiphila) showed one copy of lpsC, i.e., in total, 43 out of 46 strains contained a single copy of lpsC, among which three strains of C. purpurea, i.e., Clav26, LM4, and LM232, had a single internal stop codon; otherwise, the full range of sequences aligned very well with the reference. It is possible that the single internal stop codon could be a sequencing error. Another five strains/species, including C. capensis CCC1504, C. cyperi CCC1219, C. humidiphila LM576, C. monticola CCC1483, C. purpurea LM223, and C. spartinae CCC535 had partial sequences 1000–5000 bp long. These sequence fragments contained several indels and internal stop codons, and they are apparently not functional genes. Only one strain of C. perihumidiphila lacked lpsC.

2.2.2. Indole-Diterpene/Lolitrem (idt/ltm) Genes

Compared with eas genes, the presence/absence and copy numbers of idt/ltm genes were less variable. Through mapping genome assemblies to the reference genes, all members in sect. Claviceps had one copy of ltmC, M, P, and S and one or two copies of idt/ltmB and Q, except C. cyperi CCC1219 that lacked ltmQ and S. All members in C. sect. Pusillae had no matches to any ltm genes, whereas members of sect. Citrinae (C. citrina CCC265) had full-length matches with ltmB, C, G, and M.
Notable species-specific features were that all three strains of C. occidentalis (LM77, LM78, and LM84) had two partial copies ltmQ (1517–1518 bp); C. arundinis (CCC1102, LM583), C. perihumidiphila (LM81), C. ripicola (LM218, LM219, LM220, and LM454), and C. spartinae (CCC535) had two functional copies of ltmB (Table 3). The translated sequences of ltmS from three strains of C. occidentalis (LM77, LM78, and LM84) and three strains of C. quebecensis (Clav32, Clav50, and LM458) were 14 amino-acid residues longer than other species, and those 14 amino acids were identical among the six strains.
Intraspecific variations were observed in C. purpurea; four out of 27 strains showed a second copy of ltmQ (Table 3). In the strain Clav04, the fragment on the primary cluster (contig130) ltmQ1 was a partial copy, whereas another copy on contig 637 was a full-length copy (ltmQ2) with a good ORF. Clav46 had two partial copies; ironically, the copy on contig 43 (where all other ltm genes co-located) had a number of short deletions causing frameshifts and internal stop codons, whereas the copy on contig 229 had good ORFs, except that the first 243 bp (including 53 residuals in exon 1 and partial exon 2) were missing. On the other hand, some of the single-copy ltmQ sequences, such as in C arundinis CCC1102, C. pazoutovae CCC1485, C. perhumidiphila LM81, C. purpurea LM72, C. quebecensis Clav32 and LM458, C. ripicoloa LM218 and LM219, and C. spartinae CCC535, had varied number of indels causing frameshifts and internal stop codons; however, phylogenetically, they still belonged to copy 1 (more details in in Section 2.3).
All six genes were clustered on the same contig in 29 strains of the 12 species in sect. Claviceps; otherwise, at least three genes were on the same contig. The clustered six ltm genes were arranged in the same order as in C. purpurea 20.1 [14] (Table 3; gene coordinates are not shown). In C. citrina, ltmB and C were on the same contig (1947), whereas ltmM and G were on separate contigs. It is not assessable whether they were in one cluster. In general, the inter-gene sequences ranged from 500–1200 bp; however, several strains had very long spaces between ltmP and B, such as 4 kb in C. ripicola LM220 and over 2 kb in LM218 and C. arundinis CCC1102 and LM583 (results not shown).
Through the additional BLAST searches with lower stringency (E-value < E−50), fragments of 483 and 501 bp of ltmG from C. maximensis CCC398 and C. digitariae CCC659, respectively, were pulled out by using ltmG from C. paspali RRC-1481. They were 76% and 78% similar, respectively, to the reference sequence in the coverage (comparable to the 74% similarity between C. citrina CCC265 and C. paspali RRC-1481). Running BLAST searches of these two fragments to the NCBI database indicated that 60 bp of the 483 bp from C. maximensis matched with Beauveria bassiana ARSEF 2860 geranylgeranyl pyrophosphate synthetase; 279 of 501 bp from C. digitariae matched with idtG (geranylgeranyl diphosphate synthase) from Periglandula ipomoeae strain IasaF13.

2.2.3. Loline Alkaloid (lol) and Peramine (per) Genes

All the searches with lol and per reference genes resulted in no hits, except for the low-stringency BLAST with lolC that resulted in small fractions of sequences (~150–180 bp) matched with the start of the fifth exon for seven species (strains): C. africana (CCC485), C. citrina (CCC265), C. digitariae (CCC659), C. lovelessii (CCC647), C. maximensis (CCC398), C. pusilla (CCC602), and C. sorghi (CCC632). These fragments matched with 80% to 92% identity to O-acetylhomoserine from Purpureocillium lilacinum (XM 018324292), Drechmeria coniospora (XM 040800194), and Verticillium dahliae (XM 009654023) in the NCBI database https://blast.ncbi.nlm.nih.gov/Blast.cgi accessed in August 2021. These sequences were not submitted to GenBank because of their short length.

2.3. Phylogenies of eas and idt/ltm Genes

The individual phylogenetic trees of 11 eas genes all agreed on the long-branched separation between C. sect. Pusillae and sect. Claviceps, which was congruent with the pattern inferred by the previous multigene analyses combined with morphological, ecological, and metabolic features [3] and supported by the phylogenomic analyses [44] (Figure 3a). In C. sect. Pusillae, all genes agreed on the close proximity of C. fusiformis, C. lovelessii, and C. pusilla, as well as of C. africana and C. sorghi. The main incongruence among the gene trees appeared in the uncertain placements of C. digitariae and C. paspali, as well as the variant relationships among C. fusiformis, C. lovelessii, and C. pusillae, which could be a result of insufficient sampling (see further explanation in Section 3; Figure 3b–d and S1).
In terms of the species relationships in the sect. Claviceps, considering single-copy genes, a majority of gene trees agreed on the grouping of the four major clades inferred by the previous phylogenomic study [44]. For communication convenience, we named them as four Batches to avoid confusion with species level and general use of clades: Batch humidiphila including C. arundinis, C. humidiphila, and C. perihumidiphila, Batch purpurea including C. capensis, C. monticola, C. pazoutovae, and C. purpurea (previously designated as Clade purpurea by Píchová et al. [3]), Batch occidentalis including C. occidentalis and C. quebecensis, and Batch spartinae including C. ripicola and C. spartinae (Figure 3a and Figure S1). The exceptions were C. perihumidiphila and C. cyperi that had uncertain placement on different gene trees (Figure S1b,d,f,g). The more notable disparities among the gene trees appeared in the order of divergence of the four Batches from C. sect. Pusillae or sect. Paspalorum (Figure 4 and Figure S1 and Figure S2). Previous phylogenomic analyses resulted in the topology of a twice bifurcate pattern, ((Batch humidiphila)(Batch spartinae); (Batch occidentalis)(Batch purpurea)) [44], and this pattern was only supported by easG (Figure 4a). A slight variation of the easA tree appeared in that Batch humidiphila was an earlier diverged lineage than Batch spartinae, and these two formed a paraphyletic group instead of a monophyletic group (Figure 4b). All other genes supported the derived position of Batch humidiphila and Batch spartinae (Figure 4c–e). Furthermore, eight genes (cloA, dmaW1, easC, easE1, easH1, lpsC, and ltmB1) placed Batch purpurea at a more ancestral position than Batch occidentalis, whereas six genes (easF1, lpsB, ltmM, ltmP, ltmS, and ltmQ1) reversed the divergence order of these two Batches (Figure 4c,d). The other three genes (easD, lpsC, and ltmC) showed an unresolved order of divergence (Figure 4e).
As for genes with multiple copies, the most complex was dmaW. The dmaW2 sequences were separated into two groups. Group I included 16 strains of eight species (all non-C. purpurea dmaW2 except C. monticola CCC1483 and C. pazoutovae CCC1485), forming a parallel lineage with their dmaW1 counterpart and representing one gene duplication at node ① (Figure 5a and Figure S2a). Group II included C. purpurea, C. monticola CCC1483, and C. pazoutovae CCC1485, as well as one strain of each dmaW1 (LM60) and dmaW3 (CCC1102). This group diverged from C. purpurea dmaW1, representing the second duplication at node ②. Within group II, the otherwise consistent close relationship between C. monticola and C. pazoutovae was broken by seven strains of C. purpurea. This can be explained by a third duplication at node ③. The presence of dmaW3 of C. arundinis CCC1102 and dmaW1 C. purpurea LM60 in group II indicated extra duplication events at nodes ④ and ⑤ (Figure 5a).
The second and third copies of easF (easF2, easF3) grouped in one clade diverged from C. cyperi easF1. Within this clade, C. purpurea easF2 (14 strains) appeared as a paraphyletic group, from which diverged a clade composed of C. purpurea easF3 (five strains) and a subclade easF2 of C. quebecensis, C. humidiphila, C. arundinis, C. spartinae, C. pazoutovae, and C. moticola. From this tree topology, at least two gene duplication events were inferred (Figure 5b and Figure S2b).
The second copy of easE (easE2) from 16 samples grouped into one clade, which diverged from easE1 of C. occidentalis. However, within the easE2 clade, C. purpurea samples were separated into two subclades. The sample Clav 04 appeared as an orphan clade located close to C. quebecensis easE2, and another 10 samples grouped together and had affinity with C. monticola easE2, indicating that the historical gene duplications possibly occurred twice at nodes ① and ② (Figure 5c and Figure S2c).
The second copies of easH (easH2) were grouped into three groups that diverged three times independently. Group I includes two strains of C. ripicola (LM218 and LM220) that diverged from easH1 of the clade composed of C. capensis, C. moticola, and C. pazoutovae. As noted earlier, the sequence lengths of easH2 from these two strains are similar to easH1 and contained good ORFs, indicating that they were likely from a very recent gene duplication. Group II, including three strains of C. occidentalis, one strain each of C. arundinis, C. humidiphila, and C. perihumidiphila, and 15 strains of C. purpurea, diverged from the easH1 clade composed of eight species in sect. Claviceps (C. occidentalis, C. cyperi, C. quebecensis, C. perihumidiphila, C. ripicola, C. spartinae, C. arundinis, C. humidiphila, and C. purpurea). Group III, including nine strains of C. purpurea and the reference sequence of C. purpurea 20.1 easH2, diverged within the clade of C. purpurea easH1 (Figure 5d and Figure S2d).
For idt/ltm genes, the second copies of ltmB can be considered as one group arising from one gene duplication, except that ltmB1 of C. humidiphila LM576 was placed in this group. This sequence was the only copy detected in LM576 and, therefore, labeled as copy one. However, it was on a separate contig (contig 478), clustered with neither ltmP and ltmQ (contig 945, Table 3) nor ltmC, ltmS, and ltmM (contig 745). It is very likely that this represents the second copy of this gene, and copy one was either lost or not detected (Figure 5e and Figure S2e).
The three partial ltmQ2 genes from three strains of C. occidentalis grouped closely with a clade composed of four strains C. purpurea ltmQ1 (Clav04, Clav46, LM71, and LM72) and two ltmQ2 (Clav55, and LM461) (Figure 5f and Figure S2f). As noted earlier in Section 2.2.2, ltmQ1 of Clav04 and Clav46 was either a partial gene or a nonfunctional gene, respectively, whereas the second copies were functioning genes. Here, ltmQ2 of Clav04 and Clav46 grouped in C. purpurea ltmQ1 clade 1. This situation can be explained by a scenario in which these two copies might have switched locations due to errors in assembling. For another two sequences, ltmQ1 of LM71 was on a different contig with other ltm genes, and in LM72, the gene was split into two contigs, where one half was connected with ltmP, while the other half was independent. Overall, these four sequences appeared as the same copy in C. purpurea ltmQ2 (Clav55 and LM461). If that is the case, one gene duplication event possibly happened at node ①. Alternatively, the ltmQ2 of Clav04 and Clav26, as well as the two ltmQ2 groups, could have resulted from independent gene duplications (Figure 5f and Figure S2f). Long-read sequencing, i.e., Nanopore or PacBio, could bring more insight by ruling out the possible assembly errors.

2.4. Intraspecific Genetic Variation within C. purpurea

Overall, the haplotype diversities (Hd) of eas genes ranged from 0.936 to 1 (close to saturation), except for easH2 that had a lower value, 0.858. Nucleotide diversity (Pi) of eas genes ranged from 0.08 (easD) to 0.168 (easH2), the average number of nucleotide difference (K) ranged from 7.1510 (easD) to 212.238 (easE2), tree-based divergence from COT ranged from 0.06 (easA and easD) to 0.150 (easH2), and tree-based diversity ranged from 0.01 (easD) to 0.219 (easE2). In general, easD and easA had lower values for divergence and/or diversity. The second copies of dmaW, easE, easF, and easH had much higher values of the four parameters. Some of those genes may not function and, therefore, had fewer functional constraints. If only the first copy of the genes was considered, the genes with the highest diversity and divergence values were Pi 0.03 (dmaW1), K 92.379 (lpsC), tree-based divergence from COT 0.0025 (dmaW1), and tree-based diversity 0.038 (dmaW1). The two genes functioning in the middle of the pathway, i.e., easA and easD, were observed to be the most conserved genes compared with the other genes in the earlier or later steps (Table 4, Figure 6a).
1 Sequences with large gaps causing a significant reduction in the number of sites were excluded from the analyses. 2 Tree-based divergence from the center of tree (COT) and diversity were estimated by DIVIEN; other parameters were estimated by DnaSP.
Compared with the first copy of eas genes, idt/ltm genes had a similar level of the highest diversity and divergence. Pi ranged from 0.007 (ltmM and ltmS) to 0.02 (ltmQ1), average number of nucleotide difference (K) ranged from 6.839 (ltmS) to 41.486 (ltmQ1), tree-based divergence from COT ranged from 0.005 (ltmM) to 0.066 (ltmB1), and tree-based diversity ranged from 0.009 (ltmM) to 0.04 (ltmQ) (Table 4, Figure 6b).

3. Discussion

3.1. Correlations between the Presence/Absence of Alkaloid Genes and Alkaloid Production

It has been shown while attempting to induce EA production for pharmaceutical purposes (see review by Flieger [46]) that different ergot species produce varied types of ergot alkaloids. Simultaneously, mycologists explored the use of alkaloid chemistry for characterizing Claviceps species [47,48]. Pažoutová and colleagues [49] differentiated chemoraces using the qualitative and quantitative features of EA production. A systematic study on EA production in 43 Claviceps species confirmed that ergopeptides were produced only by the members in C. sect. Claviceps, whereas dihydroergot alkaloids (DH-ergot alkaloids) were produced only by certain members of C. sect. Pusillae, i.e., C. africana, C. gigantea, and C. eriochloe. Sixteen out of 28 species in C. sect. Pusillae were shown not to produce any EAs, including C. maximesis, C. pusillae, and C. sorghi. Species only producing clavines included C. fusiformis, C. lovelessii, and three other species [3]. More recent studies demonstrated that the indole alkaloid profiles supported the recognition of new species based on molecular and ecological data [29,30].
The EA genes detected in the present study were consistent with the known EA production of the included species, for the most part. For example, C. africana CCC489 had eight genes detected (lacking cloA, easH2, lpsB, and lpsC), and all appeared to be functional, consistent with its production of DH-ergot alkaloids. Similarly, in C. lovelessii CCC647, ten EA genes were detected (lacking lpsC and easH2); however, easH1 and lpsB had mutations resulting in a number of internal stop codons, which is consistent with the production of clavines, a product of the early pathway [3]. A lack of EA production corresponded to no matches for any EA genes in C. maximensis CCC398 and C. citrina CCC265 (C. sect. Citrinae). However, for C. pusillae and C. sorghi, several functional genes were detected even though no EA production was reported [3]. In C. pusillae CCC602, eight genes had full-length matches (dmaW1, easA, C, D, E, G, and H1, and lpsB) and one partial match (cloA 332 bp), but only dmaW1, easC, and easE had ORFs. The lack of easF, the second step in the pathway encoding dimethylallyltryptophan N-methyltransferase, might explain the lack of production of EAs. C. sorghi CCC632 had seven full-length matches (dmaW1 and easA, C, E, F, G, and H1) and two partial (cloA 435 bp and easD 653 bp). Except for cloA and easH1, all other genes had good ORFs. Theoretically, at least chanoclavine should be produced unless those genes were not expressed possibly due to a lack of triggers from physical or environmental conditions [50].
Only the members in C. sect Claviceps had lpsC and easH2, although C. perihumidiphila LM81, one strain of C. ripicola (LM454), and C. arundinis (LM583) lacked lpsC, and C. capensis, C. cyperi, C. humidiphila, and C. monticola had a partial lpsC. Moreover, three C. purpurea strains (LM65, LM72, and LM582) and three C. quebecensis strains (Clav32, Clav50, and LM458) lacked easH2. Whether the absence of these genes causes variations in their EA profiles requires a systematic investigation on the associations between eas genes and products in those species. It is worth noting, however, that the possibility of false negatives in genome screening cannot be ruled out. For instance, for C. arundinis CCC1102, lpsC was detected in the WF version of the genome assembly (created in the present study), but not in the previous version (SW [44], Table 2). The opposite also occurred in that a full length of dmW3 was detected in SW assemblies, but only partially (360 bp) in WF assemblies (this study).
The production of indole diterpenoid compounds in ergot fungi was reported in a small number of species, i.e., C. arundinis, C. cynodontis, C. humidiphila, C. paspali, and C. purpurea [21,28,29,30]. Our genome mining showed that ltmQ, P, B, C, M, and S were present in all species in C. sect. Claviceps except C. cyperi. Furthermore, ltmB, C, and M and a nonfunctioning ltmG were detected in C. citrina, while a partial ltmG was detected in C. maximensis CCC389 and C. digitariae CCC659. According to the proposed pathway, to produce paspaline, the first step requires ltmG, followed by ltmC, ltmM, and ltmB [27]. The absence of ltmG could stop production unless GGPP is present through other resources. This might be the case in the producers of indole diterpenoid compounds listed above. In the same way, it is very likely that most of the species in C. sect. Claviceps and the three species in sect. Citrinae and sect. Pusillae could also produce some forms of indole diterpenoid compounds.

3.2. Macro-Evolution of the Gene Clusters—Frequent Gene Duplications and Losses

Ergot alkaloid diversity among diverse producers, i.e., species in Hypocreales, Eurotiales, and Xylariales, was formed by three major processes: gene gains, gene losses, and gene sequence changes [13,14]. This is true within the genus Claviceps. A recent genus-level genome comparison hypothesized that unconstrained tandem gene duplications were caused by putative loss of repeat-induced point mutations in C. sect. Claviceps [44]. This pattern of duplication was confirmed here by the presence of a cluster of second or third copies of easE, easF, and dmaW, as well as second copies of ltmQ and B (Table 2 and Table 3). Moreover, easE2 and F2 of C. purpurea LM461 were on the same contig as easG and partial dmaW1, suggesting that the second copies of easE and F were arranged on the primary cluster possibly as a result of tandem gene duplication. None of the extra gene copies were found in C. sect. Pusillae or sect. Citrinae, consistent with a previous observation that the genomes of sect. Pusillae and sect. Citrinae had much fewer gene duplication events predicted [44]. According to the phylogenies of multicopy genes, one to five gene duplications can be inferred for individual genes. The dmaW gene, encoding the enzyme for the first and determinant step of EA production, had the highest number of potential gene duplications. Even though the presence of dmaW was conserved across various EA producers and proven to be a monophyletic group [51], its evolutionary rate was faster than genes in the middle steps of the EA pathway.
Gene losses can be inferred through the discrepant placement of certain gene copies on the phylogenies. For instance, one copy of ltmB in C. humidiphila LM576 was detected; however, this copy grouped with ltmB2. It is very likely that this was the second copy of ltmB gene, and the first copy was either lost or not detected (Figure 5e, see also Section 2.2.2. and Section 2.2.3). The ltmQ1 from four strains of C. purpurea (LM71, LM72, Clav04, and Clav46) was placed in the ltmQ2 clade. For LM71 and LM72, there was only one copy detected (ltmQ1); the scenario is likely similar to ltmB of LM576, where this single copy was the second copy, and the original gene was either lost or not detected (Figure 5f). On a related note, ltmQ2 of Clav04 and Clav46 was located in the ltmQ1 clade. An intuitive explanation would be that the identities of the two copies switched due to assembly artefacts (Figure 5f). Lastly, the incongruent order of divergence of the four Batches of species in C. sect. Claviceps inferred by single-copy genes could be explained as lineages sorted during the frequent gains and losses of the ancestral genotypes (Figure 4). Unlike C. sect. Claviceps, the phylogeny incongruence in C. sect Pusillae was mainly caused by the uncertain placement of C. digitariae and C. paspali. In light of the genome structure, this was likely caused by insufficient sampling instead of gene lineage sorting.

3.3. Micro-Evolution of eas Genes within C. purpurea—An Approximate Hourglass Model

The inter- and intraspecific variations of the second metabolite gene clusters in fungi are typically reported as variations in structures, gene contents, copy numbers, null alleles, and nonhomologous clusters (see review by Rokas [52]). Fewer studies have focused on the DNA sequence variations in each of the gene members. Lorenz et al. [53] identified the sequence differences in lpsA between two C. purpurea strains (P1 and ECC93) that were associated with the different alkaloid types; however, they could not find differences in cloA between C. fusiformiis and C. hirtella that could explain why this gene was functional in the former but not in the latter. Phylogenetic analyses of DNA sequences of four core genes (dmaW, easF, easC, and easE) from selected samples across Clavicipitaceae (with emphasis on Epichloë) uncovered extensive gene losses, and the origin of EA clusters on Clavicipitaceous fungi was determined to be direct descent rather than horizontal transfer [13].
The present study is the first, to our knowledge, to examine the variations of each gene on a fine scale, i.e., among 28 strains of C. purpurea. Both DNA polymorphism analyses of the DNA sequence alignments through DnaSP and tree-based diversity and divergence analyses using the DEVIEN software indicated that the evolutionary rate of early step genes, i.e., dmaW and easF is much higher than the middle step genes, i.e., easA, C, D, and E (Figure 6, Table 4). The pattern matches with the hourglass model in ontogeny, which was also evidenced in genomic studies [39]. The hourglass model (HGM) and early conservation model (ECM) in ontogeny are explained by developmental constraints. HGM considers that, at the middle stage, the meta- and cis-interactions reach the highest complexity, posing constraints for development [54,55], whereas ECM considers the constraints at early stage to be critical because any alterations at early stage would cause cascading effects [56]. The EA pathway was reported as an unusually inefficient one such that a high volume of certain intermediates were accumulated more than needed for producing the end-products [57]. This may impose less selective pressure on the middle steps. The sclerotia of C. purpurea from tall fescue contained chanoclavine (4 ± 3 µg/g) and agroclavine (2 ± 1 µg/g) in addition to the end-products, i.e., ergopeptines and ergnovine [57]. The extra amount of chanoclavine coincides with the lowest evolutionary rates of easD and easA inferred in the present study (Figure 6). The role of easD is to oxidize chanoclavine to chanoclavine aldehyde, followed by the reactions of easA and easG to yield agroclavine. It is likely that easD is under less selective pressure because plenty of supplies are available. Alternatively, it might be under a high level of functional constraints because of its pivotal position in the pathway (first step of closure of the D-ring). A different isoform of easA in C. africana and C. gigantean reacts differently, creating a shunt yielding dihydroergot alkaloids (Figure 1). This diversification may result from the change in ecological niches. Nevertheless, the rates of diversity and divergence of easA were the second lowest after easD, even though it is physically located in between lpsB and lpsC. Both of these later step genes had much higher rates than easA, possibly due to fewer constraints or more direct positive selection, as they are involved in the final steps. The cloA gene represents another point of the pathway where shunts may take place. Presumably depending on the different isoforms of cloA, varied levels of oxidation occur, resulting in different end-products [13,15]. The high rates of diversity and divergence of cloA may reflect a high level of positive selection.
The signatures of selective pressure in DNA sequences could be detected through neutrality tests. For instance, if the value of Tajima’s D significantly deviates from zero, it indicates the presence of selective pressures, i.e., negative values suggest a positive selection, whereas positive values indicate balancing selection [58]. We conducted neutrality tests and found that none of the genes departed significantly from neutrality (results not shown). These results are contradictory to Liu et al. [59], in that easE and easA were under positive selection in Canadian and western USA C. purpurea populations. We speculate here that the small sample sizes in present study (28 sequences versus 200–300 in the previous study) might be the factor limiting the ability of the Tajima’s D test to detect selective pressures.
Compared with eas gene pathways, it is difficult to evaluate whether or not the evolutionary pattern of ltm genes conformed with the hourglass model because the sequential order of steps was uncertain. Even if we assume that paspaline-derived compounds are the main products, in the absence of ltmG, there are only two to three sequential steps to paspaline. Nevertheless, ltmM had the lowest rate of divergence and diversity compared with earlier (ltmC) and later steps (ltmP and Q).
Our results provide evidence for the first time that eas gene evolution follows the hourglass model. Whether this pattern exists in other metabolic gene pathways and the mechanisms that underpin this or other patterns are questions to be answered in future work.

4. Materials and Methods

4.1. Genome Aquisition

Fifty-four genomes of 19 Claviceps spp. were studied. The assemblies of 17 genomes and the raw reads of another 34 genomes were from previous studies (Table 1) [44,45], which outlined the protocols for the DNA extraction, library preparation, and sequencing platforms. In the present study, three additional genomes were sequenced (LM63, LM65, and LM72) using a protocol similar to that described in [44]. Briefly, the gDNA samples were normalized to 300 ng and sheared to 350 bp fragments using an M220 Covaris Focused-Ultrasonicator instrument (Covaris, Woburn, MA, USA). The obtained inserts were used as a template to construct PCR-free libraries using the NxSeq AmpFREE Low DNA Library kit (LGC, Biosearch Technologies, Middleton, WI, USA)) following LGC’s library protocol. Balanced libraries in equimolar ratios were pooled, and paired-end sequencing was carried on a NextSeq500/550 (Illumina, San Diego, CA, USA) using 2 × 150 bp NextSeq Mid Output Reagent Kit (Illumina, San Diego, CA, USA) according to the manufacturer’s recommendations.
The new assemblies of 37 genomes were achieved using the following protocols: raw reads were trimmed using BBDuk, a component of BBTools downloaded from the Joint Genome Institute website (https://jgi.doe.gov/data-and-tools/bbtools/ accessed on 9 November 2021). Both quality-trim and kmer-trim were applied using the parameters qtrim = rl, trimq = 20, forcetrimleft = 10, minlength = 36, ftm = 5, ref = adapters/adapters.fa, ktrim = r, k = 22, mink = 11, hdist = 1, tbo tpe. The qualities of initial reads and post-trimming reads were assessed using FastQC version 0.11.9, setting parameters as quiet, noextract. Pairs of trimmed reads for each strain were assembled using the SPAdes version 3.14.0 genome assembly toolkit with the default parameters [60]. QUAST version 5.0.2 was used to evaluate the resulting assemblies and to obtain statistics about the assembled contigs [61]. To assess the completeness of the genome assemblies, BUSCO 4.1.4 was run on the contigs using the fungal database (fungi odb10) (Creation date: 10 September 2020, number of species: 549, number of BUSCOs: 758) [62].

4.2. Alkaloid Gene Screening and Extraction

To investigate the presence/absence of the four classes of alkaloid synthesis genes in 54 genomes, BLAST searches were conducted to interrogate the genomes with the reference genes of interest using an in-house perl script (running blastn with an E-value of E−99 as the cutoff). Alternatively, each individual genome assembly was mapped onto the reference genes using the ‘Map to Reference’ function in Geneious prime 2020.1.2 (https://www.geneious.com, accessed on 9 November 2021). The reference gene clusters were downloaded from GenBank and applied as follows: the clusters of 14 ergot alkaloid synthesis (eas) genes and six indole-diterpene/lolitrem genes (IDT/ltm) from C. purpurea strain 20.1 (JN186799 containing cloA, dmaW, easA, CG, easH1, easH2, lpsA1, lpsA2, lpsB, and lpsC; JX402756 containing idt/ltmB, C, M, P, Q, and S) and C. paspali RRC-1481 JN186800 (easO) were first applied as a query to interrogate each genome. In addition, the cluster from C. fusiformis PRL1980 EU006773 (10 genes: cloA, dmaW, easA, CH, and lpsB) were applied to further interrogate genomes in C. sect. Pusillae and C. citrina. For the IDT/ltm genes that were not previously reported in Claviceps purpurea 20.1, the reference sequences from C. paspali JN613321 (ltmF and ltmG) and Epichloë (ltmE and J on JN613318, and K on JN613320) were used to conduct lower stringency megablast searches (https://www.geneious.com, accessed in 9 November 2021) with E-values E−50 and E−20. Megablast searches were also conducted for loline alkaloid genes (lolA, D, E, M–P, T, and U on JF830816, lolC FJ464781, and lolF FJ594413) and peramine (perA JN640287) in all 54 genomes. Genes that were present in genomes were extracted manually. Split fragments of a single gene on different contigs were concatenated on the basis of reference sequences. DNA sequences of genes extracted from the new genomes were submitted to GenBank.
When multiple copies of certain genes were present (such as dmaW, easE, easF, ltmB, and ltmQ), the copy on the main cluster was designated as copy 1, as determined by examining the contig numbers. The exception was easH, which was determined on the basis of the similarity to the two copies determined by previous studies [14]. Disconnected fragments shorter than 300 bps were not considered.

4.3. Phylogenetic Analyses

The extracted sequences for each gene were aligned individually through the Geneious Prime (https://www.geneious.com, accessed on 9 November 2021) Align/Assemble function using Global alignment with free end gaps, 93% similarity (5.0/−9.026168) as the cost matrix, a gap open penalty of 12, a gap extension penalty of 3, and two refinement iterations. This protocol is particularly suitable for aligning sequences with large gaps or shorter fragments to full-length sequences. Maximum likelihood phylogenetic trees were developed using the PhyML 3.3.20180621 [63] plugin of Geneious Prime (https://www.geneious.com, accessed on 9 November 2021). Both GTR and HKY substitution models were attempted; branch supports were evaluated through bootstrapping analyses of 100 replicates. Reference sequences of lpsB of C. paspali has only 52% similarity with C. purpurea, causing spurious alignment and a significantly long branch; therefore, they were not included in the analyses.

4.4. Intraspecific Gene Diversity and Divergence Analyses

Population demographic parameters are suitable for investigating genetic differentiation and gene evolution at an intraspecific level. We investigated the DNA polymorphisms, nucleotide diversity (Pi), and average number of nucleotide differences (K) among 27 strains of C. purpurea using DnaSP [64]. Another reason for choosing this sub-set of data, instead of all 53 samples, is that all but three strains (LM65, LM2, and LM582 lacked easH2) contained all 12 genes, making the results more comparable. Nonetheless, the sequences with long gaps causing a significant reduction in alignment length in dmaW and easF were excluded from the DnaSP analyses. In addition, the tree-based diversity and divergence from the center of the tree (COT) were calculated through the web-based DIVEIN software (https://indra.mullins.microbiol.washington.edu/DIVEIN/diver.html, accessed on 9 November 2021) [65]. The following parameters were applied: GTR substitution model, optimized equilibrium frequencies, the best of NNI and SPR tree improvement, and topology + branch length tree optimization algorithm. For multicopy genes (dmaW, easE, easF, and easH), we calculated the parameters for each individual copy and combined them as one gene (Table 4).

Supplementary Materials

The following are available online at https://www.mdpi.com/article/10.3390/toxins13110799/s1, Figure S1: The phylogenetic trees developed by PhyML for each individual single-copy eas and idt/ltm genes, thickened branches indicate bootstrapping values >80%. Figure S2: The phylogenetic trees developed by PhyML for each individual multi-copy eas and idt/ltm genes, Table S1: The collection information for 53 strains of Clavicep spp.

Author Contributions

Conceptualization, M.L.; methodology, K.D., P.S. and S.A.W.; software (pipeline), W.F.; formal analysis, M.L., W.F., P.S. and A.B.; resources, J.D., M.K., J.G.M., S.A.W. and K.B.; data curation, W.F., P.S., S.A.W. and A.B.; writing—original draft preparation, M.L.; writing—review and editing, W.F., J.D., S.A.W., K.B., P.S., K.D., M.K. and J.G.M.; funding acquisition, M.L., J.D. and K.B. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by Agriculture and Agri-Food Canada’s Growing Forward 2 for a research network on Emerging Mycotoxins (EmTox, project # J-000048), STB fungal and bacterial biosystematics J-002272, the Agriculture and Food Research Initiative (AFRI) National Institute of Food and Agriculture (NIFA) Fellowships Grant Program: Predoctoral Fellowships grant no. 2019-67011-29502/project accession no. 1019134 from the United States Department of Agriculture (USDA), and the American Malting Barley Association grant no. 17037621. Additional funding was provided by Agriculture and Agri-Food Canada grant J-001564, Biological Collections Data Mobilization Initiative (BioMob, Work Package 2). This research was supported in part by the U.S. Department of Agriculture, Agricultural Research Service.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The genome and gene data presented in this study are openly available in NCBI upon publication of this article https://www.ncbi.nlm.nih.gov/ (accessed on 9 November 2021). Accession numbers are detailed in the text Section 2.2 and Table 1.

Acknowledgments

We thank the Molecular Technologies Laboratory (MTL) at the Ottawa Research & Development Centre of Agriculture and Agri-Food Canada, and Kassandra R. Bisson for technical assistance, Chunfang Zheng, and Frank You for bioinformatics assistance, Christopher Schardl for advice during the early stages of the study, two anonymous reviewers for reviewing the manuscript.

Conflicts of Interest

The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results. The USDA is an equal opportunity provider and employer.

References

  1. Tenberge, K.B. Biology and life strategy of the ergot fungi. In Ergot: The Genus Claviceps; Kren, V., Cvak, L., Eds.; Hardwood Academic Publishers (republished 2006 by Taylor and Francise e-Library): Amsterdam, The Netherlands, 1999; pp. 25–56. [Google Scholar]
  2. Luttrell, E.S. Host-parasite relationships and development of the ergot sclerotium in Claviceps purpurea. Can. J. Bot. 1980, 58, 942–958. [Google Scholar] [CrossRef]
  3. Píchová, K.; Pažoutová, S.; Kostovčík, M.; Chudíčková, M.; Stodůlková, E.; Novák, P.; Flieger, M.; van der Linde, E.; Kolařík, M. Evolutionary history of ergot with a new infrageneric classification (Hypocreales: Clavicipitaceae: Claviceps). Mol. Phylogenetics Evol. 2018, 123, 73–87. [Google Scholar] [CrossRef]
  4. Liu, M.; Overy, D.P.; Cayouette, J.; Shoukouhi, P.; Hicks, C.; Bisson, K.; Sproule, A.; Wyka, S.A.; Broders, K.; Popovic, Z.; et al. Four phylogenetic species of ergot from Canada and their characteristics in morphology, alkaloid production, and pathogenicity. Mycologia 2020, 112, 974–988. [Google Scholar] [CrossRef]
  5. Campbell, W.P.; Freisen, H.A. The control of ergot in cereal crops. Plant Dis. Rep. 1959, 43, 1266–1267. [Google Scholar]
  6. European Food Safety Authority. Scientific opinion on ergot allkaloids in food and feed. EFSA J. 2012, 10, 2798. [Google Scholar]
  7. Tfelt-Hansen, P.; Saxena, P.R.; Dahlöf, C.; Pascual, J.; Láinez, M.; Henry, P.; Diener, H.-C.; Schoenen, J.; Ferrari, M.D.; Goadsby, P.J. Ergotamine in the acute treatment of migraine: A review and European consensus. Brain 2000, 123, 9–18. [Google Scholar] [CrossRef] [PubMed]
  8. Schiff, P.L., Jr. Ergot and its alkaloids. Am. J. Pharm. Educ. 2006, 70, 98. [Google Scholar] [CrossRef]
  9. Barger, G. Ergot and Ergotism: A Monograph Based on the Dohme Lecture Delivered in Johns Hopkins University, Baltimore; Gurney and Jackson: London, UK, 1931. [Google Scholar]
  10. Belser-Ehrlich, S.; Harper, A.; Hussey, J.; Hallock, R. Human and cattle ergotism since 1900: Symptoms, outbreaks, and regulations. Toxicol. Ind. Health 2013, 29, 307–316. [Google Scholar] [CrossRef] [PubMed]
  11. Schardl, C.L.; Panaccione, D.G.; Tudzynski, P.; Geoffrey, A.C. Ergot alkaloids—Biology and molecular biology. In The Alkaloids: Chemistry and Biology; Academic Press: Cambridge, MA, USA, 2006; Volume 63, pp. 45–86. [Google Scholar]
  12. Tudzynski, P.; Holter, K.; Correia, T.; Arntz, C.; Grammel, N.; Keller, U. Evidence for an ergot alkaloid gene cluster in Claviceps purpurea. Mol. Gen. Genet. 1999, 261, 133–141. [Google Scholar] [CrossRef]
  13. Young, C.A.; Schardl, C.L.; Panaccione, D.G.; Florea, S.; Takach, J.E.; Charlton, N.D.; Moore, N.; Webb, J.S.; Jaromczyk, J. Genetics, genomics and evolution of ergot alkaloid diversity. Toxins 2015, 7, 1273–1302. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  14. Schardl, C.L.; Young, C.A.; Hesse, U.; Amyotte, S.G.; Andreeva, K.; Calie, P.J.; Fleetwood, D.J.; Haws, D.C.; Moore, N.; Oeser, B.; et al. Plant-symbiotic fungi as chemical engineers: Multi-genome analysis of the clavicipitaceae reveals dynamics of alkaloid loci. PLoS Genet. 2013, 9, e1003323. [Google Scholar] [CrossRef] [Green Version]
  15. Robinson, S.L.; Panaccione, D.G. Diversification of ergot alkaloids in natural and modified fungi. Toxins 2015, 7, 201–218. [Google Scholar] [CrossRef] [Green Version]
  16. Knaus, H.G.; McManus, O.B.; Lee, S.H.; Schmalhofer, W.A.; Garcia-Calvo, M.; Helms, L.M.; Sanchez, M.; Giangiacomo, K.; Reuben, J.P.; Smith, A.B., 3rd; et al. Tremorgenic indole alkaloids potently inhibit smooth muscle high-conductance calcium-activated potassium channels. Biochemistry 1994, 33, 5819–5828. [Google Scholar] [CrossRef]
  17. Smith, M.M.; Warren, V.A.; Thomas, B.S.; Brochu, R.M.; Ertel, E.A.; Rohrer, S.; Schaeffer, J.; Schmatz, D.; Petuch, B.R.; Tang, Y.S.; et al. Nodulisporic acid opens insect glutamate-gated chloride channels: Identification of a new high affinity modulator. Biochemistry 2000, 39, 5543–5554. [Google Scholar] [CrossRef] [PubMed]
  18. Turland, N.J.; Wiersema, J.H.; Barrie, F.R.; Greuter, W.; Hawksworth, D.L.; Herendeen, P.S.; Knapp, S.; Kusber, W.-H.; Li, D.-Z.; Marhold, K. International Code of Nomenclature for Algae, Fungi, and Plants (Shenzhen Code) Adopted by the Nineteenth International Botanical Congress Shenzhen, China, July 2017. Regnum Vegetabile 159; Koeltz Botanical Books: Glashütten, Germany, 2018; Volume 159. [Google Scholar]
  19. Botha, C.J.; Kellerman, T.S.; Fourie, N. A tremorgenic mycotoxicosis in cattle caused by Paspalum distichum (L.) infected by Claviceps paspali. J. S. Afr. Vet. Assoc. 1996, 67, 36–37. [Google Scholar] [PubMed]
  20. Prestidge, R.A. Causes and control of perennial ryegrass staggers in New Zealand. Agric. Ecosyst. Environ. 1993, 44, 283–300. [Google Scholar] [CrossRef]
  21. Uhlig, S.; Botha, C.J.; Vrålstad, T.; Rolén, E.; Miles, C.O. Indole−diterpenes and ergot alkaloids in Cynodon dactylon (bermuda grass) infected with Claviceps cynodontis from an outbreak of tremors in cattle. J. Agric. Food Chem. 2009, 57, 11112–11119. [Google Scholar] [CrossRef] [PubMed]
  22. Kozák, L.; Szilágyi, Z.; Tóth, L.; Pócsi, I.; Molnár, I. Functional characterization of the idtF and idtP genes in the Claviceps paspali indole diterpene biosynthetic gene cluster. Folia Microbiol. 2020, 65, 605–613. [Google Scholar] [CrossRef] [Green Version]
  23. Saikia, S.; Takemoto, D.; Tapper, B.A.; Lane, G.A.; Fraser, K.; Scott, B. Functional analysis of an indole-diterpene gene cluster for lolitrem B biosynthesis in the grass endosymbiont Epichloë festucae. FEBS Lett. 2012, 586, 2563–2569. [Google Scholar] [CrossRef] [Green Version]
  24. Young, C.A.; Bryant, M.K.; Christensen, M.J.; Tapper, B.A.; Bryan, G.T.; Scott, B. Molecular cloning and genetic analysis of a symbiosis-expressed gene cluster for lolitrem biosynthesis from a mutualistic endophyte of perennial ryegrass. Mol. Genet. Genom. MGG 2005, 274, 13–29. [Google Scholar] [CrossRef]
  25. Young, C.A.; Felitti, S.; Shields, K.; Spangenberg, G.; Johnson, R.D.; Bryan, G.T.; Saikia, S.; Scott, B. A complex gene cluster for indole-diterpene biosynthesis in the grass endophyte Neotyphodium lolii. Fungal Genet. Biol. 2006, 43, 679–693. [Google Scholar] [CrossRef]
  26. Young, C.A.; Tapper, B.A.; May, K.; Moon, C.D.; Schardl, C.L.; Scott, B. Indole-diterpene biosynthetic capability of Epichloë endophytes as predicted by ltm gene analysis. Appl. Environ. Microbiol. 2009, 75, 2200–2211. [Google Scholar] [CrossRef] [Green Version]
  27. Jiang, Y.; Ozaki, T.; Harada, M.; Miyasaka, T.; Sato, H.; Miyamoto, K.; Kanazawa, J.; Liu, C.; Maruyama, J.-I.; Adachi, M.; et al. Biosynthesis of indole diterpene lolitrems: Radical-induced cyclization of an epoxyalcohol affording a characteristic lolitremane skeleton. Angew. Chem. Int. Ed. 2020, 59, 17996–18002. [Google Scholar] [CrossRef]
  28. Uhlig, S.; Egge-Jacobsen, W.; Vrålstad, T.; Miles, C.O. Indole-diterpenoid profiles of Claviceps paspali and Claviceps purpurea from high-resolution Fourier transform Orbitrap mass spectrometry. Rapid Commun. Mass Spectrom. RCM 2014, 28, 1621–1634. [Google Scholar] [CrossRef]
  29. Negård, M.; Uhlig, S.; Kauserud, H.; Andersen, T.; Høiland, K.; Vrålstad, T. Links between genetic groups, indole alkaloid profiles and ecology within the grass-parasitic Claviceps purpurea species complex. Toxins 2015, 7, 1431–1456. [Google Scholar] [CrossRef] [Green Version]
  30. Uhlig, S.; Rangel-Huerta, O.D.; Divon, H.H.; Rolén, E.; Pauchon, K.; Sumarah, M.W.; Vrålstad, T.; Renaud, J.B. Unraveling the ergot alkaloid and indole diterpenoid metabolome in the Claviceps purpurea species complex using lc–hrms/ms diagnostic fragmentation filtering. J. Agric. Food Chem. 2021, 69, 7137–7148. [Google Scholar] [CrossRef] [PubMed]
  31. Schardl, C.L.; Grossman, R.B.; Nagabhyru, P.; Faulkner, J.R.; Mallik, U.P. Loline alkaloids: Currencies of mutualism. Phytochemistry 2007, 68, 980–996. [Google Scholar] [CrossRef]
  32. Tanaka, A.; Tapper, B.A.; Popay, A.; Parker, E.J.; Scott, B. A symbiosis expressed non-ribosomal peptide synthetase from a mutualistic fungal endophyte of perennial ryegrass confers protection to the symbiotum from insect herbivory. Mol. Microbiol. 2005, 57, 1036–1050. [Google Scholar] [CrossRef] [PubMed]
  33. Duboule, D. Temporal colinearity and the phylotypic progression: A basis for the stability of a vertebrate Bauplan and the evolution of morphologies through heterochrony. Development 1994, 1994, 135–142. [Google Scholar] [CrossRef]
  34. Slack, J.M.W.; Holland, P.W.H.; Graham, C.F. The zootype and the phylotypic stage. Nature 1993, 361, 490–492. [Google Scholar] [CrossRef] [PubMed]
  35. Von Baer, K.E. Über Entwickelungsgeschichte der Thiere: Beobachtung und Reflexion; Bei den gebrüdern Bornträger: Königsberg, Russia, 1828; Volume 1. [Google Scholar]
  36. Richardson, M.K. Vertebrate evolution: The developmental origins of adult variation. Bioessays 1999, 21, 604–613. [Google Scholar] [CrossRef]
  37. Poe, S.; Wake, M.H. Quantitative tests of general models for the evolution of development. Am. Nat. 2004, 164, 415–422. [Google Scholar] [CrossRef]
  38. Cheng, X.; Hui, J.H.; Lee, Y.Y.; Wan Law, P.T.; Kwan, H.S. A “developmental hourglass in fungi”. Mol. Biol. Evol. 2015, 32, 1556–1566. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  39. Prud‘homme, B.; Gompel, N. Genomic hourglass. Nature 2010, 468, 768–769. [Google Scholar] [CrossRef]
  40. Quint, M.; Drost, H.-G.; Gabel, A.; Ullrich, K.K.; Bönn, M.; Grosse, I. A transcriptomic hourglass in plant embryogenesis. Nature 2012, 490, 98–101. [Google Scholar] [CrossRef]
  41. Haeckel, E. Generelle Morphologie der Organismen. Allgemeine Grundzüge der Organischen Formen-Wissenschaft, Mechanisch Begründet Durch Die von Charles Darwin Reformirte Descendenztheorie; G. Reimer: Berlin, Germany, 1866; Volume 1. [Google Scholar]
  42. Gould, S.J. Ontogeny and Phylogeny; Harvard University Press: Cambridge, MA, USA, 1985. [Google Scholar]
  43. Cruickshank, T.; Wade, M.J. Microevolutionary support for a developmental hourglass: Gene expression patterns shape sequence variation and divergence in Drosophila. Evol. Dev. 2008, 10, 583–590. [Google Scholar] [CrossRef]
  44. Wyka, S.A.; Mondo, S.J.; Liu, M.; Dettman, J.; Nalam, V.; Broders, K.D. Whole-genome comparisons of ergot fungi reveals the divergence and evolution of species within the genus Claviceps are the result of varying mechanisms driving genome evolution and host range expansion. Genome Biol. Evol. 2021, 13, evaa267. [Google Scholar] [CrossRef] [PubMed]
  45. Wingfield, B.D.; Liu, M.; Nguyen, H.D.T.; Lane, F.A.; Morgan, S.W.; De Vos, L.; Wilken, P.M.; Duong, T.A.; Aylward, J.; Coetzee, M.P.A.; et al. Nine draft genome sequences of Claviceps purpurea s.lat., including C. arundinis, C. humidiphila, and C. cf. spartinae, pseudomolecules for the pitch canker pathogen Fusarium circinatum, draft genome of Davidsoniella eucalypti, Grosmannia galeiformis, Quambalaria eucalypti, and Teratosphaeria destructans. IMA Fungus 2018, 9, 401–418. [Google Scholar] [CrossRef] [PubMed]
  46. Flieger, M.; Wurst, M.; Shelby, R. Ergot alkaloids--sources, structures and analytical methods. Folia Microbiol. 1997, 42, 3–29. [Google Scholar] [CrossRef]
  47. Tanda, S. Mycological studies on ergot in Japan (Part 9). Distinct variety of Claviceps purpurea Tul. on Phalaris arundinacea L. and P. arundinacea var. picta L. J. Agric. Sci. Tokyo Nogyo Daigaku 1979, 24, 67–95. [Google Scholar]
  48. Pažoutová, S.; Parbery, D.P. The taxonomy and phylogeny of Claviceps. In Ergot: The Genus Claviceps; Kren, V., Cvak, L., Eds.; Hardwood Academic Publishers (republished 2006 by Taylor and Francise e-Library): Amsterdam, The Netherlands, 1999; pp. 57–77. [Google Scholar]
  49. Pažoutová, S.; Olsovská, J.; Linka, M.; Kolínská, R.; Flieger, M. Chemoraces and habitat specialization of Claviceps purpurea populations. Appl. Envron. Microbiol. 2000, 66, 5419–5425. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  50. Tudzynski, P.; Correia, T.; Keller, U. Biotechnology and genetics of ergot alkaloids. Appl. Microbiol. Biotechnol. 2001, 57, 593–605. [Google Scholar] [CrossRef] [PubMed]
  51. Liu, M.; Panaccione, D.G.; Schardl, C.L. Phylogenetic analyses reveal monophyletic origin of the ergot alkaloid gene dmaW in fungi. Evol. Bioinform. 2009, 5, EBO–S2633. [Google Scholar] [CrossRef]
  52. Rokas, A.; Wisecaver, J.H.; Lind, A.L. The birth, evolution and death of metabolic gene clusters in fungi. Nat. Rev. Microbiol. 2018, 16, 731–744. [Google Scholar] [CrossRef]
  53. Lorenz, N.; Haarmann, T.; Pazoutová, S.; Jung, M.; Tudzynski, P. The ergot alkaloid gene cluster: Functional analyses and evolutionary aspects. Phytochemistry 2009, 70, 1822–1832. [Google Scholar] [CrossRef]
  54. Raff, R.A. The Shape of Life: Genes, Development, and the Evolution of Animal Form; University of Chicago Press: Chicago, IL, USA, 2012. [Google Scholar]
  55. Galis, F.; van Dooren, T.J.; Metz, J.A. Conservation of the segmented germband stage: Robustness or pleiotropy? Trends Genet. TIG 2002, 18, 504–509. [Google Scholar] [CrossRef] [Green Version]
  56. Schlosser, G.; Wagner, G.P. Modularity in Development and Evolution; University of Chicago Press: Chicago, IL, USA, 2004. [Google Scholar]
  57. Panaccione, D.G. Origins and significance of ergot alkaloid diversity in fungi. FEMS Microbiol. Lett. 2005, 251, 9–17. [Google Scholar] [CrossRef] [Green Version]
  58. Tajima, F. Statistical method for testing the neutral mutation hypothesis by DNA polymorphism. Genetics 1989, 123, 585–595. [Google Scholar] [CrossRef] [PubMed]
  59. Liu, M.; Shoukouhi, P.; Bisson, K.R.; Wyka, S.A.; Broders, K.D.; Menzies, J.G. Sympatric divergence of the ergot fungus, Claviceps purpurea, populations infecting agricultural and nonagricultural grasses in North America. Ecol. Evol. 2021, 11, 273–293. [Google Scholar] [CrossRef]
  60. Bankevich, A.; Nurk, S.; Antipov, D.; Gurevich, A.A.; Dvorkin, M.; Kulikov, A.S.; Lesin, V.M.; Nikolenko, S.I.; Pham, S.; Prjibelski, A.D.; et al. SPAdes: A new genome assembly algorithm and its applications to single-cell sequencing. J. Comput. Biol. 2012, 19, 455–477. [Google Scholar] [CrossRef] [Green Version]
  61. Gurevich, A.; Saveliev, V.; Vyahhi, N.; Tesler, G. QUAST: Quality assessment tool for genome assemblies. Bioinformatics 2013, 29, 1072–1075. [Google Scholar] [CrossRef]
  62. Simão, F.A.; Waterhouse, R.M.; Ioannidis, P.; Kriventseva, E.V.; Zdobnov, E.M. BUSCO: Assessing genome assembly and annotation completeness with single-copy orthologs. Bioinformatics 2015, 31, 3210–3212. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  63. Guindon, S.; Dufayard, J.F.; Lefort, V.; Anisimova, M.; Hordijk, W.; Gascuel, O. New algorithms and methods to estimate maximum-likelihood phylogenies: Assessing the performance of PhyML 3.0. Syst. Biol. 2010, 59, 307–321. [Google Scholar] [CrossRef] [Green Version]
  64. Rozas, J.; Ferrer-Mata, A.; Sanchez-DelBarrio, J.C.; Guirao-Rico, S.; Librado, P.; Ramos-Onsins, S.E.; Sanchez-Gracia, A. DnaSP 6: DNA Sequence Polymorphism Analysis of Large Data Sets. Mol. Biol. Evol. 2017, 34, 3299–3302. [Google Scholar] [CrossRef] [PubMed]
  65. Deng, W.; Maust, B.S.; Nickle, D.C.; Learn, G.H.; Liu, Y.; Heath, L.; Kosakovsky Pond, S.L.; Mullins, J.I. DIVEIN: A web server to analyze phylogenies, sequence divergence, diversity, and informative sites. Biotechniques 2010, 48, 405–408. [Google Scholar] [CrossRef] [PubMed]
Figure 1. The ergot alkaloid biosynthetic pathway in Claviceps spp. Modified from Young et al. [13] and Robinson and Panaccione [15].
Figure 1. The ergot alkaloid biosynthetic pathway in Claviceps spp. Modified from Young et al. [13] and Robinson and Panaccione [15].
Toxins 13 00799 g001
Figure 2. The schematic arrangements of multiple copies of easE, easF, and dmaW in relation to the primary cluster of other eas genes. The dark solid bars denote the contigs, while gray boxes represent genes labeled accordingly, with the ranges underneath. The lengths of genes and spaces are in approximate scale. The dashed bars and genes on them indicate that those genes are on the same contig; however, the details are not displayed. (AE) represent different patterns of locations (see the text Section 2.2.1 for details).
Figure 2. The schematic arrangements of multiple copies of easE, easF, and dmaW in relation to the primary cluster of other eas genes. The dark solid bars denote the contigs, while gray boxes represent genes labeled accordingly, with the ranges underneath. The lengths of genes and spaces are in approximate scale. The dashed bars and genes on them indicate that those genes are on the same contig; however, the details are not displayed. (AE) represent different patterns of locations (see the text Section 2.2.1 for details).
Toxins 13 00799 g002
Figure 3. (a) The hypothetical species relationships of Claviceps spp. inferred by orthologous genes from Wyka et al. [44]. (bd) Variant species relationships in Sect. Pusillae summarized from phylogenies inferred by each eas gene trees (Supplementary Figures S1 and S2). The thickened branches denote bootstrapping values >80%. The letters next to thick branches denote the genes supporting the grouping, abbreviated as A, CH1 = easA, easCH1; cl = cloA; W1 = dmaW1. Dashed branches indicate that taxon was present on the gene trees listed after the species name. lpsC and lpsB are not listed here because only one or three sequences were available on the trees. DNA sequences of C. fusiformis and C. paspali were from GenBank EU006773 and JN613321.
Figure 3. (a) The hypothetical species relationships of Claviceps spp. inferred by orthologous genes from Wyka et al. [44]. (bd) Variant species relationships in Sect. Pusillae summarized from phylogenies inferred by each eas gene trees (Supplementary Figures S1 and S2). The thickened branches denote bootstrapping values >80%. The letters next to thick branches denote the genes supporting the grouping, abbreviated as A, CH1 = easA, easCH1; cl = cloA; W1 = dmaW1. Dashed branches indicate that taxon was present on the gene trees listed after the species name. lpsC and lpsB are not listed here because only one or three sequences were available on the trees. DNA sequences of C. fusiformis and C. paspali were from GenBank EU006773 and JN613321.
Toxins 13 00799 g003
Figure 4. (ae) Varied species relationships in sect. Claviceps summarized from phylogenetic trees of eas and ltm genes by PhyML analyses (the full trees are provided in Supplementary Figures S1 and S2). The thick branches denote bootstrapping values >80%. The letters beside the thick branches indicate that those genes had strong support for those branches; otherwise, all genes listed below the figure had strong support.
Figure 4. (ae) Varied species relationships in sect. Claviceps summarized from phylogenetic trees of eas and ltm genes by PhyML analyses (the full trees are provided in Supplementary Figures S1 and S2). The thick branches denote bootstrapping values >80%. The letters beside the thick branches indicate that those genes had strong support for those branches; otherwise, all genes listed below the figure had strong support.
Toxins 13 00799 g004
Figure 5. The simplified phylogenies of individual multicopy genes showing potential duplication events. The unedited trees generated by PhyML are presented in the Supplementary Figure S2. (a) dmaW, (b) easF, (c) easE, (d) easH, (e) ltmB, and (f) ltmQ. The thickened branches indicate bootstrapping values ≥80%; dashed and hatches branches are shorter than their real length. The lineages that are not shaded gray are the first copies of each gene.
Figure 5. The simplified phylogenies of individual multicopy genes showing potential duplication events. The unedited trees generated by PhyML are presented in the Supplementary Figure S2. (a) dmaW, (b) easF, (c) easE, (d) easH, (e) ltmB, and (f) ltmQ. The thickened branches indicate bootstrapping values ≥80%; dashed and hatches branches are shorter than their real length. The lineages that are not shaded gray are the first copies of each gene.
Toxins 13 00799 g005aToxins 13 00799 g005b
Figure 6. Nucleotide diversity and tree-based diversity and divergence for individual eas genes (a) and idt/ltm genes (b). Error bars denote the standard deviation for Pi and standard error for the other two parameters. The genes are arranged from top to bottom according to their order in the biosynthetic pathway. ltmS is not included in the chart as its function is unknown.
Figure 6. Nucleotide diversity and tree-based diversity and divergence for individual eas genes (a) and idt/ltm genes (b). Error bars denote the standard deviation for Pi and standard error for the other two parameters. The genes are arranged from top to bottom according to their order in the biosynthetic pathway. ltmS is not included in the chart as its function is unknown.
Toxins 13 00799 g006
Table 1. Statistics of genome assemblies screened.
Table 1. Statistics of genome assemblies screened.
SpeciesStrainBioSampleWGS #ContigsTotal Length (bp)Largest Contig (bp)N50 (bp)L50GC (%)Coverage (x)Complete BUSCO’s (%)
C. arundinisCCC1102SAMN11159893JAIUSP000000000140629,878,863375,53355,90915651.4261x98.1
C. capensisCCC1504SAMN11159898JAIUSL000000000149727,462,555202,63739,75819851.6966x98.8
C. cyperiCCC1219SAMN11159895JAIUSO000000000246726,149,012130,03219,94638651.7256x98.3
C. monticolaCCC1483SAMN11159896JAIUSN000000000178727,131,110129,90529,63927951.658x98.8
C. occidentalisLM77SAMN11159879JAIURJ000000000228528,557,246118,74621,55641051.3758x97.8
C. occidentalisLM84SAMN11159876JAIURI000000000211928,639,296133,79723,64138951.39164x97.0
C. pazoutovaeCCC1485SAMN11159897JAIUSM000000000161927,544,752151,19635,47722951.761x98.3
C. purpurea s.s.Clav04SAMN11159846JAIUSJ000000000258130,594,081349,53329,37829651.6946x98.8
C. purpurea s.s.Clav26SAMN11159847JAIUSI000000000182130,253,558299,36836,36924251.4859x98.8
C. purpurea s.s.Clav46SAMN11159848JAIUSG000000000188730,292,940231,31436,58224651.0858x99.1
C. purpurea s.s.Clav52SAMN11159849JAIUSE000000000171429,291,845175,16535,95625051.4260x98.9
C. purpurea s.s.Clav55SAMN11159850JAIUSD000000000202330,195,775203,52333,46126151.5559x98.4
C. purpurea s.s.LM14SAMN11159853JAIUSC000000000188830,259,282163,53232,81226851.7449x97.9
C. purpurea s.s.LM207SAMN11159861JAIUSB000000000191030,165,540260,84731,42827351.7453x98.7
C. purpurea s.s.LM223SAMN11159862JAIURY000000000189430,223,423195,66131,69329151.7374x98.4
C. purpurea s.s.LM232SAMN11159863JAIURX000000000191130,304,653216,99633,37626551.7353x98.8
C. purpurea s.s.LM233SAMN11159864JAIURW000000000192830,249,987183,37833,02327351.7449x98.3
C. purpurea s.s.LM30SAMN11159855JAIURV000000000181630,203,936160,35335,00526551.7564x98.4
C. purpurea s.s.LM33SAMN11159856JAIURU000000000201130,162,301157,17628,95430651.7545x97.9
C. purpurea s.s.LM39SAMN11159857JAIURT000000000179730,183,718168,04734,90225851.7581x98.3
C. purpurea s.s.LM4SAMN11159851JAIURS000000000186630,197,808200,83131,05428151.7464x98.3
C. purpurea s.s.LM46SAMN11159858JAIURR000000000184230,109,785205,39932,50327051.7679x98.6
C. purpurea s.s.LM461SAMN11159865JAIURQ000000000204130,157,824190,24728,92830751.7437x97.7
C. purpurea s.s.LM469SAMN11159866JAIURP000000000183630,218,091199,88034,40826951.7475x98.3
C. purpurea s.s.LM470SAMN11159867JAIURO000000000248230,086,038123,01423,23138451.7526x97.9
C. purpurea s.s.LM474SAMN11159868JAIURN000000000191730,149,711232,50430,85528351.7564x98
C. purpurea s.s.LM5SAMN11159852JAIURM000000000181730,171,863188,14434,17427151.7467x98.4
C. purpurea s.s.LM60SAMN11159859JAIURL000000000187130,274,458180,24231,97727551.7381x98.6
C. purpurea s.s.LM63SAMN20436330JAIUSS000000000167430,276,205210,63040,95421851.7968x98.4
C. purpurea s.s.LM65SAMN20436331JAIUSR000000000182230,277,382206,60937,97624151.7871x98.4
C. purpurea s.s.LM71SAMN11159860JAIURK000000000191930,241,564172,99732,32428251.76168x98
C. purpurea s.s.LM72SAMN20436332JAIUSQ000000000198630,160,156282,50636,80524951.8163x98.4
C. quebecensisClav32SAMN11159882JAIUSH000000000136228,435,427248,88842,25219251.6164x99.0
C. quebecensisClav50SAMN11159881JAIUSF000000000140428,499,699294,42547,79717851.659x98.8
C. ripicolaLM219SAMN11159874JAIUSA000000000184730,428,256154,69034,89825451.3955x97.1
C. ripicola c.f.LM220SAMN11159873JAIURZ000000000166230,409,961205,88143,97121151.4391x97.7
C. spartinaeCCC535SAMN11159888JAIUSK000000000201728,974,645142,72328,33230051.3860x98.1
Assemblies from previous studies
C. arundinisCCC1102SAMN11159893SRPS01140629,878,863375,53355,90915651.4261x97.7 *
C. africanaCCC489SAMN11159887SRPY01532931,933,80198,04912,22575244.6856x95 *
C. arundinisLM583SAMN08798359QEQZ01161330,055,381164,90439,30622351.4269x96.9
C. citrinaCCC265SAMN11159885SRQA01483025,056,89681,802874787147.5764x92.2*
C. digitariaeCCC659SAMN11159892SRPT01382131,170,596116,85916,07757245.557x95.9 *
C. humidiphilaLM576SAMN08798355QERB01183130,488,243190,08534,78726151.5177x97.9
C. lovelessiiCCC647SAMN11159891SRPU01820134,575,81365,4395747178143.6153x91.6 *
C. maximensisCCC398SAMN11159886SRPZ01231729,114,417192,85137,10123046.6658x98.3 *
C. occidentalisLM78SAMN08800200QEQY01232128,571,683125,45921,41642251.3764x97.3
C. perihumidiphilaLM81SAMN08800226QEQX01142330,694,913232,02946,52619251.5140x96.9
C. purpurea s.s.LM28SAMN08797627QERD01193030,251,797260,84231,81527451.7449x97.9
C. purpurea s.s.LM582SAMN08798357QERA01220730,199,509132,07227,19933451.7489x98.6
C. pusillaCCC602SAMN11159889SRPW01917137,319,48483,5555659191741.8452x90.9 *
C. quebecensisLM458SAMN08851611QEQW01170035,882,5931,850,35141,78419151.8778x98.0
C. ripicolaLM454SAMN08798353QERC01210830,692,668189,16228,58731451.37156x97.9
C. ripicolaLM218SAMN08798202QERE01163030,598,250206,72339,76323251.4146x97.6
C. sorghiCCC632SAMN11159890SRPV01720631,897,900112,2966643138945.2460x89.9 *
* BUSCO completeness for these strains was based on the Dikaryon fungal database; see Wyka et al. [44] for details.
Table 2. The eas gene copies and their locations in 18 species in Claviceps sect. Claviceps and sect. Pusillae.
Table 2. The eas gene copies and their locations in 18 species in Claviceps sect. Claviceps and sect. Pusillae.
SectionOrganismAsbl *SamplelpsCeasAlpsBcloAeasCeasDeasEeasFeasGdmaWeasH
E1E2F1F2F3 W1W2W3H1H2
ClavicepsC. arundinisWFCCC1102418/411411411411411411411 411273 411411/583273707583
SWCCC1102 157157157157157157 15773 15715773305157458
BWLM583 455455455455455455 455187 455/805805187 805822
C. capensisWFCCC1504173173173173173173173 173 1731354 347
C. cyperiWFCCC1219277277277277277277277 277 277277/20372094/696 525
C. humidiphilaBWLM576599259259259259259259 259390 259259390 259701
C. monticolaWFCCC1483367367367367367367367986367986 367367/1745986/966 494
C. occidentalisWFLM77202202202202202202202 202 2021262702 18871693
BWLM78192192192192192192192 192 1921273722 18711675
WFLM84290290290290290290290 290/1715 17151340721 17791618
C. pazoutovaeWFCCC1485307307307307307307307 307767 307307/1479767/622 430
C. perihumidiphilaBWLM81 114114114114114114 114 114/604604359 604710
C. purprueaWFClav52131131131131131131131739131/1395739 13951395/923879 923835
WFClav04717171717171711407711514/2293 7171/9932459/594 993874
WFClav26105105105105105105105 105816/1596 105105/14531596 759836
WFClav462012012012012012012011255201/13581255/1668 13581358/9281481/1294 9281043
WFClav554194194194194194194191226419/14161226/1797 14161416/13161797/1399 13161168
WFLM1485858585858585 85 8585/6991391/539 699716
WFLM207374374374374374374374 374 374374/11691625/1506 11691183
WFLM223444444444444444444444 444/15561027/1718 15561556/7831460 783762
WFLM232112112112112112112112 112 1121121843 1148908
WFLM23389898989898989 89 89/658658533 658692
BWLM28126126126126126126126 126 1261261874/563 1261208
WFLM3088888888888888 88 88/10221022610 10221180
WFLM33106106106106106106106 106 106106/13431781/649 13431210
WFLM39100100100100100100100855100/1391855/1766134013911391/7531340/546 753703
WFLM4152152152152152152152 152 152152/15701797 857726
WFLM46209209209209209209209 209 2092091386 2091025
WFLM461121121121121121121/14431443100614431006164310061006/15051846/1490 15051402
WFLM4699191919191919198091/1419980/1609172714191419/10181727/419 1018680
WFLM4702942942942942942942942080294/14282080/2165 142814282358 949736
WFLM474158158158158158158158829158/1493829/1365 14931493/7881782/41 788719
WFLM5121121121121121121121 121 121121/12171546/679 12171198
BWLM58298989898989898 98 98981036 800
WFLM60333333333333333333333 333 333/17911790/12661480/676 12661242
WFLM71159159159159159159159 159 655655931 655727
WFLM6316016016016016016016011781601178985621/160621985 621918
WFLM6521212121212121 2110182542121254 21
WFLM7219019019019019019019011561901156 55912 5
C. quebecensisWFClav32308308308308308308308753308/201753/730 201201730 201
WFClav50227227227227227227227731227/231731/679 231231679 231
BWLM45816516516516516516516514461651446/1061 4984981061 498
C. ripicolaBWLM21881818181818181 81 8181527 81820
WFLM21978787878787878 78 7878533 78
WFLM220777777588588588588 588 58858895 588285
BWLM454 120120120623623623 623 623623107 623
C. spartinaeWFCCC5351156/13758538538536806806802768027 68068027 680
PusillaeC. africanaSWCCC489 424 424424424 424 424424 424/1076
C. digitariaeSWCCC659 403 403
C. lovelessiiSWCCC647 1632163228852885/4933556/4933556 556 556556 143
C. maximensisWFCCC398
C. pusillaSWCCC602 3688143536883968/389139683968/3180 31801918 1809
C. sorghiSWCCC632 2692 24752475186186 186 186186 3370
* The assembly versions: BW was from Wingfield et al. [45], SW was from Wyka et al. [44], and WF was generated in the present study; values in the cells denote contig numbers; the 2nd contig number was led by a/when the fragment is on two contigs; green color represents full-length genes, light orange represents partial or gapped sequences, and no fill represents no gene matches; hatches denote fragments containing frameshifts or internal stop codons. None of the genes were detected in C. citrina (C. sect. Citrinae, not listed).
Table 3. The idt/ltm gene copies and their locations in C. sect. Claviceps and sect. Citrinae.
Table 3. The idt/ltm gene copies and their locations in C. sect. Claviceps and sect. Citrinae.
SectionOrganismAssembly * ltmQ1ltmQ2ltmPltmB1ltmB2ltmCltmSltmMltmG
CitrinaeC. citrinaWFCCC265 1947 1947 5822211
ClavicepsC. arundinisWFCCC110250 5050332505050
BWLM583158 158158124158158158
C. capensisWFCCC150429 2929 292929
C. cyperiWFCCC1219 2525 25 25
C. humidiphilaBWLM576945 945478 745745745
C. monticolaWFCCC1483568 568591 591591591
C. occidentalisWFLM771456189814561538 15381538657
BWLM789851877985985 985985691
WFLM843761789376376 376376376
C. pazoutovaeWFCCC1485225 225185 185185185
C. perihumidiphilaBWLM8127 27277272727
C. purprueaWFClav52174 174174 1741741230
WFClav04130637130130 130130130
WFClav26116 116116 116116116
WFClav46432294343 434343
WFClav551358/1838/128614441286557 557557557
WFLM14243 243243 243243243
WFLM207255 255255 255255255
WFLM223327 327327 327327327
WFLM232315 315315 315315315
WFLM2337 77 777
BWLM28258 258258 258258258
WFLM3087 8787 878787
WFLM3351 5151 515151
WFLM39192 192192 192192192
WFLM4361 361361 36112201220
WFLM4629 2929 292929
WFLM461529/651592965965 9659651500
WFLM46937 3737 373737
WFLM470646 787787 787787787
WFLM474243 243243 243243243
WFLM517 1717 171717
BWLM582112 112112 112112112
WFLM60765 765765 765765440
WFLM71393 1283977 977977977
WFLM63433 433433 433433433
WFLM65406 406406 406406406
WFLM72549/1151 1151361 361361361
C. quebecensisWFClav3256 5656 565656
WFClav5091 9191 919191
BWLM458536 536/1563475 475475475
C. ripicolaBWLM218191 191191136191191191
WFLM219395 395638589638638638
WFLM220368 368368591368368368
BWLM454138 138764949764764764
C. spartinaeWFCCC535225 2252254722512121212
* The assembly versions: BW was from Wingfield et al. [45], SW was from Wyka et al. [44], and WF was generated in the present study; values in the cells denote contig numbers, two values connected by/indicate the fragment was on two contigs; green color represents full-length genes, light orange represents partial or gapped sequences, and no fill represents no gene matches; hatches denote fragments containing frameshifts or internal stop codons. None of the idt/ltm genes were detected in C. sect. Pusillae except for two short fragments of ltmG from C. maximensis CCC398 and C. digitariae CCC659 by low stringency search, which are not listed (see also Section 2.2.2).
Table 4. Nucleotide polymorphism, tree-based divergence, and diversity of ergot alkaloid (eas) and indole-diterpene/lolitrm (idt/ltm) synthesis genes in C. purpurea.
Table 4. Nucleotide polymorphism, tree-based divergence, and diversity of ergot alkaloid (eas) and indole-diterpene/lolitrm (idt/ltm) synthesis genes in C. purpurea.
Biosynthesis Genes# of Sequences 1Total # of Sites# of Sites (Excluding Indel)Segregating SitesRatio# of HaplotypesHaplotype (Gene) DiversityNucleotide DiversityAverage Number of Nucleotide DifferencesTree-Based Divergence from COT 2Tree-Based Diversity
Nnss/nhHdPiStd devKMeanStd ErrorMeanStd. Error
Ergot Alkaloid (eas) Genes
dmaW3515169211960.213320.9950.0550.00450.5230.0620.0040.0860.002
dmaW12114809381540.164190.990.0300.00627.8860.0250.0030.0380.002
dmaW21415169231020.111130.9890.0420.00439.110.0420.0080.0650.004
easF3212326301210.192240.9720.0580.01536.5480.0490.0110.0810.003
easF1251232642400.062170.9530.0160.00110.540.0170.0020.0290.001
easF2812326341010.159810.0480.02030.4640.0310.0150.0550.010
easE34228316075770.359280.9790.0850.021135.9380.0850.0280.1470.008
easE128192118911120.059220.9680.0130.00124.5260.0130.0010.0200.000
easE27228316145360.332710.1320.034212.2380.1150.0420.2180.032
easC2815081503890.059230.9790.0130.00110.0340.0100.0010.0170.000
easD28851846370.044220.9760.0080.0017.15100.0060.0010.0100.000
easA2811431143510.045210.9660.0090.00110.2860.0060.0000.0110.000
easG281151923570.062200.9740.0100.0019.67200.0140.0010.0230.001
cloA28275421101730.082230.9790.0170.00135.7960.0160.0010.0300.000
easH5111955692470.434230.9360.1320.01475.1510.1020.0130.1540.004
easH128947943640.068150.9440.0120.00411.0530.0100.0030.0140.001
easH22311955712320.406150.8580.1680.01795.850.1500.0330.1880.010
lpsB28400639612550.064230.9790.0120.00148.4840.0100.0010.0170.000
lpsC27543154164210.078250.9940.0170.00192.3790.0170.0010.0290.001
Indole-Diterpene/Lolitrem (idt/ltm) Genes
ltmC2812491246600.048250.9920.0080.00010.2990.0070.0010.0120.000
ltmB128871868380.044210.9710.0140.00112.5160.0660.0100.0240.001
ltmM2817661731740.043250.9920.0070.00112.680.0050.0000.0090.000
ltmQ25218021192930.138240.9970.0190.006410.0260.0100.0400.004
ltmQ124218021192920.138230.9960.0200.00641.4860.0250.0070.0340.003
ltmP28194918951110.059240.9810.0100.00119.4790.0080.0000.0140.000
ltmS28955924340.037240.9890.0070.0016.8390.0080.0010.0150.000
1 Sequences with large gaps causing a significant reduction in the number of sites were excluded from the analyses. 2 Tree-based divergence from the center of tree (COT) and diversity were estimated by DIVIEN; other parameters were estimated by DnaSP.
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Liu, M.; Findlay, W.; Dettman, J.; Wyka, S.A.; Broders, K.; Shoukouhi, P.; Dadej, K.; Kolařík, M.; Basnyat, A.; Menzies, J.G. Mining Indole Alkaloid Synthesis Gene Clusters from Genomes of 53 Claviceps Strains Revealed Redundant Gene Copies and an Approximate Evolutionary Hourglass Model. Toxins 2021, 13, 799. https://doi.org/10.3390/toxins13110799

AMA Style

Liu M, Findlay W, Dettman J, Wyka SA, Broders K, Shoukouhi P, Dadej K, Kolařík M, Basnyat A, Menzies JG. Mining Indole Alkaloid Synthesis Gene Clusters from Genomes of 53 Claviceps Strains Revealed Redundant Gene Copies and an Approximate Evolutionary Hourglass Model. Toxins. 2021; 13(11):799. https://doi.org/10.3390/toxins13110799

Chicago/Turabian Style

Liu, Miao, Wendy Findlay, Jeremy Dettman, Stephen A. Wyka, Kirk Broders, Parivash Shoukouhi, Kasia Dadej, Miroslav Kolařík, Arpeace Basnyat, and Jim G. Menzies. 2021. "Mining Indole Alkaloid Synthesis Gene Clusters from Genomes of 53 Claviceps Strains Revealed Redundant Gene Copies and an Approximate Evolutionary Hourglass Model" Toxins 13, no. 11: 799. https://doi.org/10.3390/toxins13110799

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop