Trichoderma : Population Structure and Genetic Diversity of Species with High Potential for Biocontrol and Biofertilizer Applications

: Certain Trichoderma isolates provide biofertilizer, biocontrol, and other plant-beneficial activities while inhabiting the soil or internal plant tissue, and their use in agricultural systems can contribute to sustainable food production. It is thought that colonization of soil or internal plant tissue is fundamental for biocontrol and biofertilizer applications. Our collective analyses of prior surveys, where the tef1 α sequence was almost exclusively used to identify Trichoderma species, showed that isolates from the Harzianum complex clade, the T. asperellum / T. asperelloides group, T. virens , T. hamatum , and T. atroviride were prevalent in soil and/or as endophytes. Population structure and genetic diversity based on the genetic markers tef1 α , rpb2 , and ITS were investigated, and new lineages with statistical bootstrap support within T. atroviride , T. asperellum , T. hamatum , and T. virens populations were found. The nearest relatives of some of these species were also revealed. Choosing isolates from among more than 500 known Trichoderma species for use in non-targeted evaluation screens for biocontrol or biofertilizer applications is time-consuming and expensive. Pref-erentially selecting isolates from T. atroviride , T. asperellum / T. asperelloides , T. hamatum , the T. harzianum complex clade, T. virens , and possibly nearest relatives may speed the identification of candidates for commercialization due to the demonstrated ability of these species to successfully inhabit the soil and internal plant tissue. To our knowledge, this is the first report where dominant soil and endophytic Trichoderma species were identified from past survey data and population structure and genetic diversity analyses conducted.


Introduction
Alternatives to synthetic fertilizers and pesticides must be considered in agricultural production systems if global food demand is to be increased in a sustainable manner.Trichoderma spp.(kingdom: Fungi, division: Ascomycota; family: Hypocreaceae) are promising alternatives to synthetic fertilizers and pesticides due to their demonstrated commercial successes and desirable traits such as direct and/or indirect negative effects on many plant pathogens, nematodes, and insects; multiple capabilities for crop protection in a single product due to this broad-spectrum activity against plant pathogens and pests; protection of the plant against abiotic stressors; stimulation of plant growth; and improvement in soil nutrient availability to plants.Certain Trichoderma isolates are also dominant in soil or establish endophytic relationships with plants, traits thought to be fundamental for providing these plant-beneficial activities [1,2].
Unfortunately, the genus Trichoderma is taxonomically complex, containing more than 500 species, with its taxonomy evolving due to the use of molecular taxonomic approaches [2].To aid in selecting Trichoderma isolates for commercial development as biocontrol agents or biofertilizers (BCBFs), it would be helpful to narrow the more than 500 species of Trichoderma to a few known to have commercially desired attributes [3,4].Therefore, we narrowed the number of species to be considered for commercial development by surveying the literature to identify Trichoderma species that were prevalent soil inhabitants and/or plant endophytes.We then used translation elongation factor 1α (tef1α), RNA polymerase subunit II (rpb2), and the ribosomal internal transcribed spacer (ITS) sequences to assess genetic diversity and population structure within each of those species.The purpose of these phylogenetic analyses was to reveal lineages that may qualify for new species for consideration as BCBFs and identify lineages that may have useful adaptations to specific geographic regions.

Trichoderma in Soil and Endophytes
Data from 23 publications on surveys of Trichoderma in soil from different regions of five continents were collected to determine the dominant species in soil (Table 1).Also, data from 13 publications on surveys of endophytic Trichoderma from five continents were used (Table 2).These investigations were chosen based primarily on the reliability of the methods used in species identification, specifically using tef1α sequence data.The only two exceptions where tef1α was not used are indicated in Table 2. Surveys where the tef1α sequence was employed were selected since the use of tef1α sequence is a powerful means for identification of Trichoderma to the species level [5,6] and it is the most prevalent locus reported in the literature for identification of Trichoderma species.Trichoderma species or species groups were considered most prevalent in soil if they were detected in at least 50% of the soil surveys and represented at least 5% of all isolates collected from the surveys.a Compilation of surveys published in the past 20 years that used molecular phylogeny for Trichoderma species identification.Surveys were specific for Trichoderma isolates, and all species listed in column headings are from the genus Trichoderma.Complete isolate information from these surveys is in Supplementary Materials, Table S3.b Total number of Trichoderma isolates (of all species) identified in this study.c Isolates from T. asperellum and T. asperelloides are grouped together because T. asperelloides is often misidentified as T. asperellum due to highly similar DNA sequences and identical morphology.Additionally, many strains of T. asperelloides are incorrectly deposited in GenBank as T. asperellum.d Harzianum complex clade.Isolates from species within the Harzianum complex clade are not broken down into individual species as isolates are often misidentified as T. harzianum and deposited in GenBank as T. harzianum.e Reference for the information in this row.g Totals for information in each respective column.h Frequency of detection of this species in the different studies collectively.
(Number of studies where this species was isolated)/(total number of studies) × 100.i N/A, not applicable.

Harzianum Complex Clade Species
Literature was compiled on Trichoderma species from the Harzianum complex clade [3,42] using GenBank hits as the metric for intensity of study for individual species.Species with 20 hits or higher in GenBank are tabulated in Table 3.It was assumed that each GenBank submission represented a different Harzianum complex clade strain.a Compilation of species from the Harzianum complex clade with 20 or more GenBank Hits when using the species names in search.GenBank hits was the metric used to indicate degree to which a species has been studied.b All species listed are from the genus Trichoderma.c Number of sequence deposits from isolates of this species in GenBank.d Literature describing this species.e N/A, not applicable.For T. harzianum, number of hits is not accurate, as newly classified species from the Harzianum complex clade were previously deposited as T. harzianum.

Phylogenetic Analysis
Population structure and genetic diversity of T. atroviride, T. asperelloides, T. asperellum, T. hamatum, and T. virens were inferred from DNA sequences of tef1α, rpb2, and ITS.These three loci have been suggested for phylogenetic analysis of Trichoderma for the purpose of species identification and the description of new species [3].For the analyses, the tef1α sequence of the type or ex-type for T. atroviride (GenBank accession: AY376051), T. asperellum (GenBank accession: AY376058), T. asperelloides (GenBank accession: GU198294), T. hamatum (GenBank accession: AF456911), and T. virens (GenBank accession: AY750891) were obtained from GenBank.Each tef1α sequence was separately subjected to a Basic Local Alignment Tool Search (BLAST) at the NCBI website, and the first 100 hits were downloaded as an alignment file in FASTA format.Within these 100 sequences, we searched GenBank for strains that had sequences of rpb2 and ITS.Isolates that had a tef1α sequence and at least one of the sequences for rpb2 and ITS were selected as shown in Supplementary Materials, Table S1.The sequences for each locus were downloaded for each species (listed in Table S1), and aligned using Clustal Omega (https://www.ebi.ac.uk/jdispatcher/msa/clustalospecies, accessed on 14 March 2024) with default settings.Alignments were visually improved using Mesquite software (http://www.mesquiteproject.org,accessed on 20 March 2024), and the ends of the sequences were trimmed.For each species, the alignment files of tef1α, rpb2, and ITS in nexus format were combined using Mesquite software and then used to reconstruct phylogenetic trees using two methods, as follows.(1) Maximum likelihood in MEGA X with the substitution model predetermined using MEGA X [45].Support for the clades was assessed with 1000 bootstrap replicates.(2) Parsimony criterion in PAUP version 4.0a (http://phylosolutions.com/paup-test/, accessed on 21 February 2024).The most parsimonious tree was obtained with a heuristic search with starting trees obtained via random stepwise addition (100 replicates) and with TBR as the branch-swapping algorithm.Support for branches was assessed with 1000 bootstrap replicates.
Phylogenetic analyses for the Harzianum complex clade were conducted as follows.Sequences of the three loci (tef1α, rpb2, ITS) for two or three strains of the dominant species (Table 3) were obtained from GenBank.Sequences of additional strains that belonged to the Harzianum complex clade from two investigations [36,37] were also included in the analyses.The phylogenetic tree was subsequently constructed using the parsimony method described above.Support for branches was assessed with 1000 bootstrap replicates.The tree was rooted to T. pleurotum and T. pleuroticola; both species are positioned outside the Harzianum complex clade [42].
Phylogenetic trees were also constructed by both methods described above to reveal the nearest phylogenetic relatives for T. atroviride, T. asperellum, T. asperelloides, and T. hamatum based on DNA sequence data of tef1α, rpb2, and ITS loci.The nearest relatives were chosen based on previous reports of phylogenies [6,46,47].A total of 29 strains plus an outgroup were used in the analyses.Trees were rooted to the type species of T. evansii.In all cases, the trees produced by both methods were essentially identical in topology, and thus only one tree is presented.

Trichoderma Soil and Endophyte Survey Compilation
Results from 23 Trichoderma-specific soil surveys conducted worldwide over the past 20 years were compiled, where molecular phylogeny was used for identification of Trichoderma isolates to the species level (Table 1).Table 1 shows the dominant species of Trichoderma from these soil surveys, and Supplementary Materials Table S2 shows all the Trichoderma species identified in these soil surveys.Collectively, there were 42 distinct species identified in this compilation, plus the T. asperellum/T.asperelloides group, the Harzianum complex clade, and Trichoderma isolates not identified to the species level.Isolates from the T. asperellum/T.asperelloides group and the Harzianum complex clade were not listed as separate species, as isolates falling within these two species groupings are often misidentified and/or incorrectly deposited within GenBank (Ismaiel, unpublished).Collectively, there were 4709 isolates when considering all species and all species groupings from these surveys.Species or species groups were considered most prevalent if they were detected in at least 50% of the soil surveys (at least 12 of the 23 surveys) and represented at least 5% of all isolates collected from the 23 surveys (at least 235 isolates).The most prevalent species/species groups were the Harzianum complex clade species, T. asperellum/T.asperelloides group, T. virens, T. hamatum, and T. atroviride.Isolates from these species or species groups ranged from 42% to 100% of all isolates detected in each individual study in Table 1, and collectively they represent 74% of all isolates detected from all studies listed in Table 1.Isolates from these species or species groupings were also found to be dominant in surveys of endophytic Trichoderma conducted worldwide on different plant species.Table 2 shows the most prevalent endophytic Trichoderma species, and Supplementary Materials Table S3 shows all the endophytic species identified in the studies.The total isolates from the most prevalent species were 281 out of 429, representing 66% of the total strains isolated as endophytes.The species from the Harzianum complex clade and species from the T. asperellum/T.asperelloides group represented the top two groups, respectively.

Population Structure and Genetic Diversity of T. atroviride
The tef1α sequence of T. atroviride-type strain CBS 142.95 was used for a BLAST search, and a total of 100 sequences of T. atroviride strains were obtained, including 11 from Italy, 11 from Canada, 9 from the US, 9 from Poland, and 8 from China.Very low numbers of sequences were from strains isolated in South America, India, and Indonesia.The phylogenetic tree in Figure 1 inferred from 41 strains containing the DNA sequences of three loci (tef1α, rpb2, ITS) shows that there are three lineages with high bootstrap values (>70%).One of the lineages (C2) is dominant, containing most of the strains, including the type species of T. atroviride (AY376051).Another lineage (C1) had five strains from China, clearly showing the biogeographic restriction of the isolates.The other two lineages (C2, C3) are cosmopolitan.
and species from the T. asperellum/T.asperelloides group represented the top two groups, respectively.

Population Structure and Genetic Diversity of T. atroviride
The tef1α sequence of T. atroviride-type strain CBS 142.95 was used for a BLAST search, and a total of 100 sequences of T. atroviride strains were obtained, including 11 from Italy, 11 from Canada, 9 from the US, 9 from Poland, and 8 from China.Very low numbers of sequences were from strains isolated in South America, India, and Indonesia.The phylogenetic tree in Figure 1 inferred from 41 strains containing the DNA sequences of three loci (tef1α, rpb2, ITS) shows that there are three lineages with high bootstrap values (>70%).One of the lineages (C2) is dominant, containing most of the strains, including the type species of T. atroviride (AY376051).Another lineage (C1) had five strains from China, clearly showing the biogeographic restriction of the isolates.The other two lineages (C2, C3) are cosmopolitan.

Population Structure and Genetic Diversity of the T. asperellum/asperelloides Species Group
The tef1α sequence of T. asperellum type strain CBS 433.97 was used for a BLAST search, and a total of 100 sequences were obtained.Most of the isolates were from Asia and South America.Countries represented by the most isolates were India (17 isolates), China (14 isolates), Malaysia (13 isolates), and Brazil (12 isolates).The phylogeny of the 52 analyzed strains of the T. asperellum population inferred from combined data sequences of tef1α, rpb2, and ITS with the type strain of T. asperelloides as an outgroup (Figure 2), showed the presence of two lineages (C1, C2) with high bootstrap values.The strains in both clades are cosmopolitan and qualify for new species.This tree contained a major unresolved cluster that included the type strain CBS 433.97 with tef1α accession number of AY376090.

Population Structure and Genetic Diversity of the T. asperellum/asperelloides Species Group
The tef1α sequence of T. asperellum type strain CBS 433.97 was used for a BLAST search, and a total of 100 sequences were obtained.Most of the isolates were from Asia and South America.Countries represented by the most isolates were India (17 isolates), China (14 isolates), Malaysia (13 isolates), and Brazil (12 isolates).The phylogeny of the 52 analyzed strains of the T. asperellum population inferred from combined data sequences of tef1α, rpb2, and ITS with the type strain of T. asperelloides as an outgroup (Figure 2), showed the presence of two lineages (C1, C2) with high bootstrap values.The strains in both clades are cosmopolitan and qualify for new species.This tree contained a major unresolved cluster that included the type strain CBS 433.97 with tef1α accession number of AY376090.The tef1α sequence of the T. asperelloides type strain CBS 125398 was used for a BLAST search, and a total of 100 sequences were obtained.Of these, approximately 30% were deposited in GenBank under the wrong identity, mostly as T. asperellum, with a few as T. pseudoasperelloides and T. yunnanense (Ismaiel, unpublished).The most prevalent countries of origin for the isolates of T. asperelloides were Malaysia (21 isolates), Brazil (21 isolates), China (11 isolates), and India (9 isolates).These were also the most prevalent countries for T. asperellum.The phylogenic tree for 48 strains of the T. asperelloides population based on DNA sequences of three loci (tef1α, rpb2, ITS) with T. yunnanense as the outgroup is presented in Figure 3.The population of T. asperelloides showed very low diversity.Most isolates had identical sequences of these three loci, and there were no lineages with statistical bootstrap values above 70%, except the two strains from India (C1).
above the branches are bootstrap values obtained with 1000 bootstrap replicates.Sequences are identified by tef1α GenBank accession number followed by the country of isolation; C1 and C2 refer to lineages with bootstrap values above 70%.The scale bar indicates the number of nucleotide changes.The tree is rooted to the type species of T. asperelloides.The type species and bootstrapsupported clades are highlighted in colors.
The tef1α sequence of the T. asperelloides type strain CBS 125398 was used for a BLAST search, and a total of 100 sequences were obtained.Of these, approximately 30% were deposited in GenBank under the wrong identity, mostly as T. asperellum, with a few as T. pseudoasperelloides and T. yunnanense (Ismaiel, unpublished).The most prevalent countries of origin for the isolates of T. asperelloides were Malaysia (21 isolates), Brazil (21 isolates), China (11 isolates), and India (9 isolates).These were also the most prevalent countries for T. asperellum.The phylogenic tree for 48 strains of the T. asperelloides population based on DNA sequences of three loci (tef1α, rpb2, ITS) with T. yunnanense as the outgroup is presented in Figure 3.The population of T. asperelloides showed very low diversity.Most isolates had identical sequences of these three loci, and there were no lineages with statistical bootstrap values above 70%, except the two strains from India (C1).

Population Structure and Genetic Diversity of T. hamatum
The tef1α sequence of T. hamatum type strain DAOM 167057 (CBS 102160) was used for a BLAST search, and a total of 100 sequences were obtained.The first 100 sequences were from T. hamatum strains originating on five continents (i.e., Africa, North and South America, Asia, Europe), as well as Oceana (New Zealand), showing the cosmopolitan nature of this species.The countries Ethiopia, Italy, and Brazil were highly represented.Forty-seven strains were phylogenetically analyzed based on sequence data of the three loci tef1α, rpb2, and ITS, with the type species of T. pubescens as outgroup taxa (Figure 4).The majority of the strains, including the type species of T. hamatum (AY750893) from Canada, clustered in one large clade (C1) that had high bootstrap values (70%).There are three other smaller but highly supported lineages-C2, C3, and C4.These three lineages qualify as new cryptic species within the Hamatum clade.Only one of these clades (C4) had biogeographic restriction, as the three isolates were obtained from the Far East countries of China and South Korea.
the DNA sequences of tef1α, rpb2, and ITS.The tree was produced using parsimony in PAUP.The numbers above the branches are bootstrap values obtained with 1000 bootstraps.Sequences are identified by GenBank accession numbers followed by the country of isolation; C1 refers to lineages with bootstrap support or geographic significance.The tree was rooted to the T. yunnanense type species from China.The type species and bootstrap-supported clades are highlighted in colors.The scale bar indicates the number of nucleotide changes.

Population Structure and Genetic Diversity of T. hamatum
The tef1α sequence of T. hamatum type strain DAOM 167057 (CBS 102160) was used for a BLAST search, and a total of 100 sequences were obtained.The first 100 sequences were from T. hamatum strains originating on five continents (i.e., Africa, North and South America, Asia, Europe), as well as Oceana (New Zealand), showing the cosmopolitan nature of this species.The countries Ethiopia, Italy, and Brazil were highly represented.Forty-seven strains were phylogenetically analyzed based on sequence data of the three loci tef1α, rpb2, and ITS, with the type species of T. pubescens as outgroup taxa (Figure 4).The majority of the strains, including the type species of T. hamatum (AY750893) from Canada, clustered in one large clade (C1) that had high bootstrap values (70%).There are three other smaller but highly supported lineages-C2, C3, and C4.These three lineages qualify as new cryptic species within the Hamatum clade.Only one of these clades (C4) had biogeographic restriction, as the three isolates were obtained from the Far East countries of China and South Korea.

Genetic Diversity of Harzianum Complex Clade Species
Table 3 lists the most studied species from the Harzianum complex clade using Gen-Bank hits as the metric when searching with species names in the clade.Table 3 also lists geographic regions with references.Species with fewer than 20 GenBank hits were not included.Using this metric, T. afroharzianum, T. lentiforme, T. atrobrunneum, and T. guizhouense were the most studied species, each with over 200 GenBank hits.From Table 3, it is evident that there are some species with worldwide distribution (T.guizhouense, T. afroharzianum) and others that have not been detected in some regions.For example, T. camerunense, T. botryosum, T. pseudopyramidale, and T. afarasin were detected only in Africa.
The phylogeny of species within the Harzianum complex clade segregated the dominant species into two clades.Clade I is highly supported, with a bootstrap value of 91%, whereas Clade II was moderately supported, with a bootstrap value of 66% (Figure 5).Clade I included T. camerunense, T. rifaii, T. harzianum, T. simmonsii, T. endophyticum, T. neotropicale, T. afarasin T. botryosum, and T. lixii.Clade II included the species T. lentiforme, T. inhamatum, T. afroharzianum, T. atrobrunneum, T. pyramidale, T. pseudopyramidale, and T. guizhouense.The two main clades were not separated based on biogeographic restriction or habitat, as endophyte and soil inhabitants are present in both clades.The tree had five lineages (L1-L5), with no strong association with the identified species that could represent new species.The strains representing these lineages were isolated as endophytes from plants in Malaysia and Ethiopia, which indicated a higher chance of finding new species among endophytic strains.The tef1α sequence of type species of T. virens (GLI39) with GenBank accession number of GU591800 was BLAST searched, and the first 100 hits were obtained.Within the first 100 hits, T. virens strains were highly represented from China (27 isolates), Brazil (11 isolates), Malaysia (8 isolates), and Hungary (7 isolates).T. virens is truly cosmopolitan, having been isolated from South America, North America, Europe, Africa, and Asia.Africa had the fewest GenBank hits, with only two: one from Cameroon and the other from the Ivory Coast.A phylogenetic tree was constructed for T. virens population based on three loci DNA sequences (tef1α, rpb2, and ITS).The tree included 53 strains with the type species of T. crassum as an outgroup taxon (Figure 6).The population of T. virens is highly variable compared to the other Trichoderma species analyzed in this study, containing seven lineages (C1-C7) with strong bootstrap values.Three of the lineages, C1, C2, and C3, had biogeographic restrictions, as they had strains only from China, Mexico, and China, respectively.Based on this analysis, the population of T. virens could be split into seven different cryptic species with different plant-beneficial activities.
Appl.Microbiol.2024, 4, FOR PEER REVIEW 13 having been isolated from South America, North America, Europe, Africa, and Asia.Africa had the fewest GenBank hits, with only two: one from Cameroon and the other from the Ivory Coast.A phylogenetic tree was constructed for T. virens population based on three loci DNA sequences (tef1α, rpb2, and ITS).The tree included 53 strains with the type species of T. crassum as an outgroup taxon (Figure 6).The population of T. virens is highly variable compared to the other Trichoderma species analyzed in this study, containing seven lineages (C1-C7) with strong bootstrap values.Three of the lineages, C1, C2, and C3, had biogeographic restrictions, as they had strains only from China, Mexico, and China, respectively.Based on this analysis, the population of T. virens could be split into seven different cryptic species with different plant-beneficial activities.

Nearest-Relative Analysis
The phylogenetic tree resolving relatives of T. atroviride, T. asperellum, T. asperelloides, and T. hamatum is shown in Figure 7.In clade C1 of the resulting tree, the T. atroviride type species clustered with two strains of T. atroviride B ( 46) from New Zealand.The other species in clade C1 was T. nordicum from China.These two species in clade C1 were the nearest relatives to T. atroviride and together formed a highly supported clade (C1).The next closest relatives to T. atroviride were the species in clade C2: T. uncinatum, T. paratroviride, and T. obovatum.Clade C2 also contained isolates incorrectly identified in GenBank as T. atroviride (accession numbers KJ634780 and KJ634765), as they clustered with T. paratroviride and T. obovatum, but not with T. atroviride.
identified by tef1α GenBank accession numbers followed by the country of origin; C1-C7 refer to lineages with bootstrap support or geographic significance.The type species and bootstrap-supported clades are highlighted in colors.The tree is rooted to the T. crassum type species.The scale bar indicates the number of nucleotide changes.

Nearest-Relative Analysis
The phylogenetic tree resolving relatives of T. atroviride, T. asperellum, T. asperelloides, and T. hamatum is shown in Figure 7.In clade C1 of the resulting tree, the T. atroviride type species clustered with two strains of T. atroviride B ( 46) from New Zealand.The other species in clade C1 was T. nordicum from China.These two species in clade C1 were the nearest relatives to T. atroviride and together formed a highly supported clade (C1).The next closest relatives to T. atroviride were the species in clade C2: T. uncinatum, T. paratroviride, and T. obovatum.Clade C2 also contained isolates incorrectly identified in GenBank as T. atroviride (accession numbers KJ634780 and KJ634765), as they clustered with T. paratroviride and T. obovatum, but not with T. atroviride.
The nearest relative for T. asperellum was T. yunnanense (Figure 7, clade C3), while the nearest relative for T. asperelloides was T. pseudoasperelloides.Clade C3 also contained the sequence of an isolate incorrectly identified in GenBank as T. yunnanense from India, as it did not cluster with the type species of T. yunannense.The nearest relatives of T. hamatum (C4) were the two species T. insigne and T. anisohamatum, followed by T. pubescens.The four species formed the highly supported clade C4.The nearest relative for T. asperellum was T. yunnanense (Figure 7, clade C3), while the nearest relative for T. asperelloides was T. pseudoasperelloides.Clade C3 also contained the sequence of an isolate incorrectly identified in GenBank as T. yunnanense from India, as it did not cluster with the type species of T. yunannense.The nearest relatives of T. hamatum (C4) were the two species T. insigne and T. anisohamatum, followed by T. pubescens.The four species formed the highly supported clade C4.

Discussion
The genus Trichoderma was first recognized by Persoon in 1794 [48].However, the taxonomy of the genus remained obscure until 1969, when Rifai [44] proposed nine species or species aggregates based mainly on morphological characteristics of conidiophores and phialides.Approximately 20 years later, Bissett revised Rifai's proposal and replaced the nine aggregate species by formally recognizing five sections comprising 27 species [49][50][51][52].In the late 1990s, molecular identification based on DNA sequence data started and showed inaccuracies in the taxonomy based on morphological characteristics, primarily due to the homoplasy and plasticity of the characteristics [53].As a result, taxonomy based on molecular data of three loci (tef1α, rpb2, and ITS) was adopted and used for the classification and identification of Trichoderma spp.[3].Although the ITS sequence was informative for differentiation at the genus level, it was found to be the least informative of the three loci for species determination [3,6,42].Today, there are about 500 species of Trichoderma based on legitimate names available in MycoBank (https://www.mycobank.org,accessed on 20 February 2024).
A strategy to enhance the successful commercial development of Trichoderma and other microbes for agriculture is to preferentially seek isolates of species that have been demonstrated to have desired traits and/or are adapted to local crops, soils, and farming practices [48,54].To use this strategy to aid in selecting Trichoderma isolates for development as biocontrol agents or biofertilizers (BCBFs), we reduced the more than 500 species of Trichoderma to a few that were prevalent soil inhabitants and/or endophytes, as persistence in soil and within plant tissue is important for microbes to function as BCBFs [1,2].For this, prior Trichoderma-specific soil surveys and endophyte surveys were analyzed.Even though many survey studies are available in the literature, to our knowledge, this is the first study to summarize the results of those studies and reveal the dominant species of Trichoderma in the soil as well as the dominant endophytic species.
This analysis revealed that isolates from T. atroviride, T. hamatum, T. virens, the T. asperellum/asperelloides grouping, and the species in the Harzianum complex clade were prevalent soil inhabitants.Isolates from these species and species groupings were also often detected as endophytes in various plants worldwide.Corroborating our finding, these Trichoderma species have demonstrated importance in commercial products used in several countries.In a prior analysis, 51 of 56 commercial products had at least one of these species as active ingredients [48].Further, in a compilation of biocontrol investigations directed at combating various diseases of crops, 21 of 28 isolates, or 75% of isolates, were from one of the above Trichoderma species or species groupings [55].Further, in an extensive study in China where 1308 strains of Trichoderma were evaluated for disease control and plant promotion using different techniques such as dual plate assay, seed germination, height and weight of plants, and cell wall degrading enzymes, 13 strains were selected as the best candidates, with 12 identified as T. asperellum and 1 as T. afroharzianum.This extensive study led to two species that are also included in our top species for BCBF activities [24].
Each of these species or species groupings has been shown to induce systemic disease resistance in various crop plants, an important trait for biological control [2,56,57].Various BCBF Trichoderma isolates also produce or induce plant growth hormones and volatile compounds, and are involved in promoting the uptake of macro-and micronutrients by crops [58].
Population structure and genetic diversity analyses were performed to further characterize T. atroviride, T. hamatum, T. virens, T. asperellum and T. asperelloides to determine whether there were lineages within their populations that had the potential to be a different species that could also have BCBF activity.The analyses by multilocus phylogeny (tef1α, rpb2, and ITS) revealed possible new lineages for T. atroviride, T. asperellum T. hamatum, T. virens, and the Harzianum complex clade.Some of these lineages may qualify as new species.This is consistent with prior studies [47,59,60], where the authors showed diversity within T. atroviride and T. hamatum.There are no prior studies showing diversities within T. virens and T. asperellum, and this study is the first that suggests splitting the population of these species into more new species.Unfortunately, a formal description of the lineages as new species was not possible, as we lacked the physical cultures for these putative new species lineages.However, this opens the door for groups who handle large numbers of one of these species to conduct multi-locus phylogeny and possibly find the lineages within the collection they possess and describe them.It should also be noted that phylogeny based on the whole-genome sequence (WGS) is not common and the WGS for a large number of strains may not be available for the diverse population of each species used in this study [61].However, whole-genome analyses of a few strains may indicate the differences in genes or gene groups from those strains that do not have BCBF activities.
Several lineages and clades reported here have distinct biogeographic restrictions.It is possible that populations within these lineages may have evolved separately due to distinct regional environmental conditions or the endophytic lifestyle.It is also possible that such lineages have better adaptations to the regions where they were isolated and for plants indigenous to these regions.The coevolution of plants and Trichoderma species has been postulated, as has been demonstrated for plant-pathogen interactions [55][56][57].These regional adaptations may make these isolates ideal candidates for commercial development in these parts of the world.Also, this study revealed the closest phylogenetic relatives of T. atroviride, T. asperellum, T. asperelloides, and T. hamatum.Certain characteristics differentiate efficient biocontrol strains isolated from nature from less effective strains, possibly within each species, where only specific lineages are effective as BCBFs [62][63][64].Testing different lineages identified in this study may reveal lineages containing effective BCBF strains for commercialization.
Taking all things into consideration, microbes are proving to be important in many applications in sustainable agriculture, including their use as BCBFs [2,53,57,65].For example, some Trichoderma spp.are useful biocontrol agents of postharvest and foodborne pathogens [65][66][67][68][69], influencers of rhizospheric and plant microbiomes [70][71][72][73], and effectively neutralize mycotoxins in food grains [74][75][76][77][78][79].Unfortunately, there has been limited commercialization of microbial agricultural products relative to the volume of research on plant-beneficial microbes [53].The strategy of preferential selection of isolates from species known to have beneficial properties that are also compatible with commercialization is more robust than prior, non-preferential approaches where hundreds to thousands of randomly selected isolates may need to be screened to identify a few strains with desired characteristics [80].Consistent with this strategy, we propose narrowing the search for BCBF microbes within the increasingly complex genus Trichoderma with over 500 species to a subset of Trichoderma species-T.asperellum, T. asperelloides, T. virens, T. atroviride, T. hamatum-and some species in the Harzianum complex clade, as well as their nearest relatives, based on solid existence in the soil, endophytic characteristics, and prior commercialization.Although persistence in soil or as an endophyte does not guarantee effectiveness as a BCBF [2], use of these attributes could speed the selection of candidate isolates for downstream, in-depth screening.

Figure 1 .
Figure 1.Phylogenetic tree revealing the genetic diversity of the T. atroviride population based on the DNA sequences of tef1α, rpb2, and ITS.Sequences are identified by tef1α GenBank accession number followed by the country of isolation.The scale bar indicates the number of nucleotide changes.Numbers on the branches represent bootstrap values greater than 70%.The type species and bootstrap-supported clades are highlighted in colors.

Figure 1 .
Figure 1.Phylogenetic tree revealing the genetic diversity of the T. atroviride population based on the DNA sequences of tef1α, rpb2, and ITS.Sequences are identified by tef1α GenBank accession number followed by the country of isolation.The scale bar indicates the number of nucleotide changes.Numbers on the branches represent bootstrap values greater than 70%.The type species and bootstrap-supported clades are highlighted in colors.

Figure 2 .
Figure 2. Phylogenetic tree revealing the diversity of the T. asperellum population based on the DNA sequences of tef1α, rpb2, and ITS.The tree was generated using parsimony in PAUP.The numbers

Figure 2 .
Figure 2. Phylogenetic tree revealing the diversity of the T. asperellum population based on the DNA sequences of tef1α, rpb2, and ITS.The tree was generated using parsimony in PAUP.The numbers above the branches are bootstrap values obtained with 1000 bootstrap replicates.Sequences are identified by tef1α GenBank accession number followed by the country of isolation; C1 and C2 refer to lineages with bootstrap values above 70%.The scale bar indicates the number of nucleotide changes.The tree is rooted to the type species of T. asperelloides.The type species and bootstrap-supported clades are highlighted in colors.

Figure 3 .
Figure 3. Phylogenetic tree revealing the genetic diversity of the T. asperelloides population based on the DNA sequences of tef1α, rpb2, and ITS.The tree was produced using parsimony in PAUP.The numbers above the branches are bootstrap values obtained with 1000 bootstraps.Sequences are identified by GenBank accession numbers followed by the country of isolation; C1 refers to lineages with bootstrap support or geographic significance.The tree was rooted to the T. yunnanense type species from China.The type species and bootstrap-supported clades are highlighted in colors.The scale bar indicates the number of nucleotide changes.

Figure 4 .
Figure 4. Phylogenetic tree revealing the genetic diversity of the T. hamatum population based on the DNA sequences of tef1α, rpb2, and ITS.The tree was produced using parsimony in PAUP.The numbers above the branches are bootstrap values obtained with 1000 bootstrap replicates.Tree leaves are marked by GenBank accession number followed by the country of isolation.The tree is rooted to T. pubescens type species.C1-C4 are lineages with bootstrap support of 70% and greater and are highlighted in colors.The scale bar indicates the number of nucleotide changes.

Figure 5 . 3 . 6 .Figure 5 .
Figure 5.One of the most parsimonious trees obtained via PAUP based on sequences of tef1α, rpb2, and ITS resolving the relationship of Trichoderma species within the Harzianum complex clade.Tree leaves are labeled with tef1α GenBank accession numbers for Trichoderma species.Numbers above the branches indicate bootstrap support of 70 or greater; E at the end of the accession number indicates that the strain was isolated as an endophyte; T at the end of the accession number indicates a type species.Clades are marked with vertical lines, and numbers 1-14 represent identified species.The scale bar indicates the number of nucleotide changes.Lineages marked with vertical lines (L1-L5) represent unidentified lineages.The color highlights represent the two main clades.The tree was rooted to T. pleurotum and T. pleuroticola.3.6.Population Structure and Genetic Diversity of T. virens

Figure 6 .Figure 6 .
Figure 6.Phylogenetic tree revealing the genetic diversity of the T. virens population based on the DNA sequences of tef1α, rpb2, and ITS.The tree was produced using parsimony in PAUP.The numbers above the branches are values obtained with 1000 bootstrap replicates.Sequences areFigure 6. Phylogenetic tree revealing the genetic diversity of the T. virens population based on the DNA sequences of tef1α, rpb2, and ITS.The tree was produced using parsimony in PAUP.The

Figure 7 .
Figure 7.One of the most parsimonious trees obtained via PAUP based on DNA sequence of tef1α, rpb2, and ITS showing the nearest relatives to the Trichoderma species T. atroviride, T. asperellum, T. asperelloides, and T. hamatum.Sequences are identified by GenBank accession number for tef1α

Table 1 .
Most prevalent Trichoderma species/groups from soil surveys from different geographic regions a .

c harzianum d virens atroviride hamatum Total (% of All) e Isolates Reference f
a Compilation of surveys published in the past 20 years that used molecular phylogeny for Trichoderma species identification.Surveys were specific for Trichoderma isolates and all species listed in column headings are from the genus Trichoderma.bTotalnumber of Trichoderma isolates (of all species) identified in this study.For complete information on species isolated, see Supplementary Materials, TableS2.c Isolates from T. asperellum and T. asperelloides are grouped together because T. asperelloides is often misidentified as T. asperellum due to highly similar DNA sequences and identical morphology.Additionally, many strains of T. asperelloides are incorrectly deposited in GenBank as T. asperellum.d Harzianum complex clade species.Isolates from the different Harzianum complex clade are not broken down into individual species, as isolates are often misidentified as T. harzianum and deposited in GenBank as T. harzianum.e Total most prevalent (% of all) isolates.Total most prevalent isolates, sum of all isolates of the most prevalent species listed in this table.(% of all), percentage of all isolates in this study represented by isolates from these most prevalent species.f Reference for the information in this row.g Totals for information in each respective column.h Frequency of detection of this species in the different studies collectively.(Number of studies where this species was isolated)/(total number of studies) × 100.i N/A; not applicable.

Table 2 .
Isolation of endophytic species of Trichoderma from plants in different geographic regions a .

Table 3 .
Most studied Trichoderma species from the Harzianum complex clade a .