Population Genetics and Anastomosis Group’s Geographical Distribution of Rhizoctonia solani Associated with Soybean

Rhizoctonia solani is a species complex composed of many genetically diverse anastomosis groups (AG) and their subgroups. It causes economically important diseases of soybean worldwide. However, the global genetic diversity and distribution of R. solani AG associated with soybean are unknown to date. In this study, the global genetic diversity and distribution of AG associated with soybean were investigated based on rDNA-ITS sequences deposited in GenBank and published literature. The most prevalent AG, was AG-1 (40%), followed by AG-2 (19.13%), AG-4 (11.30%), AG-7 (10.43%), AG-11 (8.70%), AG-3 (5.22%) and AG-5 (3.48%). Most of the AG were reported from the USA and Brazil. Sequence analysis of internal transcribed spacers of ribosomal DNA separated AG associated with soybean into two distinct clades. Clade I corresponded to distinct subclades containing AG-2, AG-3, AG-5, AG-7 and AG-11. Clade II corresponded to subclades of AG-1 subgroups. Furthermore, AG and/or AG subgroups were in close proximity without corresponding to their geographical origin. Moreover, AG or AG subgroups within clade or subclades shared higher percentages of sequence similarities. The principal coordinate analysis also supported the phylogenetic and genetic diversity analyses. In conclusion, AG-1, AG-2, and AG-4 were the most prevalent AG in soybean. The clade or subclades corresponded to AG or AG subgroups and did not correspond to the AG’s geographical origin. The information on global genetic diversity and distribution will be helpful if novel management measures are to be developed against soybean diseases caused by R. solani.


Introduction
Soybean (Glycine max L.) is one of the world's most significant oilseed crops, accounting for 25% of all edible oil production [1]. About 176.6 million tons of soybeans are produced over 75.5 million hectares of fertile land each year [2,3]. R solani Kuhn [teleomorph, Thanatephorus cucumeris (Frank) Donk] is a serious threat to soybean production worldwide. The fungus causes blights (foliar and web), pre-and post-emerging dampingoff, root and hypocotyl rot diseases of soybean [4,5]. These diseases caused massive yield losses in soybeans all over the world. For example, in Brazil and the United States, only foliar blight has resulted in 30 to 69% yield losses [6][7][8][9]. Moreover, these diseases are difficult to control because of soil born nature of R. solani and the broad host range [4]. Fungicides have been widely used to manage these diseases [10,11]. However, fungicides have caused severe environmental and health concerns. The most cost-effective and environmentally sustainable option to manage R. solani is breeding resistant cultivars [12]. However, understanding the genetic diversity of R. solani is critical if novel management measures, such as developing Rhizoctonia -resistant cultivars, are to be developed. R. solani exhibits tremendous genetic diversity and is classified into different anastomosis groups (AG). To date, 14 anastomosis groups (AG 1 to 13 and AG-BI) have been identified based on the fusion of hyphae, morphology, virulence (pathogenicity), physiology, and DNA homology [13,14]. Some of the AG have been further divided into subgroups based on anastomosis frequency, physiological and morphological features, pathogenic, bimolecular, biochemical, genetic, and DNA homology characteristics [15,16].For example, AG-1 has been divided into six subgroups: IA, IB, IC, ID, IE, and IF [17]. Similarly, AG-4 has been divided into three subgroups: HGI, HGII, and HGIII [18], and AG-2 has been divided into nine subgroups such as 1, 2, t, Nt, 2IIIB, 2IV, 2LP, 3 and 4 [18]. AG-2, AG-4, AG-5, AG-3, AG-7, and AG-11 are causing damping-off and root, and hypocotyls rot, whereas AG-1 is responsible for the foliar and web blight of soybean [5,[19][20][21][22].
Information on the genetic diversity and distribution of R. solani AG associated with soybean of a particular country is available [1,[42][43][44]. However, there was no attempt to investigate the global genetic diversity and distribution of R. solani AG associated with soybean. Considering the genetic diversity and different diseases causing abilities of R. solani AG on soybean mentioned above, the current study was aimed (1) To determine most frequently reported and dominant AG associated with soybean; (2) To explore the genetic diversity of AG based on rDNA ITS1-5.8S-ITS2 sequence analysis; (3) To determine the relationship between geographical origin and genetic diversity of AG.  15 November 2021) using the main keywords "soybean-R. solani, " and the "soybean-anastomosis groups (AG)" solely and in combination. The search for literature was limited from January 2001 to October 2021. To be included in this study, the literature had to meet the following criteria: (i) Only articles published in peerreviewed journals were chosen; (ii) Articles mentioning the accession numbers of AG and the sequencing data of AG was publicly available in GenBank; (iii) Articles mentioning the geographical origin, isolates and AG that could cause symptoms on soybean; (iv) Articles mentioning about pathogen isolation from the soil (e.g., rhizosphere soil, topsoil), root, and shoot of the symptomatic soybean plants. The articles from the databases mentioned above were imported into the EndNote X9 software to acquire information of AG, isolate, geographical origin, and isolation sources. The information on AG, isolates, geographical origin, isolation sources were compiled as shown in Table S1. The primary isolation sources included diseased roots, shoots of soybean crop, and soil (e.g., rhizosphere soil, topsoil) surrounding symptomatic soybean. To create a dataset of all publicly available sequences from the rDNA ITS1-5.8S-ITS2 region associated with R. solani AG, we queried National Center for Biotechnology Information (NCBI) GenBank(https://www.ncbi.nlm.nih.gov/ genbank; accessed on 12 November 2021) and downloaded all of the sequences for the following studies.

Characterization of the Distribution and Frequency of Anastomosis Groups Assoiated with Soybean
The frequency of AG with known sequences in GenBank from the published literature was calculated using the formula with modification for this study; relative frequency (F) = 100 × (n/N), in which n = the number of each AG/AG subgroup reported in the published literature and N = the total number of all AG/AG subgroup reported in the published literature [19,45]. AG showed higher frequency was considered most frequently reported or highly distributed AG or AG subgroup.

Phylogenetic Analysis
Before conducting phylogenetic analysis, best-fit substitution model selection of the aligned sequences was carried out using the jModelTest v. 2.1.6 package program [49] with model selection strictly based on the Akaike Information Criterion (AIC) estimateand Bayesian Information Criterion (BIC) [50,51]. The Tamura-Nei [52] model was suggested by jModelTest v. 2.1.6. The best-fit substitution model for the phylogenetic trees was mentioned in Table S2. Phylogenetic trees on the multiple alignments were constructed using MEGA v. 7.0.26. The phylogenetic trees were built using Maximum Likelihood (ML) [53], Neighbor-Joining (NJ) [54] and Maximum-parsimony (MP) [55]. Rates among sites were selected as G (γ distributed) for both ML and NJ. The partial deletion for ML and NJ was set as gap/missing data treatment with a 95% site coverage cut-off, and Nearestneighbor interchange (NNI) was selected for the heuristic method. The MP analysis was obtained using the Close-Neighbor-Interchange algorithm [55]. Bootstrapping of 1000 random samples from various sequence alignments was used to test each phylogenetic tree's robustness. Gaps and missing data were removed from all positions. Only nodes with bootstrap values of 70% or higher were shown in the phylogenetic trees. Phylogenetic trees were visualized using the Interactive Tree of Life (iTOLv. 6(http://itol.embl.de/; accessed on 16 November 2021) [56] and

Principal Coordinate Analysis (PCoA) and Sequence Similarities
Pairwise percentages of sequence similarities of all the isolates within AG and AG subgroups and among the AG and AG subgroups were calculated with the MatGAT v. 2.0 program [57]. Principal coordinate analyses (PCoA) were conducted on pairwise sequences similarity matrix to investigate clustering of AG and AG subgroups using paleontological statistics software package for education and data analysis (PAST v. 4.03) with Gower similarity index [58].

Genetic Diversity of Anastomosis Groups
Initially, 115 R. solani AG associated with soybean were collected from published literature in this study (Table 1). Only 102 AG with known isolate names, pathogenicity and geographic origins were used to explore genetic diversity and phylogeny (Figures  a F (Relative frequency) = 100 × (n /N ), in which n = the number of each AG/AG subgroup and N = the total number of all AG/AG subgroups.

Relationship between Genetic Diversity of Anastomosis Groups and Their Geographic Origin
Besides, closely related AG or AG subgroups associated with soybean were clustered together regardless of the geographical origin from where they had been identified (Figures 2-4). For example, isolates of AG-1-IA in subclade IIb from Brazil, Japan, and

Relationship between Genetic Diversity of Anastomosis Groups and Their Geographic Origin
Besides, closely related AG or AG subgroups associated with soybean were clustered together regardless of the geographical origin from where they had been identified (Figures 2-4). For example, isolates of AG-1-IA in subclade IIb from Brazil, Japan, and the USA clustered together (Figures 2-4). Similarly, isolates of AG-5 from the USA and Japan, in subclade Ia-2, clustered together. Moreover, isolates of AG-2-2 from the USA and Brazil in the subclade Ib-2 clustered together (Figures 2-4). In conclusion, closely related AG associated with soybean were clustered together regardless of the geographical origin from where they had been identified.

Discussion
R. solani is a soil borne pathogen that affects soybean worldwide and has a significant economic impact in all soybean growing countries [1,7,11,19,21,[59][60][61]. All AG associated with soybean reported in this study belonged to R. solani. The frequency of AG varied substantially geographically. Most of the AG were reported from the diseased soybean plants in the USA and Brazil. More AG discoveries associated with soybean in the USA may imply an expansion of the host range and genetic diversity of R. solani [4]. Furthermore, AG associated with soybean might have been studied more intensively in the USA than other countries because of their greater relative importance as plant pathogen of soybean. In the USA, foliar blight caused by AG-1 and, hypocotyl and root rot caused by AG-2-2IIIB, AG-4, AG-5, AG-3, AG-7, and AG-11 caused as high as 45% soybean yield losses [4,21,[61][62][63]. In this study, the reports of a few AG from Japan, Canada, Taiwan, Japan, and India were probably due to a lack of sampling or isolation methods. In Brazil, foliar blight, dampingoff, and root rot caused by AG-1 and AG-2 resulted in an estimated 31 to 60% soybean yield loss [8,32,64]. In Canada, root rot ranked fourth among 22 diseases causing severe losses in soybean [65,66]. In India, foliar blight caused by AG-1 caused an average yield loss of 40% to 50% [1,43,67,68]. Besides, AG-2 and AG-5 have been reported to cause hypocotyl rot of soybean in Japan [44]. AG-7 is responsible for the damping-off of soybean seedlings in Taiwan [42]. In recent years, the frequency of legumes in crop rotations has increased, and also the intensive cultivation of soybean might be another reason for increasing the frequency of R. solani AG [4,5,21,42,[61][62][63].
Besides, our study also revealed the most frequently reported AG from soybean. AG-1 was the most frequently reported AG from soybean, followed by AG-2, AG-4, AG-11, AG-7, AG-3, and AG-5. Frequently reported AG doesn't indicate whether it is highly pathogenic or not pathogenic on soybean. For example, AG-1-1A is highly pathogenic on soybean in Brazil; however, AG-2-2IIIB, AG-4, and AG-5 are highly pathogenic on soybean in the USA [5,[19][20][21][22]. Hence, AG diversity, frequency, and distribution could be influenced by the dynamics of the host-pathogen relationship, genetic flexibility, and degree of adaptation [69]. Furthermore, crop rotation, soil types, soybean cultivars, cropping patterns, and climatic conditions of the particular region may encourage the presence of specific AG over others [70]. In addition, root-associated microbial communities also influence AG distribution [32].
The most reliable approach for phylogenetic analysis and genetic diversity of AG and AG subgroups of R. solani is the molecular characterization utilizing the sequences of the rDNA ITS1-5.8S-ITS2 region [5,37,38]. We were able to make conclusions about the phylogenetic relationships among AG and AG subgroups using the sequences of the rDNA ITS1-5.8S-ITS2 region from the NCBI GenBank. In this study, phylogenetic analysis based on MP, NJ and ML showed AG forms two distinct clades. Clade I included isolates of AG-2, AG-3, AG-4, AG-5, AG-7, and AG-11, whereas clade II included isolates of AG-1. Each AG forms a distinct subclade within the clades except AG-5 and AG-11, which form a distinct subclade (Ia).This suggests that isolates of AG-5 and AG-11 may be more closely related to each other. Previous studies have shown that even isolates of AG-5 of soybean clustered with the isolates of AG-11 of other legumes such as lupins [37]. Besides, our study showed that even AG subgroups form distinct subclades. For example, isolates of AG-1C and IB form a sister subclade with isolates of AG-1A within clade IIa (AG-1 isolates). Sequence analysis in previous studies revealed that AG-1-B was genetically distinct from AG-1 IA and IB [69]. Likewise, within the sub clade Ib (AG-2 isolates), isolates of AG-2-1 form a sister subclade with isolates of AG-2-2 and AG-2-IIIB. Previous studies considered AG-2 a polyphyletic with subgroups consistently forming different clades or subclades [71].
AG-2 is a highly heterogeneous AG with substantial genetic diversity and is further divided into nine subgroups such as 1, 2, t, Nt, 2IIIB, 2IV, 2LP, 3, 4 that cause rots and damping-off disease in soybean [37]. Moreover, within subclade Id (AG-4 isolates), few AG-4-HGIII isolates form a sister clade with isolates of AG-4-HGII and AG-4-HGI. Besides, most of the isolates of AG-4-HGII and AG-4-I were clustered together. This indicated that subgroups HGI and HGII were found to be more closely related than subgroup HGIII [15,17,32]. Furthermore, our study also revealed that AG did not have preferences for geographical origin. Most clades or subclades with high bootstrap support indices include AG and AG subgroups from USA, Brazil, and other countries. In a previous study, the authors of the reference [34] analyzed sequences of AG from Europe, North America, Australia and Asia associated with legumes, cereals and vegetables and found that AG did not have a preference for a geographic origin; however, some AG were found to be hostspecific [72]. The authors of the reference [73][74][75][76][77][78] showed that isolates of AG from different countries are categorized under the same AG or AG subgroups. Furthermore, the pairwise distance matrix based on sequence similarities revealed that the isolates of AG within the clades and subclades shared high sequence similarities. In contrast, isolates of AG from different clades and subclades showed less similarity. Furthermore, the sequence similarity was higher than 87.2% within an AG subgroup, 81.7 to 100% for different subgroups within an AG, and 69.5 to 92.5% among different AG. These results are consistent with previous studies that assessed the sequence similarities of ITS sequences [15]. They found that sequence similarity was above 96% for the same AG subgroup, 66-100% for different subgroups within an AG, and 55-96% for AG. In addition, PCoA revealed that AG and/or AG subgroups form a separate group from each other. Previous reports showed that the sequence homology in the ITS regions was higher for isolates of the same subgroup than isolates of different subgroups within an AG and isolates of different AG [38]. Our study revealed that the rDNA-ITS sequences were clustered consistently according to their known AG and not according to geographical origin. Cluster analyses based on rDNA-ITS on sequences of R. solani AG and AG subgroups associated with a soybean of a specific geographical origin have already been reported [1,7,19,21,61].

Conclusions, Limitations and Future Directions
In conclusion, this study provides the first documentation regarding the global genetic diversity and distribution of R. solani AG associated with soybean. AG-1, AG-2, and AG-4 were the most prevalent and widely documented AG in soybean. AG-1 was responsible for foliar and web blight of soybean, whereas the remaining AG were causing damping-off, root and hypocotyl rot. Across geographical origin, most of the AG were reported from the USA, followed by Brazil. Phylogenetic and genetic diversity analysis revealed that AG and/or AG subgroups formed distinct clades and subclades without corresponding to geographical origin. Pairwise percentages of sequence similarities within AG and subgroups and principal coordinate analysis also support the phylogenetic and genetic diversity analysis. The rDNA ITS1-5.8S-ITS2 region has been successfully sequenced and phylogenetically analyzed to reliably separate R. solani isolates into several groups and subgroups that correspond to the various AG [5,15,17,18,35]. However, sequence analysis of the rDNA ITS1-5.8S-ITS2 region is not without its attendant limitations. Though the differences in the rDNA ITS1-5.8S-ITS2region are sufficiently large to differentiate the AG reliably, they could not differentiate isolates of the same AG [75]. Furthermore, researchers do not verify or validate sequences deposited in databases and repositories; depositing an incorrectly named AG is almost inevitable. Complete information about isolate name, host, or geographical origin may not be included. Besides, the rDNA ITS1-5.8S-ITS2 region would not always be ideal because of high mutation rates. Furthermore, R. solani is multinucleate; therefore, there is the possibility of numerous nucleotide variations in this region even in the single strain of R. solani [76][77][78]. Hence, the genetic diversity and phylogeny of AG must be augmented with additional sequences such as large-subunit rRNA (LSU) region, ß-tubulin, the largest (RPB1) and the second-largest (RPB2) subunits of RNA polymerase, translation elongation factor (tef-1α), the mini-chromosome maintenance protein (mcm7), calmodulin (CaM), and topoisomerase I (top1) gene. Furthermore, studies involving genomic, transcriptomic, proteomic and mitogenomic analysis may provide insights into the phylogeny and genetic diversity of R. solani AG.
Supplementary Materials: The following are available online at https://www.mdpi.com/article/ 10.3390/genes13122417/s1, Table S1. GenBank accession numbers of DNA sequences from the rDNA ITS1-5.8S-ITS2 region of R. solani AG were used to determine AG's phylogenetic relationships and genetic diversity associated with soybean, Table S2. The best models used for ML and NJ analysis in this study.