The Comparative Genomics of Botryosphaeriaceae Suggests Gene Families of Botryosphaeria dothidea Related to Pathogenicity on Chinese Hickory Tree

Trunk canker poses a major threat to the production of Chinese hickory tree (Carya cathayensis Sarg.), which is primarily determined by Botryosphaeriaceae. In our previous work, we identified Botryosphaeria dothidea as the predominant pathogen of this disease. However, it is still unclear about corresponding gene families and mechanisms associated with B. dothidea’s pathogenicity on Chinese hickory tree. Here, we present a comparative analysis of high-quality genome assemblies of Botryosphaeria dothidea and other isolated pathogens, showing highly syntenic relationships between B. dothidea and its closely related species and the conservative evolution of the Botryosphaeriaceae family. Higher GC contents were found in the genomes of B. dothidea and three other isolated pathogens (Botryshaeria cortices, Botryshaeria fabicerciana, and Botryshaeria qingyuanensis) compared to Macrophomina phaseolina, Neofusicoccum parvum, Diplodia corticola, and Lasiodiplodia theobromae. An investigation of genes specific to or expanded in B. dothidea revealed that one secreted glucanase, one orsellinic acid biosynthesis enzyme, and two MFS transporters positively regulated B. dothidea’s pathogenicity. We also observed an overrepresentation of viral integrase like gene and heterokaryon incompatibility proteins in the B. dothidea’s genome. In addition, we observed one LRR-domain-containing protein and two Sec-domain-containing proteins (Sec_1 and Sec_7) that underwent positive selection. This study will help to understand B. dothidea’s pathogenicity and potential influence on the infection of Chinese hickory, which will help in the development of disease control and ensure the security of Chinese hickory production.


Introduction
The Botryosphaeriaceae family are worldwide pathogens that cause a range of disease symptoms, including leaf spots, fruit and root rots, dieback, and cankers, in multiple woody hosts [1][2][3].They have diverse ecological roles as endophytic, saprobic, and plant pathogenic species [4].Botryosphaeria is the largest genus of Botryosphaeriaceae, with Botryosphaeria dothidea being a typical species, followed by other genera including Diplodia, Dothiorella/Spencermartinsia, Lasiodiplodia, and Neofusicoccum [5].The symbolic life cycle of B. dothidea is characterized by an endophytic/latent infection phase [4,6].It exists as an endophyte within the host plant, and its virulence is triggered by the host plant's physiological status in response to environmental stressors.This stimulation leads B. dothidea to transition into a pathogenic state.Thus, the manifestation of symptoms in B. dothidea infections appears to be influenced by the occurrence of a single environmental stress event or a combination of environmental stress events [7,8].This fungus can parasitize living host plants and gains infection status under the stimulation of host or environmental strains were cultivated on potato dextrose agar (PDA) at 25 • C, and 10-day-old mycelium was harvested.Genomic DNA and messenger RNA were extracted using a genome DNA kit (FastPure Microbiome DNA Isolation Kit, Vazyme Biotech Co., Ltd, Nanjing, China) and an Oligotex-dT mRNA kit (Qiagen Inc., Valencia, CA, USA), respectively.The genomic DNA and mRNA were further purified and sequenced using Oxford Nanopore Technology (ONT) and Illumina HiSeq4000 platforms owned by Biomarker Technologies (www.biomarker.com.cn)(Beijing, China), respectively.Apart from the aforementioned species, genome assemblies and corresponding annotations of Macrophomina phaseolina, Neofusicoccum parvum, Diplodia corticola, and Lasiodiplodia theobromae were retrieved from the NCBI or JGI genome websites.

Genome Assembly, Gene Prediction, and Functional Annotation
The genome assemblies of BDLA, BQTK, BCTK, and BFLG had been completed in our previous work [23].Here, we evaluated the completeness of the genome assemblies using BUSCO v5.12 at the Ascomycota level.RepeatMasker v4.1.2and RepeatModeler v2.0.1 were employed to detect, categorize, and mask repeats in the genome assemblies.The BRAKER2 toolkit v2.1.5 was used to identify a gene model based on repeat-masked genome assemblies, coupled with the support of external protein (fungi_odb10) and transcriptome data evidence [24].eggNOG v5 [25] was used to perform the functional annotation of coding genes with an E-value threshold of 1 × 10 −5 .

Orthologs Identification and Species Phylogenetic Tree Construction
Ortholog construction was conducted using OrthoFinder v2.5.5 [26] with an E-value threshold of 1 × 10 −5 and an MCL inflation index of 2.0.Multiple alignments of single-copy orthologs were conducted using MAFFT v7.490, and a maximum-likelihood tree of the species in this study was constructed using IQ-TREE v2.0.7 with the default parameters using the auto-detect model setting.

Identification of Rapidly Evolving Ortholog Families
Based on the ortholog families identified using OrthoFinder, CAFE (Computational Analysis of Gene Family Evolution, version 5.0) was used to compare the sizes of the ortholog families of each species to their respective most recent common ancestor (MRCA) using fossil data for species divergence times obtained from the Timetree website (https: //timetree.org/,accessed on 8 October 2023).Rapidly evolving ortholog families were estimated with the best-fit global birth/death parameter (λ) set at 0.0001 and a p-value threshold of 0.01.The ultra-metric tree of each ortholog family was generated using r8s v1.81.

Positive Selection Analysis
Protein codon alignments for each single-copy ortholog were generated using pal2nal v14, and TrimAl v1.4 was applied to remove gaps within the codon alignments.dN/dS ratios (ω) were calculated using the CodeML program of PAML v4.9e utilizing the branch model (model = 2 and NSsites = 0).To pinpoint which B. dothidea genes underwent positive selection pressure, branches represented by BDLA genes were designated as foreground branches.Likelihood ratio tests (LRTs) were employed to compare the null hypothesis (fix_omega = 1 and omega = 1) with the alternative hypothesis (fix_omega = 0 and omega = 0.7).The significance of LRT statistics was determined using a χ 2 distribution.The BDLA genes that underwent positive selection pressure were defined by false discovery rate (FDR)-corrected p-values of <0.05 and dN/dS ratios (ω) higher than 1.

Detection of Horizontal Gene Transfer (HGT)
Genome-wide predictions of horizontal gene transfer (HGT) events in BDLA were conducted using the AvP toolkit (https://github.com/GDKO/AvP,accessed on 30 October 2023) [27], which performed BLASTp searches against the Uniref100 database with an E-value threshold of 1 × 10 −5 .Subsequently, the hits for each BDLA gene were compiled and categorized based on homologous similarity and the corresponding taxon, which was used for the calculation of the Alien Index (AI) value.The software then utilized IQ-TREE (with the default parameters) to construct phylogenetic trees, identifying candidates similar to hits from distant taxa that might have been acquired through horizontal gene transfer (HGT) events.

Synteny Analysis
DIAMOND v0.8.25 was employed for all vs. all BLASTp searches to detect homologous gene pairs between two corresponding genomes (with an E-value threshold of 1 × 10 −5 and a -max-target-seqs value of 5).Subsequently, MCScanX was utilized to explore intra-or inter-genomic collinearity, and genome annotations (density of proteincoding genes, LS genes, and repeats) were also input.The visualization of the synteny results was conducted using circus v0.69.

Plant Materials, Fungal Materials, and Inoculation Assays
Botryosphaeria dothidea BDLA16-7 was isolated from Chinese hickory trunk trees in Zhejiang Province, China, showing typical symptoms of canker disease.BDLA16-7 was cultured on potato dextrose agar (PDA) plates at 25 • C in the dark for three days.Then, the mycelial blocks measuring approximately 5 mm × 5 mm were cut for inoculation assays.Shallow incisions, about 2 mm in depth and 5 mm in length, were made on the trunks of two-year-old hickory seedlings using a hole puncher.A single mycelial block was gently pressed against each incision, and then the area was sealed with sterilized tape.

Validation of Gene Expression Using Quantitative qRT-PCR
RNA was isolated using the Qiagen RNeasy Mini Kit (Qiagen Inc., Valencia, CA, USA), followed by cDNA synthesis with the Superscript IV Reverse Transcriptase cDNA Synthesis Kit (TB Green ® Premix Ex Taq™ II, Takara Bio Inc., Kusatsu, Japan) using 2 µg of template RNA.All cDNA samples were subsequently diluted to a concentration of 20 ng −1 in preparation for qRT-PCR.Gene expression levels were assessed via qRT-PCR employing a Bio-Rad Real-Time PCR System and SYBR Green as the fluorescent dye.The β-tubulin gene from B. dothidea (BDLA_00007187) served as the internal reference gene.Primers were generated using Primer3 (https://primer3.ut.ee/, accessed on 5 March 2024), and the specificity of the sequences for the target genes was verified using the NCBI BLASTN web platform (https://blast.ncbi.nlm.nih.gov/Blast.cgi,accessed on 5 March 2024), with the low complexity filter disabled.The aforementioned internal reference gene was utilized to normalize the expression levels of the selected candidate genes.The CDS sequences of β-tubulin gene, candidate genes, and their corresponding primers are provided in Table S9.

Construction of Deletion Mutants and Complemented Strains
Double-joint PCR was utilized for constructing the deletion vectors of candidate genes.Genomic DNA of wild-type strain BDLA16-7 served as the template for amplifying the sequences flanking the candidate genes.Concurrently, hygromycin resistance (HPH) cassettes were cloned from the pKHT plasmid through amplification with HPH-F/R primers.The amplicons were merged in the second round of PCR, of which the product was used as the template for the final construct's amplification using nested primers.PEG-mediated protoplast transformation was applied to introduce the vectors into the wild-type BDLA16-7 strain.Selection of potential gene-deletion mutants was conducted on PDA medium augmented with 100 µg/mL of hygromycin.For the complementation assay, segments comprising the gene promoters, green fluorescent protein (GFP), and the coding regions of the candidate genes were amplified and fused by double-joint PCR.The resulting PCR products were transformed with XhoI-digested pYF11-RFP plasmid into Saccharomyces cerevisiae XK1-25 employing the Alkali-Cation Yeast Transformation Kit (MP Biomedicals, Solon, OH, USA).Recombinant plasmids were extracted from the transformed yeast cells using the Yeast Plasmid Extract Kit (Solarbio, Beijing, China) and subsequently propagated in the Escherichia coli strain DH5α.These vectors were then reintroduced into the gene-deletion mutants through PEG-mediated transformation.

Genome Assemblies of Eight Botryosphaeriaceae Species and Features of Note
The genomes of B. dothidea strain BDLD16-7 (BDLA), B. cortices strain 16-35 (BCTK), B. fabicerciana strain 18-2 (BFLG), and B. qingyuanensis strain 16-30 (BQTK) were sequenced using an Oxford Nanopore Technologies (ONT) platform with a sequencing depth higher than 100× and were reported in our previous work [29].Meanwhile, four species of Botryosphaeriaceae with distinct lineages (M.phaseolina (MP); N. parvum (NP); D. corticola (DC); and L. theobromae (LT)) were combined for a comprehensive comparison.The similar genome sizes of six Botryosphaeriaceae species, except MP and DC, are provided in Table 1, ranging from 43.69 to 45.98 Mb, and are close to the k-mer analysis results (Figure S1).These results indicate larger genomes for these Botryosphaeriaceae species compared to the average genome size of Ascomycota (36.9 Mb) [30].The assembly quality of BDLA, BQTK, BCTK, and BFLG varied greatly compared to that of MP, NP, DC, and LT due to different sequencing technologies and the assembly toolkit.For example, contig numbers ranging from 13 to 15 were found for BDLA, BCTK, BQTK, and BFLG, with N 50 values ranging from 3.86 to 3.96 Mb (Table 1).In contrast, MP, NP, DC, and LT were composed of 18 to 296 contigs, with N 50 values ranging from 0.46 to 4.83 Mb.Obviously higher GC contents of BDLA, BQTK, BCTK, and BFLG (51.90% to 52.81%) were presented relative to MP, NP, DC, and LT (36.40% to 48.42%).The completeness and accuracy of the genome assemblies and gene model predictions within the species were determined via a Benchmarking Universal Single-copy Orthologs (BUSCO) analysis with the ascomycota_odb10 database (1706 genes) as the reference.In a genome-level BUSCO analysis, 1655 to 1674 complete genes of the 1706 single-copy genes were predicted (Figure 1A), which was consistent with the result of a protein-level BUSCO analysis (Figure 1B).This revealed the high completeness of the genome assemblies of the species in this study.
cies were determined via a Benchmarking Universal Single-copy Orthologs (BUSCO) analysis with the ascomycota_odb10 database (1706 genes) as the reference.In a genomelevel BUSCO analysis, 1655 to 1674 complete genes of the 1706 single-copy genes were predicted (Figure 1A), which was consistent with the result of a protein-level BUSCO analysis (Figure 1B).This revealed the high completeness of the genome assemblies of the species in this study.

Identification of Repeat Sequences and Protein-Coding Genes
We detected and subsequently masked repetitive elements within the genomes of the eight Botryosphaeriaceae species.Similar repetitive contents were observed in BDLA, BCTK, QTK, and BFLG (6.25% to 8.46%), but quite different to those of MP (17.92%),LT (5.06%), and DC (2.72%) (Table 1, Figure 1C).Tandem repeats, low-complexity sequences, and dispersed transposable elements are the main categories of masked repeats (Table S1).DNA/TcMar-Fot1 is the dominant subclass family of the class II DNA transposon across BDLA, BCTK, BQTK, BFLG, MP, and NP (Figure 1D).For class I retrotransposons, the subclasses of LTR/Copia, LTR/Gypsy, and LINE/Tad1 are overrepresented across all species (Figure 1E,F).Furthermore, we found that DNA/hAT-Charlie (class II DNA) transposon elements are unique to the BDLA genome.
An ab initio pipeline, coupled with a known gene model dataset retrieved from Swiss-Prot and transcriptome data, was employed for the genome annotations of BDLA, BQTK, BCTK, and BFLG.The protein-coding gene counts per species ranged from 10,366 to 14,845, with similar average length (from 1530 to 1757 bp) and gene density (0.24 to 0.31 genes per kb) values (Table 2).Moreover, lower ratios of exons and introns in BDLA, BQTK, BCTK, and BFLG indicate simpler gene structures compared to MP, NP, DC, and LT.

Synteny Analysis of BDLA Genome Reveals Two Accessory Contigs and Lineage-Specific (LS) Genes
We investigated the extent of the co-linearity of BDLA, BQTK, BCTK, and BFLG, among which the BDLA genome showed high degrees of sequence identity and synteny with BQTK, BCTK, and BGLG (Figure 2A).On the other hand, an intraspecies synteny analysis showed only two pairs of syntenic blocks (contig Bd-9 and contig Bd-3; and contig Bd-10 and contig Bd-4) in the BDLA genome (Figure 2B).This indicates a conservative evolutionary history among these four Botryosphaeria species and limited gene duplication events occurring after their divergence from the ancestral lineage.Moreover, two BDLA contigs, Bd-12 and Bd-13, presented no syntenic blocks with closely related Botryosphaeria species, which suggested that Bd-12 and Bd-13 might be accessory contigs of the BDLA genome.
Through syntenic comparison, Ref. [31] distinguished lineage-specific (LS) regions in the selected genome, which included genes that play an important role in fungal pathogenicity.Inspired by this, we extracted LS regions from the BDLA genome based on the genome alignment between BDLA, BQTK, BCTK, and BFLG.In Figure 2B, it is evident that the LS regions in BDLA are mainly distributed on the accessory contigs of Bd-12 and Bd-13.Notably, these LS regions tend to cluster towards the contig terminals, exhibiting a high repeat density but a lower GC content.In BDLA, a total of 501 genes exist in LS regions (LS genes), with the highest count in Bd-13 (83 genes), followed by Bd-14 (59 genes) and Bd-04 (57 genes) (Figure 2C).

Ortholog Construction of Botryosphaeriaceae Species Uncovered Rapid Expanded Ortholog in BDLA
We established orthologous relationships between BDLA, BCTK, BFLG, BQTK, MP, NP, DC, and LT using OrthoFinder v2.5.5, and 96,176 (96.11%) of the 100,070 total genes were assigned to 15,050 orthologs (Table S3), which ranged in size from 2 to 38 genes.Overall, 2122 genes (2.12%) were assigned to 657 species-specific orthologs (Figure 3A, Table S3).Among them, 868 genes in MP were assigned to 261 species-specific orthologs, with LT, NP, and DC following closely (Table S4).In contrast, only 21 to 37 genes were assigned to species-specific orthologs in BDLA, BQTK, BCTK, and BFLG.In total, 6785 core orthologs, comprising genes from each species in this study, were detected, and 6064 of them were identified as single-copy orthologs (Figure 3A, Table S3).In addition to

Ortholog Construction of Botryosphaeriaceae Species Uncovered Rapid Expanded Ortholog in BDLA
We established orthologous relationships between BDLA, BCTK, BFLG, BQTK, MP, NP, DC, and LT using OrthoFinder v2.5.5, and 96,176 (96.11%) of the 100,070 total genes were assigned to 15,050 orthologs (Table S3), which ranged in size from 2 to 38 genes.Overall, 2122 genes (2.12%) were assigned to 657 species-specific orthologs (Figure 3A, Table S3).Among them, 868 genes in MP were assigned to 261 species-specific orthologs, with LT, NP, and DC following closely (Table S4).In contrast, only 21 to 37 genes were assigned to species-specific orthologs in BDLA, BQTK, BCTK, and BFLG.In total, 6785 core orthologs, comprising genes from each species in this study, were detected, and 6064 of them were identified as single-copy orthologs (Figure 3A, Table S3).In addition to species-specific and core orthologs, we also investigated multiple-species-involved (MSI) orthologs, which included members of two to seven species.For instance, 378 two-species MSI orthologs of BDLA suggest that genes from another species, apart from BDLA, were arranged in these orthologs (Figure 3B).As a result, the ratios of the four-species (891), five-species (975), six-species (1141), and seven-species (1758) MSI orthologs were higher than those of the two-species (378) and three-species (375) MSI orthologs in BDLA.Similar conditions were also found in other Botryosphaeriaceae species, which indicates conservative orthologous relationships between these species and supports the synteny analysis results.
We established the phylogenic relationships of the Botryosphaeriaceae species in this study using the detected single-copy orthologs.The phylogenetic tree was divided into two clades: one clade included BDLA, BCTK, BQTK, and BFLG, indicating close genetic relatedness, but displayed evolutionary distance from MP, NP, DC, and LT.In detail, BDLA exhibited a close phylogenetic relationship with BQTK, while BCTK was grouped alongside BFLG.
arranged in these orthologs (Figure 3B).As a result, the ratios of the four-species (891), five-species (975), six-species (1141), and seven-species (1758) MSI orthologs were higher than those of the two-species (378) and three-species (375) MSI orthologs in BDLA.Similar conditions were also found in other Botryosphaeriaceae species, which indicates conservative orthologous relationships between these species and supports the synteny analysis results.We established the phylogenic relationships of the Botryosphaeriaceae species in this study using the detected single-copy orthologs.The phylogenetic tree was divided into two clades: one clade included BDLA, BCTK, BQTK, and BFLG, indicating close genetic relatedness, but displayed evolutionary distance from MP, NP, DC, and LT.In detail, BDLA exhibited a close phylogenetic relationship with BQTK, while BCTK was grouped alongside BFLG.
To explore B. dothidea's pathogenicity in Chinese hickory, genes arranged in rapidly expanded orthologs in BDLA were subjected to a function enrichment analysis, and a total of 296 BDLA genes were detected (Table S5).Twenty of these genes carry the domain of Chromo (CHRromatin Organization MOdifier) (PF00385); eight genes are involved in orsellinic acid biosynthesis (OrsD: PF12013); seven encode heterokaryon incompatibility proteins (HET: PF06985); six encode cytochrome P450 (p450: PF00067); and five encode major facilitator superfamily proteins (MFS_1: PF07690) (Figure 3D).A GO enrichment analysis of these genes revealed that the enriched GO terms were mainly divided into three categories (Figure 3E): (i) DNA metabolic processes (such as 'DNA biosynthetic process' and 'DNA polymerase activity'); (ii) nitrogen compound metabolic processes; and (iii) aromatic or organic cyclic compound biosynthesis processes.

Positively Selected Genes (PSGs) of BDLA may Be Involved in Response to Xenobiotic Stimulus
Positive selection genes (PSGs), with a non-synonymous/synonymous substitutions ratio (ω = dN/dS) higher than 1, may contribute to the emergence of novel biological functions [32].To explore the PSGs of B. dothidea's pathogenicity, we calculated the dN/dS ratio for 6064 single-copy orthologs.The calculations were performed using the branch model of the PAML (phylogenetic analysis maximum likelihood)-CODEML algorithm.In this model, the genes of BDLA, BQTK, BCTK, and BFLG within each single-copy ortholog were identified as the foreground branch, respectively.The PSGs were characterized by a dN/dS ratio exceeding 1 and a chi2 value below 0.05.As a result, 32, 48, 25, and 27 PSGs were identified in BDLA, BQTK, BCTK, and BFLG, constituting 0.53%, 0.79%, 0.41%, and 0.45% of the total single-copy genes (Figure 4A, Table S6).
major facilitator superfamily proteins (MFS_1: PF07690) (Figure 3D).A GO enrichment analysis of these genes revealed that the enriched GO terms were mainly divided into three categories (Figure 3E): (i) DNA metabolic processes (such as 'DNA biosynthetic process' and 'DNA polymerase activity'); (ii) nitrogen compound metabolic processes; and (iii) aromatic or organic cyclic compound biosynthesis processes.

Positively Selected Genes (PSGs) of BDLA may Be Involved in Response to Xenobiotic Stimulus
Positive selection genes (PSGs), with a non-synonymous/synonymous substitutions ratio (ω = dN/dS) higher than 1, may contribute to the emergence of novel biological functions [32].To explore the PSGs of B. dothidea's pathogenicity, we calculated the dN/dS ratio for 6064 single-copy orthologs.The calculations were performed using the branch model of the PAML (phylogenetic analysis maximum likelihood)-CODEML algorithm.In this model, the genes of BDLA, BQTK, BCTK, and BFLG within each single-copy ortholog were identified as the foreground branch, respectively.The PSGs were characterized by a dN/dS ratio exceeding 1 and a chi2 value below 0.05.As a result, 32, 48, 25, and 27 PSGs were identified in BDLA, BQTK, BCTK, and BFLG, constituting 0.53%, 0.79%, 0.41%, and 0.45% of the total single-copy genes (Figure 4A, Table S6).We compared orthologs containing the PSGs of BDLA, BQTK, BCTK, and little intersection was observed (Figure 4B), which suggests their obvious sequence divergency.Therefore, we subsequently focused on BDLA-specific PSGs.These PSGs were predominantly enriched in GO terms such as 'response to chemical', 'cellular response to chemical stimulus', and 'response to xenobiotic stimulus', with an average dN/dS ratio ranging from 592.06 to 726.35 (Figure 4C).Notably, one of the BDLA PSGs, BDLA_00004043-RA assigned to OG0003604, carries the LRR_8 (PF13855) domain (Figure 4D, Table S6), which is consistent with research on the NBS-LRR genes of Arachis duranensis and Arachis ipaënsi [33].In addition, BDLA_00009721-RA and BDLA_00011736-RA, assigned to OG0005150 and OG0007268, respectively, underwent positive selection pressure (dN/dS ratio of 1.76 and 70.2467) and carry Sec_1 and Sec7-N domains, which were reported to be involved in SNARE complex assembly and vesicle trafficking regulation [34,35].Moreover, coding genes of one chitinase (OG0005721: BDLA_00008528-RA-containing domain of Glyco_hydro_18) and one chitin biosynthesis enzyme (OG0004492: BDLA_00004853-RAcontaining domain of Chitin_synth_2) also had a dN/dS ratio of 999, which implies that positive selection may affect the chitin metabolic process of B. dothidea.

Predicted Secretome Comparison Gives Insight into BDLA-Specific Small Secret Cysteine-Rich Proteins (Effectors)
Plant pathogens deploy an arsenal of secreted proteins to interfere with the plant immune system, enabling the successful colonization of the host and infection [36].BDLA, BCTK, BFLG, and BQTK exhibited similar numbers of secreted proteins, ranging from 997 to 1049, and 807 to 1153 secreted proteins were found in MP, NP, DC, and LT (Table S7, Figure 5A).The lengths of the identified secreted proteins in all species were primarily distributed in the ranges of 100 to 400 amino acids (aa) and 400 to 1000 aa (Figure 5A).Among these secreted proteins, there were fewer putative effectors in BDLA (83), BCTK (83), BQTK (75), and BFLG (87) compared to the four other Botryosphaeriaceae species, ranging from 78 to 130 (Table S7).A large proportion of the putative effectors were apoplastic (Figure 5B), and almost all of them were small cysteine-rich proteins (SCPs) (Figure 5C, Table S8), which suggests that these cysteine-rich effectors may confer B. dothidea's ability to overcome the immune system of its host [37].The GO enrichment analysis of B. dothidea's putative cysteine-rich effectors provided the insight that these putative effectors are mainly enriched in carbohydrate metabolism (such as pectin, galacturonan, and polysaccharide) and the xenobiotic stimulus response (Figure 5D).Homology searches were conducted for the secretomes of B. dothidea and other Botryosphaeriaceae species.As shown in Figure 5E, the BDLA secretome exhibited a significantly higher identity with those of BCTK, BFLG, and BQTK compared to MP, NP, DC, and LT.Interestingly, we noticed that there were two groups of BDLA-specific secreted proteins, of which 22 were cysteine-rich and 6 were putative effectors.Through a BLAST search against the Pathogen Host Interactions (PHI) database (http://www.phi-base.org/,accessed on 10 November 2023), it was found that eight BDLA-specific SCPs may affect Homology searches were conducted for the secretomes of B. dothidea and other Botryosphaeriaceae species.As shown in Figure 5E, the BDLA secretome exhibited a significantly higher identity with those of BCTK, BFLG, and BQTK compared to MP, NP, DC, and LT.Interestingly, we noticed that there were two groups of BDLA-specific secreted proteins, of which 22 were cysteine-rich and 6 were putative effectors.Through a BLAST search against the Pathogen Host Interactions (PHI) database (http://www.phi-base.org/,accessed on 10 November 2023), it was found that eight BDLA-specific SCPs may affect the pathogen's virulence and five of them carry domains of Glyco_hydro_18, Glyco_hydro_61, Glyco_hydro_16, Glyco_hydro_10, and Glycos_transf_1 (Table 3), which may play a role in carbohydrate metabolism and host cell wall degradation.

Identification of Horizontal Gene Transfer (HGT) Events in BDLA Genome
Horizontal gene transfer is a process where genomic DNA segments are exchanged between organisms that are not in a parent-offspring relationship, which may introduce novel functions and improve the adaptive ability of the recipient organism [38].We performed a genome-wide HGT event prediction for BDLA to explore its pathogenicity influenced by HGTs.The Alien Index (AI), comparing the best hits from closely related taxa (ingroup) and distantly related taxa (donor) based on their BLAST E-values [39], was calculated for each protein-coding gene of BDLA using the AvP toolkit [27], coupled with support from phylogenetic evidence.
The PHI base annotations revealed that BDLA_00008949-RA (HGT_3), BDLA_00010862-RA (HGT_11), BDLA_00001191-RA (HGT_12), and BDLA_00010741-RA (HGT_19) might affect pathogenicity.Notably, BDLA_00011030-RA (HGT_17) is a putative avirulence effector and carries a domain of serine aminopeptidase, which suggests that this gene might be involved in the inhibition of the host immune system through its protease activity.

Secreted Glucanase, Orsellinic Acid Biosynthesis Enzyme, and MFS Transporters Play an Important Role in Pathogenicity of B. dothidea
The bioinformatics analysis above identified several gene families that might be related to B. dothidea's pathogenicity.Here, we performed qRT-PCR assays to validate the expression level of genes assigned to these families.We found that one secreted glucanase (BDLA_00012709), two secreted chitinase (BDLA_00008528 and BDLA_00004853), two orsellinic acid biosynthesis enzymes (BDLA_00007003 and BDLA_00010036), three major facilitator superfamily proteins (BDLA_00004985, BDLA_00007512 and BDLA_00010781), one viral integrase (BDLA_00005840), and one LRR-containing protein (BDLA_00004043) significantly upregulated during the infection stage.As Figure 6 shows, BDLA_00012709, BDLA_00010036, BDLA_00007003, BDLA_00008528, and BDLA_00010781 were specifically upregulated at 8 days post-inoculation (dpi).Meanwhile, BDLA_00004985, BDLA_00007512, BDLA_00004853, BDLA_00004043, and BDLA_00015840 exhibited a consistent upregulation trend in expression.Among them, one glucanase (BDLA_00012709), one orsellinic acid biosynthesis enzyme (BDLA_00007003), and two MFS transporters (BDLA_00010781 and BDLA_00004985) showed the highest upregulated expression level and were chosen to evaluate their pathogenic ability on hickory trunk.Disease lesions caused by knockout strains of these four pathogenic candidates were obviously smaller than those caused by wild-type and their corresponding complemented strains (Figure 7).These results suggest that glucanase, orsellinic acid biosynthesis enzymes, and MFS transporters play an important role in B. dothidea's pathogenicity.According to the bioinformatics analysis above, BDLA_00007003 and BDLA_00010781 were located in the lineage-specific (LS) regions of B. dothidea (Table S2).BDLA_00004985 is assigned to ortholog group OG0000329, which is specifically expanded in B. dothidea (Table S5).Moreover, BDLA_00012709 is one small secret cysteine-rich protein specific in B. dothidea (Table 3) and is a putative virulence protein according to the annotation based on the PHI database (Table S7).
BDLA_00010036, BDLA_00007003, BDLA_00008528, and BDLA_00010781 were specifically upregulated at 8 days post-inoculation (dpi).Meanwhile, BDLA_00004985, BDLA_00007512, BDLA_00004853, BDLA_00004043, and BDLA_00015840 exhibited a consistent upregulation trend in expression.Among them, one glucanase (BDLA_00012709), one orsellinic acid biosynthesis enzyme (BDLA_00007003), and two MFS transporters (BDLA_00010781 and BDLA_00004985) showed the highest upregulated expression level and were chosen to evaluate their pathogenic ability on hickory trunk.Disease lesions caused by knockout strains of these four pathogenic candidates were obviously smaller than those caused by wild-type and their corresponding complemented strains (Figure 7).These results suggest that glucanase, orsellinic acid biosynthesis enzymes, and MFS transporters play an important role in B. dothidea's pathogenicity.According to the bioinformatics analysis above, BDLA_00007003 and BDLA_00010781 were located in the lineage-specific (LS) regions of B. dothidea (Table S2).BDLA_00004985 is assigned to ortholog group OG0000329, which is specifically expanded in B. dothidea (Table S5).Moreover, BDLA_00012709 is one small secret cysteine-rich protein specific in B. dothidea (Table 3) and is a putative virulence protein according to the annotation based on the PHI database (Table S7).

Discussion
Trunk cankers pose a significant threat to Chinese hickory production in the Lin'an district of Zhejiang Province, China [21].In our previous research, the majority of the pathogenic isolates belonged to B. dothidea wild-type strain BDLA16-7.This highlighted B. dothidea's enhanced pathogenicity and host colonization abilities, particularly in Chinese hickory.This motivated us to explore the potential mechanisms underlying this phenomenon through comparative genomics between B. dothidea BDLA16-7 (BDLA) and other closely related species.
Higher GC contents were observed in the genomes of BDLA, BCTK, BQTK, and BFLG compared to MP, NP, DC, and LT.According to [40], increased GC contents in ge-

Discussion
Trunk cankers pose a significant threat to Chinese hickory production in the Lin'an district of Zhejiang Province, China [21].In our previous research, the majority of the pathogenic isolates belonged to B. dothidea wild-type strain BDLA16-7.This highlighted B. dothidea's enhanced pathogenicity and host colonization abilities, particularly in Chi-nese hickory.This motivated us to explore the potential mechanisms underlying this phenomenon through comparative genomics between B. dothidea BDLA16-7 (BDLA) and other closely related species.
Higher GC contents were observed in the genomes of BDLA, BCTK, BQTK, and BFLG compared to MP, NP, DC, and LT.According to [40], increased GC contents in genomic regions promote transcriptional activity.We speculated that in BDLA, BCTK, BQTK, and BFLG, higher GC contents might be associated with genes related to their infection of Chinese hickory.
Expectedly high degrees of sequence identity and synteny between BDLA, BQTK, BCTK, and BFLG were described, which was consistent with [22].This suggests a conserved genomic structure and content among Botryosphaeria species, indicating gradual evolution.This result contrasts with that of destructive filamentous fungi, such as Phytophthora infestans and Magnaporthe oryzae, which can rapidly kill host plants, prompting a rapid evolutionary 'arms race' and shaping divergent genomes among closely related species [41,42].However, we still found two specific contigs and other lineage-specific (LS) regions in the BDLA genome, which include genes that may be related to B. dothidea's stronger pathogenicity in Chinese canker.A significant proportion of BDLA genes in LS regions were occupied by genes encoding virus integrase, which is responsible for the viral replication that catalyzes the covalent integration of viral cDNA into the host genome [43].It is well known that mycoviruses can attenuate the growth and virulence of B. dothidea [44].However, our HGT analysis revealed that these virus integrases were not acquired from viruses or other distant species.One possible explanation is that these virus integrases originated from B. dothidea's ancestors and might indicate to B. dothidea's pathogenicity by shaping the endophytic lifestyle of B. dothidea.Such integrases may be found in retrotransposons [45,46].Interestingly, we found that heterokaryon incompatibility (HI) proteins were expanded in BDLA as well.Heterokaryon incompatibility (HI) is a non-self-recognition phenomenon in filamentous fungi, and it plays an important role in limiting resource plundering and restricting viral transfer between strains [47,48].Therefore, B. dothidea also possesses sufficient capability to restrict viral transfer.Taken together, we speculate that there is a 'viral-balance' system in B. dothidea's genome, which suggests that B. dothidea may control the orderly integration of viruses or retrotransposons into its genome for environmental adaptation.However, detailed mechanisms still need further research in the future.Additionally, we also found that two MFS transporter coding genes exists in LS regions and deleting one of them (BDLA_00010781) reduced B. dothidea's pathogenicity.Combined with previous studies [49,50], we proposed two possibilities: (i) BDLA_00010781 is involved in fungal toxin secretion; (ii) BDLA_00010781 is important for hyphal morphology and conidiation, which affect fungal pathogenicity.
An overrepresentation of genes was involved in orsellinic acid biosynthesis and carried the orsD domain (PF12013) in the BDLA-specific gene set.Endophytic fungi can generate a range of bioactive metabolites to protect the hosts against other pathogens and herbivores in harsh environments for the purpose of nutrient absorption from host plants [51][52][53].For instance, interactions with bacteria induce Aspergillus nidulans to produce the archetypal polyketide orsellinic acid [54].In [55], the authors utilized orsellinic acid, a compound isolated from the endophytic fungus Epicoccum Nigrum, to produce biocompatible green silver nanoparticles.These nanoparticles exhibited significant antifungal activity against Alternaria solani.In this study, we found BDLA_00007003, containing the orsD domain, significantly upregulated since 3 dpi, and positively regulated B. dothidea's pathogenicity.It will be interesting to infer that this orsellinic acid biosynthesis coding gene might be induced to inhibit its fungal competitors in diseased Chinese hickory samples.
An analysis of positively selected genes (PSGs) in BDLA revealed that these PSGs are mainly involved in the response to xenobiotic stimuli.Specifically, BDLA_00003994-RA, assigned to OG0003633, encodes one LRR-containing protein.In plants, this type of protein is well known for regulating the immune system [56].However, LRR-containing proteins are also involved in development and pathogenicity in Phytophthora sojae [57,58].Thus, the LRR-containing protein of BDLA may undergo selection pressure because it affects B. dothidea's development and pathogenicity.Meanwhile, one Sec_1-containing protein and one Sec_2-containing protein of BDLA were also putative PSGs, and they may affect protein trafficking.
The putative secretome of B. dothidea shows higher similarity with BCTK, BQTK, and BFLG compared to MP, NP, DC, and LT.Small secreted cysteine-rich proteins (SCPs) have been implicated as key virulence factors in fungal pathogens, contributing to the establishment of colonization in host plants [59].In the current study, we observed that some of the BDLA-specific SCPs encode chitinase and have undergone positive selection.This association may be related to the degradation of the cell wall in fungal competitors of B. dothidea, considering the absence of chitin content in the host plants.We speculate that this result may partially explain the dominant role of B. dothidea in diseased trunk samples.Furthermore, we also noticed that one secreted glucanase, BDLA_00012709, might enhance fungal pathogenicity by degrading the cell wall of the host.
In this study, we presented a high-quality genome assembly of B. dothidea and conducted an in-depth analysis of the relationship between B. dothidea's genome and its pathogenicity in Chinese hickory.Using reported genomic data of B. dothidea [22], we identified and verified one glucanase and several chitinase that are pathogenicity-associated.On the other hand, we noticed that orsellinic acid biosynthesis-associated enzymes of B. dothidea may affect its competitiveness in the fungal community of the diseased samples, which was not reported before.Two MFS transporters were also validated for their function during the procession of B. dothidea's infection.Moreover, we speculate that the viral integrase and heterokaryon incompatibility proteins of B. dothidea may collaborate to regulate the integration of viruses into the B. dothidea genome.This coordination could potentially influence B. dothidea's pathogenicity and affect shaping its endophytic lifestyle.
This study shed light on the evolutionary process of B. dothidea and other closely related species isolated from diseased Chinese hickory trees, and several candidates might be related to B. dothidea's pathogenicity and colonization ability.However, the limitations of this comparative analysis are (1) it will be difficult to predict genes located in complex or high-repeat regions without the support of chromosome-level genome assemblies; (2) functional annotations for each gene are based only on sequence similarity to known domains, but they cannot identify key sites without in-depth experimental validation, which sometimes is more critical to gene functions.Hence, in-depth elucidation of the mechanisms is essential for a comprehensive understanding of B. dothidea's pathogenicity in the future.

Supplementary Materials:
The following supporting information can be downloaded at: https: //www.mdpi.com/article/10.3390/jof10040299/s1, Figure S1: genome size estimation by k-mer analysis; Table S1: Distribution of repeat sequences in each Botryosphaeriaceae species.Table S2: Genome locations and annotations of genes within BDLA LS regions.Table S3: Statistic of orthologous analysis.Table S4: Orthologs distribution in each Botryosphaeriaceae species.Table S5: Annotations of genes assigned to rapid-evolving orthologs in BDLA.Table S6: Annotations of positively selection genes of BDLA.Table S7: Information of secretome of each Botryosphaeriaceae species.Table S8: information of putative effectors or SCPs identified in each Botryosphaeriaceae species.Table S9: Primers used for qRT-PCR assay.

Figure 1 .
Figure 1.Genome assembly assessment and repeat sequence detection of Botryosphaeriaceae species.Genome assembly assessment performed by BUSCO at genome level (A) and protein level (B).(C-F) provide distributions of repeat sequences across Botryosphaeriaceae species: (C) total repeat

Figure 1 .
Figure 1.Genome assembly assessment and repeat sequence detection of Botryosphaeriaceae species.Genome assembly assessment performed by BUSCO at genome level (A) and protein level (B).(C-F) provide distributions of repeat sequences across Botryosphaeriaceae species: (C) total repeat sequences; (D) class II DNA transposons; (E) class I retrotransposons of LTR subclass; and (F) class I retrotransposons of LINE subclass.

Figure 2 .
Figure 2. Synteny analysis of BDLA and investigation of genes within BDLA-specific regions.(A) Inter-species synteny analysis between BDLA, BQTK, BCTK, and BFLG; (B) intra-species synteny analysis and genome content annotations of BDLA: red bar lineage-specific (LS) regions, earthy yellow heatmap means GC content, blue peak diagram represents sequence density, and green means gene density.(C) Distribution of genes within LS regions on different contigs in BDLA genome.(D) GO enrichment analysis of BDLA genes within LS regions; (E) Pfam domain annotation of BDLA genes within LS regions.

Figure 2 .
Figure 2. Synteny analysis of BDLA and investigation of genes within BDLA-specific regions.(A) Inter-species synteny analysis between BDLA, BQTK, BCTK, and BFLG; (B) intra-species synteny analysis and genome content annotations of BDLA: red bar lineage-specific (LS) regions, earthy yellow heatmap means GC content, blue peak diagram represents sequence density, and green means gene density.(C) Distribution of genes within LS regions on different contigs in BDLA genome.(D) GO enrichment analysis of BDLA genes within LS regions; (E) Pfam domain annotation of BDLA genes within LS regions.

Figure 3 .
Figure 3. Ortholog analysis and identification of rapidly evolving orthologs.(A) Orthologous relationships between BDLA, BCTK, BFLG, BQTK, MP, NP, DC, and LT.(B) Analysis of species associated with orthologs, which was used for their conservation.(C) Phylogenic analysis and rapidly evolving orthologs of BDLA, BCTK, BFLG, BQTK, MP, NP, DC, and LT.Blue numbers represent counts of expanded orthologs, and red numbers represent contracted orthologs.(D) GO enrichment analysis of genes assigned to expanded orthologs in BDLA.(E) Pfam domain annotation of genes assigned to expanded orthologs in BDLA.

Figure 3 .
Figure 3. Ortholog analysis and identification of rapidly evolving orthologs.(A) Orthologous relationships between BDLA, BCTK, BFLG, BQTK, MP, NP, DC, and LT.(B) Analysis of species associated with orthologs, which was used for their conservation.(C) Phylogenic analysis and rapidly evolving orthologs of BDLA, BCTK, BFLG, BQTK, MP, NP, DC, and LT.Blue numbers represent counts of expanded orthologs, and red numbers represent contracted orthologs.(D) GO enrichment analysis of genes assigned to expanded orthologs in BDLA.(E) Pfam domain annotation of genes assigned to expanded orthologs in BDLA.

Figure 4 .
Figure 4. Investigation of positively selected genes in BDLA, BQTK, BCTK, and BFLG.(A) Proportion of positively selected genes among all single-copy orthologs.Blue bars represent numbers of positively selected genes, and orange bars indicate their corresponding proportions among all single-copy orthologs.(B) Venn diagram showing overlaps of positively selected genes in BDLA, BQTK, BCTK, and BFLG.(C) Pfam annotations and dN/dS ratios of positively selected genes of

Figure 4 .
Figure 4. Investigation of positively selected genes in BDLA, BQTK, BCTK, and BFLG.(A) Proportion of positively selected genes among all single-copy orthologs.Blue bars represent numbers of positively selected genes, and orange bars indicate their corresponding proportions among all single-copy orthologs.(B) Venn diagram showing overlaps of positively selected genes in BDLA, BQTK, BCTK, and BFLG.(C) Pfam annotations and dN/dS ratios of positively selected genes of BDLA and their corresponding ortholog IDs.(D) GO enrichment analysis of positively selected genes of BDLA and their corresponding average dN/dS ratios.

Figure 5 .
Figure 5. Secretome analysis of Botryosphaeriaceae species.(A) Counts of secreted proteins in each Botryosphaeriaceae species and their corresponding length distribution.Different colors represent sequence length intervals.(B) Distribution of lengths and categories of putative effectors in each Botryosphaeriaceae species.The numbers on the right represent sequence length intervals.(C) Proportion of small secreted cysteine-rich proteins (SCPs) and their corresponding length distribution.The numbers on the right represent sequence length intervals.(D) GO enrichment analysis of B. dothidea's putative cysteine-rich effectors.(E) Homologous search.Red dashed box marked secret proteins specific in BDLA.

Figure 5 .
Figure 5. Secretome analysis of Botryosphaeriaceae species.(A) Counts of secreted proteins in each Botryosphaeriaceae species and their corresponding length distribution.Different colors represent sequence length intervals.(B) Distribution of lengths and categories of putative effectors in each Botryosphaeriaceae species.The numbers on the right represent sequence length intervals.(C) Proportion of small secreted cysteine-rich proteins (SCPs) and their corresponding length distribution.The numbers on the right represent sequence length intervals.(D) GO enrichment analysis of B. dothidea's putative cysteine-rich effectors.(E) Homologous search.Red dashed box marked secret proteins specific in BDLA.

Figure 7 .
Figure 7. BDLA_00012709, BDLA_00007003, BDLA_00010781, and BDLA_00004985 are required for BDLA16-7's pathogenicity.(A) Disease symptoms on Chinese hickory trunk caused by each strain for 15 days post-inoculation (dpi); (B) lesion size on Chinese hickory trunk caused by each strain.a, b, c, and d represent the significant difference (one-way ANOVA test, p < 0.01).

Figure 7 .
Figure 7. BDLA_00012709, BDLA_00007003, BDLA_00010781, and BDLA_00004985 are required for BDLA16-7's pathogenicity.(A) Disease symptoms on Chinese hickory trunk caused by each strain for 15 days post-inoculation (dpi); (B) lesion size on Chinese hickory trunk caused by each strain.a, b, c, and d represent the significant difference (one-way ANOVA test, p < 0.01).

Author Contributions:
Data curation and analysis, D.L.; funding acquisition, C.Z.; methodology, D.L.; resources, Y.J., Y.Z. and C.M.; writing-original draft, D.L.; writing-review and editing, T.M. and C.Z.All authors have read and agreed to the published version of the manuscript.Funding: This work was supported by grants from the Key Research and Development Project of Zhejiang Province, China (2020C02005), and Natural Science Foundation of Fujian Province, China (2019J01385).Institutional Review Board Statement: Not applicable.Informed Consent Statement: Not applicable.

Table 1 .
Genome features of genome assemblies of Botryosphaeriaceae species.

Table 2 .
Statistic of protein-coding genes in each Botryosphaeriaceae species.

Table 4 .
Annotations of HGT genes identified in BDLA.