The Glyphosate Target Enzyme 5-Enolpyruvyl Shikimate 3-Phosphate Synthase (EPSPS) Contains Several EPSPS-Associated Domains in Fungi

: 5-enolpyruvylshikimate 3-phosphate synthase (EPSPS) is the central enzyme of the shikimate pathway to synthesize three aromatic amino acids in fungi, plants and prokaryotes. This enzyme is the target of the herbicide glyphosate. In most plants and prokaryotes, the EPSPS protein is consti-tuted by a single domain, whereas in fungi, it contains several EPSPS-associated domains. Here, we perform a comprehensive analysis of 390 EPSPS proteins of fungi to determine the distribution and the evolution of the EPSPS-associated domains. The results of this study will be useful to determine the potential differential impact of glyphosate on alternative domain architectures in fungi.


Introduction
Glyphosate, the most used herbicide against weeds, and glyphosate-based products (GBPs), target the enzyme 5-enolpyruvylshikimate-3-phosphate synthase (EPSPS) [1,2]. EPSPS (also known as aroA) is the central enzyme in the shikimate pathway for the synthesis of three essential amino acids [3]. The enzyme is present in plants and prokaryotes as a single-domain protein, and in fungi as a multi-domain protein [4,5]. As the enzyme is not found in animals, the use of glyphosate is supposed to be safe for human health. Even if this is the case, the herbicide may still affect the biodiversity of free-living and host-associated microorganisms [6][7][8][9][10]. Thus, the study of the glyphosate target enzyme will provide clues about the potential effect of the herbicide. In recent studies, we analyzed the distribution of the EPSPS protein in organisms and we estimated the differential sensitivity of organisms to the herbicide [7,8].
In this study, we analyze the distribution and evolution of the EPSPS and EPSPSassociated domains in fungi. In plants and prokaryotes, EPSPS has a single domain, thus the potential sensitivity to the herbicide glyphosate depends, in addition to levels of gene expression and other cell factors [4,[11][12][13], on the type of amino acids in the EPSPS protein. However, the effect of the herbicide on the multi-domain structure of EPSPS in fungi is yet unclear. This survey of the EPSPS and EPSPS-associated domains will be helpful to determine the potential effect of glyphosate on fungal species.

Material and Methods
We obtained protein data of the EPSPS and EPSPS-associated domains from the PFAM database (http://pfam.xfam.org (accessed on 1 September 2019)) [14]. First, we analyzed frequencies of EPSPS-associated domains across 390 fungal species, and then we summarized  the main domain function in a Venn diagram. We also analyzed the distribution of the associated domains by taxonomical groups in fungi. A phylogenetic tree of the EPSPS domain was built with the program FastTree2 [15] after aligning the protein sequences with the programs MUSCLE [16] and Gblocks [17]. The program Cytoscape [18] was used to reconstruct a bipartite network of protein domains and fungal species. The resultant network was used to visualize the presence and absence of domains and the distribution of the different architectures of the EPSPS multi-domain. The analysis of the evolution of the EPSPS-associated domains in fungi was performed by Dollon parsimony with the program Count [19].

Functional and Domain Characterization
The EPSPS domain is approximately 450 amino acids long (~1350 nucleotides) and is present (by definition) in all EPSPS proteins [7]. The EPSPS-associated domains can be classified into four partially overlapping groups ( Figure 1): shikimate (shikimate pathway proteins), enzymes (proteins with catalytic function), expression (domains whose products are needed in controlling gene expression) and structural function (proteins that do not have a catalytic function). The function of the EPSPS enzyme is to transfer alkyl and aryl (other than methyl) groups. It transfers enolpyruvate from phosphoenolpyruvate to 3-phosphoshikimate [20]. The analysis of the EPSPS-associated domains across fungal species shows that they are mostly involved in the shikimate pathway (e.g., SKI, DHQ_synthase, DHquinase_I) and in the synthesis of aromatic amino acids (e.g., Shiki-mate_dh_N, PDH). There are also some promiscuous domains (e.g., HTH_3) associated with EPSPS in some fungal species. The least frequent domains, mainly involved in DNA modification and gene expression, are amply distributed across fungal species.
The bipartite network ( Figure 2) connects fungal species divided by order (ascomycota on the left and basidiomycota, mucoromycota, chytridiomycota, blastocladiomycota, zoopagomycota and unknown on the right). EPSPS and the most common EPSPS-associated domains are located in the center of the figure in the same order as in domains. Other domains were set on the outside of the network. Ascomycota seems to be the most variable phylum, in terms of EPSPS-associated domains, and contains several infrequent domains. Moreover, this is the phylum that contains the least number of shikimate DH domains, e.g., the domain is infrequent in Eurotiomycetes (APE), Dothideomycetes (APD) and Leotiomycetes (APL). The majority of multi-domain architectures (n = 220) are five domains long, which are involved in the shikimate pathway and approximately 1/3 of the sequences (n = 134) have all six domains of the shikimate pathway. Overall, the bipartite network shows that the EPSPS domain in fungi is mostly associated to other domains of the shikimate pathway. However, there are a few exceptions (e.g., Dothideomycetes).

Maximum Parsimony Analysis of EPSPS-Associated Domains
Given that the shikimate DH domain is found in most species, it is likely that it was present in the common ancestor of fungi and lost in some lineages. Moreover, rarer domains are late inclusions in the multi-domain structure. We analyzed the evolution of all EPSPS-associated domains with Dollon parsimony (Figure 3). The shikimate DH domain has been lost multiple times in different branches of the phylogenetic tree (Figure 3). In addition, many domains have been lost during the evolution of fungi and some new domains have been acquired recently, and independently, in a few species, and others were acquired in internal lineages and subsequently lost in a few branches. Notice that our analysis does not take into account domain duplications. The conclusion from this Dollon parsimony analysis is that a six-domain structure (i.e., a protein sequence with all six most abundant domains) was the original EPSPS sequence in fungi (Table 1). Moreover, we suggest the hypothesis that sequences with the five domains originated by loss of shikimate DH independently in different branches of the evolutionary tree. These losses occurred at early and late stages in the evolution of fungi. In some EPSPS proteins, the shikimate DH (located at the C-terminal of the protein) is truncated. It is possible that this domain was lost in a crossing over event without affecting the shikimate pathway, however, we do not know if the fragment is found in any other part of genome. Additionally, smaller multi-domains originated from the loss of other shikimate domains, usually in higher branches. Moreover, the inclusion of rare domains also occurred in higher branches.

Conclusions
The EPSPS multi-domain structure in fungi ranges between two and eight domains. An evolutionary analysis shows that the ancestral state of the EPSPS protein included six domains (DHQ synthase, EPSPS, SKI, DHquinase I, shikimate DH N and shikimate DH). The current multi-domain structure of EPSPS in fungi is the product of multiple independent domain gains and losses throughout evolution. Further analyses should determine the importance of the EPSPS-associated domains on the sensitivity/resistance of the EPSPS enzyme to glyphosate.
Funding: This research was supported by funds from the Turku Collegium for Science and Medicine (PP).

Institutional Review Board Statement: Not applicable
Informed Consent Statement: Not applicable Data Availability Statement: Protein sequence data was obtained from publicly accessible repository http://pfam.xfam.org.