The Proteogenome of Symbiotic Frankia alni in Alnus glutinosa Nodules

Omics are the most promising approaches to investigate microbes for which no genetic tools exist such as the nitrogen-fixing symbiotic Frankia. A proteogenomic analysis of symbiotic Frankia alni was done by comparing those proteins more and less abundant in Alnus glutinosa nodules relative to N-replete pure cultures with propionate as the carbon source and ammonium as the nitrogen-source. There were 250 proteins that were significantly overabundant in nodules at a fold change (FC) ≥ 2 threshold, and 1429 with the same characteristics in in vitro nitrogen-replete pure culture. Nitrogenase, SuF (Fe–Su biogenesis) and hopanoid lipids synthesis determinants were the most overabundant proteins in symbiosis. Nitrogenase was found to constitute 3% of all Frankia proteins in nodules. Sod (superoxide dismutase) was overabundant, indicating a continued oxidative stress, while Kats (catalase) were not. Several transporters were overabundant including one for dicarboxylates and one for branched amino acids. The present results confirm the centrality of nitrogenase in the actinorhizal symbiosis.


Introduction
Frankia is a genus containing soil actinobacteria that establish a nitrogen-fixing root nodule symbiosis with 24 genera and hundreds of species of dicotyledons [1]. The best known of these is Alnus glutinosa L., the designated type, which is representative of the genus both for morphological characters and evolutionary rate [2].
Hundreds of Frankia strains have been isolated from nodules of the different species and classified according to their physiology [3], host specificity [4] or phylogeny [5]. More recently, many species have been described-the first of which was Frankia alni [6]-that can nodulate Alnus spp. and Myrica spp. Its genome has been deciphered and compared to other species [5] and found to contain no symbiosis island, with all genes related to symbiosis present in several unlinked clusters [7].
This symbiosis between F. alni and Alnus spp. is initiated by Ca-spiking [8], followed by the synthesis of a root-hair-deforming factor [9] that allows the deformation, entrapment of hyphae and penetration into the cortical cells [10]; the formation of a pre-nodule and the emergence of a mature nitrogen-fixing nodule. On the plant side, the mechanisms involved appear similar to those present in legumes, with several elements of the common symbiotic cascade involved, such as the nuclear transcription factor NIN (nodule inception) [11] and the SYMR (symbiosis) kinase [12]. The genomics of symbiotic and non-symbiotic plant phylogenetic neighbors has shown that the common symbiotic determinants were present in all leguminous and actinorhizal lineages and that loss of symbiotic capacity was accompanied by the loss of NIN and RPG (rhizobium polar growth) [13].
On the microbial side, less is known since no genetic tools could be developed despite several attempts, hampering mutagenesis and complementation studies. The genomes revealed no canonical nod genes [5] except in two cl2 strains [14,15] and one cl3 strain [16] where two gene clusters are found, one with nodAB 1 and the other with nodB 2 CIJ; however, these were not detected by proteomics of F. soli exposed to root exudates from compatible Elaeagnus angustifolia and incompatible hosts [17] even though the mRNAs had been detected in Datisca nodules [14]. Transcriptomics of Frankia in Alnus nodules [7] has identified nif (nitrogenase), hop (hopanoid synthesis), suf (FeS cluster synthesis) and hup (hydrogenase uptake) as upregulated but has shed no light on the molecular dialogue between the two partners. The determinants and structure of the root deforming factor [10], thus, remain elusive. We also know through omics [18] of early rhizospheric interactions that a cellulase/cellulose synthase cluster is upregulated even though F. alni cannot grow using glucose as a carbon source, which is evocative of a local weakening of the hair cell wall and a hardening of the hyphal tip to facilitate entry into root tissues. We also know that the auxin PAA is synthesized in nodules and in pure culture-an auxin that when applied onto roots at 10 −5 M causes emergence of stunted swollen secondary roots that are similar to nodules [19]. It is known that the plant synthesizes peptides that bind to and modify the porosity of vesicles [20] and a non-comparative survey of field nodules has shown nitrogenase, the protein that reduces dinitrogen into ammonium and tricarboxylate cycle (TCA) proteins to be abundant [21].
Other in vitro proteomic studies have been done of Frankia with/without ammonium [22] or under osmotic stress [23] and have shown TCA involvement, stress determinants and various regulators.
We undertook the present study to characterize the proteomic response of Frankia alni as it forms mature 21 dpi nodules on A. glutinosa roots.

Plant and Bacterial Material
Frankia alni strain ACN14a [24] cells were maintained in BAP medium [25] with 5 mM propionate as the carbon source and 5 mM NH 4 + as the nitrogen source buffered to pH 6.5. Alnus glutinosa seeds were harvested from a tree growing in Lyon, France, used previously [7]. They were grown as before with some modifications: seedlings were transferred to Fahraeus's solution [26] in opaque plastic pots (eight seedlings/pot) and grown for four weeks with 0.5 g·L −1 KNO 3 , followed by one week without KNO 3 before inoculation with F. alni [24]. To inoculate seedlings, Frankia cells were grown in BAP-PCM medium (4 × 250 mL) until log-phase [27]. Cells were collected by centrifugation, washed twice with sterile ultra-pure water and resuspended in 500 mL of Fahraeus's solution without KNO 3 . Cell cultures were homogenized by syringing though a 21G needle. The Frankia cell suspensions were applied onto plant roots (symbiotic condition). After 21 days, mature nodules were harvested and ground in liquid nitrogen.
As a reference, F. alni cells were inoculated after syringing with a series of needles (21G, 23G, 25G, 27G) and grown for 10 days (corresponding to the end of the exponential phase) in 250 mL of BAP medium with ammonium (5 mM) in agitated 500 mL Erlenmeyer flasks [25] buffered to pH 6.5. No vesicles could be found.

Proteome Characterization
Each sample was dissolved in LDS 1X buffer (Invitrogen, Carlsbad, CA, USA) with 100 µL of LDS 1X per 30 mg of pellet. The solutions were warmed at 99 • C for 5 min and subjected to sonication in an ultrasonic bath for 5 min. Each sample was transferred into a 2 mL Precellys (Bertin Technologies, Montigny-le-Bretonneux, France) tube containing 200 mg of glass beads and subjected to three cycles of grinding for 20 s, followed by 30 s pauses. Samples were centrifuged for 40 s at 16,000× g. The resulting supernatants were transferred into Eppendorf tubes and heated for 10 min at 99 • C. Samples were subjected to a short SDS-PAGE migration and processed as previously described [28]. Briefly, the whole proteome was extracted as a single polyacrylamide band, reduced with dithiothreitol, treated with iodoacetamide and proteolyzed with sequencing-grade trypsin (Roche, Basel, Switzerland) in the presence of 0.01% of proteaseMAX detergent (Promega, Madison, WI, USA). The resulting peptides were analyzed with an ESI-Q Exactive HF mass spectrometer (Thermo Fisher Scientific, Waltham, MA, USA) coupled to an Ultimate 3000 176 RSL Nano LC System (Thermo). A volume of 10 µL of peptides was injected onto a reverse-phase Acclaim PepMap 100 C18 column (3 µm, 100 Å, 75 µm id × 500 mm) and resolved at a flow rate of 0.2 µL/min with a 60 min gradient of CH 3 CN (2.5% to 40%) in the presence of 0.1% HCOOH. The tandem mass spectrometer was operated with a Top20 strategy in data-dependent mode. Only peptide molecular ions with double or triple positive charges were selected for fragmentation with a dynamic exclusion of 10 s as previously described [29]. Tandem mass spectrometry spectra were interpreted using the MASCOT 2.2.04 software (Matrix Science, London, UK) with standard parameters. Proteins were quantified based on their spectral counts, and their abundances were normalized with total spectral counts of all proteins identified as belonging to Frankia for comparing distinct conditions. Proteome comparison between conditions was done according to the TFold module from the PatternLab software (www.patternlabforproteomics.org/; last access 7 January 2022).

Proteome Data
The mass spectrometry proteomics data were deposited in the ProteomeXchange Consortium via the PRIDE partner repository (www.ebi.ac.uk/pride/archive/; last access 16 December 2021) with the dataset identifier PXD030468 and Project DOI 10.6019/PXD030468.

Results
The three biological replicates of symbiotic Frankia alni overproduced at a fold change of ≥2250 proteins (Supplementary Table S1) using a nitrogen-replete propionate-fed pure culture as reference, of which 100 had an FC ≥ 4.38 (Table 1). These 100 proteins signing for the specificity of symbiotic state account for 17.9% of the total Frankia proteins in nodules, based on their cumulated normalized spectral abundance factors. Conversely, there were 1489 under-produced proteins at an FC ≤ 0.5. NifH (FRAAL6813) was the most overabundant protein in this condition with an FC of 291.3. Table 1. List of the 100 most over-abundant Frankia proteins in nodules with the NCBI accession, the FRAAL Id, gene names, the protein name, the fold change, the COG (cluster of orthologous genes), the phylogenetic clusters distribution and the NSAF (normalized spectral abundance factors). The COGs are according to [30]. The distribution code A is present in all Frankia and in other actinomycetes, F is present in all Frankia strains, f is present in some Frankia strains and S is present in symbiotic Frankia strains (C1, C1c, C2 and C3 but not in C4). C1, C1c, C2, C3 and C4 indicate it is present in this cluster based on a threshold of 35% Id as seen on Mage [31]. Hypothetical genes have been removed from the list; they are listed in Table S1. The most underabundant was a cAMP-binding membrane-bound transcriptional regulator (FRAAL6506) at an FC of 0.01 (Supplementary Table S1) in synton (set of genes with a conserved order) with a geosmin synthase gene. There were 2983 proteins detected that belonged to Frankia alni. There were also 68 overabundant proteins in nodules and 30 overabundant ones in pure cultures that did not meet the p-value criterion (≤0.05). There were 729 proteins that were not certified by mass spectrometry with the validation of at least two distinct peptides in this dataset. The overabundant protein coding genes were scattered around the genome (Figure 1).

P_011601417. FRAAL01
56 gdhB Glutamate dehydrogenase GdhB 4.47 E The most underabundant was a cAMP-binding membrane-bound regulator (FRAAL6506) at an FC of 0.01 (Supplementary Table S1) in synto with a conserved order) with a geosmin synthase gene. There were 298 tected that belonged to Frankia alni. There were also 68 overabundant prot and 30 overabundant ones in pure cultures that did not meet the p-value c There were 729 proteins that were not certified by mass spectrometry with of at least two distinct peptides in this dataset. The overabundant protei were scattered around the genome (Figure 1). Among F. alni proteins, the nitrogenase proteins were the most overa among the 10 highest using as reference a nitrogen-fixing pure culture. A COG analysis ( Figure 2) revealed that "C" (energy production) and "J" (translation) were the most represented in the nodule overabundant proteins, while "E" (amino acid transport and metabolism) and "K" (transcription) were the most represented in the underabundant proteins (Figure 2). Many COGs were very rare in the nodule (N, cell motility; U, intracellular trafficking; V, defense mechanisms and Q, secondary metabolites).
A COG analysis (Figure 2) revealed that "C" (energy production) and "J" (translation) were the most represented in the nodule overabundant proteins, while "E" (amino acid transport and metabolism) and "K" (transcription) were the most represented in the underabundant proteins (Figure 2). Many COGs were very rare in the nodule (N, cell motility; U, intracellular trafficking; V, defense mechanisms and Q, secondary metabolites).  A drawing with the major overabundant proteins in nodules permits to show nitrogenase and other energy-generating, transporter and stress-coping determinants ( Figure 3).

Discussion
Proteomics has the potential to decipher the physiological changes occurring in cells upon ecological transitions [32]. Establishment of the actinorhizal symbiosis is a drastic  Table 1.

Discussion
Proteomics has the potential to decipher the physiological changes occurring in cells upon ecological transitions [32]. Establishment of the actinorhizal symbiosis is a drastic change for both partners that cannot be analyzed through negative genetics. Omics thus offer an interesting approach to pinpoint the key molecular players from both organisms. Proteomics in particular has been used to study Frankia either in early interaction with Alnus [18]-or with Elaeagnus angustifolia, Ceanothus thyrsiflorus and Coriaria myrtifolia [17], showing, among others, cellulose synthase and a potassium transporter over-detected upon contact with Alnus [18] and nitrogen fixation and assimilation proteins upon contact with Elaeagnus [17], as well as stress response and respiration proteins-or in the mature field nodules of a range of actinorhizal plants [21].
Nitrogenase is the most abundant protein complex in symbiotic Frankia alni. It comprises 17 genes that had previously been shown to be the most transcribed in mature nodules [7]. The present study confirms our vision that symbiotic Frankia is essentially a nitrogenase machine. Associated proteins were seen here, such as the SuF cluster assembly proteins and the hopanoid biosynthesis proteins (that protect nitrogenase from oxygen), but not the hydrogenase proteins even though they were among those highly expressed [33]. Frankia hydrogenase proteins seen previously in field nodules [21] were not detected.
The high overabundance for 2-oxoglutarate ferredoxin oxidoreductase (KorAB) confirms a role initially hypothesized based on its position next to nif genes of 2-oxoglutarate as a primary electron source for nitrogenase [21]. The position of the korAB genes next to nif genes and their duplication in symbiotic strains only are probably the result of a specialization on the one hand for symbiosis (FRAAL6798-6790, FC = 91.5) and on the other hand for saprophytic life (FRAAL1050-1051, FC = 0.80) as seen before for Hup [34].
Nitrogen fixation is an energy-demanding process that consumes eight ATP molecules per molecule of NH 4 + produced. It is, thus, expected that the TCA cycle will be running at full force, transforming the photosynthates given by the plant into energy. The transcriptome of symbiotic F. alni was seen to contain several TCA genes upregulated at a high level [7] as was the case for the transcriptome of F. soli upon early contact [17].
There is an important specialization in nodules with only a twentieth of the genome identified (250 proteins with an FC ≥ 2) relative to a pure culture where five times as many proteins were identified at the same threshold. A similar specialization was noted in S. meliloti nodules [35].
The oxidative stress proteins catalase and superoxide dismutase are essential for the nodulation of Sinorhizobium [36] to cope with the oxidative burst induced as an early plant defense response against avirulent pathogens [37]. Nitroreductases play a similar role in Sinorhizobium [38]. In actinorhizal nodules, there is also an oxidative burst with which Frankia must cope [39]. Pure cultures of F. alni have been shown to have a basic expression of its two catalases that are then markedly upregulated upon contact with H 2 O 2 or with methyl viologen [40]. However, it appears that this upregulation is not maintained in mature nodules.
GABA is considered a nutrient as well as an effector to trigger plant responses to heat, salt, herbivory and other stresses [41]. GABA has been detected in the metabolome of Alnus and Casuarina nodules and roots at very high levels [42,43]. GABA was also seen to improve nitrogenase and respiration in pure culture [43]. In leguminous nodules, GABA has been suggested to be a nutrient fed to the symbiont since it accumulated labeled 15 N 2 and contributed to the N-nutrition [44]. In Alnus, labeled 15 NH 4 + was recovered as alanine (Ala), γ-amino butyrate GABA), glutamine (Gln), glutamate (Glu), citrulline (Cit) and arginine (Arg) [45]. GABA was late to appear, but then, it continued at a fast pace. Since GABA synthesis involves the utilization of protons and releases CO 2 , it has been suggested as a means to reduce acidity [46].
The photosynthates fed by the plant to the bacterium have been the subject of several hypotheses. It has long been known that Alnus-infective strains could not use sugars [3], which is why organic acids are routinely used to grow most strains. The identification of a dicarboxylate transporter among those genes specific to nodules [47] indicated that dicarboxylates were good candidates. Several of these were measured in nodules and roots, and fumarate, succinate, malate and 2-oxoglutarate were found to be overabundant in nodules, although this approach is biased by the presence of mitochondria where citrate is also very abundant, yet it is toxic to Frankia [48]. The overabundance in the present study of the C4-dicarboxylate transporter, FRAAL1390 confirms that dicarboxylates play a role in trophic exchanges between symbionts, although its modest overabundance (FC = 3.42) points either to other transporters or to other photosynthates involved.
Chaperones such as GroEL have been shown to be essential to nodulation in Sinorhizobium meliloti [49]. They are also involved in response to other stresses such as heat [50] or salt [51] or simply for the assembly of very complex proteins such as nitrogenase [52]. Such chaperones were previously identified in the transcriptome of symbiotic Frankia alni [7].
Proteomics permits complementation of the vision achieved by transcriptomics. The two approaches have different strengths and weaknesses. Shotgun proteomics carried out in the standard mode (i.e., based on trypsin proteolysis) may miss small proteins, membrane proteins and polypeptides with rare combinations of arginine and lysine residues [53]. Transcriptomics misses unstable messengers or those with secondary structures. Furthermore, proteomics and transcriptomics address molecular entities with different half-lives and dynamics, thus they are highly complementary. The present study reinforces the picture of a microbe highly specialized for the energy-demanding nitrogenase and the accompanying stresses. It sheds no light, however, on the molecular processes involved in communication between the two partners.
Supplementary Materials: The following supporting information can be downloaded at https: //www.mdpi.com/article/10.3390/microorganisms10030651/s1: Table S1: List of Frankia proteins identified in nodules and in a nitrogen-replete pure culture and their spectral counts.
Author Contributions: P.P., N.A. and P.N. conceived the experiment; P.P., N.A. and P.F. grew the plant and the bacterium; G.M. and J.A. extracted the proteins and ran the proteomic analysis; D.A. and P.N. did the analyses and P.P. and P.N. wrote the paper. All authors have read and agreed to the published version of the manuscript.

Institutional Review Board Statement: Not applicable.
Informed Consent Statement: Not applicable.

Data Availability Statement:
The mass spectrometry proteomics data were deposited in the Pro-teomeXchange Consortium via the PRIDE partner repository (www.ebi.ac.uk/pride/archive/; 16 December 2021) with the dataset identifier PXD030468 and Project DOI 10.6019/PXD030468.