Differentiation of Closely Related Oak-Associated Gram-Negative Bacteria by Label-Free Surface Enhanced Raman Spectroscopy (SERS)

Due to the harmful effects of chemical fertilizers and pesticides, the need for an eco-friendly solution to improve soil fertility has become a necessity, thus microbial biofertilizer research is on the rise. Plant endophytic bacteria inhabiting internal tissues represent a novel niche for research into new biofertilizer strains. However, the number of species and strains that need to be differentiated and identified to facilitate faster screening in future plant-bacteria interaction studies, is enormous. Surface enhanced Raman spectroscopy (SERS) may provide a platform for bacterial discrimination and identification, which, compared with the traditional methods, is relatively rapid, uncomplicated and ensures high specificity. In this study, we attempted to differentiate 18 bacterial isolates from two oaks via morphological, physiological, biochemical tests and SERS spectra analysis. Previous 16S rRNA gene fragment sequencing showed that three isolates belong to Paenibacillus, 3—to Pantoea and 12—to Pseudomonas genera. Additional tests were not able to further sort these bacteria into strain-specific groups. However, the obtained label-free SERS bacterial spectra along with the high-accuracy principal component (PCA) and discriminant function analyses (DFA) demonstrated the possibility to differentiate these bacteria into variant strains. Furthermore, we collected information about the biochemical characteristics of selected isolates. The results of this study suggest a promising application of SERS in combination with PCA/DFA as a rapid, non-expensive and sensitive method for the detection and identification of plant-associated bacteria.

Vibrational spectroscopy is a technique that has been used for the analysis of various chemicals, and in recent years has been successfully adapted for microbial research, showing great promise in becoming a novel diagnostic system in this field [4][5][6]8,9,[15][16][17][18].
This technique stems from the fact that under excitation by light, analyte molecules will experience observable photon scattering. Raman scattering happens when the excitation energy is not the same as that of the scattered energy post interaction with the analyte molecule. Due to the low emission rate of scattered photons, an integration time of minutes Microorganisms 2021, 9,1969 3 of 18

Materials and Methods
Eighteen bacterial samples isolated from two English oaks (Quercus robur) were chosen for this study from our previously created library [35], 12 isolates from oak α and six from oak β. Both trees are from the same site, located in Lithuania (Table 1).

Morphological, Physiological and Biochemical Analysis
Morphological, physiological and biochemistry tests were done in triplicates using fresh colonies, grown on low salt lysogeny broth (LB) agarized medium (pH of 7.2 throughout the experiments) (Duchefa Biochemie, Haarlem, the Netherlands) each time. Bacteria were grown at 25 • C. All media were autoclaved prior to use at 121 • C for 15 min. Aseptic techniques were employed throughout the experiments.

Colony Morphology
Colony morphology was observed. Colony form, elevation, margin, color, opacity, smoothness, consistency and overall appearance on LB medium after 2 days of incubation were determined. Additionally, bacteria samples from overnight liquid LB cultures were visualized using 0.1% Gentian violet dye under 10,000× magnification. Bacteria shape, arrangement and size (average from three biological replicates and 10 technical replicates each) were observed.

Biofilm Formation
Bacterial ability to form biofilms was tested. A modified tissue culture plate method was used [43]. Bacteria were grown overnight in liquid LB. The next day 2 µL of this suspension was pipetted into a sterile flat-bottomed 96-well polystyrene tissue culture plate, then each cell was filled with 198 µL of LB medium supplemented with 1% glucose. 200 µL of LB medium supplemented with 1% glucose was used as control. The plate was incubated overnight. After incubation, the plate was washed three times in a new container of sterile water each time. Then the plate was left to air dry. Subsequently, the biofilm layer was dyed using 0.1% Gentian violet solution for 15 min. Afterward, the plate was washed and dried as previously described. After the fixation step, the biofilm layer was solubilized in ethanol (95%) for 30 min. Optical density (OD) was measured using Synergy HT Multi-Mode Microplate Reader (Biotek Instruments Inc., Bad Friedrichshall, Germany) at 630 nm (95% ethanol as control). Optical density cut-off (ODc) was calculated: ODc = average OD of control + 3 times the standard deviation of control. Biofilm formation capabilities were evaluated: weak biofilm~ODc, moderate-2-4 ODc, strong biofilm-more than 4 ODc.

Carbohydrate Use
A modified phenol red test was used to determine how and which carbohydrates could these isolates use as a carbon source [44]. Lactose (L), fructose (F) (Merck, Darmstadt, Germany), maltose (M) (Avantor, Radnor, PA, USA), sucrose (Su) and glucose (G) (Duchefa Biochemie) were tested. LB liquid medium supplemented with 1% of one selected carbohydrate in each tube and 0.0018% of phenol red dye (Merck) was used. Carbohydrate solutions were filter sterilized and added to the medium after autoclaving. To check for gas production, an upside-down Durham tube was placed in each test tube. The tubes were then inoculated, gently mixed and incubated overnight in a stationary position. This allowed for the positive identification of isolates capable of anaerobically fermenting tested carbohydrates. To discern whether the bacteria were capable of aerobic use of carbohydrates, samples were also placed in a thermal shaker overnight. In both cases, color Microorganisms 2021, 9, 1969 4 of 18 changes from red to yellow were observed. Bubbles in Durham tubes were indicative of gas production and color changes (from red to yellow) indicate a pH change due to acid production, hence the capacity for carbohydrate use.

Antibiotic Susceptibility
Bacterial susceptibility to various antibiotics was determined by using a modified Kirby-Bauer disk diffusion test [45]. Ampicillin (AM), cefotaxime (CTX), chloramphenicol (C), streptomycin (STP), ticarcillin (TIC) (Duchefa Biochemie) and kanamycin (K) (Panpharma, La Selle-en-Luitré, France) were used. Bacteria were grown overnight in liquid LB medium. The next day the bacterial suspension was adjusted to approximately 1.5 × 10 8 cfu/mL. The suspension was cross-streaked on Mueller-Hinton agar (Condalab, Madrid, Spain) using sterile cotton swabs. Then sterile 0.5 mm paper disks were placed on top (6 disks per D8;9 cm plate, equally spaced). Filter sterilized antibiotic solutions were then pipetted onto the disks so that each disk contained a desired amount of antibiotics (10 µg of AM and STP, 30 µg of CTX, C, K and 75 µg of TIC per disk). The plates were incubated overnight in the dark. Inhibition zones were measured the next day and bacterial susceptibility was determined using antibiotic susceptibility charts [46][47][48].

SERS Analysis
We used SERS for bacterial vibrational fingerprinting. This method allowed us to sort bacterial isolates into groups using their vibrational patterns and to tentatively ascertain their molecular composition.

Experimental Set Up for SERS Spectra Acquisition
SERS spectra were recorded using Raman spectrometer (NTEGRA Spectra, NT-MDT Inc., Moscow, Russia) in an "upright" configuration with 532 nm laser as the excitation source. All spectra were calibrated to the first-order silicon longitudinal-optical (LO) phonon peak at 520 cm −1 . The instrument is equipped with 2 mW power at the sample, a 100× objective (NA: 0.7). A thermoelectrically cooled (−60 • C) charge-coupled device (CCD) was used as a detector. The spectral resolution was 1.1 cm −1 .

SERS Substrate Preparation
The preparation of SERS substrates was based on direct silver ions reduction by elemental silicon [49]. Silicon slides were cut into small pieces (1.5 × 1.5 cm). Then they were polished (2 min) with glass paper to rough up the silicon surface. Such prepared slides with etched 100 mm deep wells were washed with pure ethanol, then dried under nitrogen flow and kept in closed Petri dishes until use.
Preprepared HF (24%) and AgNO 3 (20 mM) solutions were mixed in a ratio of 1:1 v/v. Polished silicon slides were immersed in the reaction mixture for 2 s, then immediately transferred to a container with distilled water (dH 2 O) and finally dried under nitrogen flow. The dried substrate slides were immediately used for SERS spectra measurements.

Bacteria Sample Preparation for SERS
Bacteria from our library were transferred using a plastic loop into liquid LB. Overnight cultures (~10 6 cfu/mL) were centrifuged at 3500× g and washed 3 times with 0.9% NaCl solution. After the last wash bacteria were placed in 200 µL of 0.9% NaCl [12,24]. Using a sterile pipette, 20 µL of the suspension was then placed on the substrate silicon slide and immediately transferred to the Raman microscope for data acquisition. It should be noted that the SERS spectra measurements were performed in the presence of bacteria in liquid suspension and by scanning the sample, thus reducing the thermal effect of the laser on the bacteria, i.e., scanning live samples [50].

SERS Spectra Acquisition
To ensure the reproducibility of the SERS spectra, 50 spectra from each bacterium were obtained from the suspension drop on the SERS substrate. The single spectrum was acquired as a summary spectrum by scanning 100 × 100 micron area during the SERS spectra acquisition to optimize the Raman signal strength. The bacterial spectra dataset was collected from the 50 randomly selected spots in the sample. The acquisition time of Raman scattering signal was 20 s. From the 50 acquired spectra, 16 spectra, based on their signal-to-noise ratio, were selected for processing. The resulting SERS spectra were analyzed and edited using Nova 1.1.0.1840 (NT-MDT Inc., Moscow, Russia) and SpectraGryph 1.2.14 software (Dr. Friedrich Menges, Obersdorf, Germany) with cropping to 600-1800 cm −1 , removal of background fluorescence, normalization to the intensity of maximum amplitude, baseline correction-5% coarseness [51].
Vibrational bands were noted (peak finding threshold-0.5%, position tolerance-0.4%) and tentative band assignments were determined based on literature sources.

Multivariate Cluster Analyses
Bacterial differentiation was done based on cluster map methodology using PCA and DFA with Raman processing software [52] in the MATLAB (2012) environment (MathWorks, Inc., Natick, MA, USA). PCA was employed for this study to highlight the variability existing in the spectral data set. The reference spectrum for a single isolate used for DFA was produced as an average spectrum of the 16 experimentally acquired spectra. In DFA a leave-one-out cross-validation method was used.

Morphological, Physiological and Biochemical Analysis
Eighteen bacterial isolates were studied. Twelve from oak α and six from oak β. Previously 16S rRNA gene fragments were successfully sequenced for all the isolates [35]. All of them were identified to genus level ( Table 2). Colony morphology and DNA sequencing results allowed to presumptively divide 18 isolates into four morphotypes, identified as A-D (Table 2, Figure 1). Morphotypes A and D were isolated from both trees, while morphotypes B and C were only isolated from different trees each. NCBI Blast results showed that morphotype A was from the Paenibacillus genus and was closely related to Paenibacillus tundrae, morphotype B was closely related to Pantoea agglomerans, morphotype C-to Pseudomonas brenneri/proteolytica and morphotype D-to Pseudomonas azotoformans.  Bacteria were all similar in diameter-0.28-0.45 µm. Results from morphological, physiological and biochemistry tests also divided the isolates into four distinct groups that coincided with previously described morphotypes (Table 3).  Bacteria were all similar in diameter-0.28-0.45 µm. Results from morphological, physiological and biochemistry tests also divided the isolates into four distinct groups that coincided with previously described morphotypes (Table 3).
Microorganisms 2021, 9, 1969 7 of 18 Morphotype A was sensitive to AM, C and K, and capable of fermenting all the carbohydrates tested. Morphotype B was resistant to TIC and capable of fermenting all the carbohydrates tested. Morphotype C formed biofilms, was sensitive to K and capable of using G as a nutrient. Morphotype D was sensitive to K and capable of using G as a nutrient.

Structural Analysis Based on SERS Spectra
Eight isolates were selected for SERS analysis. As Pseudomonas sp. are difficult to differentiate to species level via 16S rRNA gene sequencing and since isolates 24 and 29 are of the same origin, we treated them as equal. Thus, for further analysis, isolate 24 and isolates 37 and 49 from the pseudomonad group were chosen. Isolates 37 and 49 were highly homologous and from different sources. Isolates 33.1 and 35 were selected from the Paenibacillus sp. group, because based on genetic tests and additional experiments, they were identical, but of different origins. Moreover, isolates 27, 30 and 34, representing Pantoea agglomerans, were selected for vibrational analysis. They were all sourced from the same tree, however, they exhibited differences in plant hormone, indole-3-acetic acid (IAA), production in previous studies [35]. To determine the efficacy of the proposed bacterial differentiation methodology, the focus was put on within-group differences of isolates 33.1/35, 27/30/34 and 37/49.
Band peaks and their respective intensities are used to sort bacteria in relation to one another [7,29,50]. Representative spectra acquired during this study are shown in Figure 2. The stacked mean spectra of all 8 isolates are shown in Figure 3.
Morphotype A was sensitive to AM, C and K, and capable of fermenting all the carbohydrates tested. Morphotype B was resistant to TIC and capable of fermenting all the carbohydrates tested. Morphotype C formed biofilms, was sensitive to K and capable of using G as a nutrient. Morphotype D was sensitive to K and capable of using G as a nutrient.

Structural Analysis Based on SERS Spectra
Eight isolates were selected for SERS analysis. As Pseudomonas sp. are difficult to differentiate to species level via 16S rRNA gene sequencing and since isolates 24 and 29 are of the same origin, we treated them as equal. Thus, for further analysis, isolate 24 and isolates 37 and 49 from the pseudomonad group were chosen. Isolates 37 and 49 were highly homologous and from different sources. Isolates 33.1 and 35 were selected from the Paenibacillus sp. group, because based on genetic tests and additional experiments, they were identical, but of different origins. Moreover, isolates 27, 30 and 34, representing Pantoea agglomerans, were selected for vibrational analysis. They were all sourced from the same tree, however, they exhibited differences in plant hormone, indole-3-acetic acid (IAA), production in previous studies [35]. To determine the efficacy of the proposed bacterial differentiation methodology, the focus was put on within-group differences of isolates 33.1/35, 27/30/34 and 37/49.
Band peaks and their respective intensities are used to sort bacteria in relation to one another [7,29,50]. Representative spectra acquired during this study are shown in Figure  2. The stacked mean spectra of all 8 isolates are shown in Figure 3.   As mentioned previously, peaks in the SERS spectra are linked with functional groups [6,7,20,22,26]. These groups represent components of bacterial cells [8,17], most often either extracellular polymers [17] or more likely degradation metabolites [50,53] or parts of the outer membrane in Gram negative bacteria [13]. We present tentative SERS spectra band assignments in Table 4.  As mentioned previously, peaks in the SERS spectra are linked with functional groups [6,7,20,22,26]. These groups represent components of bacterial cells [8,17], most often either extracellular polymers [17] or more likely degradation metabolites [50,53] or parts of the outer membrane in Gram negative bacteria [13]. We present tentative SERS spectra band assignments in Table 4.   [7,10,29,67] or C-C/C-O stretching in membrane proteins [10] 966.22 C-N stretch [26]  Minor Raman shifts in various references seen in Table 4 are due to methodological variations [7,15,17] as well as indicative of molecular differences [50], facilitating successful differentiation.
As can be seen from Figure 4a,b, isolate 27 diverged greatly from other tested isolates. It exhibits several peaks, that weren't observed in other test subjects (peaks at 563, 1005, 1592, 1647, 1703 and 1750 cm −1 ). While peaks at 563, 1005, 1592 and 1647 cm −1 , are likely indicative of a shift, bands in the 1700 cm −1 range, linked with C=O deformation, are wholly unique to this isolate. Isolates 30 and 34 from Pantoea agglomerans group didn't show such differences, however, they diverged by the absence of peaks at 688, 858 and 958 cm −1 . Additionally, isolate 30 exhibited a peak at 1131 cm −1 , related to deformations of C-C, C-N in carbohydrates or =C-C= in lipids, alongside isolate 27, while this peak was absent from the spectra of isolate 34.
(all linked with proteins), that isolate 49 exhibits. The peak at 966 cm −1 can potentially be related to the peak at ~958 cm −1 , both, based on past studies, linked with C-N deformations. Furthermore, isolate 49 doesn't show notable peaks at 622, 858 or 882 cm −1 , while exhibiting a peak at 1501 cm −1 , which according to our findings, is linked with various organic compounds, carotenoids among them. Most of these peaks are unique to isolate 49, except for the peak near 1500 cm −1 , which may be a shift from the carotenoid band at ~1510 cm −1 .

Differentiation via Multivariate Cluster Analyses
Cluster analysis methods were used for bacterial differentiation. Figure 4a shows the PCA scatter plot of the eight bacterial isolates based on 14 principal components (PCs) accounting for 95.1% of the variance. Here, the scatter plot presents the ability of the SERS spectral analysis to differentiate among the different types of bacteria. SERS data of the tested isolates were also classified using DFA. Eleven PC results were used as independent input variables in DFA (Figure 4b), which further reduced the spectral dimension, however, the groupings remained similar to those of the PCA. Furthermore, DFA from raw spectra data is presented in Figure 4c. Each isolate is sorted out as an individual, but close relationships within and between groups are noticeable. Two DF scores were calculated for each spectrum for the three bacterial cell types.
For in-depth within-group separation DFA was used further (Figures 5-7). PCA of group 27/30/34 shows a clear disassociation of isolate 27 (Figure 5a). DFA from PC scores of isolates 34, 30 and 27 was able to correctly classify 100% of each group's subjects ( Figure  5b). DFA based on raw data with cross-validation between isolates 30 and 27 was able to correctly classify 100% of subjects in the groups and DFA based on raw data with crossvalidation between isolate 30 and 34 was able to correctly classify 100% of subjects in the Isolates 33.1 and 35 from Paenibacillus sp. group exhibited similarities in their spectra and were grouped close in PCA and DFA score maps (Figures 4a and 6a). However, as can be seen by their raw data DFA (Figure 6b), there were differences. Isolate 33.1 has a notable peak at 1398 cm −1 , linked with COO-or CH 3 deformations, and lacks notable peaks at 688, 804 and 1383 cm −1 , while isolate 35 has a notable band at 1116 cm −1 , potentially related to Trp. The peak at 1398 cm −1 is likely related to peaks at~1388 cm −1 , indicating a shift, rather than an absence. Moreover, the peak at 1116 cm −1 is unique for isolate 35, since isolate 33.1 doesn't exhibit a peak related to that area.
Isolate 24 exhibits a peak at 1421 cm −1 , linked with lipids or carbohydrates, that the other two isolates, 37 and 49, from the pseudomonads group lack, while missing a peak at 1383 cm −1 . Isolate 37 has peaks at 966 cm −1 and 1119 cm −1 and lacks a peak at 1679 cm −1 (all linked with proteins), that isolate 49 exhibits. The peak at 966 cm −1 can potentially be related to the peak at~958 cm −1 , both, based on past studies, linked with C-N deformations. Furthermore, isolate 49 doesn't show notable peaks at 622, 858 or 882 cm −1 , while exhibiting a peak at 1501 cm −1 , which according to our findings, is linked with various organic compounds, carotenoids among them. Most of these peaks are unique to isolate 49, except for the peak near 1500 cm −1 , which may be a shift from the carotenoid band at~1510 cm −1 .

Differentiation via Multivariate Cluster Analyses
Cluster analysis methods were used for bacterial differentiation. Figure 4a shows the PCA scatter plot of the eight bacterial isolates based on 14 principal components (PCs) accounting for 95.1% of the variance. Here, the scatter plot presents the ability of the SERS spectral analysis to differentiate among the different types of bacteria.
SERS data of the tested isolates were also classified using DFA. Eleven PC results were used as independent input variables in DFA (Figure 4b), which further reduced the spectral dimension, however, the groupings remained similar to those of the PCA. Furthermore, DFA from raw spectra data is presented in Figure 4c. Each isolate is sorted out as an individual, but close relationships within and between groups are noticeable. Two DF scores were calculated for each spectrum for the three bacterial cell types.
For in-depth within-group separation DFA was used further (Figures 5-7). PCA of group 27/30/34 shows a clear disassociation of isolate 27 (Figure 5a). DFA from PC scores of isolates 34, 30 and 27 was able to correctly classify 100% of each group's subjects (Figure 5b). DFA based on raw data with cross-validation between isolates 30 and 27 was able to correctly classify 100% of subjects in the groups and DFA based on raw data with cross-validation between isolate 30 and 34 was able to correctly classify 100% of subjects in the group 34 and 93.8% of subjects in the group 30 (Figure 5c). Thus, the results show that isolate 27 can be effectively differentiated from the other two isolates in the group.  DFA based on PC scores with cross-validation between isolates 33.1 and 35 was able to correctly classify 87.5% of the group 33.1 subjects and 79.1% of the group 35 subjects (Figure 6a). The DFA using raw data with cross-validation between 33.1 and 35 was able to correctly classify 100% of subjects in both groups (Figure 6b). DFA based on PC scores and from raw data with cross-validation between isolate 37, 49 and 24 was able to correctly classify 100% of subjects in all the groups (Figure 7).  DFA based on PC scores with cross-validation between isolates 33.1 and 35 was able to correctly classify 87.5% of the group 33.1 subjects and 79.1% of the group 35 subjects (Figure 6a). The DFA using raw data with cross-validation between 33.1 and 35 was able to correctly classify 100% of subjects in both groups (Figure 6b). DFA based on PC scores and from raw data with cross-validation between isolate 37, 49 and 24 was able to correctly classify 100% of subjects in all the groups (Figure 7). DFA based on PC scores with cross-validation between isolates 33.1 and 35 was able to correctly classify 87.5% of the group 33.1 subjects and 79.1% of the group 35 subjects (Figure 6a). The DFA using raw data with cross-validation between 33.1 and 35 was able to correctly classify 100% of subjects in both groups (Figure 6b).
DFA based on PC scores and from raw data with cross-validation between isolate 37, 49 and 24 was able to correctly classify 100% of subjects in all the groups (Figure 7). Microorganisms 2021, 9, x FOR PEER REVIEW 12 of 18

Discussion
This study showcases that SERS coupled with multivariate cluster analyses can serve as an effective means to achieve bacterial differentiation in plant-associated species, as opposed to standard 16S rRNA gene sequencing and additional antibiotic susceptibility, carbohydrate use, biofilm formation and phenotyping tests.
Additional tests performed during this experiment were able to account for genus level separation. Colony morphology and antibiotic susceptibility tests were more effective than biofilm formation and carbohydrate use studies. Antibiotic susceptibility is considered a strain-level response [80]. Bacterial strains also have been shown to be able to adapt to utilizing new carbohydrates through mutation [81]. Pseudomonas sp. are known biofilm producers [82] and indeed biofilm formation test was able to separate two pseudomonads capable of this. Biofilms are extracellular structures often containing polysaccharides as well as other compounds [3], thus it is possible that evidence of them may be noted in the SERS data, as was shown previously with several species [83][84][85]. However, it's worth noting that the methodology chosen in this study isn't ideal for yielding data on biofilms created by isolate 24.
While comparable studies showed that SERS works for pathogenic, medicine and food related, strains [1,8,9,20], based on available information, the usefulness of this technique was not widely studied for plant-associated bacteria [29], or even more specifically for endophytes.
The complexity of the SERS spectrum makes interpretation of the data challenging. Statistical procedures or chemometric multivariate analyses are designed to improve the use and interpretation of experimental data [22]. Chemometrics is defined as a mathematical method used to extract useful information from measured physicochemical data [86].
In this study PCA and DFA, multivariate data analysis techniques, were applied to the SERS spectral data. PCA is one of the multidimensional descriptive methods in chemometrics, particularly fitting for the study of spectral data. This technique provides a synthetic image by presenting factor maps (2D or 3D), in which each spectrum is represented by a dot. The primary variables are replaced with synthetic ones (principal components), which contain all the information [15,87], thus interpreting PCA maps makes it relatively easier to understand the structure of the spectral data [88].
DFA allows for the rapid sorting/grouping of unknown spectra based on betweengroup variability while minimizing within-group variability [6,52]. It also facilitates immediate validation of spectral reproducibility, as very similar spectra should have very similar discriminant function scores and should consequently be closely grouped in DFA. All in all, DFA and PCA are similar in that they both reduce the dimension of the data, but DFA provides better separation between groups of experimental data in comparison

Discussion
This study showcases that SERS coupled with multivariate cluster analyses can serve as an effective means to achieve bacterial differentiation in plant-associated species, as opposed to standard 16S rRNA gene sequencing and additional antibiotic susceptibility, carbohydrate use, biofilm formation and phenotyping tests.
Additional tests performed during this experiment were able to account for genus level separation. Colony morphology and antibiotic susceptibility tests were more effective than biofilm formation and carbohydrate use studies. Antibiotic susceptibility is considered a strain-level response [80]. Bacterial strains also have been shown to be able to adapt to utilizing new carbohydrates through mutation [81]. Pseudomonas sp. are known biofilm producers [82] and indeed biofilm formation test was able to separate two pseudomonads capable of this. Biofilms are extracellular structures often containing polysaccharides as well as other compounds [3], thus it is possible that evidence of them may be noted in the SERS data, as was shown previously with several species [83][84][85]. However, it's worth noting that the methodology chosen in this study isn't ideal for yielding data on biofilms created by isolate 24.
While comparable studies showed that SERS works for pathogenic, medicine and food related, strains [1,8,9,20], based on available information, the usefulness of this technique was not widely studied for plant-associated bacteria [29], or even more specifically for endophytes.
The complexity of the SERS spectrum makes interpretation of the data challenging. Statistical procedures or chemometric multivariate analyses are designed to improve the use and interpretation of experimental data [22]. Chemometrics is defined as a mathematical method used to extract useful information from measured physicochemical data [86].
In this study PCA and DFA, multivariate data analysis techniques, were applied to the SERS spectral data. PCA is one of the multidimensional descriptive methods in chemometrics, particularly fitting for the study of spectral data. This technique provides a synthetic image by presenting factor maps (2D or 3D), in which each spectrum is represented by a dot. The primary variables are replaced with synthetic ones (principal components), which contain all the information [15,87], thus interpreting PCA maps makes it relatively easier to understand the structure of the spectral data [88].
DFA allows for the rapid sorting/grouping of unknown spectra based on betweengroup variability while minimizing within-group variability [6,52]. It also facilitates immediate validation of spectral reproducibility, as very similar spectra should have very similar discriminant function scores and should consequently be closely grouped in DFA. All in all, DFA and PCA are similar in that they both reduce the dimension of the data, but DFA provides better separation between groups of experimental data in comparison to PCA.
Additionally, while DFA may need a certain level of a priori knowledge about the spectra, PCA is used to examine raw data [87].
It is noteworthy, that the label-free SERS spectra acquisition protocol presented in this study is an easily replicated approach for procuring bacterial spectra, as bacteria are in an aqueous solution, requiring minimal preparation. Most often comparable procedures either use colloidal solutions or dry out the sample, hence facing difficulties with thermal damage [9,15], which may affect spectra acquisition, as often carbon associated peaks arise in the biologically relevant range [9].
Based on current knowledge, the methodology used in this study likely showcases the metabolic degradation of the tested bacteria, as they are in a no-nutrient environment (salt solution). Nevertheless, as this is linked with specific enzymes each strain may produce and unique metabolic pathways, it too ultimately relates to biochemical differences and thus strain-specific differentiation [50,53]. Another recent study demonstrates that SERS peaks may also derive from the constituents of bacterial outer membrane (Gram-) [13].
Moreover, nucleotides are rarely seen in extracellular regions, but notable bands for them have been found in various studies [9,10,15,26,29]. Bands in the same regions have been observed in this study as well. For example, an intense peak at~730 cm −1 is commonly assigned to adenine-type compounds [90]. Furthermore, adenine molecules are part of adenosine triphosphate (ATP), which bacteria use for energy, hence it is possible that their degradation metabolites would end up outside of the bacterial cells, as have been shown with E. coli placed in starvation mode [91] and other studies [50,92].
In this study, bacteria from three different genera were examined. Although certain similarities can be observed in all the tested subjects, i.e., the aforementioned adenosine band at~730 cm −1 , unique vibrational signatures for all of them were successfully obtained. Similar results have been presented previously, whereby E. coli, Listeria monocytogenes and B. subtilis strains were shown to have different SERS spectra [11,67,89,93]. Furthermore, Premasiri et al. state that mutations and even genealogy may be observed in their SERS data [11].
While the exact nature of the bacterial Raman/SERS bands is difficult to assign, without mutant bacteria studies, due to peak overlap and minor shifts, bacterial differentiation is still possible [7,15].
Bacterial strains genetically homologous with Pantoea agglomerans have been investigated in previous Raman studies [94]. P. agglomerans vibrational fingerprint reported by Guicheteau et al. resembles those obtained in this study. There were differences though, for example, the peak at 536 cm −1 , that the authors attributed to cysteine, asparagine or glutamine, was found shifted in the spectra of isolate 27, and absent from the other two tested isolates. Moreover, isolate 34 didn't have a notable peak at 958 cm −1 . Several peaks demonstrated shifts (ex. at~1004, 1142, 1544 cm −1 ), while others are absent in the spectra of the strain tested in the cited work [94]. Furthermore, other species from the Pantoea genus have been studied as well [30,77]. A study on IAA-producing Pantoea sp. has shown that some notable bands are produced by carotenoids, notably, bands at 1002, 1158 and 1520 cm −1 , of which analogs were found in our research as well (~1005, 1159, 1570 cm −1 ). Additionally, authors in this study discuss the possibility that the IAA production capacity of this strain may have also been observable, through Trp peaks, as IAA and Trp have similar chemical structures (in fact Trp is a precursor to IAA) [95]. Several Trp associated peaks have been noted in our research as well. This too may potentially be linked to IAA production [35].
The most widely researched Pseudomonas species are P. aeruginosa and P. fluorescens. To our knowledge, pseudomonads homologous to those analyzed in this work, have not yet been characterized using Raman techniques. However, data on P. aeruginosa, have shown some similarities to the pseudomonad spectra examined in this work. Spectra reported by Yang et al. are similar, as they share some peaks with very minor shifts (ex. 655, 730, 920, 1328, 1470 cm −1 ). A peak near 854 cm −1 (related to G) was not observed in isolate 49, while peaks at 957 and 1091 cm −1 , linked with hypoxanthine, A, G and guanosine, were only noted in isolate 24. Furthermore, authors report peaks at 518, 1219 and 1528 cm −1 , which were not obtained in this study [63]. Another study on P. aeruginosa also demonstrates peaks, that are comparable to those found in this study [70]. However, most of them are expressed differently in different isolates, peaks at~690 and 832 cm −1 (related to G, C and C-O-C stretching according to the authors) are observed only in the data of isolates 37 and 49, while peaks near 958 and 1421 cm −1 (linked with phospholipids or C-O deformations) have only been noted in isolate 24. A study done on P. fluorescens resulted in the SERS peak at 1495 cm −1 [78]. During our research, this peak was found to be one of the discerning factors of isolate 49, homologous with P. azotoformans. This peak has been linked with lipids, proteins and carotenoids (see Table 4).

Conclusions
A rapid SERS strategy for bacteria differentiation was successfully established by using low-cost AgNP/Si substrates, where other methods (biochemical and genetic testing) have failed. It was found that AgNP 3D film on Si surface was interacting with the bacteria, resulting in strong and reproducible SERS spectra. Surface enhanced Raman spectroscopy coupled with advanced statistical techniques (PCA and DFA) were used to discriminate between different plant bacterial strains of Paenibacillus, Pseudomonas and Pantoea genera by probing the molecular components of their cells. This is the first time, SERS peaks characteristic to bacteria closely related to Pseudomonas azotoformans and P. brenneri/proteolytica have been obtained. Moreover, so far as we were able to determine, this is one of the first studies on the SERS spectra characteristics of Paenibacillus sp. This work progresses the current knowledge of bio-spectroscopy and may help with the introduction of SERS-based bacterial identification technique as a standard method of analysis in plant-associated bacteriology. Data Availability Statement: Detailed data concerning this study is available upon request.