Differentiating between Enterococcus faecium and Enterococcus lactis by Matrix-Assisted Laser Desorption Ionization Time-of-Flight Mass Spectrometry

Unlike Enterococcus faecium strains, some Enterococcus lactis strains are considered potential probiotic strains as they lack particular virulence and antibiotic resistance genes. However, these closely related species are difficult to distinguish via conventional taxonomic methods. Here, for the first time, we used matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF MS) with BioTyper and in-house databases to distinguish between E. faecium and E. lactis. A total of 58 reference and isolated strains (89.2%) were correctly identified at the species level using MALDI-TOF MS with in-house databases. However, seven strains (10.8%) were not accurately differentiated as a single colony was identified as a different species with a similar score value. Specific mass peaks were identified by analyzing reference strains, and mass peaks at 10,122 ± 2 m/z, 3650 ± 1 m/z, and 7306 ± 1 m/z were unique to E. faecium and E. lactis reference strains, respectively. Mass peaks verified reproducibility in 60 isolates and showed 100% specificity, whereas 16S rRNA sequencing identified two different candidates for some isolates (E. faecium and E. lactis). Our specific mass peak method helped to differentiate two species, with high accuracy and high throughput, and provided a viable alternative to 16S rRNA sequencing.


Introduction
Enterococci belong to the lactic acid bacterium and are usually present in plant materials and vegetables, especially raw milk or dairy products [1]. Previous microbiota studies in fermented foods reported that enterococci have important roles in fermentation and contribute to the unique taste and flavor of fermented foods [1]. Moreover, enterococci can also improve hygiene and safety in some foods as they produce antimicrobial substances such as bacteriocins (enterocins) or lactic acid [1]. Enterococci, especially Enterococcus faecalis and Enterococcus faecium, have great potential as probiotics, yet, some strains are associated with human infection, virulence factors, and antibiotic resistance, including resistance to vancomycin [1,2]. However, E. lactis, which is closely related to E. faecium, lacks hospital infection-associated markers, such as insertion sequence IS16 and glycosyl hydrolase hyl Efm , suggesting E. lactis complies with European Food Safety Authority guidelines [3]. Therefore, E. lactis displays a higher potential as a probiotic strain than E. faecium, as the absence of transferable virulence and antibiotic resistance genes is an important prerequisite when screening probiotic strains [4].
Scientists have conventionally relied on physiological and biochemical properties to identify lactic acid bacteria [5]. However, Enterococcus species share many characteristics, making conventional identification methods not only inaccurate but also time-consuming. Recently, whole-genome sequencing was applied to bacterial taxonomy and successfully discriminated closely related species, including E. faecium and E. lactis [3]. However, this technique is time-consuming, expensive, and requires additional analysis steps, such as average nucleotide identity or digital DNA-DNA hybridization. Therefore, routine use in the laboratory is difficult [6]. Currently, 16S rRNA sequencing is a commonly used molecular method to classify bacteria. Strains showing more than 98.7% sequence similarity in 16S rRNA genes are considered the same species [7]. Unfortunately, poor discrimination has been reported for Enterococcus due to high sequence similarities (99%) in 16S rRNA [3]. By contrast, protein-coding genes provide higher taxonomic resolution and could serve as alternatives to 16S rRNA sequencing in discriminating closely related species [6,8].
Matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF MS) is often used to identify and differentiate microorganisms [9,10]. The approach is rapidly replacing analytical phenotypic and conventional biochemical identification methods, especially in clinical microbiology laboratories [11,12]. This method has been successfully used in clinical diagnostic settings and has been expanded into food safety, fermented food monitoring, biodiversity, and gut microbiota research [13][14][15][16]. Generally, MALDI-TOF MS distinguishes at the species level, with taxonomic resolution observed at the subspecies or serovar level when combined with specific mass peaks [17,18]. Importantly, MALDI-TOF MS accuracy depends on a reference microorganism database. However, commercial databases are mainly designed for routine clinical diagnostics; therefore, adding additional entries to such databases are important to facilitate increased identification rates.
In this study, we used MALDI-TOF MS to identify and discriminate between E. faecium and E. lactis. The BioTyper database currently lacks E. lactis reference spectra. Thus, we constructed an in-house database coupled with specific mass peaks to compare data with 16S rRNA sequencing.
To verify main spectrum profiles, bacteria from fermented foods, such as soybean paste, soy sauce, sikhae, and raw milk were isolated according to a previous study [22]. Briefly, 25 g of each food sample was homogenized in 225 mL sterile phosphate-buffered saline and serially diluted. Then, 0.1 mL of each dilution was spread onto MRS agar plates and incubated at 37 • C for 48 h. Isolates were identified using 16S rRNA sequencing via the 27F/1492R primer set. Isolates other than E. faecium and E. lactis were excluded from the research.

16S rRNA Sequencing
The 16S rRNA sequencing of isolates was performed to compare the MALDI-TOF MS results. Genomic DNA of isolates was extracted using G-spin genomic DNA extraction kit (Intron Biotechnology, Seongnam, Korea). The amplification was carried out in a 25 µL mixture containing 2.5 mM dNTPs (Takara, Tokyo, Japan), 10× buffer (Takara, Tokyo, Japan), 0.5 units Ex Taq polymerase (Takara, Tokyo, Japan), 20 ng of template, and 400 nM of 27F/1492R primer set. The PCR thermal profile was performed at 95 • C for 5 min, followed by 30 cycles of 95 • C for 1 min, 58 • C for 1 min, and 72 • C for 2 min, and concluded with a final elongation at 72 • C for 10 min. The PCR product was purified using the QIAquick PCR purification kit (Qiagen, Hilden, Germany) and sequenced. The 16S rRNA sequences of isolates were analyzed using the BLAST program (NCBI, Bethesda, MD, USA).

Sample Preparation for MALDI-TOF MS
Protein from reference strains was extracted using an existing ethanol/formic acid protocol [23]. Briefly, 10 µL fresh culture was suspended in 300 µL water and mixed with 900 µL ethanol to inactivate the bacteria. The cell suspension was then centrifuged at 13,600× g for 10 min and supernatant was removed. Once dry, the pellet was resuspended in 20 µL 70% formic acid and 20 µL acetonitrile, and centrifuged at 13,600× g for 5 min. After this, 1 µL extract was spotted onto an MSP 96 polished steel target plate (Bruker Daltonics, Bremen, Germany) and air dried for 10 min. Spots were overlaid with 1 µL α-cyano-4-hydroxycinnamic acid (CHCA) matrix solution (Bruker Daltonics, Bremen, Germany), and air dried for sample/matrix cocrystallization.

MALDI-TOF MS Analysis
Analyses were performed via a Microflex LT bench-top mass spectrometer (Bruker Daltonics, Bremen, Germany) with FlexControl software version 3.4. Data were obtained in automatic mode by collecting 240 laser shots with 40% laser intensity. Spectra were recorded in a positive linear mode (ion source one voltage = 18.00 kV; ion source two voltage = 16.38 kV; lens voltage = 5.40 kV; laser frequency = 60 Hz; and mass range = 2000-20,000 Da). Calibration and quality control steps before strain identification were performed via a bacterial test standard (Bruker Daltonics, Bremen, Germany) which consisted of an Escherichia coli DH5-α protein extract.
To identify isolates with specific mass peaks, raw spectra were normalized, and strain peak areas and intensities were analyzed via FlexAnalysis software version 3.4 (Bruker Daltonics, Bremen, Germany). Then, isolates were identified by comparing the presence or absence of species-specific mass peaks. A main spectrum profile dendrogram and principal component analyses (PCA) for reference and isolate strains were performed via MALDI BioTyper software version 3.1 (Bruker Daltonics, Bremen, Germany) as per standard operating procedures.

Creating an In-House Database
A representative E. lactis reference strain was used to construct an in-house database. Main spectra were generated as described in Section 2.2 identifying specific mass peaks. In total, 30 replicates for the E. lactis strain were incorporated. Raw spectra quality was evaluated using FlexAnalysis software version 3.4 (Bruker Daltonics, Bremen, Germany), whereby spectra displaying high background noise were deleted [24] according to the manufacturer's instructions. After baseline subtraction and smoothing, >20 high-quality spectra were selected and transferred to create the main spectrum profile which was used for in-house database supplementation. The 25 reference strains were blindly evaluated to determine mass spectra reproducibility with MALDI-TOF MS identification. The in-house database was assessed based on 60 isolate measurements.

Identifying Isolates Using Specific Peaks
To identify isolates, proteins were extracted using the extended direct transfer extraction protocol [25]. Briefly, a single bacterial colony was spotted on the MSP 96 polished steel target plate and overlaid with 1 µL 70% formic acid. After drying, the area was covered with 1 µL CHCA matrix solution (Bruker Daltonics, Bremen, Germany). The plate was loaded into the Microflex LT bench-top mass spectrometer which contained the BioTyper database version 3.4 (5627 reference spectra) and the in-house database, and then analyzed as described. The MALDI-TOF MS analysis results are generally expressed with a score value, indicative of the matching between the sample spectrum and the reference spectra in database. Score results were between 0 to 3. The identification criteria were as follows: a score of ≥2.300 was considered as high probable species level; 2.000-2.299, a probable species identification; 1.700-1.999, a probable genus identification; and <1.700, no reliable identification.

Identifying Specific Mass Peaks
It was previously reported that reliance on commercial databases could yield ambiguous results for closely related bacterial species, such as Lactobacillus johnsonii and Lactobacillus gasseri, Lactiplantibacillus plantarum, and Lactiplantibacillus paraplantarum, and Bacillus punilus and Bacillus safensis [25][26][27]. Importantly, MALDI-TOF MS combined with specific mass peaks was successfully used to discriminate between closely related species or subspecies, including Lactobacillus paracasei subspecies, Bifidobacterium animalis subspecies, Streptococcus species, and Lactiplantibacillus species [6,18,25,28,29]. Therefore, the characterization of specific mass peaks for species identification is accepted. In the present study, we observed inaccurate or ambiguous identification between E. faecium and E. lactis in the MALDI database.
Mass spectra showed similar patterns between E. faecium and E. lactis (Figure 1). The mass spectra of each analyzed strain for non-target species are shown in Figure S1. Discrimination ability at the species level was evaluated by analyzing mass peaks from five reference strains and 20 reference strains comprising 13 different species. In total, 192 mass peaks were extracted from the mass spectra of five reference E. faecium and E. lactis strains and analyzed for peak values according to species. Moreover, specific mass peaks were compared with 943 mass peaks from 13 other species to confirm they were unique peaks and not found in other species. Bacillus punilus and Bacillus safensis [25][26][27]. Importantly, MALDI-TOF MS combined with specific mass peaks was successfully used to discriminate between closely related species or subspecies, including Lactobacillus paracasei subspecies, Bifidobacterium animalis subspecies, Streptococcus species, and Lactiplantibacillus species [6,18,25,28,29]. Therefore, the characterization of specific mass peaks for species identification is accepted. In the present study, we observed inaccurate or ambiguous identification between E. faecium and E. lactis in the MALDI database. Mass spectra showed similar patterns between E. faecium and E. lactis (Figure 1). The mass spectra of each analyzed strain for non-target species are shown in Figure S1. Discrimination ability at the species level was evaluated by analyzing mass peaks from five reference strains and 20 reference strains comprising 13 different species. In total, 192 mass peaks were extracted from the mass spectra of five reference E. faecium and E. lactis strains and analyzed for peak values according to species. Moreover, specific mass peaks were compared with 943 mass peaks from 13 other species to confirm they were unique peaks and not found in other species.   In E. faecium, a mass peak at 10,122 ± 2 m/z was common to all E. faecium strains; peaks were present in two E. faecium reference strains but absent in other Enterococcus species, including E. lactis (Table 2). In total, 15 mass peaks were common in E. lactis strains; E. lactis was characterized by mass peaks at 3650 ± 1 m/z and 7306 ± 1 m/z which were not identified in the other 14 species, including E. faecium (Table 2). Therefore, mass peaks at 10,122 ± 2 m/z were unique to E. faecium, 3650 ± 1 m/z and 7306 ± 1 m/z were uniquely found in E. lactis (Figure 2).
In E. faecium, a mass peak at 10,122 ± 2 m/z was common to all E. faecium strains; peaks were present in two E. faecium reference strains but absent in other Enterococcus species, including E. lactis (Table 2). In total, 15 mass peaks were common in E. lactis strains; E. lactis was characterized by mass peaks at 3650 ± 1 m/z and 7306 ± 1 m/z which were not identified in the other 14 species, including E. faecium (Table 2). Therefore, mass peaks at 10,122 ± 2 m/z were unique to E. faecium, 3650 ± 1 m/z and 7306 ± 1 m/z were uniquely found in E. lactis (Figure 2).

Evaluating Commercial and In-House Databases
We used the BioTyper database to evaluate species differentiation between E. faecium and E. lactis. Five reference strains and 60 isolates were tested via BioTyper and in-house databases. Reference strains included two E. faecium (KACC 11954 and KCTC 13225) and three E. lactis strains (KACC 15681, KACC 14552, and KACC 21015). As a result, 65 colonies were identified as E. faecium via the BioTyper database. Of these, six strains (9.2%) were identified at the highly probable species level (score ≥ 2.300), 53 strains (81.5%) were identified at the probable species level (2.000-2.299), and the remaining six (9.2%) were identified at the probable genus level (1.700-1.999) ( Table 3). All isolates were identified as E. faecium at the species level via the BioTyper database. E. lactis did not exist in the BioTyper database and was created in the in-house database. After generating E. lactis strain spectra, five reference strains and 60 isolates were reidentified. The 59 strains (90.8%) were correctly identified with a high score value ≥ 2.300, and six strains (9.2%) were identified at the probable species level (2.000-2.299) ( Table 3). The in-house database, with added E. lactis mass spectra, accurately identified 58 strains (89.2%), generating an improved identification rate when compared with the BioTyper database, but seven strains (10.8%) had unreliable results due to spectral similarity with E. faecium. All strains were identified as E. faecium (5/12, 41.7%) and E. lactis (53/53, 100%) at the species level, whereas some E. faecium strains (7/12, 58.3%) generated unreliable results (Tables 3 and 4). Seven isolates were identified as E. lactis in the first match, with score values between 2.205 and 2.413, but the second match identified E. faecium, with score values between 2.160 and 2.370. Therefore, these isolates could not be differentiated by both BioTyper and in-house databases.  BioTyper database limitations were also previously observed for Lactiplantibacillus species, Salmonella species, and some anaerobic bacteria [23,25,30,31]. These species are phylogenetically closely related and have similar protein mass spectra; therefore, they could not be accurately distinguished by this database. A previous study also reported that an improved commercial database facilitated the accurate identification of microorganisms from a single colony [18]. However, in our study, the expanded database improved the identification rate between two species, but could not clearly distinguish all strains due to high similarity between protein mass spectra. To differentiate these spectra, re-identification is required using additional tests based on isolated characteristics [24].
Five reference strains and 60 isolates were used to evaluate MALDI-TOF MS robustness. The main spectrum profile dendrogram and PCA clustering are practical for differentiating between closely related strains and determining associations between isolated strains [32]. Dendrogram and PCA clustering were performed to confirm the discriminative power of mass spectra to identify two species; all E. faecium and E. lactis strains were classified into two distinct groups in the dendrogram (Figure 3). The first cluster included E. faecium species, and two clusters included E. lactis. PCA clustering was performed using intensities and mass values and showed both species were separated (Figure 4). This result suggests that two species may be differentiated by mass spectra obtained with MALDI-TOF MS.

Identifying Isolates Using Specific Mass Peaks
To validate our E. faecium and E. lactis identification approach, 60 isolates were identified using specific mass peaks; these peaks in type strains were consistently identified in isolates. Ten isolates were identified as E. faecium via mass peak analysis ( Table 4). The peak at 10,122 ± 2 m/z was specific to E. faecium and was uniquely present in these isolates, whereas E. lactis mass peaks were absent. All isolates were consistent with 16S rRNA sequencing identification results (Table 5). These isolates were identified as E. faecium (accession no. FJ378693.1 or MN401132.1 or MH473158.1) via 16S rRNA sequencing. The 50 isolates were identified as E. lactis; the mass peaks at 10,122 ± 2 m/z and 3650 ± 1 m/z, specific to E. lactis, were present in these isolates, but specific E. faecium peaks were absent. These isolates were then compared with 16S rRNA sequencing results. Isolates were correctly identified as one species using mass peak analysis, whereas 16S rRNA sequencing generated two different species candidates, E. faecium (accession no. MT597585.1 or MT378127.1) and E. lactis (MG948154.1 or CP082267.1), instead of one species. These data were consistent with previous studies showing that 16S rRNA sequence analyses showed limited discriminatory power between E. faecium and E. lactis, as both exhibited >99% sequence homology [1,3]. Therefore, three mass peaks were specific for identifying and discriminating between E. faecium and E. lactis.
MALDI-TOF MS is a cost-efficient and rapid identification method when compared to other techniques [33]. The approach was used to identify ten strains within 15 min in a colony selection study [6]. The higher the throughput rate of a sample is, the lower the analysis cost/isolate [34]. To efficiently identify microorganisms, MALDI-TOF MS costs

Identifying Isolates Using Specific Mass Peaks
To validate our E. faecium and E. lactis identification approach, 60 isolates were identified using specific mass peaks; these peaks in type strains were consistently identified in isolates. Ten isolates were identified as E. faecium via mass peak analysis ( Table 4). The peak at 10,122 ± 2 m/z was specific to E. faecium and was uniquely present in these isolates, whereas E. lactis mass peaks were absent. All isolates were consistent with 16S rRNA sequencing identification results (Table 5). These isolates were identified as E. faecium (accession no. FJ378693.1 or MN401132.1 or MH473158.1) via 16S rRNA sequencing. The 50 isolates were identified as E. lactis; the mass peaks at 10,122 ± 2 m/z and 3650 ± 1 m/z, specific to E. lactis, were present in these isolates, but specific E. faecium peaks were absent. These isolates were then compared with 16S rRNA sequencing results. Isolates were correctly identified as one species using mass peak analysis, whereas 16S rRNA sequencing generated two different species candidates, E. faecium (accession no. MT597585.1 or MT378127.1) and E. lactis (MG948154.1 or CP082267.1), instead of one species. These data were consistent with previous studies showing that 16S rRNA sequence analyses showed limited discriminatory power between E. faecium and E. lactis, as both exhibited >99% sequence homology [1,3]. Therefore, three mass peaks were specific for identifying and discriminating between E. faecium and E. lactis.
MALDI-TOF MS is a cost-efficient and rapid identification method when compared to other techniques [33]. The approach was used to identify ten strains within 15 min in a colony selection study [6]. The higher the throughput rate of a sample is, the lower the analysis cost/isolate [34]. To efficiently identify microorganisms, MALDI-TOF MS costs do not exceed $0.2 per strain, whereas other identification approaches, such as polymerase chain reaction-based methods, are more expensive [6,33].
Our identification method rapidly and accurately detected E. faecium and E. lactis from MALDI-TOF MS-specific mass peaks. E. faecium and E. lactis strains were not correctly identified at the species level using an in-house database; however, they were confirmed and identified by mass peak analysis. Despite the fact the in-house database misidentified a high number of isolates (10.8%), peak analyses may facilitate correct species assignment. The approach may also save on sequencing costs, and it does not require sequence amplification and genomic DNA extraction, thereby reducing costs, time, and labor for final strain identification [6]. Also, the extended direct transfer extraction protocol was used to reduce protein extraction times and shorten turnaround times. The specific mass peaks identified in this study were successfully used to identify E. faecium and E. lactis strains; therefore, this approach could be considered more efficient and accurate than 16S rRNA sequencing which is lacking in discriminating power.

Conclusions
MALDI-TOF MS is a powerful tool that distinguishes between E. faecium and E. lactis species. Moreover, the identification based on mass spectrometric data of two species, by combining an in-house database and MALDI-TOF MS-specific mass peak data, showed a better discrimination power than 16S rRNA sequencing. This approach can be successfully used for the accurate, rapid identification, and discrimination of E. faecium and E. lactis species and could be used in quality control protocols in the probiotic industry.