Pan-Proteomic Analysis and Elucidation of Protein Abundance among the Closely Related Brucella Species, Brucella abortus and Brucella melitensis

Brucellosis is a zoonotic infection caused by bacteria of the genus Brucella. The species, B. abortus and B. melitensis, major causative agents of human brucellosis, share remarkably similar genomes, but they differ in their natural hosts, phenotype, antigenic, immunogenic, proteomic and metabolomic properties. In the present study, label-free quantitative proteomic analysis was applied to investigate protein expression level differences. Type strains and field strains were each cultured six times, cells were harvested at a midlogarithmic growth phase and proteins were extracted. Following trypsin digestion, the peptides were desalted, separated by reverse-phase nanoLC, ionized using electrospray ionization and transferred into an linear trap quadrapole (LTQ) Orbitrap Velos mass spectrometer to record full scan MS spectra (m/z 300–1700) and tandem mass spectrometry (MS/MS) spectra of the 20 most intense ions. Database matching with the reference proteomes resulted in the identification of 826 proteins. The Cluster of Gene Ontologies of the identified proteins revealed differences in bimolecular transport and protein synthesis mechanisms between these two strains. Among several other proteins, antifreeze proteins, Omp10, superoxide dismutase and 30S ribosomal protein S14 were predicted as potential virulence factors among the proteins differentially expressed. All mass spectrometry data are available via ProteomeXchange with identifier PXD006348.


Introduction
Brucella represents a Gram-negative bacterial genus of the α-2 subgroup of Proteobacteria. Brucellae are highly adapted to their intracellular lifestyle and are the causative agents of human and animal brucellosis ("undulant fever", "Malta fever", "Mediterranean fever" or "Bang's disease") [1]. They are highly infective and 10-100 bacteria cause human infection [2,3]. The genus Brucella currently includes 12 accepted species that have been named according to their host specificity. To date, the mechanism behind the host specificity is not clear [4]. The classification of Brucella species is under debate due

Brucella Culture
Brucella type strains and field isolates as listed in Table 1 were from the culture collection of the Friedrich-Loeffler-Institut (FLI), Federal Research Institute for Animal Health, Institute of Bacterial Infections and Zoonoses (IBIZ), Jena, Germany. Each strain was independently cultivated 6 times in 50 mL of Tryptic Soy Broth at 37 • C in the presence of 5% CO 2 with shaking until the CFU was around 5 × 10 8 cells/mL. The cells were harvested by centrifugation at 11290× g for 5 min and after washing twice with phosphate buffer saline, the cells were inactivated and fixed by reconstituting the cell pellets with 300 µL of high performance liquid chromatography (HPLC) grade distilled water and 900 µL of absolute ethanol.

Whole-Cell Protein Extraction
In order to extract proteins from the ethanol-fixed cells, cells were centrifuged at 11,290× g for 2 min, the supernatant was discarded and the resultant cell pellets were air-dried for 20 min to remove ethanol traces. The cell precipitate was then reconstituted in 250 µL of lysis buffer (20 mM HEPES, pH 7.4), sonicated on ice for 1 min (duty cycle: 1.0, amplitude: 100%, UP100H; Hielscher Ultrasound Technology, Teltow, Germany), centrifuged at 11,290× g for 5 min at 4 • C and the supernatant collected. The protein content was measured using a modified Bradford's method (Biorad, Munich, Germany). The values obtained were checked for consistency by Sodium Dodecyl Sulfate PolyAcrylamide Gel Electrophoresis (SDS-PAGE) [35]. A volume of the whole-cell extract containing 10 µg of protein was subjected to acetone precipitation, reconstituted in 10 µL sample loading buffer, heated for 5 min at 60 • C and subjected to gel electrophoresis (4% acrylamide concentration in the stacking and 12% acrylamide concentration in the separating gel); the protein bands were visualized using Coomassie staining [36].

In Solution Trypsin Digestion
The protein extract containing 10 µg of protein was subjected to acetone precipitation and trypsin digestion as described elsewhere [37]. In brief, following acetone precipitation, the precipitate was reconstituted with 10 µL of denaturation buffer (6 M urea/2 M thiourea in 10 mM HEPES, pH 8.0). All steps of in-solution trypsin digestion were carried out at room temperature. The reduction was carried out for 30 min by adding 0.2 µL of 10 mM dithiothreitol in 50 mM of ammonium bicarbonate (ABC). Subsequently, alkylation was performed for 30 min by adding 0.2 µL of 55 mM iodoacetamide in 50 mM ABC. Then, 0.4 µL of LysC (0.5 µg/µL; Wako, Neuss, Germany) in ABC solution was added and incubated overnight at room temperature. Next, 75 µL of ABC were added to decrease the urea concentration to <2 M to enable trypsin digestion. Trypsin digestion was carried out overnight at 37 • C after adding 0.4 µL of 0.5 µg/µL trypsin in 50 mM ABC and the reaction was arrested by adding 100 µL of 5% acetonitrile in 3% trifluoroacetic acid.

Liquid Chromatography-Electrospray Ionization-Tandem Mass Spectrometry (LC-ESI-MS/MS)
The trypsin-digested peptides were first desalted by solid-phase extraction, using the stage-tip procedure [38]. Nano liquid chromatography-tandem mass spectrometry (LC-MS/MS) analysis was carried out using a Dionex Ultimate 3000 nanoLC system (Dionex, Germering, Germany) coupled with an LTQ Orbitrap Velos mass spectrometer (Thermo Fisher Scientific, Bremen, Germany), operated in data-dependent acquisition mode with the Xcalibur software (version 21.0.1140, Thermo Fisher Scientific). The nanoLC system was used to load the peptides in 0.1% formic acid onto a C18 PepMap trap column (75 µm ID × 2 cm, Dionex). Then, separation was achieved with a 5-60% acetonitrile gradient (90 min) with 0.1% formic acid at a flow rate of 350 nL/min through a 25 cm fritless C18 microcolumn packed inhouse with ReproSil-Pur C18-AQ 3 µm resin (Dr. Maisch GmbH, Entringen, Germany). Online electrospray ionization with an electrospray voltage of 2 kV was used for direct ionization of the eluted peptides. The ions were then transferred into an LTQ Orbitrap Velos operated in the positive mode to record full scan MS spectra (from m/z 300-1700) at a resolution of R = 60,000 followed by isolation and fragmentation of the 20 most intense ions by collision-induced dissociation.

Protein Identification
All raw MS files were combined and processed with the MaxQuant software (version. 1.6.0.16/Max-Planck-Institute of Biochemistry, Martinsried, Germany) [39,40]. The following parameters were set for protein identification: minimum required peptide length, seven amino acids, enzymes, LysC and trypsin, both enzymes with two missed cleavages, fixed modification, cysteine carbamidomethylation and variable modifications, oxidation of methionine and protein N-terminal acetylation. The initial precursor and fragment ion maximum mass deviations were set to 7 ppm and 0.5 Da, respectively, for the search against forward and backward protein sequences of a combined Brucella database (B. abortus 2308 and B. melitensis M28) downloaded from the UniProt Knowledgebase. The target-decoy-based false discovery rate (FDR) for peptide and protein identification was set to 0.01 to ensure that the proteins identified with the lowest score had a probability of ≤1% of being a false identification. The most frequently observed laboratory contaminants were eliminated from the list of identified proteins and the proteins with at least one peptide unique to the protein sequence were considered as valid identifications. MS-based quantification of proteins was performed using the label-free quantification algorithm of the MaxQuant software package [41,42].

Data Analysis
The data analysis was carried out using the freely available software Perseus (version 1.4.1.3; Max-Planck-Institute of Biochemistry, Martinsried, Germany), after importing the label-free quantification (LFQ) intensities of the proteins from the MaxQuant analysis. The intensities were first transformed to a logarithmic scale with base two and the missing values were replaced (imputated) with the value of the lowest observed value in the dataset. Statistical analysis was carried out using a two-way Student t-test, error correction (p < 0.05) and FDR correction of the alpha error was carried out by the method of Benjamini-Hochberg [43]. The comparisons between the four different datasets (type and reference strains of each species) were carried out in different pairs. Heat map and hierarchical clustering of proteins (Euclidean distance and linkage) were calculated using z-score normalized and logarithmized intensities of identified proteins. For further visualization, volcano plots and principle component analysis (PCA) were performed. All the proteins that showed a fold-change of at least 1.5 and met p > 0.05 were considered differentially expressed.
The mass spectrometry data have been deposited to the ProteomeXchange Consortium via the PRIDE partner repository [44,45], with the dataset identifier PXD006348.

Functional Categorization and Pathways Analysis
The UniProt FASTA files of protein sequences were analyzed using http://eggnogdb.embl.de (assessed in February 2017) to achieve the functional annotation of the identified proteins in terms of clusters of orthologous group (COG) [46]. The canonical pathways were also analyzed using the Database for Annotation, Visualization and Integrated Discovery (DAVID) tool [47,48].

Screening for Virulence-Associated Proteins
Protein sequences downloaded from Uniprot in FASTA format were used to predict the virulence nature using the VirulentPred online analysis tool (http://bioinfo.icgeb.res.in/virulent/ assessed in Sep. 2019) [49].

Mass Spectrometry Data
The mass spectrometry proteomics data have been deposited to the ProteomeXchange Consortium [50] via the PRIDE partner repository with the dataset identifier PXD006348.

Results and Discussion
In the present study, the reference strains B. abortus (strain 544, ATCC 23448) and B. melitensis (strain 16M, ATCC 23456) [10,11,26] were chosen in an attempt to understand the proteomic differences between strains of closely related Brucella species. The strains isolated from infected animals that were previously used to demonstrate the existence of protein expression level differences [23] were also included for comprehension. Differences at the proteome level may also underlie differences in their phenotypes, pathogenicity and host specificity [22,51,52]. The vaccine strain Rev. 1 and the laboratory strain B115 of B. melitensis had comparable two-dimensional gel electrophoresis (2DE) protein patterns. However, the reference strain 16M displayed 50% fewer protein spots [53]. Strains of the same species possessing homologous genomes and displaying comparable phenotype or biochemical reactions [54] also displayed different proteomes in B. abortus strains, i.e., the virulent strain 2308 and the vaccine strain S19 [55]. Earlier genome-based suppressive subtractive hybridization studies had identified species-specific deletions which 2DE-based investigations could not confirm [27,28,56] due to the limitations in available protein identification coverage and of protein entries in the database. Factors such as heat, oxidative and acidic pH stress and culture media have influenced the 2DE based protein coverage of B. abortus and B. melitensis [32,57]. Application of LC-MS has enhanced the protein coverage as demonstrated in the case of reference strain B. abortus 2308 to create a dataset of 621 proteins among which 300 were not reported earlier and five were attributed to pseudogenes [29]. LC-MS-based quantitative proteomic comparison of the outer membrane fraction of virulent and avirulent strains of B. abortus results in the fact that Brucella virulence is based on extensive cell envelope-based modifications [26,30,55]. Therefore, the present study involving LC-MS-based quantitative proteomic analysis of whole-cell protein extracts was initiated to identify protein expression level differences between these two closely related bacterial species.

Brucella Whole-Cell Protein Extraction
Preliminary analyses using SDS-PAGE separation ( Figure 1) revealed that both B. abortus and B. melitensis express very similar sets of proteins with regard to protein pattern and band intensities as well as the occurrence of few differences such as a distinct band around 35 kDa (B. abortus) and below 20 kDa (B. melitensis). The type strains and field strains displayed comparable bands but with varying intensities.

Results and Discussion
In the present study, the reference strains B. abortus (strain 544, ATCC 23448) and B. melitensis (strain 16M, ATCC 23456) [10,11,26] were chosen in an attempt to understand the proteomic differences between strains of closely related Brucella species. The strains isolated from infected animals that were previously used to demonstrate the existence of protein expression level differences [23] were also included for comprehension. Differences at the proteome level may also underlie differences in their phenotypes, pathogenicity and host specificity [22,51,52]. The vaccine strain Rev. 1 and the laboratory strain B115 of B. melitensis had comparable two-dimensional gel electrophoresis (2DE) protein patterns. However, the reference strain 16M displayed 50% fewer protein spots [53]. Strains of the same species possessing homologous genomes and displaying comparable phenotype or biochemical reactions [54] also displayed different proteomes in B. abortus strains, i.e., the virulent strain 2308 and the vaccine strain S19 [55]. Earlier genome-based suppressive subtractive hybridization studies had identified species-specific deletions which 2DE-based investigations could not confirm [27,28,56] due to the limitations in available protein identification coverage and of protein entries in the database. Factors such as heat, oxidative and acidic pH stress and culture media have influenced the 2DE based protein coverage of B. abortus and B. melitensis [32,57]. Application of LC-MS has enhanced the protein coverage as demonstrated in the case of reference strain B. abortus 2308 to create a dataset of 621 proteins among which 300 were not reported earlier and five were attributed to pseudogenes [29]. LC-MS-based quantitative proteomic comparison of the outer membrane fraction of virulent and avirulent strains of B. abortus results in the fact that Brucella virulence is based on extensive cell envelope-based modifications [26,30,55]. Therefore, the present study involving LC-MS-based quantitative proteomic analysis of whole-cell protein extracts was initiated to identify protein expression level differences between these two closely related bacterial species.

Brucella Whole-Cell Protein Extraction
Preliminary analyses using SDS-PAGE separation ( Figure 1) revealed that both B. abortus and B. melitensis express very similar sets of proteins with regard to protein pattern and band intensities as well as the occurrence of few differences such as a distinct band around 35 kDa (B. abortus) and below 20 kDa (B. melitensis). The type strains and field strains displayed comparable bands but with varying intensities.

Databases and Protein Sequences
Protein identification by matching mass spectra against a database of known sequences is an important tool in proteomics. Despite the availability of a list of differentially expressed proteins and the influences of the physicochemical environment, missing or unknown sequences remain a limiting factor. UniProt listed as many as 458 proteome entries for Brucella, however, 329 entries of proteomic data were redundant and moved to the UniProt Archive (UniParc) database. The remaining nonredundant database represents 10 Brucella species and remains active in the UniProt KB database. UniProt introduced the new term "pan proteome" to describe the entire set of proteins thought to be expressed by a group of highly related organisms, e.g., multiple strains of a species. The pan database entries also included all sequences within a taxonomical group as well as unique sequences not found in the reference proteome [58]. Consequently, 25 proteomes representing 10 species of Brucella were used to create the B. abortus 2308 pan proteome and the B. abortus (strain 2308) proteome remained the reference proteome. An analysis of protein IDs of the pan proteome and five strains of Brucella sp. based on the online software tool InteractiVenn [59] revealed that the complete list of the Brucella reference proteome (except three entries) forms 47% of the pan proteome ( Figure 2). Moreover, the bulk of protein entries in the pan proteome were from B. abortus, B. melitensis, B. suis, B. ovis and B. vulpis, respectively. The Brucella pan proteome contains 7266 protein entries. Their existence was mostly predicted (77%) or inferred from homology (21.6%) whereas evidence at the protein level (1%) and transcript level (0.1%) was scarce. The manual curation of protein entries was reported for 528 protein entries which correspond to approximately 7.3% of Brucella-specific protein entries. For MS-based proteome analysis of two species, pan proteome or inclusion of all 10 Brucella proteomes for protein identification might be inconvenient due to the difference in the entry IDs for each species. Therefore, for the sake of effective protein identification, the following two proteomes were combined: (1)

Databases and Protein Sequences
Protein identification by matching mass spectra against a database of known sequences is an important tool in proteomics. Despite the availability of a list of differentially expressed proteins and the influences of the physicochemical environment, missing or unknown sequences remain a limiting factor. UniProt listed as many as 458 proteome entries for Brucella, however, 329 entries of proteomic data were redundant and moved to the UniProt Archive (UniParc) database. The remaining nonredundant database represents 10 Brucella species and remains active in the UniProt KB database. UniProt introduced the new term "pan proteome" to describe the entire set of proteins thought to be expressed by a group of highly related organisms, e.g., multiple strains of a species. The pan database entries also included all sequences within a taxonomical group as well as unique sequences not found in the reference proteome [58]. Consequently, 25 proteomes representing 10 species of Brucella were used to create the B. abortus 2308 pan proteome and the B. abortus (strain 2308) proteome remained the reference proteome. An analysis of protein IDs of the pan proteome and five strains of Brucella sp. based on the online software tool InteractiVenn [59] revealed that the complete list of the Brucella reference proteome (except three entries) forms 47% of the pan proteome ( Figure 2). Moreover, the bulk of protein entries in the pan proteome were from B. abortus, B. melitensis, B. suis, B. ovis and B. vulpis, respectively. The Brucella pan proteome contains 7266 protein entries. Their existence was mostly predicted (77%) or inferred from homology (21.6%) whereas evidence at the protein level (1%) and transcript level (0.1%) was scarce. The manual curation of protein entries was reported for 528 protein entries which correspond to approximately 7.3% of Brucella-specific protein entries. For MSbased proteome analysis of two species, pan proteome or inclusion of all 10 Brucella proteomes for protein identification might be inconvenient due to the difference in the entry IDs for each species. Therefore, for the sake of effective protein identification, the following two proteomes were

Protein Identification
A MaxQuant-Andromeda-based search against the combined database of B. abortus (strain 2308) and B. melitensis biotype 1 (strain 16M) resulted in the identification of 1202 proteins with at least one unique peptide specific for a protein. A total of 826 proteins were identified, after applying the filter that label-free quantification intensity (LFQ) of a protein was present in at-least four out of six replicates in each sample dataset and after removal of proteins matched toreverse sequences and those proteins identified "by site" (Supplementary Table S1). Among these identified proteins, 478 protein IDs belonged to the B. abortus reference proteome and the remaining 348 protein IDs could be allocated to the B. melitensis proteome. The distribution of the identified proteins with respect to the chromosomes was 360 on chromosome I (43.6%) and 118 on chromosome II (14%) of B. abortus and 161 proteins expressed on chromosome I (19.5%) and 56 on chromosome II (6.8%) of B. melitensis. The remaining 131 proteins (15.7%) belonged to B. melitensis unassembled WGS sequences, which are also part of the B. abortus 2308 pan proteome. Later, a database update resulted in the addition of one protein in chromosome I of B. abortus (strain 2308, update 05 December 2016). The proteome UP000008511 was moved to UniParc as it was identified as redundant (update 18 November 2016) and the majority of its protein entries were found to be matching with proteome UP000000419 (update 9 October 2016). As a result, 131 protein entries (belonging to proteome UP000008511: B. melitensis biotype 1) of the 826 identified proteins were found redundant (marked as removed in Supplementary Table S1) or obsolete and deposited at the UniParc database. Consequently, the remaining 695 identified proteins were considered for further analysis.

Comparative Proteomics of B. abortus and B. melitensis
Visualization through unsupervised hierarchical clustering of the proteomic data recapitulated the similarities between strains and species ( Figure 3). All six replicates of each isolate clustered together and the species displayed a clear clustering. For comparative proteomics analysis, pairwise comparisons were carried out in six categories as follows, As shown in Figure 3, the volcano plot displays the negative log 10 t-test p-value over the log 2 fold-change. Proteins with p-values above the dotted line (p < 0.05) were considered to be differentially expressed between the two groups. The left side of the plot represents the downregulated proteins and the right side represents the upregulated proteins. Table 2 lists the differentially expressed proteins in each of the above-described categories. The proteins identified as significantly regulated in at least three of the four comparison categories (I-IV) were considered to play a crucial role. As a result, 109 and 104 proteins were identified as up-or downregulated in B. melitensis when compared to B. abortus (Supplementary Table S2).

Geno Ontology and Clusters of Orthologous Groups
UniProt was used to cluster the proteins in accordance with their Gene Ontologies (GO) for understanding the functional role of the identified proteins. The GO mapping was possible only for those proteins that remained active at the UniProt database until the analysis. GO results presented in Figure 4 include 695 identified proteins among which 66 and 94 were up-or downregulated, respectively, in B. melitensis compared to B. abortus, while Clusters of Orthologous Groups (COGs) are indicated for all 826 identified proteins including 131 redundant proteins. Catalytic activity, binding properties, transporter and antioxidant activity, as well as ribosomal structural constitution, were different between these two species. Macromolecular, membrane and cellular components also varied. The major metabolic and cellular processes appeared to be similar between these species. The prediction of COGs revealed the distribution within four functional groups: cellular processes and signaling (D, M, O, T and U), information storage and processing (J, K and L), metabolism (C, E, F, G, H, I, P and G) and poorly characterized (S). Analysis of about 20% of the identified proteins

Geno Ontology and Clusters of Orthologous Groups
UniProt was used to cluster the proteins in accordance with their Gene Ontologies (GO) for understanding the functional role of the identified proteins. The GO mapping was possible only for those proteins that remained active at the UniProt database until the analysis. GO results presented in Figure 4 include 695 identified proteins among which 66 and 94 were up-or downregulated, respectively, in B. melitensis compared to B. abortus, while Clusters of Orthologous Groups (COGs) are indicated for all 826 identified proteins including 131 redundant proteins. Catalytic activity, binding properties, transporter and antioxidant activity, as well as ribosomal structural constitution, were different between these two species. Macromolecular, membrane and cellular components also varied. The major metabolic and cellular processes appeared to be similar between these species. The prediction of COGs revealed the distribution within four functional groups: cellular processes and signaling (D, M, O, T and U), information storage and processing (J, K and L), metabolism (C, E, F, G, H, I, P and G) and poorly characterized (S). Analysis of about 20% of the identified proteins identifies any COG. Based on the COG clustering, membrane proteins and proteins involved in biomolecular transport and protein synthesis mechanism were found to be enhanced in B. abortus in comparison to B. melitensis.
identifies any COG. Based on the COG clustering, membrane proteins and proteins involved in biomolecular transport and protein synthesis mechanism were found to be enhanced in B. abortus in comparison to B. melitensis.

Bioinformatics Annotation of Differentially Expressed Proteins
As shown in Table 3, the DAVID analysis of differentially expressed proteins revealed involvement of several Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways. Notably, histidine metabolism appeared to be different in all compared groups. As shown in Figure 5, among the top 10 pathways, the observed differences in the categories ABC transporters, aminoacyl-tRNA biosynthesis and oxidative phosphorylation merit further investigation to clarify, if these pathways influence biochemical diagnosis or host specificity. The role of ABC transporters in intracellular survival and virulence of Brucella was demonstrated in B. ovis [60]. It was also shown that about 9% of the coding ability of Brucella is devoted towards ABC transporters, but differences between these two Brucella species were reported [61]. Analysis using DAVID indicated that the proteins downregulated in B. abortus in comparison to B. melitensis occurred within three biological pathways, each with four of the proteins identified: pyruvate metabolism, histidine metabolism, arginine and proline metabolism and lysine degradation, and tryptophan metabolism. In contrast, the upregulated proteins were present in three other pathways: 11 proteins in metabolic pathways, four proteins in carbon metabolism and three proteins in pyrimidine metabolism.

Bioinformatics Annotation of Differentially Expressed Proteins
As shown in Table 3, the DAVID analysis of differentially expressed proteins revealed involvement of several Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways. Notably, histidine metabolism appeared to be different in all compared groups. As shown in Figure 5, among the top 10 pathways, the observed differences in the categories ABC transporters, aminoacyl-tRNA biosynthesis and oxidative phosphorylation merit further investigation to clarify, if these pathways influence biochemical diagnosis or host specificity. The role of ABC transporters in intracellular survival and virulence of Brucella was demonstrated in B. ovis [60]. It was also shown that about 9% of the coding ability of Brucella is devoted towards ABC transporters, but differences between these two Brucella species were reported [61]. Analysis using DAVID indicated that the proteins downregulated in B. abortus in comparison to B. melitensis occurred within three biological pathways, each with four of the proteins identified: pyruvate metabolism, histidine metabolism, arginine and proline metabolism and lysine degradation, and tryptophan metabolism. In contrast, the upregulated proteins were present in three other pathways: 11 proteins in metabolic pathways, four proteins in carbon metabolism and three proteins in pyrimidine metabolism.

Predicted Virulence-Associated Proteins
The prediction of protein pathogenicity was carried out using the freely available online Support Vector Machines (SVM)-based tool VirulentPred [49]. Of the 826 proteins identified, 102 proteins (12%) were predicted to be potentially virulence-associated (Supplementary Table 3 Table 4, all other differentially expressed proteins predicted as virulence-associated were listed as

Predicted Virulence-Associated Proteins
The prediction of protein pathogenicity was carried out using the freely available online Support Vector Machines (SVM)-based tool VirulentPred [49]. Of the 826 proteins identified, 102 proteins (12%) were predicted to be potentially virulence-associated (Supplementary Table 3 Table 4, all other differentially expressed proteins predicted as virulence-associated were listed as uncharacterized proteins. The upregulation of potentially virulence-associated proteins and downregulation of ribosomal proteins indicate different degrees of control operations that prepare the bacterial agent for infection [62]. The identified proteins can be explored for further application in diagnostics. role in cellular adhesion and virulence in Candida albicans [68] ribosomal protein L7/L12 based subunit vaccines [69,70] Acc. No is the UniProt ID, the protein ID marked with * were moved to UniPrac as they were found to be redundant proteins. Reg-status of protein regulation: (+) denotes upregulation and (-) indicates downregulation of proteins when B. abortus is compared with B. melitensis.

Field Strains and Host Adaptability
Within the same species, the type and field strains also displayed significant variations in the protein abundances. As listed in Supplementary Table S1, the type strain and field strain of B. abortus showed differences in the expression of 180 proteins, among which 97 and 83 proteins were up-and downregulated, respectively, in the field strain when compared to that of the type strain. On the other hand, B. melitensis displayed differences in 224 proteins, among which 129 and 95 were found to be up-or downregulated in the field strain. Among these differentially expressed proteins, as listed in Table 5, 10 proteins including several binding proteins-appeared to be highly abundant among the type strains of both species, while 19 proteins-mostly belonging to the Type IV secretion system (T4SS)-were highly abundant among the field strains of B. abortus and B. melitensis. T4SS has been associated with increased host adaptability and described as an essential pathogenicity factor in several pathogens including Brucella spp., Helicobacter pylori, Legionella pneumophila and Bartonella spp [71,72]. T4SS influences the intracellular survival in the host [73,74]. Nine proteins identified as upregulated in the B. abortus field strain were found to be downregulated in the field strain of B. melitensis. On the other hand, seven proteins identified as upregulated in the B. melitensis field strain were found to be downregulated in the B. abortus field strain. These proteins are worth further analysis as they might play a role in the known host-species specificity and might be useful for designing species-specific diagnostic tools.

Conclusions
In conclusion, our quantitative proteomic analysis of reference and field-isolated strains of B. abortus and B. melitensis not surprisingly confirms the existence of proteome level differences between the strains. Besides differences in metabolic pathways, B. abortus and B. melitensis displayed differences in ABC transporters, which were shown to play a role in intracellular survival and virulence of Brucella. Field isolates displayed enhanced abundance in several binding proteins and Type IV secretion systems (T4SS), these have been associated with host adaptability and essential pathogenic factors. B. abortus field strain displayed a high abundance of 10 proteins and seven proteins were of high abundance in B. melitensis. These proteins might be playing a role in host specificity. With the exception of seven proteins, all other proteins (n = 15) identified as potentially virulence-associated were uncharacterized proteins. Problems arise from the existence of multiple redundant reference proteomes for different Brucella species. The benefits of the recently introduced pan proteome concept based on protein entries of 10 known Brucella species also remain limited, as long as the majority of protein entries at the UniProt database remain unreviewed/curated. Establishing a species-specific proteome would be useful for understanding the host specificity and infection of Brucella species. We suggest that in addition to improvements in the reference database, further in-depth proteomic analyses are performed on these strains cocultured with their respective host cell lines. This might lead to an understanding of the mechanism lying behind the described host specificity and pathogenicity.