Identification of Pathogenic Bacteria from Public Libraries via Proteomics Analysis

Hazardous organisms may thrive on surfaces that are often exposed to human contact, including children’s library books. In this study, swab samples were taken from 42 children’s books collected from four public libraries in Texas and California. Samples were then cultivated in brain–heart infusion (BHI) medium and then in Luria broth (LB) medium containing either ampicillin or kanamycin. All 42 samples (100%) were positive for bacterial growth in normal BHI medium. Furthermore, 35 samples (83.3%) and 20 samples (47.6%) in total were positive in LB medium containing ampicillin or kanamycin, respectively. Bacterial populations were then identified in samples using an Orbitrap Fusion™ Tribrid ™ mass spectrometer, a state-of-the-art proteomic analysis tool. Identified bacterial species grown in ampicillin included Bacillus, Acinetobacter, Pseudomonas, Staphylococcus, Enterobacter, Klebsiella, Serratia, Streptococcus, Escherichia, Salmonella, and Enterococcus. In contrast, identified bacteria grown in kanamycin included Staphylococcus, Streptococcus, Enterococcus, and Bacillus. The presences of pathogenic bacteria species were also confirmed. The results of this study warrant follow up studies to assess the potential health risks of identified pathogens. This study demonstrates the utility of proteomics in identifying environmental pathogenic bacteria for specific public health risk evaluations.


Introduction
Children's library books may serve as a vector of contagious organisms as they are circulated throughout communities without established sterilization procedures. In 1985, concern about disease transmission through libraries were raised by McClary in his article "Beware the Deadly Books" [1]. In 1994, Brook et al. published that Staphylococcus epidermidis was recovered from four out of 15 public library books [2]. Recently, Rafiei et al. demonstrated that 20.8% of returned books from the Al-Zahra Hospital Library and the Library of Sciences Faculty of Isfahan University were culture-positive [3]. Identified bacteria included Enterobacteriacease and coagulase-negative Staphylococcus. Gamlale et al. reported that, while airborne fungi are found throughout the city of São Paulo, Brazil, they are present in higher concentrations in libraries, subsequently resulting in asthmatic or rhinitis symptoms in 49% of 314 interviewed librarians in a follow up study [4]. Currently, in the United States there are two reports demonstrating the presence of bacteria in library books [5,6]. However, these reports originate from university course assignments or high school science competitions; thus, full description of methods and results are not available as scientific literature. In addition, the samples of these reports consist of university library books and chapter books, rather than children's books. Identification of organisms in children's books in the United States has not been illustrated to our knowledge. Furthermore, organisms that children carry can vary from those of adults [7].
Identification of bacteria species using mass spectrometry-based proteomics has over 40 years of history. In 1970, pyrolysis-gas chromatography mass spectrometry was used to identify two microorganisms, Micrococcus luteus and Bacillus subtilis var. niger [8]. Since then, over 200 papers have been published per year for bacteria identification using the mass spectrometer. Advances in mass spectrometry techniques and instrument development have resulted in improved detection of bacteria species; however, most studies utilize matrix-assisted laser desorption/ionization-time of flight mass spectrometry (MALDI-TOF) [9][10][11][12][13][14][15][16][17][18][19]. Recently, in 2013 the U.S. Food and Drug Administration (FDA) approved two matrix-assisted laser desorption-time of flight-mass spectrometry (MALDI-TOF-MS)-based platforms for bacteria identification [20]. Though MALDI-TOF-MS-based identification is a well-established technique, it can only identify bacteria at the genus level, not at the species or subspecies level [20]. Liquid Chromatography-Electrospray Ionization-Tandem Mass Spectrometry (LC-ESI-MS/MS ) is a recently developed mass spectrometry system with higher sensitivity and reliability compared to MALDI-TOF, making LC-ESI-MS/MS a superior platform for protein identification. The high sensitivity of LC-ESI-MS/MS is achieved through: (1) effective peptide concentration (50-200 fold before MS detection by HPLC column), (2) independent sequencing of peptides, and (3) utilization of between 0.01% and 0.1% of the sample loaded on a MALDI plate during MALDI-based MS analysis, and utilization of almost all the sample during the electrospray process in LC-MS/MS. However, these advanced techniques have not been used to evaluate the pathogenic potential of library books.
In this paper we used a state-of-the-art mass spectrometry instrument, the Thermo™ Fusion™ Tribrid™ Orbitrap mass spectrometer (San Jose, CA, USA), to identify bacteria at the species level. This study aims to identify organisms in children's books through mass spectrometry and proteomic analysis using an LC-ESI-MS/MS instrument.

Sample Collection
Ten books from each of two public libraries in Houston, Texas, and 11 books from each of two public libraries (a public library in the Central Valley and a public library in the East Bay) in Northern California were swabbed in the winter of 2018. The front and back cover and top and sides of each book were swabbed. Samples were collected using cotton fiber-tipped sterile swabs (Fisher, Cat. No. 14-959-96B). Immediately prior to sample collection, swabs were dipped into the sterile brain-heart infusion (BHI) medium, to ensure bacterial recovery from environmental surfaces [21]. After the book surfaces were sampled, the swab was immediately placed into a 13 mL bacterial culture tube (Sarstedt, 62.515.006), and returned to the laboratory. Separate clean sterile swabs were only dipped into the sterile BHI and placed in 13 mL of bacterial culture before and after sample collection steps per each library for negative control.

Bacterial Cultures, Collection of Bacteria
Cultivation of each sample began by adding 5 mL of BHI medium to the culture tube containing swabs. The tubes were then incubated for 12 h in a 37 • C shake incubator at a speed of 1500 rpm. After initial cultivation, 10 µL of the bacteria culture were inoculated into 5 mL of Luria broth (LB) medium containing 100 µg/mL of ampicillin or 100 µg/mL of kanamycin to select for bacteria grown in ampicillin-or kanamycin-containing medium. After 12 h of shaking incubation the bacteria were pelleted by 2000× g spin and PBS-washed.

Proteomics Analysis of Bacteria
The harvested bacteria grown in ampicillin or kanamycin-containing medium were digested and analyzed on LC-ESI-MS/MS based on a previous publication [22]. Briefly, PBS-washed bacteria were suspended with 100 µL of lysis buffer (50 mM ammonium bicarbonate, 1 mM CaCl 2 ) and snap-frozen with liquid nitrogen. Then the bacteria were lysed by three cycles of 95 • C boiling and LiN 2 snap freezing. The protein amount was measured by the Bradford method and 10 µg of total protein were digested with trypsin overnight. The digested peptides were vacuum-dried then dissolved in 30 µL of 5% methanol containing 0.1% formic acid, and one-fifth of the reconstituted samples was subjected to a nLC-1000 (Thermo Fisher Scientific, Waltham, MA, USA) coupled to an Orbitrap Fusion™ Tribrid ™ mass spectrometer (Thermo Fisher Scientific). An in-house trap column (2 cm × 100 µm, Reprosil-Pur Basic C18, 3 µm) was used for enriching peptides. Then, the trap column was switched in-line with an in-house 5 cm × 150 µm capillary column packed with 1.9 um Reprosil-Pur Basic C18 beads. Peptides were separated with a 75-min discontinuous gradient of 4-24% acetonitrile, 0.1% formic acid, at a flow rate of 800 nL/min, then electro-sprayed into a mass spectrometer. The instrument was operated using Xcalibur software ver 4.0 (Thermo Fisher Scientific) in data-dependent mode, acquiring fragmentation spectra of the top 50 strongest ions. Parent MS spectrum was acquired in the Orbitrap with a full MS range of 300-1400 m/z at a resolution of 120,000. HCD fragmented MS/MS spectrum was acquired in an ion-trap in rapid scan mode

Data Analysis with Commercial and In-House Computer Software
Search of the obtained MS/MS spectrum against a target-decoy bacterial ribosome database (48,718 protein sequence entry) was done in the Proteome Discoverer 1.4 interface (ThermoFisher, San Jose, CA, USA) with the Mascot algorithm (Mascot 2.4, Matrix Science, Boston, MA, USA). Oxidation of methionine and protein n-terminal acetylation was allowed as variable modifications. Mass tolerance was 20 ppm for the precursor and 0.5 Dalton for fragment ions. A maximum of two missed cleavages of trypsin digestion was allowed. Assigned peptides were filtered with 1% false discovery rate (FDR). Number of peptide spectrum match (PSM) was used for identification of existing bacteria species.
A Python script was written for wrangling bacterial protein FASTA. A species dependent unique peptide list was created for lysosomal protein to further pinpoint-out identification of specific species level of bacteria strain identification. Briefly, the python script is explained as follows. The SeqIO module available through biopython package was used to read the protein FASTA file. Unique proteins were selected by counting the number of protein accessions where a protein should have only one identifier. Proteins were cleaved to peptide sequence for trypsin using parser module from pyteomics package [23]. Peptides were then mapped to their corresponding protein accession. A list of unique peptides was created where the peptide mapped to only one protein accession. Script available at https://github.com/bbhatt1789/library-germs.git.

Data Availability
The mass spectrometry data have been deposited to the ProteomeXchange Consortium (http: //proteomecentral.proteomexchange.org) via the MASSIVE repository (MSV MSV000083354) with the dataset identifier PXD012473.

Growth of Bactria in Antibiotic Medium
In this study, 42 samples were analyzed, with 20 samples referring to Houston libraries (Houston 1 and Houston 2), and 22 samples referring to a public library in the Central Valley (CV) and a public library in the East Bay (EB) in Northern California. Out of 42 samples, all 42 samples were positive in terms of bacterial growth in normal BHI medium (Figure 1c). BHI medium-grown bacterial culture was inoculated into either ampicillin- (Figure 1d) or kanamycin ( Figure 1e)-containing LB medium to select antibiotic-resistant bacteria grown in ampicillin or kanamycin-containing medium. After 12 h of incubation in a 37 • C shaking incubator, 33 samples (79%) and 23 samples (55%) were positive in terms of bacterial growth in ampicillin-or kanamycin-containing LB medium, respectively. Culture results in terms of growth of bacteria and libraries identification for samples are shown in Figure 1a. Overall, the occurrence of bacteria grown in ampicillin-containing medium is much higher than the occurrence of bacteria grown in kanamycin-containing medium. was inoculated into either ampicillin- (Figure 1d) or kanamycin (Figure 1e)-containing LB medium to select antibiotic-resistant bacteria grown in ampicillin or kanamycin-containing medium. After 12 hours of incubation in a 37 °C shaking incubator, 33 samples (79%) and 23 samples (55%) were positive in terms of bacterial growth in ampicillin-or kanamycin-containing LB medium, respectively. Culture results in terms of growth of bacteria and libraries identification for samples are shown in Figure 1a. Overall, the occurrence of bacteria grown in ampicillin-containing medium is much higher than the occurrence of bacteria grown in kanamycin-containing medium.

Identified Bacteria Grown in Ampicillin or Kanamycin-Containing Medium
Bacteria grown in ampicillin or kanamycin-containing medium was digested in trypsin and analyzed on LC-ESI-MS/MS as shown in Figure 2a. Acquired MS raw files were searched against bacteria ribosomal FASTA protein database extracted from the NCBI non-redundant RefSeq proteins database (NCBInr) using all known bacteria taxa and ribosome as key words. Because the entire bacteria protein database size was two-thirds of the NCBInr RefSeq database (entry 552,817,090), calculating the raw MS file against it was impractical. Over 12 hours would be required to calculate one MS file using the entire bacteria protein database. As a result, we decided to create a far smaller protein database to make data analysis time manageable. Ribosomal proteins are good candidates for such an approach as they are essential proteins regardless of taxa. Although most ribosomal proteins are highly conserved within bacteria, some of these proteins are subject to variations depending on bacteria species so it was proven as a useful tool for the classification of bacterial isolates to the subspecies or strain level [24][25][26][27][28]. The bacteria ribosome protein database containing 48,718 proteins sequence was used reducing the average calculation time for one MS file to 45 min.

Identified Bacteria Grown in Ampicillin or Kanamycin-Containing Medium
Bacteria grown in ampicillin or kanamycin-containing medium was digested in trypsin and analyzed on LC-ESI-MS/MS as shown in Figure 2a. Acquired MS raw files were searched against bacteria ribosomal FASTA protein database extracted from the NCBI non-redundant RefSeq proteins database (NCBInr) using all known bacteria taxa and ribosome as key words. Because the entire bacteria protein database size was two-thirds of the NCBInr RefSeq database (entry 552,817,090), calculating the raw MS file against it was impractical. Over 12 h would be required to calculate one MS file using the entire bacteria protein database. As a result, we decided to create a far smaller protein database to make data analysis time manageable. Ribosomal proteins are good candidates for such an approach as they are essential proteins regardless of taxa. Although most ribosomal proteins are highly conserved within bacteria, some of these proteins are subject to variations depending on bacteria species so it was proven as a useful tool for the classification of bacterial isolates to the sub-species or strain level [24][25][26][27][28]. The bacteria ribosome protein database containing 48,718 proteins sequence was used reducing the average calculation time for one MS file to 45 min. Fifteen bacterial genera were discovered from ampicillin-containing culture medium and four bacterial genera were identified from kanamycin-containing culture medium (Figure 2b). The Bacillus and Staphylococcus genera were the most common genera from ampicillin-and kanamycin-containing media, respectively. The full list of bacteria grown is summarized in Table S1. The identified protein and peptide list for bacteria from ampicillin medium (Table S2) and kanamycin medium (Table S3) is also summarized.

Identified of Bacteria at Species Level
We further investigated the recovered peptides from bacteria grown in ampicillin or kanamycin treated media samples to identify bacteria at the species and subspecies level. Determining bacteria at the genus level based on peptides implies some uncertainty since a peptide may be from different genera and species. Therefore, detection of unique peptides of species-specific ribosomal proteins is a promising task for an unambiguous identification of bacteria at the species level. We developed a Python based script that offers the possibility of a highly efficient and simple detection of such unique peptides. The species-specific unique peptide list was generated by the Python script and compared to the recovered peptide list. Any recovered peptide matched to a unique peptide from the Python script indicates the existence of a bacteria species in the sample. Figure 3a shows the work flow of the species-specific peptide identification steps. As shown in Figure 3b, 26 and eight types of bacteria were identified at the species or sub-species level from ampicillin-and kanamycin-containing medium, respectively. The detailed identified bacteria species-specific unique peptide is listed in Table S4. Streptococcus pneumoniae was found to be the most common bacteria from ampicillincontaining medium. A few Bacillus species, including B. cereus and B. subtilis species, were also found as major bacteria (Figure 3b). In kanamycin-containing medium samples, Staphylococcus haemolyticus appeared to be most common species, followed by Enterococcus asini, as well as a few other Staphylococcus species such as S. lentus, S. warneri, and S. xylosus. Bacillus  46  38  13  28  125  Acinetobacter  8  19  9  36  Pseudomonas  16  19  35  Staphylococcus  12  13  25  Enterobacter  6  13  5  24  Klebsiella  2  8  2  12  Streptococcus  5  3  3  11  Serratia  3  7  1   Fifteen bacterial genera were discovered from ampicillin-containing culture medium and four bacterial genera were identified from kanamycin-containing culture medium (Figure 2b). The Bacillus and Staphylococcus genera were the most common genera from ampicillin-and kanamycin-containing media, respectively. The full list of bacteria grown is summarized in Table S1. The identified protein and peptide list for bacteria from ampicillin medium (Table S2) and kanamycin medium (Table S3) is also summarized.

Identified of Bacteria at Species Level
We further investigated the recovered peptides from bacteria grown in ampicillin or kanamycin treated media samples to identify bacteria at the species and subspecies level. Determining bacteria at the genus level based on peptides implies some uncertainty since a peptide may be from different genera and species. Therefore, detection of unique peptides of species-specific ribosomal proteins is a promising task for an unambiguous identification of bacteria at the species level. We developed a Python based script that offers the possibility of a highly efficient and simple detection of such unique peptides. The species-specific unique peptide list was generated by the Python script and compared to the recovered peptide list. Any recovered peptide matched to a unique peptide from the Python script indicates the existence of a bacteria species in the sample. Figure 3a shows the work flow of the species-specific peptide identification steps. As shown in Figure 3b, 26 and eight types of bacteria were identified at the species or sub-species level from ampicillin-and kanamycin-containing medium, respectively. The detailed identified bacteria species-specific unique peptide is listed in Table  S4. Streptococcus pneumoniae was found to be the most common bacteria from ampicillin-containing medium. A few Bacillus species, including B. cereus and B. subtilis species, were also found as major bacteria (Figure 3b). In kanamycin-containing medium samples, Staphylococcus haemolyticus appeared to be most common species, followed by Enterococcus asini, as well as a few other Staphylococcus species such as S. lentus, S. warneri, and S. xylosus. .

Discussion
Microorganisms exist in every environment where people are active and are generally more beneficial than harmful to humans. The purpose of this study is to investigate the presence of harmful bacteria in children's books from public libraries by mass spectrometry analysis. Two antibiotics, kanamycin, which mainly works against Gram (-) bacteria, and ampicillin, selective against Gram (+) bacteria, were used to enrich each antibiotic-resistant bacteria from the expected huge numbers of antibiotic-sensitive bacteria on the books. The bacteria grown in kanamycin-containing medium are all Gram (+) bacteria (Figure 2b and Figure 3b).
S. haemolyticus is the second most clinically isolated opportunistic pathogen (S. epidermidis is the first) that can causes meningitis, skin or soft tissue infections, prosthetic joint infections, or bacteremia [29], and has the highest level of antimicrobial resistance [30]. S. aureus is one of the leading pathogens for nosocomial infections showing multi-drug resistance (MDR) [31,32]. All these strains are associated with the human skin, gastrointestinal tract and urogenital tract [33,34].
Both Gram (+) and Gram (-) bacteria are identified in ampicillin containing medium. Gram positive bacteria S. pneumoniae was another commonly identified species in this study. S. pneumoniae is particularly dangerous for young children, older adults, and persons with underlying comorbidities [35,36]. Gram (-) bacteria A. baumannii is known to be one of the most severe MDR pathogens [37]. It often causes problems in immunocompromised individuals, particularly those who have experienced a prolonged (over 90 days) hospital stay [38]. It has been known to spread through the skin as well as the respiratory and oropharyngeal secretions of infected individuals [39]. Because it has an exceptional ability to develop resistance to all currently available antibiotics, it has been designated as a "red-alert" human pathogen [40]. S. aureus and A. baumannii are members of ESKAPE (acronym of Enterococcus faecium, S. aureus, Klebsiella pneumoniae, A. baumannii, P. aeruginosa, and Enterobacter), which are pathogens commonly associated with MDR [41]. Bacillus species are present ubiquitously in nature and are non-pathogenic except two strains: Bacillus anthracis which causes anthrax [42], and B. cereus which causes food poisoning [43]. Most Pseudomonas sp. are naturally resistant to -lactam antibiotics such as ampicillin [44]; therefore, many Pseudomonas species are identified in this study, although the well-known opportunistic human pathogen P. aeruginosa species was not identified.
Although some differences are identified depending on the library and area (Texas vs. California), more samples are required to claim any significant differences of the presence of pathogenic bacteria (Figure 2b and Figure 3b).

Discussion
Microorganisms exist in every environment where people are active and are generally more beneficial than harmful to humans. The purpose of this study is to investigate the presence of harmful bacteria in children's books from public libraries by mass spectrometry analysis. Two antibiotics, kanamycin, which mainly works against Gram (-) bacteria, and ampicillin, selective against Gram (+) bacteria, were used to enrich each antibiotic-resistant bacteria from the expected huge numbers of antibiotic-sensitive bacteria on the books. The bacteria grown in kanamycin-containing medium are all Gram (+) bacteria (Figures 2b and 3b).
S. haemolyticus is the second most clinically isolated opportunistic pathogen (S. epidermidis is the first) that can causes meningitis, skin or soft tissue infections, prosthetic joint infections, or bacteremia [29], and has the highest level of antimicrobial resistance [30]. S. aureus is one of the leading pathogens for nosocomial infections showing multi-drug resistance (MDR) [31,32]. All these strains are associated with the human skin, gastrointestinal tract and urogenital tract [33,34].
Both Gram (+) and Gram (-) bacteria are identified in ampicillin containing medium. Gram positive bacteria S. pneumoniae was another commonly identified species in this study. S. pneumoniae is particularly dangerous for young children, older adults, and persons with underlying comorbidities [35,36]. Gram (-) bacteria A. baumannii is known to be one of the most severe MDR pathogens [37]. It often causes problems in immunocompromised individuals, particularly those who have experienced a prolonged (over 90 days) hospital stay [38]. It has been known to spread through the skin as well as the respiratory and oropharyngeal secretions of infected individuals [39]. Because it has an exceptional ability to develop resistance to all currently available antibiotics, it has been designated as a "red-alert" human pathogen [40]. S. aureus and A. baumannii are members of ESKAPE (acronym of Enterococcus faecium, S. aureus, Klebsiella pneumoniae, A. baumannii, P. aeruginosa, and Enterobacter), which are pathogens commonly associated with MDR [41]. Bacillus species are present ubiquitously in nature and are non-pathogenic except two strains: Bacillus anthracis which causes anthrax [42], and B. cereus which causes food poisoning [43]. Most Pseudomonas sp. are naturally resistant to β-lactam antibiotics such as ampicillin [44]; therefore, many Pseudomonas species are identified in this study, although the well-known opportunistic human pathogen P. aeruginosa species was not identified.
Although some differences are identified depending on the library and area (Texas vs. California), more samples are required to claim any significant differences of the presence of pathogenic bacteria (Figures 2b and 3b).
In addition to the pathogenic bacteria identified in this study, other unidentified pathogenic bacteria could be present in the same sample because pathogenic bacteria were identified on the basis of resistance to ampicillin and kanamycin (Supplemental Table S1).
Species of Staphylococcus, Bacillus, Enterobacteriaceae, and Pseudomonas are the most common bacteria identified in the hospital environment [3]. All of these species are also identified in children's books in this study, suggesting that public library books could be responsible for bacterial transmission among children. Our results emphasize the importance of hand sanitizing after reading a book in the library and periodic sterilization of library books.
In this work we applied LC-ESI-MS/MS to detect bacteria at the species level. The advantages of deeper peptide coverage in LC-ESI-MS/MS compared to previously established MALDI-TOF are well addressed in numerous global proteome profiling studies [45,46]. Despite advantages in cost effectiveness and shorter turnaround time, the MALDI-TOF method is only able to detect abundant proteins and identify bacteria down to genus level [20]. For example, Balazova et al. could only identify mycobacterium at the genus level from their study testing the influence of culture conditions using a mixed culture of two known mycobacterium species [47]. Comparatively, we can detect 10,000 peptides within 1 hour of MS instrument time with 10 7 order of magnitude for protein coverage with our current LC-ESI-MS/MS-based method [22].

Conclusions
This study describes a simple and rapid method for the direct identification of bacteria from environmental media, which has the potential to discover and identify bacteria from various samples. This technique can be applied to the species or sub-species level of bacteria identification directly from clinical settings such as blood culture. Compared to conventional MALDI-TOF based detection methods, the deeper coverage of the bacterial peptides of this method enables identification of bacteria at the species or sub-species level.
Supplementary Materials: The following are available online at http://www.mdpi.com/1660-4601/16/6/912/s1, Table S1. List of bacteria from each book samples grown in ampicillin or kanamycin medium. Table S2. Protein and peptide list detected from bacteria grown in ampicillin medium. Table S3. Protein and peptide list detected from bacteria grown in kanamycin medium. Table S4. Bacteria species-specific unique peptide list.
Author Contributions: R.H.J. and M.K. contributed to all aspects of this work; B.B. provide unique peptide search tool for data analysis; J.M.C. supervised the MS analysis of sample, the MS data analysis, and manuscript writing; J.H.R. provided the project idea, was involved in manuscript writing, and supervised this work.

Conflicts of Interest:
The authors declare no conflict of interest.