Memory-Efficient Searching of Gas-Chromatography Mass Spectra Accelerated by Prescreening
Abstract
:1. Introduction
2. Results
2.1. Original ADAP-KDB Spectral Search Algorithm
Listing 1. A pseudo SQL query for calculating similarity scores between a query spectrum and all library spectra. The spectral information for all library spectra is stored in Table Peak with columns Mz, Intensity, TotalIntensity, and SpectrumId. The spectral information for the query spectrum is represented by the values and the corresponding scaled intensities and the total scaled intensity . This pseudo SQL query returns a table with similarity scores and IDs of the library spectra with scores greater than the spectral similarity score threshold. |
2.2. New ADAP-KDB Spectral Search Algorithm with the Prescreening Search
2.3. Comparison of ADAP-KDB Spectral Search Algorithms
3. Discussion
4. Materials and Methods
5. Conclusions
Supplementary Materials
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- NIST20: Updates to the NIST Tandem and Electron Ionization Spectral Libraries. Available online: https://www.nist.gov/programs-projects/nist20-updates-nist-tandem-and-electron-ionization-spectral-libraries (accessed on 13 August 2021).
- MoNA—MassBank of North America. Available online: https://mona.fiehnlab.ucdavis.edu/ (accessed on 13 August 2021).
- Smirnov, A.; Liao, Y.; Fahy, E.; Subramaniam, S.; Du, X. ADAP-KDB: A Spectral Knowledgebase for Tracking and Prioritizing Unknown GC–MS Spectra in the NIH’s Metabolomics Data Repository. Anal. Chem. 2021, 93, 12213–12220. [Google Scholar] [CrossRef] [PubMed]
- Stein, S.E.; Scott, D.R. Optimization and testing of mass spectral library search algorithms for compound identification. J. Am. Soc. Mass Spectrom. 1994, 5, 859–866. [Google Scholar] [CrossRef] [Green Version]
- Koo, I.; Zhang, X.; Kim, S. Wavelet- and Fourier-Transform-Based Spectrum Similarity Approaches to Compound Identification in Gas Chromatography/Mass Spectrometry. Anal. Chem. 2011, 83, 5631–5638. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Hertz, H.S.; Hites, R.A.; Biemann, K. Identification of mass spectra by computer-searching a file of known spectra. Anal. Chem. 1971, 43, 681–691. [Google Scholar] [CrossRef]
- Crawford, L.R.; Morrison, J.D. Computer methods in analytical mass spectrometry. Identification of an unknown compound in a catalog. Anal. Chem. 1968, 40, 1464–1469. [Google Scholar] [CrossRef]
- Knock, B.A.; Smith, I.C.; Wright, D.E.; Ridley, R.G.; Kelly, W. Compound identification by computer matching of low resolution mass spectra. Anal. Chem. 1970, 42, 1516–1520. [Google Scholar] [CrossRef]
- The NIST Mass Spectrometry Data Center. NIST/EPA/NIH Mass Spectral Library (NIST 14) and NIST Mass Spectral Search Program (Version 2.2). User’s Guide; U.S. Department of Commerce: Gaithersburg, MD, USA, 2014. [Google Scholar]
- Zhu, Q.; Yu, J.; Hu, J.; Ding, C. Two-step spectral library pre-search: A novel approach for speeding up compound identification. Int. J. Mass Spectrom. 2017, 417, 40–44. [Google Scholar] [CrossRef]
- Gao, W.; Tang, K.; Liu, M.; Zhang, J.L.; Yu, J.; Huang, S. Fuzzy-precise positioning: A pre-search algorithm based on feature peaks of mass spectra for acceleration of chemical compound recognition. Int. J. Mass Spectrom. 2019, 439, 53–59. [Google Scholar] [CrossRef]
- Li, C.; Han, J.; Huang, Q.; Li, B.; yao Zhang, Z.; Guo, C. An effective two-stage spectral library search approach based on lifting wavelet decomposition for complicated mass spectra. Chemom. Intell. Lab. Syst. 2014, 132, 75–81. [Google Scholar] [CrossRef]
- Huber, F.; van der Burg, S.; van der Hooft, J.J.J.; Ridder, L. MS2DeepScore: A novel deep learning similarity measure to compare tandem mass spectra. J. Cheminform. 2021, 13, 84. [Google Scholar] [CrossRef]
- Huber, F.; Ridder, L.; Verhoeven, S.; Spaaks, J.H.; Diblen, F.; Rogers, S.; van der Hooft, J.J.J. Spec2Vec: Improved mass spectral similarity scoring through learning of structural relationships. PLoS Comput. Biol. 2021, 17, e1008724. [Google Scholar] [CrossRef]
- Qin, C.; Luo, X.; Deng, C.; Shu, K.; Zhu, W.; Griss, J.; Hermjakob, H.; Bai, M.; Perez-Riverol, Y. Deep learning embedder method and tool for mass spectra similarity search. J. Proteom. 2021, 232, 104070. [Google Scholar] [CrossRef] [PubMed]
- Wohlgemuth, G.; Mehta, S.S.; Mejia, R.F.; Neumann, S.; Pedrosa, D.; Pluskal, T.; Schymanski, E.L.; Willighagen, E.L.; Wilson, M.; Wishart, D.S.; et al. SPLASH, a hashed identifier for mass spectra. Nat. Biotechnol. 2016, 34, 1099–1101. [Google Scholar] [CrossRef] [PubMed]
- MySQL Community Server 8.0. Available online: https://www.mysql.com/ (accessed on 21 February 2022).
- Cloud Computing Services—Amazon Web Services (AWS). Available online: https://aws.amazon.com/ (accessed on 21 February 2022).
- Stein, S. An integrated method for spectrum extraction and compound identification from gas chromatography/mass spectrometry data. J. Am. Soc. Mass Spectrom. 1999, 10, 770–781. [Google Scholar] [CrossRef] [Green Version]
- Metabolomics Workbench. Available online: https://metabolomicsworkbench.org/ (accessed on 19 August 2021).
- Smirnov, A.; Qiu, Y.; Jia, W.; Walker, D.I.; Jones, D.P.; Du, X. ADAP-GC 4.0: Application of Clustering-Assisted Multivariate Curve Resolution to Spectral Deconvolution of Gas Chromatography–Mass Spectrometry Metabolomics Data. Anal. Chem. 2019, 91, 9069–9077. [Google Scholar] [CrossRef] [PubMed]
ADAP-KDB (Orig) | ADAP-KDB (Pre) | MSSearch (Id) | MSSearch (High-Res) | |
---|---|---|---|---|
1111 low-res spectra | 12.5 | 3.5 | 1.1 | — |
1446 high-res spectra | 3.4 | 4.4 | 1.2 | 1.0 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Smirnov, A.; Liao, Y.; Du, X. Memory-Efficient Searching of Gas-Chromatography Mass Spectra Accelerated by Prescreening. Metabolites 2022, 12, 491. https://doi.org/10.3390/metabo12060491
Smirnov A, Liao Y, Du X. Memory-Efficient Searching of Gas-Chromatography Mass Spectra Accelerated by Prescreening. Metabolites. 2022; 12(6):491. https://doi.org/10.3390/metabo12060491
Chicago/Turabian StyleSmirnov, Aleksandr, Yunfei Liao, and Xiuxia Du. 2022. "Memory-Efficient Searching of Gas-Chromatography Mass Spectra Accelerated by Prescreening" Metabolites 12, no. 6: 491. https://doi.org/10.3390/metabo12060491
APA StyleSmirnov, A., Liao, Y., & Du, X. (2022). Memory-Efficient Searching of Gas-Chromatography Mass Spectra Accelerated by Prescreening. Metabolites, 12(6), 491. https://doi.org/10.3390/metabo12060491