Small Molecule Identification beyond the Crystal Ball - Selected Papers from CASMI

A special issue of Metabolites (ISSN 2218-1989). This special issue belongs to the section "Advances in Metabolomics".

Deadline for manuscript submissions: closed (31 March 2013) | Viewed by 61872

Special Issue Editors


E-Mail Website
Guest Editor
Department of Stress- and Developmental Biology, Institute of Plant Biochemistry, Weinberg 3, 06120 Halle, Germany
Interests: mass spectrometry; metabolite profiling; metabolite identification; bioinformatics; databases

E-Mail Website
Guest Editor
Eawag: Swiss Federal Institute of Aquatic Science and Technology, Überlandstrasse 133, 8600 Dübendorf, Switzerland
Interests: non-target environmental screening; mass spectrometry; structure generation; QSARs/QSPRs for candidate selection

Special Issue Information

Dear Colleagues,

The full set of challenges in the CASMI contest (Critical Assessment of Small Molecule Identification) went online a few weeks ago. Now is the time to plan the special issue of the MDPI journal Metabolites, which will make up the proceedings of CASMI. Please also note that we will send a mail to the casmi-announce mailing list shortly, with updates on the CASMI participation and evaluation details. Remember to sign up to the list(s) if you haven’t already.

Call for Papers: CASMI proceedings

The special issue will contain a summary where the organisers give details about the challenge compounds, the evaluation measures and comment on the submissions. But the main content will be your articles about your participation: we want descriptions of how you actually tackled the challenges, which steps were automagic and, if necessary, where manual intervention was required. If your approach was published previously, please focus on what is relevant for the CASMI challenges and improvements that have been made since the last publication. All articles will be peer-reviewed by at least two reviewers from the community. There is no actual page limit, but the reviews will check that the length is adequate.

Dr. Emma Schymanski
Dr. Steffen Neumann
Guest Editors

Important Dates:
31.03.2013 Submission deadline for full papers in the journal Metabolites (MDPI.com)
31.12.2012 Last day of early-bird article pre-registration

Submission

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. Papers will be published continuously (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as communications are invited.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are refereed through a peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Metabolites is an international peer-reviewed Open Access quarterly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) is 300 CHF per accepted paper for papers submitted after 1 January 2013. English correction and/or formatting fees of 250 CHF (Swiss Francs) will be charged in certain cases for those articles accepted for publication that require extensive additional formatting and/or English corrections.

Published Papers (8 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

Jump to: Review

572 KiB  
Article
Tackling CASMI 2012: Solutions from MetFrag and MetFusion
by Christoph Ruttkies, Michael Gerlich and Steffen Neumann
Metabolites 2013, 3(3), 623-636; https://doi.org/10.3390/metabo3030623 - 05 Aug 2013
Cited by 7 | Viewed by 7003
Abstract
The task in the critical assessment of small molecule identification (CASMI) contest category 2 was to determine the identification of (initially) unknown compounds for which high-resolution tandem mass spectra were published. We focused on computer-assisted methods that tried to correctly identify the compound [...] Read more.
The task in the critical assessment of small molecule identification (CASMI) contest category 2 was to determine the identification of (initially) unknown compounds for which high-resolution tandem mass spectra were published. We focused on computer-assisted methods that tried to correctly identify the compound automatically and entered the contest with MetFrag and MetFusion to score candidate structures retrieved from the PubChem structure database. MetFrag was combined with the metabolite-likeness score, which helped to improve the performance for the natural product challenges. We present the results, discuss the performance, and give details of how to interpret the MetFrag and MetFusion output. Full article
Show Figures

Graphical abstract

2206 KiB  
Article
The Critical Assessment of Small Molecule Identification (CASMI): Challenges and Solutions
by Emma L. Schymanski and Steffen Neumann
Metabolites 2013, 3(3), 517-538; https://doi.org/10.3390/metabo3030517 - 25 Jun 2013
Cited by 28 | Viewed by 9223
Abstract
The Critical Assessment of Small Molecule Identification, or CASMI, contest was founded in 2012 to provide scientists with a common open dataset to evaluate their identification methods. In this article, the challenges and solutions for the inaugural CASMI 2012 are presented. The contest [...] Read more.
The Critical Assessment of Small Molecule Identification, or CASMI, contest was founded in 2012 to provide scientists with a common open dataset to evaluate their identification methods. In this article, the challenges and solutions for the inaugural CASMI 2012 are presented. The contest was split into four categories corresponding with tasks to determine molecular formula and molecular structure, each from two measurement types, liquid chromatography-high resolution mass spectrometry (LC-HRMS), where preference was given to high mass accuracy data, and gas chromatography-electron impact-mass spectrometry (GC-MS), i.e., unit accuracy data. These challenges were obtained from plant material, environmental samples and reference standards. It was surprisingly difficult to obtain data suitable for a contest, especially for GC-MS data where existing databases are very large. The level of difficulty of the challenges is thus quite varied. In this article, the challenges and the answers are discussed, and recommendations for challenge selection in subsequent CASMI contests are given. Full article
Show Figures

Graphical abstract

236 KiB  
Article
Molecular Formula Identification with SIRIUS
by Kai Dührkop, Kerstin Scheubert and Sebastian Böcker
Metabolites 2013, 3(2), 506-516; https://doi.org/10.3390/metabo3020506 - 13 Jun 2013
Cited by 32 | Viewed by 7302
Abstract
We present results of the SIRIUS2 submission to the 2012 CASMI contest. Only results for Category 1 (molecular formula identification) were submitted. The SIRIUS method and the parameters used are briefly described, followed by detailed analysis of the results and a discussion [...] Read more.
We present results of the SIRIUS2 submission to the 2012 CASMI contest. Only results for Category 1 (molecular formula identification) were submitted. The SIRIUS method and the parameters used are briefly described, followed by detailed analysis of the results and a discussion of cases where SIRIUS2 was unable to come up with the correct molecular formula. SIRIUS2 returns consistently high quality results, with the exception of fragmentation pattern analysis of time-of-flight data. We then discuss possibilities for further improving SIRIUS2 in the future. Full article
Show Figures

Figure 1

626 KiB  
Article
Metabolite Identification through Machine Learning— Tackling CASMI Challenge Using FingerID
by Huibin Shen, Nicola Zamboni, Markus Heinonen and Juho Rousu
Metabolites 2013, 3(2), 484-505; https://doi.org/10.3390/metabo3020484 - 06 Jun 2013
Cited by 21 | Viewed by 8372
Abstract
Metabolite identification is a major bottleneck in metabolomics due to the number and diversity of the molecules. To alleviate this bottleneck, computational methods and tools that reliably filter the set of candidates are needed for further analysis by human experts. Recent efforts in [...] Read more.
Metabolite identification is a major bottleneck in metabolomics due to the number and diversity of the molecules. To alleviate this bottleneck, computational methods and tools that reliably filter the set of candidates are needed for further analysis by human experts. Recent efforts in assembling large public mass spectral databases such as MassBank have opened the door for developing a new genre of metabolite identification methods that rely on machine learning as the primary vehicle for identification. In this paper we describe the machine learning approach used in FingerID, its application to the CASMI challenges and some results that were not part of our challenge submission. In short, FingerID learns to predict molecular fingerprints from a large collection of MS/MS spectra, and uses the predicted fingerprints to retrieve and rank candidate molecules from a given large molecular database. Furthermore, we introduce a web server for FingerID, which was applied for the first time to the CASMI challenges. The challenge results show that the new machine learning framework produces competitive results on those challenge molecules that were found within the relatively restricted KEGG compound database. Additional experiments on the PubChem database confirm the feasibility of the approach even on a much larger database, although room for improvement still remains. Full article
Show Figures

Graphical abstract

417 KiB  
Article
Small Molecule Identification with MOLGEN and Mass Spectrometry
by Markus Meringer and Emma L. Schymanski
Metabolites 2013, 3(2), 440-462; https://doi.org/10.3390/metabo3020440 - 28 May 2013
Cited by 30 | Viewed by 7345
Abstract
This paper details the MOLGEN entries for the 2012 CASMI contest for small molecule identification to demonstrate structure elucidation using structure generation approaches. Different MOLGEN programs were used for different categories, including MOLGEN–MS/MS for Category 1, MOLGEN 3.5 and 5.0 for Category 2 [...] Read more.
This paper details the MOLGEN entries for the 2012 CASMI contest for small molecule identification to demonstrate structure elucidation using structure generation approaches. Different MOLGEN programs were used for different categories, including MOLGEN–MS/MS for Category 1, MOLGEN 3.5 and 5.0 for Category 2 and MOLGEN–MS for Categories 3 and 4. A greater focus is given to Categories 1 and 2, as most CASMI participants entered these categories. The settings used and the reasons behind them are described in detail, while various evaluations are used to put these results into perspective. As one author was also an organiser of CASMI, these submissions were not part of the official CASMI competition, but this paper provides an insight into how unknown identification could be performed using structure generation approaches. The approaches are semi-automated (category dependent) and benefit greatly from user experience. Thus, the results presented and discussed here may be better than those an inexperienced user could obtain with MOLGEN programs. Full article
Show Figures

Figure 1

243 KiB  
Article
CASMI—The Small Molecule Identification Process from a Birmingham Perspective
by J. William Allwood, Ralf J.M. Weber, Jiarui Zhou, Shan He, Mark R. Viant and Warwick B. Dunn
Metabolites 2013, 3(2), 397-411; https://doi.org/10.3390/metabo3020397 - 21 May 2013
Cited by 10 | Viewed by 6001
Abstract
The Critical Assessment of Small Molecule Identification (CASMI) contest was developed to provide a systematic comparative evaluation of strategies applied for the annotation and identification of small molecules. The authors participated in eleven challenges in both category 1 (to deduce a molecular formula) [...] Read more.
The Critical Assessment of Small Molecule Identification (CASMI) contest was developed to provide a systematic comparative evaluation of strategies applied for the annotation and identification of small molecules. The authors participated in eleven challenges in both category 1 (to deduce a molecular formula) and category 2 (to deduce a molecular structure) related to high resolution LC-MS data. For category 1 challenges, the PUTMEDID_LCMS workflows provided the correct molecular formula in nine challenges; the two incorrect submissions were related to a larger mass error in experimental data than expected or the absence of the correct molecular formula in a reference file applied in the PUTMEDID_LCMS workflows. For category 2 challenges, MetFrag was applied to construct in silico fragmentation data and compare with experimentally-derived MS/MS data. The submissions for three challenges were correct, and for eight challenges, the submissions were not correct; some submissions showed similarity to the correct structures, while others showed no similarity. The low number of correct submissions for category 2 was a result of applying the assumption that all chemicals were derived from biological samples and highlights the importance of knowing the origin of biological or chemical samples studied and the metabolites expected to be present to define the correct chemical space to search in annotation processes. Full article
Show Figures

Graphical abstract

124 KiB  
Article
Applying Tandem Mass Spectral Libraries for Solving the Critical Assessment of Small Molecule Identification (CASMI) LC/MS Challenge 2012
by Herbert Oberacher
Metabolites 2013, 3(2), 312-324; https://doi.org/10.3390/metabo3020312 - 26 Apr 2013
Cited by 15 | Viewed by 7507
Abstract
The “Critical Assessment of Small Molecule Identification” (CASMI) contest was aimed in testing strategies for small molecule identification that are currently available in the experimental and computational mass spectrometry community. We have applied tandem mass spectral library search to solve Category 2 of [...] Read more.
The “Critical Assessment of Small Molecule Identification” (CASMI) contest was aimed in testing strategies for small molecule identification that are currently available in the experimental and computational mass spectrometry community. We have applied tandem mass spectral library search to solve Category 2 of the CASMI Challenge 2012 (best identification for high resolution LC/MS data). More than 230,000 tandem mass spectra part of four well established libraries (MassBank, the collection of tandem mass spectra of the “NIST/NIH/EPA Mass Spectral Library 2012”, METLIN, and the ‘Wiley Registry of Tandem Mass Spectral Data, MSforID’) were searched. The sample spectra acquired in positive ion mode were processed. Seven out of 12 challenges did not produce putative positive matches, simply because reference spectra were not available for the compounds searched. This suggests that to some extent the limited coverage of chemical space with high-quality reference spectra is still a problem encountered in tandem mass spectral library search. Solutions were submitted for five challenges. Three compounds were correctly identified (kanamycin A, benzyldiphenylphosphine oxide, and 1-isopropyl-5-methyl-1H-indole-2,3-dione). In the absence of any reference spectrum, a false positive identification was obtained for 1-aminoanthraquinone by matching the corresponding sample spectrum to the structurally related compounds N-phenylphthalimide and 2-aminoanthraquinone. Another false positive result was submitted for 1H-benz[g]indole; for the 1H-benz[g]indole-specific sample spectra provided, carbazole was listed as the best matching compound. In this case, the quality of the available 1H-benz[g]indole-specific reference spectra was found to hamper unequivocal identification. Full article
Show Figures

Figure 1

Review

Jump to: Research

639 KiB  
Review
CASMI: And the Winner is . . .
by Emma L. Schymanski and Steffen Neumann
Metabolites 2013, 3(2), 412-439; https://doi.org/10.3390/metabo3020412 - 24 May 2013
Cited by 26 | Viewed by 8226
Abstract
The Critical Assessment of Small Molecule Identification (CASMI) Contest was founded in 2012 to provide scientists with a common open dataset to evaluate their identification methods. In this review, we summarize the submissions, evaluate procedures and discuss the results. We received five submissions [...] Read more.
The Critical Assessment of Small Molecule Identification (CASMI) Contest was founded in 2012 to provide scientists with a common open dataset to evaluate their identification methods. In this review, we summarize the submissions, evaluate procedures and discuss the results. We received five submissions (three external, two internal) for LC–MS Category 1 (best molecular formula) and six submissions (three external, three internal) for LC–MS Category 2 (best molecular structure). No external submissions were received for the GC–MS Categories 3 and 4. The team of Dunn et al. from Birmingham had the most answers in the 1st place for Category 1, while Category 2 was won by H. Oberacher. Despite the low number of participants, the external and internal submissions cover a broad range of identification strategies, including expert knowledge, database searching, automated methods and structure generation. The results of Category 1 show that complementing automated strategies with (manual) expert knowledge was the most successful approach, while no automated method could compete with the power of spectral searching for Category 2—if the challenge was present in a spectral library. Every participant topped at least one challenge, showing that different approaches are still necessary for interpretation diversity. Full article
Show Figures

Graphical abstract

Back to TopTop