Next Article in Journal
Temperature Stress Analysis of Super-Long Frame Structures Accounting for Differences in the Linear Expansion Coefficients of Steel and Concrete
Next Article in Special Issue
MRlogP: Transfer Learning Enables Accurate logP Prediction Using Small Experimental Training Datasets
Previous Article in Journal
Liquid Hot Water Pretreatment of Lignocellulosic Biomass at Lab and Pilot Scale
 
 
Communication
Peer-Review Record

SimilarityLab: Molecular Similarity for SAR Exploration and Target Prediction on the Web

Processes 2021, 9(9), 1520; https://doi.org/10.3390/pr9091520
by Steven Shave * and Manfred Auer *
Reviewer 1: Anonymous
Reviewer 2:
Processes 2021, 9(9), 1520; https://doi.org/10.3390/pr9091520
Submission received: 31 July 2021 / Revised: 19 August 2021 / Accepted: 25 August 2021 / Published: 27 August 2021
(This article belongs to the Special Issue Molecular Modeling: Computer-Aided Drug Design)

Round 1

Reviewer 1 Report

Hi Authors,

Thank you for your work.

I personally like your work and I’m believing we need this kind of work.

We have a lot of effective small molecules that don’t have FDA approval.

I’m Just wondering if you can add a demonstration part with one or two molecules in your manuscript.

In this case, you will show the short path to understand the drug mechanism of action and help researcher to work economically.

Also, it will be amazing if you can rich your study with a well-known database (eg., drug bank).

Hopefully, this is not too much to ask

Many thanks

Please stay safe

 

Author Response

Reviewer comment:

Hi Authors, Thank you for your work. I personally like your work and I’m believing we need this kind of work. We have a lot of effective small molecules that don’t have FDA approval. I’m Just wondering if you can add a demonstration part with one or two molecules in your manuscript. In this case, you will show the short path to understand the drug mechanism of action and help researcher to work economically.

Author response:

We were happy to read the reviewers enthusiasm for our work. We have expanded the manuscript, first giving a step by step guide to retrieving similars to the approved drug diclofenac, and then as the reviewer requested, we move on to target and mode of action prediction, the conclusions of which are supported by literature and the addition of more references.

 

Reviewer comment:

Also, it will be amazing if you can rich your study with a well-known database (eg., drug bank).

Author response:

We thank the reviewer for their comment. We have started conformer generation on the latest release of DrugBank (v5.1.8) which is on target to complete by our planned September SimilarityLab update, allowing searching of similar molecules within DrugBank.

Reviewer 2 Report

­
Overall comments

In the presented work Shave and Auer present SimilarityLab, a web-based tool that uses the USRCAT measure to determine structurally similar compounds to a user-defined query. Overall, I find the simplicity of the tool useful, and I think the application has the potential to fill a niche. 

However, I would like to see the authors put their work in the context of other previously-published similar tools, for instance USR-VS (http://usr.marseille.inserm.fr/, doi:10.1093/nar/gkw320). The latter seems to use the same similarity measures as the authors here, albeit on the ZINC database, last extracted in 2013. If other similar tools exist, advantages and disadvantages of the provided application should be briefly discussed.


Comments about the application specifically

* L82. Why is the activity threshold of 10uM hard-coded? Could a user specify a different one for the search?

* L78. Why is the similarity search result limited to 200 hits? It would be useful as a user to extract a larger number for other use-cases such as virtual screening. Same argument goes for the target prediction part of the software.

* How was the 0.67 threshold to remove non-druglike small molecules decided on?

* Right now it seems that the only accepted input format for a user is a SMILES string. Would it make sense to also provide the option for user-specific 3D conformations (e.g. as an .sdf or .xyz file)?

* In the target prediction part of the software, individual links to the found ChEMBL ids, as well as the targets found would be useful for the end user.

* Regarding the GitHub repository. It appears that most of the provided backend is provided via Python, but no requirements file is provided for future reproducibility/installation. At least a virtual or conda environment with version specifications for the packages used should be provided. At best, some form of CI/CD practice could also be applied.

Other comments

L27. Additional “.” inserted.

Author Response

Reviewer comment:

In the presented work Shave and Auer present SimilarityLab, a web-based tool that uses the USRCAT measure to determine structurally similar compounds to a user-defined query. Overall, I find the simplicity of the tool useful, and I think the application has the potential to fill a niche. 

However, I would like to see the authors put their work in the context of other previously-published similar tools, for instance USR-VS (http://usr.marseille.inserm.fr/, doi:10.1093/nar/gkw320). The latter seems to use the same similarity measures as the authors here, albeit on the ZINC database, last extracted in 2013. If other similar tools exist, advantages and disadvantages of the provided application should be briefly discussed.

Author response

We thank the reviewer for this advice which we have actioned in order to strengthen our paper. We were initially apprehensive in pointing out weaknesses in similar approaches and tools. However, the out-of-date nature does highlight a key strength of SimilarityLab, in that tools are in place for periodic updates. For instance, the planned September 2021 update will refresh our commercial chemical space, and see integration of the DrugBank database. We have added text to the manuscript pointing out the availability of USR-VS, but highlighting that it queries from a database last updated in 2013. We had previously used a similar service named EDULISS, but it seem that this is now offline and so do not bring it to the reader’s attention.  We have referred to the UFSRAT website (a similar 3D molecular similarity techniques) which also makes use of outdated databases.

 

Reviewer comment:

(Line 82). Why is the activity threshold of 10uM hard-coded? Could a user specify a different one for the search?

Author response:

This is a pertinent question which we gave careful thought to when first incorporating ChEMBL data into all of our data science and cheminformatics activities. We needed to decide upon an arbitrary activity cut-off for ChEMBL as incorporating even millimolar activities would not be beneficial. We chose 10 µM as it is the standard screening concentration in industry primary screens and is often used as an activity cut-off for beginning medicinal chemistry. Allowing a user definable cut-off would require further cloud computing resources and limit the funded lifetime of the SimilarityLab website, currently funded for 5 years although we will be pursuing follow-on funding and industrial/commercial partnerships to extend this. With future grant funding, this extension can be envisaged.

 

Reviewer comment:

(Line 78). Why is the similarity search result limited to 200 hits? It would be useful as a user to extract a larger number for other use-cases such as virtual screening. Same argument goes for the target prediction part of the software.

Author response

We thank the reviewer for pointing out the usefulness in larger molecular similarity datasets for use in virtual screening campaigns, and see that SimilarityLab would be more useful (in both research, and teaching) with this proposed feature expansion. We now have a user option for increasing the number of similar molecules returned up to 2,000, and have implemented the same user accessible options for the mechanism of action determination capabilities. This change has been tested and deployed to the SimilarityLab website.

 

Reviewer comment:

How was the 0.67 threshold to remove non-druglike small molecules decided on?

Author response

In establishing the QED measure of druglikeness, Bickerton and Hopkins asked chemists to assign druglike or non-druglike labels to 200 small molecules. The QED metric is based on fitting distributions of observed druglike properties (like MW, number of hydrogen bond donors, etc). The mean QED score of the molecules assigned as druglike by chemist was calculated to be 0.67. Whilst subjectively druglike molecules exist with scores lower than this cut-off, it has been used by others as evidenced in literature (https://doi.org/10.1021/acs.jcim.1c00155). Future expansions to SimilarityLab could see inclusion of non-druglike molecules.

 

Reviewer comment:

* Right now it seems that the only accepted input format for a user is a SMILES string. Would it make sense to also provide the option for user-specific 3D conformations (e.g. as an .sdf or .xyz file)?

Author response

Users may query molecules in the SDF, MOL and CML formats through the 2D drawing applet “Load data” button. However, this only serves to convert the 3D format to 2D in the applet, for final submission as SMILES. This is a design choice, as all of our processed databases are represented with conformers generated using the method outlined by Ebejer (https://doi.org/10.1021/ci2004658), targeting bioactive conformations, and to have the maximum likelihood of correctly pairing molecules, we would like to keep all conformers used by the system as the low energy conformers defined by the Ebejer et. al. method. We feel that offering custom query conformation functionality to all users could impact the quality of results when misused and suggest people seek cheminformatics advice on other approaches to find similars to their specific small molecule conformations if required. We have expanded upon this point in the ‘About’ section of the website.

 

Reviewer comment:

* In the target prediction part of the software, individual links to the found ChEMBL ids, as well as the targets found would be useful for the end user.

Author response

We thank the reviewer for their improvement suggestion. This has now been implemented, tested, and deployed on the SimilarityLab website, with all targets and actives clickable as links into ChEMBL.

 

Reviewer comment:

* Regarding the GitHub repository. It appears that most of the provided backend is provided via Python, but no requirements file is provided for future reproducibility/installation. At least a virtual or conda environment with version specifications for the packages used should be provided. At best, some form of CI/CD practice could also be applied.

Author response

We thank the reviewer for bringing this inconsistency to our attention and have now addressed this omission. We have added to the GitHub repository with instructions on recreating the required conda environment, and all packages with version numbers are supplied in a standard requirements.txt which can be read by conda (using the supplied instructions) for creation of a compatible environment.

 

Reviewer comment:

L27. Additional “.” inserted.

Author response:

Many thanks for spotting this error, which we have now addressed. Our newly uploaded version of the manuscript contains modifications requested by both reviewers, as well as new Figure 3 with a selection box visible for requesting the number of similars

Round 2

Reviewer 2 Report

The authors have successfully addressed all of my comments in this revision. 

Back to TopTop