Submit to this Journal Review for this Journal Propose a Special Issue

Article Menu

Share Help Cite Discuss in SciProfiles

Open AccessFeature PaperArticle

Peer-Review Record

Predicting the Potency of Anti-Alzheimer’s Drug Combinations Using Machine Learning

Processes 2021, 9(2), 264; https://doi.org/10.3390/pr9020264

by Thomas J. Anastasio

Reviewer 1: Anonymous

Reviewer 2: Anonymous

Processes 2021, 9(2), 264; https://doi.org/10.3390/pr9020264

Submission received: 19 November 2020 / Revised: 13 January 2021 / Accepted: 21 January 2021 / Published: 29 January 2021

(This article belongs to the Special Issue Machine Learning Methods for Modelling Neurological Diseases)

Round 1

Reviewer 1 Report

The article seems to be a novel finding and really important regarding Alzheimers' Disease

Results are a bit confusing to me, less is more. Please, try to make graphs a bit more comprehensible.

Methodics could be described a bit more in detail and structured as I lost myself several times try to understand it.

Author Response

The article seems to be a novel finding and really important regarding Alzheimers' Disease.

The author appreciates the reviewer's positive and constructive comments.

Results are a bit confusing to me, less is more. Please, try to make graphs a bit more comprehensible.

The Results section now has less text than any other section of the revised manuscript. Streamlining and rewriting of the Methods section should make the Results more easily understood. The captions to Figures 2 through 6 have been augmented for greater comprehensibility.

Methodics could be described a bit more in detail and structured as I lost myself several times try to understand it.

The Methods section has been extensively rewritten for greater clarity. ANNs are more thoroughly explained in lines 91-104, 108-115, and 122-123. The power and versatility of an ANN having an internal layer of recurrently connected LSTMs is explained more thoroughly in lines 125-142. The description of age-advancing sequence has been expanded for greater clarity on lines 178-190. Treatment of missing input data is now described more succinctly on lines 197-198. The outcome of the Genetic Algorithm optimization of ANN configuration is made more clear on lines 223-231. Perhaps most relevant to this reviewer's concerns, the description of the computation of drug combination predicted potency, which underlies all of the main results of this study, has been completely rewritten for greater clarity (lines 274-288 and 301-304).

Reviewer 2 Report

The author in this manuscript presents an original approach to simulate the potential influence of various drug combinations on the cognitive status of patients with Alzheimer's Disease. The proposed approach is based fully on data analytics (artificial neural network models trained on two publicly available datasets) and simulation of different drug combinations on age-sequenced interpolated data. This is really an interesting and original way, also the informative value of its results is probably limited. But it is quite well explained and discussed in the manuscript.

The major problem is missing a state of the art and/or a related work section. There is therefore not clear, what has been already done by other researchers (even one similar work of author already published elsewhere) and what new is discovered by the authors of this manuscript.

Minor issues:

Abbreviation NSAID is used the first time in the abstract but explained only later on in the text.
Line 37: “… to train an AI using ML …” – the formulation is not suited for a scientific journal. One can train a model using some ML (part of AI) algorithm, not the AI.
Lines 61-62: “The ML 61 technique used to train ANNs is supervised learning” is not precise. Supervised learning is a general term covering a broad spectrum of ML algorithms, not only for training ANNs. Moreover, there are also unsupervised ANN architectures with respective learning algorithms, like e.g. SOM (self-organizing maps).
Page 3: Description of the used ANN architecture is misleading. It starts presenting the architecture as a stand-alone feedforward neural network, but then it shifts to the LSTM network, which is a different type of architecture.
Lines 126-127: “… input/desired-output pairs were arranged into age-advancing sequences for each 126 participant” what does “age-advancing sequence“ means precisely?
Lines 135-137: “Because the 133 majority of the input data is binary the mean, rather than the median, is a better measure of expected 134 value.” Really? I think that it is exactly the opposite. What would it mean e.g. 0.57 as a mean for a binary variable?
The last paragraph in subsection 2.1 shows, that you normalized also the scoring variables. I do not think this is a proper way of preprocessing. For categorical variables like the given example there would be more appropriate to use a series of binary attributes, i.e. binarize them.

Author Response

The author appreciates the reviewer's thoughtful and constructive comments. The manuscript has been rewritten thoroughly on the basis of this review.

The entire Introduction (lines 27-89) has been rewritten to include a survey of the state-of-the-art in machine-learning approaches applied to Alzheimer disease (AD). Six new references were added. The survey reveals that machine learning (ML) has not yet been used to predict repurposed drug combinations from AD database data. The initial study described in this manuscript is the first of its kind.

Minor issues:

Abbreviation NSAID is used the first time in the abstract but explained only later on in the text.

The term "NSAID" has been spelled out in the Abstract (line 18).

Line 37: “… to train an AI using ML …” – the formulation is not suited for a scientific journal. One can train a model using some ML (part of AI) algorithm, not the AI.

That statement has been rewritten as per the reviewer's comment (lines 64-65).

Lines 61-62: “The ML 61 technique used to train ANNs is supervised learning” is not precise. Supervised learning is a general term covering a broad spectrum of ML algorithms, not only for training ANNs. Moreover, there are also unsupervised ANN architectures with respective learning algorithms, like e.g. SOM (self-organizing maps).

That statement has been made precise by making it specific to the ML application described in this manuscript (lines 101-103).

Page 3: Description of the used ANN architecture is misleading. It starts presenting the architecture as a stand-alone feedforward neural network, but then it shifts to the LSTM network, which is a different type of architecture.

The Methods section has been expanded to provide more background on ANN architectures and to clarify the architecture of the ANN used in this study (lines 91-101, 107-115, and 122-123). Also, a more thorough description of LSTMs and networks thereof is now provided (lines 125-142).

Lines 126-127: “… input/desired-output pairs were arranged into age-advancing sequences for each 126 participant” what does “age-advancing sequence“ means precisely?

The description of age-advancing sequence has been expanded for greater clarity on lines 178-190.

Lines 135-137: “Because the 133 majority of the input data is binary the mean, rather than the median, is a better measure of expected 134 value.” Really? I think that it is exactly the opposite. What would it mean e.g. 0.57 as a mean for a binary variable?

Technically, a binary random variable has neither a mean nor a median. It has a mode, which is the most frequently observed value, and the average value is the proportion of 1's and also the probability of observing a 1. Whether binary or not, replacing missing input values with the average of the available values is consistent with the maximum-entropy approach, which is standard in ML. This is now stated on lines 197-198 and a new reference has been added.

The last paragraph in subsection 2.1 shows, that you normalized also the scoring variables. I do not think this is a proper way of preprocessing. For categorical variables like the given example there would be more appropriate to use a series of binary attributes, i.e. binarize them.

For this application I needed to combine the cognitive scores as predicted by the ANN into a composite cognitive score so that I could rank order the drug combinations according to their predicted potencies (the higher the composite cognitive score associated with a drug combination the higher its predicted potency). The most straightforward way to do that was to have each output unit represent one of the normalized cognitive test scores, and then average over the output unit activations to obtain the combined cognitive score. Binarizing the desired-outputs, and then somehow reconverting and combining the reconverted actual outputs to obtain a composite cognitive score, would have entailed many additional assumptions that could have affected the correlation between the results derived separately from the two datasets in unknowable ways. For that reason, the more straightforward, simpler method was preferable for this application. The benefit of the more straightforward method should now be more apparent from the improved description of the computation of the combined cognitive score (lines 274-288 and 301-304).

Article Menu

Predicting the Potency of Anti-Alzheimer’s Drug Combinations Using Machine Learning

Further Information

Guidelines

MDPI Initiatives

Follow MDPI