Next Article in Journal
Sunflower WRINKLED1 Plays a Key Role in Transcriptional Regulation of Oil Biosynthesis
Next Article in Special Issue
Mitochondria as the Target of Hepatotoxicity and Drug-Induced Liver Injury: Molecular Mechanisms and Detection Methods
Previous Article in Journal
Immune Dysregulation in Autism Spectrum Disorder: What Do We Know about It?
Previous Article in Special Issue
A Transcriptomic Approach to Elucidate the Mechanisms of Gefitinib-Induced Toxicity in Healthy Human Intestinal Organoids
Order Article Reprints
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:

Prediction of the Neurotoxic Potential of Chemicals Based on Modelling of Molecular Initiating Events Upstream of the Adverse Outcome Pathways of (Developmental) Neurotoxicity

Laboratory of Environmental Chemistry and Toxicology, Department of Environmental Health Sciences, Istituto di Ricerche Farmacologiche Mario Negri IRCCS, Via Mario Negri 2, 20156 Milan, Italy
School of Pharmacy and Biomolecular Sciences, Liverpool John Moores University, Byrom Street, Liverpool L3 3AF, UK
Author to whom correspondence should be addressed.
Int. J. Mol. Sci. 2022, 23(6), 3053;
Received: 8 February 2022 / Revised: 7 March 2022 / Accepted: 8 March 2022 / Published: 11 March 2022
(This article belongs to the Special Issue Molecular Mechanisms of Specific Target Organ Toxicity)


Developmental and adult/ageing neurotoxicity is an area needing alternative methods for chemical risk assessment. The formulation of a strategy to screen large numbers of chemicals is highly relevant due to potential exposure to compounds that may have long-term adverse health consequences on the nervous system, leading to neurodegeneration. Adverse Outcome Pathways (AOPs) provide information on relevant molecular initiating events (MIEs) and key events (KEs) that could inform the development of computational alternatives for these complex effects. We propose a screening method integrating multiple Quantitative Structure–Activity Relationship (QSAR) models. The MIEs of existing AOP networks of developmental and adult/ageing neurotoxicity were modelled to predict neurotoxicity. Random Forests were used to model each MIE. Predictions returned by single models were integrated and evaluated for their capability to predict neurotoxicity. Specifically, MIE predictions were used within various types of classifiers and compared with other reference standards (chemical descriptors and structural fingerprints) to benchmark their predictive capability. Overall, classifiers based on MIE predictions returned predictive performances comparable to those based on chemical descriptors and structural fingerprints. The integrated computational approach described here will be beneficial for large-scale screening and prioritisation of chemicals as a function of their potential to cause long-term neurotoxic effects.

1. Introduction

The human brain is exceptionally sensitive to injury, and several neurodevelopmental processes have been shown to be highly vulnerable to external factors [1,2]. Such processes include neural progenitor cell proliferation, apoptosis, cell migration, neuronal and glial differentiation, neurite outgrowth and branching, myelination, synaptogenesis and neuronal network formation, the ontogeny of neurotransmitters and receptors, the development of the blood–brain barrier, and the developmental changes in the adolescent brain [3,4,5]. Disruption of any of these processes may lead to potentially adverse alterations in neuroanatomy, neurophysiology, and neurochemistry. It has been estimated that developmental neurotoxicity (DNT) disorders affect 10–15% of all births [6], and the prevalence of autism and attention-deficit hyperactivity disorders is increasing worldwide [1]. In addition, DNT disorders documented in children and adolescents could be a precursor of the development of neurodegenerative diseases (NDs) later in life [7]. NDs (e.g., Alzheimer’s and Parkinson’s) are widely investigated pathologies due to the low efficacy of current therapies [8,9], the severe functional impairments they impose on daily life activities, and the resulting high familial, social, and financial costs of patient care [10].
Overall, genetic factors seem to account for about 30–40% of all cases of DNT disorders [11]. Evidence has been reported that exposure to chemical stressors, e.g., industrial chemicals in the environment, is a key determinant in the occurrence of neurological disorders [11,12]. Thousands of chemicals have been reported to have adverse effect on neurodevelopment or to be toxic the nervous system in adults [11,12]. However, the total number of known neurotoxic substances is likely to be an underestimation of the true number released into the environment [11]. There is, therefore, a need to develop strategies able to screen the large number of chemicals to which the population is exposed daily and that may have possible long-term adverse health consequences to the brain. Guideline-based DNT studies involve the use of a large number of animals for an extended period of time, making this kind of study significantly resource-intensive and not suitable for large-scale screening [13,14].
Computational toxicology has been shown to be a cost- and time-efficient alternative to traditional toxicity testing methods [15]. Quantitative Structure–Activity Relationship (QSAR) models are computational methods that have been primarily applied for their capability to identify the toxicity of chemicals as a function of their structural attributes [16]. The increased availability of data obtained from in vitro bioactivity testing has made the development of QSARs easier. Moreover, QSARs require relatively few resources and are rapid, which have been key factors in their increased use assisting in filling toxicological data gaps for chemicals with high production volumes.
Toxicology has recently undergone a paradigm shift towards the use of alternative testing methods based on knowledge of the biological modes of action and pathways that are responsible for adverse effects, defined as adverse outcome pathways (AOPs). An AOP is a logical construct that connects an upstream molecular initiating event (MIE) (e.g., the interaction of a chemical with a molecular target) to a downstream adverse outcome (AO), progressing through a series of key events (KEs) [17,18]. According to this concept, compounds of unknown hazards can be assigned to various levels of concern based on the number of activated MIEs and the extent of their activation. Moreover, chemicals activating similar MIEs/KEs with respect to known toxic chemicals will be more likely to be toxic themselves [19]. Several authors have recently highlighted a possible synergism between the AOP concept and QSAR modelling in toxicology [19,20,21,22,23,24,25,26]. Thus, it is possible to utilise QSAR models to predict the potential of chemicals to modulate MIEs and to prioritise chemicals as a function of their toxicological profile.
In the present manuscript, we present an integrated computational system to predict neurotoxic potential that relies on the identification of the MIEs activated by chemicals. MIEs upstream of neurotoxicity were identified from recently published AOP networks, then QSAR models were developed for the prediction of each MIE. Predictions from QSARs for individual MIEs were evaluated for their capability to discriminate neurotoxic and non-neurotoxic compounds as part of an integrated computational prediction system. MIEs were used as independent variables in various machine learning approaches and compared for their predictive power with other widely used methods for chemical description, i.e., fingerprints and molecular descriptors. The predictions returned by QSARs presented here may represent an effective first tier of an Integrated Approaches to Testing and Assessment (IATA) to rapidly screen many chemicals, providing information regarding potential MIEs and associated mechanisms of toxicity and thus, helping to prioritise chemicals for additional and better-targeted screening/testing, e.g., in vitro testing [27,28].

2. Results

Performance statistics from external validation for the QSARs for MIEs are reported in Table 1. A complete description of the MIEs listed in Table 1 and of their abbreviations is reported in Section 4.1.
The statistics of the balanced random forest (BRF) models for MIEs were extremely good and confirmed the high quality of the information included in ChEMBL database and its suitability as a source of data for modelling the interactions of ligands with molecular targets upstream of biological pathways (e.g., nuclear receptors and enzymes) [29,30]. One possible reason for this high performance is the high structural homogeneity of the active samples. This is not surprising, as many records included in ChEMBL are congeneric sets of candidate drugs, while negative samples in the MIE datasets are more structurally heterogeneous. This aspect is reasonable, as ligands for enzymes and receptors are often required to possess specific pharmacophoric features to interact with binding sites, leading to preferred structural moieties being shared among the different ligands.
External validation returned BA values in the range of 0.84–0.99. The statistics were balanced between sensitivity (SEN) and specificity (SPE); although SEN was higher in most cases, slightly lower values for the inactive class were most common in all datasets. As for Matthews Correlation Coefficient (MCC) values, several of the models showed values below the average, such as the BRFs for CYP2E1, KAR, and TTR. MCC has beenproposed for the evaluation of classification between two very unbalanced categories; however, it was observed that this parameter can sometimes be biased by high unbalance datasets on terms of the categories (i.e., fewer than 20% of chemicals included in the smallest category) [24]. Indeed, the datasets mentioned above are among those with a lower number of actives.
Table 2 shows statistics for the neurotoxicity-predicting models. The performance of each classifier was calculated as the average of 100 iterations of five-fold cross validation. Figure 1 (Figure 1a: k-nearest neighbours (k-NN); Figure 1b: random forest (RF); Figure 1c: neural network (NNET)) shows the distribution of the BAs of models based on MIE predictions compared with DRAGON descriptors and extended fingerprints. In the case of k-NN classifiers, the MIE predictions show higher performance (BA avg = 0.72) with respect to chemical descriptors (BA avg = 0.70), and fingerprints do not perform as well. MIEs and descriptors have a similar peak in the distribution of BAs between 0.70–0.75. RFs are the top-performing classifiers overall with respect to kNN and NNET, always reaching average BAs higher than 0.65 and having maximum values closer to 0.90. In this case, DRAGON descriptors were the top-performing variables (BA avg = 0.83), followed by predictions of QSARs based on fingerprints (BA avg = 0.74) and MIEs (BA avg = 0.73). In the case of NNET, MIE predictions and DRAGON descriptors (BA avg = 0.74) were characterised by an almost equal distribution profile, while fingerprints had lower performance (BA avg = 0.67).
The relative importance of MIEs for neurotoxicity prediction was evaluated using the methods described in Section 4.6. Table S1 in the Supplementary Materials reports the impact of the removal of specific MIEs on the performance (BAs) of models for neurotoxicity, while Table S2 sorts the various MIEs included in the RF models for neurotoxicity by their variable importance. Both BAs and variable importance are an average of the values calculated over the various modelling iterations; see Section 4.5.
Thyroid elements seem to be relevant (THRs, TPO and TTR) to neurotoxicity; in particular, the exclusion of TTR always leads to a reduction in BA average. TTR is the fourth MIE in terms of variable importance. THRs, TPO and TTR are involved in the biosynthesis, metabolism, and transportation of the thyroid hormone, respectively. Among the two isoforms of thyroid receptors, THRβ is consistently the most important (first descriptor for variable importance), while the removal of THRα did not negatively affect performance. Among ionotropic glutamate receptors, AMPAR and KAR seem to be more linked to neurotoxicity than NMDAR. AMPA/kainite receptor-mediated neurotoxicity was reported to possibly play a role in neuronal neurodegeneration in amyotrophic lateral sclerosis [31] and in the injury of basal forebrain cholinergic neurons in diseases such as Alzheimer’s [32]. VGSC and, to a lesser extent, GABAR are consistently relevant within models developed for neurotoxicity (Table S1). The relevance of VGSC and GABAR is confirmed in the RF variable importance analyses (Table S4). The role of SMARTS for protein adduct formation is unclear: the descriptor is the least relevant from the variable importance analyses, and on the whole, its exclusion does not affect the average performance of RFs. On the other hand, its removal has a detrimental effect on the performance of both the K-NN and NNET models.

3. Discussion

In the present work, a new integrated computational system was proposed for predicting the neurotoxic potential of chemicals as a function of their capability to trigger MIE (i.e., interaction and modulation of relevant receptors and enzymes) upstream of neurotoxicity. QSARs were developed to predict MIE induction while BRFs were applied to handle the unbalanced training data. In the last part of the manuscript, MIE predictions were used to classify neurotoxic and non-neurotoxic compounds and compared for their predictivity with other approaches to describe the structure of chemicals, namely, chemical descriptors and extended fingerprints.
Overall, in two out of three cases MIEs perform comparably with chemical descriptors and better than fingerprints, which are considered the gold standard for describing chemical structures within QSARs [33]. The only exception is given by neurotoxicity models based on RF, where DRAGON descriptors (BA avg = 0.83) performed better than their counterparts based on MIEs (BA avg = 0.73). This is likely to be due to the fact that RFs perform better if trained on a larger pool of variables, such as the pool of descriptors provided by DRAGON (i.e., several thousand) [34,35].
Despite this, one of the key advantages afforded by the use of MIE responses in place of the classical structural representation of molecules is the interpretability of the predictions. Indeed, a complete profile of the neuronal receptors and enzymes that are activated is given together with the overall neurotoxicity outcome, providing insights into the possible mode of action of a predicted neurotoxic chemical. This aspect was further verified by predicting a series of known neurotoxicants with the models predicting MIEs. The mode of action for these neurotoxicants was reported in a review by Masjosthusmann and coworkers [36] who gathered information from the literature about the targets upstream of the neurotoxicological pathway of these chemicals [11,12,37]. Interestingly, QSARs for the prediction of MIEs were able to identify the correct mode of action of several neurotoxicants. For example, dichlofenthion (97-17-6), Parathion (56-38-2), paraoxon (311-45-5), diazinon (333-41-5), physostigmin (57-47-6), ibogaine (83-74-9), and dichlovoros (62-73-7) were correctly predicted to stimulate cholinergic neurotransmission through AChE inhibition, while 3-Nitropropionic acid (504-88-1), glyphosate (1071-83-6), and argiopine (105029-41-2) were predicted to interact with NAMDR receptors. Indeed, the two former chemicals are reported to stimulate glutamatergic neurotransmission and cause excitotoxicity after activation of NMDA, leading to oxidative stress and cell death, while the latter was reported to inhibit glutamatergic neurotransmission after blockage of the post-synaptic receptors. Rotenone (83-79-4) and dieldrin (60-57-1) were correctly predicted to inhibit complex I (NADH dehydrogenase) and to cause reactive oxygen species (ROS)-induced degeneration of dopaminergic neurons and locomotor deficit.
In addition to increased biological relevance, MIE predictions simplify models to a reduced number of variables (i.e., fewer than 20), while in the case of descriptors and fingerprints several hundred variables may be included in the models.
The RFs developed here for the prediction of single MIEs returned satisfactory predictive performance and were confirmed to be a valuable method in the field of computational toxicology. The statistical performance of the models presented here confirmed our previous findings that using RFs with internal balancing of categories (BRF) represents one of the best methods for handling the unbalanced distribution typical of biological data [24,38].
In the present manuscript, the use of biological information (i.e., MIEs) instead of the classical structural description of molecules was proposed for the development of QSARs. The use of biological data (e.g., biological assays) utilised as input variables to develop QSARs has been increasingly explored in the recent literature [39]. This strategy is justified by the fact that QSARs historically had difficulty predicting complex systemic endpoints encompassing several mechanisms, which are difficult to model together. In the case of neurotoxicity, the brain is an extremely complex organ comprising a variety of highly specialised neuronal cell types that differ in function, expression of brain regions, and stages of development [40]. These different cells are all potential targets that can be disrupted by neurotoxicants with different possible mechanisms of toxicity [4]. Another limitation of QSARs is that they rely on the principle that analogies in chemical structure always result in analogies in toxicity. However, the existence of activity cliffs, i.e., compounds with high structural similarity together with unexpectedly high activity differences, were reported for high-tier endpoints characterised by multiple mechanisms of toxicity [41]. On the other hand, the use of information from AOPs and biological assays allows for the fragmentation of complex endpoints into simpler ones based on mechanistic knowledge. These “sub-endpoints” are easier to address with a single computational model, as they describe the interaction of a chemical with a single molecular target that triggers a specific response. Overall, this strategy allows for a reduction in the complexity of the challenge of capturing the complex relationships existing between the structure of a chemical and its high-level systemic toxicity [4].
The development of new machine learning and artificial intelligence-based approaches is highly desirable, as it allows for the detection of chemicals with potential neurotoxicity and DNT effects in a more time- and resource-efficient way compared to traditional in vivo testing. In addition, data from in silico screenings based on AOPs can provide a scientifically sound rationale to make decisions relating to assessment of the safety of chemicals. The mechanistic nature of AOPs provides knowledge to guide the design of new IATAs to meet regulatory needs [28,42]. In particular, these in silico predictions can be used to provide information regarding the potential MIEs of chemicals, to help prioritise or deprioritise certain chemicals for further testing, and to provide indications for better-targeted follow-up in vitro evaluations [43]. In the specific case of neurotoxicity, a wide range of in vitro tests has been proposed, each evaluating a different MIE/KE of the complex network upstream of the adverse outcome [27]. Predictions of MIE provided by QSARs may give indications of which assays to prioritise among the wide battery of tests available. In this regard, in silico models represent an ideal first tier of a multi-step IATA for the prediction of the neurotoxicity of chemicals which involves multiple alternative testing methods.

4. Materials and Methods

4.1. Data Selection for Molecular Initiating Events (MIEs)

MIEs linked to neurotoxicity were identified from the AOP networks [44] published by Spînu et al. [45] and Li et al. [27]. The MIEs selected along with their associated molecular targets (i.e., receptors, enzymes) are reported in Table 3. In certain cases, multiple molecular targets are associated with a single MIE (e.g., three different glutamate receptors were considered for MIE A), while a single target may be repeated in multiple MIEs (e.g., NADHOX was common to MIEs C and N). The molecular targets involved in the MIEs and their relevance in neurotoxicity are briefly described below.
  • Glutamate ionotropic receptors, i.e., N-methyl-D-aspartate (NMDAR), alpha-amino-3-hydroxy-5-methyl-4-isoxazolepropionate (AMPAR) and kainate (KAR) are responsible for excitatory synaptic transmission and synaptic plasticity, which are fundamental for learning and memory [46]. Sustained over-activation of these receptors (MIE A) can induce excitotoxicity due to increased Ca2+ influx, with consequent cell death, memory problems, and convulsions [4]. Analogously, the chronic blockage of NMDAR by chemicals during synaptogenesis (MIE B) disrupts neuronal network formation, resulting in the impairment of learning and memory processes [47] and increasing the risk of developing Alzheimer’s-type NDs in later life [2].
  • Protein adduct formation is the covalent interaction between an electrophilic chemical and the nucleophilic part of a protein, and may lead to damage of the protein and the potential loss of its function. This may affect thiol- and seleno-containing proteins, which offer antioxidant protection [48]. The binding of xenobiotics (e.g., heavy metals and mercury) to these or other proteins during brain development (MIE D and H) may lead to several functional impairments, such as in learning and memory. Cytochrome P450 2E1 (CYP2E1) is relevant to this mechanism as well, as it is one of the enzymes responsible for the metabolism of small compounds. The induction of CYP2E1 (MIE E) leads to an increase in reactive metabolites, which can form protein adducts. For example, a high concentration of ethanol leads to an increased expression of CYP2E1 and consequent increased production of acetaldehyde metabolite, which can form protein adducts [49]. The consequences include oxidative stress, lipid peroxidation, unfolded protein responses and, ultimately the apoptosis of neuronal cells [50].
  • The function of the Na+/I symporter (NIS) is critical for the physiological production and maintenance of thyroid hormone levels in the serum, as it mediates the transport of iodide into thyroid cells. Its inhibition (MIE F) results in decreased thyroid hormone synthesis, with effects on neurocognitive function in children [51,52].
  • Acetylcholinesterase (AChE) is an enzyme present in both central and peripheral nervous systems and in muscular motor plaques. It is responsible for the enzymatic cleavage of the neurotransmitter acetylcholine [53]. Inhibition of AChE (MIE I), e.g., by organophosphates and carbamates, leads to an increase in levels of acetylcholine and overstimulation of both muscarinic and nicotinic receptors, resulting in multiple adverse outcomes affecting a wide variety of functions [54].
  • Ryanodine-sensitive Ca2+ channels (RyR) contribute to neurotransmission and synaptic plasticity. Polychlorinated biphenyl (PCB) exposure has been reported to alter intracellular Ca2+ levels and to interfere with normal neuronal dendritic growth and plasticity in a RyR-dependent manner (MIE L) [55].
  • Thyroid hormone receptors α and β (THRα and THRβ) mediate the effects of thyroid hormones, while thyroperoxidase (TPO) and deionidase are involved in the biosynthesis/catabolism of thyroid hormones. Transtyretrin serum binding protein (TTR), monocarboxylate transporters 8 and 10, and the solute carrier organic anion transporter family member 1C14 (OATP1C1) are involved in the transportation of thyroid hormones at various levels [56]. Interference at any of these levels (MIEs G and Q-T) may lead to decreased thyroxine (T4) and thyroid hormones in the brain, and ultimately alter neurodevelopmental processes such as neuronal proliferation, apoptosis, migration, neurite outgrowth, and neuronal network connectivity [57,58], culminating in irreversible mental retardation and motor deficits [59]. It has been reported that PCBs induce activation of xenobiotic nuclear receptors, e.g., the constitutive androstane receptor (CAR) and the pregnane X receptor (PXR), which represent MIE P, leading to thyroid hormone disruption during cochlear development and potentially resulting in permanent auditory loss [60].
  • The complexes of the respiratory chain play a pivotal role in neuronal and glial cell survival and cell death, as they regulate both energy metabolism and apoptotic/necrotic pathways. The interaction of xenobiotics with these enzymes can interfere in various ways with their normal functionality, e.g., inhibiting the production of ATP (MIE M) or interfering with the redox cycle (MIE N and O), with consequent increased production of ROS and oxidative stress. Oxidative stress contributes to a loss of function of hippocampal neural progenitor cells and a decline in learning and memory performance [4]. Moreover, the inhibition of NADH-quinone oxidoreductase (NADHOX) (MIE C) by pesticides or toxins (e.g., neurotoxin 1-methyl-4-phenyl-1,2,3,6- tetrahydropyridine, MPTP) has been reported to cause mitochondrial dysfunction and degeneration of dopaminergic neurons of the nigro-striatal area, with consequent motor deficits typical of Parkinson’s disease [61].
  • Voltage-gated sodium channels (VGSC) are the primary molecules responsible for the control of the electrophysiological potentials of electrically excitable cells. Various isoforms exist, with isoforms 1, 2, 3, and 6 reported to be mainly expressed in the central nervous system [62]. Neurotoxic effects in mammals have been associated with the ability of some neurotoxicants (e.g., p,p’-DDT and pyrethroids) to bind to and disrupt VGSC (MIE U), with consequent behavioural effects [4,63].
  • Ionotropic GABA receptors (GABAR) are ligand-gated ion channels which play important roles in inhibitory neurotransmission [64]. Interference with GABA signalling (MIE V) during development and after brain maturation is likely to cause such varied adverse outcomes as autism, mental retardation, epilepsy, and schizophrenia [59]. Chemically-induced epileptic seizures can be caused by the binding of neurotoxicants (e.g., barbiturates, benzodiazepines, and picrotoxin) to the active sites of the GABA receptor [65].
Data relative to each of the molecular targets identified in Table 3 were extracted from the ChEMBL database [66] using a protocol adapted from Bosc et al. [30]. ChEMBL Target IDs for each molecular target linked to an MIE are listed in Table 4. When available, only data relative to Homo sapiens were considered. For each target, only bioactivities with pChEMBL values were chosen. This term refers to all the comparable measures of half-maximal responses (molar IC50, XC50, EC50, AC50, Ki, Kd, potency and ED50) on a negative logarithmic scale [67]. Different pChEMBL thresholds to classify bioactivity values were evaluated. Ultimately, selected pChEMBL data were flagged as active or inactive based on a pChEMBL threshold of 5.0 (10 µM), providing datasets with a reasonable number of active samples for modelling. pChEMBL-like activities with standard relation “>“ or “≥“ (i.e., not associated with a precise activity value) were included as inactive. Only activities that were not flagged as potential duplicates, with no data_validity_comment and with an activity_comment that was not ‘inconclusive’, ‘undetermined’, or ‘not determined’ were considered. Endpoints characterised by few or no active compounds (i.e., thyroid hormone transporters, monocarboxylate transporters 8 and 10 and OATP1C1, NADH-cytochrome b5 reductase, deiodinase, and thyroperoxidase) were excluded from the modelling. For the modelling of NADH oxide reductase activity Bos taurus data were used, as human data were not available. pCheMBL data distributions were heavily skewed in the majority of cases towards positive values.
In order to prevent skew towards positive values, which is different from the natural distribution of biological data (i.e., few active, many inactive compounds), each dataset was further enriched with the chemicals included in the datasets of the remaining endpoints. These chemicals were treated as ‘decoys’ and assumed to be inactive. Due to the very large number of data available, AChE data were not used to enrich inactive samples of other endpoints to avoid the creation of datasets excessively unbalanced towards inactives. A semi-automated curation procedure [70] was applied to SMILES strings retrieved from ChEMBL in order to neutralise ionised chemical structures, remove counterions, and discard inorganics, organometallics, and mixtures. Removal of duplicate structures was carried out automatically at the InChI level. The entry with the maximum pChEMBL activity was selected in the case of duplicate structures in order to maximise the number of active samples.
Table 4 reports the final distribution of active and inactive chemicals for each dataset. The training sets for each of the modelled MIEs are available in the Supplementary Materials (Table S3).

4.2. QSARs for Molecular Initiating Events

ChEMBL datasets from Table 4 were used to develop 15 QSARs for molecular targets involved in the MIEs. Extended fingerprints (Daylight Chemical Information Systems, Inc., 2019) were calculated for each compound with a KNIME implementation [71] of the CDK toolkit ( (accessed on 7 March 2022)) and used as input for QSAR modelling. The BRF [72] implemented in KNIME was used for QSAR development. This technique artificially alters the class distribution in each tree. A sampling without repetition was made to select compounds, allowing all of the active compounds to always be selected together with an equal number of randomly selected inactive compounds from the training set in order to assure balancing between categories [73]. The number of trees in each BRF was set to 100.
Models were validated by splitting each data set into a training (80%) and a test set (20%) by applying a stratification sampling to the activity classes. The splitting procedure was repeated 100 times using different random splits, ensuring that each chemical in the datasets was included in the test set the same number of times in order to avoid bias due to the molecules present in the different sets. The performance using the test set was calculated for each iteration, then the final performance of each model was calculated by averaging the statistical parameters obtained using the test sets relative to each of the 100 iterations.

4.3. Thyroperoxidase (TPO) Modelling

As no data were found for TPO inhibition from ChEMBL, the QSAR model for predicting TPO inhibition proposed by Gadaleta et al. [68] was used. The model was developed from data related to the Amplex UltraRed-thyroperoxidase (AUR-TPO) assay. For positive hit-calls, only high selective inhibitors were used for the development of the model. These data were characterised by a demarcated separation of the AUR-TPO assay log IC20 value from confounding activities reported by a luciferase inhibition assay (flagging for non-specific enzyme inhibition) and a cytotoxicity assay. The QSAR was based on a BRF developed with the imbalance-learner and scikit-learn Python libraries [74] and based on 160 DRAGON descriptors [75] with a training set of 723 chemicals. Additional details on the predictive performance of the model can be found in [68].

4.4. Reactivity SMARTS

MIEs D (Binding and SH/SeH proteins involved in protection against oxidative stress) and H (Protein Adduct Formation) do not refer to an interaction with a specific receptor/enzyme; rather, they describe non-specific covalent binding to biological macromolecules (i.e., proteins.). Because this type of binding refers to the intrinsic reactivity of molecules, SMARTS compiled by Enoch et al. [69] describing electrophilic protein binding reactions (71 SMARTS) were used to account for the two MIEs. Chemicals matching at least one of the SMARTS were flagged as positive (1); otherwise, they were negative (0).

4.5. Neurotoxicity Data

Predictions using single QSARs for MIEs of neurotoxicity were evaluated for their capability to predict the neurotoxic potential of chemicals. Neurotoxicity data were retrieved from Kosnik et al. [76], who listed data for a total of 73 compounds (41 neuroactive and 32 non-neuroactive). This is a sub-selection of a list of the EPA’s ToxCast chemicals, previously tested by Strickland et al. [77] for their neural network function in vitro as measured on primary cortical cultures grown on microelectrode arrays and then subsequently retested to confirm the measured activities. SMILES were retrieved from the chemical name and CAS number using the semi-automated data retrieval and curation procedure from Gadaleta et al. [70]. Four compounds (three neuroactive and one non-neuroactive) were removed because they were mixtures, inorganics, and/or organometallics, leading to a final dataset of 69 chemicals.
The final list of 69 compounds along with their neurotoxic classification is reported in the Supplementary Materials (Table S4).

4.6. Neurotoxicity Modelling

The MIEs for the 69 chemicals with data for neuroactivity were predicted with the 15 BRFs developed from the entire datasets in Table 4 using the BRF model to predict AUR-TPO from Gadaleta et al. [68], and were profiled with the SMARTS for electrophilic activity compiled by Enoch et al. [69].
The predictions for the 69 chemicals from Kosnik et al. [76] were reported in the form of probabilities associated with predictions, and are shown in Table S4 of the Supplementary Materials. Probabilities ranged from 0 to 1, and in the case of BRF are the percentage of trees within the BRF returning a ‘positive’ prediction. As a consequence, probabilities higher than 0.50 flag for positive predictions, while probabilities lower than 0.50 flag for negative predictions. Predictions equal to 0.50 were considered “not classified”.
The predictions generated by the MIE models for the 69 chemicals were used as independent variables to develop new integrated QSAR models for predicting the neurotoxicity of chemicals.
Three different classifiers able to naturally handle a high number of independent variables were trained based on the neurotoxicity data. Five different settings were applied for the various classifiers, as implemented in KNIME [71].
  • K-Nearest Neighbours (k-NN) [78]: Euclidean distance was used to calculate the similarity between the target and the neighbours. K was varied from 1 and 9, with a step of 2.
  • Random Forest (RF) [79]: the number of trees was varied from 50 and 250, with a step of 50.
  • Multi-linear Perceptron–Artificial Neural Networks (NNET) [80,81]: one hidden layer was used, with the number of hidden neurons varied from 2 to 12 with a step of 2.
In order to verify the capability of MIE prediction to discriminate neuroactive and non-neuroactive compounds, QSARs based on MIEs were compared with other models developed with the same algorithms (kNN, RF and NNET) and different independent variables (i.e., extended structural fingerprints [82] and chemical descriptors). Chemical descriptors were calculated by means of DRAGON software [75,83]. The initial pool of descriptors calculated by DRAGON was pruned by constant and semi-constant values (standard deviation < 0.001). Descriptors having at least one missing value were also discarded. In the case of highly correlated descriptors (absolute pair correlation > 0.90), only the one with the highest number of correlated descriptors was retained, while the others were discarded. This procedure led to a final pool of 747 descriptors.
Model performance was evaluated by five-fold cross validation. Fold-splitting was performed by applying a stratified sampling of the neurotoxicity categories. For each classifier and each selection of parameters, the seed applied when performing the split was maintained; thus, the various folds were always the same. The splitting procedure was repeated 100 times using the same list of 100 random seeds for each combination of classifiers and parameters; then, statistics were collected for each iteration. Considering the variation of splits and parameters, a total of 500 iterations were performed for each of the three classifiers.
The same procedure was repeated in turns using MIE predictions, fingerprints, and DRAGON descriptors as independent variables, for a total of 1500 models developed. In the case of NNETs, DRAGON descriptors were preliminarily normalised in a range of 0–1, as NNETs are sensitive to the normalisation of independent variables.
Figure 2 summarises the entire procedure described above, including data extraction and curation, MIE modelling, and neurotoxicity modeling.

4.7. Evaluation of MIE Relative Importance

The relative importance of the various MIEs on the neurotoxicity predictions was evaluated in two ways.
MIEs were iteratively removed, then QSARs for neurotoxicity were developed with the remaining features, as described in Section 4.6. BAs were averaged among the various iterations and compared to the reference values of models developed using all of the variables. A reduction in performance after the removal of a specific MIE flags a strong relationship between the excluded MIE and neurotoxicity. On the contrary, MIEs are considered less relevant if their exclusion does not vary or improve baseline performance.
Variable importance was calculated for each MIE within RF models. A score was calculated based on the attribute usage statistics in the RF for each descriptor by counting how many times it was selected for a split (#split) and at which rank (“level”; the first two levels were considered) among all available attributes (#candidates) in the trees of the ensemble:
Variable importance = #splits(level 0)/#candidates(level 0) + #splits(level 1)/#candidates(level 1)
Variable importance calculated in this way was averaged among the various modelling iterations.

5. Conclusions

In the present manuscript, a new take on the traditional QSAR methodology was proposed to predict neurotoxicity by employing the biological information associated with chemicals (in the form of ligand-based predictions of MIE activation data) in place of the traditional structural data. The main advantage of this approach is that it can both return a prediction of the adverse outcome and provide insights into the specific mechanisms and molecular events that trigger toxicity. Emphasising the specific mechanisms of action behind neurotoxicity will increase the confidence of scientists and regulators in the predictions returned by these models. Having information about the activated molecular targets that are responsible for an apical effect may, in some cases, provide indications to chemists of possible modifications to the structure of hazardous chemicals, allowing for the designing of safe alternatives.
Despite their increasing usage, the application of the AOP framework in computational toxicology remains hampered by numerous and serious challenges. In general, an AOP is always a simplification of more complex and articulated biological pathways. Indeed, for certain biological processes, there are gaps in definitive knowledge of all responsible molecular determinants and mechanisms. In the case of neurotoxicity, there is a lack of understanding of all the MIEs involved in the alteration of downstream KEs as well as the occurrence of the AOs [4]. Several of the MIEs initially identified from AOPs were not included in the modelling presented here due to the shortage of data. Gaps in knowledge regarding chemical concentrations and time of exposure to trigger MIE/KEs prevent the development of quantitative approaches and limit the development of AOPs for adult and developmental neurotoxicity mainly to qualitative ones [4,27]. Considering the fact that the AOPs studied in this work are likely to be incomplete, the results described here are even more encouraging. Indeed, the future availability of more high-quality data is likely to improve the predictive capability of single QSARs for MIEs, while the future availability of more detailed AOPs and the inclusion of additional MIEs will complete the overall infrastructure, possibly leading to a more accurate and reliable prediction of apical endpoints. The incorporation of exposure and toxicokinetics, i.e., absorption (e.g., blood–brain barrier penetration), distribution, metabolism, and excretion data represent a possible additional improvement of the results presented herein [84].

Supplementary Materials

The following supporting information can be downloaded at:

Author Contributions

Conceptualization, D.G. and N.S.; methodology, D.G.; validation, D.G., data curation, D.G. and N.S.; writing—original draft preparation, D.G.; writing—review and editing, N.S., A.R., M.T.D.C. and E.B.; supervision, A.R., M.T.D.C. and E.B.; project administration, A.R., M.T.D.C. and E.B.; funding acquisition, M.T.D.C. and E.B. All authors have read and agreed to the published version of the manuscript.


This research was funded by the European Union’s Horizon 2020 research and innovation programme under Grant Agreement No. 963845 (ONTOX) and the European Union’s Marie Skłodowska-Curie Action ‘in3’: MSCA-ITN-2016, Grant No. 721975.

Data Availability Statement

The data presented in this study are available in the Supplementary Materials.


D.G. acknowledges the grant from the LUSH Prize 2020 (Category: Young Researchers) from LUSH and Ethical Consumer.

Conflicts of Interest

The authors declare no conflict of interest.


AMPARalpha-amino-3-hydroxy-5-methyl-4-isoxazolepropionate receptor
AOadverse outcome
AOPadverse outcome pathway
AUCarea under the ROC curve
AUR-TPOAmplex UltraRed-thyroperoxidase
BAbalanced accuracy
BRFbalanced random forest
CARconstitutive androstane receptor
CYP2E1cytochrome P450 2E1
DNTdevelopmental neurotoxicity
FNfalse negative
FPfalse positive
GABARGABA receptor
IATAintegrated approaches to testing and assessment
KAkainite receptor
KEkey event
k-NNk-nearest neighbors
MCCMatthew’s correlation coefficient
MIEmolecular initiating event
NADHOXNADH-quinone oxidoreductase
NISNa+/I symporter
NDneurodegenerative disease
NNETneural networks
NMDARN-methyl-D-aspartate receptor
OATP1C1solute carrier organic anion transporter family member 1C14
PCBpolychlorinated biphenyls
PXRpregnane X receptor
QSARquantitative structure-activity relationship
RFrandom forest
ROSreactive oxygen species
RyRryanodine-sensitive Ca2+ channel
THRαthyroid hormone receptor α
THRβthyroid hormone receptor β
TNtrue negative
TPtrue positive
TTRtranstyretrin serum binding protein
VGSCvoltage-gated sodium channel


  1. Landrigan, P.J.; Lambertini, L.; Birnbaum, L.S. A Research Strategy to Discover the Environmental Causes of Autism and Neurodevelopmental Disabilities. Environ. Health Perspect. 2012, 120, a258–a260. [Google Scholar] [CrossRef] [PubMed][Green Version]
  2. Landrigan, P.J.; Sonawane, B.; Butler, R.N.; Trasande, L.; Callan, R.; Droller, D. Early Environmental Origins of Neurodegenerative Disease in Later Life. Environ. Health Perspect. 2005, 113, 1230–1233. [Google Scholar] [CrossRef] [PubMed][Green Version]
  3. Lein, P.; Silbergeld, E.; Locke, P.; Goldberg, A.M. In Vitro and Other Alternative Approaches to Developmental Neurotoxicity Testing (DNT). Environ. Toxicol. Pharmacol. 2005, 19, 735–744. [Google Scholar] [CrossRef] [PubMed]
  4. Bal-Price, A.; Crofton, K.M.; Sachana, M.; Shafer, T.J.; Behl, M.; Forsby, A.; Hargreaves, A.; Landesmann, B.; Lein, P.J.; Louisse, J. Putative Adverse Outcome Pathways Relevant to Neurotoxicity. Crit. Rev. Toxicol. 2015, 45, 83–91. [Google Scholar] [CrossRef] [PubMed][Green Version]
  5. Stiles, J.; Jernigan, T.L. The Basics of Brain Development. Neuropsychol. Rev. 2010, 20, 327–348. [Google Scholar] [CrossRef] [PubMed][Green Version]
  6. Bloom, B.; Cohen, R.A.; Freeman, G. Summary Health Statistics for US Children: National Health Interview Survey, 2009; National Center for Health Statistics: Hyattsville, MD, USA, 2010; Volume 247, pp. 1–82.
  7. Bandeen-Roche, K.; Glass, T.A.; Bolla, K.I. Cumulative Lead Dose and Cognitive Function in Older Adults. Altern. Med. Rev. 2010, 15, 112–113. [Google Scholar] [CrossRef] [PubMed][Green Version]
  8. Narayan, P.; Ehsani, S.; Lindquist, S. Combating Neurodegenerative Disease with Chemical Probes and Model Systems. Nat. Chem. Biol. 2014, 10, 911–920. [Google Scholar] [CrossRef]
  9. Trippier, P.C.; Jansen Labby, K.; Hawker, D.D.; Mataka, J.J.; Silverman, R.B. Target-and Mechanism-Based Therapeutics for Neurodegenerative Diseases: Strength in Numbers. J. Med. Chem. 2013, 56, 3121–3147. [Google Scholar] [CrossRef][Green Version]
  10. Banerjee, S. The Macroeconomics of Dementia—Will the World Economy Get Alzheimer’s Disease? Arch. Med. Res. 2012, 43, 705–709. [Google Scholar] [CrossRef]
  11. Grandjean, P.; Landrigan, P.J. Neurobehavioural Effects of Developmental Toxicity. Lancet Neurol. 2014, 13, 330–338. [Google Scholar] [CrossRef][Green Version]
  12. Grandjean, P.; Landrigan, P.J. Developmental Neurotoxicity of Industrial Chemicals. Lancet 2006, 368, 2167–2178. [Google Scholar] [CrossRef]
  13. Bal-Price, A.; Pistollato, F.; Sachana, M.; Bopp, S.K.; Munn, S.; Worth, A. Strategies to Improve the Regulatory Assessment of Developmental Neurotoxicity (DNT) Using in Vitro Methods. Toxicol. Appl. Pharmacol. 2018, 354, 7–18. [Google Scholar] [CrossRef] [PubMed]
  14. Tsuji, R.; Crofton, K.M. Developmental Neurotoxicity Guideline Study: Issues with Methodology, Evaluation and Regulation. Congenit. Anom. 2012, 52, 122–128. [Google Scholar] [CrossRef] [PubMed]
  15. Collins, F.S.; Gray, G.M.; Bucher, J.R. Transforming Environmental Health Protection. Science 2008, 319, 906. [Google Scholar] [CrossRef] [PubMed][Green Version]
  16. Dearden, J.C. The History and Development of Quantitative Structure-Activity Relationships (QSARs). In Oncology: Breakthroughs in Research and Practice; IGI Global: Hershey, PA, USA, 2017; pp. 67–117. [Google Scholar]
  17. Ankley, G.T.; Bennett, R.S.; Erickson, R.J.; Hoff, D.J.; Hornung, M.W.; Johnson, R.D.; Mount, D.R.; Nichols, J.W.; Russom, C.L.; Schmieder, P.K. Adverse Outcome Pathways: A Conceptual Framework to Support Ecotoxicology Research and Risk Assessment. Environ. Toxicol. Chem. Int. J. 2010, 29, 730–741. [Google Scholar] [CrossRef]
  18. Vinken, M. The Adverse Outcome Pathway Concept: A Pragmatic Tool in Toxicology. Toxicology 2013, 312, 158–165. [Google Scholar] [CrossRef]
  19. Leist, M.; Ghallab, A.; Graepel, R.; Marchan, R.; Hassan, R.; Bennekou, S.H.; Limonciel, A.; Vinken, M.; Schildknecht, S.; Waldmann, T.; et al. Adverse Outcome Pathways: Opportunities, Limitations and Open Questions. Arch. Toxicol. 2017, 91, 3477–3505. [Google Scholar] [CrossRef][Green Version]
  20. Allen, T.E.; Goodman, J.M.; Gutsell, S.; Russell, P.J. A History of the Molecular Initiating Event. Chem. Res. Toxicol. 2016, 29, 2060–2070. [Google Scholar] [CrossRef]
  21. Allen, T.E.; Goodman, J.M.; Gutsell, S.; Russell, P.J. Quantitative Predictions for Molecular Initiating Events Using Three-Dimensional Quantitative Structure–Activity Relationships. Chem. Res. Toxicol. 2019, 33, 324–332. [Google Scholar] [CrossRef]
  22. Benigni, R. Building Predictive Adverse Outcome Pathway Models: Role of Molecular Initiating Events and Structure–Activity Relationships. Appl. Vitr. Toxicol. 2017, 3, 265–270. [Google Scholar] [CrossRef]
  23. Cronin, M.T.; Richarz, A.-N. Relationship between Adverse Outcome Pathways and Chemistry-Based in Silico Models to Predict Toxicity. Appl. Vitr. Toxicol. 2017, 3, 286–297. [Google Scholar] [CrossRef]
  24. Gadaleta, D.; Manganelli, S.; Roncaglioni, A.; Toma, C.; Benfenati, E.; Mombelli, E. QSAR Modeling of ToxCast Assays Relevant to the Molecular Initiating Events of AOPs Leading to Hepatic Steatosis. J. Chem. Inf. Model. 2018, 58, 1501–1517. [Google Scholar] [CrossRef] [PubMed][Green Version]
  25. Patlewicz, G.; Simon, T.W.; Rowlands, J.C.; Budinsky, R.A.; Becker, R.A. Proposing a Scientific Confidence Framework to Help Support the Application of Adverse Outcome Pathways for Regulatory Purposes. Regul. Toxicol. Pharmacol. 2015, 71, 463–477. [Google Scholar] [CrossRef] [PubMed][Green Version]
  26. Tollefsen, K.E.; Scholz, S.; Cronin, M.T.; Edwards, S.W.; de Knecht, J.; Crofton, K.; Garcia-Reyero, N.; Hartung, T.; Worth, A.; Patlewicz, G. Applying Adverse Outcome Pathways (AOPs) to Support Integrated Approaches to Testing and Assessment (IATA). Regul. Toxicol. Pharmacol. 2014, 70, 629–640. [Google Scholar] [CrossRef]
  27. Li, J.; Settivari, R.; LeBaron, M.J.; Marty, M.S. An Industry Perspective: A Streamlined Screening Strategy Using Alternative Models for Chemical Assessment of Developmental Neurotoxicity. Neurotoxicology 2019, 73, 17–30. [Google Scholar] [CrossRef] [PubMed]
  28. Marx-Stoelting, P.; de LM Solano, M.; Aoyama, H.; Adams, R.H.; Bal-Price, A.; Buschmann, J.; Chahoud, I.; Clark, R.; Fang, T.; Fujiwara, M.; et al. 25th Anniversary of the Berlin Workshop on Developmental Toxicology: DevTox Database Update, Challenges in Risk Assessment of Developmental Neurotoxicity and Alternative Methodologies in Bone Development and Growth. Reprod. Toxicol. 2021, 100, 155–162. [Google Scholar] [CrossRef] [PubMed]
  29. Lenselink, E.B.; Ten Dijke, N.; Bongers, B.; Papadatos, G.; Van Vlijmen, H.W.; Kowalczyk, W.; Ijzerman, A.P.; Van Westen, G.J. Beyond the Hype: Deep Neural Networks Outperform Established Methods Using a ChEMBL Bioactivity Benchmark Set. J. Cheminform. 2017, 9, 1–14. [Google Scholar] [CrossRef][Green Version]
  30. Bosc, N.; Atkinson, F.; Felix, E.; Gaulton, A.; Hersey, A.; Leach, A.R. Large Scale Comparison of QSAR and Conformal Prediction Methods and Their Applications in Drug Discovery. J. Cheminform. 2019, 11, 4. [Google Scholar] [CrossRef]
  31. Couratier, P.; Sindou, P.; Hugon, J.; Vallat, J.-M.; Dumas, M. Cell Culture Evidence for Neuronal Degeneration in Amyotrophic Lateral Sclerosis Being Linked to Glutamate AMPA/Kainate Receptors. Lancet 1993, 341, 265–268. [Google Scholar] [CrossRef]
  32. Weiss, J.; Yin, H.-Z.; Choi, D. Basal Forebrain Cholinergic Neurons Are Selectively Vulnerable to AMPA/Kainate Receptor-Mediated Neurotoxicity. Neuroscience 1994, 60, 659–664. [Google Scholar] [CrossRef]
  33. Muratov, E.N.; Bajorath, J.; Sheridan, R.P.; Tetko, I.V.; Filimonov, D.; Poroikov, V.; Oprea, T.I.; Baskin, I.I.; Varnek, A.; Roitberg, A.; et al. QSAR without Borders. Chem. Soc. Rev. 2020, 49, 3525–3564. [Google Scholar] [CrossRef] [PubMed]
  34. Polishchuk, P.G.; Muratov, E.N.; Artemenko, A.G.; Kolumbin, O.G.; Muratov, N.N.; Kuz’min, V.E. Application of Random Forest Approach to QSAR Prediction of Aquatic Toxicity. J. Chem. Inf. Model. 2009, 49, 2481–2488. [Google Scholar] [CrossRef] [PubMed]
  35. Svetnik, V.; Liaw, A.; Tong, C.; Culberson, J.C.; Sheridan, R.P.; Feuston, B.P. Random Forest: A Classification and Regression Tool for Compound Classification and QSAR Modeling. J. Chem. Inf. Comput. Sci. 2003, 43, 1947–1958. [Google Scholar] [CrossRef] [PubMed]
  36. Masjosthusmann, S.; Barenys, M.; El-Gamal, M.; Geerts, L.; Gerosa, L.; Gorreja, A.; Kühne, B.; Marchetti, N.; Tigges, J.; Viviani, B.; et al. Literature Review and Appraisal on Alternative Neurotoxicity Testing Methods. EFSA Support. Publ. 2018, 15, 1410E. [Google Scholar] [CrossRef][Green Version]
  37. Choi, J.; Polcher, A.; Joas, A. Systematic Literature Review on Parkinson’s Disease and Childhood Leukaemia and Mode of Actions for Pesticides. EFSA Support. Publ. 2016, 13, 955E. [Google Scholar] [CrossRef]
  38. Gadaleta, D.; Vuković, K.; Toma, C.; Lavado, G.J.; Karmaus, A.L.; Mansouri, K.; Kleinstreuer, N.C.; Benfenati, E.; Roncaglioni, A. SAR and QSAR Modeling of a Large Collection of LD 50 Rat Acute Oral Toxicity Data. J. Cheminform. 2019, 11, 58. [Google Scholar] [CrossRef][Green Version]
  39. Škuta, C.; Cortés-Ciriano, I.; Dehaen, W.; Kříž, P.; van Westen, G.J.; Tetko, I.V.; Bender, A.; Svozil, D. QSAR-Derived Affinity Fingerprints (Part 1): Fingerprint Construction and Modeling Performance for Similarity Searching, Bioactivity Classification and Scaffold Hopping. J. Cheminform. 2020, 12, 39. [Google Scholar] [CrossRef]
  40. Rice, D.; Barone, S., Jr. Critical Periods of Vulnerability for the Developing Nervous System: Evidence from Humans and Animal Models. Environ. Health Perspect. 2000, 108 (Suppl. 3), 511–533. [Google Scholar]
  41. Cruz-Monteagudo, M.; Medina-Franco, J.L.; Perez-Castillo, Y.; Nicolotti, O.; Cordeiro, M.N.D.; Borges, F. Activity Cliffs in Drug Discovery: Dr Jekyll or Mr Hyde? Drug Discov. Today 2014, 19, 1069–1080. [Google Scholar] [CrossRef]
  42. Carlson, L.M.; Champagne, F.A.; Cory-Slechta, D.A.; Dishaw, L.; Faustman, E.; Mundy, W.; Segal, D.; Sobin, C.; Starkey, C.; Taylor, M. Potential Frameworks to Support Evaluation of Mechanistic Data for Developmental Neurotoxicity Outcomes: A Symposium Report. Neurotoxicol. Teratol. 2020, 78, 106865. [Google Scholar] [CrossRef]
  43. Fritsche, E.; Grandjean, P.; Crofton, K.M.; Aschner, M.; Goldberg, A.; Heinonen, T.; Hessel, E.V.; Hogberg, H.T.; Bennekou, S.H.; Lein, P.J. Consensus Statement on the Need for Innovation, Transition and Implementation of Developmental Neurotoxicity (DNT) Testing for Regulatory Purposes. Toxicol. Appl. Pharmacol. 2018, 354, 3–6. [Google Scholar] [CrossRef] [PubMed]
  44. Villeneuve, D.L.; Crump, D.; Garcia-Reyero, N.; Hecker, M.; Hutchinson, T.H.; LaLone, C.A.; Landesmann, B.; Lettieri, T.; Munn, S.; Nepelska, M.; et al. Adverse Outcome Pathway (AOP) Development I: Strategies and Principles. Toxicol. Sci. 2014, 142, 312–320. [Google Scholar] [CrossRef] [PubMed][Green Version]
  45. Spînu, N.; Bal-Price, A.; Cronin, M.T.; Enoch, S.J.; Madden, J.C.; Worth, A.P. Development and Analysis of an Adverse Outcome Pathway Network for Human Neurotoxicity. Arch. Toxicol. 2019, 93, 2759–2772. [Google Scholar] [CrossRef][Green Version]
  46. Schrattenholz, A.; Soskic, V. NMDA Receptors Are Not Alone: Dynamic Regulation of NMDA Receptor Structure and Function by Neuregulins and Transient Cholesterol-Rich Membrane Domains Leads to Disease-Specific Nuances of Glutamate-Signalling. Curr. Top. Med. Chem. 2006, 6, 663–686. [Google Scholar] [CrossRef] [PubMed]
  47. Toscano, C.D.; Guilarte, T.R. Lead Neurotoxicity: From Exposure to Molecular Effects. Brain Res. Rev. 2005, 49, 529–554. [Google Scholar] [CrossRef]
  48. Farina, M.; Rocha, J.B.; Aschner, M. Mechanisms of Methylmercury-Induced Neurotoxicity: Evidence from Experimental Studies. Life Sci. 2011, 89, 555–563. [Google Scholar] [CrossRef][Green Version]
  49. Haorah, J.; Ramirez, S.H.; Floreani, N.; Gorantla, S.; Morsey, B.; Persidsky, Y. Mechanism of Alcohol-Induced Oxidative Stress and Neuronal Injury. Free. Radic. Biol. Med. 2008, 45, 1542–1550. [Google Scholar] [CrossRef][Green Version]
  50. Valencia-Olvera, A.C.; Morán, J.; Camacho-Carranza, R.; Prospéro-García, O.; Espinosa-Aguirre, J.J. CYP2E1 Induction Leads to Oxidative Stress and Cytotoxicity in Glutathione-Depleted Cerebellar Granule Neurons. Toxicol. Vitr. 2014, 28, 1206–1214. [Google Scholar] [CrossRef]
  51. De la Vieja, A.; Dohan, O.; Levy, O.; Carrasco, N. Molecular Analysis of the Sodium/Iodide Symporter: Impact on Thyroid and Extrathyroid Pathophysiology. Physiol. Rev. 2000, 80, 1083–1105. [Google Scholar] [CrossRef]
  52. Dohan, O.; De la Vieja, A.; Paroder, V.; Riedel, C.; Artani, M.; Reed, M.; Ginter, C.S.; Carrasco, N. The Sodium/Iodide Symporter (NIS): Characterization, Regulation, and Medical Significance. Endocr. Rev. 2003, 24, 48–77. [Google Scholar] [CrossRef][Green Version]
  53. Darvesh, S.; Hopkins, D.A.; Geula, C. Neurobiology of Butyrylcholinesterase. Nat. Rev. Neurosci. 2003, 4, 131–138. [Google Scholar] [CrossRef]
  54. US Environmental Protection Agency. The Use of Data on Cholinesterase Inhibition for Risk Assessments of Organophosphorous and Carbamate Pesticides; US Environmental Protection Agency: Washington, DC, USA, 2000.
  55. Holland, E.B.; Feng, W.; Zheng, J.; Dong, Y.; Li, X.; Lehmler, H.-J.; Pessah, I.N. An Extended Structure–Activity Relationship of Nondioxin-like PCBs Evaluates and Supports Modeling Predictions and Identifies Picomolar Potency of PCB 202 towards Ryanodine Receptors. Toxicol. Sci. 2017, 155, 170–181. [Google Scholar] [CrossRef] [PubMed][Green Version]
  56. Paul Friedman, K.; Watt, E.D.; Hornung, M.W.; Hedge, J.M.; Judson, R.S.; Crofton, K.M.; Houck, K.A.; Simmons, S.O. Tiered High-Throughput Screening Approach to Identify Thyroperoxidase Inhibitors within the ToxCast Phase I and II Chemical Libraries. Toxicol. Sci. 2016, 151, 160–180. [Google Scholar] [CrossRef][Green Version]
  57. Zoeller, R.; Rovet, J. Timing of Thyroid Hormone Action in the Developing Brain: Clinical Observations and Experimental Findings. J. Neuroendocrinol. 2004, 16, 809–818. [Google Scholar] [CrossRef] [PubMed]
  58. Bernal, J. Thyroid Hormone Receptors in Brain Development and Function. Nat. Clin. Pract. Endocrinol. Metab. 2007, 3, 249–259. [Google Scholar] [CrossRef] [PubMed]
  59. Westerholz, S.; De Lima, A.; Voigt, T. Regulation of Early Spontaneous Network Activity and GABAergic Neurons Development by Thyroid Hormone. Neuroscience 2010, 168, 573–589. [Google Scholar] [CrossRef]
  60. Crofton, K.M.; Zoeller, R.T. Mode of Action: Neurotoxicity Induced by Thyroid Hormone Disruption during Development—Hearing Loss Resulting from Exposure to PHAHs. Crit. Rev. Toxicol. 2005, 35, 757–769. [Google Scholar] [CrossRef] [PubMed]
  61. Van Maele-Fabry, G.; Hoet, P.; Vilain, F.; Lison, D. Occupational Exposure to Pesticides and Parkinson’s Disease: A Systematic Review and Meta-Analysis of Cohort Studies. Environ. Int. 2012, 46, 30–43. [Google Scholar] [CrossRef] [PubMed]
  62. Goldin, A.L. Resurgence of Sodium Channel Research. Annu. Rev. Physiol. 2001, 63, 871–894. [Google Scholar] [CrossRef][Green Version]
  63. Soderlund, D.M. Molecular Mechanisms of Pyrethroid Insecticide Neurotoxicity: Recent Advances. Arch. Toxicol. 2012, 86, 165–181. [Google Scholar] [CrossRef][Green Version]
  64. McGonigle, I.; Lummis, S.C. Molecular Characterization of Agonists That Bind to an Insect GABA Receptor. Biochemistry 2010, 49, 2897–2902. [Google Scholar] [CrossRef] [PubMed]
  65. Gong, P.; Hong, H.; Perkins, E.J. Ionotropic GABA Receptor Antagonism-Induced Adverse Outcome Pathways for Potential Neurotoxicity Biomarkers. Biomark. Med. 2015, 9, 1225–1239. [Google Scholar] [CrossRef] [PubMed][Green Version]
  66. Gaulton, A.; Hersey, A.; Nowotka, M.; Bento, A.P.; Chambers, J.; Mendez, D.; Mutowo, P.; Atkinson, F.; Bellis, L.J.; Cibrián-Uhalte, E.; et al. The ChEMBL Database in 2017. Nucleic Acids Res. 2017, 45, D945–D954. [Google Scholar] [CrossRef] [PubMed]
  67. Bento, A.P.; Gaulton, A.; Hersey, A.; Bellis, L.J.; Chambers, J.; Davies, M.; Krüger, F.A.; Light, Y.; Mak, L.; McGlinchey, S. The ChEMBL Bioactivity Database: An Update. Nucleic Acids Res. 2014, 42, D1083–D1090. [Google Scholar] [CrossRef] [PubMed][Green Version]
  68. Gadaleta, D.; d’Alessandro, L.; Marzo, M.; Benfenati, E.; Roncaglioni, A. Quantitative Structure-Activity Relationship Modeling of the Amplex Ultrared Assay to Predict Thyroperoxidase Inhibitory Activity. Front. Pharmacol. 2021, 12, 713037. [Google Scholar] [CrossRef]
  69. Enoch, S.J.; Madden, J.C.; Cronin, M.T.D. Identification of Mechanisms of Toxic Action for Skin Sensitisation Using a SMARTS Pattern Based Approach. SAR QSAR Environ. Res. 2008, 19, 555–578. [Google Scholar] [CrossRef]
  70. Gadaleta, D.; Lombardo, A.; Toma, C.; Benfenati, E. A New Semi-Automated Workflow for Chemical Data Retrieval and Quality Checking for Modeling Applications. J. Cheminform. 2018, 10, 60. [Google Scholar] [CrossRef][Green Version]
  71. Berthold, M.R.; Cebron, N.; Dill, F.; Gabriel, T.R.; Kötter, T.; Meinl, T.; Ohl, P.; Thiel, K.; Wiswedel, B. KNIME-the Konstanz Information Miner: Version 2.0 and Beyond. AcM SIGKDD Explor. Newsl. 2009, 11, 26–31. [Google Scholar] [CrossRef][Green Version]
  72. Chen, C.; Liaw, A.; Breiman, L. Using Random Forest to Learn Imbalanced Data. Univ. Calif. Berkeley 2004, 110, 24. [Google Scholar]
  73. Dal Pozzolo, A.; Boracchi, G.; Caelen, O.; Alippi, C.; Bontempi, G. Credit Card Fraud Detection and Concept-Drift Adaptation with Delayed Supervised Information. In Proceedings of the 2015 International Joint Conference on Neural Networks (IJCNN), Killarney, Ireland, 12–16 July 2015; pp. 1–8. [Google Scholar]
  74. Pedregosa, F.; Varoquaux, G.; Gramfort, A.; Michel, V.; Thirion, B.; Grisel, O.; Blondel, M.; Prettenhofer, P.; Weiss, R.; Dubourg, V.; et al. Scikit-Learn: Machine Learning in Python. J. Mach. Learn. Res. 2011, 12, 2825–2830. [Google Scholar]
  75. Kode: DRAGON 7.0.8. 2017. Available online: (accessed on 7 March 2022).
  76. Kosnik, M.B.; Strickland, J.D.; Marvel, S.W.; Wallis, D.J.; Wallace, K.; Richard, A.M.; Reif, D.M.; Shafer, T.J. Concentration–Response Evaluation of ToxCast Compounds for Multivariate Activity Patterns of Neural Network Function. Arch. Toxicol. 2020, 94, 469–484. [Google Scholar] [CrossRef] [PubMed]
  77. Strickland, J.D.; Martin, M.T.; Richard, A.M.; Houck, K.A.; Shafer, T.J. Screening the ToxCast Phase II Libraries for Alterations in Network Function Using Cortical Neurons Grown on Multi-Well Microelectrode Array (MwMEA) Plates. Arch. Toxicol. 2018, 92, 487–500. [Google Scholar] [CrossRef] [PubMed]
  78. Altman, N.S. An Introduction to Kernel and Nearest-Neighbor Nonparametric Regression. Am. Stat. 1992, 46, 175–185. [Google Scholar]
  79. Breiman, L. Random Forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef][Green Version]
  80. Dudek, A.Z.; Arodz, T.; Gálvez, J. Computational Methods in Developing Quantitative Structure-Activity Relationships (QSAR): A Review. Comb. Chem. High Throughput Screen. 2006, 9, 213–228. [Google Scholar] [CrossRef] [PubMed]
  81. Jain, A.K.; Mao, J.; Mohiuddin, K.M. Artificial Neural Networks: A Tutorial. Computer 1996, 29, 31–44. [Google Scholar] [CrossRef][Green Version]
  82. Daylight Chemical Information Systems, Inc. 6. Fingerprints—Screening and Similarity. 2019. Available online: (accessed on 26 January 2022).
  83. Todeschini, R.; Consonni, V. Handbook of Molecular Descriptors; John Wiley & Sons: Hoboken, NJ, USA, 2008; Volume 11. [Google Scholar]
  84. Blaauboer, B.J. The Integration of Data on Physico-Chemical Properties, in Vitro-Derived Toxicity Data and Physiologically Based Kinetic and Dynamic as Modelling a Tool in Hazard and Risk Assessment. A Commentary. Toxicol. Lett. 2003, 138, 161–171. [Google Scholar] [CrossRef]
Figure 1. Distribution of balanced accuracies calculated among the various QSARs developed to predict neurotoxic potential. Balanced accuracies are grouped based on the algorithm used: (a) k-Nearest Neighbours; (b) Random Forest; (c) Neural Network. Blue bars refer to models developed based on MIE predictions, red bars refer to models based on DRAGON descriptors, and yellow bars refer to models based on Extended Fingerprints. Dashed lines indicate the mean accuracy value achieved by each group of models.
Figure 1. Distribution of balanced accuracies calculated among the various QSARs developed to predict neurotoxic potential. Balanced accuracies are grouped based on the algorithm used: (a) k-Nearest Neighbours; (b) Random Forest; (c) Neural Network. Blue bars refer to models developed based on MIE predictions, red bars refer to models based on DRAGON descriptors, and yellow bars refer to models based on Extended Fingerprints. Dashed lines indicate the mean accuracy value achieved by each group of models.
Ijms 23 03053 g001
Figure 2. Modeling workflow. The colours of the various blocks refer to the paragraph in Materials and Methods that describes the specific steps of the workflow. Data from ChEMBL for 15 targets relevant for the MIEs of neurotoxicity (red) were classified based on the threshold pChEMBL = 5; negative samples were enriched with data using “>” and “≥” qualifiers and with chemicals from other MIE data that were treated as decoys. QSARs for MIEs (blue) were developed from these datasets using the BRF method. Datasets were iteratively partitioned into training and test sets and their external performance was calculated as the average of the various iterations; then, the models were retrained on the whole datasets. Neurotoxicity data (green) were retrieved from [76] and curated at the SMILES level. Predictions from the thyroperoxidase model (violet) by [72] and reactivity SMARTS (cyan) by [75] were combined with the predictions from the 15 MIE modes and used as independent variables to develop neurotoxicity QSAR models (orange). kNN, RF, and NNET were used to develop models. The use of MIE predictions as independent variables was benchmarked with fingerprints and DRAGON descriptors; then, the performance of the obtained models was compared with five-fold cross validation.
Figure 2. Modeling workflow. The colours of the various blocks refer to the paragraph in Materials and Methods that describes the specific steps of the workflow. Data from ChEMBL for 15 targets relevant for the MIEs of neurotoxicity (red) were classified based on the threshold pChEMBL = 5; negative samples were enriched with data using “>” and “≥” qualifiers and with chemicals from other MIE data that were treated as decoys. QSARs for MIEs (blue) were developed from these datasets using the BRF method. Datasets were iteratively partitioned into training and test sets and their external performance was calculated as the average of the various iterations; then, the models were retrained on the whole datasets. Neurotoxicity data (green) were retrieved from [76] and curated at the SMILES level. Predictions from the thyroperoxidase model (violet) by [72] and reactivity SMARTS (cyan) by [75] were combined with the predictions from the 15 MIE modes and used as independent variables to develop neurotoxicity QSAR models (orange). kNN, RF, and NNET were used to develop models. The use of MIE predictions as independent variables was benchmarked with fingerprints and DRAGON descriptors; then, the performance of the obtained models was compared with five-fold cross validation.
Ijms 23 03053 g002
Table 1. External validation of QSAR models for MIEs based on ChEMBL data. For each MIE predicting QSAR the average number of true positives (TP), false positives (FP), true negatives (TN), and false negatives (FN) were reported. The metrics for evaluating the predictivity of the models were sensitivity (SEN), specificity (SPE), balanced accuracy (BA), Matthew’s correlation coefficient (MCC) and area under the ROC curve (AUC). Performance is the average of metrics obtained over 100 different training-test splits.
Table 1. External validation of QSAR models for MIEs based on ChEMBL data. For each MIE predicting QSAR the average number of true positives (TP), false positives (FP), true negatives (TN), and false negatives (FN) were reported. The metrics for evaluating the predictivity of the models were sensitivity (SEN), specificity (SPE), balanced accuracy (BA), Matthew’s correlation coefficient (MCC) and area under the ROC curve (AUC). Performance is the average of metrics obtained over 100 different training-test splits.
Table 2. Performance of the three classifiers (kNN, RF, NNET) using MIE predictions, chemical descriptors, and extended fingerprints as independent variables. For each method, the average number of true positives (TP), false positives (FP), true negatives (TN) false negatives (FN) and not classified (NC) were reported. The metrics to evaluate the predictivity of the models were sensitivity (SEN), specificity (SPE), balanced accuracy (BA), Matthew’s correlation coefficient (MCC), and area under the ROC curve (AUC). Performance is the average of five-fold cross-validation results obtained over 500 iterations (100 fold-splitting procedures and five parameter combinations).
Table 2. Performance of the three classifiers (kNN, RF, NNET) using MIE predictions, chemical descriptors, and extended fingerprints as independent variables. For each method, the average number of true positives (TP), false positives (FP), true negatives (TN) false negatives (FN) and not classified (NC) were reported. The metrics to evaluate the predictivity of the models were sensitivity (SEN), specificity (SPE), balanced accuracy (BA), Matthew’s correlation coefficient (MCC), and area under the ROC curve (AUC). Performance is the average of five-fold cross-validation results obtained over 500 iterations (100 fold-splitting procedures and five parameter combinations).
K-NNMIE predictions30.511.419.
MLP-NNETMIE predictions29.49.421.
RFMIE predictions31.111.619.
Table 3. Molecular Initiating Events associated with Developmental Neurotoxicity, adapted from Spînu et al. [45] and Li et al. [27].
Table 3. Molecular Initiating Events associated with Developmental Neurotoxicity, adapted from Spînu et al. [45] and Li et al. [27].
ABinding of agonist, Ionotropic glutamate receptorsGlutamate [NMDA] receptor[45]
ABinding of agonist, Ionotropic glutamate receptorsGlutamate receptor ionotropic kainate[45]
ABinding of agonist, Ionotropic glutamate receptorsGlutamate receptor ionotropic AMPA[45]
BBinding of antagonist, NMDA receptorsGlutamate [NMDA] receptor[45]
CBinding of inhibitor, NADH-ubiquinone oxidoreductase (complex I)Mitochondrial complex I (NADH dehydrogenase)[45]
DBinding, SH/SeH proteins involved in protection against oxidative stressAspecific1[45]
ECYP2E1 ActivationCytochrome P450 2E1[45]
FInhibition, Na+/I symporter (NIS)Sodium/iodide cotransporter[45]
GThyroperoxidase, InhibitionThyroid peroxidase 1[45]
HProtein Adduct FormationAspecific 2[45]
IBinding of inhibitors to acetylcholinesterase (AChE)Acetylcholinesterase[27]
LBinding of non-dioxin-like polychlorinated biphenyls with ryanodine receptor (RyR)Ryanodine receptors 1, 2 and 3[27]
MInteraction uncouplers with oxidative phosphorylationAspecific 3[27]
NBinding of redox cycling chemicals with NADH-quinone oxidoreductaseMitochondrial complex I (NADH dehydrogenase)[27]
OBinding of redox cycling chemicals with NADH cytochrome b5 reductaseNADH-cytochrome b5 reductase[27]
PXenobiotic nuclear receptor activationPregnane X receptor[27]
PXenobiotic nuclear receptor activationNuclear receptor subfamily 1 group I member 3 (Constitutive Androstane Receptor)[27]
QInterference with thyroid serum binding proteinTransthyretin[27]
RDeiodinase inhibitionDeiodinase 4[27]
SThyroid receptor bindingThyroid hormone receptor beta[27]
SThyroid receptor bindingThyroid hormone receptor alpha[27]
TThyroid hormone transporter interferenceMonocarboxylate transporter 8 4[27]
TThyroid hormone transporter interferenceMonocarboxylate transporter 10 4[27]
TThyroid hormone transporter interferenceSolute carrier organic anion transporter family member 1C1 4[27]
UBinding of pyrethroids to voltage-gated sodium channels (VGSC)Sodium channel protein type N alpha subunit[27]
VBinding of antagonist to γ-aminobutyric acid receptor GABAARGABA-A receptor; alpha-1/beta-2/gamma-2[27]
1 No data found in ChEMBL, QSAR from Gadaleta et al. [68] was used. 2 Replaced with the use of reactivity SMARTS [69]. 3 No specific targets, not considered for modelling. 4 No data found in ChEMBL, not considered for modelling.
Table 4. List of endpoints modelled using ChEMBL data. For each endpoint, the reference MIE, ChEMBL ID relative to the molecular target, species, and composition of the Training and Test sets are reported; ACT is the number of active compounds, while INA is the number of inactive compounds.
Table 4. List of endpoints modelled using ChEMBL data. For each endpoint, the reference MIE, ChEMBL ID relative to the molecular target, species, and composition of the Training and Test sets are reported; ACT is the number of active compounds, while INA is the number of inactive compounds.
Glutamate receptor ionotropic AMPAAMPARCHEMBL2096670HumanA733355
Nuclear receptor subfamily 1 group I member 3 (Constitutive Androstane Receptor)CARCHEMBL5503HumanP513377
Cytochrome P450 2E1CYP2E1CHEMBL5281HumanE253402
GABA-A receptor; alpha-1/beta-2/gamma-2GABARCHEMBL2095172HumanV1293298
Glutamate receptor ionotropic kainateKARCHEMBL2109241HumanA253402
Mitochondrial complex I (NADH dehydrogenase)NADHOXCHEMBL614865Bos taurusC, N783349
Sodium/iodide cotransporterNISCHEMBL2331047HumanF563371
Glutamate [NMDA] receptorNMDARCHEMBL2094124HumanA, B2673161
Pregnane X receptorPXRCHEMBL3401HumanP2443188
Ryanodine receptors 1RYRCHEMBL2062
Thyroid hormone receptor alphaTHRαCHEMBL1860HumanS3113116
Thyroid hormone receptor betaTHRβCHEMBL1947HumanS7282704
Sodium channel protein type N alpha subunit 2VGSCCHEMBL1845
1 All the three isoforms of RYR were considered. 2 Isoforms 1, 2, 3 and 6 were considered.
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Gadaleta, D.; Spînu, N.; Roncaglioni, A.; Cronin, M.T.D.; Benfenati, E. Prediction of the Neurotoxic Potential of Chemicals Based on Modelling of Molecular Initiating Events Upstream of the Adverse Outcome Pathways of (Developmental) Neurotoxicity. Int. J. Mol. Sci. 2022, 23, 3053.

AMA Style

Gadaleta D, Spînu N, Roncaglioni A, Cronin MTD, Benfenati E. Prediction of the Neurotoxic Potential of Chemicals Based on Modelling of Molecular Initiating Events Upstream of the Adverse Outcome Pathways of (Developmental) Neurotoxicity. International Journal of Molecular Sciences. 2022; 23(6):3053.

Chicago/Turabian Style

Gadaleta, Domenico, Nicoleta Spînu, Alessandra Roncaglioni, Mark T. D. Cronin, and Emilio Benfenati. 2022. "Prediction of the Neurotoxic Potential of Chemicals Based on Modelling of Molecular Initiating Events Upstream of the Adverse Outcome Pathways of (Developmental) Neurotoxicity" International Journal of Molecular Sciences 23, no. 6: 3053.

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop