A Review of Quantitative Structure–Activity Relationship (QSAR) Models to Predict Thyroid Hormone System Disruption by Chemical Substances

Evangelista, Marco; Papa, Ester

doi:10.3390/toxics13090799

Open AccessReview

A Review of Quantitative Structure–Activity Relationship (QSAR) Models to Predict Thyroid Hormone System Disruption by Chemical Substances

by

Marco Evangelista

^1,2,*

and

Ester Papa

¹

QSAR Research Unit in Environmental Chemistry and Ecotoxicology, Department of Theoretical and Applied Sciences, University of Insubria, Via J.H. Dunant 3, 21100 Varese, Italy

²

Department of Science and High Technology, University of Insubria, Via Valleggio 11, 22100 Como, Italy

^*

Author to whom correspondence should be addressed.

Toxics 2025, 13(9), 799; https://doi.org/10.3390/toxics13090799

Submission received: 24 August 2025 / Revised: 15 September 2025 / Accepted: 17 September 2025 / Published: 19 September 2025

(This article belongs to the Section Novel Methods in Toxicology Research)

Download

Browse Figures

Versions Notes

Highlights

What are the main findings?

This review maps the landscape of QSAR models developed between 2010 and 2024 for predicting the disruption of the thyroid hormone system, induced by chemicals, within an AOP framework.
This review shows progress in modelling key MIEs (e.g., TR, TTR) but reveals that many other MIEs are poorly addressed or entirely overlooked.
This review identifies a preference for classification-based approaches, a frequent reliance on simple algorithms, insufficient AD definitions, and limited mechanistic interpretations and chemical space coverage, challenging model confidence and/or its broader application.

What is the implication of the main finding?

This review provides a state-of-the-art resource to guide future QSAR development for thyroid hormone system disruption by consolidating existing knowledge and identifying critical research gaps.
The findings suggest a clear need for future research to address overlooked MIEs, to expand the range of chemical classes studied, and to develop new QSARs with explicitly defined ADs and improved mechanistic interpretability.

Abstract

Thyroid hormone (TH) system disruption by chemicals poses a significant concern due to the key role the TH system plays in essential body functions, including the metabolism, growth, and brain development. Animal-based testing methods are resource-demanding and raise ethical issues. Thus, there is a recognised need for new approach methodologies, such as quantitative structure–activity relationship (QSAR) models, to advance chemical hazard assessments. This review, covering the scientific literature from 2010 to 2024, aimed to map the current landscape of QSAR model development for predicting TH system disruption. The focus was placed on QSARs that address molecular initiating events within the adverse outcome pathway for TH system disruption. A total of thirty papers presenting eighty-six different QSARs were selected based on predefined criteria. A discussion on the endpoints and chemical classes modelled, data sources, modelling approaches, and the molecular descriptors selected, including their mechanistic interpretations, was provided. By serving as a “state-of-the-art” of the field, existing models and gaps were identified and highlighted. This review can be used to inform future research studies aimed at advancing the assessment of TH system disruption by chemicals without relying on animal-based testing, highlighting areas that require additional research.

Keywords:

endocrine disruption; thyroid hormone system disruption; new approach methodologies; QSAR; MIE; AOP; molecular descriptors; mechanistic interpretation; applicability domain

1. Introduction

The endocrine system is a network of glands and organs responsible for the proper production and homeostasis of hormones that control and regulate essential physiological processes, including growth, metabolism, and reproduction [1,2]. While the proper function of this intricate network is vital for maintaining hormonal homeostasis, the endocrine system is vulnerable to exogenous chemical substances known as endocrine disrupting chemicals (EDCs) [1,2]. By mimicking or blocking hormone activity, EDCs cause a wide range of severe adverse health outcomes in living organisms, including, among others, cancers and infertility [1,3,4]. This breadth and severity of effects have made exposure to EDCs a global concern for ecosystems and human health [4,5,6].

In mammals, three major axes characterise the endocrine system: the hypothalamic–pituitary–gonadal (HPG) axis, the hypothalamic–pituitary–adrenal (HPA) axis, and the hypothalamic–pituitary–thyroid (HPT) axis [2,6,7]. The HPT axis regulates the synthesis and release of specific hormones, i.e., thyroid hormones (TH), through a negative feedback loop, ensuring their homeostasis and appropriate physiological concentrations [8,9]. THs, primarily thyroxine (T4) and triiodothyronine (T3), are essential for regulating and coordinating a wide spectrum of physiological processes throughout all life cycle stages, from embryonic development to adult tissue functions. These processes include, among others, the regulation of metabolism and energy balance [10,11], and the influence on the immune, nervous, skeletal, reproductive, and cardiovascular systems [12,13,14,15,16]. Although proper TH activity is essential for normal physiological processes in adulthood [17], its importance is critically pronounced during gestation and early life stages, as THs play a lead role in placenta, brain, and nervous system development [18,19,20,21]. TH system-disrupting chemicals (THSDCs) are a specific subset of EDCs which target the TH system and interfere with the synthesis, secretion, distribution, and metabolism of THs and, ultimately, with their binding to nuclear TH receptors (TRs) for inhibiting or activating gene transcription [22,23]. To date, multiple chemical substances have been recognised as THSDCs, including polychlorinated biphenyls (PCBs), polybrominated diphenyl ethers (PBDEs), perchlorate, bisphenol A, phthalates, dioxins, pesticides, per- and polyfluoroalkyl substances (PFAS), and metals [22,24,25,26]. Exposure to TSHDCs can disrupt TH homeostasis, resulting in cognitive and neurobehavioral disorders [27], cancer [28], and immune, cardiovascular, and reproductive system dysfunctions [29,30,31,32]. Therefore, it is of utmost importance that THSDCs are identified without delays [23,33].

In the framework of the European Green Deal [34] and the Chemicals Strategy for Sustainability [35], the development and implementation of new approach methodologies (NAMs), including in vitro assays and in silico approaches, are heavily promoted to support the identification of EDCs and reduce the reliance on vertebrate animal testing [36,37,38]. The European Union (EU) is advancing this field by funding key dedicated research projects, such as the European Cluster to Improve Identification of Endocrine Disruptors (https://eurion-cluster.eu/). At the international level, the Organisation for Economic Co-operation and Development (OECD) included in vitro and in silico methodologies in the “Conceptual Framework for Testing and Assessment of Endocrine Disrupting Chemicals” as a relevant source of information to assess the ED properties of substances [39].

In previous years, the criteria for the determination of ED properties has been adopted under the main EU chemical regulations, such as Regulation (EU) No 528/2012 [40], Regulation (EC) No 1107/2009 [41], and Regulation (EC) No 1272/2008 [42]. Although there are minor differences in the terminology across these regulations, a chemical substance is recognised as an EDC if it meets the following three criteria: (i) it shows an adverse effect, (ii) it can alter the endocrine system through an endocrine mode of action, (iii) a plausible link between (i) and (ii) must be established. In this regard, the combined application of NAMs and the adverse outcome pathway (AOP) framework [43] has been suggested as an effective strategy [36,37,44]. Firstly, the development and application of NAMs can identify molecular initiating events (MIEs) in AOPs through which chemical substances can trigger specific endocrine modes of action and consequently lead to endocrine-related adverse effects. Secondly, a biologically plausible link between endocrine modes of action and adverse effects can emerge. This synergy gains even greater significance as it is now established that EDCs can disrupt various pathways involving hormone signalling, rather than the initial belief that their effects were solely mediated by interacting with nuclear receptors [7]. As is the case with other types of NAMs, a synergism between AOPs and quantitative structure–activity relationship (QSAR) models has been established [23,45,46]. The AOP network for TH system disruption developed by Noyes et al. [47] holds significant importance in the field, as it was used as foundational framework by the European Union Reference Laboratory for alternatives to animal testing (EURL ECVAM) to validate a suite of mechanistic in vitro assays for identifying THSDCs [33,48]. Multiple MIEs have been well documented, which involved each step of the TH cycle [47]. Examples include, among others, the inhibition of thyroperoxidase (TPO), which is a critical enzyme for TH synthesis as it catalyses tyrosine residue iodination; binding to serum TH distributor proteins, such as transthyretin (TTR), thyroid binding globulin (TBG), and albumin, which serve as buffers of TH in the bloodstream to ensure the proper TH concentration in their free form; and binding to TRs, which are proteins that, once bound to TH, regulate gene expression and ultimately biological effects [47].

Despite the growing need and interest to advance TH system disruption assessments using in silico and QSAR approaches, a comprehensive review on this topic is currently lacking. While valuable studies have been recently published [49,50], their scopes were different. Sellami and co-workers presented a review on in silico studies focused on nuclear receptors, covering a range of approaches that included not only QSARs but also other methods such as molecular docking and dynamics, and considered the TR as the sole target related to the TH system [49]. In contrast, Vergauwen and co-workers presented a broader review focused on in vivo, in vitro, and in silico methods currently available for TH system disruption assessment [50]. However, their specific examination of in silico tools was confined to models available in open-source predictive tools (e.g., Danish (Q)SAR Database), leading to the identification of twelve models [50]. The present review addressed the current state-of-the-art of QSAR models published in the literature from 2010 and up to 2024 to predict potential TH system disruption by chemical substances. This allowed for a detailed characterisation of how this field has evolved over time, which type of TH system-related endpoints were modelled (and not) by these models, the main data sources used for model development, the modelling approaches, the applicability domain (AD) definitions, which types of chemicals have been assessed, which types of molecular descriptors have been selected as more relevant, and their mechanistic interpretations to suggest potential biological mechanisms. Mapping out the state-of-the-art on this topic is necessary to consolidate existing knowledge, identify research gaps, and offer a resource to guide future investigations. To provide the most up-to-date perspective on the topic, a separate paragraph is dedicated to key articles published between January and July 2025. The decision to treat these publications separately was made because 2025 is an incomplete year and a full comprehensive review of its literature would be premature.

2. Materials and Methods

Criteria of Inclusion and Exclusion and Literature Collection

To meet the scope of this review, the following specific inclusion and exclusion criteria were predefined to collect relevant publications. (1) Original peer-reviewed research articles published from 2010 and up to 2024, where new QSAR models for predicting the potential TH system disruption by chemical substances were proposed. Original peer-reviewed research articles not proposing a new QSAR model (e.g., experimental studies, the application of unsupervised learning methods) were excluded. (2) Modelling efforts focused on predicting MIEs within AOPs for TH system disruption; the AOP network proposed by Noyes et al. was used as a reference [47]. MIEs, such as the induction of the constitutive androstane receptor (CAR), pregnane X receptor (PXR), aryl hydrocarbon receptor (AhR), and peroxisome proliferator-activated receptor (PPAR), were not considered in this review as they were not identified as being thyroid-specific by Dracheva et al. [51] in a study following that by Noyes et al. [47], and were also not addressed by the EURL ECVAM [33,48]. (3) From articles reporting multiple models for the same endpoint, only the QSARs explicitly identified as the best ones by the developers and/or applied for screening purposes within the same study were retained, thereby excluding QSARs arising from, e.g., different data partitioning, data imbalance handling techniques, and feature selection procedures (please note that a description about the effects of such approaches on models’ performances was provided Section 3.5). (4) Original, peer-reviewed research articles focusing on QSAR development for a series of “selective ligands” in illness treatments or drug development were excluded. The same inclusion and exclusion criteria were applied to identify relevant articles published between January and July 2025.

The literature search was conducted using the Web of Science database, according to the inclusion and exclusion criteria. To obtain a more comprehensive collection of relevant publications, the literature search was conducted using both the full names and abbreviations of key biological targets (e.g., TTR, TPO) as keywords, rather than searching for each specific MIE (e.g., TTR binding, TPO inhibition) [47,48,51]. Hereafter in this review, each biological target will be referred to as the related MIE. The search strategy involved the following combinations of keywords: “thyroid system” AND “QSAR”, “thyrotropin releasing hormone receptor” AND “QSAR”, “TRHR” AND “QSAR”, “thyroid stimulating hormone receptor” AND “QSAR”, “TSHR” AND “QSAR”, “thyroperoxidase” AND “QSAR”, “TPO” AND “QSAR”, “sodium iodide symporter” AND “QSAR”, “NIS” AND “QSAR”, “type 1 deiodinase” AND “QSAR”, “DIO1” AND “QSAR”, “type 2 deiodinase” AND “QSAR”, “DIO2” AND “QSAR”, “type 3 deiodinase” AND “QSAR”, “DIO3” AND “QSAR”, “deiodinase” AND “QSAR”, “DIO” AND “QSAR”, “iodothyronine deiodinase” AND “QSAR”, “IYD” AND “QSAR”, “iodotyrosine deiodinase” AND “QSAR”, “DUOX” AND “QSAR”, “dual oxidase” AND “QSAR”, “pendrin” AND “QSAR”, “monocarboxylate transporter 8” AND “QSAR”, “MCT8” AND “QSAR”, “monocarboxylate transporter 10” AND “QSAR”, “MCT10” AND “QSAR”, “monocarboxylate transporter” AND “QSAR”, “MCT” AND “QSAR”, “organic anion transporter polypeptide 1C1” AND “QSAR”, “OATP1C1” AND “QSAR”, “organic anion transporter polypeptide 1A4” AND “QSAR”, “OATP1A4” AND “QSAR”, “organic anion transporter polypeptide” AND “QSAR”, “OATP” AND “QSAR”, “multidrug resistance protein 1” AND “QSAR”, “MDR1” AND “QSAR”, “multidrug resistance associated protein 2” AND “QSAR”, “MRP2” AND “QSAR”, “thyroid binding globulin” AND “QSAR”, “TBG” AND “QSAR”, “transthyretin” AND “QSAR”, “TTR” AND “QSAR”, “albumin” AND “QSAR”, “thyroid receptor” AND “QSAR”, “TR” AND “QSAR”.

The selection of relevant publications from this search followed two main phases. An initial screening of titles and abstracts was conducted to assess relevance based on the inclusion and exclusion criteria. If relevance could not be determined from this step, a full-text analysis was performed.

3. Results and Discussion

The final list comprised thirty publications including eighty-six distinct QSAR models. A summary is reported in Table 1, where studies are presented chronologically. Additional information is reported in Table S1 in the Supplementary Materials.

3.1. Temporal Trend

Figure 1 illustrates the number and distribution of the selected QSAR models and papers over time. Despite minor fluctuations, modelling efforts remained relatively stable since 2010 until 2020, followed by a noticeable surge in the period 2021–2022 and by a slight decrease in 2023–2024. Sixteen out of the thirty papers selected in this review were published in the period 2021–2024, suggesting a recent acceleration of research into TH system disruption using QSAR-based approaches. Before this, the field was characterised by notably sparser publications, with fourteen papers appearing over the ten years from 2010 to 2020. Notably, no relevant publications were detected in 2016 and 2020, which could signify periods of reduced research focus (e.g., the impact of the COVID-19 pandemic), or a shift in research priorities. The number of developed QSAR models mirrors this trend. Indeed, while a year-to-year fluctuation was observed up to 2020, over 70% of the total QSARs were published within the last four years, with a pronounced surge occurring in 2021 and 2022. The number of QSAR models exceeding the number of publications is largely attributed to the increasing practice of proposing multiple models within a single publication, often addressing, for instance, different endpoints, descriptor types, and/or methodological approaches. These findings could be mainly attributed to the growing availability of publicly available high-throughput screening (HTS) data for multiple thyroid-related endpoints, such as those from large-scale projects like Toxicity Forecaster (ToxCast) (https://www.epa.gov/comptox-tools/toxicity-forecasting-toxcast) and Toxicology in the 21st Century (Tox21) (https://tox21.gov/).

3.2. Modelled MIEs

The selected QSAR models were developed for eleven different MIEs for TH system disruption, which represent only a subset of the over twenty described by Noyes et al. [47]. MIEs regarding DUOX, IYD, and pendrin inhibition, as well as those related to cellular TH transport (i.e., MCT8, MCT10, OATP1C1, OATP1A4, MDR1, and MRP2), have never been addressed by QSAR modelling.

As illustrated in Figure 2, a predominant focus was placed on TR and TTR, which together account for 57% of all the QSARs included in this review. This large number could be attributed to the widespread availability of in vitro data for these MIEs and to their established mechanistic links with TH system disruption [7]. While less frequently modelled than TTR and TR, targets like TSHR and TPO were still relatively well represented. The modelling of TPO, which is a key enzyme for THs synthesis, and TSHR, which is a protein that regulates thyroid gland function, highlighted an expanding scope of investigation beyond just TH distribution or nuclear receptor binding reflected by, respectively, TTR and TR. In contrast, other important MIEs remained significantly poorly addressed, highlighting the notable gaps in the current research in the field. The critical roles of albumin, TBG, NIS, TRHR, and the three deiodinases (DIO 1, 2, and 3) in TH synthesis, distribution, and metabolism are well established [31,47]. However, despite their recognised relevance, the scarcity of QSAR research for these targets pointed out potential challenges, such as poor data availability or a limited interest or awareness among QSAR developers. This almost-negligible modelling effort for these MIEs indicates a significant opportunity for future research and QSAR model development.

As illustrated in Figure 3, TR and TTR were consistently modelled throughout the entire study period, reflecting their long-standing recognition as key targets for TH system disruption assessment. A shift in research focus is evident from 2021 onward, with a significant diversification of modelled MIEs. Specifically, the modelling efforts on TSHR, TPO, NIS, TRHR, and deiodinases (DIO1, DIO2, DIO3), though less numerous overall, were distinctly concentrated in 2021 and 2022. This concentrated activity, however, largely stemmed from two studies by Dracheva et al. [51] and de Lomana et al. [119], where multiple endpoints were addressed in the same publication. QSARs addressing other important TH distributor proteins, i.e., TBG and albumin, were only published in the last two years.

As discussed in Section 3.1, this trend of diversification and the surge in the 2021–2022 biennium are likely linked to the growing availability and accessibility of HTS data. Prior to 2021, the scarcity of QSAR studies for MIEs other than TR and TTR likely stemmed from a combination of factors: a scarcity of available experimental data (for instance, Gadaleta et al. [104] pointed out that MIEs such as MCT8, MCT10, and OATP1C1 lacked sufficient active compounds in the ChEMBL database to be used for modelling purposes) and a complexity of developing suitable and validated assays for their generation, or a lower awareness of the mechanistic role of these MIEs in TH system disruption. It is worth highlighting that no in vitro assays for TH system disruption have yet been validated by the OECD [33,48,149,150], which might slowing data generation. The relatively recent publication of the AOP network for TH system disruption by Noyes and colleagues [47] likely played a crucial role. By providing a more structured understanding of these diverse pathways, it stimulated research into previously underexplored MIEs. The growing number of publications and QSARs covering multiple MIEs underscored the increasing awareness of the multifaceted and interconnected nature of TH system disruption.

3.3. Data Sources

As detailed in Table 1, the QSARs selected for this review were based on data from three main source types: (1) primary sources, where data was generated as part of the same study; (2) secondary sources, where data was collected from the existing literature; (3) publicly available databases (i.e., ToxCast, Tox21, and ChEMBL). In most cases, these sources were used individually, while in others, they were combined (Figure 4).

The data source reference(s) used to develop each QSAR are reported in Table 1. The data included in publicly available databases served as unique data sources for developing thirty-five distinct QSARs, representing approximately 41% of the total. MIEs covered by these QSARs included TTR, TR, TSHR, TPO, TRHR, NIS, and the three deiodinases. With a single exception [121], all of the studies using data from the ToxCast and Tox21 projects were published from 2021 to 2023, proposing all of the available QSARs addressing TPO, NIS, TSHR, TRHR, DIO1, DIO2, and DIO3. As previously discussed, these findings were linked to the growing availability and accessibility of comprehensive HTS datasets. This data availability, combined with an increasing awareness of the critical roles these targets play within the TH system, has broadened the scope of QSAR investigations for TH system disruption assessment.

In contrast, primary and secondary data sources were used alone for the development of forty-six (53%) QSARs. The consistent use of primary and secondary data sources from the literature throughout the entire study period underscored their sustained importance. These models covered a stricter range of MIEs, such as TR and TH distributor proteins TTR, TBG, and albumin, underscoring limited data availability or utilisation for other MIEs. Whilst the majority of these QSARs were developed using data from in vitro experiments, Kowalska et al. [90] and Yang et al. [52] developed a total of six QSARs to predict binding energies to TTR and TBG, respectively. Binding energies used for models’ development were generated within the same studies through molecular docking and dynamic simulations and used as dependent variables. The successful application of integrated in silico approaches highlighted their utility as an effective strategy when experimental data from in vivo or in vitro studies is limited or entirely lacking, further enabling the exploration of complex molecular interactions that might be otherwise inaccessible.

A key aspect across the studies was data transparency. The data sources and data used for model development were consistently made available, either directly within the publications or through adequately referenced sources. This commitment to data availability aligned with the FAIR (Findable, Accessible, Interoperable, Reusable) principles for data sharing [151], thereby optimising data reuse for future research.

3.4. Chemical Classes

The datasets used for QSARs training and validation included either structurally heterogenous chemicals or class-specific chemicals.

Structurally heterogeneous datasets were used for approximately 67% of the QSARs. These datasets primarily consisted of organic chemicals, encompassing a mix of environmental pollutants, natural compounds, and, occasionally, drugs. The sizes of such datasets varied considerably, from 41 to 8682 compounds. About 83% of these QSARs were published within the last four years, reflecting the spreading availability of HTS data, as previously described. As illustrated in Figure 5, all the endpoints were addressed using heterogeneous datasets, with the exception of TBG and albumin.

In contrast, only a limited number of chemical classes have been tested and modelled for TH system disruption, addressing a limited number of MIEs. These datasets primarily focused on environmental pollutants of known concern, including PCBs and their hydroxylated metabolites, PBDEs and their hydroxylated metabolites, PCNs, halogenated phenols and thiophenols, phenolic DBPs, PFAS (often referred to as PFCs), and PBBs and their hydroxylated metabolites. The sizes of these datasets were generally smaller compared with the structurally heterogeneous ones, ranging from 17 to 107 compounds. Furthermore, these data were exclusively generated within the same study or retrieved from the existing literature, hence were never extracted from databases. Notably, only TR and TH distributor proteins (i.e., TTR, TBG, and albumin) were modelled using these datasets, underscoring limited data availability or the utilisation of specific class data for other thyroid-related endpoints. It is also important to highlight that almost half of these QSARs were published within the last four years. This trend suggested that, despite the increasing availability of HTS data, the reliance on data published in the literature by independent research groups remained critically important.

Although certain compounds, like bisphenol derivatives, phthalates, various pesticides, and constituents of personal care products have been experimentally identified as THSDCs [22,25,26,152], many others within these same classes remain poorly addressed. This lack of data is concerning because structural similarity among compounds within the same class may suggest a similar toxic potential. This highlighted a strong need for additional in silico or in vitro efforts to generate more data for these and other chemical categories for specific MIEs. Broadening the chemical space coverage for each of these chemical categories would be essential to develop new, specific QSAR models, enabling a more robust hazard assessment for entire groups of compounds. Building on the successful application of integrated in silico approaches by Kowalska et al. [90] and Yang et al. [52], as described in Section 3.3, a similar approach could be an effective strategy to address other MIEs for specific chemical classes.

Generally, the use of heterogeneous datasets can improve a model’s AD coverage and generalizability for large screening applications. In contrast, local QSARs, which are specifically designed for specific classes of compounds, are often preferred for their ability to more accurately capture subtle structural differences and specific structure–activity relationships. This can lead to (potentially) more reliable predictions within that defined chemical space. Therefore, the choice between using global or local QSAR models depends on the specific application purposes. Furthermore, the inherent complexity of heterogeneous data can hinder the mechanistic interpretation of molecular descriptors (see Section 3.8). When a model is trained on a wide array of chemical structures, it is more challenging to pinpoint the exact structural features or physicochemical properties responsible for a particular activity. This is in contrast to datasets of specific classes, where a clearer structure–activity relationship can emerge, making interpretation more straightforward.

3.5. Modelling Approaches

A wide variety of modelling algorithms are available for QSAR model development. These range from traditional methodologies, such as MLR and LDA, to more complex machine learning methodologies, such as NN and SVM [46,153]. The choice of algorithm generally depends on the complexity of the data and the desired interpretability of the model. Thus, the landscape of algorithms for QSAR development lacks a universally accepted solution, as each method presents its own set of strengths and limitations.

As illustrated in Figure 6, different modelling algorithms and approaches were identified across the papers.

Over two-thirds of the QSARs selected in this review (67%) were designed for classification, a preference largely driven by the nature of HTS data. Large-scale projects like ToxCast and Tox21 generate vast datasets, where the effect on a biological target by compounds is often reported with a simple categorical outcome, i.e., “active” or “inactive”. This format has consequently led to a shift in QSAR modelling for TH system disruption, favouring classification-based approaches over regression-based ones.

RF was the most frequently used algorithm, followed by MLR and kNN. Overall, RF, kNN, and MLR were used to develop a total of fifty-eight different QSARs, corresponding to approximately 67% of the total models. It is important to highlight that a single study by Dracheva et al. [51] utilised RF to develop eleven different QSARs for the prediction of nine MIEs, which significantly influenced the overall count of RF applications.

The majority of studies concentrated on a single, well-defined modelling strategy, while a few explored more comprehensive approaches, systematically exploring combinations of algorithms, descriptor types, or class-balancing techniques to achieve the best possible performance. While a comprehensive comparative analysis of the predictive models’ performances would be highly valuable, it fell outside the scope of this review, as it was hindered by the following two key reasons. Firstly, the distribution of available models was highly imbalanced. While MIEs like TTR and TR have been extensively studied with multiple QSARs, others have been addressed by a few, or even no, models. Secondly, cross-study comparisons of models’ performances can be performed only when the same dataset and data processing technique are used [154], meaning that simply looking at the statistical metrics of QSARs from different papers would be inappropriate to determine which modelling approach is truly superior. For example, Schür et al. recently reviewed predictive ecotoxicology studies and concluded that no existing studies were truly comparable due to inconsistent methodologies regarding datasets, data processing, and performance statistical metrics [155]. This finding could also be applicable to the broader toxicological context. Therefore, the focus of this section was placed on studies that directly explored various modelling approaches, in terms of algorithms, descriptor types, or data-balancing methods, on a single, consistent dataset. This approach allowed the authors to conduct a reliable assessment of which specific methodology yielded the best predictive results.

All of the models for TPO inhibition were developed using structurally heterogenous datasets of chemicals (see Table 1 and Figure 5). Rosenberg et al. [121] developed two robust QSAR models using PLR, named QSAR1 and QSAR2, using an initial selection of predefined molecular descriptors and training set-dependent scaffolds. The authors evaluated seven different modelling strategies, including approaches that used scaffolds and those that did not, in both single and composite models. The most successful strategy was a composite model that uniquely combined a single, unbalanced model with balanced sub-models from a composite one. This strategy was found to be particularly effective in handling the challenges posed by imbalanced datasets, and led to the final QSAR1 and QSAR2 models (with a cross-validation balanced accuracy equal to 80.6% and 82.7%, respectively). Similarly, Seo et al. [107] developed binary, ternary, and quaternary QSAR models. They applied multiple algorithms, such as RF, SVM, artificial NN, Adaptive Boosting (AdaB) and XGB, and hard- and soft-voting classifiers. Each algorithm was combined with multiple categories of fingerprints (FPs) (e.g., Morgan FPs, Atom Pair Count FPs) and dimensionality reduction techniques (i.e., principal component analysis (PCA) and LDA) to address overfitting. The Atom Pair Count FPs was the best-performing FP, whereas the best-performing models in the binary, ternary, and quaternary models were the hard-voting classifier, XGB with LDA, and soft-voting classifier, respectively (test scores equal to 0.66, 0.51, and 0.52, respectively). Gadaleta et al. [120] applied multiple algorithms, including SVM, balanced RF, RF, and kNN, and explored different partitioning schemes to stratify and select active compounds in different ways. The top-performing models were based on balanced RF and kNN using a dataset that excluded compounds with an ambiguous active categorization. The models achieved a balanced accuracy of 76–78% on external data, which resulted as a performance comparable to the reported experimental variability of the assay used to generate modelled data.

Regarding TR binding, Bai et al. [122] developed classification QSARs based on twenty-two PCBs using LDA and SVM. Both showed strong and equal accuracy in the training set (88.2%), with the SVM model exhibiting a greater accuracy in the test set, equal to 80%. Akinola et al. [91] developed classification models applying LR and LDA on a dataset of sixty-eight OH-PCBs, showing that both methods performed identically (accuracies in the training set and in the test set equal to 84.3% and 76.5%, respectively). Yan et al. [123] developed ternary classification models applying LDA, classification and regression trees (CART), and SVM on a dataset of structurally heterogenous compounds. SVM proved to be the optimal algorithm, with a total accuracy in the training and test set equal to 81.4% and 76.5%, respectively. Sapounidou et al. [113] proposed a comprehensive set of twenty-three QSAR models for various MIEs related to endocrine disruption, including TRβ binding, utilising the conformal prediction (CP) framework combined with RF as the modelling algorithm. Five different data-balancing techniques were employed (for more details, see below), with CP providing the best one. A balanced accuracy equal to 0.78 was achieved.

As for TPO, all of the models for TSHR inhibition were developed using datasets of structurally heterogenous chemicals. Xu and colleagues [106] developed binary classification models comparing three different algorithms: RF, XGB, and LR. Both RF and XGB models showed good predictive performances, with balanced accuracies of 0.85 and 0.84, respectively. The authors further developed a simplified RF model using the seven most influential descriptors, which maintained strong performance (balanced accuracy equal to 0.83). Additionally, they first developed a regression model using MLR, which yielded an R² of 0.35. Therefore, a regression model using XGB was developed and the R² increased up to 0.65. Later, Liu et al. [95] explored various combinations of seven molecular representations (including different types of FPs and Mordred descriptors) and four algorithms (RF, SVM, multilayer perceptron, and graph attention network). The best-performing model was a RF using PubChem FPs, which achieved a balanced accuracy of 0.94 on the validation set.

Regarding TTR binding, Zhang et al. [87] developed QSAR classification models applying kNN, PLS discriminant analysis (PLS-DA), and SVM. The kNN model, with a k-value of 4, showed the best performance, achieving the highest correct classification rate during both internal and external validation (0.88 and 0.82, respectively). Similarly, Rybacka et al. [137] tested seven different machine learning methods (MLR, PLS, associative NN, kNN, RF, SVM, and fast stepwise (stagewise) multivariate linear regression) and five distinct descriptor sets. The best result was obtained by combining the associative NN algorithm with Dragon descriptors, which achieved a balanced accuracy equal to 89%.

Finally, de Lomana et al. [119] used five different algorithms (i.e., LR, RF, XGB, SVM, and NN) in combination with three class-balancing techniques (see below for more details) to predict multiple MIEs. All algorithms performed similarly, with a tendency for the models trained on over-sampled data to achieve better results. Balanced accuracies ranged from 0.68 to 0.82 for different endpoints.

Although the use of diverse algorithms was evident (Figure 6), no clear temporal trend was observed in the type of modelling algorithms employed. This suggested a consistent application of both established and newer algorithms across different publication years, rather than a gradual shift toward more complex techniques. Interestingly, despite the recent advancements in machine learning and deep learning approaches, classical algorithms such as MLR, LDA, and PLS continued to be widely used, given their interpretability and simplicity. Indeed, their added value stems from providing easily understandable models that offer direct insights into the structural features driving the activity, which is a key aspect to enhance confidence with QSARs. On the contrary, the “black box” nature of more complex algorithms makes them less transparent and, if not adequately controlled, potentially more susceptible to overfitting [156]. Therefore, an important research direction is to leverage the power of complex algorithms by focusing on developing methods that enhance their interpretability and transparency, thereby increasing user confidence and facilitating their broader adoption.

Establishing a clear link between the algorithm type and a specific endpoint proved challenging, as most endpoints have been assessed by a few, or even no, QSARs. Regarding TTR and TR, the two most modelled endpoints, algorithms capable of handling linear relationships (e.g., MLR) and non-linear relationships (e.g., kNN) between independent and dependent variables were both utilised, with a slight preference for the second group.

An additional methodological aspect observed across the studies was the application of class-balancing strategies. This is crucial to address class imbalance, where one class (e.g., inactive compounds) is much more common than another (e.g., active compounds) in a training dataset. This imbalance is frequently found in data from databases or generated through HTS and can cause a model to become biased toward the majority class, leading to a poor performance with the minority class. This is especially critical in hazard prediction, where mistakenly predicting a dangerous compound as safe is a far more serious error than the opposite. Several effective strategies exist and were observed in the reviewed studies. Sapounidou et al. [113] in combination with RF, explored five different data-balancing techniques: CP, equal size sampling (under-sampling), over-sampling by duplication, synthetic minority over-sampling technique, and random over-sampling examples. As described above, the use of CP was the best choice. de Lomana et al. [119] combined five different algorithms (i.e., LR, RF, XGB, SVM, and NN) and three class-balancing techniques: weight balancing, over-sampling, and under-sampling. The models trained on over-sampled data achieved better results. Xu et al. [106] employed the synthetic minority over-sampling technique-edited nearest neighbours (SMOTEENN) technique, which combines over-sampling the minority class samples with under-sampling the majority class samples to achieve a more balanced distribution. Gadaleta et al. [104,120] developed models using balanced RF, which is an adaptation of the more traditional RF that incorporates the internal balancing of categories. Finally, Liu et al. [95] employed a threshold moving method. The increasing volume of HTS data highlights the critical importance of effective class-balancing strategies for enhancing the robustness and reliability of models built on these datasets.

3.6. Validation Strategies

Validation stands as a crucial step in QSAR model development, ensuring the appropriateness of goodness-of-fit, overall robustness, and predictive ability, thereby ultimately maximising the model’s reliability [157,158,159]. Validation procedures can be distinguished as internal and external. Internal validation is conducted to evaluate the robustness and the predictive ability of a QSAR on the data from which it was developed (i.e., training set). External validation, on the other hand, is conducted to evaluate the actual predictive ability of a QSAR on data not used for its development (i.e., test set). Thus, external validation is of key importance as it assesses the model’s true predictive power using unseen data [157,158,160]. Although the best strategy to perform external validation involves the use of completely new and independent datasets, obtaining these datasets is often challenging given the scarcity of available experimental data. Therefore, a common practice is to partition the available data into a training set and into a test set (i.e., dataset splitting) [157,158,161].

All of the QSARs reviewed in this study underwent some form of internal and/or external validation. Internal validation was performed to evaluate the robustness of seventy-two QSARs, accounting for approximately 84% of the total. The most frequent internal validation strategy was the k-fold cross-validation (CV), which was used in fifty-five instances. This approach involves splitting the training set into k equally sized groups. The method iteratively trains a model on k − 1 groups and validates it on the remaining group. This is repeated k times, such that each group serves as the validation set once [162]. In this review, k values were typically set to 2, 5, or 10. In some instances, this strategy was often referred to as leave-more-out CV (LMO CV) and as leave-one-out CV (LOO CV). The latter is the simplest case of k-fold CV, where each compound of the training set is removed one at time, and it was employed in twenty instances. Finally, the stratified bagging method was used in one instance [137], where k-fold CV was also tested. Additionally, regression-based QSARs often underwent further internal validation strategies, such as the QUIK rule [163] to detect high predictor collinearity, Y-randomisation to detect chance correlations [164], and the use of the bootstrapping coefficient [165]. Details about each model are reported in Table S1. External validation was performed to evaluate the predictive ability of eighty-three QSARs, representing almost all of them. It is important to highlight that for the three QSARs where external validation was not conducted, this omission was not an oversight, instead it was intentionally not performed and adequately justified by the authors [93,121]. For example, Rosenberg et al. [121] proposed two QSARs for TPO inhibition, named QSAR1 and QSAR2, which were developed using two independent datasets. QSAR1 was developed using one dataset as the training set and the other one as the test set for external validation. Instead, QSAR2 was developed by merging both datasets to form a larger training set: whilst QSAR2 was developed using the same modelling method and CV approaches as QSAR1, it purposely lacked external validation. Both QSARs showed good performances and were applied to broader screening purposes. In the study by Gallagher et al. [93], given the small size of the dataset (twenty-two compounds), the “Small Dataset Modeler” tool proposed by Ambure et al. [166] was utilised to facilitate an exhaustive double CV approach that uses the entire dataset without requiring splitting it into a training set and a test set, making it an effective and suitable solution to validate QSARs based on limited data. Beyond these three specific exceptions, data partitioning into a training and test set was performed by employing various splitting strategies. Random splitting is a frequently adopted strategy [167], and its prevalence was also observed in this review, where it was applied in sixty-three instances. While straightforward, this procedure can lead to an uneven data distribution particularly when dealing with small-sized datasets or with skewed class distributions [168,169,170]. This imbalance might ultimately result in training and test sets that deviate from the representativeness suggested by Golbraikh et al. [171], who argued that using rationally selected training and test set can enhance QSAR reliability [171]. Alternative partitioning strategies have been proposed and used for a strategic selection of training and test set compounds [157,168]. In this review, strategies alternative to random splitting included those based on sorted response variables [53,90,133,144] or on the Kennard–Stone algorithm [87,91,172]. The details about each model are reported in Table S1.

The diverse array of splitting strategies reflected the fact that there is not a single and widely considered ideal partitioning scheme. Instead, the choice depends on the specific dataset type, its size, and the modelling methodology employed in the study [168]. Encouragingly, with only a few noted exceptions, the predominant practice across the reviewed studies involved the combined application of both internal and external validation. This robust approach, which was utilised in 80% of the QSAR models, suggested a strong commitment within the field to ensure their validity and reliability.

3.7. Applicability Domains

A single QSAR model cannot accurately predict the entire chemical universe [173]. Thus, each QSAR needs to be associated with a clearly defined AD. This domain determines whether a QSAR model can provide reliable or unreliable predictions (i.e., extrapolations) based on the structural, physicochemical, and response information present in the training set of the model [157,158,174]. No single, universally accepted method exists for defining the AD of a QSAR model. Instead, a range of methodologies are utilised, each offering a distinct approach [175,176]. These methods can differ in their restrictiveness and can yield either categorical outcomes (e.g., a simple “in” or “out” of the AD) or continuous values (e.g., distance) quantifying the relative position of a compound to the AD boundaries or centre [157].

An alarming finding was that the AD was not explicitly defined for thirty-two QSARs, accounting for approximately 37% of the total models. Among these, it is worth highlighting the studies by Bai et al. [122] and by Akinola et al. [91]. Bai et al. [122] developed two QSARs using a training set of twenty-two PCBs, which were then applied to predict TR binding for the remaining PCBs congeners. Similarly, Akinola et al. [91] developed two QSARs based on TR binding data for sixty-eight mono-hydroxylated PCBs. While the ADs of these models were not formally defined, it is reasonable to assume that they were implicitly limited to these specific chemical classes due to their relatively small number and well-defined congeners. In the publication by de Lomana et al. [119], nine different QSARs were developed without defining a priori their ADs. Instead, ADs were assessed post hoc in terms of the Tanimoto coefficient by comparing the chemical space covered by the training sets with the chemical spaces covered by well-known datasets of pesticides, cosmetics, and drugs.

Despite its critical importance for ensuring the reliability of predictions for new chemicals, the AD was explicitly defined for fifty-four QSARs (see Table S1). As illustrated in Figure 7, a variety of methodologies were used for the AD definition of both classification and regression QSARs, showing that some studies integrated multiple approaches while others relied on a single method. While some methods were used more frequently, others appeared in only a single instance.

The leverage approach was the most frequently used method, employed in approximately 44% of the QSAR models. This method was used either as a standalone technique [52,53,90,112,144,145,147,148] or in combination with other approaches [57,59,114], often complemented by the Williams plot as a graphical support for AD visualisation [52,53,57,59,90,112,114,144,147,148]. For example, in two different studies by Yang et al. [57,59] the leverage approach was combined with the Euclidean distance-based method to define the AD boundaries of two regression QSARs for TTR binding. In another study [114], Yang et al. used the same approach as before and included the Tanimoto similarity index to assess the reliability of external predictions for four regression QSARs for TTR binding; in the same study, they combined the Euclidean distance-based method with the Tanimoto similarity index to define the AD of five classification QSARs. The Euclidean distance method was additionally employed to define the AD of two QSARs developed by Kar et al. [133] and one QSAR by Kovarich et al. [146]. Kar et al. [133] combined it with the standardisation-based technique [173], while Kovarich et al. [146] combined it with the range of descriptor values in the training set. Zhang et al. [87] employed the Hotelling T² test to measure the distance of new compounds from the centre of the training set in descriptor space in order to define the AD. Rybacka and colleagues [137] used a PCA to define the chemical space of the training set based on selected molecular descriptors, and then calculated the distance-to-the-model (DModX) value for each compound. Methodologies less common than distance-based approaches were applied in six distinct publications [51,95,113,120,121,138]. Toropova et al. [138] defined the ADs of three QSARs according to the prevalence of local and global SMILES attributes in the training and validation sets, as proposed in their earlier publication [177]. Both Gadaleta et al. [120] and Rosenberg et al. [121] defined the ADs of their models in terms of the post probability of the predictions, with Rosenberg et al. [121] integrating the study of post probabilities with the Tanimoto similarity index. Liu et al. [95] characterised the AD in terms of weighted similarity density (ρ_s) and weighted inconsistency of activities (I_A) (AD_SAL{ρs, IA}). Finally, both Dracheva et al. [51] and Sapounidou et al. [113] employed the CP framework to define the AD. As described in the studies, the CP quantifies the uncertainty of predictions by providing similarity scores, also termed as nonconformity scores, which can then be used to determine whether query compounds fall inside or outside the AD of a model.

Overall, a critical finding was the pronounced lack of QSAR models associated with a clearly defined AD. A clear definition of the AD is fundamental to increase confidence in the reliability of QSAR predictions and to accurately assess the degree of extrapolations. Without a defined AD, QSAR models risk being applied incorrectly and outside their intended scope, which can lead to the misuse of the tool and, ultimately, unreliable predictions.

3.8. Molecular Descriptors: Mechanistic Interpretations and Feature Importance

Molecular descriptors encode for numerical representations of molecular structures and serve as independent variables in QSAR models. Thousands of molecular descriptors have been developed, reflecting the varied complexity of chemical structural representation. Molecular descriptors range from simple molecular properties (e.g., molecular weight (MW)) to highly complex ones (e.g., quantum chemical descriptors) [178]. Multiple types of software, either open or commercial, are available for their calculation [179].

Across the examined studies, an extensive range of molecular descriptors and software for their calculation was observed. The full list of software and molecular descriptors used for each model is provided in Table 2, where models are presented for each MIE to facilitate direct comparison. These descriptors encompassed multiple categories, including physicochemical properties, FPs, constitutional, topological, electronic, and quantum chemical descriptors. It was a common practice to combine multiple software or libraries within a single study to compute different types of molecular descriptors.

The mechanistic interpretation of a QSAR model is critically important because it allows for the identification of the chemical properties or structural features that most significantly contribute to the predicted endpoint, enhancing the scientific credibility and acceptance of predictions [157,158]. Furthermore, it can offer new insights into the molecular features driving the modelled activity, hence contributing to safe-by-design approach. However, mechanistic interpretation is not always straightforward. This is often due to the challenging interpretability of certain molecular descriptors or the complexity of the algorithms used in model development [184]. To overcome this, feature importance techniques are often employed to provide clarity and to pinpoint the most influential molecular descriptors among many, since not all descriptors contribute equally.

Mechanistic interpretation or the application of feature importance techniques was conducted for fifty-six QSARs selected in this review, accounting for approximately 65% of the total models. These approaches were applied across six specific MIEs: TTR, TR, TSHR, TPO, TBG, and albumin. The decision to conduct a straightforward mechanistic interpretation of selected molecular descriptors or to apply feature importance techniques was contingent upon various factors, including the type of modelling methodology employed, the chemical nature of the compounds modelled, and the specific types of molecular descriptors used. Relevant descriptors are influenced by the structural characteristics included in the dataset, which in turn depends on whether the dataset is composed of structurally heterogeneous chemicals or of compounds from a single chemical class.

Interpreted QSARs for TTR binding were either based on heterogenous organic chemicals or specific chemical classes, including halogenated phenols and thiophenols, PFAS and/or PFCs, and PBDEs and their hydroxylated metabolites. A strong consensus on the fundamental molecular properties influencing TTR binding was revealed, although different descriptors were used to represent those properties, highlighting the fact that various computational methods can effectively encode the same critical structural information. The most significant and consistently identified structural features were aromatic rings, halogen atoms, and hydroxyl groups. Examples of descriptors encoding for these features were nArOH (number of aromatic hydroxyls) and nX (number of halogen atoms), which consistently showed a positive correlation with TTR binding affinity. These can be referred to as “structural alerts”, as their presence recalls the chemical structure of THs like T3 and T4. In addition, hydrophobicity was consistently recognised as a critical property driving TTR binding. Descriptors encoding for this property, such as logP and log DOW (pH = 7.40), were repeatedly selected in various QSARs. The hydrophobic nature of the TTR binding site for T4 justifies this observation [185]. Furthermore, descriptors like a_don, nHDon, and H-050 were selected to encode for hydrogen bond donor capacity, thereby emphasising the role of noncovalent interactions, such as hydrogen bonding and electrostatic interactions, between ligands and TTR. Furthermore, a consensus on the most significant features determining the TTR binding by PFAS (or PFCs) was highlighted across studies addressing this class of chemicals. These were mainly represented by the carbon chain length, MW and dimension, and terminal functional groups. An intermediate carbon chain length was found to be optimal for TTR binding. This information was encoded by descriptors like HATS6m and F06[C-O]. The most active PFAS were found to have an MW between 300 and 500 g/mol, as captured by the AMW descriptor. HATS6m, which encodes for molecular shape and dimension, was used to distinguish the activity of compounds with similar molecular weights. F07[C-O] and nH were used to account for the presence of carboxylic or sulfonic acid terminal groups at a particular topological distance and to differentiate compounds based on their terminal functional group. As seen before, hydrophobicity was still recognised as a critical property driving TTR binding. A broad spectrum of molecular descriptors was used across these studies, encompassing quantum chemical and electronic descriptors, topological, structural, and constitutional ones, as well as functional group counts and logKOW. It was often observed that the same groups of descriptors were employed across different studies, especially when conducted by the same research groups. While all studies converged on similar key features for TTR binding, it is worth noting that the specific choice and subsequent interpretation of descriptors could be influenced by a research group’s preferred modelling tools, their expertise, and their background. This implies that while the underlying findings may be consistent, their description might vary in the level of detail, depending on the specific approach adopted by the group.

As seen for TTR, interpreted QSARs for TR binding predictions were either based on heterogenous organic chemicals or specific classes. These included PCBs and their hydroxylated metabolites, PBDEs, PCNs, and PFAS. Similarly to TTR, TR modelling was performed using a wide array of molecular descriptors, and mechanistic interpretations were either more general or detailed. The presence and quantity of halogen atoms were consistently identified as being critical for TR binding. This information was encoded in descriptors like X% (percentage of halogen atoms) and nBr or nCl (number of bromine or chlorine atoms, respectively), which showed a positive correlation with TR binding affinity. In addition, molecular polarity was identified as a relevant property. Descriptors like EEig03d and EEig06d (edge adjacency indices weighted by dipole moments) and μ and μ2 (dipole moments) were positively correlated with TR binding, indicating that an increase in polarity could enhance affinity to TR. For PFAS, an optimal chain length was identified as a key determinant of TR binding, showing a moderate to high probability of binding for longer chains. Hydrophobicity was another key property consistently identified as a positive contributor to TR binding. Finally, electronic descriptors were selected to encode for the ability of a compound to accept or donate electrons and form hydrogen bonds with TR.

The interpretation of five QSARs for TPO inhibition by Seo et al. [107], Gadaleta et al. [120], and Rosenberg et al. [121] led to converged results, despite the use of different software to calculate molecular descriptors and different types of descriptors. The presence of aromatic structures, either hydroxylated (e.g., phenols) or non-hydroxylated (e.g., anilines), and of various heteroatoms (including nitrogen, oxygen, sulphur, and halogens) were highlighted as key structural features for TPO inhibition. These findings pointed out how these structural features often mimic typical endogenous targets of TPO, like tyrosine residues, thereby exerting disrupting effects [120]. Additionally, the lipophilic nature of a compound was also identified as a critical property. Furthermore, valuable insights regarding typical structural features found in non-TPO inhibitors were provided [121], including ethers, esters, aryl halides, and tertiary amines. All of these findings offered a comprehensive picture of which structural features and/or properties either contribute to, or detract from, TPO inhibition.

Regarding TSHR, two QSARs by Liu et al. [95] and Xu et al. [106] were interpreted. According to Xu et al. [106], the inhibitory effect of compounds on the TSHR is primarily influenced by two key chemical descriptors: the probability of water solubility (encoded by the descriptor “Sw < 0.1 mg/mL probability”) and lipophilicity (encoded by the descriptor “log D (pH = 7.4)”). The probability of water solubility was identified as the most influential factor because compounds must be able to diffuse through blood or body fluids to reach their biological target. Compounds with very low water solubility are less likely to be transported effectively, thus limiting their TSHR inhibitory potential. High lipophilicity was highlighted as key for TSHR inhibition, since this property describes the ability of compounds to penetrate the cell membrane and reach the transmembrane domain of TSHR. Nevertheless, several other molecular descriptors, reported in Table 2, were considered to account for factors influencing properties including dissociation properties, molecular flexibility, and electronic interactions. Liu et al. [95] employed the Shapley additive explanation (SHAP) technique to quantitatively assess the influence of each molecular feature, encoded as FPs, on TSHR agonism. While twenty different FPs were identified as having positive SHAP values, this analysis pointed out the contributions of lipophilicity, and aromatic and/or amino groups in promoting TSHR agonism.

TBG binding was only modelled by Yang et al., who developed four QSARs based on data for PBBs, including their mono-hydroxylated and di-hydroxylated metabolites [52]. The mechanistic interpretation indicated that hydroxylated metabolites exhibited a greater ability to bind with TBG, likely due to their capacity to establish hydrogen bonds or van der Waals interactions. Similarly, albumin binding was modelled only by Gallagher et al. [93], who developed three QSARs based on data for PFAS. Although they only provided indications about the positive or negative contributions of the selected molecular descriptors based on their coefficients signs, the authors concluded that PFAS with chain lengths shorter than ten carbons demonstrated a higher albumin binding affinity compared with those with longer carbon chains.

Overall, although these findings provided valuable insights into the main structural features and properties that may cause TH system disruption, a drawback is the limited emphasis or, in some instances, the complete absence of mechanistic interpretations or the application of feature importance techniques. This may limit the confidence in QSAR models among both scientists and regulatory bodies. Therefore, considering the increasing demand for mechanistically informed NAMs to advance chemical hazard assessments, future research should prioritise and dedicate resources to improving the mechanistic understanding of QSAR models in order to promote wider acceptance and trust in these methodologies.

Finally, this review showed the wide array of software and molecular descriptors used in QSAR studies for TH system disruption, underscoring the dynamic nature of the field. The diversity in approaches to descriptor calculation and selection indicates a lack of a single, standardised tool or method. Instead, the choice of software and methodology appeared to be driven by factors such as the expertise of researchers, tool accessibility, and prior experience with specific platforms.

3.9. Recent Advances: 2025

To keep this review up to date and to provide a picture of the field’s evolution, this section was included to provide a concise picture of the key models published between January and July 2025. Table 3 and Table 4 are included for quick reference.

Charest et al. [187] developed a QSAR model for TTR binding prediction using RF on a dataset of 853 compounds. The AD was defined using prediction entropy (PS), a metric derived from the probability outputs from the RF. A core strength of their modelling process was the adoption of a “mechanistic a priori” approach. They first analysed crystal structures of TTR and performed docking studies of how chemicals bind to it, allowing the authors to select molecular descriptors that were known to be relevant to the binding mechanism. The chosen descriptors, obtained from the PaDEL descriptor library using the OPERA software v2.9 [192], included measures of hydrogen bonding (nHBacc, nHBDon), planarity (naAromAtom), and hydrophobicity (CrippenLogP), as well as more complex topological descriptors like ETA and ATSC to capture fine structural details. They also included ZMIC descriptors to account for the structural specificity of protein binding. After training the model, they used the permutation importance and the mutual information methods to perform an a posteriori analysis to confirm the importance of their selected features. The authors found that the descriptors related to aromatic structures and hydrophobicity were highly relevant for TTR binding, which aligned with their initial mechanistic hypothesis and were consistent with mechanistic interpretations by previous studies described in Section 3.8. This combination of a priori and a posteriori analysis turned the model from a “black box” into a transparent and interpretable tool that is beyond the sole generation of predictions. Janicka et al. [186] developed a QSAR model designed to predict albumin binding based on data for twenty-nine phenoxyacetic acid-derived congeners. The authors first applied biopartitioning micellar chromatography (BMC) to derive an in vitro lipophilicity descriptor (logkBMC), which was used with other descriptors to develop an MLR-QSAR. The leverage approach, with the use of the Williams plot for graphical visualisation, was used to define the AD. Three key molecular descriptors defined the model: logkBMC to encode for lipophilicity, α to encode for polarizability, and the sum of hydrogen bond donors (HBD) and acceptors (HBA). Based on the descriptors’ signs in the model’s equation, binding to albumin was found to increase with higher lipophilicity and polarizability and to decrease with a greater number of hydrogen bond donors and acceptors. Two QSAR models were developed by Evangelista et al. [189] to predict the binding of PFAS to TTR, using a dataset of 134 PFAS. One classification model was developed using LDA, while one regression model was developed using MLR. To ensure robustness and avoid overfitting, the models were subjected to a rigorous validation protocol including randomization procedures and leave-one-out bootstrapping. The AD was defined differently for each model. For the classification model, the AD was defined in terms of distance (cosine α) and post probabilities of classification. Shannon entropy was introduced to quantify the uncertainty associated with external predictions. For the regression model, the AD was defined in terms of the leverage approach, with the adoption of the Williams plot for graphical visualisation. A prediction interval was introduced to quantify the uncertainty associated with external predictions. The classification model was characterised by GATS3e, ATSC6p, GATS8m, and MIC2, while the regression model was characterised by piPC5, GGI9, and AATSC0e. The selected descriptors were consistent with prior in vitro and in silico (docking) findings regarding the major drivers of PFAS binding to TTR. The findings highlighted the importance of hydrogen bond formation and hydrophobic interactions to establish binding with TTR. The relevance of lipophilicity, molecular weight, and chain length were highlighted. The study performed by Sosnowska et al. [191] focused on the ability of PFAS to bind to TTR. The methodology involved the development of classification and regression QSAR models based on data from 45 PFAS. The classification model was developed using the decision tree classifier (DTC). Then, a single regression model was developed using MLR. The same algorithm was also used to perform a multiple regression model (MRM) approach, where a total of thirty-one single MLR-QSARs were developed from different data splits. The AD for the classification model was defined using a boundary box method, while the leverage approach with the support of the Williams plot was employed to define the AD of the regression QSARs. The classification model highlighted molecular size and structural complexity as key factors, using descriptors like SM4_D and GATS3m. The MLR model emphasised that compounds with heavier and more polar atoms tended to be more active, with descriptors like AMW and GATS7p. It also identified the importance of specific structural features, such as fluorine atoms at a topological distance of 10 bonds (B10[F-F]). Three molecular descriptors were frequently selected in the models developed through the MRM approach (JGI10, ATSC7c, and MATS6i), which further confirmed the importance of atomic charge and polarity. The QSAR models generated via the MRM approach were not included in this review for simplicity because their performance was comparable to that of the single model developed independently. The collective findings consistently showed that molecular size, complexity, and polarity were the primary drivers of PFAS activity in disrupting TTR. Ultimately, it is highly relevant to cite the study by Cirino et al. [193]. Although their work did not propose any new QSAR models, its focus on optimising predictive performance through consensus modelling and evaluating the robustness of existing models made it a significant contribution to the field. Specifically, this study served as a retrospective analysis of computational strategies from the Tox24 Challenge [194], which aimed to advance computational toxicology for predicting chemical binding to TTR by using a large dataset of 1512 compounds [188]. The primary goal by Cirino et al. [193] was to analyse the models developed by the nine top-performing teams from the Tox24 Challenge and explore consensus strategies to enhance the predictive performances of single models. The participating teams adopted diverse strategies, with some relying on single-method models while others combined multiple approaches, such as descriptor-based and representation learning techniques. The study by Cirino et al. [193] demonstrated that consensus modelling improved the predictive accuracy for TTR binding, compared with individual models alone, achieving a lower error rate. Finally, an analysis to identify overrepresented functional groups in active compounds for TTR binding was performed. Unsurprisingly, groups similar to T4, such as phenols, aryl halides, and diarylethers, were highly frequent. Six other functional groups of potential concern were identified, including nitro compounds, arenes, and gem-trihalides.

4. Conclusions

This review highlighted the growing yet still-evolving landscape of QSAR models addressing MIEs leading to TH system disruption by chemical substances. While significant progress has been made, particularly due to the increased availability of HTS data, the field remains fragmented and challenges persist. This review highlighted a preference for classification-based models to predict categorical outcomes, instead of continuous toxicity values, and that, despite the rise in complex machine learning methods, simpler algorithms continued to be employed to leverage their interpretability and promote broader adoption.

This review revealed that modelling efforts were predominantly focused on key MIEs like TR and TTR, scarcely followed by TPO and TSHR. Critically, many other relevant MIEs, including the three deiodinases, NIS, TRHR, TBG, and albumin were significantly poorly addressed in QSAR research. A critical finding was the lack of QSAR modelling studies addressing MIEs related to DUOX, IYD, and pendrin inhibition, and those associated with cellular TH transport (specifically MCT8, MCT10, OATP1C1, OATP1A4, MDR1, and MRP2), highlighting critical areas for future investigations. Similarly, a limited number of chemical classes were addressed, leading to a very small number of local QSARs. Notably, while validation strategies were consistently employed, a critical finding was a frequent lack of explicitly defined ADs. Without clear AD definitions, QSARs risk being applied outside their scope, undermining decision-making confidence and leading to the incorrect use of QSARs. Furthermore, even though several types of molecular descriptors have been consistently identified as being relevant to model specific MIEs (e.g., TTR and TR), a limited emphasis on mechanistic interpretations was observed for many models, representing a critical drawback. However, the recent emergence of studies simultaneously covering multiple TH system-related endpoints demonstrated a growing awareness of the multifaceted nature of TH system disruption, offering a promising direction in aligning predictive modelling within AOP frameworks. The findings suggested a need for increased efforts in generating in vitro and in silico data for poorly addressed MIEs, broadening the chemical space of tested compounds, and ultimately developing new models. The successful application of integrated in silico approaches to generate activity data, such as molecular docking and dynamic simulations, has proven to be an effective strategy for developing QSARs when experimental data is limited or unavailable, presenting a valuable path forward for exploring multiple MIEs and for specific chemical classes. This would enable a more robust hazard assessment for entire groups of compounds. Future studies should prioritise the development of QSAR models with clearly defined ADs and enhanced mechanistic interpretability to increase the reliability and transparency of and confidence in their predictions, ultimately to promote their wider acceptance as effective NAMs for TH system disruption assessment.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/toxics13090799/s1.

Author Contributions

Conceptualization, M.E. and E.P.; formal analysis, M.E.; investigation, M.E.; data curation, M.E.; writing—original draft preparation, M.E. and E.P.; writing—review and editing, M.E. and E.P.; visualisation, M.E. and E.P.; supervision, E.P.; project administration, E.P. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the PhD programme in Chemical and Environmental Sciences (DiSCA) at the University of Insubria; PhD scholarship awarded to Marco Evangelista.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

No new data were created or analysed in this study. The data supporting the findings of this study are included within the paper and Supplementary Materials. Further inquiries can be directed to the corresponding authors.

Conflicts of Interest

The authors declare no conflicts of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.

Abbreviations

The following abbreviations are used in this manuscript:

AD	Applicability domain
AdaB	Adaptive Boosting
AhR	Aryl hydrocarbon receptor
AOP	Adverse outcome pathway
ASNNs	Associative neural networks
BFR	Brominated flame retardant
BMC	Biopartitioning micellar chromatography
CAR	Constitutive androstane receptor
CART	Classification and regression trees
CP	Conformal prediction
DBP	Phenolic disinfection byproduct
DIO	Iodothyronine deiodinase
DIO1	Type 1 deiodinase
DIO2	Type 2 deiodinase
DIO3	Type 3 deiodinase
DTC	Decision Tree Classifier
DUOX	Dual oxidase
EDC	Endocrine-disrupting chemical
EU	European Union
EURL ECVAM	European Union Reference Laboratory for Alternatives to Animal Testing
FAIR	Findable, Accessible, Interoperable, Reusable
FP	Fingerprint
HPA	Hypothalamic–pituitary–adrenal
HPG	Hypothalamic–pituitary–gonadal
HPT	Hypothalamic–pituitary–thyroid
HTS	High-throughput screening
IYD	Iodotyrosine deiodinase
kNN	k-nearest neighbours
LDA	Linear discriminant analysis
LMO	Leave more out
LOO	Leave one out
LR	Logistic regression
MCT	Monocarboxylate transporter
MCT10	Monocarboxylate transporter 10
MCT8	Monocarboxylate transporter 8
MDR1	Multidrug resistance protein 1
MIE	Molecular initiating event
MLR	Multiple linear regression
MRM	Multiple regression model
MRP2	Multidrug resistance-associated protein 2
MW	Molecular weight
NAMs	New approach methodologies
NIS	Sodium iodide symporter
NN	Neural network
OATP	Organic anion transporter polypeptide
OATP1A4	Organic anion transporter polypeptide 1A4
OATP1C1	Organic anion transporter polypeptide 1C1
OECD	Organisation for Economic Co-operation and Development
PBB	Polybrominated biphenyl
PBDE	Polybrominated diphenyl ether
PCA	Principal component analysis
PCB	Polychlorinated biphenyl
PCN	Polychlorinated naphthalene
PFAS	Per- and polyfluoroalkyl substances
PFC	Poly- and perfluorinated compound
PLR	Partial logistic regression
PLS	Partial least squares
PLS-DA	Partial least squares discriminant analysis
PPAR	Peroxisome proliferator-activated receptor
PS	Prediction entropy
PXR	Pregnane X receptor
QSAR	Quantitative structure–activity relationship
RF	Random forest
SHAP	Shapley additive explanation
SMOTEENN	Synthetic minority over-sampling technique-edited nearest neighbours
SVM	Support vector machine
T3	Triiodothyronine
T4	Thyroxine
TBG	Thyroid-binding globulin
TH	Thyroid hormone
THSDC	Thyroid hormone system-disrupting chemical
TPO	Thyroperoxidase
TR	Thyroid hormone receptor
TRHR	Thyrotropin-releasing hormone receptor
TSHR	Thyroid-stimulating hormone receptor
TTR	Transthyretin
XGB	Extreme gradient boosting

References

Gore, A.C.; La Merrill, M.A.; Patisaul, H.B.; Sargis, R. Endocrine Disrupting Chemicals: Threats to Human Health. The Endocrine Society and IPEN. 2024. Available online: https://ipen.org/sites/default/files/documents/edc_report-2024-final-compressed.pdf (accessed on 22 June 2025).
World Health Organization/International Programme on Chemical Safety (WHO/IPCS). Global Assessment on the State of the Science of Endocrine Disruptors; World Health Organization: Geneva, Switzerland, 2002. [Google Scholar]
Ahn, C.; Jeung, E.-B. Endocrine-Disrupting Chemicals and Disease Endpoints. Int. J. Mol. Sci. 2023, 24, 5342. [Google Scholar] [CrossRef]
World Health Organization/International Programme on Chemical Safety (WHO/IPCS). State of the Science of Endocrine Disrupting Chemicals 2012; World Health Organization: Geneva, Switzerland, 2012. [Google Scholar]
de Oliveira Santos, A.D.; do Nascimento, M.T.L.; de Freitas, A.d.S.; de Carvalho, D.G.; Bila, D.M.; Hauser-Davis, R.A.; Monteiro da Fonseca, E.; Baptista Neto, J.A. The Evolution of Endocrine Disruptor Chemical Assessments Worldwide in the Last Three Decades. Mar. Pollut. Bull. 2023, 197, 115727. [Google Scholar] [CrossRef]
FAO. Exposure to Endocrine Disrupting Chemicals—Changes from 2002 to 2024; Food Safety and Quality Series, No. 30; FAO: Rome, Italy, 2024. [Google Scholar] [CrossRef]
Diamanti-Kandarakis, E.; Bourguignon, J.-P.; Giudice, L.C.; Hauser, R.; Prins, G.S.; Soto, A.M.; Zoeller, R.T.; Gore, A.C. Endocrine-Disrupting Chemicals: An Endocrine Society Scientific Statement. Endocr. Rev. 2009, 30, 293–342. [Google Scholar] [CrossRef]
Feldt-Rasmussen, U.; Effraimidis, G.; Klose, M. The Hypothalamus-Pituitary-Thyroid (HPT)-Axis and Its Role in Physiology and Pathophysiology of Other Hypothalamus-Pituitary Functions. Mol. Cell Endocrinol. 2021, 525, 111173. [Google Scholar] [CrossRef]
Fekete, C.; Lechan, R.M. Central Regulation of Hypothalamic-Pituitary-Thyroid Axis Under Physiological and Pathophysiological Conditions. Endocr. Rev. 2014, 35, 159–194. [Google Scholar] [CrossRef] [PubMed]
Sabatino, L.; Vassalle, C. Thyroid Hormones and Metabolism Regulation: Which Role on Brown Adipose Tissue and Browning Process? Biomolecules 2025, 15, 361. [Google Scholar] [CrossRef] [PubMed]
Mullur, R.; Liu, Y.-Y.; Brent, G.A. Thyroid Hormone Regulation of Metabolism. Physiol. Rev. 2014, 94, 355–382. [Google Scholar] [CrossRef] [PubMed]
De Luca, R.; Davis, P.J.; Lin, H.-Y.; Gionfra, F.; Percario, Z.A.; Affabris, E.; Pedersen, J.Z.; Marchese, C.; Trivedi, P.; Anastasiadou, E.; et al. Thyroid Hormones Interaction With Immune Response, Inflammation and Non-Thyroidal Illness Syndrome. Front. Cell Dev. Biol. 2021, 8, 614030. [Google Scholar] [CrossRef]
Sawicka-Gutaj, N.; Zawalna, N.; Gut, P.; Ruchała, M. Relationship between Thyroid Hormones and Central Nervous System Metabolism in Physiological and Pathological Conditions. Pharmacol. Rep. 2022, 74, 847–858. [Google Scholar] [CrossRef]
Bassett, J.H.D.; Williams, G.R. Role of Thyroid Hormones in Skeletal Development and Bone Maintenance. Endocr. Rev. 2016, 37, 135–187. [Google Scholar] [CrossRef]
Silva, J.F.; Ocarino, N.M.; Serakides, R. Thyroid Hormones and Female Reproduction. Biol. Reprod. 2018, 99, 907–921. [Google Scholar] [CrossRef] [PubMed]
Yamakawa, H.; Kato, T.S.; Noh, J.Y.; Yuasa, S.; Kawamura, A.; Fukuda, K.; Aizawa, Y. Thyroid Hormone Plays an Important Role in Cardiac Function: From Bench to Bedside. Front. Physiol. 2021, 12, 606931. [Google Scholar] [CrossRef] [PubMed]
Brent, G.A. Mechanisms of Thyroid Hormone Action. J. Clin. Investig. 2012, 122, 3035–3043. [Google Scholar] [CrossRef]
Alcaide Martin, A.; Mayerl, S. Local Thyroid Hormone Action in Brain Development. Int. J. Mol. Sci. 2023, 24, 12352. [Google Scholar] [CrossRef]
Giannocco, G.; Kizys, M.M.L.; Maciel, R.M.; de Souza, J.S. Thyroid Hormone, Gene Expression, and Central Nervous System: Where We Are. Semin. Cell Dev. Biol. 2021, 114, 47–56. [Google Scholar] [CrossRef]
Adu-Gyamfi, E.A.; Wang, Y.-X.; Ding, Y.-B. The Interplay between Thyroid Hormones and the Placenta: A Comprehensive Review. Biol. Reprod. 2020, 102, 8–17. [Google Scholar] [CrossRef]
Moog, N.K.; Entringer, S.; Heim, C.; Wadhwa, P.D.; Kathmann, N.; Buss, C. Influence of Maternal Thyroid Hormones during Gestation on Fetal Brain Development. Neuroscience 2017, 342, 68–100. [Google Scholar] [CrossRef]
Köhrle, J.; Frädrich, C. Thyroid Hormone System Disrupting Chemicals. Best Pract. Res. Clin. Endocrinol. Metab. 2021, 35, 101562. [Google Scholar] [CrossRef]
Kortenkamp, A.; Axelstad, M.; Baig, A.H.; Bergman, Å.; Bornehag, C.-G.; Cenijn, P.; Christiansen, S.; Demeneix, B.; Derakhshan, A.; Fini, J.-B.; et al. Removing Critical Gaps in Chemical Test Methods by Developing New Assays for the Identification of Thyroid Hormone System-Disrupting Chemicals—The ATHENA Project. Int. J. Mol. Sci. 2020, 21, 3123. [Google Scholar] [CrossRef]
Oliveira, K.J.; Chiamolera, M.I.; Giannocco, G.; Pazos-Moura, C.C.; Ortiga-Carvalho, T.M. Thyroid Function Disruptors: From Nature to Chemicals. J. Mol. Endocrinol. 2019, 62, R1–R19. [Google Scholar] [CrossRef]
Calsolaro, V.; Pasqualetti, G.; Niccolai, F.; Caraccio, N.; Monzani, F. Thyroid Disrupting Chemicals. Int. J. Mol. Sci. 2017, 18, 2583. [Google Scholar] [CrossRef]
Boas, M.; Feldt-Rasmussen, U.; Main, K.M. Thyroid Effects of Endocrine Disrupting Chemicals. Mol. Cell Endocrinol. 2012, 355, 240–248. [Google Scholar] [CrossRef]
Salazar, P.; Villaseca, P.; Cisternas, P.; Inestrosa, N.C. Neurodevelopmental Impact of the Offspring by Thyroid Hormone System-Disrupting Environmental Chemicals during Pregnancy. Environ. Res. 2021, 200, 111345. [Google Scholar] [CrossRef] [PubMed]
Alsen, M.; Sinclair, C.; Cooke, P.; Ziadkhanpour, K.; Genden, E.; van Gerwen, M. Endocrine Disrupting Chemicals and Thyroid Cancer: An Overview. Toxics 2021, 9, 14. [Google Scholar] [CrossRef] [PubMed]
Olanrewaju, O.A.; Asghar, R.; Makwana, S.; Yahya, M.; Kumar, N.; Khawar, M.H.; Ahmed, A.; Islam, T.; Kumari, K.; Shadmani, S.; et al. Thyroid and Its Ripple Effect: Impact on Cardiac Structure, Function, and Outcomes. Cureus 2024, 16, e51574. [Google Scholar] [CrossRef] [PubMed]
Brown, E.D.L.; Obeng-Gyasi, B.; Hall, J.E.; Shekhar, S. The Thyroid Hormone Axis and Female Reproduction. Int. J. Mol. Sci. 2023, 24, 9815. [Google Scholar] [CrossRef]
Murk, A.J.; Rijntjes, E.; Blaauboer, B.J.; Clewell, R.; Crofton, K.M.; Dingemans, M.M.L.; David Furlow, J.; Kavlock, R.; Köhrle, J.; Opitz, R.; et al. Mechanism-Based Testing Strategy Using in Vitro Approaches for Identification of Thyroid Hormone Disrupting Chemicals. Toxicol. Vitr. 2013, 27, 1320–1346. [Google Scholar] [CrossRef]
Crofton, K.M. Thyroid Disrupting Chemicals: Mechanisms and Mixtures. Int. J. Androl. 2008, 31, 209–223. [Google Scholar] [CrossRef]
Bernasconi, C.; Sampani, S.; Beronius, A.; Coecke, S.; Langezaal, I.; Pistollato, F.; Paini, A.; Muñoz, A.; Asturiol, D.; Kienzler, A.; et al. Chemical Selection for the Thyroid Validation Study Coordinated by EURL ECVAM and Involving EU-NETVAL Laboratories. ALTEX—Altern. Anim. Exp. 2025. [Google Scholar] [CrossRef]
European Commission. The European Green Deal 2019; European Commission: Brussels, Belgium, 2019. [Google Scholar]
European Commission. Chemicals Strategy for Sustainability Towards a Toxic-Free Environment 2020; European Commission: Brussels, Belgium, 2020. [Google Scholar]
Holmer, M.L.; Holmberg, R.D.; Despicht, C.; Bouftas, N.; Axelstad, M.; Beronius, A.; Zilliacus, J.; Van Duursen, M.; Svingen, T. Assessment of Endocrine Disruptors in the European Union: Current Regulatory Framework, Use of New Approach Methodologies (NAMs) and Recommendations for Improvements. Regul. Toxicol. Pharmacol. 2025, 162, 105883. [Google Scholar] [CrossRef]
Ramhøj, L.; Axelstad, M.; Baert, Y.; Cañas-Portilla, A.I.; Chalmel, F.; Dahmen, L.; De La Vieja, A.; Evrard, B.; Haigis, A.-C.; Hamers, T.; et al. New Approach Methods to Improve Human Health Risk Assessment of Thyroid Hormone System Disruption–a PARC Project. Front. Toxicol. 2023, 5, 1189303. [Google Scholar] [CrossRef] [PubMed]
ECHA. Key Areas of Regulatory Challenge. 2025. Available online: https://echa.europa.eu/documents/10162/17228/key_areas_regulatory_challenge_2025_en.pdf/da33bf25-2b75-1fe9-c308-53043f9b9a28?t=1749466525527 (accessed on 22 June 2025).
OECD. Introduction. In Revised Guidance Document 150 on Standardised Test Guidelines for Evaluating Chemicals for Endocrine Disruption; OECD publishing: Paris, France, 2018; pp. 19–39. [Google Scholar]
European Commission. Commission Delegated Regulation (EU) 2017/2100 of 4 September 2017 Setting out Scientific Criteria for the Determination of Endocrine-Disrupting Properties Pursuant to Regulation (EU) No 528/2012 of the European Parliament and Council (Text with EEA Relevance); European Commission: Brussels, Belgium, 2017. [Google Scholar]
European Commission. Commission Regulation (EU) 2018/605 of 19 April 2018 Amending Annex II to Regulation (EC) No 1107/2009 by Setting out Scientific Criteria for the Determination of Endocrine Disrupting Properties (Text with EEA Relevance); European Commission: Brussels, Belgium, 2018. [Google Scholar]
European Commission. Commission Delegated Regulation (EU) 2023/707 of 19 December 2022 Amending Regulation (EC) No 1272/2008 as Regards Hazard Classes and Criteria for the Classification, Labelling and Packaging of Substances and Mixtures (Text with EEA Relevance); European Commission: Brussels, Belgium, 2022. [Google Scholar]
Ankley, G.T.; Bennett, R.S.; Erickson, R.J.; Hoff, D.J.; Hornung, M.W.; Johnson, R.D.; Mount, D.R.; Nichols, J.W.; Russom, C.L.; Schmieder, P.K.; et al. Adverse Outcome Pathways: A Conceptual Framework to Support Ecotoxicology Research and Risk Assessment. Environ. Toxicol. Chem. 2010, 29, 730–741. [Google Scholar] [CrossRef] [PubMed]
Svingen, T.; Schwartz, C.L.; Rosenmai, A.K.; Ramhøj, L.; Johansson, H.K.L.; Hass, U.; Draskau, M.K.; Davidsen, N.; Christiansen, S.; Ballegaard, A.-S.R.; et al. Using Alternative Test Methods to Predict Endocrine Disruption and Reproductive Adverse Outcomes: Do We Have Enough Knowledge? Environ. Pollut. 2022, 304, 119242. [Google Scholar] [CrossRef] [PubMed]
Cronin, M.T.D.; Richarz, A.-N. Relationship Between Adverse Outcome Pathways and Chemistry-Based In Silico Models to Predict Toxicity. Appl. Vitr. Toxicol. 2017, 3, 286–297. [Google Scholar] [CrossRef]
Muratov, E.N.; Bajorath, J.; Sheridan, R.P.; Tetko, I.V.; Filimonov, D.; Poroikov, V.; Oprea, T.I.; Baskin, I.I.; Varnek, A.; Roitberg, A.; et al. QSAR without Borders. Chem. Soc. Rev. 2020, 49, 3525–3564. [Google Scholar] [CrossRef]
Noyes, P.D.; Friedman, K.P.; Browne, P.; Haselman, J.T.; Gilbert, M.E.; Hornung, M.W.; Barone, S.; Crofton, K.M.; Laws, S.C.; Stoker, T.E.; et al. Evaluating Chemicals for Thyroid Disruption: Opportunities and Challenges with in Vitro Testing and Adverse Outcome Pathway Approaches. Environ. Health Perspect. 2019, 127, 095001. [Google Scholar] [CrossRef]
Bernasconi, C.; Langezaal, I.; Bartnicka, J.; Asturiol, D.; Bowe, G.; Coecke, S.; Kienzler, A.; Liska, R.; Milcamps, A.; Munoz-Pineiro, M.A.; et al. Validation of a Battery of Mechanistic Methods Relevant for the Detection of Chemicals That Can Disrupt the Thyroid Hormone System; Publications Office of the European Union: Luxembourg, 2023. [Google Scholar]
Sellami, A.; Reau, M.; Montes, M.; Lagarde, N. Review of in Silico Studies Dedicated to the Nuclear Receptor Family: Therapeutic Prospects and Toxicological Concerns. Front. Endocrinol. 2022, 13, 986016. [Google Scholar] [CrossRef]
Vergauwen, L.; Bajard, L.; Tait, S.; Langezaal, I.; Sosnowska, A.; Roncaglioni, A.; Hessel, E.; van den Brand, A.D.; Haigis, A.-C.; Novák, J.; et al. A 2024 Inventory of Test Methods Relevant to Thyroid Hormone System Disruption for Human Health and Environmental Regulatory Hazard Assessment. Open Res. Eur. 2024, 4, 242. [Google Scholar] [CrossRef]
Dracheva, E.; Norinder, U.; Rydén, P.; Engelhardt, J.; Weiss, J.M.; Andersson, P.L. In Silico Identification of Potential Thyroid Hormone System Disruptors among Chemicals in Human Serum and Chemicals with a High Exposure Index. Environ. Sci. Technol. 2022, 56, 8363–8372. [Google Scholar] [CrossRef]
Yang, L.; Sun, P.; Tao, L.; Zhao, X. An in Silico Study on Human Carcinogenicity Mechanism of Polybrominated Biphenyls Exposure. Chem.-Biol. Interact. 2024, 397, 111075. [Google Scholar] [CrossRef]
Evangelista, M.; Chirico, N.; Papa, E. In Silico Models for the Screening of Human Transthyretin Disruptors. J. Hazard. Mater. 2024, 480, 136188. [Google Scholar] [CrossRef]
Cao, J.; Lin, Y.; Guo, L.-H.; Zhang, A.-Q.; Wei, Y.; Yang, Y. Structure-Based Investigation on the Binding Interaction of Hydroxylated Polybrominated Diphenyl Ethers with Thyroxine Transport Proteins. Toxicology 2010, 277, 20–28. [Google Scholar] [CrossRef] [PubMed]
Montaño, M.; Cocco, E.; Guignard, C.; Marsh, G.; Hoffmann, L.; Bergman, Å.; Gutleb, A.C.; Murk, A.J. New Approaches to Assess the Transthyretin Binding Capacity of Bioactivated Thyroid Hormone Disruptors. Toxicol. Sci. 2012, 130, 94–105. [Google Scholar] [CrossRef] [PubMed]
Grimm, F.A.; Lehmler, H.-J.; He, X.; Robertson, L.W.; Duffel, M.W. Sulfated Metabolites of Polychlorinated Biphenyls Are High-Affinity Ligands for the Thyroid Hormone Transport Protein Transthyretin. Environ. Health Perspect. 2013, 121, 657–662. [Google Scholar] [CrossRef] [PubMed]
Yang, X.; Ou, W.; Xi, Y.; Chen, J.; Liu, H. Emerging Polar Phenolic Disinfection Byproducts Are High-Affinity Human Transthyretin Disruptors: An in Vitro and in Silico Study. Environ. Sci. Technol. 2019, 53, 7019–7028. [Google Scholar] [CrossRef]
Xi, Y.; Yang, X.; Zhang, H.; Liu, H.; Watson, P.; Yang, F. Binding Interactions of Halo-Benzoic Acids, Halo-Benzenesulfonic Acids and Halo-Phenylboronic Acids with Human Transthyretin. Chemosphere 2020, 242, 125135. [Google Scholar] [CrossRef]
Yang, X.; Ou, W.; Zhao, S.; Wang, L.; Chen, J.; Kusko, R.; Hong, H.; Liu, H. Human Transthyretin Binding Affinity of Halogenated Thiophenols and Halogenated Phenols: An in Vitro and in Silico Study. Chemosphere 2021, 280, 130627. [Google Scholar] [CrossRef]
Rosenmai, A.K.; Winge, S.B.; Möller, M.; Lundqvist, J.; Wedebye, E.B.; Nikolov, N.G.; Lilith Johansson, H.K.; Vinggaard, A.M. Organophosphate Ester Flame Retardants Have Antiandrogenic Potential and Affect Other Endocrine Related Endpoints in Vitro and in Silico. Chemosphere 2021, 263, 127703. [Google Scholar] [CrossRef]
Ren, X.M.; Guo, L.-H. Assessment of the Binding of Hydroxylated Polybrominated Diphenyl Ethers to Thyroid Hormone Transport Proteins Using a Site-Specific Fluorescence Probe. Environ. Sci. Technol. 2012, 46, 4633–4640. [Google Scholar] [CrossRef]
Ren, X.-M.; Qin, W.-P.; Cao, L.-Y.; Zhang, J.; Yang, Y.; Wan, B.; Guo, L.-H. Binding Interactions of Perfluoroalkyl Substances with Thyroid Hormone Transport Proteins and Potential Toxicological Implications. Toxicology 2016, 366–367, 32–42. [Google Scholar] [CrossRef]
Ouyang, X.; Froment, J.; Leonards, P.E.G.; Christensen, G.; Tollefsen, K.-E.; de Boer, J.; Thomas, K.V.; Lamoree, M.H. Miniaturization of a Transthyretin Binding Assay Using a Fluorescent Probe for High Throughput Screening of Thyroid Hormone Disruption in Environmental Samples. Chemosphere 2017, 171, 722–728. [Google Scholar] [CrossRef]
Qin, W.-P.; Li, C.-H.; Guo, L.-H.; Ren, X.-M.; Zhang, J.-Q. Binding and Activity of Polybrominated Diphenyl Ether Sulfates to Thyroid Hormone Transport Proteins and Nuclear Receptors. Environ. Sci. Process. Impacts 2019, 21, 950–956. [Google Scholar] [CrossRef]
Ren, X.-M.; Yao, L.; Xue, Q.; Shi, J.; Zhang, Q.; Wang, P.; Fu, J.; Zhang, A.; Qu, G.; Jiang, G. Binding and Activity of Tetrabromobisphenol A Mono-Ether Structural Analogs to Thyroid Hormone Transport Proteins and Receptors. Environ. Health Perspect. 2020, 128, 107008. [Google Scholar] [CrossRef] [PubMed]
Huang, K.; Wang, X.; Zhang, H.; Zeng, L.; Zhang, X.; Wang, B.; Zhou, Y.; Jing, T. Structure-Directed Screening and Analysis of Thyroid-Disrupting Chemicals Targeting Transthyretin Based on Molecular Recognition and Chromatographic Separation. Environ. Sci. Technol. 2020, 54, 5437–5445. [Google Scholar] [CrossRef]
Hamers, T.; Kortenkamp, A.; Scholze, M.; Molenaar, D.; Cenijn, P.H.; Weiss, J.M. Transthyretin-Binding Activity of Complex Mixtures Representing the Composition of Thyroid-Hormone Disrupting Contaminants in House Dust and Human Serum. Environ. Health Perspect. 2020, 128, 017015. [Google Scholar] [CrossRef] [PubMed]
van den Berg, K.J. Interaction of Chlorinated Phenols with Thyroxine Binding Sites of Human Transthyretin, Albumin and Thyroid Binding Globulin. Chem. Biol. Interact. 1990, 76, 63–75. [Google Scholar] [CrossRef] [PubMed]
den Besten, C.; Vet, J.J.R.M.; Besselink, H.T.; Kiel, G.S.; van Berkel, B.J.M.; Beems, R.; van Bladeren, P.J. The Liver, Kidney, and Thyroid Toxicity of Chlorinated Benzenes. Toxicol. Appl. Pharmacol. 1991, 111, 69–81. [Google Scholar] [CrossRef]
Lans, M.C.; Klasson-Wehler, E.; Willemsen, M.; Meussen, E.; Safe, S.; Brouwer, A. Structure-Dependent, Competitive Interaction of Hydroxy-Polychlorobiphenyls, -Dibenzo-p-Dioxins and -Dibenzofurans with Human Transthyretin. Chem. Biol. Interact. 1993, 88, 7–21. [Google Scholar] [CrossRef]
Cheek, A.O.; Kow, K.; Chen, J.; McLachlan, J.A. Potential Mechanisms of Thyroid Disruption in Humans: Interaction of Organochlorine Compounds with Thyroid Receptor, Transthyretin, and Thyroid-Binding Globulin. Environ. Health Perspect. 1999, 107, 273–278. [Google Scholar] [CrossRef]
Meerts, I.A.T.M.; van Zanden, J.J.; Luijks, E.A.C.; van Leeuwen-Bol, I.; Marsh, G.; Jakobsson, E.; Bergman, Å.; Brouwer, A. Potent Competitive Interactions of Some Brominated Flame Retardants and Related Compounds with Human Transthyretin in Vitro. Toxicol. Sci. 2000, 56, 95–104. [Google Scholar] [CrossRef]
Sandau, C.D.; Meerts, I.A.T.M.; Letcher, R.J.; McAlees, A.J.; Chittim, B.; Brouwer, A.; Norstrom, R.J. Identification of 4-Hydroxyheptachlorostyrene in Polar Bear Plasma and Its Binding Affinity to Transthyretin: A Metabolite of Octachlorostyrene? Environ. Sci. Technol. 2000, 34, 3871–3877. [Google Scholar] [CrossRef]
Chauhan, K.R.; Kodavanti, P.R.S.; McKinney, J.D. Assessing the Role of Ortho-Substitution on Polychlorinated Biphenyl Binding to Transthyretin, a Thyroxine Transport Protein. Toxicol. Appl. Pharmacol. 2000, 162, 10–21. [Google Scholar] [CrossRef]
Legler, J.; Cenijn, P.H.; Malmberg, T.; Bergman, A.; Brouwer, A. Determination of the Endocrine Disrupting Potency of Hydroxylated PCB’s and Flame Retardants with in Vitro Bioassays. Organohalogen Compd. 2002, 53–56. Available online: https://research.vu.nl/en/publications/determination-of-the-endocrine-disrupting-potency-of-hydroxylated (accessed on 16 April 2023).
Meerts, I.A.T.M.; Assink, Y.; Cenijn, P.H.; van den Berg, J.H.J.; Weijers, B.M.; Bergman, Å.; Koeman, J.H.; Brouwer, A. Placental Transfer of a Hydroxylated Polychlorinated Biphenyl and Effects on Fetal and Maternal Thyroid Hormone Homeostasis in the Rat. Toxicol. Sci. 2002, 68, 361–371. [Google Scholar] [CrossRef]
Maia, F.; Almeida, M.d.R.; Gales, L.; Kijjoa, A.; Pinto, M.M.M.; Saraiva, M.J.; Damas, A.M. The Binding of Xanthone Derivatives to Transthyretin. Biochem. Pharmacol. 2005, 70, 1861–1869. [Google Scholar] [CrossRef] [PubMed]
Hamers, T.; Kamstra, J.H.; Sonneveld, E.; Murk, A.J.; Kester, M.H.A.; Andersson, P.L.; Legler, J.; Brouwer, A. In Vitro Profiling of the Endocrine-Disrupting Potency of Brominated Flame Retardants. Toxicol. Sci. 2006, 92, 157–173. [Google Scholar] [CrossRef] [PubMed]
Harju, M.; Hamers, T.; Kamstra, J.H.; Sonneveld, E.; Boon, J.P.; Tysklind, M.; Andersson, P.L. Quantitative Structure–activity Relationship Modeling on in Vitro Endocrine Effects and Metabolic Stability Involving 26 Selected Brominated Flame Retardants. Environ. Toxicol. Chem. 2007, 26, 816–826. [Google Scholar] [CrossRef] [PubMed]
Hamers, T.; Kamstra, J.H.; Sonneveld, E.; Murk, A.J.; Visser, T.J.; Van Velzen, M.J.M.; Brouwer, A.; Bergman, Å. Biotransformation of Brominated Flame Retardants into Potentially Endocrine-Disrupting Metabolites, with Special Attention to 2,2′,4,4′-Tetrabromodiphenyl Ether (BDE-47). Mol. Nutr. Food Res. 2008, 52, 284–298. [Google Scholar] [CrossRef]
Gales, L.; Almeida, M.R.; Arsequell, G.; Valencia, G.; Saraiva, M.J.; Damas, A.M. Iodination of Salicylic Acid Improves Its Binding to Transthyretin. Biochim. Biophis. Acta 2008, 1784, 512–517. [Google Scholar] [CrossRef]
Weiss, J.M.; Andersson, P.L.; Lamoree, M.H.; Leonards, P.E.G.; van Leeuwen, S.P.J.; Hamers, T. Competitive Binding of Poly- and Perfluorinated Compounds to the Thyroid Hormone Transport Protein Transthyretin. Toxicol. Sci. 2009, 109, 206–216. [Google Scholar] [CrossRef]
Hamers, T.; Kamstra, J.H.; Cenijn, P.H.; Pencikova, K.; Palkova, L.; Simeckova, P.; Vondracek, J.; Andersson, P.L.; Stenberg, M.; Machala, M. In Vitro Toxicity Profiling of Ultrapure Non–Dioxin-like Polychlorinated Biphenyl Congeners and Their Relative Toxic Contribution to PCB Mixtures in Humans. Toxicol. Sci. 2011, 121, 88–100. [Google Scholar] [CrossRef]
Simon, E.; Bytingsvik, J.; Jonker, W.; Leonards, P.E.G.; de Boer, J.; Jenssen, B.M.; Lie, E.; Aars, J.; Hamers, T.; Lamoree, M.H. Blood Plasma Sample Preparation Method for the Assessment of Thyroid Hormone-Disrupting Potency in Effect-Directed Analysis. Environ. Sci. Technol. 2011, 45, 7936–7944. [Google Scholar] [CrossRef] [PubMed]
Simon, E.; van Velzen, M.; Brandsma, S.H.; Lie, E.; Løken, K.; de Boer, J.; Bytingsvik, J.; Jenssen, B.M.; Aars, J.; Hamers, T.; et al. Effect-Directed Analysis To Explore the Polar Bear Exposome: Identification of Thyroid Hormone Disrupting Compounds in Plasma. Environ. Sci. Technol. 2013, 47, 8902–8912. [Google Scholar] [CrossRef] [PubMed]
Viluksela, M.; Heikkinen, P.; van der Ven, L.T.M.; Rendel, F.; Roos, R.; Esteban, J.; Korkalainen, M.; Lensu, S.; Miettinen, H.M.; Savolainen, K.; et al. Toxicological Profile of Ultrapure 2,2′,3,4,4′,5,5′-Heptachlorbiphenyl (PCB 180) in Adult Rats. PLoS ONE 2014, 9, e104639. [Google Scholar] [CrossRef] [PubMed]
Zhang, J.; Kamstra, J.H.; Ghorbanzadeh, M.; Weiss, J.M.; Hamers, T.; Andersson, P.L. In Silico Approach To Identify Potential Thyroid Hormone Disruptors among Currently Known Dust Contaminants and Their Metabolites. Environ. Sci. Technol. 2015, 49, 10099–10107. [Google Scholar] [CrossRef]
Weiss, J.M.; Andersson, P.L.; Zhang, J.; Simon, E.; Leonards, P.E.G.; Hamers, T.; Lamoree, M.H. Tracing Thyroid Hormone-Disrupting Compounds: Database Compilation and Structure–activity Evaluation for an Effect-Directed Analysis of Sediment. Anal. Bioanal. Chem. 2015, 407, 5625–5634. [Google Scholar] [CrossRef]
Hill, K.L.; Mortensen, Å.-K.; Teclechiel, D.; Willmore, W.G.; Sylte, I.; Jenssen, B.M.; Letcher, R.J. In Vitro and in Silico Competitive Binding of Brominated Polyphenyl Ether Contaminants with Human and Gull Thyroid Hormone Transport Proteins. Environ. Sci. Technol. 2018, 52, 1533–1541. [Google Scholar] [CrossRef]
Kowalska, D.; Sosnowska, A.; Bulawska, N.; Stępnik, M.; Besselink, H.; Behnisch, P.; Puzyn, T. How the Structure of Per- and Polyfluoroalkyl Substances (PFAS) Influences Their Binding Potency to the Peroxisome Proliferator-Activated and Thyroid Hormone Receptors—An In Silico Screening Study. Molecules 2023, 28, 479. [Google Scholar] [CrossRef]
Akinola, L.K.; Uzairu, A.; Shallangwa, G.A.; Abechi, S.E. Development of Binary Classification Models for Grouping Hydroxylated Polychlorinated Biphenyls into Active and Inactive Thyroid Hormone Receptor Agonists. SAR QSAR Environ. Res. 2023, 34, 267–284. [Google Scholar] [CrossRef]
Arulmozhiraja, S.; Shiraishi, F.; Okumura, T.; Iida, M.; Takigami, H.; Edmonds, J.S.; Morita, M. Structural Requirements for the Interaction of 91 Hydroxylated Polychlorinated Biphenyls with Estrogen and Thyroid Hormone Receptors. Toxicol. Sci. 2005, 84, 49–62. [Google Scholar] [CrossRef]
Gallagher, A.; Kar, S.; Sepúlveda, M.S. Computational Modeling of Human Serum Albumin Binding of Per- and Polyfluoroalkyl Substances Employing QSAR, Read-Across, and Docking. Molecules 2023, 28, 5375. [Google Scholar] [CrossRef] [PubMed]
Jackson, T.W.; Scheibly, C.M.; Polera, M.E.; Belcher, S.M. Rapid Characterization of Human Serum Albumin Binding for Per- and Polyfluoroalkyl Substances Using Differential Scanning Fluorimetry. Environ. Sci. Technol. 2021, 55, 12291–12301. [Google Scholar] [CrossRef]
Liu, W.; Wang, Z.; Chen, J.; Tang, W.; Wang, H. Machine Learning Model for Screening Thyroid Stimulating Hormone Receptor Agonists Based on Updated Datasets and Improved Applicability Domain Metrics. Chem. Res. Toxicol. 2023, 36, 947–958. [Google Scholar] [CrossRef] [PubMed]
Neumann, S.; Huang, W.; Titus, S.; Krause, G.; Kleinau, G.; Alberobello, A.T.; Zheng, W.; Southall, N.T.; Inglese, J.; Austin, C.P.; et al. Small-Molecule Agonists for the Thyrotropin Receptor Stimulate Thyroid Function in Human Thyrocytes and Mice. Proc. Natl. Acad. Sci. USA 2009, 106, 12471–12476. [Google Scholar] [CrossRef]
Jäschke, H.; Neumann, S.; Moore, S.; Thomas, C.J.; Colson, A.-O.; Costanzi, S.; Kleinau, G.; Jiang, J.-K.; Paschke, R.; Raaka, B.M.; et al. A Low Molecular Weight Agonist Signals by Binding to the Transmembrane Domain of Thyroid-Stimulating Hormone Receptor (TSHR) and Luteinizing Hormone/Chorionic Gonadotropin Receptor (LHCGR). J. Biol. Chem. 2006, 281, 9841–9844. [Google Scholar] [CrossRef]
Titus, S.; Neumann, S.; Zheng, W.; Southall, N.; Michael, S.; Klumpp, C.; Yasgar, A.; Shinn, P.; Thomas, C.J.; Inglese, J.; et al. Quantitative High-Throughput Screening Using a Live-Cell cAMP Assay Identifies Small-Molecule Agonists of the TSH Receptor. SLAS Discov. 2008, 13, 120–127. [Google Scholar] [CrossRef]
Huang, R.; Xia, M.; Sakamuru, S.; Zhao, J.; Shahane, S.A.; Attene-Ramos, M.; Zhao, T.; Austin, C.P.; Simeonov, A. Modelling the Tox21 10 K Chemical Profiles for in Vivo Toxicity Prediction and Mechanism Characterization. Nat. Commun. 2016, 7, 10425. [Google Scholar] [CrossRef]
Huang, R.; Xia, M.; Sakamuru, S.; Zhao, J.; Lynch, C.; Zhao, T.; Zhu, H.; Austin, C.P.; Simeonov, A. Expanding Biological Space Coverage Enhances the Prediction of Drug Adverse Effects in Human Using in Vitro Activity Profiles. Sci. Rep. 2018, 8, 3783. [Google Scholar] [CrossRef]
Olker, J.H.; Korte, J.J.; Denny, J.S.; Hartig, P.C.; Cardon, M.C.; Knutsen, C.N.; Kent, P.M.; Christensen, J.P.; Degitz, S.J.; Hornung, M.W. Screening the ToxCast Phase 1, Phase 2, and E1k Chemical Libraries for Inhibitors of Iodothyronine Deiodinases. Toxicol. Sci. 2019, 168, 430–442. [Google Scholar] [CrossRef]
Wang, J.; Hallinger, D.R.; Murr, A.S.; Buckalew, A.R.; Lougee, R.R.; Richard, A.M.; Laws, S.C.; Stoker, T.E. High-Throughput Screening and Chemotype-Enrichment Analysis of ToxCast Phase II Chemicals Evaluated for Human Sodium-Iodide Symporter (NIS) Inhibition. Environ. Int. 2019, 126, 377–386. [Google Scholar] [CrossRef]
Paul Friedman, K.; Watt, E.D.; Hornung, M.W.; Hedge, J.M.; Judson, R.S.; Crofton, K.M.; Houck, K.A.; Simmons, S.O. Tiered High-Throughput Screening Approach to Identify Thyroperoxidase Inhibitors Within the ToxCast Phase I and II Chemical Libraries. Toxicol. Sci. 2016, 151, 160–180. [Google Scholar] [CrossRef] [PubMed]
Gadaleta, D.; Spînu, N.; Roncaglioni, A.; Cronin, M.T.D.; Benfenati, E. Prediction of the Neurotoxic Potential of Chemicals Based on Modelling of Molecular Initiating Events Upstream of the Adverse Outcome Pathways of (Developmental) Neurotoxicity. Int. J. Mol. Sci. 2022, 23, 3053. [Google Scholar] [CrossRef] [PubMed]
Gaulton, A.; Hersey, A.; Nowotka, M.; Bento, A.P.; Chambers, J.; Mendez, D.; Mutowo, P.; Atkinson, F.; Bellis, L.J.; Cibrián-Uhalte, E.; et al. The ChEMBL Database in 2017. Nucleic Acids Res. 2017, 45, D945–D954. [Google Scholar] [CrossRef] [PubMed]
Xu, X.; Wang, C.; Gui, B.; Yuan, X.; Li, C.; Zhao, Y.; Martyniuk, C.J.; Su, L. Application of Machine Learning to Predict the Inhibitory Activity of Organic Chemicals on Thyroid Stimulating Hormone Receptor. Environ. Res. 2022, 212, 113175. [Google Scholar] [CrossRef]
Seo, M.; Lim, C.; Kwon, H. In Silico Prediction Models for Thyroid Peroxidase Inhibitors and Their Application to Synthetic Flavors. Food Sci. Biotechnol. 2022, 31, 483–495. [Google Scholar] [CrossRef]
Carvalho, D.P.; Ferreira, A.C.F.; Coelho, S.M.; Moraes, J.M.; Camacho, M.A.S.; Rosenthal, D. Thyroid Peroxidase Activity Is Inhibited by Amino Acids. Braz. J. Med. Biol. Res. 2000, 33, 355–361. [Google Scholar] [CrossRef]
Divi, R.L.; Doerge, D.R. Inhibition of Thyroid Peroxidase by Dietary Flavonoids. Chem. Res. Toxicol. 1996, 9, 16–23. [Google Scholar] [CrossRef]
Habza-Kowalska, E.; Kaczor, A.A.; Żuk, J.; Matosiuk, D.; Gawlik-Dziki, U. Thyroid Peroxidase Activity Is Inhibited by Phenolic Compounds—Impact of Interaction. Molecules 2019, 24, 2766. [Google Scholar] [CrossRef]
Lee, J. Conversion of the Organic Breakdown Products of Glucosinolate to Thiocyanate Anions and Their Effects on Thyroid Hormone Production. Ph.D. Thesis, Seoul National University, Seoul, Republic of Korea, 2015. [Google Scholar]
Li, X.; Gu, W.; Zhang, B.; Xin, X.; Kang, Q.; Yang, M.; Chen, B.; Li, Y. Insights into Toxicity of Polychlorinated Naphthalenes to Multiple Human Endocrine Receptors: Mechanism and Health Risk Analysis. Environ. Int. 2022, 165, 107291. [Google Scholar] [CrossRef]
Sapounidou, M.; Norinder, U.; Andersson, P.L. Predicting Endocrine Disruption Using Conformal Prediction—A Prioritization Strategy to Identify Hazardous Chemicals with Confidence. Chem. Res. Toxicol. 2023, 36, 53–65. [Google Scholar] [CrossRef]
Yang, X.; Ou, W.; Zhao, S.; Xi, Y.; Wang, L.; Liu, H. Rapid Screening of Human Transthyretin Disruptors through a Tiered in Silico Approach. ACS Sustain. Chem. Eng. 2021, 9, 5661–5672. [Google Scholar] [CrossRef]
Van den Berg, K.J.; van Raaij, J.A.G.M.; Bragt, P.C.; Notten, W.R.F. Interactions of Halogenated Industrial Chemicals with Transthyretin and Effects on Thyroid Hormone Levels In Vivo. Arch. Toxicol. 1991, 65, 15–19. [Google Scholar] [CrossRef]
Marchesini, G.R.; Meulenberg, E.; Haasnoot, W.; Mizuguchi, M.; Irth, H. Biosensor Recognition of Thyroid-Disrupting Chemicals Using Transport Proteins. Anal. Chem. 2006, 78, 1107–1114. [Google Scholar] [CrossRef]
Marchesini, G.R.; Meimaridou, A.; Haasnoot, W.; Meulenberg, E.; Albertus, F.; Mizuguchi, M.; Takeuchi, M.; Irth, H.; Murk, A.J. Biosensor Discovery of Thyroxine Transport Disrupting Chemicals. Toxicol. Appl. Pharmacol. 2008, 232, 150–160. [Google Scholar] [CrossRef]
Purkey, H.E.; Palaninathan, S.K.; Kent, K.C.; Smith, C.; Safe, S.H.; Sacchettini, J.C.; Kelly, J.W. Hydroxylated Polychlorinated Biphenyls Selectively Bind Transthyretin in Blood and Inhibit Amyloidogenesis: Rationalizing Rodent PCB Toxicity. Chem. Biol. 2004, 11, 1719–1728. [Google Scholar] [CrossRef]
Garcia de Lomana, M.; Weber, A.G.; Birk, B.; Landsiedel, R.; Achenbach, J.; Schleifer, K.-J.; Mathea, M.; Kirchmair, J. In Silico Models to Predict the Perturbation of Molecular Initiating Events Related to Thyroid Hormone Homeostasis. Chem. Res. Toxicol. 2021, 34, 396–411. [Google Scholar] [CrossRef]
Gadaleta, D.; d’Alessandro, L.; Marzo, M.; Benfenati, E.; Roncaglioni, A. Quantitative Structure–activity Relationship Modeling of the Amplex Ultrared Assay to Predict Thyroperoxidase Inhibitory Activity. Front. Pharmacol. 2021, 12, 713037. [Google Scholar] [CrossRef]
Rosenberg, S.A.; Watt, E.D.; Judson, R.S.; Simmons, S.O.; Paul Friedman, K.; Dybdahl, M.; Nikolov, N.G.; Wedebye, E.B. QSAR Models for Thyroperoxidase Inhibition and Screening of U.S. and EU Chemical Inventories. Comput. Toxicol. 2017, 4, 11–21. [Google Scholar] [CrossRef]
Bai, X.; Yan, L.; Ji, C.; Zhang, Q.; Dong, X.; Chen, A.; Zhao, M. A Combination of Ternary Classification Models and Reporter Gene Assays for the Comprehensive Thyroid Hormone Disruption Profiles of 209 Polychlorinated Biphenyls. Chemosphere 2018, 210, 312–319. [Google Scholar] [CrossRef] [PubMed]
Yan, L.; Zhang, Q.; Huang, F.; Nie, W.-W.; Hu, C.-Q.; Ying, H.-Z.; Dong, X.-W.; Zhao, M.-R. Ternary Classification Models for Predicting Hormonal Activities of Chemicals via Nuclear Receptors. Chem. Phys. Lett. 2018, 706, 360–366. [Google Scholar] [CrossRef]
Nakamura, N.; Matsubara, K.; Sanoh, S.; Ohta, S.; Uramaru, N.; Kitamura, S.; Yamaguchi, M.; Sugihara, K.; Fujimoto, N. Cell Type-Dependent Agonist/Antagonist Activities of Polybrominated Diphenyl Ethers. Toxicol. Lett. 2013, 223, 192–197. [Google Scholar] [CrossRef]
Ren, X.-M.; Guo, L.-H. Molecular Toxicology of Polybrominated Diphenyl Ethers: Nuclear Hormone Receptor Mediated Pathways. Environ. Sci. Process. Impacts 2013, 15, 702–708. [Google Scholar] [CrossRef]
Amano, I.; Miyazaki, W.; Iwasaki, T.; Shimokawa, N.; Koibuchi, N. The Effect of Hydroxylated Polychlorinated Biphenyl (OH-PCB) on Thyroid Hormone Receptor (TR)-Mediated Transcription through Native-Thyroid Hormone Response Element (TRE). Ind. Health 2010, 48, 115–118. [Google Scholar] [CrossRef]
Du, G.; Shen, O.; Sun, H.; Fei, J.; Lu, C.; Song, L.; Xia, Y.; Wang, S.; Wang, X. Assessing Hormone Receptor Activities of Pyrethroid Insecticides and Their Metabolites in Reporter Gene Assays. Toxicol. Sci. 2010, 116, 58–66. [Google Scholar] [CrossRef] [PubMed]
Li, J.; Ma, M.; Wang, Z. A Two-hybrid Yeast Assay to Quantify the Effects of Xenobiotics on Thyroid Hormone-mediated Gene Expression. Environ. Toxicol. Chem. 2008, 27, 159–167. [Google Scholar] [CrossRef] [PubMed]
Sun, H.; Shen, O.-X.; Wang, X.-R.; Zhou, L.; Zhen, S.; Chen, X. Anti-Thyroid Hormone Activity of Bisphenol A, Tetrabromobisphenol A and Tetrachlorobisphenol A in an Improved Reporter Gene Assay. Toxicol. Vitr. 2009, 23, 950–954. [Google Scholar] [CrossRef] [PubMed]
Hu, W.; Liu, H.; Sun, H.; Shen, O.; Wang, X.; Lam, M.H.W.; Giesy, J.P.; Zhang, X.; Yu, H. Endocrine Effects of Methoxylated Brominated Diphenyl Ethers in Three in Vitro Models. Mar. Pollut. Bull. 2011, 62, 2356–2361. [Google Scholar] [CrossRef]
Liu, H.; Hu, W.; Sun, H.; Shen, O.; Wang, X.; Lam, M.H.W.; Giesy, J.P.; Zhang, X.; Yu, H. In Vitro Profiling of Endocrine Disrupting Potency of 2,2′,4,4′-Tetrabromodiphenyl Ether (BDE47) and Related Hydroxylated Analogs (HO-PBDEs). Mar. Pollut. Bull. 2011, 63, 287–296. [Google Scholar] [CrossRef]
Kojima, H.; Takeuchi, S.; Uramaru, N.; Sugihara, K.; Yoshida, T.; Kitamura, S. Nuclear Hormone Receptor Activity of Polybrominated Diphenyl Ethers and Their Hydroxylated and Methoxylated Metabolites in Transactivation Assays Using Chinese Hamster Ovary Cells. Environ. Health Perspect. 2009, 117, 1210–1218. [Google Scholar] [CrossRef]
Kar, S.; Sepúlveda, M.S.; Roy, K.; Leszczynski, J. Endocrine-Disrupting Activity of per- and Polyfluoroalkyl Substances: Exploring Combined Approaches of Ligand and Structure Based Modeling. Chemosphere 2017, 184, 514–523. [Google Scholar] [CrossRef]
Dix, D.J.; Houck, K.A.; Martin, M.T.; Richard, A.M.; Setzer, R.W.; Kavlock, R.J. The ToxCast Program for Prioritizing Toxicity Testing of Environmental Chemicals. Toxicol. Sci. 2007, 95, 5–12. [Google Scholar] [CrossRef]
Richard, A.M.; Judson, R.S.; Houck, K.A.; Grulke, C.M.; Volarath, P.; Thillainadarajah, I.; Yang, C.; Rathman, J.; Martin, M.T.; Wambaugh, J.F.; et al. ToxCast Chemical Landscape: Paving the Road to 21st Century Toxicology. Chem. Res. Toxicol. 2016, 29, 1225–1251. [Google Scholar] [CrossRef]
EDSP21 Work Plan. The Incorporation of In Silico Models and In Vitro High Throughput Assays in the Endocrine Disruptor Screening Program (EDSP) for Prioritization and Screening. 2011. Available online: https://www.epa.gov/sites/default/files/2015-07/documents/edsp21_work_plan_summary_overview_final.pdf (accessed on 13 March 2017).
Rybacka, A.; Rudén, C.; Tetko, I.V.; Andersson, P.L. Identifying Potential Endocrine Disruptors among Industrial Chemicals and Their Metabolites—Development and Evaluation of in Silico Tools. Chemosphere 2015, 139, 372–378. [Google Scholar] [CrossRef]
Toropova, A.P.; Toropov, A.A.; Benfenati, E. CORAL: Prediction of Binding Affinity and Efficacy of Thyroid Hormone Receptor Ligands. Eur. J. Med. Chem. 2015, 101, 452–461. [Google Scholar] [CrossRef]
Politi, R.; Rusyn, I.; Tropsha, A. Prediction of Binding Affinity and Efficacy of Thyroid Hormone Receptor Ligands Using QSAR and Structure-Based Modeling Methods. Toxicol. Appl. Pharmacol. 2014, 280, 177–189. [Google Scholar] [CrossRef]
Gaulton, A.; Bellis, L.J.; Bento, A.P.; Chambers, J.; Davies, M.; Hersey, A.; Light, Y.; McGlinchey, S.; Michalovich, D.; Al-Lazikani, B.; et al. ChEMBL: A Large-Scale Bioactivity Database for Drug Discovery. Nucleic Acids Res. 2012, 40, D1100–D1107. [Google Scholar] [CrossRef]
Arnold, L.A.; Kosinski, A.; Estébanez-Perpiñá, E.; Guy, R.K. Inhibitors of the Interaction of a Thyroid Hormone Receptor and Coactivators: Preliminary Structure−Activity Relationships. J. Med. Chem. 2007, 50, 5269–5280. [Google Scholar] [CrossRef]
Hwang, J.Y.; Arnold, L.A.; Zhu, F.; Kosinski, A.; Mangano, T.J.; Setola, V.; Roth, B.L.; Guy, R.K. Improvement of Pharmacological Properties of Irreversible Thyroid Receptor Coactivator Binding Inhibitors. J. Med. Chem. 2009, 52, 3892–3901. [Google Scholar] [CrossRef] [PubMed][Green Version]
Hwang, J.Y.; Attia, R.R.; Zhu, F.; Yang, L.; Lemoff, A.; Jeffries, C.; Connelly, M.C.; Guy, R.K. Synthesis and Evaluation of Sulfonylnitrophenylthiazoles (SNPTs) as Thyroid Hormone Receptor–Coactivator Interaction Inhibitors. J. Med. Chem. 2012, 55, 2301–2310. [Google Scholar] [CrossRef] [PubMed][Green Version]
Papa, E.; Kovarich, S.; Gramatica, P. QSAR Prediction of the Competitive Interaction of Emerging Halogenated Pollutants with Human Transthyretin. SAR QSAR Environ. Res. 2013, 24, 333–349. [Google Scholar] [CrossRef] [PubMed]
Kovarich, S.; Papa, E.; Li, J.; Gramatica, P. QSAR Classification Models for the Screening of the Endocrine-Disrupting Activity of Perfluorinated Compounds. SAR QSAR Environ. Res. 2012, 23, 207–220. [Google Scholar] [CrossRef]
Kovarich, S.; Papa, E.; Gramatica, P. QSAR Classification Models for the Prediction of Endocrine Disrupting Activity of Brominated Flame Retardants. J. Hazard. Mater. 2011, 190, 106–112. [Google Scholar] [CrossRef] [PubMed]
Papa, E.; Kovarich, S.; Gramatica, P. QSAR Modeling and Prediction of the Endocrine-Disrupting Potencies of Brominated Flame Retardants. Chem. Res. Toxicol. 2010, 23, 946–954. [Google Scholar] [CrossRef] [PubMed]
Li, F.; Xie, Q.; Li, X.; Li, N.; Chi, P.; Chen, J.; Wang, Z.; Hao, C. Hormone Activity of Hydroxylated Polybrominated Diphenyl Ethers on Human Thyroid Receptor-β: In Vitro and In Silico Investigations. Environ. Health Perspect. 2010, 118, 602–606. [Google Scholar] [CrossRef]
OECD. New Scoping Document on In Vitro and Ex Vivo Assays for the Identification of Modulators of Thyroid Hormone Signalling; OECD Publishing: Paris, France, 2014. [Google Scholar]
OECD. Thyroid in Vitro Methods: Assessment Reports by the Thyroid Disruption Methods Expert Group: Reports Assessing the Validation Status of Assays from the EU-NETVAL Activities. In OECD Series on Testing and Assessment, No. 403; OECD Publishing: Paris, France, 2024. [Google Scholar]
Wilkinson, M.D.; Dumontier, M.; Aalbersberg, I.J.; Appleton, G.; Axton, M.; Baak, A.; Blomberg, N.; Boiten, J.-W.; da Silva Santos, L.B.; Bourne, P.E.; et al. The FAIR Guiding Principles for Scientific Data Management and Stewardship. Sci. Data 2016, 3, 160018. [Google Scholar] [CrossRef] [PubMed]
Gilbert, M.E.; O’Shaughnessy, K.L.; Axelstad, M. Regulation of Thyroid-Disrupting Chemicals to Protect the Developing Brain. Endocrinology 2020, 161, bqaa106. [Google Scholar] [CrossRef]
Pirhadi, S.; Shiri, F.; Ghasemi, J.B. Multivariate Statistical Analysis Methods in QSAR. RSC Adv. 2015, 5, 104635–104665. [Google Scholar] [CrossRef]
Schür, C.; Gasser, L.; Perez-Cruz, F.; Schirmer, K.; Baity-Jesi, M. A Benchmark Dataset for Machine Learning in Ecotoxicology. Sci. Data 2023, 10, 718. [Google Scholar] [CrossRef]
Schür, C.; Schirmer, K.; Baity-Jesi, M. On the Comparability between Studies in Predictive Ecotoxicology. Comput. Toxicol. 2025, 35, 100367. [Google Scholar] [CrossRef]
Wassenaar, P.N.H.; Minnema, J.; Vriend, J.; Peijnenburg, W.J.G.M.; Pennings, J.L.A.; Kienhuis, A. The Role of Trust in the Use of Artificial Intelligence for Chemical Risk Assessment. Regul. Toxicol. Pharmacol. 2024, 148, 105589. [Google Scholar] [CrossRef]
OECD. Guidance Document on the Validation of (Quantitative) Structure–activity Relationship [(Q)SAR] Models; OECD Publishing: Paris, France, 2014. [Google Scholar]
OECD. (Q)SAR Assessment Framework: Guidance for the Regulatory Assessment of (Quantitative) Structure Activity Relationship Models and Predictions; OECD Series on Testing and Assessment, No. 386; OECD Publishing: Paris, France, 2023. [Google Scholar]
Tropsha, A. Best Practices for QSAR Model Development, Validation, and Exploitation. Mol. Inform. 2010, 29, 476–488. [Google Scholar] [CrossRef] [PubMed]
Golbraikh, A.; Tropsha, A. Beware of q²! J. Mol. Graph. Model. 2002, 20, 269–276. [Google Scholar] [CrossRef] [PubMed]
Golbraikh, A.; Tropsha, A. Predictive QSAR Modeling Based on Diversity Sampling of Experimental Datasets for the Training and Test Set Selection. Mol. Divers. 2000, 5, 231–243. [Google Scholar] [CrossRef] [PubMed]
Hastie, T.; Tibshirani, R.; Friedman, J. The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd ed.; Springer: New York, NY, USA, 2009. [Google Scholar]
Todeschini, R.; Consonni, V.; Maiocchi, A. The K Correlation Index: Theory Development and Its Application in Chemometrics. Chemometr. Intell. Lab. 1999, 46, 13–29. [Google Scholar] [CrossRef]
Tropsha, A.; Gramatica, P.; Gombar, V.K. The Importance of Being Earnest: Validation Is the Absolute Essential for Successful Application and Interpretation of QSPR Models. QSAR Comb. Sci. 2003, 22, 69–77. [Google Scholar] [CrossRef]
Wehrens, R.; Putter, H.; Buydens, L.M.C. The Bootstrap: A Tutorial. Chemom. Intell. Lab. Syst. 2000, 54, 35–52. [Google Scholar] [CrossRef]
Ambure, P.; Gajewicz-Skretna, A.; Cordeiro, M.N.D.S.; Roy, K. New Workflow for QSAR Model Development from Small Data Sets: Small Dataset Curator and Small Dataset Modeler. Integration of Data Curation, Exhaustive Double Cross-Validation, and a Set of Optimal Model Selection Techniques. J. Chem. Inf. Model. 2019, 59, 4070–4076. [Google Scholar] [CrossRef]
Raste, S.; Singh, R.; Vaughan, J.; Nair, V.N. Quantifying Inherent Randomness in Machine Learning Algorithms. arXiv 2022, arXiv:2206.12353. [Google Scholar]
Li, J.; Zhao, T.; Yang, Q.; Du, S.; Xu, L. A Review of Quantitative Structure–activity Relationship: The Development and Current Status of Data Sets, Molecular Descriptors and Mathematical Models. Chemom. Intell. Lab. Syst. 2025, 256, 105278. [Google Scholar] [CrossRef]
Khan, A.A. Balanced Split: A New Train-Test Data Splitting Strategy for Imbalanced Datasets. arXiv 2022, arXiv:2212.11116. [Google Scholar] [CrossRef]
An, C.; Park, Y.W.; Ahn, S.S.; Han, K.; Kim, H.; Lee, S.-K. Radiomics Machine Learning Study with a Small Sample Size: Single Random Training-Test Set Split May Lead to Unreliable Results. PLoS ONE 2021, 16, e0256152. [Google Scholar] [CrossRef]
Golbraikh, A.; Shen, M.; Xiao, Z.; Xiao, Y.-D.; Lee, K.-H.; Tropsha, A. Rational Selection of Training and Test Sets for the Development of Validated QSAR Models. J. Comput. Aided Mol. Des. 2003, 17, 241–253. [Google Scholar] [CrossRef]
Kennard, R.W.; Stone, L.A. Computer Aided Design of Experiments. Technometrics 1969, 11, 137–148. [Google Scholar] [CrossRef]
Roy, K.; Kar, S.; Ambure, P. On a Simple Approach for Determining Applicability Domain of QSAR Models. Chemom. Intell. Lab. Syst. 2015, 145, 22–29. [Google Scholar] [CrossRef]
Netzeva, T.I.; Worth, A.P.; Aldenberg, T.; Benigni, R.; Cronin, M.T.D.; Gramatica, P.; Jaworska, J.S.; Kahn, S.; Klopman, G.; Marchant, C.A.; et al. Current Status of Methods for Defining the Applicability Domain of (Quantitative) Structure–activity Relationships: The Report and Recommendations of ECVAM Workshop 52. Altern. Lab. Anim. 2005, 33, 155–173. [Google Scholar] [CrossRef] [PubMed]
Sahigara, F.; Mansouri, K.; Ballabio, D.; Mauri, A.; Consonni, V.; Todeschini, R. Comparison of Different Approaches to Define the Applicability Domain of QSAR Models. Molecules 2012, 17, 4791–4810. [Google Scholar] [CrossRef]
Klingspohn, W.; Mathea, M.; ter Laak, A.; Heinrich, N.; Baumann, K. Efficiency of Different Measures for Defining the Applicability Domain of Classification Models. J. Cheminform. 2017, 9, 44. [Google Scholar] [CrossRef]
Toropova, A.P.; Toropov, A.A.; Rallo, R.; Leszczynska, D.; Leszczynski, J. Optimal Descriptor as a Translator of Eclectic Data into Prediction of Cytotoxicity for Metal Oxide Nanoparticles under Different Conditions. Ecotoxicol. Environ. Saf. 2015, 112, 39–45. [Google Scholar] [CrossRef]
Todeschini, R.; Consonni, V. Molecular Descriptors for Chemoinformatics; John Wiley & Sons, Ltd.: Hoboken, NJ, USA, 2009; ISBN 978-3-527-62876-6. [Google Scholar]
Vasilev, B.; Atanasova, M. A (Comprehensive) Review of the Application of Quantitative Structure–Activity Relationship (QSAR) in the Prediction of New Compounds with Anti-Breast Cancer Activity. Appl. Sci. 2025, 15, 1206. [Google Scholar] [CrossRef]
Yap, C.W. PaDEL-Descriptor: An Open Source Software to Calculate Molecular Descriptors and Fingerprints. J. Comput. Chem. 2011, 32, 1466–1474. [Google Scholar] [CrossRef]
Todeschini, R.; Consonni, V. Handbook of Molecular Descriptors; WILEY-VCH: Weinheim, Germany, 2000. [Google Scholar]
Mauri, A. alvaDesc: A Tool to Calculate and Analyze Molecular Descriptors and Fingerprints. In Ecotoxicological QSARs; Roy, K., Ed.; Springer: New York, NY, USA, 2020; pp. 801–820. ISBN 978-1-07-160150-1. [Google Scholar]
Mauri, A.; Bertola, M. Alvascience: A New Software Suite for the QSAR Workflow Applied to the Blood–Brain Barrier Permeability. Int. J. Mol. Sci. 2022, 23, 12882. [Google Scholar] [CrossRef]
Matveieva, M.; Polishchuk, P. Benchmarks for Interpretation of QSAR Models. J. Cheminform. 2021, 13, 41. [Google Scholar] [CrossRef]
Gião, T.; Saavedra, J.; Cotrina, E.; Quintana, J.; Llop, J.; Arsequell, G.; Cardoso, I. Undiscovered Roles for Transthyretin: From a Transporter Protein to a New Therapeutic Target for Alzheimer’s Disease. Int. J. Mol. Sci. 2020, 21, 2075. [Google Scholar] [CrossRef]
Janicka, M.; Sztanke, M.; Sztanke, K. Biomimetic Chromatography/QSAR Investigations in Modeling Properties Influencing the Biological Efficacy of Phenoxyacetic Acid-Derived Congeners. Molecules 2025, 30, 688. [Google Scholar] [CrossRef]
Charest, N.; Sinclair, G.; Eytcheson, S.A.; Chang, D.T.; Martin, T.M.; Lowe, C.N.; Paul Friedman, K.; Williams, A.J. Combined In Vitro and In Silico Workflow to Deliver Robust, Transparent, and Contextually Rigorous Models of Bioactivity. J. Chem. Inf. Model. 2025, 65, 4426–4441. [Google Scholar] [CrossRef]
Eytcheson, S.A.; Zosel, A.D.; Olker, J.H.; Hornung, M.W.; Degitz, S.J. Screening the ToxCast Chemical Libraries for Binding to Transthyretin. Chem. Res. Toxicol. 2024, 37, 1670–1681. [Google Scholar] [CrossRef] [PubMed]
Evangelista, M.; Chirico, N.; Papa, E. New QSAR Models to Predict Human Transthyretin Disruption by Per- and Polyfluoroalkyl Substances (PFAS): Development and Application. Toxics 2025, 13, 590. [Google Scholar] [CrossRef] [PubMed]
Degitz, S.J.; Olker, J.H.; Denny, J.S.; Degoey, P.P.; Hartig, P.C.; Cardon, M.C.; Eytcheson, S.A.; Haselman, J.T.; Mayasich, S.A.; Hornung, M.W. In Vitro Screening of per- and Polyfluorinated Substances (PFAS) for Interference with Seven Thyroid Hormone System Targets across Nine Assays. Toxicol. Vitr. 2024, 95, 105762. [Google Scholar] [CrossRef] [PubMed]
Sosnowska, A.; Mudlaff, M.; Mombelli, E.; Behnisch, P.; Zdybel, S.; Besselink, H.; Kuckelkorn, J.; Bulawska, N.; Kepka, K.; Kowalska, D.; et al. Identification of New PFAS for Severe Interference with Thyroid Hormone Transport: A Combined in Vitro/Silico Approach. J. Hazard. Mater. 2025, 491, 137949. [Google Scholar] [CrossRef]
Mansouri, K.; Grulke, C.M.; Judson, R.S.; Williams, A.J. OPERA Models for Predicting Physicochemical Properties and Environmental Fate Endpoints. J. Cheminform. 2018, 10, 10. [Google Scholar] [CrossRef]
Cirino, T.; Pinto, L.; Iwan, M.; Dougha, A.; Lučić, B.; Kraljević, A.; Navoyan, Z.; Tevosyan, A.; Yeghiazaryan, H.; Khondkaryan, L.; et al. Consensus Modeling Strategies for Predicting Transthyretin Binding Affinity from Tox24 Challenge Data. Chem. Res. Toxicol. 2025, 38, 1061–1071. [Google Scholar] [CrossRef]
Tetko, I.V. Tox24 Challenge. Chem. Res. Toxicol. 2024, 37, 825–826. [Google Scholar] [CrossRef]

Figure 1. Annual distribution of QSAR models (yellow bars) and papers (purple bars).

Figure 2. Count of QSAR models developed for each MIE.

Figure 3. Annual distribution of QSAR models, categorised by MIE.

Figure 4. Count of QSAR models based on data source type, categorised by MIE.

Figure 5. Count of QSAR models based on structurally heterogenous or class-specific datasets, categorised by MIE.

Figure 6. Count of QSAR models based on the modelling algorithm, categorised by MIE.

Figure 7. Types of AD definitions used in the selected QSAR models (and count).

Table 1. Summary and main characteristics of selected QSARs. C: classification-based; R: regression-based; Primary: data generated as part of the same study; Secondary: data collected from the existing literature; ToxCast database: Toxicity Forecaster (ToxCast) database (https://www.epa.gov/comptox-tools/toxicity-forecasting-toxcast); Tox21 database: Toxicology in the 21st Century (Tox21) (https://tox21.gov/); Ref.: reference; n.s.: not specified.

Model ID	Ref.	Year	MIE	Algorithm	C or R	Chemical Class	Data Source Type	Data Source Literature Reference(s)
ID_1	[52]	2024	TBG	MLR	R	PBBs	Primary	[52]
ID_2	[52]	2024	TBG	MLR	R	PBBs and OH-PBBs	Primary	[52]
ID_3	[52]	2024	TBG	MLR	R	PBBs and 2OH-PBBs	Primary	[52]
ID_4	[52]	2024	TBG	MLR	R	PBBs, OH-PBBs, and 2OH-PBBs	Primary	[52]
ID_5	[53]	2024	TTR	MLR	R	Heterogeneous	Secondary	[54,55,56,57,58,59,60]
ID_6	[53]	2024	TTR	MLR	R	Heterogeneous	Secondary	[61,62,63,64,65,66,67]
ID_7	[53]	2024	TTR	MLR	R	Heterogeneous	Secondary	[68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89]
ID_8	[90]	2023	TR α	MLR	R	PFAS	Primary	[90]
ID_9	[90]	2023	TR β	MLR	R	PFAS	Primary	[90]
ID_10	[91]	2023	TR n.s.	LDA	C	OH-PCBs	Secondary	[92]
ID_11	[91]	2023	TR n.s.	LR	C	OH-PCBs	Secondary	[92]
ID_12	[93]	2023	Albumin	PLS	R	PFAS	Secondary	[94]
ID_13	[93]	2023	Albumin	LDA	C	PFAS	Secondary	[94]
ID_14	[93]	2023	Albumin	MLR	R	PFAS	Secondary	[94]
ID_15	[95]	2023	TSHR	RF	C	Heterogeneous	Tox21 database and secondary	[96,97,98]
ID_16	[51]	2022	TTR	RF	C	Heterogeneous	Secondary	[87]
ID_17	[51]	2022	TR β	RF	C	Heterogeneous	Tox21 database *	[99,100]
ID_18	[51]	2022	TR β	RF	C	Heterogeneous	Tox21 database *	[99,100]
ID_19	[51]	2022	TSHR	RF	C	Heterogeneous	Tox21 database *	[99,100]
ID_20	[51]	2022	TSHR	RF	C	Heterogeneous	Tox21 database *	[99,100]
ID_21	[51]	2022	TRHR	RF	C	Heterogeneous	Tox21 database *	[99,100]
ID_22	[51]	2022	DIO1	RF	C	Heterogeneous	ToxCast database **	[101]
ID_23	[51]	2022	DIO2	RF	C	Heterogeneous	ToxCast database **	[101]
ID_24	[51]	2022	DIO3	RF	C	Heterogeneous	ToxCast database **	[101]
ID_25	[51]	2022	NIS	RF	C	Heterogeneous	ToxCast database **	[102]
ID_26	[51]	2022	TPO	RF	C	Heterogeneous	ToxCast database **	[103]
ID_27	[104]	2022	TTR	RF	C	Heterogeneous	ChEMBL database ***	[105]
ID_28	[104]	2022	TR α	RF	C	Heterogeneous	ChEMBL database ***	[105]
ID_29	[104]	2022	TR β	RF	C	Heterogeneous	ChEMBL database ***	[105]
ID_30	[104]	2022	NIS	RF	C	Heterogeneous	ChEMBL database ***	[105]
ID_31	[106]	2022	TSHR	RF	C	Heterogeneous	Tox21 database	https://tripod.nih.gov/tox21/assays/
ID_32	[106]	2022	TSHR	RF	C	Heterogeneous	Tox21 database	https://tripod.nih.gov/tox21/assays/
ID_33	[106]	2022	TSHR	XGB	C	Heterogeneous	Tox21 database	https://tripod.nih.gov/tox21/assays/
ID_34	[106]	2022	TSHR	LR	C	Heterogeneous	Tox21 database	https://tripod.nih.gov/tox21/assays/
ID_35	[106]	2022	TSHR	XGB	R	Heterogeneous	Tox21 database	https://tripod.nih.gov/tox21/assays/
ID_36	[107]	2022	TPO	XGB	C	Heterogeneous	ToxCast database and secondary **	[103,108,109,110,111]
ID_37	[107]	2022	TPO	Hard Voting	C	Heterogeneous	ToxCast database and secondary **	[103,108,109,110,111]
ID_38	[107]	2022	TPO	Soft Voting	C	Heterogeneous	ToxCast database and secondary **	[103,108,109,110,111]
ID_39	[112]	2022	TR β	MLR	R	PCNs	Primary	[112]
ID_40	[113]	2023	TR β	RF	C	Heterogeneous	Tox21 database	National Center for Biotechnology Information. PubChem Database. Source = 824, AID = 743067, https://pubchem.ncbi.nlm.nih.gov/bioassay/743067 (accessed 13 May 2021)
ID_41	[59]	2021	TTR	MLR	R	Halogenated phenols and thiophenols	Primary and Secondary	[57,59]
ID_42	[114]	2021	TTR	kNN	C	Heterogeneous	Secondary	[54,55,56,57,58,59,61,62,64,68,70,71,72,73,75,78,79,80,81,82,83,84,85,86,87,88,89,115,116,117,118]
ID_43	[114]	2021	TTR	kNN	C	Heterogeneous	Secondary	[54,55,56,57,58,59,61,62,64,68,70,71,72,73,75,78,79,80,81,82,83,84,85,86,87,88,89,115,116,117,118]
ID_44	[114]	2021	TTR	kNN	C	Heterogeneous	Secondary	[54,55,56,57,58,59,61,62,64,68,70,71,72,73,75,78,79,80,81,82,83,84,85,86,87,88,89,115,116,117,118]
ID_45	[114]	2021	TTR	kNN	C	Heterogeneous	Secondary	[54,55,56,57,58,59,61,62,64,68,70,71,72,73,75,78,79,80,81,82,83,84,85,86,87,88,89,115,116,117,118]
ID_46	[114]	2021	TTR	kNN	C	Heterogeneous	Secondary	[54,55,56,57,58,59,61,62,64,68,70,71,72,73,75,78,79,80,81,82,83,84,85,86,87,88,89,115,116,117,118]
ID_47	[114]	2021	TTR	MLR	R	Heterogeneous	Secondary	[61,70,71,72,73,75,78,79,80,81,82,83,84,85,86,87,88]
ID_48	[114]	2021	TTR	MLR	R	Heterogeneous	Secondary	[57,58,59]
ID_49	[114]	2021	TTR	kNN	R	Heterogeneous	Secondary	[61,70,71,72,73,75,78,79,80,81,82,83,84,85,86,87,88]
ID_50	[114]	2021	TTR	kNN	R	Heterogeneous	Secondary	[57,58,59]
ID_51	[119]	2021	TR n.s.	RF	C	Heterogeneous	ToxCast database	Cited as ToxCast and Tox21 Summary Files for invitroDBv3.2, U.S. EPA, Washington, DC.
ID_52	[119]	2021	TSHR	RF	C	Heterogeneous	ToxCast database	Cited as ToxCast and Tox21 Summary Files for invitroDBv3.2, U.S. EPA, Washington, DC.
ID_53	[119]	2021	TSHR	NN	C	Heterogeneous	ToxCast database	Cited as ToxCast and Tox21 Summary Files for invitroDBv3.2, U.S. EPA, Washington, DC.
ID_54	[119]	2021	TPO	XGB	C	Heterogeneous	ToxCast database **	Cited as ToxCast and Tox21 Summary Files for invitroDBv3.2, U.S. EPA, Washington, DC. and [103]
ID_55	[119]	2021	TRHR	SVM	C	Heterogeneous	ToxCast database	Cited as ToxCast and Tox21 Summary Files for invitroDBv3.2, U.S. EPA, Washington, DC.
ID_56	[119]	2021	DIO1	SVM	C	Heterogeneous	ToxCast database **	Cited as ToxCast and Tox21 Summary Files for invitroDBv3.2, U.S. EPA, Washington, DC. and [101]
ID_57	[119]	2021	DIO2	SVM	C	Heterogeneous	ToxCast database **	[101]
ID_58	[119]	2021	DIO3	NN	C	Heterogeneous	ToxCast database **	[101]
ID_59	[119]	2021	NIS	LR	C	Heterogeneous	ToxCast database **	Cited as ToxCast and Tox21 Summary Files for invitroDBv3.2, U.S. EPA, Washington, DC. and [102]
ID_60	[120]	2021	TPO	kNN	C	Heterogeneous	ToxCast database **	[103,121]
ID_61	[120]	2021	TPO	RF	C	Heterogeneous	ToxCast database **	[103,121]
ID_62	[57]	2019	TTR	MLR	R	Phenolic DBPs	Primary	[57]
ID_63	[122]	2018	TR β	SVM	C	PCBs	Primary	[122]
ID_64	[122]	2018	TR β	LDA	C	PCBs	Primary	[122]
ID_65	[123]	2018	TR n.s.	SVM	C	PCBs and PBDEs	Secondary	[124,125,126,127,128,129,130,131,132]
ID_66	[133]	2017	TTR	LDA	C	PFCs	Secondary	[82]
ID_67	[133]	2017	TTR	MLR	R	PFCs	Secondary	[82]
ID_68	[121]	2017	TPO	PLR	C	Heterogeneous	ToxCast database **	[103,134,135,136]
ID_69	[121]	2017	TPO	PLR	C	Heterogeneous	ToxCast database **	[103,134,135,136]
ID_70	[87]	2015	TTR	kNN	C	Heterogeneous	Secondary	[88]
ID_71	[137]	2015	TTR	ASNN	C	Heterogeneous	Secondary	[88]
ID_72	[138]	2015	TR β	Monte Carlo	R	Heterogeneous	Secondary	[139]
ID_73	[138]	2015	TR β	Monte Carlo	R	Heterogeneous	Secondary	[139]
ID_74	[138]	2015	TR β	Monte Carlo	R	Heterogeneous	Secondary	[139]
ID_75	[139]	2014	TR β	RF	R	Heterogeneous	ChEMBL database ***	[140]
ID_76	[139]	2014	TR β	RF	R	Heterogeneous	Secondary	[141,142,143]
ID_77	[139]	2014	TR β	RF	C	Heterogeneous	ChEMBL database ***	[140]
ID_78	[144]	2013	TTR	kNN	C	PFCs and BFRs	Secondary	[78,80,82]
ID_79	[144]	2013	TTR	MLR	R	PFCs and BFRs	Secondary	[78,80,82]
ID_80	[145]	2012	TTR	kNN	C	PFCs	Secondary	[82]
ID_81	[145]	2012	TTR	kNN	C	PFCs	Secondary	[82]
ID_82	[145]	2012	TTR	kNN	C	PFCs	Secondary	[82]
ID_83	[145]	2012	TTR	kNN	C	PFCs	Secondary	[82]
ID_84	[146]	2011	TTR	kNN	C	BFRs	Secondary	[78,80]
ID_85	[147]	2010	TTR	MLR	R	BFRs	Secondary	[78,80]
ID_86	[148]	2010	TR β	PLS	R	OH-PBDEs	Primary	[148]

* Tox21 served as a data source but it was cited as [99,100]. ** Although cited as [101,102,103] or [134,135,136], this review will refer to them as the ToxCast data source as described in the referenced papers. *** ChEMBL served as a data source but it was cited as [105] or [140].

Table 2. Summary of the molecular descriptors selected by each QSAR, grouped by MIE.

MIE	Ref.	Model ID	Chemical Class	Descriptors	Software
TTR	[53]	ID_5	Heterogeneous	AATSC1c; PubchemFP381; ATSC2s; nX	PaDEL [180]
		ID_6	Heterogeneous	naasC; SpMin4_Bhs; VE3_Dzs	PaDEL [180]
		ID_7	Heterogeneous	PubchemFP590; SpMax1_Bhe; PubchemFP18; GATS5c; AATSC1e; AATS4v	PaDEL [180]
	[51]	ID_16	Heterogeneous	Calculation of 119 RDKit chemical descriptors	RDKit: Open-source cheminformatics. http://www.rdkit.org
	[104]	ID_27	Heterogeneous	Calculation of extended fingerprints with a KNIME implementation of the CDK toolkit	CDK toolkit: https://cdk.github.io/
	[59]	ID_41	Halogenated phenols and thiophenols	logDOW(pH = 7.40); ω_adj; dipole_adj	Marvin Sketch 15.6.29.0, 2015: ChemAxon, http://www.chemaxon.com); Gaussian 16; GsGrid 1.7 (http://gsgrid.codeplex.com)
	[114]	ID_42	Heterogeneous	V_sadj; Π_adj; μ_adj	Marvin Sketch 15.6.29.0, 2015: ChemAxon, http://www.chemaxon.com; GaussView 6.0; Gaussian 16; GsGrid 1.7, http://gsgrid.codeplex.com
		ID_43	Heterogeneous	V_sadj; O-059; μ_adj	Marvin Sketch 15.6.29.0, 2015: ChemAxon, http://www.chemaxon.com; GaussView 6.0; Gaussian 16; GsGrid 1.7, http://gsgrid.codeplex.com
		ID_44	Heterogeneous	V_sadj; H-050; nCbH	Marvin Sketch 15.6.29.0, 2015: ChemAxon, http://www.chemaxon.com; GaussView 6.0; Gaussian 16; GsGrid 1.7, http://gsgrid.codeplex.com
		ID_45	Heterogeneous	nArOH; V_sadj; ω_adj	Marvin Sketch 15.6.29.0, 2015: ChemAxon, http://www.chemaxon.com; GaussView 6.0; Gaussian 16; GsGrid 1.7, http://gsgrid.codeplex.com
		ID_46	Heterogeneous	V_sadj; C-024; nHDon	Marvin Sketch 15.6.29.0, 2015: ChemAxon, http://www.chemaxon.com; GaussView 6.0; Gaussian 16; GsGrid 1.7, http://gsgrid.codeplex.com
		ID_47	Heterogeneous	C-040; nCq; H-050; O-058; Πadj; O-056	Marvin Sketch 15.6.29.0, 2015: ChemAxon, http://www.chemaxon.com; GaussView 6.0; Gaussian 16; GsGrid 1.7, http://gsgrid.codeplex.com
		ID_48	Heterogeneous	log DOW(pH = 7.40); nArOH; O-057; nArNO2	Marvin Sketch 15.6.29.0, 2015: ChemAxon, http://www.chemaxon.com; GaussView 6.0; Gaussian 16; GsGrid 1.7, http://gsgrid.codeplex.com
		ID_49	Heterogeneous	E_HOMO-adj; nArOH; H052; ω_adj	Marvin Sketch 15.6.29.0, 2015: ChemAxon, http://www.chemaxon.com; GaussView 6.0; Gaussian 16; GsGrid 1.7, http://gsgrid.codeplex.com
		ID_50	Heterogeneous	log DOW(pH = 7.40); nArOH	Marvin Sketch 15.6.29.0, 2015: ChemAxon, http://www.chemaxon.com; GaussView 6.0; Gaussian 16; GsGrid 1.7, http://gsgrid.codeplex.com
	[57]	ID_62	Phenolic DBPs	log D; dipole_adj	Marvin Sketch 15.6.29.0, 2015: ChemAxon, http://www.chemaxon.com; Gaussian 16
	[133]	ID_66	PFCs	Me; nCsp2; H-050	DRAGON Version 6.0, 2011, http://www.talete.mi.it/
	[133]	ID_67	PFCs	IC3; ∑β’_S	DRAGON Version 6.0, 2011, http://www.talete.mi.it/
	[87]	ID_70	Heterogeneous	Based on the following 14 molecular descriptors: TPSA; a_don; a_nOH; nX; PEOE_VSA_FNEG; PEOE_RPC-; density; PEOE_RPC+; diameter; PEOE_PC+; vsa_hyd; KierFlex; logP(o/w); opr_brigid	Molecular Operating Environment (MOE), 2013.08; Chemical Computing Group Inc.: Montreal, QC, Canada, 2015
	[137]	ID_71	Heterogeneous	nArOH; nHDon; nCb-; nCRX3; nCH2RX; ALogPS_logP; nArOR; nCrq; nCq; nCp; nCs; nCbH	DRAGON version 6 [181].
	[144]	ID_78	PFCs and BFRs	nArOH; F03(Br..Br); HATS6m	DRAGON Version 5.5 for Windows, Talete srl, Milan, Italy, 2007
	[144]	ID_79	PFCs and BFRs	R5u; F07[C-O]; nArOH	DRAGON Version 5.5 for Windows, Talete srl, Milan, Italy, 2007
	[145]	ID_80	PFCs	AMW; HATS6m	DRAGON Version 5.5 for Windows, Talete srl, Milan, Italy, 2007
		ID_81	PFCs	nH; HATS6m	DRAGON Version 5.5 for Windows, Talete srl, Milan, Italy, 2007
		ID_82	PFCs	nH; F06[C-O]	DRAGON Version 5.5 for Windows, Talete srl, Milan, Italy, 2007
		ID_83	PFCs	T(F..F); HATS6m	DRAGON Version 5.5 for Windows, Talete srl, Milan, Italy, 2007
	[146]	ID_84	BFRs	DISPe; nArOH	DRAGON Version 5.5 for Windows, Talete srl, Milan, Italy, 2008
	[147]	ID_85	BFRs	qpmax; MATS6v	DRAGON Version 5.5 for Windows, Talete srl, Milan, Italy
TR α	[90]	ID_8	PFAS	X%; ICR	AlvaDesc [182]
TR α	[104]	ID_28	Heterogeneous	Calculation of extended fingerprints with a KNIME implementation of the CDK toolkit	CDK toolkit: https://cdk.github.io/
TR β	[90]	ID_9	PFAS	X%; TPC	AlvaDesc [182]
	[51]	ID_17	Heterogeneous	Calculation of 119 RDKit chemical descriptors	RDKit: Open-source cheminformatics. http://www.rdkit.org
	[51]	ID_18	Heterogeneous	Calculation of 119 RDKit chemical descriptors	RDKit: Open-source cheminformatics. http://www.rdkit.org
	[104]	ID_29	Heterogeneous	Calculation of extended fingerprints with a KNIME implementation of the CDK toolkit	CDK toolkit: https://cdk.github.io/
	[112]	ID_39	PCNs	E_LUMO; ΔE; μ; Q_xx; Q_yy; Q_yz; q⁺; logK_ow; N_Cl; N_o	Gaussian 09 software.
	[113]	ID_40	Heterogeneous	Use of RDKit descriptors	RDKit: Open-source cheminformatics. http://www.rdkit.org
	[122]	ID_63	PCBs	logK_ow; ω; BER; nCl; EEig13d; JGI4	EPI Suite, version 4.1 (US EPA, 2012); DRAGON
	[122]	ID_64	PCBs	logK_ow; ω; BER; nCl; EEig13d; JGI4	EPI Suite, version 4.1 (US EPA, 2012); DRAGON
	[138]	ID_72	Heterogeneous	Molecular optimal descriptor DCW(3, 10)	CORAL software: http://www.insilico.eu/coral
		ID_73	Heterogeneous	Molecular optimal descriptor DCW(1, 3)	CORAL software: http://www.insilico.eu/coral
		ID_74	Heterogeneous	Molecular optimal descriptor DCW(3, 4)	CORAL software: http://www.insilico.eu/coral
	[139]	ID_75	Heterogeneous	Thirty-five most statistically significant descriptors were identified: F04[N-Cl]; EEig03d; F06[C-Cl]; EEig08r; GATS7e; nArOH; EEig07r; EEig05d; EEig06d; TPSA(Tot); GGI1; BEHp4; SPI; C-026; ESpm01d; nCb-; Hy; GATS8v; T(O..O); BLTA96; IVDE; MATS1e; Ms; GATS6e; MATS6m; MATS5m; MATS2e; MATS1p; MATS8v; MATS6e; MATS8p; X4Av; X2Av; X0Av; Jhetp	Dragon software (version 5.4; Talete s.r.l., Milan, Italy)
		ID_76	Heterogeneous	Twenty-seven most statistically significant descriptors were identified: F08[C-Cl]; T(N..Cl); C-006; EEig06d; SEigm; ATS3m; ATS4m; BEHm6; T(O..Cl); ATS5m; ATS7m; BEHm7; Uindex; EEig04d; BELe3; EEig08d; HVcpx; PHI; BELm3; GGI8; BIC5; BEHml; JGI6; JGI7; BELml; GATS3p; VEA2	Dragon software (version 5.4; Talete s.r.l., Milan, Italy)
		ID_77	Heterogeneous	Thirty most statistically significant descriptors were identified: B05[O-O]; EEig03d; nArOH; GGI7; EEig05d; PW2; F04[C-N]; C-026; ESpm01d; AAC; GATS8p; Hy; PCR; GATS8v; F05[O-O]; O-057; MATS5v; IVDE; MATS1e; Ms; MATS5p; ARR; MATS5m; PHI; MATS8v; GATS1e; MATS8p; RBF; Jhetp; X1A	Dragon software (version 5.4; Talete s.r.l., Milan, Italy)
	[148]	ID_86	OH-PBDEs	nBr; logKow; I_A; E_LUMO; ω; μ²	EPI Suite, version 4.0 (U.S. Environmental Protection Agency 2009); Gaussian 03 programs; DRAGON [181]
TR n.s.	[91]	ID_10	OH-PCBs	RDF35u; RDF55u; RDF85u; RDF65v	PaDEL [180]
	[91]	ID_11	OH-PCBs	RDF35u; RDF55u; RDF85u; RDF65v	PaDEL [180]
	[119]	ID_51	Heterogeneous	Calculation of count-based Morgan fingerprints with a radius of 2 bonds and a length of 2048 bits, and of all 119 one-dimensional and two-dimensional RDKit chemical descriptors	RDKit: Open-source cheminformatics. http://www.rdkit.org
	[123]	ID_65	PCBs; PBDEs	DELS; MAXDN; Mor31v; Ms; RDF040e; BER	DRAGON 5.5 for Windows, Talete srl, Milan, Italy, 2008
TSHR	[106]	ID_31	Heterogeneous	Thirty-nine descriptors were used, here sorted by their weight in descending order (top seven descriptors were used to build Model ID_32.): Sw < 0.1 mg/mL probability; LogSw; LogD(pH = 7.4); LogL; S; R2; E; LogS(pH = 7.4); logP; Solubility class; AAB/LogP; McGowan Volume; MW; Pi2; LogS(pH = 7.4)-; L; V; Sw < 1 mg/mL probability; No Of H Donors; Acid_pKa; LogSwLo; Sw > 10 mg/mL probability; Abraham’s Alfa; NoOfRotBonds; A; Bo; 0Form; B; Form+; No Of H Acceptors; LogSwHi; Rel_pKa_ac; Base_pKa; Abraham’s BetaH; Ertl TPSA; Form-; Rule of 5; Rel_pKa_bs; Form±	KOWWIN program (EPI Suite version 4.1.1, https://www.epa.gov/tsca-screening-tools/epi-suitetm-estimation-program-interface) to calculate logKow. Software for the calculation of the other molecular descriptors was not specified
		ID_32		Sw < 0.1 mg/mL probability; LogSw; LogD(pH = 7.4); LogL; S; R2; E	KOWWIN program (EPI Suite version 4.1.1, https://www.epa.gov/tsca-screening-tools/epi-suitetm-estimation-program-interface) to calculate logKow. Software for the calculation of the other molecular descriptors was not specified
		ID_33		The use of thirty-nine descriptors was reported in the study	KOWWIN program (EPI Suite version 4.1.1, https://www.epa.gov/tsca-screening-tools/epi-suitetm-estimation-program-interface) to calculate logKow. Software for the calculation of the other molecular descriptors was not specified
		ID_34		LogS, LogP, E	KOWWIN program (EPI Suite version 4.1.1, https://www.epa.gov/tsca-screening-tools/epi-suitetm-estimation-program-interface) to calculate logKow. Software for the calculation of the other molecular descriptors was not specified
		ID_35		Forty-one descriptors were used, here sorted by their weight in descending order: Base_pKa; V; Abraham’s Alfa; 0Form; AAB/LogP; CDocker Energy; NoOfRotBonds; S; LogSwLo; LogSwHi; CDocker Interaction Energy; Rel_pKa_bs; R2; E; LogD(pH = 7.4); LogS(pH = 7.4)-; Sw < 0.1 mg/mL probability; A; Sw > 10 mg/mL probability; Ertl TPSA; MW; logP; LogSw; Pi2; Abraham’s BetaH; Solubility class; B; LogL; Sw < 1 mg/mL probability; L; Acid_pKa; Rel_pKa_ac; No Of H Acceptors; Bo; No Of H Donors; McGowan Volume; LogS(pH = 7.4); Form+; Form-; Form±; Rule of 5	KOWWIN program (EPI Suite version 4.1.1, https://www.epa.gov/tsca-screening-tools/epi-suitetm-estimation-program-interface) to calculate logKow. Software for the calculation of the other molecular descriptors was not specified
	[51]	ID_19	Heterogeneous	Calculation of 119 RDKit chemical descriptors	RDKit: Open-source cheminformatics. http://www.rdkit.org
	[51]	ID_20	Heterogeneous	Calculation of 119 RDKit chemical descriptors	RDKit: Open-source cheminformatics. http://www.rdkit.org
	[119]	ID_52	Heterogeneous	Calculation of count-based Morgan fingerprints with a radius of 2 bonds and a length of 2048 bits, and of all 119 one-dimensional and two-dimensional RDKit chemical descriptors	RDKit: Open-source cheminformatics. http://www.rdkit.org
	[119]	ID_53	Heterogeneous	Calculation of count-based Morgan fingerprints with a radius of 2 bonds and a length of 2048 bits, and of all 119 one-dimensional and two-dimensional RDKit chemical descriptors	RDKit: Open-source cheminformatics. http://www.rdkit.org
	[95]	ID_15	Heterogeneous	Top twenty FPs with positive SHAP (Shapley additive explanation) values: PubchemFP12, PubchemFP259, PubchemFP257, PubchemFP256, PubchemFP628, PubchemFP185, PubchemFP258, PubchemFP2, PubchemFP143, PubchemFP146, PubchemFP656, PubchemFP633, PubchemFP150, PubchemFP464, PubchemFP442, PubchemFP607, PubchemFP613, PubchemFP549, PubchemFP153, PubchemFP418	PaDEL [180]
TPO	[51]	ID_26	Heterogeneous	Calculation of 119 RDKit chemical descriptors	RDKit: Open-source cheminformatics. http://www.rdkit.org
	[107]	ID_36	Heterogeneous	Use of Atom Pair Count (APC) fingerprints	PaDEL [180]
		ID_37	Heterogeneous	Use of Atom Pair Count (APC) fingerprints	PaDEL [180]
		ID_38	Heterogeneous	Use of Atom Pair Count (APC) fingerprints	PaDEL [180]
	[119]	ID_54	Heterogeneous	Calculation of count-based Morgan fingerprints with a radius of 2 bonds and a length of 2048 bits, and of all 119 one-dimensional and two-dimensional RDKit chemical descriptors	RDKit: Open-source cheminformatics. http://www.rdkit.org
	[120]	ID_60	Heterogeneous	The top twenty ranked descriptors identified in the kNN model: GATS1e; NArOH; CATS2D_02_DL; MATS1e; MATS1s; C-026; CATS2D_03_DL; B10 [C-C]; MATS1m; ‘SpMax2_Bh(s); MATS1p; nCb-; NX; Uc; ‘P_VSA_i_1’; SpMAD_B(v); NCbH; GATS1s; MLOGP; Eta_C_A’	DRAGON v7.0.8., 2017: https://chm.kode-solutions.net/products_dragon.php
	[120]	ID_61	Heterogeneous	Based on 160 molecular descriptors	DRAGON v7.0.8., 2017: https://chm.kode-solutions.net/products_dragon.php
	[121]	ID_68	Heterogeneous	Based on scaffolds and structural features	Leadscope Predictive Data Miner (LPDM), Leadscope, Inc., (2016): http://www.leadscope.com/
	[121]	ID_69	Heterogeneous	The top ten most common structural features linked to active compounds: benzene, 1,3-dihydroxy-; Scaffold 288; benzene, 1-alkyl-,4-amino(NH2)-; benzene, 1,2-dihydroxy-; Scaffold 297; alcohol, alkenyl-; Scaffold 576; benzene, 1-alkoxy-,4-hydroxy-; Scaffold 306; Scaffold 574. The top ten most commons structural features linked to inactive compounds: Scaffold 110; Scaffold 342; Scaffold 210; Scaffold 253; Scaffold 303; Scaffold 108; benzene, 1-alkyl-,4-halo-; halide, benzyl-; Scaffold 454; Scaffold 194	Leadscope Predictive Data Miner (LPDM), Leadscope, Inc., (2016): http://www.leadscope.com/
TBG	[52]	ID_1	PBBs	Molecular Weight (MW); Critical temperature (CT); Critical pressure (CP); Topological diameter (TD)	PaDEL [180]; Gaussian (Gaussian 09 (Gaussian Inc., Wallingford, CT, USA); ChemDraw 12.0
		ID_2	PBBs and OH-PBBs	Quadrupole moment Q_yy (Q_yy); Most negative Mulliken charge number (q₋); Frequency (Freq); TD	PaDEL [180]; Gaussian (Gaussian 09 (Gaussian Inc., Wallingford, CT, USA); ChemDraw 12.0
		ID_3	PBBs and 2OH-PBBs	q⁻; CP; TD; Topological Shape (TS)	PaDEL [180]; Gaussian (Gaussian 09 (Gaussian Inc., Wallingford, CT, USA); ChemDraw 12.0
		ID_4	PBBs, OH-PBBs, and 2OH-PBBs	q⁻; CP; TD; CT	PaDEL [180]; Gaussian (Gaussian 09 (Gaussian Inc., Wallingford, CT, USA); ChemDraw 12.0
NIS	[51]	ID_25	Heterogeneous	Calculation of 119 RDKit chemical descriptors	RDKit: Open-source cheminformatics. http://www.rdkit.org
	[104]	ID_30	Heterogeneous	Calculation of extended fingerprints with a KNIME implementation of the CDK toolkit	CDK toolkit: https://cdk.github.io/
	[119]	ID_59	Heterogeneous	Calculation of count-based Morgan fingerprints with a radius of 2 bonds and a length of 2048 bits, and of all 119 one-dimensional and two-dimensional RDKit chemical descriptors	RDKit: Open-source cheminformatics. http://www.rdkit.org
Albumin	[93]	ID_12	PFAS	PDI; GATS8v; MATS8m; QED	AlvaDesc 2.0.16 [183]
		ID_13	PFAS	Eig12_AEA(bo); DECC; X4A	AlvaDesc 2.0.16 [183]
		ID_14	PFAS	QED; PDI; GATS8v; MATS8m	AlvaDesc 2.0.16 [183]
DIO1	[51]	ID_22	Heterogeneous	Calculation of 119 RDKit chemical descriptors	RDKit: Open-source cheminformatics. http://www.rdkit.org
DIO1	[119]	ID_56	Heterogeneous	Calculation of count-based Morgan fingerprints with a radius of 2 bonds and a length of 2048 bits, and of all 119 one-dimensional and two-dimensional RDKit chemical descriptors	RDKit: Open-source cheminformatics. http://www.rdkit.org
DIO2	[51]	ID_23	Heterogeneous	Calculation of 119 RDKit chemical descriptors	RDKit: Open-source cheminformatics. http://www.rdkit.org
DIO2	[119]	ID_57	Heterogeneous	Calculation of count-based Morgan fingerprints with a radius of 2 bonds and a length of 2048 bits, and of all 119 one-dimensional and two-dimensional RDKit chemical descriptors	RDKit: Open-source cheminformatics. http://www.rdkit.org
DIO3	[51]	ID_24	Heterogeneous	Calculation of 119 RDKit chemical descriptors	RDKit: Open-source cheminformatics. http://www.rdkit.org
DIO3	[119]	ID_58	Heterogeneous	Calculation of count-based Morgan fingerprints with a radius of 2 bonds and a length of 2048 bits, and of all 119 one-dimensional and two-dimensional RDKit chemical descriptors	RDKit: Open-source cheminformatics. http://www.rdkit.org
TRHR	[51]	ID_21	Heterogeneous	Calculation of 119 RDKit chemical descriptors	RDKit: Open-source cheminformatics. http://www.rdkit.org
TRHR	[119]	ID_55	Heterogeneous	Calculation of count-based Morgan fingerprints with a radius of 2 bonds and a length of 2048 bits, and of all 119 one-dimensional and two-dimensional RDKit chemical descriptors	RDKit: Open-source cheminformatics. http://www.rdkit.org

Table 3. Summary and main characteristics of relevant QSARs published from January to July 2025. C: classification-based; R: regression-based; Primary: data generated as part of the same study; Secondary: data collected from the existing literature.

Model ID	Reference	Year	MIE	Algorithm	C or R	Chemical Class	Data Source Type	Data Source Literature Reference(s)
ID_2025_1	[186]	2025	Albumin	MLR	R	Phenoxyacetic acid-derived congeners	Primary	[186]
ID_2025_2	[187]	2025	TTR	RF	C	Heterogenous	Secondary	[188]
ID_2025_3	[189]	2025	TTR	LDA	C	PFAS	Secondary	[190]
ID_2025_4	[189]	2025	TTR	MLR	R	PFAS	Secondary	[190]
ID_2025_5	[191]	2025	TTR	DTC	C	PFAS	Primary	[191]
ID_2025_6	[191]	2025	TTR	MLR	R	PFAS	Primary	[191]

Table 4. Summary of relevant QSARs published from January to July 2025 and selected molecular descriptors, grouped by MIE.

MIE	Ref.	Model ID	Chemical class	Descriptors	Software
TTR	[187]	ID_2025_2	Heterogenous	Thirty-one descriptors sorted by permutation importance: CrippenLogP; ATSC3c; ATSC5c; C1SP3; ETA_BetaP_s; naAromAtom; ZMIC1; ATSC4m; ZMIC5; ATSC4c; hmin; hmax; ATSC2m; ATSC5m; ETA_Beta_ns_d; ATSC0m; VE1_DzZ; C1SP2; ZMIC2; ATSC1m; nHBAcc; ZMIC3; ATSC3m; ATSC2c; ETA_dAlpha_A; ETA_Shape_Y; ATSC0c; maxdssC; ZMIC4; nHBDon; ATSC1c	PaDEL descriptors from OPERA software v2.9 [192]
	[189]	ID_2025_3	PFAS	GATS3e; ATSC6p; GATS8m; MIC2	PaDEL [180]
	[189]	ID_2025_4	PFAS	piPC5; GGI9; AATSC0e	PaDEL [180]
	[191]	ID_2025_5	PFAS	SM4_D; GATS3m	AlvaDesc [182]
	[191]	ID_2025_6	PFAS	AMW; GATS7p; B10[F-F]	AlvaDesc [182]
Albumin	[186]	ID_2025_1	Phenoxyacetic acid-derived congeners	logkBMC; α; sum of HBD and HBA	ACD/Percepta software, version 1994–2012 (ACD/Labs, Advanced Chemistry Development, Inc., Toronto, ON, Canada)

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Evangelista, M.; Papa, E. A Review of Quantitative Structure–Activity Relationship (QSAR) Models to Predict Thyroid Hormone System Disruption by Chemical Substances. Toxics 2025, 13, 799. https://doi.org/10.3390/toxics13090799

AMA Style

Evangelista M, Papa E. A Review of Quantitative Structure–Activity Relationship (QSAR) Models to Predict Thyroid Hormone System Disruption by Chemical Substances. Toxics. 2025; 13(9):799. https://doi.org/10.3390/toxics13090799

Chicago/Turabian Style

Evangelista, Marco, and Ester Papa. 2025. "A Review of Quantitative Structure–Activity Relationship (QSAR) Models to Predict Thyroid Hormone System Disruption by Chemical Substances" Toxics 13, no. 9: 799. https://doi.org/10.3390/toxics13090799

APA Style

Evangelista, M., & Papa, E. (2025). A Review of Quantitative Structure–Activity Relationship (QSAR) Models to Predict Thyroid Hormone System Disruption by Chemical Substances. Toxics, 13(9), 799. https://doi.org/10.3390/toxics13090799

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Review of Quantitative Structure–Activity Relationship (QSAR) Models to Predict Thyroid Hormone System Disruption by Chemical Substances

Highlights

Abstract

1. Introduction

2. Materials and Methods

Criteria of Inclusion and Exclusion and Literature Collection

3. Results and Discussion

3.1. Temporal Trend

3.2. Modelled MIEs

3.3. Data Sources

3.4. Chemical Classes

3.5. Modelling Approaches

3.6. Validation Strategies

3.7. Applicability Domains

3.8. Molecular Descriptors: Mechanistic Interpretations and Feature Importance

3.9. Recent Advances: 2025

4. Conclusions

Supplementary Materials

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI