Artificial Intelligence in Translational Medicine

The huge advancement in Internet web facilities as well as the progress in computing and algorithm development, along with current innovations regarding high-throughput techniques, enable the scientific community to gain access to biological datasets, clinical data and several databases containing billions of pieces of information concerning scientific knowledge. Consequently, during the last decade the system for managing, analyzing, processing and extrapolating information from scientific data has been considerably modified in several fields, including the medical one. As a consequence of the mentioned scenario, scientific vocabulary was enriched by novel lexicons such as machine learning (ML)/deep learning (DL) and overall artificial intelligence (AI). Beyond the terminology, these computational techniques are revolutionizing the scientific research in drug discovery pitch, from the preclinical studies to clinical investigation. Interestingly, between preclinical and clinical research, translational research is benefitting from computer-based approaches, transforming the design and execution of translational research, resulting in breakthroughs for advancing human health. Accordingly, in this review article, we analyze the most advanced applications of AI in translational medicine, providing an up-to-date outlook regarding this emerging field.


Introduction
Nowadays, Artificial Intelligence (AI) as well as the related specialties of Machine Learning (ML) and Deep Learning (DL) are rapidly gaining traction in many sectors, including scientific ones (e.g., healthcare), with the potential to transform lives and improve patient outcomes in various fields of medicine. Accordingly, AI companies attracted approximately $40 billion worldwide in unveiled investment in 2019 alone [1], reaching $232 billion by 2025 [2]. Regarding the scientific areas, these revolutionary computer-based approaches have the potential to revolutionize how clinicians assist patients in clinical practice (precision medicine, virtual diagnosis, and patient monitoring) as well as how scientists discover and deliver new drugs and diagnostic tools [3][4][5]. These pieces of evidence are also supported by published papers during the years. In fact, by searching in PubMed the term "artificial intelligence", we obtained over 140,000 published papers in the fields, with a significant increment starting from 2018, testifying that the discipline is of particular interest worldwide (Figure1, panel A). Furthermore, by adding the term "translational medicine" to "artificial intelligence", we obtained almost 2,000 publications with a marked increase from 2019 ( Figure 1, panel B). This basic research highlighted the growing interest in AI-based techniques in scientific fields, particularly in translational medicine. Currently, high-throughput procedures like parallelized sequencing, microscope imaging, and compound screening are now widely used by academic and biotech/pharmaceutical researchers, and the number and quality of laboratory data collected has increased dramatically. These "big data" are used for producing biological insight applying ML techniques, granting a better understanding of disease causes, uncover new therapy options, and improving diagnostic tools for clinical use [6].
In fact, AI term is defined from US Food and Drug Administration (FDA) as "the science and engineering of making intelligent machines", whereas ML means "an AI technique that can be used to design and train software algorithms to learn from and act on data" [7].
Accordingly, the main goal of these advanced technologies is to analyze the big data employing computer-based algorithms for extracting valuable information for supporting decision-making [8]. So, the application of AI methods enable scientists to manage and conduct a broad assortment of tasks including diagnosis generation and appropriate therapy selection, risk prediction and illness stratification, medical mistake reduction, and productivity improvement, among other things [5,9]. In particular, regarding the translational research, a number of high-throughput assays generate data from many patient samples are acquired into datasets that are into machine-readable format and hypothetically critical variables are discovered employing an ML-based algorithm. The algorithm will learn relationships between the variables and may perform intelligent tasks, including grouping patients or predicting their outcomes [6]. The role of AI in medicine is summarized in Figure 2. According to one description, ML is "the fundamental technology required to meaningfully process data that exceed the capacity of the human brain to comprehend" [10]. A large number of data points is used to train ML computer-based models. Existing information about specific data items and relationships between data elements is learned via repeated cycles of mapping between inputs and outputs rather than being explicitly coded into the model. Therefore, cooperation between ML and clinical specialists is critical, and there are a variety of modelling approaches that include various degrees of clinical experience into model parameters [11]. Currently, the generation of ML models is mainly grouped into four categories: supervised learning, unsupervised learning, semi-supervised learning, and reinforcement learning. Briefly, the output labels, such as a disorder, are known in advance in supervised ML models. In fact, the objective is to generate a computational tool for predicting an output from a set of input data (i.e., output is usually termed a target value, response variable, or label, while inputs predictors or features). The method "learns" the best model by analyzing data contained in the training set, that include many observations, each of which holds values for its characteristics as well as its label. In addition, there are two kinds of supervised ML: classification and regression. In the classification, the output variable is divided into categories like "present" or "absent," "disorder" or "no-disorder," or "grading" (Grade1, Grade2, etc), while in regression the output variable is an actual value such as "weight", "dose" or "concentration" (IC50, EC50, TC50 etc) for performing predictions on novel samples. Such ML may be employed in medical imaging in a variety of areas, including radiology, pathology, and other imaging fields as well as in epidemiology. Very recently. with the outbreak of CoVid-19 pandemic some ML approaches have been applied for predicting the infection fate starting from an epidemiology dataset [12], as well as from environmental conditions [13]. Furthermore, in the last years supervised ML has also been used in drug discovery and development [14][15][16]. These approaches employing supervised ML are valuable, but they must be approached with prudence because they need huge and reliable data sets containing high-quality data to become accurate, and the data must be correctly categorized [17].
On the other hand, unsupervised ML models aimed at identifying relationships in data that we would not see otherwise. In particular, there are no labels on the data sets, but they do contain features. As a result, the unsupervised ML algorithm must produce groups and classes based on data set similarities. Unsupervised ML, in contrast to supervised ML, predicts unknown outcomes, uncovering previously undiscovered patterns.
Unsupervised ML is exemplified through clustering. This latter is the process of dividing data into various groups or clusters. Accordingly, when the exact information about the clusters is unknown, we can utilize unsupervised ML to cluster them unsupervised ML can be used [18]. Various scientific fields benefit from the application of unsupervised ML. For instance, in a recent report, unsupervised ML technique was applied for identifying subjects showing a high likelihood of dementia in populationbased surveys wit no needing of a medical diagnosis of dementia in a subsample [19]. Another study investigates healthcare professionals' feelings toward a digital simulator, technology, and mentality for elucidating their effects on neonatal resuscitation performance in simulation-based assessments [20]. In general pathology, unsupervised ML is becoming a crucial tool for accelerating the transition to autonomous pathological tissue analysis [21]. In another research an unsupervised ML approach was used to discover patient clusters established on genetic signatures [22]. Also in this case in drug discovery and development, unsupervised ML has been successfully applied in atomistic simulations or to understand the comportment of chemicals (e.g, drugs) as well as materials [23]. Recently, a in a randomized clinical study unsupervised ML was applied to clustering septic patients to determine optimal treatment (NCT03752489). For better understanding the difference between supervised and unsupervised ML models, for example a supervised ML model can be used to identify which patients will develop a given disorder, a known entity, while an unsupervised ML model will be able to identify unknown subgroups of patients suffering from a given pathology since unsupervised models assume that the output labels are unknown. Most computer-based models incorporated into clinical workflows, as clinical decision support, are supervised ML models. For improving the performances of ML models unsupervised and supervised ML can be combined in supervised ML (Figure 3). Ma and colleagues successfully reported a combination of the two strategies for phenotyping complex diseases. They applied this technique to obstructive sleep apnea, highlighting that the phenotyping framework constructed by combining unsupervised and supervised ML techniques can be employed for other heterogeneous, complex diseases to phenotype patients, distinguishing significant features for high-risk phenotypes [24]. Omta and coworkers by combining unsupervised and supervised ML-based tools showed that has great capacity to increase the capability to detect new knowledge in functional genomics screening. Firstly, they applied unsupervised exploratory ML models should to the dataset for gaining a better insight into the quality of the data. This latter enhances the selection and labeling of data for establishing reliable training sets prior to applying ML. For demonstrating the validity of the approach, they used a high-content genome-wide small interfering RNA (siRNA) screen. By applying unsupervised ML models, they easily identified four robust phenotypes that were consequently used as a training set for developing a high-quality random forest (RF) ML tool for differentiating four phenotypes (accuracy = 91.1%; kappa = 0.85). The reported approach significantly improved the ability to obtain novel information from a screening compared with the usage of unsupervised ML techniques alone [25].
However, it is important to highlight that the accuracy of these analyses is terribly dependent on the quality of the training sets employed to generate ML models.
Finally, the reinforcement ML method allow the computational tool to learn from its failures, generating an algorithm based on what it has learned. Thus, this learning is constructed upon the trial-and-error process [26]. In the scientific field, for example, different tasks can include training an algorithm such as to understand the treatment regimens on medical registry data and to find the optimal strategy for trating patients with chemotherapy. In a recent study has been successfully reported the use of a reinforcement ML model for establishing an effective formulation of clinical trial dosing the algorithm trained proper dosing regimens for reducing tumor diameters in patients treated by chemotherapy and radiation [27]. In Figure 3 is reported a schematic illustration of the mentioned ML approaches is reported. validation showed that following this protocol, the authors selected a series of structurally different GSK-3β inhibitors. Among the retrieved active compounds, a selective small-molecule inhibitor (ruboxistaurin, CHEMBL91829) with activity against GSK-3β (IC50 = 97.3 nM) and GSK-3α (IC50 = 695.9 nM) deserves particular attention. This interesting approach highlights the valuable help of ML for accelerating the drug discovery process for finding effective AD therapeutic agents [53]. Fang and coworkers combined Bayesian ML and recursive partitioning (RP) algorithms for building classifiers to envisage the activity of molecules on 25 crucial cellular targets in AD applying a multitarget-quantitative structure-activity relationships (multi-QSAR) approach. The authors started to describe the selected molecules with two types of fingerprint descriptors, namely, ECFP6 and MACCS; after that they built one hundred classifiers. The performance was assessed by internal and external validation (area under the ROC curve for the test 0.741 -1.0, average 0.965). The values are indicative of a robust models. for the test sets was from 0.741 to 1.0, with an average of 0.965. The validated computational tools were used for predicting the possible targets for six approved anti-AD drugs and 19 known active molecules within AD framework. The experimental validation confirmed the prediction outcomes, with the identification of various multitarget-directed ligands (MTDLs) against AD (seven acetylcholinesterase (AChE) inhibitors (IC50 = 0.442 -72.26 μM); four histamine receptor 3 (H3R) antagonists (IC50 = 0.308 -58.6 μM)). Among the retrieved active compounds, the best MTDL, namely DL0410, showed a dual cholinesterase inhibitor behavior (IC50 AChE = 0.442 μM; IC50 BuChE = 3.57 μM). Moreover, DL0410 behaved as a H3R antagonist showing an IC50 of 0.308 μM. Remarkably, the selected work could have implications in MTDLs research against other disorders [72]. Remaining in the AD context, Rodriguez and colleagues reported the development of DRIAD (Drug Repurposing In AD), a ML-based strategy for quantifying possible relationships between the pathology of AD severity (the Braak stage) and molecular mechanisms as determined in records of gene names. Authors applied DRIAD to lists of genes arise from perturbations in differentiated human neural cells by using 80 FDA-approved and investigational drugs, identifying potential drugs for repurposing. Top-ranked drugs were experimentally evaluated against their targets. Interestingly, results showed that 33 FDA-approved drugs can be used for repurposing immediately. Notably, these selected drugs, after supplementary validation and identification of significant pharmacodynamic biomarkers, could be immediately investigated in human clinical trials [59]. Considering another neurodegenerative disorder, such as Parkinson's disease (PD), Shao and collaborators described an integrated computational platform based on two in silico methods. The ML-approach was represented by SVM models coupled with Tanimoto similarity-based clustering analysis. Following this strategy, the authors investigated the possibility to identify molecules, possessing indole-piperazine-pyrimidine scaffold, able to modulate human adenosine receptor A2A and human dopamine receptor D2 subtypes. They identify two compounds that behaved as multifunctional ligands against human A2A (Ki = 8.7 and 11.2 μM) and D2 receptors (EC50 = 22.5 and 40.2 μM). Furthermore, the retrieved hit compounds were devoid of any mutagenicity (up to 100 μM), cardiotoxicity or hepatotoxicity (up to 30 μM) issues, and one molecule improved the movement and mitigation concerning the loss of dopaminergic neurons in Drosophila models of PD [73]. In the same field, Michielan and coworkers reported a different application of the SVM and Support Vector Regression (SVR) methods for describing A2A versus A3 receptor subtypes selectivity profile as well as the related binding affinities. The authors implemented an integrated application of SVM-SVR method, constructed on the usage of molecular descriptors encoding for the Molecular Electrostatic Potential (autoMEP). In this way, the computational tool can simultaneously distinguish A2A versus A3 receptor antagonists predicting their binding affinity to the corresponding receptor subtype of a huge dataset composed by pyrazolo-triazolo-pyrimidine derivatives. The in silico approach was experimentally validated by synthesizing 51 novel pyrazolo-triazolo-pyrimidine containing compounds that confirmed the predicted receptor subtype selectivity and the related binding affinity profiles [74].
Regarding the anticancer research, Deshmukh and colleagues employed two ML algorithms (SVM and RF) for generating four classification models considering a large amount of PubChem bioassay data probable human Flap endonuclease1 (FEN1) inhibitors and non-inhibitors. FEN1 is a crucial protein concerning DNA replication and repair processes. Accordingly, the inhibition of Flap cleavage action results in increased cellular sensitivity to DNA-damaging agents (e.g., cisplatin, temozolomide), with the possibility to improve cancer prognosis. Since FEN1 is overexpressed in several kinds of tumors, FEN1 inhibitors could represent efficacious anticancer agents. For developing the mentioned ML models, the authors used huge freely accessible, high-throughput screening data of small molecules targeting FEN1. The findings showed that SVM model with inactive molecules was superior to RF with Matthews's correlation coefficient (MCC) of 0.67 for the test set. The computational tool was subsequently used in a virtual screening employing the Maybridge database (53,000 molecules). Five top-ranked compounds were experimentally validated. In fact, the selected hit compounds were tested against the enzyme and in cell-based system. The molecule JFD00950 behaved as FEN1 inhibitor in the micromolar range, inhibiting Flap cleavage activity, Moreover, JFD00950 showed a cytotoxic activity against a colon cancer cells (DLD-1, IC50 = 16.7 μM) [75]. The exploration of another drug target for developing anticancer drugs was performed by Zhang and colleagues. They investigated a promising target for cancer immunotherapy, the indoleamine 2,3-dioxygenase (IDO). The authors generated ML models using naive Bayesian and RP techniques considering a library of established IDO inhibitors. For building the models they used descriptors employing 13 molecular fingerprints for predicting IDO inhibitors. The best-performing ML computational tool was utilized in a virtual screening campaign using a proprietary chemical library. This step provided 50 potential IDO inhibitors that were experimentally validated. In vitro tests confirmed the prediction done by ML model since three new IDO inhibitors, belonging to the tanshinone family, were identified (IC50s = 1.30, 4.10, and 4.68 μM) [76]. Kang and coworkers attempted to target VEGFR-2, a well-established target for developing anticancer compounds with anti-angiogenic activity. The authors develop a ML model using naive Bayesian technique coupled to a molecular docking calculation, obtaining a virtual screening protocol that was used to identify VEGFR-2 inhibitors using a chemical library containing FDA-approved drugs. The most promising naive Bayesian model showed Matthews correlation coefficients of 0.966 and 0.951 considering the test set and external validation set, respectively. Accordingly, using the developed computational model 1841 FDA-approved drugs were screened and subsequently submitted to molecular docking calculation employing LibDock. The outcome of virtual screening provided 9 top-ranked drugs showing EstPGood value ≥ 0.6 and LibDock Score ≥ 120, were submitted to a biological evaluation. VEGFR-2 kinase test results showed that papaverine, rilpivirine, and flubendazole were able to inhibit VEGFR-2 (IC50s = 0.47 -6.29 μM). Notably, the integrated screening platform provided 3 FDAapproved drugs as new VEGFR-2 inhibitors, that can be rapidly translated into clinical studies [77]. Montanari and coworkers applied four distinct ML algorithms to train the model (LR, naive Bayesian, SVM, and RF) for identifying novel agents acting as breast cancer resistance protein (BCRP) inhibitors. BCRP is involved in multidrug resistance (MDR) event, thus emerging BCRP inhibitors for increasing the concentration of antitumor agents into resistant cancer cells has been proposed as a valuable tactic for overcoming MDR. The developed model was validated showing good predictivity in cross-validation (area under ROC curve = 0.9) and satisfactory predictivity in prospective validation (area under ROC curve = 0.7). Subsequently, the computational tool was employed in a virtual screening approach using the drug library (1702 compounds). Following this strategy, the authors identified 10 drugs as potential BCRP inhibitors to submit to biological evaluation (inhibition of mitoxantrone efflux in BCRPexpressing PLB985 cells). Among the drugs tested two of them behaved as BCRP inhibitors (cisapride and roflumilast, IC50 = 0.4 μM and 0.9 μM, respectively) [78]. Allen and collaborators used a ML model, based on Laplacien-modified naive Bayesian classifiers developed considering topological fingerprints, in a virtual screening campaign employing a large database (eMolecules > 6 million compounds) for selecting dual kinase/bromodomain (EGFR/BRD4) inhibitors. Two ML models for EGFR were developed considering extended connectivity fingerprints (ECFP4) based on a total of 591,744 unique kinase compounds: one with 3,058 active molecules characterized by a pIC50/pKi ≥ 7, and another with 4,785 active compounds with pIC50/pKi ≥ 6. The two developed models showed exceptional area under the ROC curve values of 0.98 to 0.99 based on 50/50 training/test set and assessed employing leave-one-out cross-validations. The enrichment factors considering 1% of the dataset are 78 and 66, respectively. The ML model for kinase was coupled to a structure-based techniques regarding the bromodomain. This computational protocol allowed the identification of various BRD4 inhibitors. Among them, a first-in-class dual EGFR-BRD4 inhibitor (compound 2870) was found (EGFR IC50 = 44 nM; ERBB2, ERBB4, and BRD4 IC50 = 8.73, 24.2, and 9.02 μM, respectively) [79].
In the field of parasitic and neglected tropical diseases ML-based approaches can be useful for identifying novel effective therapeutic agents as recently reported [80]. Here we only highlighted the representative works explicative of the mentioned technology. Keshavarzi Arshadi and colleagues developed a ML model based on a GCNN algorithm. GCNN has been demonstrated strong accuracy for the prediction concerning chemical properties of molecules. These ML-based computational models transform the molecules into graphs and learn higher-level abstract representations of the input solely based on the data [81]. In the above-mentioned research, GCNN represent the core of a new AI platform called DeepMalaria, with the aim to speed up the antimalarial drug discovery. The characteristic capacities of GCNNs are employed for implementing a virtual screening pipeline. A graph-based model was trained on 13,446 potential antimalarials contained in GlaxoSmithKline database. The developed model was validated by predicting hit molecules from an additional chemical collection and a FDAapproved drug database. The molecules are also tested employing in vitro tests for validating the ML-based model. DeepMalaria identified all molecules showing nanomolar activity and 87.5% of the chemicals with greater percentage of inhibition (> 50%). Additional tests, to uncover the mechanism of action of compounds, showed that not only one of the hit molecules, DC-9237, inhibits all asexual stages of Plasmodium falciparum, but is a fast-acting molecule, making it a robust drug candidate to be optimizated [82]. Furthermore, a very interesting ML-based approach was reported by Stokes and collaborators regarding the application of DL method for discovery novel antibiotic agents. Due to the tremendous impact of antibiotic resistance in clinical practice, there is an urgent need for novel chemicals able to inhibit the multidrug resistance bacteria [83]. In the mentioned work, the scientists trained a DNN model, using a dataset of 2,335 molecules, for identifying compounds possessing a broadspectrum antibacterial profile. The obtained computational tool exhibited an area under ROC curve of 0.896 considering the test data. As a result, the authors employed the model for screening various chemical libraries. From this screening step, they identify an existing drug, namely, halicin (SU-3327, developed for inhibiting c-Jun N-terminal kinase (JNK)). Remarkably, the structure of this compound is totally different from classical antibiotic agents. Moreover, halicin was demonstrated to possess interesting bactericidal activity in vitro as well as in vivo. The characterization of the mechanism of action as antibiotic revealed that halicin can dissipate the transmembrane ΔpH potential in bacteria, and it was found very efficacious against M. tuberculosis. Moreover, the developed ML model was used to screen over 100 million compounds belonging to the ZINC15 database. This additional screening provided further eight antibacterial agents, chemically unrelated to well-known antibiotics. Among them, two compounds (ZINC000100032716 and ZINC000225434673), showed strong broad-spectrum activity and overcame a range of frequent resistance factors. This approach was the first effective experiment regarding the application of DNN for drug repurposing and for discovering new drug lead compounds. The findings indicated that ML approaches can be relevant for identifying novel antibiotic agents counteracting the dissemination of resistance, decreasing the assets required for discovering these compounds, and associated costs [84]. In another investigation, Li and colleagues generated ML models, employing naïve Bayesian and RP techniques, based on physicochemical descriptors and structural fingerprints, aimed at identifying novel DNA gyrase inhibitors to develop broadspectrum antibacterial agents, being bacterial DNA gyrase not expressed in eukaryotic cells. The overall predictive accuracies, considering the training and test sets, was greater than 80%. The authors used eleven promising ML models for the virtual screening of a chemical library. The potential hits, selected by virtual screening were experimentally validated against Escherichia coli, methicillin-resistant Staphylococcus aureus and other bacteria, and DNA gyrase. For compounds able to inhibit DNA gyrase, MIC values range between 1 and 32 μg/mL and, the relative inhibition rates of inhibitors ranging from 42% to 75% at 1 μM [85].
In the context of antiviral research, Ekins and collaborators developed a Bayesian ML model considering viral pseudotype entry assay and the Ebola virus replication assay data. The developed model was submitted to an internal and external validation step. The scientists employed this model in a virtual screening campaign using the MicroSource library of drugs, for selecting possible antiviral compounds. Among the retrieved potential hit compounds, three promising antiviral candidates were found (quinacrine, pyronaridine and tilorone, were experimentally validated with an EC50 = 350, 420, and 230 nM, respectively, against Ebola virus replication). Notably, pyronaridine is an element of a combination therapy for malaria recently approved by the European Medicines Agency (EMA), consequently it could be immediately used for clinical testing. Also, this study highlighted how ML models can be used for speeding up the preclinical step of drug discovery trajectory, providing drugs for translational research [86].
Remarkably, ML approaches, especially based on reinforcement learning, can be useful for developing models that can also be applied for de novo design of small molecules possessing desired pharmacological profiles [87][88][89]. Briefly, we report some representative attempts to apply this methodology to this task. Recently, Zhavoronkov and coworkers reported the development of a deep generative model, namely, generative tensorial reinforcement learning (GENTRL), very useful for de novo small molecule design acting as inhibitors of discoidin domain receptor 1 (DDR1) kinase, that is involved in fibrosis and further disorders. To develop GENTRL, the authors combined reinforcement learning, variational inference, and tensor decompositions into a generative two-step ML algorithm. In the first step, the scientists learned a mapping of chemical space, a set of discrete molecular graphs, to a continuous space of 50 dimensions, parameterizing the structure of the learned manifold in the tensor train format to utilize partly well-known features. The computational model was generated using six data sets: (i) a big set of compounds from a ZINC database; (ii) known inhibitors of DDR1 kinase; (iii) common kinase inhibitors (positive set); (iv) compounds active against non-kinase target proteins (negative set); (v) patent data of pharmaceutical companies regarding biologically active compounds; (vi) 3D structures for DDR1 inhibitors. In the second step, they explored the mapped chemical space with reinforcement learning for discovery novel molecules against a selected target. Results showed that GENTRL is capable of optimizing synthetic accessibility, novelty, and bioactivity. In the reported paper, GENTRL allowed to indicate several compounds for the synthesis, authors synthesized six lead compounds. These latter were experimentally evaluated for their inhibitory potential against DDR1. Notably, two molecules strongly inhibited DDR1 activity (IC50 = 10 -21 nM), the other two compounds showed moderate potency (IC50 = 0.278 -1 μM), while the remaining two molecules were found inactive. Moreover, the best performing compounds demonstrated good selectivity against DDR1 over DDR2 and one of was highly selective against a panel of 44 diverse kinases.
Interestingly, these two compounds inhibited the induction of fibrotic markers (α-actin and CCN2) in MRC-5 lung fibroblasts. These chemical entities were able to inhibit the expression of collagen (a hallmark of fibrosis) in LX-2 hepatic stellate cells [90]. McCloskey and coworkers, in an interesting approach, described an effective ML platform aimed at accelerating the drug discovery pipeline considering a DNA-encoded small molecule library (DEL) selection data. Two types of ML models were trained on the DEL selection data for classifying molecules (over 2,000): RF and GCNN. ML models were trained on the aggregated selection data (using no prior off-DNA activity measurements). The computational tool was applied to three drug targets (sEH (a hydrolase), ERα (a nuclear receptor), and c-KIT (a kinase)) and used in virtual screening of large chemical databases (∼88 million compounds). The outcomes revealed that the technique is efficient, with a global hit rate of ∼30% at 30 μM, discoverying powerful compounds (IC50 < 10 nM) for each drug target [91]. Lastly, a novel ML approach based on DL and reinforcement learning for de novo design of small molecules with desired profile was presented by Popova and coworkers. This computational tool named ReLeaSE (Reinforcement Learning for Structural Evolution) combines two DNNs (generative and predictive) that are trained independently although are employed together for generating new focused chemical libraries. The methodology was separated in two phases, in the first one, a supervised learning algorithm was employed for a separate training of generative and predictive models. The second phase consisted of a joint training of both models with the reinforcement learning methodology to bias the generation of new chemicals showing desired physical and biological profile. In the work, the authors applied ReLeaSE for generating a series of libraries containing chemical entities with a precise profile: (a) satisfactory drug-likeness, regarding physchem properties, the authors chosen Tm and n-octanol/water partition coefficient (logP); (b) desired biological activity, the authors selected Janus protein kinase 2 (JAK2) as the target protein; (c) novel chemotypes with significant chemical complexity, that should guarantee a higher selectivity against the selected target. In particular, the number of benzene rings and substituents were employed as structural rewards for designing focused libraries enclosing chemically complex molecules [87].

Drug targets prediction and biomarkers identification
Noteworthy, in addition to the previously discussed ML approaches to identify promising drug candidates, AI techinuqes are also emerging in drug targets prediction, with remarkable success. For instance, in the field of neurodegenerative disorders, we report here one of the most significant advancement in ML approaches applied to drug target identification in drug discovery/drug repurposing field. In fact, a computational model based on DL methodology, namely, deepDTnet was successfully used in a repurposing approach, providing interesting hints for treating multiple sclerosis [92]. DeepDTnet was conceived for identifying novel drug targets and drug repurposing, considering heterogeneous drug-gene-disease network embedding fifteen categories of chemical, genomic, phenotypic, and cellular network profiles. DeepDTnet was generated using 732 FDA-approved for training. Subsequent validation analysis showed that deepDTnet was accurate in identifying innovative cellular drug targets for marketed drugs (area under the ROC curve = 0.963). The experimental validation was performed considering the output of topotecan (a topoisomerase-I inhibitor), a chemotherapeutic agent approved to treat various forms of cancer, such as lung and ovarian cancer [93][94][95]. In fact, topotecan was predicted by deepDTnet as an inhibitor of the human retinoicacid-receptor-related orphan receptor-gamma t (ROR-γt), a promising drug target for treating different disorders including psoriasis, multiple sclerosis, and rheumatoid arthritis [96,97]. According to computational output topotecan was found to inhibit RORγt (IC50 = 0.43 μM) and notably showed potential therapeutic effects in multiple sclerosis, being efficacious in reverting the pathological phenotype in vivo in EAE mouse model at 10 mg/kg [92]. Madhukar and colleagues, in the framework of drug target identification developed a Bayesian ML algorithm, namely, BANDIT (Bayesian ANalysis to determine Drug Interaction Targets). This computer-based tool combines various kinds of data for predicting drug targets (e. g., 20 million data points derived from six diverse type of data such as drug efficacy, post-treatment transcriptional response, drug structure, described undesirable effects, bioassay results, and well-established targets). Using over 2,000 compounds, BANDIT showed an accuracy of ~90% in identifying correct targets. Next, the authors used this computational platform employing over 14,000 molecules for which any target was known. Results showed that the ML-based tool produced ~4,000 undisclosed molecule-target predictions. Considering the most promising data the authors validated fourteen molecules predicted as microtubule binders. Among this subset three compounds were highlighted for their activity against resistant tumor cells. Experimental validation fully supported the BANDIT predictions. Moreover, BANDIT was applied to ONC201 (anticancer agents in clinical development with an unknow target). The development algorithm predicted ONC201 as an antagonist of dopamine receptor 2 (DRD2). The target was validated confirming the prediction, and currently this hint derived from the mentioned studies was the basis for designing an appropriate clinical trial using ONC201. ONC201 will be evaluated for its efficacy in pheochromocytomas, a rare cancer in which was observed an overexpression of DRD2 (NCT03034200). Lastly, BANDIT identified linkers among distinct classes of drugs, revealing undisclosed clinical observations highlighting novel possibilities for drug repurposing. According to these findings, BANDIT is a useful screening platform that can efficiently speed up the drug discovery process accelerating translational research toward clinical application [98]. Dezső and Ceccarelli reported the development of a ML-based approach for scoring proteins for generating a druggability score of novel unidentified drug targets. The authors included in the ML model 70 features obtained from drug targets (e.g., features indicating protein functions, features extracted from the sequence, and network features obtained from the protein-protein interaction network). They generated 10,000 ML models based on RF algorithm using a training set built considering drug targets in complex with marketed drugs (102 targets), and a "negative" set enclosing 102 non-drug targets. ML models are able to detect relevant combinations of included features discriminating drug targets from non-pharmacological targets. The approach was validated using an external test set of clinically-relevant drug targets (277 targets). Validation results showed a significant accuracy accounting for an area under the ROC curve of 0.89. the authors further validated their predictions using an independent set of clinical drug targets, attaining a high accuracy indicated by an area under the ROC curve of 0.89. The output reported in this work provided new potential drug targets for developing innovative anticancer drugs [99].

AI/ML in quantitative systems pharmacology (QSP)
Following the identification of prospective therapeutic drug targets, analysis must be performed to validate them. Computational approaches offer affordable low-cost, time-saving strategies to evaluate the likelihood that potential targets could provide an efficient way for treating a given disorder. Accordingly, a pivotal step in target validation is represented by the construction of confidence interval for a given potential therapeutic hypothesis employing quantitative systems pharmacology (QSP) models [100]. QSP is a stimulating and effective conjunction of biological pathways, pharmacology, and mathematical models for drug development. QSP possesses the potential for providing considerable impact to modern medicine as a result of the discovery and deployment of new molecular pathways and drug targets in the quest of innovative therapeutic agents and personalized medicine. The combination of these specialties is causing substantial attention in pharma companies to expand predictions from a pharmacodynamic (PD) and pharmacokinetics (PK) perspective and through the improvements in computing capacity, QSP is currently capable to improve outcomes in drug discovery trajectory. In fact, QSP models can combine information on PK/PD properties, biological processes of interest and mechanisms of action, resulting from prior knowledge and available preclinical and clinical data, to quantitatively predict efficacy and safety responses over time and translate molecular data to clinical outcomes [101][102][103][104]. QSP provides a perfect quantitative framework for integrating different big data sources, including omics (i.e., proteomics, transcriptomics, metabolomics, and genomics) and imaging, the dimensionality of which can be reduced using ML methods. By allowing the identification of relevant association and data representations, the development of QSP platforms with higher granularity and enhanced predictive power can be further enhanced [105]. Moreover, the opportunity to implement QSP platform with ML techniques enhanced the capacity to handle big data can offer great opportunities for systems pharmacology modeling. In fact, with the high availability of processed and organized data for building interpretable and actionable computational models supporting decision making in the whole process of drug discovery and development, QSP can improve the reliability of predictions providing more complex analysis, a better understanding of biomedical systems, and ultimately lead to the design of optimized treatments. We reported some examples regarding this approach.
In a recent work, Ramm and collaborators take advantage of systems biology methods coupled to multi-dimensional datasets and ML for identifying biomarkers to predict nephrotoxic molecules, for characterizing their mechanism of toxicity in vitro. The authors employed primary human kidney cells and used an approach based on systems biology combining multidimensional datasets and ML for identifying biomarkers for predicting nephrotoxic molecules along with the mechanism of toxicity. ML using RF technique was applied for systematically identifying genes and imaging features from 46 different nephrotoxic compounds. From this analysis, the authors acquired information regarding changes in cell morphology as well as mRNA levels, finding and validating HMOX1 and SQSTM1 as nephrotoxic biomarkers. Furthermore, RF algorithm was trained and validated using clinical observations of kidney toxicity and employed for nephrotoxicity classification (class labels as nontoxic = 0 (10 instances, including 8 molecules, DMSO, and medium controls) or toxic = 1 (38 molecules)). The developed computational model was capable of discriminating nephrotoxic from non-nephrotoxic molecules and a hierarchical clustering approach, considering chemicals with an established mechanism of action allowed to detect potential mechanisms of toxicity of drug candidates [106].
Notably, the individuation of appropriate and efficacious therapies for treating a given pathology is extremely important. Computational models can help with this issue also providing the responsiveness of patients for a given treatment. In an interesting work, Song and coworkers reported the development and validation of a large-scale bidirectional generative adversarial network for predicting tyrosine kinase inhibitor (TKI) response in patients with stage IV EGFR variant-positive non-small cell lung cancer. In the mentioned diagnostic/prognostic study were enrolled 465 patients and the authors developed a DL semantic signature for predicting progression-free survival (PFS) was built in the training group. The computational approach was validated employing two external validation and two control groups and compared with the radiomics signature. Briefly, 342 subjects with stage IV EGFR variant-positive non-small cell lung cancer receiving EGFR-TKI therapy met the inclusion criteria. Of these, 145 patients from two hospitals (n = 117 and 28) were included in the training group, and the patients from two additional hospitals established two external validation groups (validation cohort 1: n = 101; validation cohort 2: n = 96). 56 patients with advanced-stage EGFR variant-positive non-small cell lung cancer and 67 patients with advanced-stage EGFR wild-type nonsmall cell lung cancer who received first-line chemotherapy were included. A total of 90 subjects (26%) receiving EGFR-TKI therapy with a high risk of rapid disease progression were detected applying the DL semantic signature. When compared to other patients in validation groups, PFS dropped by 36% (hazard ratio, 2.13; 95 percent CI, 1.30-3.49; P.001). When comparing the PFS of high-risk patients receiving EGFR-TKI treatment to chemotherapy groups, no substantial variations were detected (median PFS, 6.9 vs 4.4 months; P = .08). In terms of predicting the tumor progression risk after EGFR-TKI therapy, clinical decisions based on the DL semantic signature led to better survival outcomes than those based on radiomics signature across all risk probabilities by the decision curve analysis [107]. Very recently, Lu and collaborators described a significant ML approach based on DL algorithm for predicting patient response time course from early data via neural-PK/PD modelling. Currently, analyses of patient response following doses of therapeutics are conducted employing standard PK/PD methods that require relevant human scientific expertise. Interestingly, the application of DL to system pharmacology as in the case of PK/PD models that directly learn the governing equations from data for predicting patient response time course and for simulating the effects of unseen dosing regimens. Accordingly, the authors, in this new methodology, combined crucial pharmacological rules with neural ordinary differential equations. This neural-PK/PD model was used for analyzing the drug concentration and platelet response considering a clinical dataset comprising over 600 patients. In particular, the computational strategy was applied to predict drug concentration and platelet dynamics after the treatment with trastuzumab emtansine (intravenous administration at 3.6 mg/kg once every three weeks) for treating human epidermal growth factor receptor 2 (HER2)-positive metastatic breast cancer in subjects failing treatment before-hand with trastuzumab and a taxane. The outcomes demonstrated that the computational model is able to predict the patients' responses and also simulate patients' responses considering untested dosing regimens. These findings prove the potential of neural-PK/PD for automated predictive analytics of patient response time course, suggesting that AI/ML approach can support clinical pharmacologists with a prospective, in the next future, to use neural-PK/PD as advanced analytics tools for understanding and predicting drug concentration and response for dosing recommendation [108].
At the end of this section, day-by-day it is evident how AI has emerged in the field of drug discovery and development, being able to improve affordable and effective therapeutic treatments for common and emerging disorders, accelerating drug repurposing and minimizing the translational gap in drug development.

General consideration
With the growing accessibility to high-quality amounts of cell imaging data, there are currently relevant possibilities to use ML-based methods to aid researchers in cell image processing. In fact, the image features that are supposed to be crucial in producing the prediction or diagnoses can be generally processed by using ML algorithms. These latter offer possibilities of predictive, descriptive, and prescriptive assessment to acquire relevant information that would otherwise be impossible to obtain by human analysis, providing accurate medical diagnoses [109,110]. Accordingly, in the last years various clinical investigations have enabled the use of AI in several fields providing general pathological classification, risk evaluation, diagnosis, prognosis, and prediction of appropriate therapy and possible responses to a given pharmacological treatment [111,112]. In particular, DL, a class of ML that employing ANN (CNN and recurrent neural networks (RNN)) resembling human cognitive capabilities, has proven undeniable superiority over conventional ML approaches owing to algorithm improvement, better processing hardware, and access to massive amounts of imaging data [113]. The successful incorporation of DL technology into normal clinical practice is determined that the diagnosis accuracy is comparable to that of healthcare experts. Furthermore, DL model integration provides additional advantages, including speed, efficiency, affordability, increased accessibility, and ethical behavior [110]. For these reasons, the FDA has approved the use of specific DL-driven diagnostic computational tools for clinical usage (Table1) [114][115][116]. The application of AI encompasses several medical and biomedical fields including radiology [117], gastroenterology [118,119], neurology [120,121], ophthalmology [122,123], cardiology [124,125], dermatology [126], general pathology [127], oncology [128], healthcare [129,130] and clinical medicine [131,132]. Abbreviation: AD -Alzheimer disease; ADHD -attention deficit hyperactivity disorder; AIartificial intelligence; ANN -artificial neural network; ASD -autism spectrum disorder; BISbioimpedance spectroscopy; DL -deep learning; CNN -convolutional neural network; CT -computed tomography; DBT -digital breast tomosynthesis; EEG -electroencephalogram; ECG -electrocardiogram; LVO -large vessel occlusion; MCI -mild cognitive impairment: ML -machine learning; MRCP -magnetic resonance cholangiopancreatography; MRI -magnetic resonance imaging; OARs -organs-at-risk; OSA -obstructive sleep apnea; PET -positron emission tomography; PNX -pneumothorax.

Basic research
In this section, we illustrate some relevant and representative examples on how AI can be an added value in translational medicine, starting from research laboratories to clinical practice, speeding up the understanding of disorders (targets involved, phatophysiological mechanism, etc) and the translation of acquired knowledge in clinical medicine. For example, in medical/cellular imaging ML-based methods hold great promises. Considering cell microscopy and histopathology, observation of the slides is often complicated, so pathologists' interpretation might be inconsistent, making histopathological diagnoses problematic [236]. Conventional approaches (e.g., microscopic/biological inspection of a sample) have limitations, reducing the possibility to discover particular biomarkers, genomic driver mutations, and patterns within a cell's subcellular apparatus [237]. Accordingly, with the aid of ML, unravelling disease heterogeneity through enhancing cellular profiling of specific morphological features is becoming progressively possible. ML approaches able to improve sample categorization allow the acquisition of undisclosed disease characteristics that cannot be identified by humans alone. To this end, Simm and collaborators described a fascinating approach in which a ML-based method was employed for predicting the activity of a given compound from images. The interesting idea starts with the evidence that large-scale assays (e.g., high throughput screening) for the drug discovery pipeline are costly, timeconsuming, and frequently unfeasible, mainly for the growing number of relevant physiological systems needing primary cells, organoids, entire organisms, as well as pricey or rare reagents. The authors assumed that data from only high-throughput imaging (HTI) assay can be repurposed for predicting bioactivities of molecules in other assays, similar to those that target different biological processes or pathways. For that purpose, they developed a protocol for predicting the activity of compounds in several orthogonal tests. In the first step the researchers extracted a large set of image-based fingerprint of morphological descriptions for each molecule (considering the threechannel glucocorticoid receptor (GCR) as target for HTI assay employed in the valuation, the authors obtained 842-dimensional feature vector per cell). The second step consisted of introducing known activity data for orthogonal assays of interest on the considered molecules. Finally, by using supervised ML approach they trained models, selecting the one that showed higher predictivity. The resulting ML model was successfully used for selecting novel chemical entities for a biological evaluation [238]. Another interesting work was performed by Nassar and colleagues. They reported a ML-based method (evaluating six ML algorithms: AdaBoost, Gradient Boosting (GB), k-NN, RF, and SVM) for classifying white blood cell (WBC). Currently, WBC count, a method for assessing the immune system status of a person, needs a flow cytometer and fluorescent markers. Obviously for accomplishing this process various steps for sample preparation are required. By using the proposed label-free approach only employing imaging flow cytometer combined with ML methods unstained WBCs were classified. The developed model showed good scores, being also able to discriminate B and T lymphocytes. The approach was validated performing WBC analyses from unstained samples collected from 85 donors. Notably, the described approach allows an extremely precise classification of WBCs while avoiding cell disruption and leaving marker channels open to address further biological issues. In the end, the proposed method enables the use of ML for liquid biopsy, applying the powerful info in cell morphology for several diagnostics of primary blood such as for example the detection of tumor products or circulating tumor cells in the blood [239]. Coudray and colleagues applied ML algorithms for classifying and predicting mutations from histopathological images belonging to non-small cell lung cancer. In fact, the visual inspection represents the elected methodology for assessing stage, type, and subtype of lung cancers. Expert pathologists are able to distinguish adenocarcinoma (LUAD) and squamous cell carcinoma (LUSC) by visual inspection. The authors presented a ML approach based on deep CNN trained on whole-slide images acquired from The Cancer Genome Atlas for accurately and automatically classifying them into LUAD, LUSC or normal lung tissue. The performance of the methodology is equivalent to that of pathologists, showing an average area under the ROC curve of 0.97. The in silico model was validated on independent datasets of frozen tissues, formalin-fixed paraffin-embedded tissues and biopsies. Additionally, the network was also trained for predicting ten most frequently mutated genes in LUAD. Six of them (STK11, EGFR, FAT1, SETBP1, KRAS, and TP53) can be predicted from pathology images, with a significant area under the ROC curve (0.733 -0.856) as determined on a held-out population. Remarkably, a similar approach based on ML models could aid pathologists in detecting gene mutations related to cancer subtypes [240]. Moreover, ML-based approaches can assist to identify specific biomarkers involved in a given diseasease. The most fruitful computer-based approaches were recently nicely reviewed [6,241]. To understand the task, we report some examples highlighting ML approaches in this field. Kang and collaborators used the python package sklearn for building a ML-based computational model, employing SVM technique, that executed 10-fold cross-validation to implement a diagnostic tool for identifying lung cancer risk of suspected cases. The authors performed an inclusive assessment of results from genetic analysis and critical clinical data regarding patients affected by lung cancer to carry out a model able to diagnose early lung cancer also indicating tumor risks. They considered tissues from samples of patients with lung cancer and tissue from healthful persons for a total of 70 pairs. The authors evaluated the methylation rates of six genes (FHIT, p16, MGMT, RASSF1A, APC, DAPK) in lung cancer patients, the critical clinical data, and tumor marker concentrations of these patients. The SVM model was validated calculating the area under the ROC curve and other statistical parameters. Based on these validation data (area under the ROC curve of 0.963 sensitivity of 0.900, specificity of 0.971, and accuracy of 0.936), the scientists proved the validity of the developed method, highlighting the crucial role of ML models as diagnostic tools for an early diagnosis of cancers that can contribute to increasing the survival rate of patients [242].

AI, imaging and ophthalmology
In the context of imaging in diagnosis and disease-progression applying ML-based techniques, ophthalmology is one of the medical fields in which these computational approaches have been successfully employed [123]. In fact, AI principally based on DL methods has been used to detect several ocular disorders including retinopathy of prematurity [243], diabetic retinopathy [244,245], macular oedema [246,247], age-related macular degeneration [248,249], and glaucoma [250][251][252] using fundus images, optical coherence tomography (OCT), and visual fields. Screening, diagnosis, and monitoring of major eye disorders for patients in primary care might be achievable using DL in ocular imaging combined with telemedicine. Briefly, we report here some representative examples on how ML can revolutionize the diagnostic, improving the quality of diagnosis, reducing potential medical errors and the workload of medical staff, also save the time of patients examined. Very recently, Dai and coworkers reported the development of an intriguing screening platform for detecting diabetic retinopathy. It is well established that retinal screening has a tremendous impact on early diagnosis of retinopathy to start effective treatments to avoid vision loss, slowing down the progression of the disorder. For facilitating the screening procedure, they use a ML approach based on DL algorithms for developing a computational tool, namely, DeepDR (DL Diabetic Retinopathy). DeepDR is a transfer-learning aided multi-task network for evaluating retinal images feature, retinal lesions, and diabetic retinopathy grades. This evaluation allows the detection of early-to-late stages of diabetic retinopathy. DeepDR was generated taking into account 666,383 fundus images from 173,346 patients, and it is trained for real-time image quality valuation, lesion detection and grading by means of 466,247 fundus images from 121,342 patients (70%) affected by diabetes were randomly included in the training set, while the evaluation is conducted considering 52,004 patients (30%) for a local validation set consisting of 200,136 fundus images and three external datasets containing 209,322 images. Results showed an area under ROC curves of 0.901, 0.941, 0.954 and 0.967 regarding the detection of microaneurysms, cotton-wool spots, hard exudates, and hemorrhages, respectively, while the grading of diabetic retinopathy as mild, moderate, severe, and proliferative accomplishes significant area under the ROC curves (0.943, 0.955, 0.960, and 0.972, respectively). Finally, the statistical parameters, considering the external validation, ranging from 0.916 to 0.970 (area under the ROC curves). In summary, DeepDR showed significant accuracy and high sensitivity in detecting diabetic retinopathy from early-to late-stages [244]. Asaoka and colleagues reported a ML approach based on deep and transfer learning for an accurate diagnosis regarding the early-onset glaucoma using OCT images [252]. DL model was built starting from 4316 OCT images from 1565 eyes from patients suffering from glaucoma and 193 normal eyes, used as a pre-training dataset. A smaller set of OCT images was used to train the model (94 eyes from patient with early glaucoma and 84 healthy eyes). The independent dataset employed as test set for assessing the diagnostic performance of the developed model comprised 114 eyes from 114 patients at early stages of glaucoma and 82 eyes from 82 healthy people. In particular, a DL classifier based on CNN was employed in the reported study, and the input features were 8 x 8 grid macular retinal nerve fiber layer thickness and ganglion cell complex layer thickness from OCT images. Diagnostic performances were assessed using the test set and applying RF and SVM algorithm. Results showed that the DL model displayed an area under the ROC curve of 93.7%, considerably decreasing (to 76.6 and 78.8%) with no pre-training procedure, suggesting a relevant sensitivity and specificity of the DL model to diagnose glaucoma, highlighting the robustness of the proposed approach. Accordingly, also in the reported case is underlined that the use of ML approaches can offer a significant increase in diagnostic performances, assisting clinicians in making a decision [252]. Finally, another interesting approach was conducted by Zhang and collaborators. They used OCT images of the fundus retina for generating and validating a ML-based model as a diagnostic model for diabetic macular edema (DME). Concisely, the authors used 38,057 OCT images (drusen, choroidal neovascularization (CNV), DME, and healthy) in multiscale transfer-learning algorithm model by using CNN technique. This computational-based tool consisted of two steps (self-enhancement and disease detection). The self-enhancement model is built using a multiscale feature learning method for detecting and extracting the frame of the diagnostic target. Next, the enhanced data are employed to generate a disease diagnostic model that combines transfer-learning knowledge. The data are initially processed by convolutional and pooling layers for extracting characteristics hidden in the original data. Lastly, these features were used in a classification step for automatically determining the type of disorder. In the training set was enclosed 37,457 samples (9,891 cases (26.41%) of CNV, 9,633 cases (25.72%) of DME, 7,975 cases (21.29%) of drusen, and 9,958 cases (26.58%) of healthy), while 600 samples (150 cases (25%) of CNV, 150 cases (25%) of DME, 150 cases (25%) of drusen and 150 cases (25%) of healthy) composed the validation set. Statistical parameters (accuracy, precision, sensitivity, and specificity) of the model were evaluated as well as the parameters for assessing the performance of the ML-based model from the perspective of clinical application. The developed computational tool showed 94.5% accuracy, 97.2% precision, 97.7% sensitivity, and 97% specificity in the independent testing dataset. Notably, the developed model based on a multiscale transferlearning algorithm is able to accurately employ OCT images for assessing the health of patients, automatically and accurately diagnosing several eye health conditions. Such an approach could help clinicians for improving the effectiveness of therapies, reducing the disability ratio of severe disorders [247].

AI/ML in central nervous system (CNS)-related disorders
Another interesting area, in which AI/ML and DL have also been widely employed for brain image assessment to develop imaging-based diagnostic and classification systems, is the neurology and central nervous system (CNS)-related disorders such as psychiatric disorders, demyelinating diseases, neurodegenerative disorders, epilepsy, and strokes [121,[253][254][255][256][257][258]. Together with extensive usage in image recognition, language processing, and data mining, ML approaches have obtained growing interest also in neurological-related applications, ranging from automated imaging assessment to disorder prediction. In epilepsy, ML approaches are currently applied for automatically detecting seizure using electroencephalography (EEG), video, and kinetic data, automated imaging analysis and pre-surgical planning, prediction of medication response, and prediction of medical and surgical outcomes using several data sources. This has been accomplished by different ML techniques including ANN, SVM, decision tree, RF, and decision forest [258]. For example, in a recent work Abdelhameed and Bayoumi used EEG data for developing a ML model based on a DL approach for identifying seizures in pediatric patients based on the classification of raw multichannel EEG signal recordings after a limited pre-processing step. The developed ML model based on the CNN technique takes advantage of the automatic feature learning abilities of a two-dimensional deep convolution autoencoder (2D-DCAE) associated to a neural network-based classifier to generate a unified system that is trained in a supervised way to attain the best classification accuracy between the ictal and interictal brain state signals. Generally, two subsequently steps are required for accomplishing the automatically detection of seizure after acquisition and pre-processing steps of EEG raw signals. The first step involves the extraction and selection of specific characteristics of the EEG signals. In the second stage is required to build and train a classification system to use the extracted features for detecting epileptic events. Notably, the step regarding features extraction directly influences the accuracy/precision of the developed automatic seizure detection model. In the mentioned study, the used dataset was recorded at Boston Children's Hospital and consists of long-term EEG scalp recordings of 23 pediatric patients with intractable seizures, while a DL-based system using a supervised 2D-DCAE) approach is used for retrieving epileptic seizures in that multichannel EEG signals recording. In order to test and assess the strategy, two models were developed and evaluated employing three different EEG data segment lengths and a 10-fold cross-validation scheme. Considering five evaluation metrics, the best performing ML-based tool was a supervised DCAE. In particular, this model showed 98.79% accuracy, 98.72% sensitivity, 98.86% specificity, 98.86% precision, and an F1-score of 98.79%, respectively [257]. According to this study and other similar research works in the field, the use of ML-based models can be useful in detecting seizure in epilepsy. Furthermore, the improvement in processing capabilities, the availability of efficient and more sophisticated ML methods, and the collection of larger datasets, scientists will benefit from these computational approaches along with considerable progress acquired in their use in epilepsy [257,258].
Regarding CNS-related disorders, AI/ML approaches have been used for classifying and performing diagnosis for patients with ADHD (attention deficit hyperactivity disorder). Tenev and colleagues used a ML algorithm based on SVM technique for classifying adult ADHD using EEG data. The model was trained enclosing 117 adults (67 ADHD, 50 healthy). Four conditions were considered during measurements: two resting conditions (eyes open and eyes closed) and two neuropsychological tasks (visual continuous performance test and emotional continuous performance test). The authors considered four datasets (one for each condition), that independently trained diverse SVM classifier. The output was combined employing a logical expression obtained from the Karnaugh map. The evaluation of the developed computational protocol indicated that following this strategy is possible to discriminate ADHD patients from healthy subjects, differentiating ADHD subtypes [259]. Slobodin and coworkers applied a ML-based model for predicting ADHD by employing continuous performance test (CPT) indices. These data from 458 children were used for training, cross-validating, and testing ML models (age 6-12 years, 213ADHD patients and 245 healthy). Authors used the CPT total score containing four indices (timeliness, attention, impulsiveness, and hyperactivity) and four variables (gender, age, day of the week, and time of day), to get relevant data capable of discriminating patients with ADHD. The developed model showed significant predictivity displaying accuracy, sensitivity, and specificity of 87%, 89%, and 84%. Interestingly, ML models can accurately classify ADHD using CPT data [260]. In another impressive work, Kautzky and collaborators described the development of a ML model for discriminating ADHD patients form healthy subjects using multivariate, genetic and positron emission tomography (PET) imaging data. They selected 16 ADHD patients and 22 healthy subjects. These groups were scanned and scanned via PET for measuring the serotonin transporter (SERT) binding potential employing the radioligand [ 11 C]DASB (3-amino-4-(2-dimethylaminomethylphenylsulfanyl)-benzonitrile). The considered subjects were analyzed based on 30 possible single-nucleotide polymorphisms (SNPs) involving HTR1A, HTR1B, HTR2A, and TPH2 genes. Accordingly, authors defined cortical and subcortical regions of interest (ROI), and a ML model based on RF technique was employed for selecting and classifying relevant features in a 5-fold cross-validation model (10 repeats). The results regarding the model performances revealed an accuracy, sensitivity, and specificity of 0.82, 0.75, and 0.86, respectively, indicating a significant predictivity of the model. Furthermore, the outcomes highlighted the relevance of SERT along with HTR1B and HTR2A genes in ADHD indicating disease-specific effects and suggesting that a diagnostic tool based on these features can be suitable for supporting clinical decisions [261]. In the last example Dubreuil-Vall and colleagues developed a ML model based on the CNN technique with a four-layer architecture combining filtering and pooling employed various types of data extracted from EEG analysis for discriminating ADHD patients from healthy subjects. These data obtained from 20 ADHD patients and 20 healthy controls were used to train the model. Based on the results presented by the authors the computational tool can correctly categorize ADHD patients, showing an accuracy value of 88%, outperforming other models such as RNN and other ML models previously reported. Although the data are interesting and promising, studies considering a more consistent number of participants is highly desirable [262].
A different field in which the imaging techniques can help diagnose is the area regarding the neurodegenerative diseases. In fact, the multifactorial and complex molecular mechanisms involved in neurodegeneration make challenging the discovery of tools for early diagnosis as well as the identification of effective treatments. In this scenario, ML-based approaches allow to reduce this gap, assisting researchers in formulating early diagnosis, interpreting brain images and in developing potential effective therapeutic strategies [255]. In fact, regarding AD, a precise diagnosis, and its early-stage characterization, such as mild cognitive impairment (MCI) is essential to opportune treat and possible slow down AD progression. Accordingly, Lu and coworkers described a ML-based approach based on DL technique for an early diagnosis of AD. They proposed a multimodal and multiscale ML-based method in which information from magnetic resonance imaging (MRI) and fluorodeoxyglucose positron emission tomography (FDG-PET) images were combined within DNN framework for discriminating AD patients. For developing the mentioned model are required the following two steps: (I) pre-processing images form MRI and FDG-PET. This step allowed to sub-divide the gray-matter segmentation into patches of a range of sizes, for extracting features from each-sized patch; (II) training a DNN algorithm for learning the patterns for discriminating AD individuals. Next, the ML-based model can be employed for an individual classification. Data from 1242 subject with both a T1-weighted MRI scan and FDG-PET image from the Alzheimer's Disease Neuroimaging Initiative (ADNI) database was used for developing and validating the model. Subjects were clustered into 5 classes based on clinical diagnosis: 1) Stable Normal controls (sNC) 360 subjects; 2) Stable MCI (sMCI) 409 subjects; 3) Progressive NC (pNC) 18 subjects assessed to be NC at baseline visit but progressed to clinical diagnosis of possible AD; 4) Progressive MCI (pMCI): 217 subjects evaluated to be MCI at baseline visit and progressed to a clinical diagnosis of possible AD at some point in the future; 5) Stable AD (sAD): 238 subjects with AD. Further the classifier trained with the combined sample of pNC, pMCI and sAD was found to yield the highest overall classification accuracy of 82.4% accuracy in the identifying the individuals with MCI who will convert to AD at 3 years before conversion (86.4% combined accuracy for conversion within 1-3 years), a 94.23% sensitivity in the classification of persons with clinical diagnosis of probable AD, and an 86.3% specificity in in the classification of non-demented controls. These results suggest that DNN classifiers may be useful as a potential tool for providing evidence in support of the clinical diagnosis of probable AD [263]. Shi and colleagues highlighted the importance of combining information derived from different tests. To this end. they developed a DL algorithm based on deep polynomial networks (DPN) to develop a computational model trained by multimodal neuroimaging data (MRI and PET). In the selected work, they built a multimodal stacked DPN (MM-SDPN) algorithm. MM-SDPN involves two SDPN stages, one dedicated to fuse multimodal neuroimaging data, while the other devoted for learning high-level features from AD diagnosis. The authors used data from ADNI dataset (same MRI and PET images from 51 AD patients, 99 MCI patients (43 MCI converters (MCI-C), who progressed to AD, and 56 MCI non-converters (MCI-NC), who did not progress to AD in 18 months), and 52 normal controls (NC)). The developed MM-SDPN algorithm was applied to the ADNI dataset for conducting both binary classification and multiclass classification tasks. Validation results using ROC curve showed and area under the curve of 0.897, indicating MM-SDPN approach better performed over other multimodal MLbased approaches in achieving a correct AD diagnosis, being able to classify all stages concerning AD progression [264]. Gao and collaborators by using a ML approach based on CNN for classifying computed tomography (CT) brain images with the aim to translate images into clinical applications. This classification was done considering three main groups containing subjects with AD (1,000 images), lesions (e.g., cancer) (947 images), or normal aging (2,129 images). Interestingly, because of the features of CT brain images with higher thickness, authors considered both 2D and 3D CNN are employed in this research. The fusion is consequently performed considering both 2D CT images along the axial direction and 3D segmented blocks with accuracy rates of 88.8%, 76.7% and 95% for groups of AD, lesion and normal, respectively, leading to an average of 86.8%. Accordingly, adopting ML approach based on CNN is possible to classify CT brain images for AD with great accuracy [265]. In another interesting approach, Liu and collaborators conceived a different ML approach to identify AD. In particular, the authors collected a novel speech dataset, based on the spectrogram features (extracted based on audio data using an algorithm ad hoc), that enclosed AD patients and healthy subjects as control. Next, a ML-based models were employed for comparing this new dataset with the speech provided by Dem@Care project. Among the assessed ML-based models, Logistic-regressionCV (LRCV) model showed the best performance. Notably, the authors demonstrated that ML-based approaches, trained by extracting spectrogram features from speech data, can be applied for identifying AD, helping in understanding the development of AD at early stages for providing therapies for delaying the disorder progression [266]. Finally, we reported an interesting ML approach described by Grassi and colleagues. Their study is focused on the development of an algorithm for predicting, based on a time of 3 years, a possible progression of patients with MCI and preMCI to AD. ML models were trained employing information from 90 patients with MCI and 94 subject with PreMCI with a diagnostic follow-up valuation for at least 3 years. They extracted several features from the data for a total of 36 predictors (e.g., diagnostic subtypes, clinical and neuropsychological test scores, sociodemographic characteristics, cardiovascular risk indexes, and levels of medial temporal lobe brain atrophy in the hippocampus (HPC), perirhinal cortex (PRC), entorhinal cortex (ERC), and assessed by a clinician-rated Visual Rating Scale (VRS)). To model these data, the authors used several MLbased techniques including Elastic Net (EN) with polynomial features, SVM, Gaussian processes (GP), and k-NN. The resulting models were validated using leave-pair-outcross-validation. The best performing ML model based on SVM technique showed an area under the ROC curve of 0.962 an accuracy of 0.913 [267]. The reported work further demonstrated how ML applications can assist translational research providing computational tools for readly applications in clinical practice and clinical trials. Similarly, similar procedures, extracting specific features from available data allowing the development of ML based models for Parkinson's disease (PD). In fact, as reported for AD, several studies highlighted that through ML-based approaches applied to PD [268] is possible to predict the progression of the disorder employing serum cytokines [269], MRI [270], and walking tests [271], to estimate the state of PD, employing longitudinal data [272], to rate the main synthomps (resting tremor and bradykinesia) [273], to produce a correct diagnosis from EEG analysis [274,275] and from voice dataset [276,277], only for reporting some relevant works.

AI in cardiology and cardiovascular diseases
Due to the enormous progress in cardiovascular imaging along with the advancement of recording technologies have enabled the acquisition of complex and huge multidimensional data, AI/ML can be applied in cardiology, particularly. ML-based techniques allow cardiologists to investigate new possibilities, making findings not detected using classical strategies. Also considering this field ML can offer novel chances for improving patient support (survival prediction, appropriate diagnoses, and pharmacological treatments) and medical decision-making, covering the gap between the swift progress of cardiac imaging and clinical care [124,278,279]. In particular, several studies in cardiology and related fields employ supervised ML models as diagnostic predictors [280,281]. These computer-based tools are able to extract specific features obtained from imaging data and clinical outcomes select features derived from any imaging data sample (e.g., electrocardiograms (ECG), echocardiograms, cardiac MRI, cardiac computed tomography (CCT)) for providing specific diagnoses [282]. In this section, some relevant and innovative examples of ML applications in cardiology field are examined and discussed. Madani and colleagues developed a ML protocol based on DL approach using CNN algorithm for establishing an AI tool to interpret echocardiograms. They trained a CNN using images and video from 267 transthoracic echocardiograms depicting realworld clinical variation (e.g., different patient variables, echocardiographic indications, technical qualities, and pathologies) for classifying 15 distinct standard echocardiographic views. For generating the CNN model, they employed over 200,000 images (240 studies) for arranging a training and validation set of and over 20,000 images (27 studies) composed a test set. The developed computer-based model showed an overall accuracy of 97.8% on videos (F-score 0.964 ) and of 100% on seven of the 12 video view, supporting the robustness of the approach [283]. Another work performed by Madani and colleagues reported the development of ML-based approach using CNN technique employing DL classifiers for automatically interpreting echocardiographic data. Results from this report showed an accuracy of 94.4% considering 15 echocardiographic view classifications of still images and 91.2% accuracy for binary left ventricular hypertrophy view classification. Subsequently, the authors employed a semi-supervised generative adversarial network model for detecting left ventricular hypertrophy. The model showed excellent performances accounting for an accuracy of 80% in view classification and of 92.3% accuracy for left ventricular hypertrophy [284]. Zhang and colleagues re-ported the development of different ML models based on CNN technique for an automatic classification of echocardiogram data for detecting three distinct cardiovascular diseases: hypertrophic cardiomyopathy, cardiac amyloid, and pulmonary arterial hypertension. For training and validating the models for multiple tasks the authors used 14,035 echocardiograms spanning a 10-year period. Results were assessed by comparing data from manual segmentation and measurements considering 8,666 echocardiograms from routinary clinical assessment. The developed CNN models were able to identify views, including flagging partially obscured cardiac chambers, and facilitated the segmentation of individual cardiac chambers. Overall, the authors findings demonstrated that automated measurements can be similar or even superior to manual measurements considering 11 internal consistency metrics (e.g., the correlation of left atrial and ventricular volumes). Furthermore, CNN models appropriately detect hypertrophic cardiomyopathy, cardiac amyloidosis, and pulmonary arterial hypertension showing C statistical parameters of 0.93, 0.87, and 0.85, respectively [285]. Interestingly, echocardiography outcomes were used from Samad and colleagues to develop a supervised ML model based on RF algorithm to predict future adverse cardiac events. In fact, the RF algorithm was employed for predicting survival from echocardiography data. They trained the model employing the information obtained from echocardiograms considering 171,510 patients, providing three different classes of input: (I) clinical variables such as 90 cardiovascular-relevant international classification of diseases (ICD)-10 codes, sex, weight, age, height, blood pressures, heart rate, LDL, HDL, smoking; (II) clinical variables plus physician-reported ejection fraction (EF); (III) clinical variables, EF, plus 57 additional echocardiographic measurements. The ML models based on RF algorithm showed good accuracy regarding the prediction with an area under the ROC curve > 0.82 greater than conventional clinical risk scores (area under the ROC curve ranging from 0.61 to 0.79). Accordingly, ML can successfully use employing combining several and distinct input variables for predicting survival considering echocardiography data [286]. Again, the CNN technique was also used from Strodthoff and coworkers for developing a ML model for detecting myocardial infarction directly from ECG with no preprocessing. They used a dataset of 549 ECG outcomes from 290 subjects available from Physikalisch Technische Bundesanstalt (PTB) database that enclosed a large publicly accessible ECG data. The developed ML model based on a DL approach showed sensitivity and specificity of 93.3% and 89.7%, respectively, as assessed employing 10-fold cross-validation with sampling established on patients. The described model was able to detect myocardial infarction and it showed performances comparable with those obtained from human cardiologists. Furthermore, another analysis showed that it is also able in discriminating channel-specific regions substantially contributing to the neural network's decision. These highlighted that the same signs indicative of myocardial infarction recognized by human cardiologists were underlined from the ML model. This work further demonstrated that ML models applied to ECG evaluation can be progressed into clinical application [287]. Hannun and coworkers developed a ML model based on DNN technique, employing ECG data, for detecting arrhythmias. The DNN algorithm was trained using 91,232 single-lead ECG records from 53,549 patients who used a single-lead ambulatory ECG monitoring device for classifying 12 rhythm classes (10 arrhythmias as well as sinus rhythm and noise). The resulting model was validated using an independent test set (328 ECGs collected from 328 patients), showing an average area under the ROC curve of 0.97. Moreover, the median F1 score, that represents the harmonic mean of the positive predictive value and sensitivity, for the DNN (0.837) surpassed that of average cardiologists (0.780) for all rhythm classes. The results clearly indicated that the ML approach based on DNN can be used for correctly classifying different types of arrhythmias from ECG outcomes. This approach could hold tremendous potential if use in clinical settings, reducing misdiagnoses to prioritize urgent health status [288]. Very recently, also Elul and colleagues by using ECG data for developing a ML model for detecting heterogeneous combination of known and unknown arrhythmias and to identify underlying cardio-pathology considering segments marked as normal sinus rhythm documented in pa-tients with intermittent arrhythmia [289]. Further, asymptomatic left ventricular dysfunction (ALVD) can be predicted using a CNN algorithm employing ECG data as reported by Attia and colleagues. The authors used paired 12-lead ECG and echocardiogram data, including the left ventricular ejection fraction (a measure of contractile function), considering 44,959 patients for training a CNN algorithm for identifying subjects affected by ventricular dysfunction (defined as ejection fraction ≤ 35%). The developed model was tested against an independent set of 52,870 subjects, showing an area under ROC curve, accuracy, specificity, and sensitivity of 0.93, 85.7%, 85.7%, and 86.3%, respectively. Very interesting, the authors found that in patients devoid of ventricular dysfunction, those with positive outcomes, indicated by the ML model, were at 4 times the risk (hazard ratio, 4.1; 95% confidence interval, 3.3 to 5.0) of developing future ventricular dysfunction compared with those with a negative screen. Remarkably, the application of AI/ML to ECG data is versatile for predicting a lot of possible outputs for finding potential subjects who will develop a given disorder ad in the case of ALVD [290]. The following example reported the use of unsupervised ML approach for assessing diastolic dysfunction. The objective of the study conducted by Pandey and collaborators was to develop a ML model based on the DNN technique for integrating multidimensional echocardiographic data with the aim to detect distinct patient subgroups with heart failure in conjunction with preserved ejection fraction (HFpEF). This study is particularly relevant since, currently, no algorithms to translate in clinical exist for phenotyping the severity of diastolic dysfunction in HFpEF. The authors established a DNN model for predicting high-and low-risk phenogroups in a derivation group (n = 1,242). Next, two external groups were considered for validating the performance of the model for identifying high left ventricular filling pressure (n = 84) and assessing its prognostic capacity in patients (n = 219) presenting different degrees of systolic and diastolic dysfunction. Notably, the clinical relevance of the ML model was evaluated in three HFpEF clinical trials by assessing the relationships of the groups with adverse clinical consequences (TOPCAT trial, NCT00094302, n = 518), cardiac biomarkers, and exercise parameters (NEAT-HFpEF trial, NCT02053493 and RELAX trial, n = 346). Notably, the developed unsupervised ML model based on DNN technique showed an area under ROC curve was higher than that reported by the American Society of Echocardiography guidelines for the prediction of high left ventricular filling pressure (0.88 vs 0.67; p = 0.01). Furthermore, the developed model showed high performance also considering the validation sets, including the three HFpEF clinical trials. In fact, DNN classifier is able to depict the severity of diastolic dysfunction and identify a specific subgroup of patients with HFpEF showing high left ventricular filling pressure, biomarkers of myocardial injury and stress, and adverse events and those who are more likely to respond to spironolactone [291]. Another interesting application of ML model applied to the cardiovascular system was described by Ma and coworkers. They started considering the relationships between carotid plaque echogenicity in ultrasound images and the risk of stroke in atherosclerotic patients. For accurately classifying carotid plaques to estimate their stability to predict cardiovascular events, the authors used a ML model employing CNN technique. This approach could automatically provide a carotid plaque echogenicity classification. For improving the reliability of the method, the authors redesigned the spatial pyramid pooling (SPP) and propose multilevel strip pooling (MSP) for the automatic and accurate classification of carotid plaque echogenicity in the longitudinal section. By performing this step, the resulting MPS module was able to accept arbitrarily sized carotid plaques as input and capture a long-range informative context for improving the accuracy of classification. Accordingly, the scientists implemented an MSP-based CNN employing the visual geometry group (VGG) network as the backbone. They trained the mode using 1,463 carotid plaque images (335 echo-rich plaques, 405 intermediate plaques, and 723 echolucent plaques). The 5-fold cross-validation results show that the proposed MSP-based VGG-Net achieves a sensitivity of 92.1%, specificity of 95.6%, accuracy of 92.1%, and F1-score of 92.1%. The findings of this work proved that this strategy is relevant for enhancing the applicability of CNN also using any input size of samples, leading to an improve-ment of the accuracy of classification, making the objective risk assessment more effective [292].
The rising usage of ML-based approaches in cardiology is likely to continue in the foreseeable future. Following a proper validation, they might enhance treatment outcomes by facilitating daily workflow, patient satisfaction, early identification, and right interpretation of data.

AI in gastroenterology
In the area of gastroenterology, clinicians work with many clinical data and several imaging technologies including endoscopy and ultrasound. In this context, for managing and analyzing huge quantities of information AI/ML methodologies can play a pivotal role regarding image analysis, diagnosis, prognosis, and possible treatments. AI/ML based techniques can be applied to gastroenterology for improving endoscopic diagnosis allowing the detection of abnormalities of the gastrointestinal tract such as colorectal polyps as well as malignancies such as esophageal, gastric, and intestinal tumors, as well as inflammatory bowel disease, irritable bowel syndrome, and peptic ulcer bleeding [293][294][295]. We report here some relevant examples demonstrating the translational potential of AI/ML-based approach in gastroenterology. Mori and collaborators reported an AI approach for detecting small (< 5 mm) adenomatous or sessile polyps, usually extremely difficult to identify for clinicians employing colonoscopy. For validating the approach in a prospective, single-group, and open-label clinical trial (UMIN000027360), they trained a ML-based model with data from 325 subjects presenting 466 microscopic polyps. In this prospective study, the model showed an accuracy of 94% (with a negative predictive value of 96%), including a pathologic prediction rate of 98.1% (457 of 466) [296]. In another approach, Wang and colleagues developed a ML-algorithm for detecting polyps in clinical colonoscopy investigations. Specifically, they generated a DL algorithm trained employing data derived from 1,290 patients (5,545 colonoscopy images). The training of the model was performed in two separate steps: 1) a training step in which 4,495 images were used, selecting 2,607 images containing polyps and 1,888 images with no polyps. The training data were employed for optimizing the network parameters; 2) a tuning step in which 1,050 images (1,027 with polyps and 23 without polyps) were considered for optimizing hyperparameters. The authors validate the approach using information obtained from (I) a novel collected set consisting of 27,113 colonoscopy images taken from 1,138 patients presenting as a minimum one detected polyp. The calculated statistical parameters demonstrated the validity of the approach, showing a sensitivity of 94.38% and a specificity of 95.92%, with an area under the ROC curve of 0.984; (II) a public database containing clinical images of 612 polyps (sensitivity of 88.24%); (III) 138 colonoscopy videos including histologically established polyps (sensitivity of 91.64%; per-polyp-sensitivity of 100%); (III) a set of 54 intact full-range colonoscopy videos with no polyps (specificity of 95.40%). The developed a DL model has great potential in assisting clinicians while conducting colonoscopies, being able to correctly discriminate polyps and adenomas [297]. Byrne and coworkers developed a ML model based on deep CNN technique for a real-time evaluation of endoscopic video images of colorectal polyps. The model was trained and validated using untouched video data derived from routinary clinical investigations not adapted for a classification based on AI approach. For assessing the performances of the developed computational tool, the authors tested the model employing an independent set of 125 videos of sequentially encountered diminutive polyps classified as adenomatous or hyperplastic polyps. The ML model showed a sensitivity of 98%, a specificity of 83%, and an accuracy of 94%, being able to discriminate hyperplastic from adenomatous polyps [298]. Urban and colleagues used a similar approach to develop a deep CNN algorithm for detecting polyps from colonoscopy exams. They trained a ML model employing 8,641 handlabeled images, with 4,088 unique polyps, from screening colonoscopies derived from over 2,000 subjects. The authors tested the model using 20 colonoscopy videos (5 h of duration). When validated considering manually labeled images the developed model detected polyps with an area under the ROC curve of 0.991 and an accuracy of 96.4%. Interestingly, in the examination of colonoscopy videos where 28 polyps were removed, 4 expert reviewers found 8 extra polyps with no ML-based support that had not been removed and observed further 17 polyps taken an advantage from CNN support (45 total polyps). Notably, every one of the polyps removed and detected by experts were found using the ML-based model, although the computational tool showed7% of false positive. However, the CNN algorithm identified a number of polyps higher than those observed from expert clinicians. Notably, the additional polyps found by the model are little adenomas with a size ranging from 1 -3 and 4-6 mm [299]. Regarding gastrointestinal malignancies some methods, based on AI/ML, for detecting cancers in the gastrointestinal tract have been described. For example, Tokai and colleagues in their study, estimated the diagnostic capability of a ML tool based on CNN algorithm in detecting esophageal squamous cell carcinoma (ESCC) and in assessing its invasiveness. For a comprehensive assessment of the performances, they compared the acquired results with the findings obtained from thirteen expert endoscopists. The CNN algorithm was trained using white light imaging and narrow-band imaging endoscopic images including 1,751 images of ESCC. In the validation step the ML-based model identified 95.5.% of ESCC in test pictures (279/291) in ten seconds properly estimating the invasion depth of ESCC with a sensitivity of 84.1% and accuracy of 80.9% in six seconds. The diagnosis assisted by CNN algorithm was more accurate than diagnosis done by expert clinicians alone, indicating a potential role of ML as ESCC diagnostic tool [300]. Another example of the AI/ML application to detect cancer and its invasive potential was carried out by Nakagawa and collaborators. They reported the development of a DNN approach for diagnosing the invasion depth of ESCC. ML-based model was built employing endoscopic images from subjects affected by superficial ESCC. In particular, the authors generated a training set collecting 8,660 non-magnified endoscopic images as well as 5,678 magnified images from 804 patients with superficial ESCC presenting cancer invasion; while they compiled a validation test set consisting of 405 non-magnified images ad 509 magnified images from 155 subjects. The DNN algorithm showed the following statistical parameters: specificity 95.8%, sensitivity 90.1%, accuracy 91%, positive predicted value 99.2%, negative predictive value 63.9%. These parameters highlighted the capacity of the model to identify pathologic mucosal and submucosal microinvasive (SM1) cancers from submucosal deep invasive (SM2/3) cancers. Compared with the assessment performed by a pool of experts, employing the same validation set, the model showed a slight improvement of the performances, confirming the capability to detect invasion depth in patients with superficial ESCC [301]. Other interesting works in the field regard the possibility to assess the severity of inflammatory bowel disease (IBD) and improving its classification by using AI/ML approach. Ozawa and coworkers developed a ML-based system for evaluating the severity of ulcerative colitis. They developed a CNN algorithm trained on colonoscopy images (26,304 images) derived from 841 subjects affected by ulcerative colitis. The performance of the ML model was assessed considering an independent test set composed by 3,981 images from 114 patients with ulcerative colitis. The model was examined for its capacity to distinguish normal mucosa (Mayo 0) and mucosal healing state (Mayo 0-1). The validation was achieved by calculating the areas under the ROC curve, and the results for the ML-based were 0.86 and 0.98 in identifying Mayo 0 and 0-1, respectively. The CNN algorithm better performed for the rectum than for the right side and left side of the colon when identifying Mayo 0 (areas under the ROC curve = 0.92, 0.83, and 0.83, respectively). This work underlined the robustness of the method in identifying endoscopic inflammation seriousness in subjects with ulcerative colitis, indicating that the CNN algorithm can assist clinicians in determining severitybased therapies as well as follow-up endoscopy waits for IBD [302]. Mossotto and collaborators developed a ML model for classifying pediatric IBD employing data derived from endoscopic and histological imaging of 287 children affected by IBD. These data were used for developing, training, testing, and validating a ML model for classifying disorder subtypes. Unsupervised ML models displayed wide clustering of Crohn's disease/ulcerative colitis, but no apparent subtype differentiation, while hierarchical clustering recognized new categories with varying levels of colonic contribution. Furthermore, endoscopic data alone, histological data alone, and a combination of endoscopic/histological data were used to generate three supervised ML models, showing a classification accuracy of 71.0%, 76.9%, and 82.7%, respectively. The most promising ML model was assessed by considering an independent group of 48 children affected by IBD. The findings demonstrated that the ML-based model appropriately classified patients with an accuracy of 83.3%. This work highlighted that for a proper supervised ML model development is necessary to consider both endoscopic and histological data for performing a more accurate classification of a disease [303].
A very fascinating approach in which AI/ML-based approach can be used is in the field of food intolerance. In particular, starting from a decade ago, several computational attempts were done for detecting subjects presenting celiac disease and for classifying the disorder [304]. In a pioneeristic approach, Vècsei and collaborators developed a computer-based methodology for automatically classifying celiac severity on 612 endoscopic images from pediatric patients considering two-class issue: mucosa affected by celiac disease and unaffected duodenal tissue. Even though the classification method was able to discriminate celiac disease into two mentioned group (disease vs no disease), showing an overall accuracy of 88%, the model displayed a reduced accuracy (63.7%) in classifying the severity of disorders maybe due to the small set for training the model [305]. After that, Wimmer and collaborators theorized that AI methods can be employed for classifying luminal endoscopic images of celiac disease. They developed a CNN transfer-learning that categorized luminal endoscopic images from the duodenum gathered by white light and narrow band imaging endoscopy, collecting 1661 images. The CNN algorithm showed an accuracy of 90.5% in the identification of celiac disease considering endoscopic images alone. The authors indicated that the gold-standard for the diagnosis of celiac disease remains unchanged, ML could offer a new way in diagnostic settings, especially where acquiring biopsies is complicated [306]. Hujoel and collaborators developed a ML model for detecting undiagnosed celiac disease. To this purpose, they collected serum samples derived from 47,557 subjects, whit no previous diagnosis of celiac disease. From this set 408 undiagnosed cases were detected. To apply ML in a retrospective study, they developed various ML-based predictive models employing several approaches such as LR, EN, tree-based models with and without boosting and/or bagging, SVM with radial basis functions, ANN, RF, and LDA. The performances of all the developed models were assessed applying the calculation of the area under the ROC curve. Ten models were trained considering the images set including and excluding variables and a predictor set including sex, age, number of symptoms, history of any autoimmune condition, thyroid disorder, anemia, hypothyroidism, previous indication to test for celiac disease, dyspepsia, and recurring abdominal pain. Unfortunately, by using this approach the authors obtained ML-based models with limited discriminatory power, showing an area under the ROC curve ranging from 0.49 to 0.53. Two models (RF and bagged classification trees) showed better performance with respect to the random chance (likelihood > 95%), although the predictive power showed a slight improvement compared to the other models. Probably, the partial failure in developing effective MLbased models can be ascribable to the subtle symptoms in atypical cases, suggesting that considering the mentioned variables for developing predictive models could be impractical, since they did not characterize undiagnosed celiac disease [307]. Accordingly, for improving diagnostic rates other approaches must be investigated for detecting celiac disease, and very recently, Koh and coworkers developed a new ML algorithm for an automated classification of duodenal biopsy images, aiding clinicians to detect celiac disorder and the severity of villous atrophy, taking into account the Marsh score. In the first step, the authors performed a pre-process procedure on biopsy images, subjecting images to a Steerable Pyramidal Transform (SPT) for obtaining sub band coefficients. Considering each sub band diverse entropy (Fuzzy entropy, Kapur entropy, Renyi en-tropy, Shannon entropy, Vajda entropy, Yager entropy) and nonlinear features were calculated and used as input to the decision tree (DT), k-NN, SVM, Adaboost M1 for twoclasses and Adaboost M2 for multiclass classification, Bagged Trees and Discriminant Subspace for automatically classifying the extracted features (734 features were extracted from each set of data and so, 26,424 features were extracted from three diverse sets of data) from two classes (normal and celiac) and multiclass (diverse degree of severity of villous atrophy considering Marsh scores) biopsy images. Interestingly, for avoiding the bias determined by data imbalance, the authors employed an adaptive synthetic sampling (AdaSyn) technique. Next, the authors employed a ten-fold cross-validation approach for training and testing the model. In the ten-fold scheme, the set was divided into ten parts, where 9 parts were employed to train the model and 1 part for testing. Consequently, a different part was utilized to test the model while the other 9 parts were used for training. This procedure was repeated ten times for each part. The performance of the developed ML model was evaluated, and results showed an accuracy, sensitivity, and specificity of 88.89%, 89.67%, and 86.67% in the two-class classification of 2 Set data (Marsh I + II and Marsh III) of Hematoxylin-Eosin-DAB (HED) biopsy images. Furthermore, 82.92% accuracy, 85.67% sensitivity and 76.67% specificity results were achieved in the two-class classification of 2 Set data (Marsh I + II and Marsh III) of RGB biopsy images. Considering the results of multi-class classification (3 Set data), an accuracy of 72% was obtained for HED biopsy images employing SVM. The suggested approach for an automatic classification of biopsy pictures can help with the process of evaluating villous atrophy using Marsh score, suggesting that automation of biopsy images is a feasible task. Nevertheless, more amount of data with improved quality (e.g., biopsy images well-orientated) are needed to appropriately train the model, enhancing its predictive power [308]. Remarkably, the reported results have shown great potential for AI/ML in automation of biopsy images for detecting celiac disease as well as other disorders. Finally, we discuss a recent article in which ML based on DL technique was adopted for detecting Helicobacter pylori considering gastric biopsies. Klein and colleagues reported used for the first time a computer-based approach for accelerating the recognition of Helicobacter pylori on histological samples. They developed a DL decision support algorithm to be employed on conventional images of gastric biopsies for detecting H. pylori on H&E-and Giemsa -stained slide images. These latter were classified using a DNN algorithm trained considering Giemsa and H&E slides (191 H&E-and 286 Giemsa-stained slides for a total of 2,629 tiles containing for Giemsa and 790 H&E. In addition, 4,241 (Giemsa) and 1,533 (H&E) tiles without Helicobacter pylori-like bacterial structures). Several validation approaches presented in the work showed a significant area under ROC curve > 0.8, indicating the ability of the model to detect Helicobacter pylori, indicating that AI/ML tools can assist clinicians to formulate a more accurate diagnosis regarding the presence of H. pylori on gastric biopsies [309].

AI in dermatology
As discussed for different medical fields, the translational power of AI/ML in medicine is great. From diagnosis to targeted therapy, ML techniques have great potential to increase dermatologists' practices. Current progress in computing along with the availability of huge datasets (e.g., image and -omics databases, electronic medical records), have spurred the development of ML-based approaches in dermatology [126,310]. Some relevant examples were analyzed in this paragraph. Spyridonos and coworkers described a computational approach for discriminating actinic keratoses from healthy skin based on color texture examination of typical clinical photographs. It is important to early recognized these kinds of skin lesions since they are frequent pre-malignant injuries that indicates the possibility to develop invasive skin squamous cell carcinoma. They collected non-standardized clinical photographs of 22 patients of both actinic keratoses and healthy skin, labelled by experienced dermatologists highlighting ROI. In this way the authors obtained a dataset composed by 6,010 (actinic keratoses) and 13,915 (healthy) ROI. Information about color texture were obtained employing local binary patterns (LBP) or texton frequency histograms and assessed using a classifier based on the SVM technique. The classification method was evaluated employing leave-onepatient-out procedure in RGB, YIQ and CIE-Lab color spaces. The best performing configuration of the SVM model was tested using 157 actinic keratoses and 216 healthy skin rectangular regions of arbitrary size. Actinic keratoses treatment outcome was assessed in a further group of eight subjects with 32 skin lesions. The excellent configuration, for discriminating the samples, was obtained using LBP color texture descriptors estimated from the Y and I components of the YIQ color space, and the SVM model achieved a sensitivity of 80.1% and a specificity of 81.1% at ROI level, while a sensitivity of 89.8% and a specificity of 91.7% at region level. The authors observed a quantitative actinic keratoses reduction of 83.6% considering the classifier used. Interestingly, this work that a combination of clinical photographs with ML algorithm for a detailed image analysis represents a useful non-invasive, cost-effective approach to monitor actinic keratoses for early therapeutic strategies against such skin lesions [311]. Intriguingly, some AI-based models have been established for predicting the skin sensitization. In this context, Tsujita-Inoue and collaborators developed a ML approach based on ANN algorithm for assessing the skin sensitization risk derived from several chemicals. The authors used several descriptors (e.g., data from of antioxidant response element (ARE) tests and LogP, indicating lipid solubility and skin absorption) for implementing a previous version of a software able to predict the murine local lymph node assay (LLNA) test results [312]. In fact, LLNA is the most used in vivo method to assess the sensitizing potential of chemical entities. Accordingly, they developed iSENS ver.2. The authors used the data obtained for 62 compounds in murine LLNA tests. Among them, 53 composed the training set, while the others were employed for validating the developed computational tool. The predictivity of the ANN-based model was assessed by employing a 10-fold crossvalidation method. The accuracy, specificity, and sensitivity of the computational model were 84.9%, 92.3% and 82.5%, respectively [313]. According to the results, ML approaches for evaluating the risk estimation of compounds regarding skin sensitization can represent a valuable resource for replacing animal testing. Subsequently, Zang and collaborators improved the number of selected chemicals for developing a ML model to predict the skin sensitization considering two datasets, one including LLNA results regarding 120 chemicals and the other covering human skin sensitization results taking into account 87 chemicals (all these substances were included in the LLNA dataset). Moreover, the authors included six physicochemical features of these chemicals related to skin exposure and penetration (octanol/water partition coefficient, water solubility, vapor pressure, melting point, boiling point, and molecular weight). The molecules were distributed into training set (75%) and test set (25%). Different ML approaches were used for developing predictive models, including classification and regression tree, LDA, LR, and SVM. The validation step was performed applying the leave-one-out cross-validation procedure. SVM was found to be the best method in modelling LLNA output with an accuracy of 89% and a sensitivity of 86%, and specificity of 92% on the test set. Regarding the prediction for human outcomes, SVM model demonstrated an accuracy of 81%, a sensitivity and specificity of 86%, and 78%, respectively [314]. Another area of dermatology regards skin lesions and malignancies. Esteva and coworkers generated a deep CNN-based model for classifying skin lesions. They trained a CNN model employing a set of 129,450 clinical images enclosing 2,032 diverse disorders, matching the performance of 21 dermatologists experienced across three serious diagnoses: keratinocyte carcinoma classification, melanoma classification and melanoma classification by means of dermoscopic data. Results showed an area under the ROC curve of 0.96 for carcinoma, and of 0.94 for melanoma [315]. Haenssle and colleagues, in an interesting experiment, evaluated the accuracy of melanoma skin cancer diagnosis considering the performance of 58 experts in comparison with the assessment performed by a ML-based model generated using CNN technique. ML model was developed, validated, and tested for classifying dermoscopic images of lesions of melanocytic origin (melanoma, benign nevi) for diagnostic purposes. The dataset enclosed a test set composed by 300 images containing 20% melanomas (in situ and invasive) of all body sites and of all common histotypes, and 80% benign melanocytic nevi. The average of the calculated area under the ROC curves was 0.79, considering the results from the 58 dermatologists, and 0.86, considering the ML model, respectively, indicating an improvement concerning the diagnostic performance derived from the application of the computer-based tool. Accordingly, the study highlighted that ML models appropriately trained have the capability to perform accurate diagnostic classification of dermoscopic images of melanocytic origin [316,317]. Han and coworkers developed a ML model using CNN algorithm for classifying clinical images from 12 skin diseases (basal cell carcinoma, squamous cell carcinoma, intraepithelial carcinoma, melanocytic nevus, pyogenic granuloma, seborrheic keratosis, actinic keratosis, wart, malignant melanoma, hemangioma, lentigo, and dermatofibroma,). ML model was trained, tested, and validated employing Asan dataset, MED-NODE dataset, and atlas site images, for a total of 19,398 images, opportunely divided in training set and test set. Considering Asan dataset, the area under the ROC curve concerning the diagnosis of basal cell carcinoma, squamous cell carcinoma, intraepithelial carcinoma, and melanoma was 0.96, 0.83, 0.82, and 0.96, respectively. Considering the Edinburgh dataset, the area under the ROC curve for the same disorders was 0.90, 0.91, 0.83, and 0.88, respectively. The developed ML-based model demonstrated comparable performances to those obtained from 16 dermatologists. Furthermore, as indicated by the authors, for improving the performance of CNN algorithm, supplementary images representing a larger variety of ages and ethnicities should be employed [318]. Following this trend other studies employing data from dermoscopic images sometimes combined with macroscopic images for training supervised or unsupervised ML models based principally on CNN algorithms to detect and/or classify cutaneous malignancies including melanoma and basal cell carcinoma [319][320][321][322][323][324][325][326]. Notably, CNN algorithms showed interesting performances also in classifying and detecting other relevant dermatological disorders including onychomycosis, rosacea, atopic dermatitis, and psoriasis [327][328][329][330][331][332][333]

Conclusion and Future Perspective
AI/ML has reemerged in the last years as a powerful set of tools for unlocking value from big datasets. According to the extraordinary increase in the use of AI and ML techniques to nearly all fields of technology, science, and medicine clearly indicates a significantly greater role for these procedures in the discovery of innovative therapies in the near future. The above descriptive examples display how useful these methodologies can be in discovering novel drug candidates, biomarkers, and drug targets as well as for detecting and evaluating the progression of a given disease. It is also clear from the literature that the rate of adoption of these methods is increasing significantly. This is determined by the increase of the usage of high-throughput screens, increased power and availability of open-source ML methods, and development of new AI/ML algorithms, generating more accurate descriptors and model relationships.
Remarkably, the quality of the generated ML algorithms is also principally defined by the quality of the input data, so a proper data acquisition and curation is a crucial step for developing predictive/effective ML-based models. In context of ML as a new diagnostic technique and for identifying appropriate therapeutic regimens, most of the developed models were found to outperform current clinical standards based on the assessment of sensitivity, specificity and accuracy employing ROC method for comparing ML algorithms and clinician performances. This validation, undoubtedly added validity to model performances, but for a real-world assessment any new methodology employed in clinical settings, should demonstrate superior performance in properly designed, randomized clinical trials. Nonetheless, advances in ML will provide, in the next future, effective methods for addressing the uncertainty observed in translational medicine, facilitating for a more forceful, data-driven decision making for developing the next generation of diagnostic tools and therapeutic agents to patients.