Induced Pluripotent Stem Cell-Based Drug Screening by Use of Artificial Intelligence

Induced pluripotent stem cells (iPSCs) are terminally differentiated somatic cells that differentiate into various cell types. iPSCs are expected to be used for disease modeling and for developing novel treatments because differentiated cells from iPSCs can recapitulate the cellular pathology of patients with genetic mutations. However, a barrier to using iPSCs for comprehensive drug screening is the difficulty of evaluating their pathophysiology. Recently, the accuracy of image analysis has dramatically improved with the development of artificial intelligence (AI) technology. In the field of cell biology, it has become possible to estimate cell types and states by examining cellular morphology obtained from simple microscopic images. AI can evaluate disease-specific phenotypes of iPS-derived cells from label-free microscopic images; thus, AI can be utilized for disease-specific drug screening using iPSCs. In addition to image analysis, various AI-based methods can be applied to drug development, including phenotype prediction by analyzing genomic data and virtual screening by analyzing structural formulas and protein–protein interactions of compounds. In the future, combining AI methods may rapidly accelerate drug discovery using iPSCs. In this review, we explain the details of AI technology and the application of AI for iPSC-based drug screening.


Introduction
Stem cell technology has recently been developed and many clinical applications are expected. Induced pluripotent stem cells (iPSCs) are generated by transferring defined factors, such as transcription factors, that are upregulated in undifferentiated cells during embryogenesis [1,2]. iPSCs have pluripotency, which means that the cells can differentiate into all cell types except extraembryonic tissue, and can be cultured on a large scale because of their good proliferative capacity; thus, iPSCs can be applied to many technologies. Although regenerative medicine is one of the most promising technologies for using iPSCs [3][4][5], disease modeling using iPSCs is also a promising field [6][7][8][9][10]. Genetic diseases are caused by mutations in DNA. Although there are several methods of genetic disease analysis, including exome and whole-genome sequencing [11], the underlying mechanisms cannot be fully explained using only genetic analyses, and it is difficult to understand patient-specific cellular dynamics without cellular analysis. We can analyze cells that are easy to obtain from patients, such as skin cells, by directly performing primary culture; however, it is difficult to analyze primary cultured cells that are not easily obtained directly, such as cardiomyocytes, vascular endothelial cells, and nerve cells. To overcome these issues, disease-specific iPSCs have been used to understand patient-specific cellular phenotypes [8]. Disease-specific iPSCs can be generated from patients with genetic mutations. iPSCs have the same genetic mutation as patients; therefore, differentiated cells from iPSCs recapitulate the cellular phenotype of patients ( Figure 1). Thus, there is great interest in using disease-specific iPSCs in the development of novel treatments for genetically intractable diseases for which no treatment exists [12,13]. Although it has been reported that disease-specific iPSCs can effectively reproduce pathological conditions, there are many obstacles to actual drug screening for diseases. It is difficult to verify whether a drug can improve the disease phenotype, and it is desirable to establish a simpler and more reliable method for evaluating the disease phenotype. Recently, various problems have become solvable owing to the technological developments in artificial intelligence (AI). In the field of medical biology, AI technology has already begun to be introduced, and its active use in a variety of issues is desired. In particular, the accuracy of image analysis using convolutional neural networks (CNNs), a deep learning technique, exceeds that of humans, and various applications are expected for stem cell biology and drug screening [14]. It has also been reported that the type [15] and state [16] of cells can be evaluated from cellular morphology by simple microscopic imaging using the image analysis technology of AI. Therefore, AI-based image analysis technology can be used as an evaluation index that reflects the pathological condition of cells, and is expected to be applied to drug screening using disease-specific iPSCs. In this paper, we discuss drug screening using iPSCs, with a particular focus on AI technology. genetic mutations. iPSCs have the same genetic mutation as patients; therefore, differentiated cells from iPSCs recapitulate the cellular phenotype of patients ( Figure 1). Thus, there is great interest in using disease-specific iPSCs in the development of novel treatments for genetically intractable diseases for which no treatment exists [12,13]. Although it has been reported that disease-specific iPSCs can effectively reproduce pathological conditions, there are many obstacles to actual drug screening for diseases. It is difficult to verify whether a drug can improve the disease phenotype, and it is desirable to establish a simpler and more reliable method for evaluating the disease phenotype. Recently, various problems have become solvable owing to the technological developments in artificial intelligence (AI). In the field of medical biology, AI technology has already begun to be introduced, and its active use in a variety of issues is desired. In particular, the accuracy of image analysis using convolutional neural networks (CNNs), a deep learning technique, exceeds that of humans, and various applications are expected for stem cell biology and drug screening [14]. It has also been reported that the type [15] and state [16] of cells can be evaluated from cellular morphology by simple microscopic imaging using the image analysis technology of AI. Therefore, AI-based image analysis technology can be used as an evaluation index that reflects the pathological condition of cells, and is expected to be applied to drug screening using disease-specific iPSCs. In this paper, we discuss drug screening using iPSCs, with a particular focus on AI technology. : Disease-specific iPSCs were generated from the patients with genetic mutations. iPSCs have the same genetic mutations as patients; therefore, differentiated cells from iPSCs can recapitulate the cellular phenotype of patients. Thus, disease-specific iPSCs can be used for disease analysis and drug screening of genetic diseases.

Patient-Specific iPSCs
We can create many diseased cells that maintain the genetic disease phenotype of patients using iPSC technology; therefore, iPSCs can be used for disease analysis and drug screening ( Figure 1). Several studies have demonstrated that disease-specific iPSCs can recapitulate disease phenotypes. Cardiomyocytes, which act as pumps for the heart, are difficult to culture and analyze because they lose their proliferative ability in adulthood. Thus, iPSCs are useful for modeling heart disease. Cardiomyocytes contract owing to electrical activity, which is produced by the movement of various ions through channels. Abnormal channels can cause arrhythmias and sudden death. Long QT Syndrome is one of the most common hereditary arrhythmias. Phenotypes such as action potential prolongation can be reproduced by iPSC-derived cardiomyocytes from patients [17,18]. : Disease-specific iPSCs were generated from the patients with genetic mutations. iPSCs have the same genetic mutations as patients; therefore, differentiated cells from iPSCs can recapitulate the cellular phenotype of patients. Thus, disease-specific iPSCs can be used for disease analysis and drug screening of genetic diseases.

Patient-Specific iPSCs
We can create many diseased cells that maintain the genetic disease phenotype of patients using iPSC technology; therefore, iPSCs can be used for disease analysis and drug screening ( Figure 1). Several studies have demonstrated that disease-specific iPSCs can recapitulate disease phenotypes. Cardiomyocytes, which act as pumps for the heart, are difficult to culture and analyze because they lose their proliferative ability in adulthood. Thus, iPSCs are useful for modeling heart disease. Cardiomyocytes contract owing to electrical activity, which is produced by the movement of various ions through channels. Abnormal channels can cause arrhythmias and sudden death. Long QT Syndrome is one of the most common hereditary arrhythmias. Phenotypes such as action potential prolongation can be reproduced by iPSC-derived cardiomyocytes from patients [17,18]. Furthermore, changes in action potentials due to drug administration can be detected [19,20]. Cardiomyopathy, which causes genetic abnormalities in cardiomyocytes, can also be analyzed using iPSC-derived cardiomyocytes. Hypertrophic cardiomyopathy, in which cardiomyocytes are enlarged, is the most common type of cardiomyopathy. iPSC-derived cardiomyocytes exhibit a phenotype similar to hypertrophic cardiomyopathy [21,22]. Furthermore, it is possible to reproduce the pathological conditions and extract candidate drugs for treatment, such as endothelin antagonists [8]. Nerve cells do not have proliferative abilities in adults, and it is difficult to analyze them using cultured cells. In particular, cells from the central nervous system are difficult to obtain by biopsy; therefore, iPSCs can be useful for disease modeling. Many reports have shown that disease modeling is successful when using iP-SCs, including Alzheimer's disease (AD) [23,24], Parkinson's disease [25,26], amyotrophic lateral sclerosis (ALS) [27,28], and schizophrenia [29,30]. In addition to diseases involving heart and nerve cells, diseases in most organs can be modeled using disease-specific iPSC technology. Vascular [9,31], kidney [32], liver [33], and lung [34] diseases, modeled by iPSCs, mimic pathological conditions well and are suitable for therapeutic development. Differentiated cells derived from iPSCs are generally immature, suggesting that disease modeling by iPSCs is suitable for early onset disease, but there is an ongoing debate on whether iPSCs can imitate the phenotype of late-onset diseases. However, there is evidence that by applying a stress load that mimics the pathological condition, iPSCs can correctly reproduce the pathological condition of late-onset diseases such as cardiomyopathy and neurodegenerative diseases [8,35,36]. Thus, iPSCs represent a promising technology for disease modeling and drug discovery.

Development of Machine Learning Technology
In recent years, various problems have become solvable because of the technological development of AI, and it is necessary to consider how AI can be applied in the fields of medicine and biology. AI was originally developed in the 1950s in an attempt to imitate human intelligence. AI, which imitates human intelligence and has the ability to learn things like humans can [37], is still in the process of development and may require a long time for practical realization. However, AI technology, which is specialized for specific abilities, such as image [38] and language recognition [39], has been rapidly developed and applied in various fields. The most important program used for specialized AI is machine learning. Whereas an explicit program, which is a general computer program, derives an answer from its pre-programming by humans, machine learning is a technology that automatically learns regularity and classification criteria from data, and can predict answers from unknown datasets based on a pre-trained program. Machine learning has played a pivotal role in AI technology since the 1990s [40][41][42][43]. Various machine learning methods are used in many tasks, such as random forests [44], support vector machines [45], and neural networks [46].

Supervised Learning, Unsupervised Learning, and Reinforcement Learning
Machine learning methods have various patterns ( Figure 2). Typical methods include supervised and unsupervised learning. Supervised learning is a method of learning with correct answers given to learning data [47]. The correct answer was given to all the data, and the output of the program was trained to be close to the answer. There are two types of supervised learning methods: regression and classification. In regression, numerical values are continuously predicted, and in classification, they are used to distinguish between the categories and classes. On the other hand, unsupervised learning is a method of learning without the correct answer [48]. Programs determine the regularity and characteristics of data on their own, and classify them based on common terms and frequency of appearance. Representative methods of unsupervised learning include clustering, which compares similar objects, and principal component analysis, which reduces dimensions. By using supervised learning, we can efficiently learn from the data, whereas unsupervised learning is very effective for tasks in which the correct answer is not known in advance. In addition to the aforementioned methods, a method called reinforcement learning has been developed in recent years [49]. Reinforcement learning is a program designed to maximize rewards, and is optimized to do so on its own. For example, AlphaGo Zero, an AI that incorporates reinforcement learning algorithms, defeated top Go players in only three days of learning [50]. It was amazing that the machine could teach itself to beat a Go player without human input. Reinforcement learning has not yet been fully applied in the fields of medicine and biology, but it has great potential in the future [51].
dimensions. By using supervised learning, we can efficiently learn from the data, whereas unsupervised learning is very effective for tasks in which the correct answer is not known in advance. In addition to the aforementioned methods, a method called reinforcement learning has been developed in recent years [49]. Reinforcement learning is a program designed to maximize rewards, and is optimized to do so on its own. For example, Al-phaGo Zero, an AI that incorporates reinforcement learning algorithms, defeated top Go players in only three days of learning [50]. It was amazing that the machine could teach itself to beat a Go player without human input. Reinforcement learning has not yet been fully applied in the fields of medicine and biology, but it has great potential in the future [51].

Figure 2.
Various machine learning techniques.: There are various machine learning methods, such as random forests, support vector machines, k-nearest neighbor, decision tree, and clustering. Deep learning is a type of machine learning technique that consists of a multilayer neural network. There are three patterns of learning: supervised learning, unsupervised learning, and reinforcement learning.

Deep Neural Network
Deep learning is a type of machine learning technique that consists of a multilayer neural network [52], and each building unit that makes up a neural network is called a simple perceptron. Although the concept of a simple perceptron was developed in the 1940s [53,54], it is not the most commonly used machine learning technique. A simple perceptron is a program originally created to imitate neuronal activity. In neurons, a potential difference is generated in the cell based on the input data, and when a certain threshold is exceeded, it is depolarized and information is transmitted to the next neuron. Similarly, a simple perceptron has multiple input values and outputs an answer when it exceeds a threshold. Each input value was multiplied by a weight to identify the importance of the input value. A neural network is a program in which simple perceptrons are stacked and consist of three layers: the input, hidden, and output layers. The value transmitted from the input layer propagates according to the calculation format of the simple perceptron, and the answer is the output from the output layer. For the neural network to output the correct answer, it is necessary to adjust the weights; this adjustment is called training [14]. It was found that deeper stacking of the neural network contributed to improving accuracy. One disadvantage of deep learning is that it takes a long time to perform calculations because the network is more complicated. However, efficient learning methods such as the backpropagation method [55] and a large amount of parallel computing [56][57][58] using a graphics processing unit (GPU) have been developed; therefore, deep learning has played a central role in advancing machine learning techniques.

Convolutional Neural Network
Although many fields can use deep neural networks, the most promising field for implementation is image analysis, which includes image classification, object detection, and semantic segmentation. The most basic deep learning method for image analysis is a Figure 2. Various machine learning techniques.: There are various machine learning methods, such as random forests, support vector machines, k-nearest neighbor, decision tree, and clustering. Deep learning is a type of machine learning technique that consists of a multilayer neural network. There are three patterns of learning: supervised learning, unsupervised learning, and reinforcement learning.

Deep Neural Network
Deep learning is a type of machine learning technique that consists of a multilayer neural network [52], and each building unit that makes up a neural network is called a simple perceptron. Although the concept of a simple perceptron was developed in the 1940s [53,54], it is not the most commonly used machine learning technique. A simple perceptron is a program originally created to imitate neuronal activity. In neurons, a potential difference is generated in the cell based on the input data, and when a certain threshold is exceeded, it is depolarized and information is transmitted to the next neuron. Similarly, a simple perceptron has multiple input values and outputs an answer when it exceeds a threshold. Each input value was multiplied by a weight to identify the importance of the input value. A neural network is a program in which simple perceptrons are stacked and consist of three layers: the input, hidden, and output layers. The value transmitted from the input layer propagates according to the calculation format of the simple perceptron, and the answer is the output from the output layer. For the neural network to output the correct answer, it is necessary to adjust the weights; this adjustment is called training [14]. It was found that deeper stacking of the neural network contributed to improving accuracy. One disadvantage of deep learning is that it takes a long time to perform calculations because the network is more complicated. However, efficient learning methods such as the backpropagation method [55] and a large amount of parallel computing [56][57][58] using a graphics processing unit (GPU) have been developed; therefore, deep learning has played a central role in advancing machine learning techniques.

Convolutional Neural Network
Although many fields can use deep neural networks, the most promising field for implementation is image analysis, which includes image classification, object detection, and semantic segmentation. The most basic deep learning method for image analysis is a convolutional neural network [59]. In a convolutional neural network, a program is composed of two types of layers: a convolution layer and pooling layer ( Figure 3). One of the greatest features of convolutional neural networks is their ability to extract complex image features while preserving the image position information. In the convolution layer, the value of the feature map is extracted by performing a convolution operation using a filter that corresponds to the weight. In the pooling layers, the maximum or average values are the output, which improves the robustness of the program (Figure 3). A deep neural network is constructed by connecting two types of layers. Eventually, the data are vectorized in one dimension and the answer is output through operations in the fully connected layer. The great power of convolutional neural networks has been demonstrated in the ImageNet Large Scale Visual Recognition Challenge (ILSVRC) [60], which competes with the classification performance of programs. With the advent of a convolutional neural network in 2012, the error rate dramatically decreased, and in 2015, it exceeded the human recognition accuracy [61]. It is possible to perform classification with extremely high accuracy using the latest network [60][61][62][63].
convolutional neural network [59]. In a convolutional neural network, a program is composed of two types of layers: a convolution layer and pooling layer (Figure 3). One of the greatest features of convolutional neural networks is their ability to extract complex image features while preserving the image position information. In the convolution layer, the value of the feature map is extracted by performing a convolution operation using a filter that corresponds to the weight. In the pooling layers, the maximum or average values are the output, which improves the robustness of the program (Figure 3). A deep neural network is constructed by connecting two types of layers. Eventually, the data are vectorized in one dimension and the answer is output through operations in the fully connected layer. The great power of convolutional neural networks has been demonstrated in the ImageNet Large Scale Visual Recognition Challenge (ILSVRC) [60], which competes with the classification performance of programs. With the advent of a convolutional neural network in 2012, the error rate dramatically decreased, and in 2015, it exceeded the human recognition accuracy [61]. It is possible to perform classification with extremely high accuracy using the latest network [60][61][62][63].

AI for Cell Recognition Based on Morphology
Deep learning technology is widely used in the fields of molecular and cellular biology and has solved many complicated tasks. Generally, cells must be labeled with a specific Pharmaceuticals 2022, 15, 562 6 of 16 molecular marker before microscopic observation to infer cell type and intracellular state (Figure 4). When the cell type and state are different, the characteristic gene expression and protein composition change, which greatly changes the cell morphology. Label-free cellular analysis can be performed by analyzing the cell morphology obtained from microscopic bright or phase-contrast imaging [64] (Figure 4). Christiansen et al. developed a label-free system to recognize cell types and states from microscopic bright-field images without molecular labeling by immunostaining [65]. Edlund et al. developed the LIVECell system that can classify eight types of cells with high accuracy using phase-contrast microscopy images [66]. It is possible to visualize not only the cell type, but also the intracellular components, as well as their localization and type, without molecular labels [67,68]. Microscopic images of stem cell differentiation were also analyzed using AI. The differentiation of C2C12 cells [69] and hematopoietic stem cells [70] was evaluated with high accuracy. Additionally, by using a recurrent neural network (RNN), which can be used to analyze time-series data, AI can predict the final lineage through hematopoietic stem cell differentiation from time-lapse microscopic images with high accuracy [70]. The machine learning method can also be applied to label-free cell sorting systems. Ota et al. used a barcode system to convert cell images into wave information and constructed a system that can sort cells, similar to a fluorescence-activated cell sorting system called ghost cytometry [71]. Ugawa et al. developed a ghost cytometry system that can distinguish undifferentiated human iPSCs, iPSC-derived differentiated cells, neuroectodermal cells (NECs), and hepatic endodermal cells (HECs) and classified types of peripheral white blood cells [72]. Machine learning algorithms can also classify cell morphology [73,74], cardiac tissue contractility, and molecular imaging [75].

AI for Cell Recognition Based on Morphology
Deep learning technology is widely used in the fields of molecular and cellular biology and has solved many complicated tasks. Generally, cells must be labeled with a specific molecular marker before microscopic observation to infer cell type and intracellular state (Figure 4). When the cell type and state are different, the characteristic gene expression and protein composition change, which greatly changes the cell morphology. Labelfree cellular analysis can be performed by analyzing the cell morphology obtained from microscopic bright or phase-contrast imaging [64] (Figure 4). Christiansen et al. developed a label-free system to recognize cell types and states from microscopic bright-field images without molecular labeling by immunostaining [65]. Edlund et al. developed the LIVECell system that can classify eight types of cells with high accuracy using phase-contrast microscopy images [66]. It is possible to visualize not only the cell type, but also the intracellular components, as well as their localization and type, without molecular labels [67,68]. Microscopic images of stem cell differentiation were also analyzed using AI. The differentiation of C2C12 cells [69] and hematopoietic stem cells [70] was evaluated with high accuracy. Additionally, by using a recurrent neural network (RNN), which can be used to analyze time-series data, AI can predict the final lineage through hematopoietic stem cell differentiation from time-lapse microscopic images with high accuracy [70]. The machine learning method can also be applied to label-free cell sorting systems. Ota et al. used a barcode system to convert cell images into wave information and constructed a system that can sort cells, similar to a fluorescence-activated cell sorting system called ghost cytometry [71]. Ugawa et al. developed a ghost cytometry system that can distinguish undifferentiated human iPSCs, iPSC-derived differentiated cells, neuroectodermal cells (NECs), and hepatic endodermal cells (HECs) and classified types of peripheral white blood cells [72]. Machine learning algorithms can also classify cell morphology [73,74], cardiac tissue contractility, and molecular imaging [75]. . Label-free cell recognition by artificial intelligence.: In the molecular biology-based approach, we labeled the cells with a specific molecular marker to infer the cell type or state before observation (left). On the other hand, in the AI-based approach, AI detects the morphological changes in cells from microscopy images and infers the cell type or state label-free.

AI for Bioinformatics Tool
While phenotypic analysis of cells using image analysis by AI is very important, AI is also useful for processing large amounts of datasets, such as genomic data. Lui et al. developed a CRISPR interference (CRISPRi) platform that targets 16,401 long non-coding RNA (lncRNA) loci in various cells, including iPSCs from humans, and screened them for . Label-free cell recognition by artificial intelligence.: In the molecular biology-based approach, we labeled the cells with a specific molecular marker to infer the cell type or state before observation (left). On the other hand, in the AI-based approach, AI detects the morphological changes in cells from microscopy images and infers the cell type or state label-free.

AI for Bioinformatics Tool
While phenotypic analysis of cells using image analysis by AI is very important, AI is also useful for processing large amounts of datasets, such as genomic data. Lui et al. developed a CRISPR interference (CRISPRi) platform that targets 16,401 long non-coding RNA (lncRNA) loci in various cells, including iPSCs from humans, and screened them for lncRNA genes. They identified lncRNAs that are involved in cell growth and examined whether hit lncRNAs could be distinguished from non-hit lncRNAs using machine learning techniques. They constructed a logistic regression model and identified hit lncRNAs using 18 genomic datasets, such as RNA-seq data, enhancer maps, and copy number variations. This study shows that the machine learning model is also useful for analysis that combines multiple genomic data, and its application for disease-specific iPSC models is expected [76]. iPSCs can be created by introducing genes into somatic cells; however, reprogramming is inefficient, time-consuming, and costly. Warner et al. developed a computer model called DeepNEU that identifies genes and molecules in iPSCs. DeepNEU is a machine learning Pharmaceuticals 2022, 15, 562 7 of 16 model that uses an unsupervised learning method with a fully connected recurrent neural network architecture. DeepNEU contains a database containing information on many gene networks, and the efficient reprogramming of iPSCs can be simulated. DeepNEU was also applicable to induced neural stem cells (iNSC) and cardiomyocyte models, and it was possible to simulate diseases such as Rett syndrome using aiNSCs. These data show that machine learning-based approaches for genomic-based iPSC identification and functional characterization are efficient [77].

AI for iPSC and iPSC-Derived Differentiation Cell
AI is also useful for cellular analyses of iPSCs. Joutsijoki et al. constructed a system to automatically identify the quality of iPSCs using machine learning techniques. Image features were extracted using Scaled Invariant Feature Transformation (SIFT), and they also used various machine learning techniques such as k-nearest neighbor (k-NN) and support vector machine (SVM) to construct a model to classify undifferentiated iPSC colonies as good, semigood, and bad [78]. Not only iPSC colonies but also iPSC-derived cells can be analyzed by AI. Since iPSCs have the same genetic characteristics as the patient, iPSCderived cells exhibit patient-specific cell phenotypes and are effective in patient-specific or disease-specific drug screening. Endothelial cells cover the lumen of blood vessels and play an important role in maintaining blood vessel homeostasis. Several diseases are caused by endothelial genetic abnormalities such as valvular heart disease, pulmonary hypertension, and moyamoya disease. Technology for creating vascular endothelial cells from iPSCs has been developed, and we can analyze these diseases by creating patient-specific vascular endothelial cells. To verify whether AI can be used for the analysis of disease-specific vascular diseases, we first elucidated whether the vascular endothelial cells derived from iPSCs can be identified by AI from microscopic images [15]. We independently induced the differentiation of iPSCs into vascular endothelial cells four times and obtained phasecontrast microscopy images and fluorescent images of PECAM1, an endothelial marker, in the same location. To identify vascular endothelial cells in phase-contrast microscopy images, the cells in the images were extracted, and AI learning was performed to predict whether the cells were endothelial cells, using CD31 immunostaining as the answer. It is necessary to prepare a large dataset to perform optimal AI learning, and it was possible to prepare approximately 120,000 cell images from the four-phase-contrast microscopy images by acquiring each cell in the images. When we examined the number of images required for successful learning, we found that at least tens of thousands of images are required. Next, we examined whether it would be more accurate to use a larger image, including the surrounding environment for cell-type prediction, and found that a larger image was much better. Deep neural network adjustment was also effective in improving accuracy, which could be achieved by deepening the network. Finally, to infer the performance for an unknown dataset, we performed k-fold cross-validation and proved that recognition with high accuracy was possible. We demonstrated that iPSC-derived cells can be identified using AI, and that they could be effective for drug screening using iPSCs.

Disease Evaluation Using AI
It has been shown that iPSC-derived differentiated cells can be evaluated with high accuracy by using AI. However, we can not only evaluate the cell type but also the pathological state of the cell. If AI can evaluate the morphological changes due to the pathological state of cells, drug discovery research using iPSCs may also be revolutionized. Previously, when performing drug discovery screening with iPSCs, it was necessary to search for molecular markers that could assess the pathological conditions of the cells. However, there are many cases where no effective molecular markers exist, and it is difficult to perform screening in these cases. Therefore, we constructed a system that infers the pathological state of cells from changes in cell morphology using AI, and applied it to drug discovery screening using iPSCs [16]. We examined whether the pathological conditions could be elucidated by AI using cultured endothelial cells and human umbilical vein endothelial cells (HU-VECs). For pathological conditions, we used a cellular senescence model of endothelial cells. During the progression of age-related diseases, several stressors damage DNA, and cells become senescent, which is a protective mechanism that prevents oncogenesis by inducing cell cycle arrest [79]. Senescent cells cause an inflammatory phenotype called senescenceassociated secretory phenotype (SASP) and induce organ dysfunction [80,81]. Endothelial senescence plays a key role in the progression of cardiovascular diseases. To evaluate pathological conditions using AI, we first trained a convolutional neural network (CNN) to classify healthy or senescent cells. We captured phase-contrast microscopy images of healthy and senescent cells, where each cell image was automatically cropped from the large images, and the AI was subsequently trained from them. After training, AI was able to classify healthy and senescent cells with extremely high accuracy and less learning. Next, we verified whether the degree of senescence could be quantitatively measured using a trained CNN. We succeeded in creating a senescence score that could evaluate the degree of cellular senescence with high quality by applying the senescence probability output from the CNN. The senescence score was highly correlated with various stress intensities, such as oxidative stress concentration, camptothecin concentration, and number of replications. We named the system Deep-SeSMo (Deep Learning-based Senescence Scoring System by Morphology) [16] (Figure 5). Deep-SeSMo assigns a score to each microscopy image in approximately 100 µs; thus, it can be applied to high-throughput drug screening. In addition, it can evaluate newly acquired datasets with high accuracy and can be applied to datasets obtained from other facilities. Cellular senescence can be similarly evaluated in cells other than vascular endothelial cells such as fibroblasts. Therefore, we consider Deep-SeSMo to have high generalization performance and it should be optimal for drug screening using cell models. AI-based cellular image analysis can also be applied to other cell lines. Schiff et al. used fibroblasts from 91 patients with Parkinson's disease to build a system for automatic recognition of cell phenotypes using pre-trained CNN models on ImageNet. The training results showed that it was possible to separate fibroblasts derived from Parkinson's disease from healthy controls. This is important data to show that the phenotype of the disease can be analyzed using AI [82].
ical state of cells, drug discovery research using iPSCs may also be revolutionized. Previously, when performing drug discovery screening with iPSCs, it was necessary to search for molecular markers that could assess the pathological conditions of the cells. However, there are many cases where no effective molecular markers exist, and it is difficult to perform screening in these cases. Therefore, we constructed a system that infers the pathological state of cells from changes in cell morphology using AI, and applied it to drug discovery screening using iPSCs [16]. We examined whether the pathological conditions could be elucidated by AI using cultured endothelial cells and human umbilical vein endothelial cells (HUVECs). For pathological conditions, we used a cellular senescence model of endothelial cells. During the progression of age-related diseases, several stressors damage DNA, and cells become senescent, which is a protective mechanism that prevents oncogenesis by inducing cell cycle arrest [79]. Senescent cells cause an inflammatory phenotype called senescence-associated secretory phenotype (SASP) and induce organ dysfunction [80,81]. Endothelial senescence plays a key role in the progression of cardiovascular diseases. To evaluate pathological conditions using AI, we first trained a convolutional neural network (CNN) to classify healthy or senescent cells. We captured phasecontrast microscopy images of healthy and senescent cells, where each cell image was automatically cropped from the large images, and the AI was subsequently trained from them. After training, AI was able to classify healthy and senescent cells with extremely high accuracy and less learning. Next, we verified whether the degree of senescence could be quantitatively measured using a trained CNN. We succeeded in creating a senescence score that could evaluate the degree of cellular senescence with high quality by applying the senescence probability output from the CNN. The senescence score was highly correlated with various stress intensities, such as oxidative stress concentration, camptothecin concentration, and number of replications. We named the system Deep-SeSMo (Deep Learning-based Senescence Scoring System by Morphology) [16] (Figure 5). Deep-SeSMo assigns a score to each microscopy image in approximately 100 µs; thus, it can be applied to high-throughput drug screening. In addition, it can evaluate newly acquired datasets with high accuracy and can be applied to datasets obtained from other facilities. Cellular senescence can be similarly evaluated in cells other than vascular endothelial cells such as fibroblasts. Therefore, we consider Deep-SeSMo to have high generalization performance and it should be optimal for drug screening using cell models. AI-based cellular image analysis can also be applied to other cell lines. Schiff et al. used fibroblasts from 91 patients with Parkinson's disease to build a system for automatic recognition of cell phenotypes using pre-trained CNN models on ImageNet. The training results showed that it was possible to separate fibroblasts derived from Parkinson's disease from healthy controls. This is important data to show that the phenotype of the disease can be analyzed using AI [82]. : Deep-SeSMo is an AI-based system that creates senescence scores from phase-contrast microscopy images without molecular labels and can evaluate the degree of cellular senescence in high quantities by applying the senescence probability output from the CNN.

Drug Screening Using AI
As AI was able to analogize the pathological state of cells from cell images, the AIbased evaluation system could be used to search for novel drugs to ameliorate disease. To validate the performance of Deep-SeSMo, which can quantitatively evaluate endothelial cellular senescence with high performance, the effects of metformin and NMN, which are anti-aging drugs, were evaluated [16]. Metformin improves insulin resistance and lowers blood glucose levels by regulating AMPK function. In recent years, metformin has been shown to suppress aging [83], and clinical trials have been conducted with increased lifespan as the outcome. NMN is an activator of the longevity gene Sirt1. Similarly, it is expected to suppress aging [84]. When metformin and NMN were administered to senescent endothelial cells, the expression of the aging markers P16, SA-β-GAL, and those in the P21-53 pathway was reduced, indicating that cellular senescence was suppressed. Deep-SeSMo can accurately evaluate the anti-senescent effects of these drugs in a dose-dependent manner. Next, 80 types of kinase inhibitors were administered to senescent HUVECs and anti-senescent drugs were screened using deep-SeSMo. Three methods were used to induce aging: oxidative stress, camptothecin, and replication stress. The screening results were sorted by ranking, and the top four drugs were extracted as the hit compounds. To verify whether Deep-SeSMo succeeded in correctly extracting anti-senescent drugs, the four hit compounds were verified using a molecular biological method. Western blotting revealed that all the top four compounds had anti-aging effects, and RNA sequences also revealed suppression of the NFκB-mediated inflammatory pathway. Inflammation is an important phenotype in which senescent cells damage the surrounding tissues. Importantly, the drugs identified by Deep-SeSMo exhibit cellular senescence as well as an inflammatory phenotype. Thus, Deep-SeSMo may be a particularly useful system for drug screening using cell models [16], including patient-specific iPSCs. For drug discovery using AI, it is important not only to perform phenotypic screening using cellular images but also to construct a system that determines the drug effects from the structural formulas of compounds or proteins. Graph convolutional neural networks (GCNs) are often used for the structural analysis of compounds. A GCN is an architecture based on a convolutional neural network that can analyze datasets with a graph structure and can be used for the analysis of various compounds. Strokes et al. used the GCN model to search for antibiotics based on their compound structures. They trained the model using 1760 molecules of which phenotypes were already known, and they identified hit compounds from over 100 million compound datasets [85]. Wang et al. constructed an SSGraphCPI system that can analyze the interaction between compounds and proteins and can predict the target protein of the compound. SSGraphCPI consists of recurrent neural networks (RNN) with an attentional mechanism and graph convolutional neural networks (GCN) [86].

Disease-Specific iPSCs and AI
AI-based disease evaluation can also be applied to disease modelling using iPSCs (Table 1). iPS-derived cardiomyocytes can be used to evaluate the patient-specific cardiotoxicity of drugs. Lee et al. successfully identified myocardial contractions in iPCS-derived cardiomyocytes using bright-field images. They used principal component analysis to identify the direction of myocardial contraction and classify normal and abnormal myocardial contractions using a machine learning method called the support vector machine. Using this system, they demonstrated that the cardiotoxicity of various compounds can be evaluated [87]. Imamura et al. created iPSCs from healthy control subjects and patients with amyotrophic lateral sclerosis (ALS). Subsequently, motor neurons were created from iPSCs for disease modeling. β3-tubulin immunostaining images were obtained, and cells derived from healthy individuals and patients with ALS were classified using a CNN. As a result of the learning, the AI could classify them with high accuracy, with an AUC exceeding 0.97. An important point in this study is that the accuracy is as low as AUC 0.6 when using random forest, which is a classical machine learning method, clearly demonstrating the usefulness of CNN. The morphology of cells created from iPSCs differs depending on the cell line; however, in this study, the morphological heterogeneity among the cell lines was overcome by creating iPSCs from many patients, including 15 healthy subjects and 15 patients with ALS. Using AI technology for tasks other than image analysis using a CNN is also useful for disease evaluation of iPSCs [88]. Hidaka et al. developed a machine learning algorithm from a heat diffusion equation (HDE) model and performed compound screening to suppress cell death in iPSC-derived motor neurons. The HDE model identified 5875 compounds from a screening set of two million compounds [89]. Cardiomyocytes play a major role as pumps in the heart, and pathological conditions can cause heart failure. A technique for quantitatively evaluating the contraction of cardiomyocytes created from iPSCs, by focusing on calcium currents, has also been developed. Furthermore, it is possible to classify normal and pathological cardiomyocytes using the calcium current as an index using machine learning [90,91]. Thus, AI technology may be useful for pathological evaluation using iPSCs and drug screening.

Novel Technology for Disease Modeling with iPSCs
In recent years, technologies that not only create differentiated cells from iPSCs but also induce cell groups that have constructed a tissue structure consisting of multiple cell types have been developed. Organoids, in which multiple cells maintain three-dimensional tissue construction, are created by inducing the differentiation of iPSCs using a three-dimensional culture system. Organoids are multicellular and may be much more useful for disease analysis than simple cell models [92][93][94]. Many reports have demonstrated that the disease phenotype can be reproduced using organoids [95][96][97][98][99][100]. Tang et al. analyzed the specific phenotype of Down syndrome using iPSC-derived cerebral organoids and identified important pathways involved in the disease [96]. In addition, disease models using organoids have been constructed for a wide range of diseases, including Alzheimer's [97], Parkinson's [98], lung [99], and liver diseases [100]. Organoids are also used in the search for effective drugs for disease and toxicity tests [101][102][103][104]. Park et al. created neural organoids from iPSC models derived from patients with Alzheimer's disease and presented detailed strategies for drug screening [104]. In addition to organoids, technologies have been developed for constructing cell organizations by fusing them with engineering technologies. An organ-on-a-chip is a technology that constructs a multicellular tissue structure on a microfluidic device and uses it as a disease model. Many studies have analyzed diseases by constructing tissues of differentiated cells derived from iPSCs using the organ-on-achip [105,106], and this technique might be a promising application for disease analysis using iPSCs. To evaluate diseases using iPSCs, considerable manpower and labor related to cell culture are required. In recent years, automatic culture machines that use robots have been developed. Trista et al. constructed an automated culture machine system using robots named CompacT SelecT (CTST) [107]. CTST supports various types of cells, containers such as flasks and well plates of various sizes, and pipetting. Scientists can remotely direct various protocols without entering a laboratory. iPSCs can not only be automatically maintained but can also be automatically induced to differentiate into cells such as nerve cells, cardiomyocytes, and hepatocytes. Since CTST can handle up to 384 well plates, it is considered to be very useful for high-throughput drug searches using iPSCs. This advancement is also crucial for high-throughput screening in the future.

Conclusions
Disease-specific iPSCs are a useful tool for analyzing the cellular pathology of diseases by differentiating cells that are difficult to obtain from patients, such as cardiomyocytes and neurons, because they have genetic abnormalities that cause diseases. Disease-specific iPSCs are very useful in the search for drugs that are effective against diseases and appropriate therapies for each patient. AI has made remarkable progress in recent years, especially in image analysis technology using convolutional neural networks. AI image analysis techniques can now be used to analyze even the characteristic morphological changes of diseases. When using pathological iPSC-derived model cells for drug discovery, it is often difficult to define an index to evaluate cellular pathology, but the index using AI-based image analysis has proven to be very effective. By making full use of AI-based image analysis, high-throughput label-free and simple drug screening is possible, which will accelerate iPSC-based drug discovery and development. AI can be applied to drug development using a variety of technologies other than image analysis, including AI to predict diseases using genome data and RNA expression. It is possible to infer the phenotype of a disease by using genetic information. AI can also analyze the structural formulas of compounds and protein-protein interactions, making it possible to narrow down candidate compounds through in silico virtual screening. In the future, the combination of these methods will accelerate drug discovery using iPSCs. The development of methods to induce iPSC differentiation is also considered very important for drug discovery. In particular, it is very important to create cell populations with a three-dimensional tissue architecture, such as organoids and organ-on-a-chip, because they resemble the actual in vivo environment more than simple cellular systems. Human labor is also considered in high-throughput screening. It is important to automate cell culture and experimental procedures using robots. As we have seen, various techniques have been developed for disease evaluation and drug screening using iPSCs, and combining these technologies will lead to further innovation in future drug discovery using iPSCs, resulting in the development of novel treatments.