Induced Pluripotency and Gene Editing in Disease Modelling: Perspectives and Challenges

Embryonic stem cells (ESCs) are chiefly characterized by their ability to self-renew and to differentiate into any cell type derived from the three main germ layers. It was demonstrated that somatic cells could be reprogrammed to form induced pluripotent stem cells (iPSCs) via various strategies. Gene editing is a technique that can be used to make targeted changes in the genome, and the efficiency of this process has been significantly enhanced by recent advancements. The use of engineered endonucleases, such as homing endonucleases, zinc finger nucleases (ZFNs), transcription activator-like effector nucleases (TALENs) and Cas9 of the CRISPR system, has significantly enhanced the efficiency of gene editing. The combination of somatic cell reprogramming with gene editing enables us to model human diseases in vitro, in a manner considered superior to animal disease models. In this review, we discuss the various strategies of reprogramming and gene targeting with an emphasis on the current advancements and challenges of using these techniques to model human diseases.


Introduction
In 1963, Ernest Armstrong McCulloch and James Edgar Till demonstrated the presence of self-renewing cells in mouse bone marrow [1] and, in 1981, Martin Evans, Matthew Kaufman, and Gail R. Martin were able to derive embryonic stem cells (ESCs) from mouse embryos [2]. ESCs are characterized by their ability to self-renew indefinitely, as long as their genetic expression profile favors self-renewal [3]. In addition, they are pluripotent, namely they are able to differentiate into nearly any of the cell types derived from the three main germ layers that comprise an organism [4]. Due to these characteristics, ESCs are thought to be promising in the search for a cure for degenerative diseases [5].
In 2006, Kazutoshi Takahashi and Shinya Yamanaka successfully generated induced pluripotent stem cells (iPSCs) from mice fibroblast cultures by the addition of a few defined transcription factors [6]. Since then, many strategies for generating safer and more stable iPSCs from a variety of somatic cell types have been developed [7].
The discovery that somatic cells can be reprogrammed to form iPSCs, which share similar characteristics with ESCs, has expanded the prospect for development of cellular therapies for degenerative diseases [8]. iPSCs face less ethical controversy as compared to ESCs, and as Yamanaka utilized retroviral vectors containing Oct4, Sox2, Klf4 and c-Myc, such that these genes were overexpressed, to reprogram mouse fibroblasts to iPSCs [6]. Similarly, Yu et al. [18] were able to generate iPSCs from both mouse and human fibroblasts by inducing the overexpression of Oct4, Sox2, Nanog, and Lin28, with the use of lentiviral vectors. iPSCs can also be generated via the use of an inducible lentiviral system, where iPS cell clones are differentiated into fibroblast-like cells, that can be induced to express reprogramming factors, to enable secondary reprogramming [19].
Another approach for transgene-mediated reprogramming is by the use of integrating non-viral inducible plasmid vectors. For example, Merkl et al. [20] used a doxycycline-inducible plasmid vector containing murine Oct4, Sox2, c-Myc and Klf4 to reprogram rat fibroblasts.
It was also found that the introduction of certain small molecules in combination with reprogramming factors could enhance reprogramming efficiency. The compound E-616452 (RepSox) was found to be able to replace Sox2 in the reprogramming of mouse embryonic fibroblasts (MEFs). RepSox acts by inhibiting the transforming growth factor-β (TGF-β), thus upregulating Nanog [21]. Kenpaullone, a GSK3 inhibitor, is another compound that enhanced the reprogramming of MEFs by complementing and thus, replacing Klf4 [22]. In addition, Lin et al. [23] demonstrated that when Yamanaka factors were combined with SB431542, an Alk5 inhibitor, PD0325901, a MEK inhibitor and thiazovivin, a 200-fold increase in reprogramming efficiency could be attained.
A variety of studies have also illustrated that certain small molecules are able to replace some of the Yamanaka factors in reprogramming. For example, compounds such as A-83-01, PD0325901, PS48, 0.25 mM sodium butyrate [24], Vitamin C [25], BIX-01294, BayK8644 [26], and valproic acid (VPA) [27] are able to either replace factors assumed to be crucial for reprogramming, or increase reprogramming efficiency [28]. In addition, Hou et al. [29] demonstrated that seven small-molecule compounds were able to reprogram mouse somatic cells in the absence of the expression of exogenous transcription factors.
Due to the ease of utilising transgene-based reprogramming, these methods remain the most widely used strategies in reprogramming. However, as the site of viral integration is usually random, viral-mediated reprogramming carries the risk of insertional inactivation of a vital gene or perturbation of endogenous gene expression [8]. Another problem associated with this type of cellular reprogramming is low reprogramming efficiency [8].

Transgene-Free Cellular Reprogramming Methods
Due to the risks and limitations associated with viral-mediated cellular reprogramming methods, several other methods for generating iPSCs have been developed. As mentioned above, it is now possible to reprogram mouse somatic cells with small-molecule compounds in the absence of exogenous transcription factors [29]. The ability to generate human iPSCs utilising small-molecule compounds alone is a highly desired goal as small-molecule reprogramming has a smaller risk of perturbing endogenous gene sequences or expression [28].
Alternatively, iPSCs can be generated using non-integrating plasmid vectors. The transient co-transfection of plasmids encoding the Yamanaka factors enabled the generation of iPSCs from mouse embryonic fibroblasts [30].
Non-integrating viral mediated cellular reprogramming can be achieved by using RNA viruses that do not integrate their genes into the host genome. In one approach, Yu et al. [31] cloned six reprogramming factors (Oct4, Sox2, Nanog, LIN28, c-Myc and Klf4) into an oriP/EBNA1 (Epstein-Barr nuclear antigen-1) based episomal vector and, thus, were able to reprogram human fibroblasts into iPSCs. In addition, multiple labs have also made use of Sendai viruses to reprogram somatic cells such as human fibroblasts [32] and human peripheral blood cells [33]. Similarly, non-integrating DNA adenoviral vectors encoding Yamanaka factors have been successfully used to reprogram MEFs, mouse liver cells [34] and human embryonic fibroblasts [35].
Transgene-free cellular reprogramming can also be achieved by utilising modified lentiviral vectors in which the vectors can be excised from the genomes of the generated iPSCs. For example, Chang et al. [36] successfully generated iPSCs from dermal fibroblasts by using a polycistronic lentiviral vector that encoded the reprogramming factors Oct4, Sox2, and Klf4. This lentiviral vector contained a loxP site in the 3 1 -LTR region, such that the vector could be deleted upon the expression of Cre recombinase. Similarly, Sommer et al. [37] successfully generated iPSCs from peripheral blood mononuclear cells by using a single excisable polycistronic lentiviral Stem Cell Cassette (STEMCCA) that encoded Yamanaka factors.
Similarly, modified transposons can be utilized for transgene-free cellular reprogramming. Yusa et al. [38] used a piggyBAC-derived transposon system carrying 2A peptide-linked reprogramming factors for reprogramming, which was subsequently removed by the re-expression of transposase. The piggyBAC system excises without a footprint, such that the impact on the genome is minimized.
Another approach for transgene-free mediated reprogramming is the use of synthetic mRNA encoding Yamanaka factors. These can be introduced into cells via complexing with cationic vehicles [39] or by electroporation [40].
In addition, transgene-free cellular reprogramming can be achieved by the use of recombinant proteins, such as Yamanaka factors fused to poly-arginine domains. Proteins containing such poly-arginine domains are able to easily cross cell membranes, and mouse embryonic fibroblasts (MEFs) [41] and human newborn fibroblasts (HNFs) [42] were successfully reprogrammed using this approach. Reprogramming factors can also be transfected into cells by magnet-based nanofection, where proteins are conjugated to non-viral magnetic nanoparticles, enabling their easy transfection into cells via magnetic force [43].
Micro RNAs (miRNAs) can also be used to generate iPSCs, and this does not involve the utilization of any of the reprogramming factors that are commonly used. Anokye-Danso et al. [44] used a lentiviral vector to induce the expression of mouse miR302/367 in MEFs and human fibroblasts. Here, reprogramming was successful with a higher efficiency when compared with reprogramming using Yamanaka factors.
These transgene-free methods eliminate the risk of random integration, and thus may be preferred, but these methods are often tedious and tend to have lower efficiencies.
The methods discussed above have been utilized for reprogramming various cell types into iPSCs ( Figure 1, Table 1). These iPSCs can then be directly differentiated into the cell type of interest to model diseases, or have their genome modified by gene editing before differentiation, as discussed below.
Int. J. Mol. Sci. 2015, 16, page-page 4 lentiviral vector that encoded the reprogramming factors Oct4, Sox2, and Klf4. This lentiviral vector contained a loxP site in the 3′-LTR region, such that the vector could be deleted upon the expression of Cre recombinase. Similarly, Sommer et al. [37] successfully generated iPSCs from peripheral blood mononuclear cells by using a single excisable polycistronic lentiviral Stem Cell Cassette (STEMCCA) that encoded Yamanaka factors.
Similarly, modified transposons can be utilized for transgene-free cellular reprogramming. Yusa et al. [38] used a piggyBAC-derived transposon system carrying 2A peptide-linked reprogramming factors for reprogramming, which was subsequently removed by the re-expression of transposase. The piggyBAC system excises without a footprint, such that the impact on the genome is minimized.
Another approach for transgene-free mediated reprogramming is the use of synthetic mRNA encoding Yamanaka factors. These can be introduced into cells via complexing with cationic vehicles [39] or by electroporation [40].
In addition, transgene-free cellular reprogramming can be achieved by the use of recombinant proteins, such as Yamanaka factors fused to poly-arginine domains. Proteins containing such poly-arginine domains are able to easily cross cell membranes, and mouse embryonic fibroblasts (MEFs) [41] and human newborn fibroblasts (HNFs) [42] were successfully reprogrammed using this approach. Reprogramming factors can also be transfected into cells by magnet-based nanofection, where proteins are conjugated to non-viral magnetic nanoparticles, enabling their easy transfection into cells via magnetic force [43].
Micro RNAs (miRNAs) can also be used to generate iPSCs, and this does not involve the utilization of any of the reprogramming factors that are commonly used. Anokye-Danso et al. [44] used a lentiviral vector to induce the expression of mouse miR302/367 in MEFs and human fibroblasts. Here, reprogramming was successful with a higher efficiency when compared with reprogramming using Yamanaka factors.
These transgene-free methods eliminate the risk of random integration, and thus may be preferred, but these methods are often tedious and tend to have lower efficiencies.
The methods discussed above have been utilized for reprogramming various cell types into iPSCs ( Figure 1, Table 1). These iPSCs can then be directly differentiated into the cell type of interest to model diseases, or have their genome modified by gene editing before differentiation, as discussed below.

Cell Types from Which Induced Pluripotent Stem Cells (iPSCs) Have Been Derived
iPSCs have been generated from a plethora of somatic cell types for both mice and humans. These include but are not limited to B-lymphocytes [45], neural progenitor cells (NPCs) [46], hepatocytes and gastric epithelial cells [21] from mice as well as adipose-derived stem cells [47], keratinocytes [48], peripheral blood T-cells [49,50], hair follicle cells [51], amniotic fluid cells [52] and astrocytes [53] from humans. In addition, Park et al. [54] have shown that iPSCs could be generated from fetal, neonatal and adult human primary cells, including dermal fibroblasts isolated from healthy subjects.

Gene Targeting and Disease Modeling
Gene editing is the process by which one makes targeted changes to genomic DNA sequences, such as insertions, deletions, point mutations or translocations. This is done by the targeted generation of double strand breaks (DSBs), which triggers the action of various DNA repair pathways, such as homology-directed repair (HDR) or non-homologous end joining (NHEJ). The imprecise NHEJ pathway results in insertion or deletion mutations (INDELs), as the blunt ends are joined together, frequently resulting in frameshift mutations or premature stop codons, and even in knockouts [55,56]. In contrast, if cleavage takes place in the presence of DNA that is partially homologous to the cleaved strand, the exogenous DNA may be incorporated into the genome by HDR [9,57]. This makes such cleavage useful in the introduction of foreign DNA into the eukaryotic genome, where HDR can enable the specific addition of exogenous protein-coding sequences [58].
Sequence-specific DNA-binding proteins such as homing nucleases, zinc finger proteins (ZFPs), transcription activator-like effectors (TALEs) and Cas9 of the CRISPR/Cas system have been adapted to introduce DSBs in a sequence-specific manner, to trigger HDR or NHEJ. The drawbacks of homing nucleases [59], ZFNs [60,61] and TALENS [62] (Table 2), along with the ease of re-targeting in the CRISPR/Cas9 system, have led to the widespread adoption of the CRISPR/Cas9 system for gene editing.  [73] Easy to re-target (cloning and oligo synthesis) [74] Depends on predictable Watson-Crick base-pairing˝O ff-target effects [75]

CRISPR/Cas9 System and Its Potential in Disease Modeling
CRISPR/Cas systems were the first type of the prokaryotic adaptive immune system discovered, and are perhaps the most complex. Scientists first noticed CRISPR repeats in 1987 [76], but their function was not elucidated until 2005 [77]. It was later experimentally shown that the Type II CRISPR system acts as an adaptive immune system, to protect the prokaryotic genome against mobile genetic elements [72].
Of the different CRISPR/Cas systems, eukaryotic genome engineering has primarily utilised the Type II system from Streptococcus pyogenes [12,78]. The Type II system was selected as it is the most compact; the Cas9 protein and RNA constructs are sufficient to cleave target DNA [79], as compared to multiple Cas proteins needed in the Type I and Type III CRISPR/Cas systems.
The Type II CRISPR system requires two pieces of RNA to function, namely crRNA, which base pairs with target DNA, and tracrRNA, which triggers pre-crRNA processing and cleavage of the target by Cas9. These two RNA strands can be engineered into a single chimeric RNA, the single guide RNA (sgRNA), which functions as efficiently as the two endogenous RNAs [74]. This increases the ease of manipulation, as only two components (Cas9 and sgRNA) must be introduced into eukaryotic cells for the CRISPR system to be utilised.
However, a key problem in utilization of the CRISPR/Cas system is the high frequency of off-target effects [75]. Various strategies have been used to overcome this, such as increasing the length of the sequence recognized, by incorporating cooperativity into the functioning of the system. This has been done by fusing deactivated Cas9 proteins (dCas9) to FokI nucleases in a manner similar to that seen in the construction of ZFNs and TALENs. Here, two different sgRNAs binding at adjacent locations in the genome are required for cleavage to take place, reducing off-target effects, as it is unlikely that both sgRNAs will bind in close proximity outside the intended target [69]. Alternatively, Cas9 nickases can be made, where one of the two Cas9 nuclease domains is inactivated. The D10A mutation inactivates the RuvC-like domain, while the H840A mutation inactivates the HNH domain [80]. The introduction of two different sgRNAs targeting slightly different regions on opposite strands of the target results in two staggered nicks created by the paired nickases that results in a DSB [81].

Gene Editing in Disease Modeling
Such gene editing technologies enable the editing of a genome in a specific and targeted way. This could enable the generation of mutations in specific genomic locations, to create disease models that enable further study of disease pathology, or to investigate the effects of specific loci or mutations on the disease phenotype [13].
Specific corrections could be made in iPSCs of diseased individuals, to elucidate the effect of the specific mutant allele and to act as a proof-of-principle for gene therapy. In contrast, disease-associated mutations can be introduced into a healthy cell background, to elucidate the effect of those mutations [13].
ZFNs have been utilized for the creation of disease models. For example, Meyer et al. [82] used ZFNs to introduce missense and silent mutations into the Rab38 gene, which encodes for a small GTPase that regulates intracellular vesicle trafficking. The introduction of these ZFNs in one-cell mouse embryos was used to generate disease-related mutants containing single nucleotide or codon replacements.
The CRISPR/Cas system has also been utilized in understanding disease in iPSC disease models. Soldner et al. [83] modeled a familial form of Parkinson's disease (PD) by the generation of iPSCs from individuals carrying the A53T mutation in α-synuclein (SNCA) and corrected the mutation by gene editing. In addition, gene editing was used to introduce the A53T or E46K mutation in SNCA in wild-type hiPSCs, thus enabling the creation of iPSCs that were bi-allelic for the mutation. This enabled the study of mutant α-synuclein in the absence of wild-type α-synuclein, which could shed light on PD pathogenesis [83].
In addition, in order to study the role of the N996I KCNH2 mutation in long-QT syndrome (LQTS), Bellin et al. [84] created iPSCs from individuals carrying the mutation and corrected the mutation via gene editing, while separately introducing the point mutation into hESCs. The iPSCs were differentiated into cardiomyocytes, with parameters such as the current conducted and AP duration studied in order to illustrate that the N996I KCNH2 mutation was the main cause of the LQTS phenotype.

Advantages of iPSCs for Disease Modeling
The ability of iPSCs to be renewed indefinitely enables the generation of a large number of cells, aiding further study and large-scale screening, while the ability to generate patient-specific iPSCs enables the study of the genetic causes of diseases, as well as the generation of models of complex diseases for which the genetic causes may not have been fully elucidated. In addition, the simultaneous study of different patient-specific iPSC lines can aid in our understanding of how various disease-associated loci may interact to cause a phenotype. This could enhance disease diagnosis, and make drug screening personalized and more reliable [85].
In fact, iPSCs have been utilized as a diagnostic tool in a patient with long QT syndrome caused by a novel mutation [86]. Here, iPSCs were generated from the patient and differentiated into cardiomyocytes before being characterized by electrophysiological analysis and the introduction of specific drugs known to affect QT length. This illustrates that disease-specific iPSCs are able to recapitulate disease phenotypes, and could aid in both diagnosis and drug testing.
The utilization of disease modeling via iPSCs provides many advantages over the use of animal disease models. For example, human disease models will more accurately represent the physiology of human cells and systems as compared to animal models. As such, disease modeling via iPSCs will reflect both drug efficacy and toxicity more accurately. Currently, many drugs are initially tested in animals, such as mice, but this process may yield both false positives and negatives. For example, false positives include drugs that appear to alleviate the disease phenotype in mice, but that do not benefit humans. This was seen for creatine in the treatment of ALS (Amyotrophic Lateral Sclerosis), where it prolonged lifespan and maintained motor neuron function in mice models, but failed to yield any apparent benefit in human clinical trials [87]. In addition, drug toxicity differs between different animals, such that animal models may not be appropriate for testing drug toxicity [88]. While most clinical trials are expensive and time-consuming, pre-testing in more accurate iPSC disease models may be able to reduce cost and time [89].

Requirements of iPSCs in Disease Modeling
Before one is able to use generated iPSCs for disease modeling, the iPSCs should be screened for any mutations that could have occurred due to the reprogramming event [90]. The DNA methylation patterns of these iPSCs should also be examined to avoid any incomplete demethylation of crucial genes [91]. Also, there should be no change in the allelic copy number variation [92] or any abnormality of X chromosome inactivation [93]. The presence of any of these abnormalities can lead to changes in the phenotype of the iPSCs and of the cells generated from them.
It is important to keep in mind that iPSCs carry epigenetic memory from their somatic cells of origin, which plays a role in the tendency of iPSCs to differentiate into specific cell types. Thus, the somatic cell source must be compatible with the cell type of the desired disease model. For example, Moad et al. [94] showed that iPSCs generated from prostate and urinary tract cells had better efficiency of differentiation into prostate and urinary tract cells as compared to iPSCs derived from skin fibroblasts. This illustrates the epigenetic differences between cell types, and illustrates that the origin of cell type plays a key role in efficiency of targeted differentiation.

Examples of Disease Models
iPSC-based disease models have been created for several diseases affecting different systems in the human body, including both Mendelian and complex diseases. This illustrates that iPSC-based disease models can be generated even if the exact genetic cause of the disease is not well understood ( Figure 2). In addition, directed differentiation processes must exist, in order to differentiate the iPSCs into the cell type of interest [89].

Neurological Disease Models
Some of the first diseases to be modeled using iPSCs were neurological diseases, as neurons have been sufficiently well-studied such that good differentiation protocols exist, and as primary human neurons are extremely inaccessible, making an iPSC-based method desirable [13]. In addition, in neurodegenerative diseases such as Parkinson's disease (PD), clinical manifestation usually indicates that certain neurons, in this case, dopaminergic neurons of substantia nigra, have already been lost, thus making the disease difficult to study by the isolation of primary cells of

Neurological Disease Models
Some of the first diseases to be modeled using iPSCs were neurological diseases, as neurons have been sufficiently well-studied such that good differentiation protocols exist, and as primary human neurons are extremely inaccessible, making an iPSC-based method desirable [13]. In addition, in neurodegenerative diseases such as Parkinson's disease (PD), clinical manifestation usually indicates that certain neurons, in this case, dopaminergic neurons of substantia nigra, have already been lost, thus making the disease difficult to study by the isolation of primary cells of interest [124].
Disease models of Parkinson's disease have been created by generating iPSCs from patients carrying the G2019S mutation in the Leucine-Rich Repeat Kinase-2 (LRRK2) gene, the most common mutation in Parkinson's disease. These iPSCs were differentiated into dopaminergic (DA) neurons, which accurately recapitulated the phenotype of Parkinson's disease [95].
In 2013, Lu et al. [102] successfully modeled the neurogenesis impairment in Down Syndrome by generating iPSCs from Trisomy 21 amniotic fluid cells. These generated iPSCs maintained three copies of chromosome 21. The level of amyloid precursor protein was significantly increased in the neuronal progenitor cells derived from the generated iPSCs, shedding light on the mechanism by which Trisomy 21 results in neurological defects.

Metabolic Disease Modeling
In 2010, Rashid et al. [109] demonstrated that metabolic diseases of the liver could be modeled using iPSC-based disease modeling. Dermal fibroblasts from patients with various inherited metabolic diseases of the liver were used to generate iPSCs that were subsequently differentiated into hepatocytes for modeling. The diseases that could be modeled using this method were familial hypercholesterolemia (FH), glycogen storage disease type Ia (GSDIa) and Alpha 1-antitrypsin (AAT) deficiency.
Other metabolic diseases of the liver such as tyrosinemia, glycogen storage disease, progressive familial hereditary cholestasis, and Crigler-Najjar syndrome have been successfully modeled [125].

Modeling Drug Metabolism
The key aspects of drug toxicity are cardiotoxicity and hepatotoxicity, and the differentiation of iPSCs into cardiac cells and hepatocytes is thus useful for testing new drugs. This can be done for iPSCs from healthy individuals-simply to study the effect of genetic variation on drug metabolism [89] or for individuals with inherited metabolic disorders in order to study their responses to drugs. This was done for individuals with alpha-1 antitrypsin (AAT) deficiency and it enabled the identification of five drugs that could alleviate the disease phenotype [126]. In addition, the ability to generate a plethora of cell types from iPSCs enables the evaluation of drug toxicity against many cell types, giving hope to a future of personalized medicine [127].

Cardiovascular Disease Models
Moretti et al. [112] have successfully generated iPSCs from patients suffering long QT 1 syndrome, caused by a missense mutation in the KCNQ1 gene. The iPSCs generated were differentiated to cardiomyocytes that recapitulated the phenotype of the disease, including increased depolarization of cardiomyocytes.
In another study, Itzhaki et al. [113] generated iPSCs from patients suffering from long QT 2 syndrome, and these carried the monogenic A614V missense mutation in the KCNH2 gene.
The differentiated cardiomyocytes recapitulated the disease phenotype and showed an increase in their depolarization.

Telomere Disease Models
In 2011, Batista et al. [103] generated iPSCs from dyskeratosis congenita patients that demonstrated telomere shortening and loss of telomere self-renewal. The iPSCs were found to harbour the precise biochemical defects which are characteristic of the disease. This illustrates that iPSCs themselves can be used to study mechanisms of human stem cell diseases.

Other Disease Models
In addition, various diseases of different tissues have been successfully modeled. These include hematological diseases like various myeloproliferative disorders [128], chronic granulomatous disease [106], chronic myelogenous leukemia [107] and Fanconi anemia [108]. Mitochondrial diseases like Friedreich's ataxia [104] and diabetes caused by the mtDNA A3243G mutation [105] have also been modeled. Retinitis pigmentosa, the leading cause of blindness in industrial countries [120], systemic lupus erythematosus [121] and skeletal dysplasia [122] have also been modeled. The plethora of examples illustrate that iPSCs enable the modeling of many different diseases that affect various different tissues.

Modeling Epigenetic Diseases
Angelman and Prader-Willi syndrome were also successfully modeled. The generated iPSCs had the DNA methylation patterns characteristic of each disease and the differentiated neurons recapitulated the phenotypes of both diseases [110,111]. This is particularly interesting as Angelman and Prader-Willi syndrome are caused by inappropriate imprinting, and the successful modeling of these diseases indicate that epigenetic diseases can similarly be modeled by iPSCs as long as genomic imprinting marks are not disturbed [111].

Modeling Infectious Diseases
Disease modeling using iPSCs is not restricted to diseases with genetic causes, in fact, patient-specific iPSC-derived cells can be utilized as platforms to analyze host-pathogen interactions, where the cells are infected with a virus and studied. iPSC-derived cells are more accurate representatives of human physiology as compared to animal models [129]. iPSC-based disease modeling is particularly useful for studying viruses that are highly species-specific, or that can only grow in a restricted number of cell types, specifically those that may be hard to isolate and culture [129]. For example, herpes simplex virus (HSV) and varicella zoster virus (VZV) have tropism for neural cells and establish latency in sensory neurons, and these cell types can be generated from iPSCs to enable further study of these viruses. In addition, patient-specific iPSCs can be used to further understand the genetic bases of viral infections, as have been done for HSV encephalitis [130] and severe influenza [131]. Genome editing has also been utilized in iPSCs as a bid to confer antiviral resistance, as has been done for HIV, by generating anti-HIV iPSCs that contained CCR5 shRNA in combination with a chimeric human/rhesus TRIM5α molecule that could generate HIV-resistant macrophages [132].

Modeling Cancer
Cancer cell lines from human tumours have been immortalized and have been used to study cancer, but prolonged culture could result in these cell lines being less representative of primary tumours [133]. In addition, many of these cell lines represent a mature cancer stage, and are not helpful in modeling early cancer stages. As such, iPSCs have been generated from a number of cancer cell lines or cancer cells from biopsies, in the hope that these will enable the development of an in vitro model for carcinogenesis and recapitulation of cancer development. For example, the lack of a human cell model of the early stages of pancreatic ductal adenocarcinoma (PDAC) limited the study of PDAC development, but the generation of iPSCs from PDAC lines and injection of these iPSCs into mice enabled the study of PDAC progression [134].
In addition, iPSC lines have been generated from a family with Li-Fraumeni syndrome, a familial cancer syndrome linked to mutations in the TP53 tumor suppressor gene. These iPSCs were used to study the role of mutant p53 in the development of osteosarcomas in these patients, demonstrating how iPSCs can be utilized to study inherited cancer syndromes [123].
It is widely believed that pluripotency and oncogenic transformation are related processes [135,136], due to the presence of common characteristics, such as self-renewal, altered metabolism and expression of certain markers and stem cell genes [137]. iPSCs might thus be useful in understanding certain properties of cancer cells.

Limitations of iPSCs in Disease Modeling
Despite its application for various diseases, the use of iPSCs continues to be limited by various factors such as epigenetic memory, the absence of efficient differentiation protocols and non-genetic variability between cells. In addition, it is still difficult to study epigenetic diseases, as well as diseases that involve more than one cell type, diseases that are affected by the environment and adult onset diseases.
Although iPSCs share many similar characteristics with ESCs, such as the expression of stem cell markers such as SSEA-1 and alkaline phosphatase, the in vitro and in vivo differentiation into the three germ layers and the ability to form teratomas [6], recent studies have demonstrated that the DNA methylation pattern of iPSCs resembles that of their cells of origin, thus pushing iPSC differentiation towards the lineage of the cells of origin, as previously mentioned. This phenomenon has been termed epigenetic memory [138]. This affects the cell types that can be effectively generated, thus the cell type of origin and desired cell type must be carefully considered when planning such experiments.
The use of iPSCs is currently limited to cell types for which there are dependable and efficient differentiation protocols, and this might limit the cell types that could be modeled. Protocols do exist for well-studied cell types, such as neural cells, but may not be present for other cell types, such that the efficiency of iPSC differentiation into a specific cell lineage may not be sufficient to model certain diseases [139]. In addition, the lack of efficient and robust differentiation protocols may result in a heterogeneous mixture of cell types, which may not be useful if the aim is to model a disease in a single cell type. This might be overcome by the use of reporter genes placed under cell-specific promoters, to effectively select for cells of the cell type of interest [140].
Individual iPSC lines may exhibit highly variable properties independently of genetic background. This could be due to cellular changes resulting from reprogramming, particularly if transgene-based reprogramming is utilized, as transgenes may integrate into the genome and disrupt endogenous gene expression [141]. This could be overcome by utilising transgene-free reprogramming, and by adopting more stringent quality controls.

Diseases That Continue to Be Difficult to Model
Not all diseases are applicable for modeling using iPSCs. For example, diseases that are caused by changes in the epigenetic status of cells may not be effectively modeled by iPSCs as the epigenetic status of cells may change during cellular reprogramming [142]. A notable exception to this is mentioned above, namely the fact that Angelman and Prader-Willi syndrome can be modeled using iPSC-based disease modeling [111].
Modeling of diseases is made more complex by the fact that many diseases involve complex interactions between multiple cell types, such that studying a single cell type or a limited number of cell types independently will not be able to explain the pathology of the disease. For example, the pathogenesis of liver disease is affected by the complex interactions between various types of parenchymal and non-parenchymal cells, and studying a single cell type alone would be insufficient in understanding the disease [127]. Similarly, iPSC-based disease modeling cannot model the three-dimensional (3D) niche in which cells usually find themselves.
However, this may be overcome by the recent development in various 3D culture techniques, as well as the development in organoid-growing and tissue engineering techniques. This has been illustrated for a variety of organs. For example, iPSC-derived hepatic endoderm cells cultured with endothelial and mesenchymal cells form liver buds, and form functional human liver tissue upon transplantation into mice [143]. Similarly, iPSCs directed to undergo neural differentiation and cultured in a 3D system are able to form cerebral organoids. These organoids develop discrete brain regions, and can be subsequently used to model microcephaly [144]. 3D culture iPSC models of Alzheimer's disease (AD) have also been generated, which appear to have a greater resemblance to actual AD pathology as compared to conventional 2D cultures [145]. In addition, human pluripotent stem cells (hPSCs) have been differentiated to generate 3D optic cups that could be used to model retinal diseases [146], and kidney-like structures that could be used to model renal diseases [147].
It is also difficult to model diseases that are affected by environmental factors. For example, alcohol-mediated liver damage is largely influenced by alcohol consumption, and would be difficult to model, especially if the cells were sourced from non-liver tissues [127].
Adult onset diseases are similarly difficult to model. iPSCs are characterized by the fact that they resemble ESCs, which are in a fetal-like stage of development. As such, they might not exhibit the phenotype of adult-onset diseases, such as Parkinson's or Alzheimer's disease. This could be overcome by artificially "ageing" cells, which has been done in iPSC-derived neurons by exposing cells to oxidative stress [95], by expressing progerin [148] or by excessively stimulating neurons with high concentrations of glutamate [149], which all induce age-dependent disease phenotypes (Table 3). Table 3. Challenges in utilising iPSCs for disease modeling and their potential solutions.

Challenges Potential Solutions
Epigenetic memory [138] Test various cell types, and use late passage iPSCs Lack of dependable and efficient differentiation protocols, which may result in the generation of a mixture of cell types [139] Further research into and development of protocols utilise reporter genes to select for the cell type of interest [140] Variable properties independent of genetic background, especially for transgene-based reprogramming Utilise transgene-free reprogramming utilise more stringent quality controls Modeling of diseases involving the complex interactions between multiple cell types, in a three-dimensional niche [127] Advances in 3D culture techniques, organoid-growing techniques and tissue engineering strategies [143,144,146,147] Modeling of diseases affected by environmental factors Use cells cultured in bio-engineered niches and co-culture with primary cells in order to mimic in vivo tissue development

Modeling of adult onset diseases
For neurons, exposing cells to oxidative stress [95], progerin [148], or by excessively stimulating neurons with high concentrations of glutamate [149] 12. Concluding Remarks iPSC-based disease modeling has been utilized for many diseases, supplementing more traditional animal disease models. The current techniques in cellular reprogramming have provided a suitable platform for modeling diseases and recapitulating their phenotypes. Transgene-free methods are especially useful, as these reduce the risk of unspecific alterations in the gene expression profile of the cells. Another breakthrough in cellular reprogramming has been the generation of iPSCs from blood and other samples. This breakthrough made obtaining patients' cells easy and non-invasive. Samples from infants, children and elderly people can now be easily collected with little risk. The easy isolation of patient-specific iPSCs enables population-wide modeling of diseases, eventually contributing to personalized medicine.
Gene editing via the use of ZFNs, TALENs and particularly via CRISPR enable us to study and model diseases with higher efficacy. Scientists can now edit the genomes of iPSCs derived from healthy or diseased individuals, to better understand the genetic causes of diseases ( Figure 3).
However, there remain limitations to the use of iPSCs in disease modeling, which must be overcome. We continue to struggle to model complex diseases that are affected by the environment, or that involve the interaction between multiple cell types.

13
Modeling of adult onset diseases For neurons, exposing cells to oxidative stress [95], progerin [148], or by excessively stimulating neurons with high concentrations of glutamate [149] 13. Concluding Remarks iPSC-based disease modeling has been utilized for many diseases, supplementing more traditional animal disease models. The current techniques in cellular reprogramming have provided a suitable platform for modeling diseases and recapitulating their phenotypes. Transgenefree methods are especially useful, as these reduce the risk of unspecific alterations in the gene expression profile of the cells. Another breakthrough in cellular reprogramming has been the generation of iPSCs from blood and other samples. This breakthrough made obtaining patients' cells easy and non-invasive. Samples from infants, children and elderly people can now be easily collected with little risk. The easy isolation of patient-specific iPSCs enables population-wide modeling of diseases, eventually contributing to personalized medicine.
Gene editing via the use of ZFNs, TALENs and particularly via CRISPR enable us to study and model diseases with higher efficacy. Scientists can now edit the genomes of iPSCs derived from healthy or diseased individuals, to better understand the genetic causes of diseases ( Figure 3).
However, there remain limitations to the use of iPSCs in disease modeling, which must be overcome. We continue to struggle to model complex diseases that are affected by the environment, or that involve the interaction between multiple cell types.