Inducible MLL-AF9 Expression Drives an AML Program during Human Pluripotent Stem Cell-Derived Hematopoietic Differentiation

A t(9;11)(p22;q23) translocation produces the MLL-AF9 fusion protein, which is found in up to 25% of de novo AML cases in children. Despite major advances, obtaining a comprehensive understanding of context-dependent MLL-AF9-mediated gene programs during early hematopoiesis is challenging. Here, we generated a human inducible pluripotent stem cell (hiPSC) model with a doxycycline dose-dependent MLL-AF9 expression. We exploited MLL-AF9 expression as an oncogenic hit to uncover epigenetic and transcriptomic effects on iPSC-derived hematopoietic development and the transformation into (pre-)leukemic states. In doing so, we observed a disruption in early myelomonocytic development. Accordingly, we identified gene profiles that were consistent with primary MLL-AF9 AML and uncovered high-confidence MLL-AF9-associated core genes that are faithfully represented in primary MLL-AF9 AML, including known and presently unknown factors. Using single-cell RNA-sequencing, we identified an increase of CD34 expressing early hematopoietic progenitor-like cell states as well as granulocyte-monocyte progenitor-like cells upon MLL-AF9 activation. Our system allows for careful chemically controlled and stepwise in vitro hiPSC-derived differentiation under serum-free and feeder-free conditions. For a disease that currently lacks effective precision medicine, our system provides a novel entry-point into exploring potential novel targets for personalized therapeutic strategies.


Introduction
The majority of acute myeloid leukemia (AML) cases are associated with non-random chromosomal translocations that can result in gene rearrangements [1]. Many encode into abnormal transcriptional activators that can alter gene expression necessary for myeloid development, cell proliferation, and/or survival [2]. Consequently, a cell state that is susceptible to leukemic transformation can be established [3]. The potential targeting of these fusion transcripts has become a major focus for the development of novel therapeutics.
Age-related cells of origin and/or additional genetic lesions can lead to changes in the pathology and clinical outcome [4][5][6]. For instance, the MLL-AF9 (KMT2A-MLLT3) fusion, associated with up to 5% of adult AML and in 25% of de novo AML in children [4], initiates transformation in rapidly cycling myeloid progenitors [7]. Moreover, it has been shown that neonatal cells are inherently more susceptible to MLL-AF9-mediated immortalization than adult cells, and co-existing mutations may contribute to the aggressiveness of the disease [5,6].

Myelomonocytic Differentiation
For differentiating iPSCs to HPCs expressing CD34, CD45, and CD43, a STEMdiff™ Hematopoietic Kit (STEMCELL™ Technologies) was used. Until day 0, the cells were cultured in mTeSR™ Plus (STEMCELL™ Technologies)-coated 6-well plates at 37 • C and supplemented with 1% Penicillin/Streptomycin on Vitronectin XF™ (STEMCELL™ Technologies). On day 6 and routinely every other day afterward (day 8, 10, 12, etc.), doxycycline (16 ng/mL) was supplemented to induce MLL-AF9 expression. We chose day 6 to start doxycycline treatment because, at this stage, HPCs are most likely not (partially) differentiated. We estimated that the first HPCs would be formed in the hematopoietic clusters around day 8. Therefore, day 6 provided the largest window of opportunity to study HPC transformation into (pre-)leukemic states. As a consequence, the MLL-AF9 fusion could target downstream genes that are important during early hematopoiesis. After 12 days, the cells residing in the supernatant were transferred to 6-well plates and the Cells 2023, 12, 1195 3 of 16 medium was replaced by Stemline ® II Hematopoietic Stem Cell Expansion Medium (Sigma-Aldrich, Saint Louis, MO, USA) supplemented with 1% Penicillin/Streptomycin, 1:100 insulin-transferrin-selenium-ethanolamine (ITS-X) (Thermo Fisher Scientific, Waltham, MA, USA) and cytokines (50 ng/mL IL-3, 50 ng/mL FLT3-L, 50 ng/mL SCF, 50 ng/mL M-CSF, 10 ng/mL TPO) (Miltenyi Biotec B.V., Leiden, The Netherlands) to induce the monocyte differentiation. The medium was refreshed every 2-3 days, and MLL-AF9 expressing cells were kept continuously in doxycycline upon further analysis or exhaustion.

RNA Extraction and Real-Time PCR
RNA was isolated from 1E5 cells using Quick-RNA™ Microprep (Zymo Research, Breisgau, Germany) and reverse transcribed using an iScript™ cDNA Synthesis Kit (Bio-Rad, Hercules, CA, USA). Real-time amplification was performed using iQ SYBR Green mix (Bio-Rad) on a CFX96™ Real-Time System (Bio-Rad) and quantified using Bio-Rad CFX Manager. To calculate the relative expression levels, Glyceraldehyde 3-phosphate dehydrogenase (GAPDH) was used. We used the following primers:

Western Blot
The iPSCs were cultured in E8 media (Life Technologies, Carlsbad, CA, USA) induced with 50 ng/mL doxycycline for 24 h. As a control, we used untreated iPSCs.

Flow Cytometric Analysis
On day 15, cells were stained with a monocyte (CD14, CD64) antibody cocktail and kept in the dark for 30 min at room temperature. Samples were then washed 3 times with PBS + 1% BSA. After staining, cells were analyzed on a Beckman Coulter Gallios 10-color (Beckman Coulter, Brea, CA, USA) with the Kaluza G software (2014).

Publicly Available Datasets
Publicly available primary AML RNA-seq data were used from the BeatAML adult cohort and TARGET AML pediatric cohort [12]. The TARGET AML data were generated by Additionally, publicly available ChIP-seq occupancy changes of MLL-AF9 (HA) in MLL-AF9-HA-FKPB12-transformed human cells (HCB1) were obtained from the gene expression omnibus (GSE173599) [13] and visualized in a UCSC browser session (https://genome.ucsc. edu/s/bheuts/hg38_Armstrong_MLLAF9_degradation, accessed on 27 January 2021).

RNA Sequencing
Cells were collected at various timepoints between day 20 and day 28. RNA was isolated from 1E6 cells using Quick-RNA™ Microprep (Zymo Research, Breisgau, Germany) and on-column DNaseI treatment. Library generation was performed on 100 ng RNA using KAPA RNA HyperPrep Kit with RiboErase (HMR) (Kapa Biosystems, Potters Bar, UK), with an RNA fragmentation of approximately 300 bp fragments for 6 min at 94 • C. Library size distribution was measured using High Sensitivity DNA analysis (Agilent, Santa Clara, CA, USA) on an Agilent 2100 Bioanalyzer and its corresponding software (version B.02.08.SI648). Libraries with average sizes between approximately 300-400 bp were used for sequencing via a NextSeq 500 system (Illumina, San Diego, CA, USA).

Assay for Transposase-Accessible Chromatin Using Sequencing
Cells were harvested on day 23 or day 24 from two independent differentiation experiments. Suspension cells (50,000 per sample) were washed twice in ice cold PBS. The pellet was resuspended in 1:1 ice cold PBS and 2× lysis buffer (1 M Tris/HCl pH 7.5, 5 M NaCl, 0.5 M MgCl 2 , 10% NP40). The cells were then centrifuged (300× g) for 30 min at 4 • C, and the pellet was resuspended with 24 µL clean-up buffer (5 M NaCl, 0.5 M EDTA, 10 mg/mL Proteinase K, 10% SDS) whilst being kept on ice. Then, 1 µL Tn5 was added to each sample for tagmentation. The nuclei were heated for 6 min at 37 • C with 650 rpm agitation. Immediately after incubation, 9 µL clean-up buffer was added to each sample and incubated for 30 min at 40 • C with 650 rpm agitation. The samples were purified with a normal phase 2× SPRI purification and amplified by PCR using Nextera primers (Illumina, San Diego, CA, USA) and KAPA HiFi HotStart ReadyMix (Roche). Tagmented DNA was amplified using a five-cycle PCR protocol: 5 min at 72 • C; 45 s at 98 • C; 5× cycles: {15 s at 98 • C; 30 s at 63 • C; 30 s at 72 • C}; 60 s at 72 • C; hold at 12 • C. A reverse phase 0.65× SPRI bead purification step was performed, followed by a 1.5× SPRI beads clean-up. Next, a second PCR amplification program was used to further amplify the tagmented DNA. The program was identical to the first PCR program, including the Nextera primers, except for the number of cycles, which was determined by qPCR. Real-time amplification was performed using iQ SYBR Green mix (Bio-Rad, Hercules, CA, USA) on a CFX96™ Real-Time System (Bio-Rad) and quantified using Bio-Rad CFX Manager. The following program was used: 45 s at 98 • C; 40× cycles: {15 s at 98 • C; 30 s at 63 • C; 30 s at 72 • C}; Melt curve: {0.05 at 65 • C; 0.5 at 95 • C}. The Ct value + 2 was used as PCR cycles for the second PCR amplification step. Subsequently, two successive 1.5× SPRI clean-ups were performed to generate the ATAC libraries. The DNA was stored at −20 • C, and an aliquot was used to measure the DNA concentration with a DeNovix dsDNA HS Assay Kits (Life Technologies, Carlsbad, CA, USA) on a DeNovix Spectrophotometer/Fluorometer (DS-11) (DeNovix, Wilmington, UK). Library size distribution was measured using High Sensitivity DNA analysis (Agilent, Santa Clara, CA, USA) on an Agilent 2100 Bioanalyzer and its corresponding software. DNA fragment distribution followed a mononucleosomal-like pattern at approximately 200 bp and 320 bp. The libraries were sequenced using a NextSeq 500 system (Illumina, San Diego, CA, USA).

ANalysis Algorithm for Networks Specified by Enhancers (ANANSE)
MLL-AF9-specific transcription factors were predicted using ANANSE v0.4.0 [18]. TF binding profiles were predicted using narrow peaks and sequence alignment data (BAM) from ATAC-seq, motif scores, and average ReMap ChIP-seq coverage, as described previously by Xu et al. [18]. Subsequently, gene regulatory networks (GRNs) were determined using our normalized (TPM) RNA-seq results. For the TF prediction, we used human genome assembly hg38. To obtain key TFs, we only show TFs that were predicted to be important in two independent analyses.

Single-Cell RNA Sequencing
On day 23, cells from each condition were sorted into separate 384-well plates with a Becton Dickinson Aria Flow cytometry sorter (Becton, Dickinson and Company, Franklin Lakes, NJ, USA), using a 100 µM nozzle. After this, plates were frozen at −80 • C and thawed before further processing. Sequencing library generation was performed according to the RAID-seq protocols [23] with adaptations; libraries were generated without immunostaining and Barcode Compensation Primers as we only sequenced mRNA transcripts. Thanks to the use of an Agilent 2100 Bioanalyzer and High Sensitivity DNA analysis (Agilent, Santa Clara, CA, USA), DNA fragment sizes were determined. Libraries with average sizes between approximately 300-400 bp were used for sequencing via a NextSeq 500 system (Illumina, San Diego, CA, USA).
The sequenced libraries were then processed into spliced and unspliced count matrices using Seq2science (https://github.com/vanheeringen-lab/seq2science, accessed on 27 January 2021) (CELSeq2 workflow; RNA velocity/Kallisto). Then, quality control metrics were determined using a custom scRNA-seq pre-processing workflow (https: //github.com/Rebecza/scRNA-seq, accessed on 27 January 2021), and downstream analyses, such as cell cycle scoring and differential expression analysis, were computed using the scRNA-seq integration workflow with the SCTranform normalization method of the Seurat package in R [24]. The number of dimensions were determined using an elbow plot; 13 dimensions were chosen (Supplementary Figure S1A). We used the default clustering resolution (Supplementary Figure S1B).
Identifying cross-dataset matched biological states (anchors) can be challenging. For the pre-subset dataset, we revealed generally lower anchor scores for cluster 0 compared to other clusters, implying subpar integration for these cells (Supplementary Figure S1C). Therefore, profound biological understanding may be compromised for cluster 0. On the contrary, other anchors successfully recovered matching cell states, even with MLL-AF9induced dataset differences. Therefore, we removed cells from cluster 0 and performed the quality assessment and downstream analyses with the post-subset dataset. We set the dimensions to 14 (Supplementary Figure S1D) and kept the clustering resolution at default parameters. We applied a zero-preserving imputation strategy using ALRA [25].

Generation of an Inducible MLL-AF9 Human Pluripotent Stem Cell Model
To investigate the molecular mechanisms mediated by MLL-AF9 expression during human hematopoiesis, we established a doxycycline inducible human MLL-AF9 iPSC model. An MLL-AF9 inducible iPSC clone derived from human megakaryoblasts was similarly generated, as previously described [10,11]. A dose-dependent expression of MLL-AF9 in iPSCs was determined via qPCR to match the physiological MLL-AF9 expression levels in the MLL-AF9-positive human leukemia monocytic cell line THP-1 ( Figure 1A). MLL-AF9 expression reached physiological levels of THP-1 between 14 and 50 ng/mL doxycycline. Indeed, it was confirmed that MLL-AF9 protein expression used Western blotting after doxycycline induction ( Figure 1B). For this, we used an antibody that recognizes the Nterminal region of MLL, resulting in the detection of endogenous wild-type MLL (variants) and the MLL-AF9 fusion protein. These results demonstrated that doxycycline-induced promoter activation causes MLL-AF9 expression on the transcriptome and proteome level. Together, these results suggest that induced MLL-AF9 expression deregulates normal myelomonocytic differentiation, consequently generating cells with higher cell expansion potential. Moreover, without doxycycline induction, these genetically modified iPSCs were still viable for myelomonocytic differentiation, demonstrating the potential of this doxycycline-dependent MLL-AF9 activation model.  Next, to evaluate whether induced MLL-AF9 expression deregulates differentiation along the early progenitor to the myeloid axis, we induced iPSC-derived hematopoiesis ( Figure 1C). First, iPSCs were differentiated into a mesoderm-like state, a prerequisite for the specification of hemogenic endothelial cells [8]. Subsequently, early hematopoietic progenitor cells (HPCs) were generated and transferred to initiate myelomonocytic differentiation using a monocytic cytokine cocktail. Depending on the doxycycline treatment, we either generated iMonocytes or iMLL-AF9 cells. We did not perform any subsequent sorting strategy to isolate the iMonocytes from the myeloid population. To determine the cell expansion capacity of iMonocytes and iMLL-AF9 cells, we calculated the total cumulative cell count (per mL) for each condition ( Figure 1D). Up to day 22 of the myelomonocytic differentiation cell expansion was comparable for each condition. Thereafter, iMLL-AF9 cell expansion increased and continued until day 66 before exhaustion occurred, whereas iMonocytes were exhausted after day 30. These results indicated that the induction of MLL-AF9 deregulates normal hiPSC-derived hematopoiesis. To identify immunophenotypic alterations upon MLL-AF9 activation, we measured the cell-surface marker expression for monocytes (CD14 and CD64) using flow cytometry ( Figure 1E). At day 15 of the myelomonocytic differentiation, flow cytometry analysis revealed a CD14-and CD64-enriched cell population for iMonocytes, whereas iMLL-AF9 cells were almost entirely devoid of these monocytic markers. In addition, to confirm persisted HPC gene marker expression upon MLL-AF9 induction, we compared gene expression levels of key HPC gene markers (CD34 and SPN) between iMonocytes and iMLL-AF9 cells after day 20 of the differentiation protocol ( Figure 1F and Supplementary Figure S1E,F). A significantly higher expression of CD34 and SPN implies HPC gene programs persisted as a consequence of MLL-AF9 expression.
Together, these results suggest that induced MLL-AF9 expression deregulates normal myelomonocytic differentiation, consequently generating cells with higher cell expansion potential. Moreover, without doxycycline induction, these genetically modified iPSCs were still viable for myelomonocytic differentiation, demonstrating the potential of this doxycycline-dependent MLL-AF9 activation model.
To elucidate the underlying biological processes associated with this MLL-AF9-mediated gene response, we performed Gene Ontology (GO) term enrichment analysis. Our iMonocytes were significantly enriched for "leukocyte mediated immunity", "leukocyte proliferation", and "MHC protein complex assembly", signifying white blood cell development and immunity, whereas iMLL-AF9 cells were significantly enriched for biological processes such as "chromosome segregation", "regulation of cell cycle phase transition", and "RNA localization" ( Figure 2C)-processes that are generally associated with a proliferative phenotype. Overall, these results are in line with our previous flow cytometry results, implying a disturbance of iPSC-derived hematopoiesis upon MLL-AF9 expression.
In order to demonstrate that our iPSC-derived hematopoietic cells are relevant to early hematopoietic development and leukemia gene profiles, we performed Gene Set Enrichment Analysis (GSEA). For this, we used multiple curated gene sets for neonatal HSC signatures identified by Jaatinen et al. [33], Novershtern et al. [34], and Eppert et al. [35] ( Figure 2D, left column). Indeed, our iMLL-AF9 model correlated with HSC signatures, as we observed significant enrichment scores resulting from our upregulated iMLL-AF9 genes. Moreover, MLL-AF9-induced CD34 + cord blood cell signatures by Horton et al. [6], leukemic stem cells (LSCs) by Eppert et al. [35], signatures for cell cycle progression by Kanehisa et al. [36], and an AML signature by Köhler et al. [37] were enriched in our iMLL-AF9 cells ( Figure 2D, right column). Our iMonocytes were inversely correlated with HSCs, LSCs, and MLL-AF9 expressing CD34 + cord blood signatures ( Figure 2D). Together, we demonstrated significant upregulation of key MLL-AF9 AML marker genes after MLL-AF9 induction as a single oncogenic hit. Furthermore, these results provide a strong indication that our iMLL-AF9 cells share a transcriptional profile with early myeloid progenitors and are relevant to AML-like overproduction of such early myeloid progenitors.

Identifying Key Transcription Factors Driving MLL-AF9-Induced Early Hematopoietic Progenitor Cells
Transcription factors (TFs) are important regulators during the progression of normal hematopoiesis [38]. In MLL-AF9 AML, numerous TFs are considered to be oncogenic and/or important for leukemogenesis and/or maintenance [39]. To identify important TFs mediated by MLL-AF9 expression, we used a network-based TF prediction method that prioritizes TFs based on transcriptomic and chromatin accessibility data [18]. For this, we performed RNA-seq and Assay for Transposase-Accessible Chromatin using sequencing (ATAC-seq) with the use of iMonocytes and iMLL-AF9 cells (Supplementary Figure S1I,J). First, genome-wide TF binding profiles were predicted using three types of input features: signal intensity at chromatin accessible regions, TF binding motifs scores, and the average ReMap ChIP-seq coverage [40]. Subsequently, differential gene regulatory networks were determined using gene expression information, resulting in an inferred 'influence score'-a measure of importance to which a TF can explain the transcriptional differences between two cell states. In total, 12 TFs were predicted to be important in MLL-AF9 expressing HPCs ( Figure 2E). To determine MLL-AF9 DNA binding at the loci of our predicted TFs, we analyzed publicly available MLL-AF9 ChIP-seq data from genetically modified human cord blood cells and identified nine as direct MLL-AF9 target genes (Supplementary Figure S1K) [27]. Amongst these was PBX3, an important TF in MLL-AF9 expressed cells, which has previously been shown to be important in leukemogenesis and maintenance, as well as a direct target of MLL-AF9 [13,41,42]. Interestingly, GATA2 and SP8 were predicted as the most influential TFs in iMLL-AF9 cells, yet their roles in this context remain elusive. Together, these results reveal an instructive set of transcription factors that likely govern MLL-AF9-induced gene programs. The majority of the predicted TFs are indeed direct MLL-AF9 targets, strongly implying that the novel factors could be involved in MLL-AF9-mediated leukemogenesis and/or maintenance.

Comparative Gene Expression Profiling Reveals MLL-AF9-Associated Core Genes in Primary AML
To identify MLL-AF9-specific genes that are faithfully represented in primary AML, we compared our iMLL-AF9 model to MLL-AF9-positive primary AMLs. First, we used our iMLL-AF9 model to uncover MLL-AF9-associated genes ( Figure 3A). For each MLL-AF9associated gene, expression levels were statistically tested by comparing MLL-AF9-positive primary AML to other AML subtypes. For this, we used transcriptomic data from the TARGET AML cohort (pediatric) and BeatAML (adult) cohort [12]. We identified 61 core genes for each cohort: 39 high-confidence genes that were significantly upregulated in both AML cohorts and 22 unique genes that were upregulated in either one ( Figure 3A Table S2). In agreement with the previous results ( Figure 2B), we identified known MLL-AF9 targets, such as HOXA10 and ZNF521 [28,43]. Based on this analysis, we were able to uncover MLL-AF9-associated core genes that are faithfully represented in primary AML.
Furthermore, to uncover the specificity of the core genes for MLL-AF9 (KMT2A-MLLT3) leukemia, we examined their expression levels in various AML subsets ( Figure 3C). Indeed, a higher expression of these core genes correlated well with MLL-AF9 AMLs. Moreover, to determine the discriminative power of MLL-AF9-associated core genes in leukemia subclassification, we performed uniform manifold approximation and projection (UMAP) ( Figure 3D,E), as well as hierarchical clustering (Supplementary Figure S2A,B). In both the pediatric (TARGET AML) and the adult cohort (BeatAML), we were able to define clusters of MLL-AF9 AML patients. Interestingly, the dynamic gene expression within the core genes also allowed for the clustering of other cytogenetic abnormalities, such as RUNX1-RUNX1T1, CBFB-MYH11, or AML patients with a CEBPA mutation. Altogether, these results suggest that we were able to effectively classify primary MLL-AF9 and other AMLs using these core genes.
Next, to determine which of the core genes are directly regulated by the MLL-AF9 fusion protein, we included MLL-AF9 DNA binding from publicly available MLL-AF9transduced CD34 + cord blood data in our analysis [13]. In total, 18 genes were identified as direct MLL-AF9 targets (Supplementary Table S2), including known (e.g., HOXA10) and novel (e.g., SKIDA1) targets. Indeed, all 18 genes were identified as high-confidence MLL-AF9-associated genes that were significantly upregulated in both AML cohorts. In summary, these results suggest that we were able to uncover high-confidence MLL-AF9associated core genes that are faithfully represented in primary AML. Our syngeneic model served as an important entry point in uncovering these known and presently unknown markers.

MLL-AF9 Expression Affects Distinct Cellular Populations during Early Hematopoietic Development
To determine the effects of MLL-AF9 at various levels of blood cell differentiation, we performed single-cell RNA sequencing using 265 iMonocytes and 774 iMLL-AF9 cells. First, we aligned cross-dataset pairs of cell identities based on a shared set of variable genes and performed UMAP nonlinear dimensionality reduction as well as Louvain clustering to visualize and explore our data. This revealed multiple integrated cell populations between our two conditions ( Figure 4A). We defined hematopoietic populations based on the expression of established markers (e.g., CD34, CD14, GP1BA, ITGA2B, SIGLEC8, HLA-DRA, PRG2), which revealed that the generated blood cells predominantly derived from the myeloid compartment, such as megakaryocyte-, HPC-, granulocyte-monocyte progenitor (GMP)-, basophil-, eosinophil-, (pro-)monocyte-, and dendritic cell (DC)-like cells ( Figure 4B and Supplementary Figure S2C). Our data integration revealed a shared set of cell states for which we could determine MLL-AF9-induced gene expression changes.
Cells 2023, 12, x FOR PEER REVIEW 11 of 17 expression within the core genes also allowed for the clustering of other cytogenetic abnormalities, such as RUNX1-RUNX1T1, CBFB-MYH11, or AML patients with a CEBPA mutation. Altogether, these results suggest that we were able to effectively classify primary MLL-AF9 and other AMLs using these core genes. Next, to determine which of the core genes are directly regulated by the MLL-AF9 fusion protein, we included MLL-AF9 DNA binding from publicly available MLL-AF9transduced CD34 + cord blood data in our analysis [13]. In total, 18 genes were identified as direct MLL-AF9 targets (Supplementary Table S2), including known (e.g., HOXA10) and novel (e.g., SKIDA1) targets. Indeed, all 18 genes were identified as high-confidence MLL-AF9-associated genes that were significantly upregulated in both AML cohorts. In summary, these results suggest that we were able to uncover high-confidence MLL-AF9associated core genes that are faithfully represented in primary AML. Our syngeneic model served as an important entry point in uncovering these known and presently unknown markers.  gether, this indicates that the effects of MLL-AF9 on core gene expression are cell typedependent.
Overall, these results reveal a cell state-dependent response to MLL-AF9 expression. MLL-AF9 allows for a competitive environment in which progenitor cell types are more susceptible to transformation. Our model enables the expansion of immature cells with a disturbed myelomonocytic differentiation axis. Thus, this provides a valuable platform to study the MLL-AF9-induced and context-dependent molecular mechanisms of diseases.  MLL-AF9 helps to perpetuate ongoing gene expression programs across multiple cellular states [7], thus creating a permissive environment for competitive cell states with preserved cell cycle behavior. To assess whether cell cycle signatures were upregulated after MLL-AF9 induction, we classified each cell in either G1-, G2M-, or S-phase based on canonical cell cycle marker expression ( Figure 4C). When compared to iMonocytes, this revealed an increased expression of both S-and G2M-phase genes for most iMLL-AF9 cell clusters. In agreement with our previous cell expansion results ( Figure 1D), these data suggest that MLL-AF9 induction affects the DNA replication gene program in most cell states. Thus, this implies that MLL-AF9 expression maintains a cell cycle program in multiple blood cell compartments.
To uncover whether certain cell populations expanded more in response to MLL-AF9 expression, we normalized for the cell count in iMonocytes and iMLL-AF9 cells ( Figure 4D and Supplementary Figure S2D). It was revealed that both cluster 1 (annotated as CD34 + HPCs) and cluster 2 (annotated as GMPs) were more expanded after MLL-AF9 induction, which is in concordance with MLL-AF9 studies in human neonatal cells [6]. In addition, cluster 6 (annotated as DCs) became more abundant when MLL-AF9 was induced, which contained cells that expressed pro-angio-/vasculogenic factors (Supplementary Figure S2E). Previous studies identified immature DCs as bipotent cells with vasculogenic potential that could transdifferentiate into endothelial cells in response to high concentrations of VEGF [44]. Whether MLL-AF9 plays a role in the endothelialisation of these DC-like cells needs to be determined in future studies. Furthermore, we observed a decreased (pro-)monocytic-like compartment (cluster 5)-even though these cells were enriched for cell cycle genes-and a non-expanding megakaryocytic-like cell compartment (cluster 0). Taken together, these results suggest that MLL-AF9 affects cell expansion in a cell state-dependent manner.
Next, in order to investigate whether the leukemic potential of iMLL-AF9 stems from specific cell type populations, we examined the expression of our earlier defined MLL-AF9-specific core genes in our identified blood cell populations. In keeping with our prior findings, we observed a higher expression of the core genes in iMLL-AF9 cells compared to iMonocytes ( Figure 4E). For iMLL-AF9 cells, the highest expression was observed in clusters 2 (annotated as GMPs), 3, and 4, while other clusters exhibited a lower expression, although overall, their expression was greater compared to the iMonocytes. While the expression of MLL-AF9 core genes was absent in clusters 5 (annotated as (pro-)monocytic), MLL-AF9independent expression of the core genes was observed in other clusters. Together, this indicates that the effects of MLL-AF9 on core gene expression are cell type-dependent.
Overall, these results reveal a cell state-dependent response to MLL-AF9 expression. MLL-AF9 allows for a competitive environment in which progenitor cell types are more susceptible to transformation. Our model enables the expansion of immature cells with a disturbed myelomonocytic differentiation axis. Thus, this provides a valuable platform to study the MLL-AF9-induced and context-dependent molecular mechanisms of diseases.

Discussion
In this study, we have established a new platform to examine the consequences of MLL-AF9 expression during human iPSC-derived hematopoietic differentiation. Our syngeneic model can serve as a new entry point to study the MLL-AF9-induced and context-dependent molecular mechanisms of diseases.
The MLL-AF9 fusion protein induces cell expansion and impairs differentiation [6,7]. In agreement with this, we confirmed the deregulation of myelomonocytic differentiation in conjunction with an increase in cell expansion capacity upon MLL-AF9 expression. Indeed, typical gene expression profiles that have well-described roles in both the induction and maintenance of MLL-AF9 AML were enriched following MLL-AF9 activation (e.g., MEIS1, HOXA9, HOXA10, CDK6, ZNF521) [26][27][28][29]43]. Moreover, we identified MLL-AF9associated core genes that are faithfully represented in primary MLL-AF9 AML, including known targets such as HOXA10, CDK6, and ZNF521, as well as targets that have a presently unknown role in MLL-AF9 AML, e.g., SKIDA1, which seems to be a promising predictor of MLL-rearranged AML [45]. This indicates that we have uncovered a repertoire of gene profiles that includes factors important for leukemic transformation and maintenance in MLL-AF9 AML.
Previously, it has been shown that MLL-AF9 expression sustains already existing gene expression programs in multiple blood cell types [7]. As we can generate various blood cell states along the myeloid differentiation axis with our iPSC model (e.g., megakaryocyte-, granulocyte-, monocyte-like cell states), we assessed the consequences of MLL-AF9 ex-pression in these myeloid blood cell compartments. We revealed a cell state-dependent response to MLL-AF9 expression. In concordance with others [6,7], CD34 + HPCs and GMP-like cells were observed to have an increased proliferative ability. Our findings shed light on the cell state-dependent response to MLL-AF9 activation, suggesting several areas for future investigation.
While we did observe the expansion of HPC and GMP-like compartments, it is unclear if this heterogeneity is a consequence of cell state-dependent differential in MLL-AF9 expression levels or the cellular context in which MLL-AF9 expression occurs [46,47]. Additionally, we observed signs of exhaustion after MLL-AF9 induction. This suggests that our cells were not yet immortalized. In this model, the oncogenic hit causes a competitive advantage for cells that have an enriched cell division program, likely priming the cells for malignant transformation. Expressing MLL-AF9 has been shown to be sufficient to induce malignancy in animal models [48,49], whether this is the case with our model needs to be determined. Importantly, our model allows for excellent control in MLL-AF9 expression activation during iPSC-derived hematopoiesis. Not only are we able to control the cell types in which MLL-AF9 becomes expressed, but we can also control its timing and dosage during the various developmental stages. Furthermore, our iPSCs were differentiated under serum-free and feeder-free conditions, allowing for careful, chemically controlled, and stepwise differentiation.
In summary, we demonstrated that, as an oncogenic hit, inducing MLL-AF9 expression disturbs early human iPSC-derived hematopoiesis in a cell state-dependent manner. By exploiting this system, we can provide a source of biomarkers that are faithfully represented in primary MLL-AF9 AML, including factors that have a presently unknown role in MLL-AF9-mediated programs. As this disease currently lacks effective precision medicine, finding potential novel targets lays the foundation for personalized therapeutic strategies.