Distinct miRNA Signatures and Networks Discern Fetal from Adult Erythroid Differentiation and Primary from Immortalized Erythroid Cells

MicroRNAs (miRNAs) are small non-coding RNAs crucial for post-transcriptional and translational regulation of cellular and developmental pathways. The study of miRNAs in erythropoiesis elucidates underlying regulatory mechanisms and facilitates related diagnostic and therapy development. Here, we used DNA Nanoball (DNB) small RNA sequencing to comprehensively characterize miRNAs in human erythroid cell cultures. Based on primary human peripheral-blood-derived CD34+ (hCD34+) cells and two influential erythroid cell lines with adult and fetal hemoglobin expression patterns, HUDEP-2 and HUDEP-1, respectively, our study links differential miRNA expression to erythroid differentiation, cell type, and hemoglobin expression profile. Sequencing results validated by reverse-transcription quantitative PCR (RT-qPCR) of selected miRNAs indicate shared differentiation signatures in primary and immortalized cells, characterized by reduced overall miRNA expression and reciprocal expression increases for individual lineage-specific miRNAs in late-stage erythropoiesis. Despite the high similarity of same-stage hCD34+ and HUDEP-2 cells, differential expression of several miRNAs highlighted informative discrepancies between both cell types. Moreover, a comparison between HUDEP-2 and HUDEP-1 cells displayed changes in miRNAs, transcription factors (TFs), target genes, and pathways associated with globin switching. In resulting TF-miRNA co-regulatory networks, major therapeutically relevant regulators of globin expression were targeted by many co-expressed miRNAs, outlining intricate combinatorial miRNA regulation of globin expression in erythroid cells.


Introduction
Hematopoiesis is orchestrated by a complex and multidimensional regulatory network that controls blood cell formation in the bone marrow (BM) [1]. The identification of a pool of multipotent hematopoietic stem and progenitor cells (HSPCs) in the BM, typically based on the use of the CD34 as cell surface marker [2,3], allowed the subsequent discovery of distinct, phenotypically defined compartments in the hematopoietic differentiation hierarchy, marked by characteristic changes in transcriptional profiles and key transcription factors (TFs) [4]. The application of high-throughput technologies, like genomic, transcriptomic, proteomic, and other "omic" technologies, allowed identification of lineage-specific expression profiles and study of cellular dynamics and regulatory mechanisms controlling human hematopoietic homeostasis and cell specification [5,6]. The proliferation and differentiation of HSCs down the erythroid lineage, called erythropoiesis, is thus one of the best-characterized developmental models, yet many of its aspects remain obscure [7].
MicroRNAs (miRNAs) are small non-coding RNAs (sncRNAs) of 20-23 nucleotides that posttranscriptionally modulate gene expression, mainly via degradation of mRNAs of target genes and/or inhibition of translation [8,9], although they can also upregulate translation or control transcription [10][11][12]. They represent an axis of regulation of erythropoiesis, fine-tuning the expression of key TFs that control cell fate, with mounting evidence for their additional role in other, more complex competing endogenous RNA (ceRNA) networks [13,14]. An ongoing effort to complement mostly microarray-based high-throughput studies and computational predictions with functional studies of individual miRNAs draws an image of complex modular action for miRNAs, characterized by their involvement in a multitude of highly intertwined processes and by interactions with proteins and with coding and non-coding transcripts.
Landmark studies on miRNAs during erythropoiesis catalogued differential miRNA expression in in vitro erythroid differentiation of human progenitors [15] and revealed progressive downregulation and targeting of the key erythroid receptor c-KIT by the miR-221/222 cluster [16], progressive downregulation of miR-221, miR-223, and miR-155, upregulation of GATA1-driven high-level expression of miR-144 and miR-451 [17][18][19], and LMO2-targeting and progressive downregulation of miR-223 [20]. Subsequently, several miRNAs were shown to be involved in normal and abnormal erythropoiesis or in its accompanying processes, such as iron metabolism and hemoglobin synthesis [21]. Importantly, β-hemoglobinopathies are clinically prominent and are typically ameliorated by elevated expression of the β-like γ-globin, which may substitute β-globin chains of adult hemoglobin (HbA; α 2 β 2 ) to form fetal hemoglobin (HbF; α 2 γ 2 ) [22,23]. This observation has fueled decades of exceptionally detailed characterization of the perinatal fetal-toadult switch from βto γ-globin, which nevertheless is still not fully understood [24,25]. Therefore, further delineation of the underlying mechanisms and the role of miRNAs and other non-coding RNAs (ncRNAs) in hemoglobin switching will facilitate the design of alternative or supplementary therapies for β-hemoglobinopathies. Although several studies have already implicated specific miRNAs in hemoglobin switching, manipulation of single miRNAs has yet to achieve therapeutically relevant HbF reactivation [26][27][28][29]. Based on our growing understanding of normal and pathological miRNA action in networks of multiple-to-multiple rather than one-to-one interactions, this may instead be achieved by concerted manipulation of multiple functionally related miRNAs [30][31][32].
In the present study, we employed small RNA sequencing to explore sncRNA and in particular miRNA expression in ex vivo human erythroid cell cultures from normal peripheral blood (PB)-derived CD34+ HSPCs (hCD34+), as well as in human umbilical cord blood-derived erythroid progenitor (HUDEP) cell lines (HUDEP-1 and HUDEP-2) during erythroid differentiation. To this end, hCD34+, HUDEP-1, and HUDEP-2 cells were expanded, differentiated, characterized, and analyzed for small RNA expression at early and late stages of erythroid commitment. Whereas hCD34+ cells are relevant as the hematopoietic stem and progenitor cell population used for therapeutic transplantation, the HUDEP cell lines are now used by many hematological research groups and currently represent an invaluable model for the study of erythropoiesis and therapy development [33][34][35][36][37][38]. Since their establishment by Kurita et al. based on lentiviral transduction of the doxycycline-inducible human papillomavirus 16 (HPV16)-E6/E7 expression system, HUDEP cells have overwhelmingly replaced other cell lines, such as K562, MEL, and UT-7/EPO, in the analysis of erythropoiesis, owing to their superior similarity to human primary cells. Notably, HUDEP-2 and HUDEP-1 show selective upregulation of β-globin and γ-globin expression, respectively, and BCL11A, a major repressor of γ-globin, is abundantly expressed only in HUDEP-2 cells [39]. Based on these characteristics, global small RNA sequencing of both HUDEP cell lines for comparison in erythropoiesis with one another and with adult primary erythroid cells will provide more authentic insights than those obtained from other erythroid cell lines [18,40,41] and may prove highly informative for the role and possible therapeutic exploitation of detected miRNAs.
In this first global sncRNA study on HUDEP cells, the parallel analysis of miRNA expression in primary erythroid cells and HUDEP erythroid cell lines identifies miRNA signatures associated with erythropoiesis, reveals differences between primary cells and their immortalized equivalent, and provides pointers to miRNAs, miRNA target genes, such as TFs, and pathways mechanistically linked to globin switching. Furthermore, predicted interactions between identified miRNAs and known, erythroid-specific long non-coding RNAs (lncRNAs) indicate evaluable crosstalk between small and long ncRNAs in our erythroid cell cultures.

Results
The experimental design for this study, including time points of analyses and culture conditions, is illustrated in Figure 1.

Flow Cytometry-Based Cell-Surface Protein Analysis Confirms Consistent Erythroid Differentiation in Cultures
Erythroid differentiation in cell culture systems was assessed by flow cytometry analysis of cell surface protein expression at three time points: early-stage (_E, day 0 of erythroid differentiation), intermediate stage (_I, days 3 and 4 of erythroid differentiation for hCD34+ and HUDEP cells, respectively) and late-stage (_L, days 6 and 9 of erythroid differentiation for hCD34+ and HUDEP cells, respectively). While excluded from sequencing analysis, the intermediate-stage samples served to monitor and confirm progress and comparability of erythropoiesis. Two sets of proteins, CD44 and CD49d (integrin alpha 4) as early differentiation markers and CD235a (glycophorin A, GpA) and CD71 (trans-ferrin receptor) as late differentiation markers, were evaluated, to track erythropoiesis by their differential expression. The corresponding histograms, comparing the expression of markers with the progression of erythropoiesis, are shown in Figure 2a. The observed dynamic changes of expression revealed an overall normal and comparable progression of erythroid differentiation in hCD34+ and HUDEP cells. As expected, expression levels of early-stage markers (CD44 and CD49d) declined upon induction of differentiation, in line with previous reports of their progressive decrease from proerythroblast to reticulocyte stages [42,43]. In our cultures, CD71 rapidly increased from early to intermediate stages, but then declined towards late stages of erythropoiesis, also in agreement with earlier studies, where CD71 was consistently found to increase 3-to 4-fold from proerythroblast to basophilic and polychromatophilic erythroblast stages and then return to proerythroblast level in late-stage orthochromatophilic erythroblasts [43,44]. Finally, CD235 as the major intrinsic protein of the human erythrocyte membrane [45], steadily increased from early to late stages in all cultures, although with a higher baseline expression in early-stage HUDEP cells. The latter observation concurs with the basophilic erythroblast gene expression profile of undifferentiated HUDEP cells previously documented by others [46].

Morphological Changes Characterize Erythroid Differentiation in Cultures
Morphologically, both primary hCD34+ cells and HUDEP erythroid cell lines progressed through the recognizable erythroblastic stages, with polychromatophilic erythroblasts and fewer orthochromatophilic erythroblasts already appearing by day 3 for hCD34+ and by day 4 for HUDEP cells (_I). At the end of erythroid differentiation (_L), hemoglobinproducing cells comprised the majority of cells in hCD34+, HUDEP-1, and HUDEP-2 cell cultures (96.0%, 98.4%, and 99.8%, respectively). At this stage, orthochromatophilic erythroblasts with pyknotic nuclei and enucleated cells prevailed in primary cell cultures, whereas poly-and orthochromatophilic erythroblasts were the predominant cells in HUDEP cell cultures, in the absence of enucleation (Figure 2b). Differential counting of erythroid subpopulations for _E and _L samples (total numbers of cells counted were 1647 for CD34_E, 1047 for HUDEP1_E, 1995 for HUDEP2_E, 1670 for CD34_L, 908 for HUDEP1_L, and 1179 for HUDEP2_L) indicated a greater differentiation potential of hCD34+ cells as shown by the cell fraction of late-stage erythroid differentiation (orthochromatophilic erythroblasts and reticulocytes); however, differences between cell types were not statistically significant (Figure 2c,d). Moreover, cell death, as evaluated by trypan blue assay, was higher in late-stage HUDEP-1 and HUDEP-2 cultures (≈67% and ≈52%, respectively) compared to late-stage primary cell cultures (15%). The above observations are attributed to the documented inherent inefficiency of differentiation of HUDEP cells and the reduced cell viability accompanying the withdrawal of doxycycline in cultures [36,39].

Reversed-Phase High-Performance Liquid Chromatography (RP-HPLC) Analysis Verifies Adult and Fetal Globin Expression Profiles of Differentiated Cells
Separation of globin chains by RP-HPLC was also performed in late-stage samples for hCD34+, HUDEP-1, and HUDEP-2 cells. As expected, hCD34+ cells and HUDEP-2 cells exhibited an adult globin expression profile with αand β-globin production, whereas HUDEP-1 had a fetal profile with αand predominantly γ-globin production ( Figure 2e). To identify small RNAs expressed during erythroid differentiation, we used DNA Nanoball (DNB) sequencing technology to sequence small RNAs from total RNA of hCD34+, HUDEP-1, and HUDEP-2 cells at two distinct time points of erythroid differentiation: early-stage (on day 0 for all cell sources and henceforth labeled _E) and late-stage (on days 6 and 9 for hCD34+ and HUDEP cells, respectively, and henceforth labeled _L). Three biological replicates in each stage were analyzed, labeled A, B, and C.
The proportions of reads assigned to known miRNAs ("mature" and "precursor") and other kinds of sncRNAs (e.g., piwi-interacting RNAs (piRNAs), small nuclear RNAs (snRNAs), small nucleolar RNAs (snoRNAs), small interfering RNAs (siRNAs), transfer RNAs (tRNAs), ribosomal RNAs (rRNAs)), are summarized as pie charts in (Supplementary Figure S1); for the number of known and novel miRNAs and piRNAs detected for each sample, see Table S1. In total, we measured the expression of 1707 known miRNAs assigned to 583 miRNA families and identified 2138 novel miRNAs based on the predicted secondary structure of their potential precursors. Out of these, 538 known miRNAs and 210 novel miRNAs had an average read count number above 20 in tested samples, while more than 85 miRNAs showed more than 1000 reads in each sample. Only mature miRNAs and no miRNA precursors or other classes of sncRNAs were considered for further analyses in the context of this study.
Most of the predicted novel miRNAs had relatively low abundances and were identified only in few samples (Table S3). Thirty-seven novel miRNAs were expressed in all tested samples. Among them, novel_mir1 was the most abundant, its average number of reads contributing 50.45% and 1.37% of all reads for novel and for all (novel and known) miRNAs, respectively. Novel_mir1 was mapped to chr17:30306982_30307057, based on the predicted secondary structure of its potential precursor.
For a global view of miRNA expression in our cell culture systems, we performed principal component (PC) analysis of known and novel expressed miRNAs (Figure 3a). The first PC (PC1), accounting for 28.6% of the variation in the dataset, corresponds to the stage of erythroid differentiation (early vs. late), indicating an overall pattern of temporal changes in expression, whereas the second PC (PC2), accounting for 15.5% of the variation, corresponds to the cell type. Notably, PC2 grouped both adult-type cell sources, HUDEP-2 and hCD34+ samples, together, while separating fetal-type HUDEP-1 samples.  For pairwise comparisons of interest (same-stage expression between cell types, and late-vs. early-stage expression for each cell type) we performed differential expression analysis of all known and novel miRNAs using DEGseq [47]. For this analysis, DE miRNAs were defaulted as miRNAs with a false discovery rate (FDR) < 0.001 and a fold change (FC) ≥ 2 up or down (|log 2 FC| ≥ 1). Of note, all log 2 FCs cited in this article were statistically significant with a p-value <1.00 × 10 −300 , unless stated otherwise. Hierarchical clustering of all known and novel DE miRNAs confirmed the result of PCA, showing the relatedness between hCD34+ and HUDEP-2 cells as captured at both early and late stages of erythroid differentiation (Figure 3b,c). When samples were analyzed for changes in miRNA expression upon induction of erythroid differentiation (_L vs. _E), HUDEP-1 and HUDEP-2 cells clustered together and separately from hCD34+, although the inter-cluster variance was, in that case, low (Figure 3d), indicating an overall common erythroid differentiation signature for all three cell types. The top 60 most variable known miRNAs across cell types and stages, as defined by the standard deviation (SD) of expression, were sorted by descending average expression and plotted in a heatmap (Supplementary Figure S2).

Temporal miRNA Expression Profiling Shows Dynamic Regulation of Erythroid Differentiation
To identify miRNAs that are dynamically expressed during erythropoiesis, we looked for known and novel DE miRNAs between _L and _E samples. Evidently, more miRNAs were downregulated than upregulated in the late stages of erythroid differentiation in all cell cultures (Figure 4a). Consistent with progressive nuclear and chromatin condensation and downregulation of global gene expression after erythroid lineage commitment [48,49], our data revealed a restricted pattern of global miRNA expression and a reciprocal increase of the expression of individual lineage-specific miRNAs in terminal stages of erythropoiesis. This finding mirrors at the miRNA level what has been observed at the mRNA level, i.e., that less differentiated cells retain a poised cell state that is characterized by transcriptional complexity [50], and indicates active regulation of the cell state by a large repertoire of miRNAs.
By considering known miRNAs DE between early and late stages of erythroid differentiation, and their target genes, we identified significantly enriched biological pathways in hCD34+, HUDEP-1, and HUDEP-2 cells using Reactome [54] (Table S7). This confirmed pathways at the miRNA level that had been highlighted in previous conventional transcriptomic studies of HSPC proliferation and erythroid differentiation, including the EGFR, the SCF-KIT, the MAPK, and the PI3K-Akt signaling pathways [55][56][57], and also indicated novel pathways, such as signaling by ErbB2 and by Wnt.

Characterisation of the miRNA Transcriptome Highlights Differences between Primary hCD34+ Cells and HUDEP Cell Lines
The characterization and comparative analysis of the small-RNA transcriptome in hCD34+ and HUDEP cells showed an overall common erythroid differentiation signature, which vindicates the use of HUDEP cells as informative in vitro models for human erythropoiesis, also for the analysis of sncRNAs. Differential expression analysis of known and novel miRNAs in same-stage HUDEP-1 vs. hCD34+ and HUDEP-2 vs. hCD34+ confirmed the greater relatedness between hCD34+ and HUDEP-2 cells but also revealed nuanced differences that must be taken into consideration when those models are used to evaluate novel therapeutic targets and associated molecular mechanisms.
We focused on differential expression analysis of known miRNAs in HUDEP2_L vs. CD34_L cells as the most informative comparison for an assessment of the fidelity of findings from HUDEP-2 cells in functional studies of miRNAs ( Figure 5a and Table S8). Differential expression compared to hCD34+ cells would indicate potentially artefactual observations in HUDEP-2 cells (see also Table S4 and Supplementary Figure S3). In total, 172 and 242 known miRNAs were significantly upregulated in HUDEP2_L and CD34_L cells, respectively. By mapping those miRNAs to the genomic location of their precursor sequences using Bowtie 2 (Figure 5b), we identified five miRNAs of the miR-106a/363 cluster located on the Xq26.2 chromosome as significantly upregulated in HUDEP2_L cells. The cluster consists of six miRNAs (miR-106a, miR-18b-, miR-20b, miR-19b-2, miR-92a-2, and miR-363), encoded within about 850 nucleotides, and was previously linked to oncogenesis [58]. Most importantly, the Xq26.2 miRNAs were recently found to be dysregulated by the HPV16-E6/E7 oncoproteins [59]. It is therefore likely that their observed differential expression reflects the doxycycline-induced expression of the HPV16-E6/E7 immortalization system employed in HUDEP cells [39]. The most highly divergent miRNA of this locus was miR-363-3p (HUDEP2_L vs. CD34_L log 2 FC = 3.51). Overall, chromosomes X, 1 and 19 were the top three most DE miRNA-gene-containing chromosomes, whereas the Y chromosome contained none. The observation was unsurprising, as chromosomes 1, 19, X and 2 are known to carry a disproportionately high number of miRNA genes (about 29% of total human miRNA genes in 15% of human genomic DNA). Moreover, the high number of miRNA genes on the X chromosome and the scarcity or absence of miRNA genes on the Y chromosome are common in all species, constituting an evolutionarily conserved phenomenon [60]. A similar upregulation of the Xq26.2 cluster was observed in HUDEP1_L vs. CD34_L, miR-363-3p being again the most highly divergent of the locus (HUDEP1_L vs. CD34_L log 2 FC = 5.09) (Supplementary Figure S4 and Table  S9). In line with their HPV16-E6/E7-related expression and with high levels of doxycycline in early HUDEP-1 and HUDEP-2 cultures, the upregulation of the Xq26.2 cluster in HUDEP1_E and HUDEP2_E in comparison to CD34_E cells was even more striking than in late cultures, once more with miR-363-3p as the most upregulated miRNA of the locus (HUDEP1_E vs. CD34_E log 2 FC = 8.37, HUDEP2_E vs. CD34_E log 2 FC = 6.69). Two other known oncomiRs, miR-125b-5p and miR-196a-5p [61,62], were also significantly upregulated in HUDEP-2 cells (HUDEP2_L vs. CD34_L log 2 FC = 4.25, p < 1.00 × 10 −300 and 6.06, p = 3.69 × 10 −149 , respectively).  Table S8 for a full list). (b) The number of HUDEP2_L vs. CD34_L DE known miRNAs located in different chromosomes.
By mapping each upregulated known miRNA in HUDEP-1 cells to the genomic position of their precursor sequences (Figure 6b), we identified 17 out of a total of 274 as mapping to the frequently imprinted 14q32 locus, previously identified as the genomic locus of a strongly upregulated miRNA cluster in fetal erythroblasts [66]. The most significant miRNA of this locus in our study was miR-342-3p (HUDEP1_L vs. HUDEP2_L log 2 FC = 1.4). Once more, 5 out of the 6 miRNAs of the Xq26.2 cluster were identified as DE in HUDEP1_L vs. HUDEP2_L, likely reflecting the variability of expression for their shared doxycycline-inducible HPV16-E6/E7 transgene. Chromosomes X and 1 were the top two most miRNA-gene-containing chromosomes for miRNAs upregulated in HUDEP-1 cells, whereas the Y chromosome contained none. By mapping each known miRNA downregulated in HUDEP-1 cells to the genomic position of its precursor sequence, we identified the miR-183 cluster (miR-183, miR-96, and miR-182) located on chromosome 7 [72], and the miR-125a/let-7e cluster on chromosome 19 [73], as significantly downregulated. Chromosome 19 was the topmost miRNA-gene-containing chromosome for miRNAs downregulated in HUDEP-1 cells, whereas chromosomes 13 and Y contained none.  Table  S11 for a full list). (b) The number of HUDEP1_L vs. HUDEP2_L DE known miRNAs located in different chromosomes.
To understand the complex multiple-to-multiple relations between DE miRNAs, target genes, and TFs, we explored TF-miRNA co-regulatory networks using miRNet (Figure 7a,b) [74,75]. The lists of 274 and 131 known miRNAs upregulated in HUDEP1_L and HUDEP2_L, respectively, were used independently as input for network analysis in miRNet and processed along with a separate list of ten TFs known to be involved in γ-globin regulation (BCL11A, ZBTB7A, KLF1, CHD4, KLF10, MYB, SOX6, NFE2, NFYA, TAL1) [76][77][78][79][80]. For five of them (BCL11A, KLF1, MYB, SOX6, TAL1), published gene expression data reveal differential expression between HUDEP-1 and HUDEP-2 cells [39,81]. The analysis was based on experimentally validated miRNA-interaction data collected from the TarBase v8.0 database [82]. The constructed networks, after filtering out of nodes with low degree and betweenness centrality, revealed hub miRNAs and genes as well as important connections between the three (TF-miRNA-gene) interactors. Interestingly, several TFs, including the two major γ-globin repressors, BCL11A and ZBTB7A, were recognized as interactors in both miRNA interaction networks. Among them, ZBTB7A and KLF10 showed the highest degree and betweenness centrality in the network of upregulated miRNAs in HUDEP1_L and HUDEP2_L, respectively. Feedforward loop type-specific interactions, considered to be key components of gene regulatory networks (GRNs) [83,84], were also detected in our networks, shedding more light on particular molecular pathways regulated by these TFs (Supplementary Figure S5). In an independent analysis of experimentally validated miRNA/target-mRNA interaction networks of BCL11A and ZBTB7A retrieved from miRNet, we found that miRNAs in these networks were overrepresented in the list of HUDEP1_L vs. HUDEP2_L DE miRNAs. More specifically, 9 out of a list of 17 experimentally validated miRNAs that target the major HbF repressor BCL11A according to miRNet, were indicated as DE in HUDEP1-L vs. HUDEP2_L in our study (p = 0.0109), whereas the corresponding numbers for ZBTB7A were 35 out of 100 (p = 0.0092). We also looked at LIN28B, an RNA-binding protein recently recognized as an upstream regulator of BCL11A [85], and found 17 DE miRNAs out of a list of 41 validated LIN28B-associated miRNAs (p = 0.0109) (Table S13). The above findings advocate a collective role of miRNAs in erythropoiesis and globin expression regulation, where multiple miRNAs act together to fine-tune the expression of important target genes.
To examine the biological interpretation of results further, we performed gene ontology (GO) enrichment analysis [86,87] on target genes of all known and novel HUDEP2_L vs. HUDEP1_L DE miRNAs, to identify GO terms that are significantly overrepresented in our dataset. With numerous overrepresented GO nodes in all three GO categories, biological process, cellular component, and molecular function, the latter with the exemplary nodes protein binding transcription factor activity (430 target genes), antioxidant activity (42 target genes) and translational regular activity (20 target genes) presents nodes of particular potential interest for the differential regulation of globin expression and associated stress factors (Supplementary Figure S6). In addition, pathway enrichment analysis using Enrichr [88] and WikiPathways 2019 Human [89] identified 106 enriched biological pathways (adjusted p-value < 0.05). Several pathways, such as MAPK, PI3K-Akt, RAS, and insulin signaling pathways, have previously been implicated in γ-globin gene regulation [90][91][92][93][94], but others, such as EGF/EGF receptor, VEGF A/VEGF receptor 2, and ErbB signaling pathways, have not been reported as involved in this process before (Table 1).

Evidence Suggests Cross Talk between miRNAs and Long Non-Coding RNAs (lncRNAs)
Accumulating evidence suggests that miRNAs and other classes of ncRNAs interact with each other directly to further modulate the effect of their regulation [95,96]. This multimodal cross talk is only beginning to be unraveled, as new studies, mainly focusing on lncRNA-miRNA interaction, elucidate its regulatory role in biological processes, including erythropoiesis [97][98][99]. We thus conducted in silico analysis of lncRNA-miRNA interactions focusing on BGLT3, an erythroid lncRNA that positively regulates γ-globin gene expression. Lying in an intergenic region downstream of the A γ-globin (HBG1) gene, the non-coding BGLT3 locus and its transcript act separately to increase globin expression through looping to the globin genes and interacting with the mediator of the RNA polymerase II transcription complex, respectively [100]. To date, no BGLT3-miRNA interactions have been reported; however, it is likely that RNA cross-talk exists and modulates the BGLT3-mediated regulation of γ-globin. MicroRNA members of five out of nine miRNA families predicted to interact with BGLT3 according to miRcode [101] were DE in HUDEP1_L vs. HUDEP2_L samples. Three of them, miR-125a-5p, miR-193a-5p and miR-155-5p, were upregulated in HUDEP2_L (HUDEP1_L vs. HUDEP2_L log 2 FC = −2.7, p < 1.00 × 10 −300 −2.52, p = 1.24 × 10 −178 and −1.09, p < 1.00 × 10 −300 , respectively) and two, miR-142-3p and miR-138-5p, upregulated in HUDEP1_L cells (HUDEP1_L vs. HUDEP2_L log 2 FC = 2.42, p < 1.00 × 10 −300 and 1.21, p = 1.29 × 10 −4 , respectively). Interestingly, miR-125a-5p and miR-142-3p were among the top 50 most significant HUDEP1_L vs. HUDEP2_L DE miRNAs. Of note, HBBP1 (NR_001589), another lncRNA recently identified as related to high γ-globin levels [102], is predicted to interact with five HUDEP1_L vs. HUDEP2_L DE miRNAs, including miR-18b-5p (HUDEP1_L vs. HUDEP2_L log 2 FC = 2.99), which was among the top 20 DE miRNAs based on p-value. a b miRNA gene transcription factor Figure 7. TF-miRNA co-regulatory networks illustrating the "multiple-to-multiple" pattern of miRNA interaction. Networks were extracted from miRNet using key erythroid transcription factors and DE miRNAs as input (accessed 12/2020). Analyses were performed in line with miRNet recommendations to limit network size (<2000 nodes), i.e., the "Data Filter" function was used, based on the network topology measures-degreed, betweenness, and shortest path. In this context, the degree of a node is the total number of connections it has to other nodes, and high-degree nodes are considered important "hubs" in a network (set to the default 1.0 here). The betweenness measures the number of shortest paths going through a node, and nodes with higher betweenness are important interactors in a network (set to 1.0 here). The shortest path filter helps reduce the complexity of a network by keeping only one, the shortest, path between hub nodes, i.e., in the presence of multiple paths connecting two nodes, only the shortest path will be retained (applied here). In the figure, node size indicates betweenness centrality, with large symbols highlighting hub nodes. The components shown are (from inside out) input miRNAs mapped to a network, miRNet-identified target genes, and input TFs mapped to a network, as indicated with corresponding symbols. (a) TF-miRNA co-regulatory network of known miRNAs upregulated in HUDEP1_L. (b) TF-miRNA co-regulatory network of known miRNAs upregulated in HUDEP2_L.
Overall, all five miRNAs identified as DE by sequencing and shortlisted for confirmation by RT-qPCR showed the expected trend and similar magnitude of dynamic expression in RT-qPCR analysis. The high concordance between the two methods as calculated by Pearson correlation coefficient analysis (r = 0.8369, p < 0.0001) validates our general results obtained by small RNA sequencing. , and 146a-5p) was analyzed by RT-qPCR and normalized using miRNA-93-5p as an endogenous control for RNA input. All RT-qPCR experiments were carried out in triplicates. The miRNA expression is presented as mean ± standard deviation (SD), normalized to the highest sample readout.

Discussion
With this study, we provide the first comprehensive analysis of small RNAs expressed during erythroid lineage progression in HUDEP cell lines, along with a parallel analysis in adult hCD34+ cells.
In summary, miRNA expression analysis revealed a pattern of time-related changes during erythroid differentiation in both primary hCD34+ cells and HUDEP erythroid cell lines. In the process, more miRNAs were downregulated than upregulated, yet terminal stages of erythropoiesis were marked by a sharp increase in the expression of several lineage-specific miRNAs. Same-stage hCD34+ and HUDEP-2 cells (adult-like erythroid cells) showed evident clustering and clear separation from HUDEP-1 cells (fetal-like erythroid cells). Differential expression analysis in HUDEP1_L vs. HUDEP2_L cells suggested several miRNAs and implicated their putative target genes and related pathways in γ-globin regulation. Corresponding TF-miRNA co-regulatory networks readily incorporated recognized major factors involved in γ-globin regulation, including individual genes targeted simultaneously by multiple miRNAs or multiple genes targeted by the same miRNAs.
To elaborate on the "multiple-to-multiple" pattern of miRNA interaction, we explored miRNA/target-mRNA interaction networks of key regulators of hemoglobin switching (BCL11A and ZBTB7A) and found the corresponding miRNAs significantly enriched in the list of DE miRNAs between HUDEP-1 and HUDEP-2 cells. As our understanding of miRNA biology advances, it becomes clearer that multiple miRNAs act in a combinatorial and synergistic manner on the same target or pathway to regulate developmental events, such as hemoglobin switching. Groups of miRNAs that are "coordinately" increased and/or decreased at a certain cell state are likely to participate in interwoven GRNs, where recurrent motifs of feedback and feedforward loops exist between TFs, miRNAs, and their shared target genes [83,103,104]. Recent studies analyzing high-throughput data for both miRNAs and mRNAs have noted the prevalence of several miRNA/target-mRNA pairs with positively correlated expression patterns. These observations contradict the traditional notion of negatively correlated miRNA/target-mRNA pairs, which is based on an exclusively repressive mode of action of miRNAs but can be explained if perceived in the context of GRNs. In line with this, several up-and downregulated miRNAs in HUDEP-1_L in comparison to HUDEP-2_L cells were found to interact with some of the known γ-globin repressors in our constructed miRNA/target-mRNA networks. The identification of those co-expressed miRNAs that jointly regulate genes of interest (either cooperatively or antagonistically) in HbF expression pathways partly explains why most studies focusing on manipulating the expression of single miRNAs have failed to reactivate HbF substantially [27,28,69,105]. Mild co-manipulation of more than one miRNA may have a more pronounced phenotypic effect than targeting single miRNAs while minimizing at the same time off-target effects [106]. In this vein and beyond clear correlation with established mRNA expression patterns, integration of our data with comprehensive mRNA profiling will allow deeper insights into the networks concerned, as would multiplex inactivation of identified miRNA networks to validate their putative action on individual mRNA targets.
Adding to the complexity of GRNs, a new layer of regulation of biological processes was unraveled with the discovery of lncRNAs and the understanding of their role as key mediators in ceRNA activity. Recent theoretical and experimental studies have shed light on RNA-RNA "communication," reporting several modes of interaction: miR-triggered RNA decay, competition between miRNAs and lncRNAs for the same mRNA target, miRNA generation from lncRNAs, and lncRNAs acting as decoys for miRNAs [107]. Computational clues in our study provide preliminary evidence that for several miRNAs put forward as implicated in globin regulation here, two lncRNAs, BGLT3 and HBBP1, previously shown to be associated with high γ-globin levels, besides regulating the expression of target genes independently may act as miRNA binding partners to further modulate their effect. The versatility of lncRNA interactions and function, however, calls for dedicated functional studies to determine the nature of such interactions and draw firm conclusions as to their effect.
Our study is the first to determine miRNA expression profiles of HUDEP cells at early and late stages of erythroid differentiation and directly compare HUDEP cells to same-stage primary HSPC-derived erythroid cells. Given the now-widespread use of HUDEP-2 cells as a model of human terminal erythropoiesis [34][35][36][37][38], recognizing existing differences between them and their native counterparts is important for the design and interpretation of experimental studies based on HUDEP cell lines. This is particularly critical for the development of targeted therapies, as manipulated cellular pathways may respond differently in engineered cell lines compared to their tissue of origin [108]. We have highlighted specific aspects of HUDEP cells that need to be taken into account during the interpretation of functional studies. These include the inherent elevation of HPV16-E6/E7-related pathway for both, HUDEP-1 and HUDEP-2 cells, which is related to the immortalization process and not representative of observations in primary cells. Related to this point, differences in miRNA expression from the Xq26.2 cluster were also detected between both HUDEP cells and likely reflect differential transgene expression, short halflife and/or varying metabolization of doxycycline, and thus are probably not of biological significance for natural erythropoiesis and globin switching.
Weighed against greater fidelity of individual primary samples for observations in vivo, the use in the present study of cell lines HUDEP-1 and HUDEP-2 for the iden-tification of γ-globin-related miRNA signatures offers the substantial benefit of directly comparing co-derived fetal-like (HUDEP-1) cells to adult-like (HUDEP-2) cells of common tissue origin, specifically human umbilical cord blood, under the same culture conditions and without the biological variation typical of primary samples. In contrast with previous studies comparing erythroblasts derived from different tissues (fetal liver or cord blood vs. adult BM or PB) [28,66,67], the common tissue-specific background expression in our cord blood-derived erythroblasts renders any identified differences more likely to be relevant to γ-globin regulation. Likewise, a modest sample size and its impact on statistical sensitivity and specificity in our study are potentially offset by vastly reduced intra-group variation for cell lines, a view vindicated by consistent agreement of our significant observations with existing literature, where there are corresponding data. Moreover, the BGISEQ-500 sequencing system utilized here has been reported as superior to the commonly used Illumina HiSeq system for the detection of novel miRNAs, by showing a more even coverage of rare miRNA species, which further strengthens the validity of our novel data [109]. Taken together, these factors, in turn, lend credence to our novel observations, such as for novel miRNAs (e.g., novel_mir1 and novel_mir103) and novel or extended GRNs, including miRNAs and pathways newly implicated in erythropoiesis and γ-globin gene regulation, which are likely of biological relevance also in primary cells. Our work thus puts forward specific candidates and hypotheses for validation in functional studies, such as by combinatorial knock-down or editing of co-regulated and potentially collaborating miRNAs.
In conclusion, this study provides important insights into miRNA regulatory mechanisms in erythropoiesis and globin expression, highlighting co-regulation of miRNAs as central to their natural function. MiRNAs, genes, and pathways identified through this study may be explored as potential targets for the development of novel therapies for β-hemoglobinopathies and other disorders of erythropoiesis. Moreover, this study demonstrates the similarity of the commonly used HUDEP-2 cell line to primary erythroid cells, also at the level of miRNAs, and vindicates the use of HUDEP-2 and HUDEP-1 lines for corresponding functional studies. at concentrations below 0.5 × 10 6 cells/mL. The culture medium was changed twice a week and doxycycline was replenished every other day to retain expression of the HPV16-E6/E7 immortalization system. To induce erythroid differentiation, cells were cultured in a three-phase erythroid differentiation culture system previously described [36]. Briefly, cells were cultured in Iscove's Modified Dulbecco's Medium (Thermo Fisher Scientific, Waltham, MA, USA) containing 5% Human Serum from plasma male AB (Sigma-Aldrich, Munich, Germany), 2× Penicillin-Streptomycin (Thermo Fisher Scientific, Waltham, MA, USA), 1% L-Glutamine (Thermo Fisher Scientific, Waltham, MA, USA), 200 µg/mL Holo-Transferrin (Sigma-Aldrich, Munich, Germany), 2 IU/mL Heparin (Sigma-Aldrich, Munich, Germany), 10 ng/mL Insulin (Sigma-Aldrich, Munich, Germany), and 3 IU/mL EPO (Binocrit 4000 IU/0.4 mL, Sandoz GmbH, Kundl, Austria). In phase I (days 1-4), the culture medium was supplemented with 100 ng/mL hSCF (PeproTech, Rocky Hill, NJ USA). In phase II (days 5-7), hSCF was withdrawn from culture. In phase III (days 8 and 9), doxycycline was also removed from the culture medium. Cells in differentiation were maintained at 1 × 10 6 cells/mL. Cell counting was performed manually using a hemocytometer, and cell viability was measured using Trypan Blue Solution, 0.4% (Thermo Fisher Scientific, Waltham, MA, USA).

Isolation and Culture of PB-Derived hCD34+ HSPCs
hCD34+ cells were isolated from PB of healthy individuals by magnetic-activated cell sorting, after mononuclear cell isolation using a density gradient medium and according to a published protocol (Protocol C) [110]. Briefly, PB was collected in K2EDTA tubes (Greiner Bio-One, Kremsmünster, Austria) and mixed with 1.5 blood-volumes of Dulbecco's Phosphate-Buffered Saline (PBS) without calcium chloride and magnesium chloride (Sigma-Aldrich, Munich, Germany), and then carefully overlaid on top of 1.25 blood-volumes of Lymphoprep medium (Axis-Shield, Dundee, UK). After centrifuging at 900× g for 40 min without brake, the interphase, consisting of mononuclear cells, was harvested, washed 3 times in PBS, twice by centrifugation at 300× g for 5 min and once at 200× g for 5 min. The cell pellet was then resuspended in 600 µL ice-cold 1% bovine serum albumin (BSA) (Sigma-Aldrich, Munich, Germany) in PBS and incubated for 15 min with 100 µL CD34 MicroBead from CD34+ MicroBead Kit, human (Miltenyi Biotec, Bergisch Gladbach, Germany), on a rocking shaker at 4 • C. After two washes with ice-cold beading buffer (BB) (0.5% BSA, 2 mM EDTA in PBS) and centrifugation at 300× g for 5 min at 4 • C, the cell pellet was resuspended in 1 mL BB (without EDTA). Magnetic cell separation was carried out using LS columns placed on the QuadroMACS Separator on a MACS MultiStand (all from Miltenyi Biotec, Bergisch Gladbach, Germany). LC column was equilibrated with 5 mL BB, and then cells were loaded onto the column in two consecutive steps with an intervening BB column washing step. Cells were eluted from the column with 5 mL BB (by flushing with the plunger) and for the second round of selection were loaded onto a new equilibrated column to enrich further for hCD34+ cells. Cells were again eluted with 5 mL BB and collected by centrifugation at 300× g for 5 min. Isolated cells were then expanded in StemSpan SFEM II (Stem Cell Technologies, Vancouver, Canada) supplemented with 1 µM Dexamethasone (Sigma-Aldrich, Munich, Germany), 2 IU/mL EPO (Binocrit 4000 IU/0.4 mL, Sandoz GmbH, Kundl, Austria), 1% StemSpan CC100 (Stem Cell Technologies, Vancouver, Canada) and 1× Penicillin-Streptomycin (Thermo Fisher Scientific, Waltham, MA, USA) at concentrations below 0.5 × 10 6 cells/mL for ten days. The medium was changed twice a week. For the induction of erythroid differentiation, cells were cultured in Minimum Essential Medium Eagle, Alpha modification (Sigma-Aldrich, Munich, Germany) supplemented with 30% Defined Fetal Bovine Serum (Hyclone, Logan, Utah, USA), 10 −5 M 2-Mercaptoethanol (Sigma-Aldrich, Munich, Germany), 10 IU/mL EPO (Binocrit 4000 IU/0.4 mL, Sandoz GmbH, Kundl, Austria) 10 ng/mL hSCF (PeproTech, Rocky Hill, USA) and 1× Penicillin-Streptomycin (Thermo Fisher Scientific, Waltham, MA, USA) at a concentration of 0.5-1 × 10 6 /mL for six days. Again, cell counting was performed manually using a hemocytometer, and cell viability was measured using Trypan Blue Solution, 0.4% (Thermo Fisher Scientific, Waltham, MA, USA).

Cytocentrifugation and Microscopy
For determining cell morphology, 200 µL of cell suspension containing 1 × 10 5 cells were used for cell preparations on slides, using the Tharmac Cellspin II cytocentrifuge with an EASY-rotor (Tharmac, Wiesbaden, Germany). The slides were fixed in ice-cold methanol for 4 min, immersed in 1.5% o-Dianisidine solution (Sigma-Aldrich, Munich, Germany) for 2 min and then in freshly prepared H 2 O 2 /ethanol solution (50% ethanol, 1.5% H 2 O 2 in dH 2 O) for another 2 min. A standard May-Grünwald/Giemsa (Fluka, Munich, Germany) staining procedure was then followed, after which slides were air-dried, mounted, and sealed with a coverslip. Slide images were acquired using an IX73P1F inverted microscope, LED illumination, a 40× lens, and CellSens 1.7 imaging software (all Olympus Corporation, Shinjuku City, Tokyo, Japan). Four or five images per sample were obtained by using an averaged acquisition of seven frames, and cell morphology was evaluated blinded to the cell type and phase.

Flow Cytometry and Analysis
1 × 10 5 cells were used for each staining, including a non-stained sample for each set of measurements. Briefly, cells were washed twice in PBS (Sigma-Aldrich, Munich, Germany) and incubated for 10 min at 4 • C in 40 µL blocking buffer, composed of 39 µL 1% Bovine Serum Albumin (BSA) (Sigma-Aldrich, Munich, Germany) in PBS (Sigma-Aldrich, Munich, Germany) and 1 µL FcR human blocking reagent (Miltenyi, Bergisch Gladbach, Germany). Cells were then incubated with conjugated antibodies (PE anti-human CD235a (Glycophorin A) Antibody (clone HIR2), FITC anti-human CD71 Antibody (clone CY1G4), FITC anti-mouse/human CD44 Antibody (clone IM7), FITC anti-human CD49d Antibody (clone 9F10), all from Biolegend, San Diego, CA, USA, and at concentrations suggested by the manufacturer) for 30 min at 4 • C in the dark and analyzed using a CyFlow Cube 8 6-channel instrument (Sysmex Partec, Münster, Germany). For cell death assessment, cells were incubated for 10 min at 4 • C in the dark with 7-AAD Viability Staining Solution (Biolegend, San Diego, CA, USA) according to the manufacturer's instructions. A minimum of 0.5 × 10 5 cellular events was recorded and data were analyzed and visualized using FCS Express 4 (DeNovo Software).

Globin Chain Analysis by RP-HPLC
The pellet of 1 × 10 6 differentiated (_L) cells was washed with PBS and resuspended in 50 µL of HPLC-grade water before two rounds of freezing/thawing. After centrifugation at 16,000× g for 10 min at 4 • C, the supernatant was transferred to HPLC vials (Altmann Analytik, Munich, Germany). An LC-20AD chromatographic system (Shimadzu, Kyoto, Kyoto, Japan) and an Aeris Widepore C18 column (Phenomenex, Torrance, CA, USA) were used to separate peptides based on their hydrophobicity and on an increasing linear gradient of acetonitrile with 0.1% trifluoroacetic acid against 0.1% trifluoroacetic acid/0.033% sodium hydroxide for elution from the column, as previously published [111]. 25-30 µL of protein extract were injected per analysis. Heme and globin chains were eluted from the column at different retention times and detected as absorbance peaks, the area of which was used to determine the relative quantities of globin chains in samples.

DNA Nanoball (DNB) Small RNA Sequencing and Analysis
After RNA extraction, small RNAs were size-selected (18-30 nucleotides) using polyacrylamide gel electrophoresis. A total of 18 small RNA libraries were constructed comprising three biological replicates for each of the following samples and time points: CD34_E, HUDEP1_E, HUDEP2_E, CD34_L, HUDEP1_L, and HUDEP2_L. Sequencing was performed using DNB sequencing technology on a BGI-SEQ500 sequencing platform gener-ating~40 × 10 6 50-bp single-end reads per sample. Sequencing data were submitted to NCBI GEO with accession ID GSE165011. Before data analysis, raw data were filtered to remove low-quality reads, adaptors, and other contaminants, and then clean reads were mapped to the reference genome and to other ncRNA databases, such as miRBase [112] and Rfam 12.0 [113] by using Bowtie 2 [114] and CMsearch programs [115]. Micro-RNA counts were normalized to transcripts per million (TPM) [116] and DE analysis was performed using DEGseq for replicate samples [47]. DE miRNAs were defined as miRNAs with FDR < 0.001 and a FC of ≥2 up or down. Of note and in disambiguation of varying usage by authors and service providers, DE of sample A vs. sample B was expressed throughout as levels for sample A relative to levels for sample B. Throughout this study, "known" miRNAs and piRNAs are defined as those included in miRBase database [112] and piRN-ABank [117], respectively, whereas, "novel" miRNAs and piRNAs are those predicted by miRDeep2 [118] and Piano [119], respectively. PCA of the normalized quantification (as TPM) of 458 miRNAs with TPM readout > 0.02 for ≥3 samples and with a relative SD < 1.0 per biological replicate trio of samples, was performed in ClustVis [120]. Singular Value Decomposition with imputation was used to calculate principal components and the two major components selected for display. Hierarchical clustering was performed using the pheatmap function [121] and was based on the intersection of miRNAs DE for all three comparisons shown per panel. For target gene prediction, the minimum free energy (MFE) threshold was set to −20 kcal/mol and the intersection of the target genes of RNAhybrid [122], miRanda [123], and Targetscan [124] was chosen as the final dataset used in downstream analyses. For a visual exploration of TF-miRNA co-regulatory networks, we used miRNet and TarBase v8.0 databases [75,82]. To identify hub genes with high connectivity in networks and to illustrate the "multiple-to-multiple" concept of miRNA action, we applied filtering based on degree centrality (cutoff 1), betweenness centrality (cutoff 1), and "shortest path" in miRNet. GO and pathway enrichment analyses were performed for DE target genes using WEGO software [125] and Reactome [54] or Wikipathways in Enrichr [88,89], respectively. To look at lncRNA genes and predicted lncRNA-miRNA interactions, we used miRcode [101].

Reverse-Transcription Quantitative PCR (RT-qPCR)
Quantitative PCR using TaqMan MicroRNA Assay (Applied Biosystems, Foster City, CA, USA) was conducted to validate the sequencing results. 2-4 × 10 6 cells were lysed in 1 mL of TriFast (Peqlab/VWR, Erlangen, Germany), and miRNAs were extracted using PureLink™ miRNA Isolation Kit (Invitrogen, Carlsbad, CA, USA), according to the manufacturer's instructions. cDNA was generated using TaqMan MicroRNA Reverse Transcription Kit (Applied Biosystems, Foster City, CA, USA) and a miR-specific stem-looped RT Primer (TaqMan MicroRNA Assay, Applied Biosystems, Foster City, CA, USA) according to the manufacturer's instructions. RT-qPCR reactions were performed in 96-well plates on a QuantStudio 7 Flex Real-Time System (Applied Biosystems, Foster City, CA, USA). The reactions were prepared in a volume of 10 µL containing 0.67 µL cDNA, 5 µL TaqMan Universal PCR Master Mix (Applied Biosystems, Foster City, CA, USA), and 0.5 µL of miRspecific TaqMan MicroRNA Assay (a mix of miR-specific forward PCR Primer, miR-specific reverse PCR Primer, and miR-specific TaqMan MGB probe) (Applied Biosystems, Foster City, CA, USA). Reaction conditions were as follows: an initial enzyme activation step for 10 min at 95 • C, followed by 40 cycles of denaturation for 15 s at 95 • C and elongation for 60 s at 60 • C. All reactions were performed in triplicates and a non-template control sample was always included as a negative control. The expression of five miRNAs was analyzed and normalized against miR-93-5p. Three commonly applied housekeeping small RNAs were initially selected and evaluated as candidate internal controls: RNU48, miR-26a-5p, and miR-93-5p [126][127][128][129]. Among them, miR-93-5p exhibited the best stability (smallest CT variation) and was hence identified as the most suitable internal normalizer for our dataset. In accordance with this, miR-93-5p was the gene with the lowest coefficient of variation identified by small RNA sequencing in our study (Table S14). Exemplary analysis of efficiencies for miR-93-5p and miR-451a confirmed equal amplification efficiencies (similar slope values at −3.73 (R 2 = 0.99) and −3.55 (R 2 = 0.99), respectively, n = 3) across the entire sample dilution range. Determination of CTs was performed using the QuantStudio 7 Software v1.7.1 (Applied Biosystems, Foster City, CA, USA), and raw CT data were transformed into relative quantities by the ∆∆CT method [130].

Statistical Analysis
After data processing in Excel (Office 2013, Microsoft Corp, Redmond, WA, USA), enrichment analyses for miRNA/target-mRNA pairs (based on miRNet-validated represented and DE miRNAs for each target and the input list) were performed as hypergeometric distribution analysis in Excel, and all other statistical analyses were performed in Prism 8.0 (GraphPad Software Inc., La Jolla, CA, USA). The D'Agostino-Pearson normality test was used to determine sample distribution and the appropriate test for non-directional group comparisons. One-way ANOVA with Dunnett's or Tukey's multiple comparison test, as appropriate, was used to compare normally distributed data, for others the non-parametric Kruskal-Wallis test with Dunn's multiple comparison test. Summary statistics are given as geometric mean for fold-changes and as the arithmetic mean ± SD of the population mean for other data. To measure the strength of association between sequencing and RT-qPCR data, we determined the Pearson correlation coefficient r and its corresponding p-value.  Figure S1 Average proportions of sncRNA reads per sample type (n = 3). Table S1. Summary of detected sncRNAs in samples. Table S2. Expression levels of known miRNAs in early-and late-stage hCD34+, HUDEP-1, and HUDEP-2 samples. Table S3. Expression levels of novel miRNAs in early-and late-stage hCD34+, HUDEP-1, and HUDEP-2 samples. Supplementary Figure S2. Highly expressed, highly variable known miRNAs. Table S4. Selection of exemplary miRNAs with relevant previous characterization. Supplementary Figure S3. Plots of differential expression analysis in pairwise comparisons, indicating exemplary miRNAs with corresponding previous characterization. Table S5. Differentially expressed known miRNAs in comparisons of late-vs. early-stage hCD34+, HUDEP-1, and HUDEP-2 samples Table S6. Differentially expressed novel miRNAs in comparisons of late-vs. early-stage hCD34+, HUDEP-1, and HUDEP-2 samples. Table S7. Differentially expressed pathways in comparisons of late-vs. earlystage hCD34+, HUDEP-1, and HUDEP-2 samples. Table S8. Differentially expressed known miRNAs in comparisons of late-stage HUDEP-2 vs. hCD34+ samples. Supplementary Figure S4. Differentially expressed known miRNAs in the comparison of late-stage HUDEP-1 vs. hCD34+ samples (n = 3). Table S9. Differentially expressed known miRNAs in comparisons of late-stage HUDEP-1 vs. hCD34+ samples. Table S10. Differentially expressed pathways in comparisons of late-stage HUDEP-2 vs hCD34+ samples. Table S11. Differentially expressed known miRNAs in comparisons of late-stage HUDEP-1 vs. HUDEP-2 samples. Table S12. Differentially expressed novel miRNAs in comparisons of late-stage HUDEP-1 vs. HUDEP-2 samples. Supplementary Figure S5. Schematic representation of exemplary feedforward loop type-specific interactions. Table S13. Enrichment analysis of miRNAs targeting key erythroid transcription factors in the list of DE miRNAs in HUDEP1_L vs. HUDEP2_L samples. Supplementary Figure S6. Overrepresented gene ontology nodes based on differential miRNA expression HUDEP-1 vs HUDEP-2. Table S14. Coefficient of variation for all (known and novel) miRNAs across all samples.

Conflicts of Interest:
The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.