Deciphering the Retinal Epigenome during Development, Disease and Reprogramming: Advancements, Challenges and Perspectives

Retinal neurogenesis is driven by concerted actions of transcription factors, some of which are expressed in a continuum and across several cell subtypes throughout development. While seemingly redundant, many factors diversify their regulatory outcome on gene expression, by coordinating variations in chromatin landscapes to drive divergent retinal specification programs. Recent studies have furthered the understanding of the epigenetic contribution to the progression of age-related macular degeneration, a leading cause of blindness in the elderly. The knowledge of the epigenomic mechanisms that control the acquisition and stabilization of retinal cell fates and are evoked upon damage, holds the potential for the treatment of retinal degeneration. Herein, this review presents the state-of-the-art approaches to investigate the retinal epigenome during development, disease, and reprogramming. A pipeline is then reviewed to functionally interrogate the epigenetic and transcriptional networks underlying cell fate specification, relying on a truly unbiased screening of open chromatin states. The related work proposes an inferential model to identify gene regulatory networks, features the first footprinting analysis and the first tentative, systematic query of candidate pioneer factors in the retina ever conducted in any model organism, leading to the identification of previously uncharacterized master regulators of retinal cell identity, such as the nuclear factor I, NFI. This pipeline is virtually applicable to the study of genetic programs and candidate pioneer factors in any developmental context. Finally, challenges and limitations intrinsic to the current next-generation sequencing techniques are discussed, as well as recent advances in super-resolution imaging, enabling spatio-temporal resolution of the genome.


Introduction
The eye, the primary sense organ for vision, is often seen as an extension of the brain, and a model system to study the central nervous system. Despite sharing the same neuroectodermal origin, secreted signalling factors drive the regionalization of the neural plate along the anteroposterior axis, originating the prosencephalon that comprises the telencephalon, the eye, and the diencephalon, followed by the midbrain, the hindbrain, and the spinal cord. Neuroepithelial cells originating from different subdomains can acquire distinct, progressively restricted cell identities, under the combinatorial, dosagedependent action of several transcription factors (TFs), giving rise to the precursors of the eye field [1][2][3].
Extensive work has led to the profiling of epigenetic marks during retinal development, tumorigenesis, and reprogramming, by comparing the human fetal retina at different stages of development with human and murine retinoblastoma and murine rod-derived iPSCs [83].
Chromatin states throughout retinal development can be defined by clustering of histone acetylation and methylation marks by hidden Markov models (HMM) and paired variations in gene expression [83,84]. Post-translational modifications of histones on corresponding promoter regions are predictive of gene expression and can be profiled by chromatin immunoprecipitation sequencing (ChIP-seq). ChIP-seq can identify genome-wide binding for the insulator proteins CTCF, BRD4, mediator, RNA-Pol-II, the histone marks H3K4-me1,2,3, H3K36me3, and enhancers decorated by H3K27ac and/or H3K4me1 [85,86].
As retinogenesis proceeds, de-repression of cell type-specific genes occurs and it is paralleled by progressive silencing of cell-cycle related and retinal progenitor cells related genes, such as Ascl1 [87], decorated by H3K27me3 and H3K9me3 repressive marks. HMM states and stage-specific super-enhancers marked by H3K27ac histone acetylation within TADs [61], are evolutionarily conserved between mouse and human, whereas DNA methylation regions are less conserved. Whole-genome bisulfite sequencing indicates DNA methylation and gene expression are inversely correlated but account only for a small fraction of the total number of developmentally regulated genes.

A Fine Balance between Reprogramming Capacity and Epigenetic Memory
All vertebrate retinas display a common organization and share basic cell types, although the retina has adapted to suit each species' ecological niche in response to different levels of light exposure, diversifying neural circuitry, cell subtypes, [88,89], cellular organization functional to light scattering and visual acuity [90][91][92], optical transparency [93,94] as well as subnuclear compartmentalization properties [95].
Pioneering studies on nuclear architecture led by immunocytochemistry [95], optics, and biophotonics [96] have shown that rod photoreceptors in nocturnal animals have an inverted nuclear structure, compared to diurnal animals, with specific refractive properties that enable better light propagation throughout the outer nuclear layer. In nocturnal rods, transcriptionally active euchromatin is situated at the periphery of the nucleus, while transcriptionally inaccessible and denser heterochromatin is located at its center, with an interposed layer of facultative heterochromatin. The rods of diurnal retinas, instead, possess the conventional architecture found in nearly all eukaryotic cells, with most heterochromatin situated at the nuclear periphery and active euchromatin residing toward the nuclear interior.
Later studies have relied on these foundational findings, in the attempt to elucidate differences in the nuclear architecture and cell type-specific reprogramming properties of the vertebrate retina [83,97,98]. Five different cell types in the murine retina and derivative iPSCs lines were produced in mosaic co-cultures with post-natal retinal pellets; the iPSCs lines were compared at two developmental stages for their ability to regenerate retinal organoids and to display cellular pliancy [99], that is, predisposition to undergo reprogramming by the acquisition of epigenetically active states. An inverse correlation was noticed between epigenetic memory and regenerative capacity across all derived cell types [97].
The regenerative capacity of retinal-derived iPSCs reminisces the progressive expansion of open chromatin states occurring during cancer, which is accompanied by the induction of proto-oncogenes [100,101], and it is inversely related to the epigenetic memory.
Epigenetic memory is cell-type specific; it is accompanied by persistent retention of repressive histone methylation marks and DNA methylation, and it partially depends on the sequestration of reprogramming genes to nuclear domains of facultative heterochromatin.
Conversely, full reprogramming capacity entails silencing of cell type-specific genes, resetting of poised enhancers and re-organization of heterochromatin to a pluripotent state.
In rods-derived iPSCs, that retain epigenetic memory of their cellular origin, pluripotencyrelated genes are sequestered into compartments of facultative heterochromatin and the overall heterochromatin content is higher compared to that in other cell types.
H3K9me3 marks are more resilient in rods-derived murine iPSCs where genes involved in the neurogenic process are actively repressed. Rods and bipolar-derived iPSCs have a more efficient retinal differentiation and the lowest reprogramming efficiency. In contrast, Müller glia, horizontal and amacrine cells have a lower proportion of nuclear volume sequestered into heterochromatin and are more prone to undergo complete reprogramming. Subsequent studies on purified murine rod photoreceptors have furthered these findings indicating that developmental, cancer, and stress-response genes are localized to facultative heterochromatin domains in rods [102]. Furthermore, integration of HiC analysis with H3K27ac marks and CTCF chromatin immunoprecipitating sequencing (ChIP-seq) have led to the identification of stage-specific super-enhancers, distal regulatory elements brought to the proximity of promoter regions by protruding DNA loops, and long-distance chromatin interactions across multiple TADs [98,102].
Furthermore, the epigenomic studies may be affected by experimental biases that are intrinsic to cell cultures systems and neoplastic conditions, or due to the spurious contribution of heterogeneous cell types composing the whole retinal tissue [83], where multiple cell types contribute to histone methylation and acetylation marks, enhancers elements, and long-range chromatin interactions [98].
While transcriptional and epigenetic descriptors do not necessarily address the mechanistic aspects of cell fate determination in the retina, the advent of single-cell sequencing has allowed, at least, to resolve such molecular complexity, by deconvolving the cell type specificity of regulatory elements that could not otherwise be discerned from conventional bulk sequencing [98,124].
A systematic census of TFs cisand trans-regulatory sites that are necessary and sufficient for the establishment of developmental trajectories is still missing. Nonetheless, the knowledge of the epigenomic mechanisms that drive the acquisition and stabilization of retinal cell fates holds the potential for the treatment of retinal degeneration.
An overwhelming amount of data have begun to accumulate, yet a question remains to be addressed and it is how, exactly, evolutionarily conserved, and divergent TF binding patterns coordinate epigenetic changes across retinal development.

A Truly Unbiased Screening of Open Chromatin States: From Clustering of De Novo Oligonucleotides to the Identification of TFs Binding Patterns in Retinal Cell Fate Specification
Functional interrogation of next-generation sequencing profiles can reveal TFs regulatory networks underlying retinal cell fate specification.
Previous work was led on flow-sorted retinal progenitor cells, purified from transgenic mice by virtue of the pan-retinal reporter VSX2 (Chx10-GFP-Cre) [75]. An agnostic screening of open chromatin states in early and late-stage retinal progenitor cells was paralleled by age-matched mRNA expression, inferential analysis of cisand trans-acting TFs that control temporal patterning, and footprinting profiling of TFs binding patterns, coupled with the identification of gene regulatory networks.
Interrogation of ATAC-seq and RNA-seq, paired with age-matched ChIP-seq for the top candidate TF identified the LIM-homeodomain TF LHX2 as a central regulator of chromatin accessibility across several stages of retinal development.
Hierarchical clustering of probabilistically assigned position weight matrices (PWM) led to the identification of the most functionally enriched TFs, among which homeobox domain-containing TFs, followed by C2H2-type, GC-rich, and KLF zinc finger proteins, then POU-homeodomain, MADS-box, and CCAAT-binding TFs. Individual motif instances were also found associated with NFY factors and the insulator protein CTCF, involved in chromatin looping and conformation. Among the putative assignments, many motif instances reflected positional variations of the LHX2 consensus [75]. LHX2 was the most represented TF motif by binomial scoring of known PWM both in retinal progenitor cells and stage-specific accessible regions. LHX2 occupancy was ascertained by footprinting analysis at both stages. ChIP-seq was performed for LHX2, integrated with age-matched RNA-seq and ATAC-seq profiles from flow-sorted retinal progenitor cells. The phenotypic readout on chromatin accessibility and transcriptome was re-assessed upon Lhx2 loss of function (Chx10-GFP-Cre; Lhx2 fl/fl ) in early-and late-stage retinal progenitor cells.

Lhx2
Is Required for the Regionalization of the Optic Vesicle through Cell-Autonomous Regulation of Gene Expression and for the Neuro-Retinal Suppression of Alternative Diencephalic Cell Fates LHX2 regulates the neural differentiation of human embryonic stem cells by inhibition of the BMP-WNT signaling pathway [130] and is required for the transition of early-stage retinal progenitor cells from an immature state, when they give rise to retinal ganglion cells [131], to a later stage, when they generate late-born Müller glia by enhancing the Notch signalling pathway [132,133]. Ultimately, LHX2 maintains the retinal progenitor cells' proliferative potential and neurogenic competence by preventing the generation of later cell types and an ectopic expression of genes specific to the anterodorsal hypothalamus and thalamic eminence [134].
LHX2 also defines the cortical identity of neuroepithelial progenitors, while repressing the expression of genes enriched within the cortical hem and ectopic hippocampal fields [135,136]. Additional studies indicate LHX2 has a role in radial glial cell populations that give rise to neurons and astrocytes in the hippocampus and neocortex [137,138]. Ablation of Lhx2 in neocortical radial glial progenitor cells approaching cortical neurogenesis results in a cell cycle exit, premature production of neurons and a loss of the stellate astrocytes [139].
This LIM homeodomain TF has been characterized in a plethora of developmental contexts and it is required for the eye field specification and the morphogenesis of the optic cup through cell autonomous regulation of gene expression [134,140,141]. While Lhx2 mutants are anophthalmic, display hypoplasia of the neocortex and severe anaemia, neuroretina specific loss of function of Lhx2 causes severe microphthalmia, loss of expression of a subset of retinal progenitor cell-specific genes, and ectopic expression of hypothalamic genes [134,140,142].
Ocular expression of Lhx2 has been observed as early as embryonic day e8.5; it can be detected in the retinal neuro-epithelium and retinal pigmented epithelium (RPE) at e10 and ciliary margin by e14, becoming progressively restricted to the inner nuclear layer. In the post-natal retina, the LHX2 TF is expressed in mitotic retinal progenitor cells that co-express VSX2 (CHX10) and Ki67. By post-natal day P7, LHX2 expression becomes restricted to differentiating Müller glia and is preserved in adult Müller glia, a subset of bipolar and starburst amacrine cells [132].

Lhx2 Is Expressed in Neuro-Epithelial, Bipotent Progenitors That Give Rise to Segregated, Neuroretina-like and RPE-like Domains in hiPSCs-Derived Optic Cups and Organoids
Eye development in the embryo arises from the eye field, a centrally organized domain of the neural plate, composed of neuroepithelial cells that express several TFs including PAX6, RX, LHX2, SIX3, SIX6, and it is surrounded by anterior neuroepithelial cells expressing PAX6 and SOX1. In the eye field formation, cell-fate specification of the anterior neuroectoderm into either the neuro-retina or the RPE is regulated critically by two TFs, the visual system homeobox 2 (VSX2) and the microphthalmia-associated TF (MITF), which are initially co-expressed in the bipotent progenitor cells but subsequently become restricted to the neuroretina and RPE, respectively [143][144][145][146].
The expression patterns of neuroepithelial cells can be recapitulated fairly well by the optic cups, where retinal progenitor cells begin to emerge after one week in vitro: the central region of differentiating aggregates downregulates the pluripotency markers OCT4 and SOX1 and induces the neural markers PAX6 and SIX3, followed by LHX2 and RX, and eventually SIX6, mimicking the chronology of expression of the eye field TFs seen in vivo [143,147]. Segregation of human-induced pluripotent stem cells (hiPSCs)derived optic cups into the neuroretina-like and presumptive RPE domains also occurs upon reciprocal inhibition by the eye field TFs VSX2 and MITF, with MITF preceding the expression of the neuro-retinal marker VSX2 [148]. The Neuroretina-like domain  expresses PAX6, LHX2, RX, and VSX2, whereas the peripheral, RPE-like domain expresses  MITF and PAX6 and, between the third and the fourth week in vitro, optic cup formation and invagination occurs, with the neuroretina progressively acquiring a horseshoe shape reminiscent of the inner wall of the optic cup, surrounded by the RPE [143]. Optic cups have been shown to recapitulate the process of topological organization of the retina, displaying all major retinal cell types and photosensitivity [143].
Since the advent of 3D optic cups from mouse embryonic stem cells, pioneered over a decade ago [149], the protocol has been adapted to produce human retinal organoids, for the investigation of cell non-autonomous processes and multi-factorial traits.
Human organoids are multi-layered, light-responsive iPSCs-derived structures with a phase-bright, stratified neuroepithelium that originates from an embryoid body and optic cup-like structures. Following an initial stage of the phase-bright retinal neuroepithelium, pre-organoids develop, on a second stage, into opaque structures, followed by a third stage with brush-like protrusions corresponding to photoreceptors outer segments.
The phenotypic analysis of organoids has proved that the temporal patterning of the retina can be recapitulated in vitro, supporting the notion that TFs-mediated, reciprocal inhibition drives the dichotomic cell fate specification of neuroepithelial progenitors towards the neuroretina as opposed to the RPE, through feedback regulatory loops [150]. Mitf -mutant organoids exhibit delayed proliferation during the early stages of development and downregulate the proliferation marker Ki67, but the overall growth is unaffected in the long term except for the fact the RPE develops abnormally and the neuro-retinal marker VSX2 is upregulated [151,152]. Conversely, Vsx2-mutant organoids exhibit reduced proliferation at early stages and reduced expression of the marker Ki67, are intrinsically biased towards an RPE cell fate, and upregulate Mitf.
Overexpression of Fgf9 partially rescues the mutant phenotype, and the expression of Vsx2, although proliferation remains delayed [144,148].
It has been suggested that Mitf expression may regulate the early proliferation of the neuroepithelial progenitors and that accumulation of β-catenin in the dorsal optic vesicle may specify the RPE cell fate through the canonical Wnt/β-catenin pathway, while subsequent repression of Mitf by VSX2 would be essential to induce the neuroretina and the optic cup development [148].
Single-cell RNA-seq from the intact retina has identified transcriptional signatures for several non-neuronal cells such as Müller glia, astrocytes, RPE, choroidal melanocytes, microglia, monocytes, natural killer cells, T and B cells, mast cells, pericytes, fibroblasts, and vascular endothelial cells. In developed organoids, most retinal neuronal cell types and some non-neuronal cell types are present, including rods, S cones, and L/M cones, one type of horizontal cells, all ten types of bipolar cells, and 14 of 17 types of amacrine cells typically found in the adult [72]. However, the only non-neuronal cell types that can be detected are Müller cells, astrocytes, and RPE cells. Conversely, immune, vascularassociated, and choroidal cell types are not observed in developed organoids, including pericytes, endothelial cells, melanocytes, fibroblasts, monocytes, T cells, monocytes, and microglia [72]. Overall, retinal organoids have been shown to recapitulate lamination, synaptic function, staging, cell-type diversity, and specificity of gene expression seen in the intact human retina, by single-cell RNA-seq and ATAC-seq comparative profiling, providing a potential model system to investigate retinogenesis and retinal degenerative diseases [69,72].

Lhx2 and Its Neurogenic Potential
Additional evidence indicates that early embryonic ablation of Lhx2 results in proliferative defects of glial precursors [132] while early post-natal ablation (Pdgfrα-Cre; Lhx2 fl/fl ), as well as overexpression by electroporation (pCAG-Cre; Lhx2 fl/fl ), result in loss of glial markers and dysmorphic apical structures [153]. Later in development, Lhx2 loss of function results in forms of reactive gliosis in absence of stress (RaxCreERT2; Lhx2 fl/fl ), (GlastCreERT2; Lhx2 fl/fl ) and increased susceptibility to light damage by reduced expression of neuroprotective factors [153].
While temporally controlled conditional knockouts indicate that LHX2 induces and stabilizes the retinal glial fate during the first post-natal week of development, by sustaining the expression of the HES5-mediated Notch signalling effectors [132], its function does not seem to be confined to the differentiation and homeostasis of glial precursors: Lhx2 early embryonic ablation by tamoxifen at embryonic day e10.5, analyzed at e12.5, results in the depletion of multipotent retinal progenitor cells and overproduction of retinal ganglion cells, at the expense of horizontal cells and photoreceptors (Hes1-CreERT2 ; Lhx2 fl/fl ). Although, if the ablation of Lhx2 is postponed to e15.5 and analyzed at postnatal day P0, when the peak of production for retinal ganglion cells has passed, then rods are overproduced [131]. This indicates that LHX2 may contribute to the maintenance of embryonic progenitors in a proliferative and multipotent state, by restricting their exit from the cell cycle and allowing for the generation of later cell types within critical windows [131].
Finally, overexpression of Lhx2 with its transcriptional co-activator Ldb1 triggers cell cycle arrest, overproduction of wide-field amacrine cells, and inhibits both Notch signalling and retinal gliogenesis. LHX2 drives expression of several neurogenic bHLHs, among which, Ascl1 and Neurog2 [75,156]. In the wild-type retina, RNF12, a negative regulator of LDB1, is necessary for the initiation of gliogenesis, during which the formation of the LHX2/LDB1 complex is reduced [156].
This cumulative evidence indicates that LHX2 regulates the genesis, terminal differentiation, and homeostasis of Müller glia throughout retinal development, although its function is not confined to retinal gliogenesis. Retinal cell fate determination correlates with the time and context of Lhx2 inactivation and dependence on Notch signalling and LHX2 may control the cell fate determination by deflecting and/or overriding default developmental pathways.

Lhx2 as a Transcriptional Determinant of Cell Fate Identity
Repression release of retinal cell type-specific genes occurs indeed in retinal progenitor cells upon Lhx2 loss of function (Chx10-Cre-eGFP; Lhx2 fl/fl ) revealing underlying, default developmental pathways.
Early embryonic ablation of Lhx2 in early, flow-sorted retinal progenitor cells induced repression release for genes associated with amacrine cells, retinal ganglion cells, horizontal and bipolar cells, whereas post-natal ablation of Lhx2 (pCAG-Cre-GFP; Lhx2 fl/fl ) induces over-expression of genes associated to rods, cones, and late retinal progenitor cells.
Novel PWMs, found enriched in the LHX2 ChIP-seq datasets at early and late retinal progenitor cell stages, were aggregated by average linkage by increasingly stringent similarity where the minimum correlation threshold for motifs partitioning was set to be 0.6. All major clusters and related roots were then re-assigned to age-matched retinal TFs showing expression by RNA-seq. Multiple motifs instances were found within a broad range of similarity to the known LHX2 consensus and exhibited differential occupancy, possibly underlying differences in affinity and/or combinatorial interaction with other TFs [75]. While LHX2 motifs instances identified by ChIP-seq are homogeneously distributed across retinal categories, high similarity motif instances are slightly prevalent in promoters of retinal ganglion cells specific genes at embryonic stages. Instead, motifs that diverge from consensus are prevalently located in promoters of genes that are enriched in Müller glia and cones. At post-natal stages, high similarity LHX2 motifs instances are slightly prevalent in promoters of rods-associated genes whereas low similarity motifs are slightly prevalent in promoters of retinal ganglion cells and early retinal progenitor cells-associated genes. Low similarity instances identified by ChIP-seq seem underrepresented across all retinal categories at post-natal stages compared to embryonic stages, suggesting a broader diversification of LHX2 binding sites repertoires occurring at early stages. The empirical distribution of LHX2 motifs diverging from consensus across retinal promoters seems to reflect transcriptional dependence of these genes on Lhx2 expression, with retinal ganglion cells and bipolar cells prevalently de-repressed at embryonic stages and rods derepressed at post-natal stages, following Lhx2 loss of function.

Chromatin Regulators in Retinal Development
Multiple TFs involved in retinal development have been characterized. Some affect the di-and tri-methylation of histone H3K4, which results in gene activation and is primarily mediated by the family of histone methyltransferases MLL1. Conditional ablation of Mll1 results in disruption of progenitor cells proliferation, cell-type composition, and neuronglia balance [157].
Other relevant TFs compose the histone methyltransferase polycomb repressive complex 2 (PRC2) which is known to regulate the maintenance and differentiation of retinal progenitor cells [158,159] by mediating the deposition of the histone repressive mark H3K27me3. Methylation occurs via recruitment of the H3K27 trimethylase EZH2 [117,160], whereas H3K27me3 demethylation is mediated by JMJD3 [161]. H3K27me3 is progressively removed as retinal development proceeds and is paralleled by enhancers'regulation on cell type-specific genes [83].
In the frog retina, expression of the PRC2 subunits is driven by Wnt/β-catenin signalling via the receptor Frizzled 5 and is independent of SOX2. Functional inactivation of the PRC2 core subunits Ezh2 affects retinal proliferation in frog and mouse, promoting Müller glia cells formation at the expense of neuronal cell types [158,159]. Despite the enrichment of Ezh2 in retinal ganglion cells, Atoh7-driven deletion of Ezh2 (Math5-Cre) does not seem to elicit any defect in retinal ganglion cells development, survival, or homeostasis [162].
Nonetheless, conditional ablation of Ezh2 (Pax6-αCre, Six3-Cre) depletes postnatal retinal progenitors, disrupts retinal lamination, enhances differentiation of photoreceptors and Müller glia, and induces glial reactivity [117]. Embryonic deletion of Ezh2 in retinal progenitor cells results in photoreceptor degeneration, by de-repression of the transcriptional targets Six1 and Eya2 [163]. Furthermore, the TF CASZ1 cooperatively interacts with members of the PRC complex to repress Lamin A which is necessary to preserve an inverted nuclear structure in nocturnal rods. Consistently, CASZ1 expands and centralizes the heterochromatin in fibroblasts [164], suggesting that the PRC complex and its cognate factors may have a role in the rods' genome organization. Finally, inactivation of the PRC2 subunit Eed leads to post-natal depletion of multipotent progenitors, a reduced production of Müller glia, bipolar, and rod photoreceptor cells, and overproduction of amacrine cells.
Notably, the former gliogenic phenotype seen with the inactivation of Ezh2 phenocopies Lhx2 postnatal ablation, coupled with overexpression of the intracellular Notch domain [133]. Glial reactivity is also observed upon Lhx2 ablation in differentiated Müller glia [153]. The latter phenotype with overproduction of amacrine cells upon Ead inactivation, instead, phenocopies Lhx2 loss of function coupled with overexpression of SoxB1-SoxC family members [133].
Besides the PRC2 complex, alteration of the SWI/SNF chromatin remodelling complex is also known to affect chromatin organization on a compartment level [165,166]. The SWI/SNF chromatin remodelling ATPase BRM promotes Notch inhibition, cell cycle exit, and BRN3B-mediated retinal ganglion cell production [167].
The SWI/SNF complex also affects retinal progenitors' proliferation and photoreceptors' differentiation through the core component BRG1, which regulates enhancers involved in cell-lineage specification [168,169]. Stage-specific and lineage-specific enhancer elements associate with non-coding RNAs [170], and some have been individually validated for the transcriptional regulation of the TF Vsx2, Prdm1 [171], and Otx2 [172], whose expression can be driven by ASCL1 or, alternatively, by NEUROG2 [173].
As the advent of CRISPR-based genome editing has enabled ex vivo perturbation of regulatory elements and functional validation of enhancers, more and more cisand trans-regulatory elements are now being surveyed and assessed.

Leveraging Next-Generation Sequencing towards a Deeper Understanding of Gene Regulatory Networks and Hierarchies
Pioneer factors are known to target partial DNA motifs on nucleosomes to initiate reprogramming [174][175][176].
Some of these factors were identified by process of elimination, as necessary and sufficient to mediate somatic cells reprogramming to embryonic stem cells-like pluripotency, when ectopically induced [177,178].
Starting with a minimum of two developmental time points, a progressive time-course acquisition of chromatin accessibility can be monitored by ATAC-seq around the motifs center, for any given TF, and binding sites can be validated by ChIP-seq.
TFs can be ranked by their likelihood of chromatin opening, scored by protein interaction quantitation (PIQs) [179]. Signs of competition for nucleosome occupancy, or displacement, also emerge from the positive overlay between ATAC-seq nucleosomecentered regions and ChIP-seq for the TF of interest, where ChIP-seq binding sites can be compiled across genome-wide heatmaps of k-means clustered open chromatin regions ( Figure 1) (p. 12).
Footprinting enables to detect active TF recruitment across open chromatin regions, where bindings sites can be cross-referenced against deposited, known PWMs, or de novo, empirically detected oligonucleotides by ChIP-seq [180][181][182]. Given the multitude of causative loci identified in complex diseases and their variable penetrance, the identification of TF binding sites can eventually lead to post hoc prioritization of genomic variants in non-coding regulatory regions from genome-wide association studies [183,184], to devise customized, genome editing-based therapeutics for visual restoration.
Comparison of ChIP-seq binding sites with footprints lost upon conditional knock-out for the TF of interest allows unequivocal assignment of binding sites, which is useful in case of consensus similarity across TF family members, due to sequence homology for DNA binding domains ( Figure 1) (p. 12).
Furthermore, footprinting analysis of predicted pioneer factors can reflect their developmentally regulated reliance on the TF of interest, by comparing control with TF knock-out conditions [75].
It was noticed that 40% of LHX2 ChIP-seq targets at embryonic stages and 8% of postnatal targets lose accessibility upon Lhx2 ablation. Furthermore, LHX2 affected chromatin accessibility genome-wide, regardless of its binding patterns, resulting in 48% loss at embryonic stages and 22% at post-natal ones.
Because physical juxtaposition of LHX2 binding sites with nucleosome-centered regions (4% at embryonic stages and 3% at post-natal stages) ( Figure 2) cannot explain the global loss in chromatin accessibility following Lhx2 loss of function, it was hypothesized that secondary regulation of LHX2-dependent TFs with high pioneer potential may occur and amplify the phenomenon. Hence, the widespread decrease in chromatin accessibility would result from transcriptional de-regulation of pioneer factors by LHX2 and/or from the loss of steric interaction of such factors with LHX2 [75] (Figures 3 and 4) (p. 14).  Enhancer regulatory elements, decorated by H3K27ac marks, bound by LHX2, and transcriptionally accessible by ATAC-seq were identified for several retinal progenitor cell-associated genes such as Sox2, Sox9, and Vsx2 (Chx10) ( Figure 5) (p. 16) [75], and later replicated [98].  . Transcriptional dependence tested (simplified schematic). TFs with high pioneer potential are among LHX2 genetic targets, based on age-matched RNA-seq results from conditional Lhx2 knock outs (cKO). LHX2 coordinates paired variations in TFs gene expression (RNA-seq) and in chromatin accessibility (fold change [FC] by ATAC-seq) at bound regulatory loci (ChIP-seq) within extended regulatory promoter domains, resulting in epistatic and transcriptional de-regulation of genetic targets including other TFs with pioneer potential. The nearest promoters, LHX2 ChIP-seq peaks were assigned to, were defined by extending TSS regions 5000 bp upstream, 1000 bp downstream for up to 1,000,000 bp max extension. Enhancers were defined by H3K27ac coverage. Adapted from [75] (CC BY 4.0).

Figure 5.
Inferential module of transcriptional regulation for candidate pioneer factor Sox9. UCSC custom tracks were configured onto the mm10 mouse genome. A probabilistic, regulatory module of LHX2-coordinated variations in chromatin accessibility can be inferred for any given genetic target from a time-course analysis of differentially accessible open chromatin regions. ChIP-seq for any TF of interest can be integrated with enhancer elements decorated by H3K27ac, with RNA-seq, ATAC-seq and footprinting analysis (FTPs) in control and TF knock out conditions (cKO). As for Figure 4, ChIP-seq-bound targets that overlay with coordinately accessible chromatin loci (altered accessibility upon age-matched TF loss of function) are probabilistically assigned to the nearest gene displaying age-matched, statistically significant variations in gene expression upon TF loss of function. Probabilistic assignment of regulatory loci is allowed to any nearest gene located within extended curated regulatory domains that are defined by extending TSS regions 5000 bp upstream, 1000 bp downstream for up to 1,000,000 bp maximum. This pipeline is virtually applicable to identify genetic regulatory networks and transcriptional hierarchies for any TF of interest in any developmental context. Adapted from [75,185] (CC BY 4.0). SOX9 is among LHX2 genetic and steric targets with predicted pioneer potential. For explicative purposes, blue and red vertical shadows are drawn across regional reads pileups that are developmentally regulated over time. Time-course analysis encompasses ATAC-seq and age-matched RNA-seq at embryonic day E10, E12, E14, E16, E18, P0, and P2 for flow-sorted retinal progenitor cells and post-mitotic fractions (Chx10-GFP-Cre). Lhx2 targets are ascertained by comparing LHX2 ChIP-seq readout with ATAC-seq and RNA-seq from age-matched control and conditional knock-out at E14 and P2. Representative LHX2 peaks at P5 and P14 are also displayed. SOX2 plays a role in the embryonic and adult central nervous system, where its depletion impacts both neural progenitor cells maintenance and neurogenesis [186][187][188][189][190]. SOX2 is a dose-dependent regulator of retinal neural progenitor competence [191,192], it preserves retinal progenitor cells post-natal pluripotency [154] and it is involved in amacrine and Müller Glia formation from murine retinal progenitor cells [193]. Postnatally, SOX2 is also essential to preserve Müller glia precursors' radial morphology and cell cycle quiescence [154]. SOX2 and SOX9 are expressed in retinal progenitor cells postnatally and in Müller glia, accounting for 2% of the retinal population [194].
The HMG-box TF SOX9 is essential for the differentiation and survival of postnatal Müller glial cells [195], it regulates genes involved in the visual cycle and the timing of RPE maturation [196,197], it is involved in the proliferation and differentiation in corneal epithelial stem cells [198], it mediates neurogenic-to-gliogenic fate transitions in the cerebellum [199] and astrogliogenesis along the dorsal-ventral axis of the spinal cord [200].
Here, gene regulatory networks are inferred by pairing ATAC-seq peaks of variable accessibility to all nearest genes displaying age-matched statistically significant expression variations, or variations in experimentally matched knock-out conditions ( Figure 5) (pp. 17, 18) [75].
Subsequent work from the same group [124] has heavily relied on this first inferential model [75], when constructing gene regulatory networks. The original work does not set a stringent limit to pair peak-to-transcription start sites (TSS), since it enables probabilistic assignment of a given peak up to 5000 bp upstream the TSS, 1000 bp downstream for up to 1,000,000 bp maximum extension. Moreover, it incorporates H3K27ac marks since enhancers can span across megabases of distance from the TSS. A lack of integration with enhancer marks and the overall stringency in peak-to-TSS pairing adopted in the subsequent work, that is upstream 100 kb or downstream 100 kb of the intergenic peaks [124], may lead to proximity-related biases and false-negative assignments. In the subsequent work, prediction of cell type-specific TFs binding in cis-regulatory elements [124] also relies on the pairing of TF binding sites based on footprinting scores with corresponding stage-matched TF expression, as proposed in the original work [75].
Last, but not least, what is here referred to as "feedback TF pairs" [124] arises from the fundamental observation described in the original work [75] that TFs can exhibit mutually epistatic and/or steric dependencies and that integration of ChIP-seq, with ATAC-seq and RNA-seq in conditional knock out conditions for any given TF, can address that.
Such regulatory dependencies can be inferred by a) assigning ATAC-seq peaks exhibiting a given TF binding signature (PWM, ChIP-seq binding site, ATAC-seq footprinting) and variable accessibility, to all nearest genes displaying RNA-seq expression variations in experimentally matched knock out conditions for a given TF b) pairing all co-occurring TF binding signatures (PWM, ChIP-seq binding site, ATAC-seq footprints) with footprinting variations in experimentally matched knock out conditions for a given TF.
Simply put, in a positive feedback scenario, for any overexpressed TF1 that positively regulates a given TF2, there would be co-occurrent accessibility of TF1 binding sites, induced gene expression for TF2, and positive footprinting metrics for both TF1 and TF2. When TF1 is downregulated, accessibility of TF1 binding sites would be reduced, expression for TF2 would be downregulated and footprinting signatures for both TF1 and TF2 would be reduced or lost.
It is, therefore, not clear how this work [124] advances the state of the art on gene regulatory networks besides stratifying cellular heterogeneity, and how exactly the authors intend to leverage the new single-cell inventories of cisand trans-acting factors to guide cell-based therapies in retinal degeneration.

Footprinting Analysis and Competition for Nucleosome Occupancy by Predicted Pioneer Factors Identify Steric Dependencies and Transcriptional Hierarchies in Cell Fate Specification
Considering that binding sites for TFs with known and putative pioneer potential are among LHX2 ChIP-seq motifs co-occurrences, steric dependence on LHX2 was tested by footprinting analysis and competition for nucleosome occupancy. Analysis following Lhx2 loss of function has shown that LHX2 may stabilize top candidate pioneer factors prior to and during recruitment at chromatin loci across murine retinal development ( Figure 6) (page 19). Figure 6. Footprinting analysis and competition for nucleosome occupancy by predicted pioneer factors reflect their developmentally regulated dependence on LHX2. Comparative analysis of ATAC-seq footprinting and nucleosome occupancy in controls and Lhx2 knock-out conditions (cKO) (Chx10-GFP-Cre; Lhx2 lox/lox ) identifies TFs whose genome-wide binding and competition for nucleosome occupancy is sterically affected by LHX2. Steric dependence on LHX2 is first assessed by comparing TFs footprinting metrics in control and Lhx2 knock-out conditions (absolute counts and reads distribution over 200 bp from motif centers). TFs whose binding associates to progressive, developmentally regulated chromatin opening (positive pioneer PIQ index), hence candidate pioneer factors, are then screened for signs of nucleosome displacement. A schematic is finally derived from genome-wide histograms of ATAC-seq reads, compiled across TF motif centers for candidate pioneer factors. Nucleosomes distribution, or occupancy, is an inferred, specular representation of ATAC-seq reads across motif centers. The lower the distribution of ATAC-seq reads around motif centers is, the higher chromatin compaction is expected to be, since ATAC-seq only profiles, in principle, open chromatin regions. Nucleosome dips of 150 bp circa arise from the depletion of the ATAC-seq signal around motif center because nucleosomes bound chromatin cannot be transposed, resulting in a "train rack" heatmap effect, with visible flanking regions and a central dip in reads coverage, as displayed in Figure 1, bottom panel. A TF with no pioneer potential would result in a relatively homogenous reads distribution around motif centers and a locally limited effect on chromatin accessibility upon loss of function. Conversely, loss of function or destabilization of candidate pioneer factors would result mostly in an altered nucleosome dip (i.e., ASCL1) and higher nucleosome occupancy, hence chromatin compaction, as in [100]. Competition for nucleosome occupancy by candidate pioneer factors is affected by Lhx2 expression and/or interaction with LHX2 and it is developmentally regulated. Adapted from [75] (CC BY 4.0).
Among the computed pioneer factors in retinal progenitor cells, the pro-neural transcription ASCL1 was identified and predicted to mediate chromatin opening by direct competition with nucleosome binding [75].
ASCL1 is a neurogenic bHLH TF expressed in a subpopulation of retinal progenitor cells [87,201], it is known to increase in the fish retina upon injury [202] and its overexpression mediates neuronal differentiation of retinal progenitor cells in dissociated cultured of mouse Müller Glia [194]. While ASCL1-mediated regenerative response seems to be limited to younger mice [194], the addition of an HDAC inhibitor restores the regenerative potential in adult mice in vivo following NMDA-mediated damage [203]. Ultimately, the model postulated here based on a totally agnostic screening reconciles with these findings [203], indicating that LHX2, and negative regulators of the HDAC complex, may stabilize ASCL1 and enable chromatin opening.
Among all candidate regulators of chromatin accessibility in the retina, SOX2 is predicted to stabilize chromatin opening and ablation of Lhx2 results in destabilization of SOX2 and an increase in nucleosome occupancy at early embryonic and post-natal stages [75]. Later work has shown that downregulation of Sox2 in rod photoreceptors is accompanied by deposition of the repressive histone mark H3K27me3 on specific loci [98], consistent with the above findings [75]. Furthermore, KLF4 is predicted to stabilize chromatin compaction early in development while mediating chromatin opening at post-natal stages [75]. KLF4, similar to SOX2, is a pluripotency factor, as it was shown to mediate somatic reprogramming into induced pluripotent stem cells [177,[204][205][206]. Klf4 repression release occurs upon downregulation of miRNA let7, which is necessary to mediate a regenerative response in the retina upon injury [194]. Klf4 is also expressed in Müller glia and is induced in the chick and fish retina upon injury [207][208][209]. Klf4 is expressed in retinal ganglion cells that are primarily affected in glaucoma and optic neuropathies. Klf4 expression in retinal progenitor cells and post-mitotic precursors does not seem to affect retinal ganglion cells neurogenesis, suggesting that underlying compensatory mechanisms may exist, mediated by functionally redundant Kruppel-like family members. Nonetheless, Klf4 overexpression does decrease neurite outgrowth, in hippocampal and cortical neurons. Conversely, conditional ablation of Klf4 (Chx10-Cre; Klf4 fl/fl ) results in increased axon growth both in vitro and in vivo, after optic nerve injury, suggesting KLF4 may repress axon regeneration related pathways in retinal ganglion cells upon injury [210,211].
LHX2 binds within modules of coordinately accessible chromatin in retinal progenitor cells targeting promoters and non-coding elements in nucleosome-free regions associated with H3K27ac active enhancers marks. These modules are functionally enriched for genes involved in axonal dystrophy, among other identified functions at post-natal murine stages. LHX2 is known to affect thalamocortical axon guidance by regulating Robo1 and Robo2 receptors [212]. In Robo1, Robo2 mutants, mimicking retinopathy of prematurity, retinal ganglion cell axon guidance is also disrupted, and they display abnormal retinal vascularization, due to the lack of directional information to migrating astrocytes, [213].
Notwithstanding, targeted overexpression of Klf4 in the retina of neonatal rats is sufficient to reprogram the identity of late retinal progenitor cells towards retinal ganglion cells, by partially reactivating relevant transcriptional networks outside their developmental window [214].
This cumulative evidence suggests that the interplay between LHX2 and KLF4 may control the onset of retinal ganglion cells by instructing retinal progenitor cells competence within restricted temporal windows. LHX2-KLF4 co-occurring binding patterns may potentially reflect cis-and trans-regulatory determinants of retinal ganglion cells' identity and await functional testing.
Some of these findings have been furthered by subsequent studies from other groups, showing that primary enhancers and super-enhancers have a synergistic role in sustaining and grading Atoh7 expression during critical windows for retinal ganglion cell neurogenesis and they account for species-specific differences in developmental plasticity between mice and humans. Redundant deployment of multiple regulatory elements in mice may provide some buffering capacity, by limiting drastic fluctuations in Atoh7 expression which would otherwise result in retinal ganglion cell loss, a threshold effect seen both in mice, with the Atoh7 mutants and in humans, with congenital blindness due to optic nerve aplasia [219]. Similar threshold effects have been observed for Lhx2 in gliogenesis [75,132,156]. Subsequent work has shown that MEIS1, identified in retinal progenitor cells [75], is predictive of efficient retinogenesis in stem cells models [97] and that MEIS1 and MEIS2 function redundantly to promote the expression of retinal progenitor cell-specific genes, through cooperative interaction with LHX2 and other retinal progenitor cells specific TFs [220].
Among the identified, candidate pioneer factors, this work also shows the first evidence that NF-I may play a role in murine retinal progenitor maintenance and identifies NF-I TFs as candidate regulators of temporal patterning in the developing retina, proposing the mechanism by which they regulate changes in retinal progenitor competence, stabilizing chromatin compaction at embryonic stages while mediating chromatin unfolding at post-natal stages [75].
Thereafter, three derivative studies from the same group have focused on the NF-I regulatory role in retinal progenitor cells, by showing that: (a) NF-I TFs are selectively expressed in late retinal progenitor cells where they control cell-cycle exit, they regulate the generation of late-born retinal cell types, such as bipolar interneurons and Müller glia [127]; (b) disruption of NF-I TFs, which maintain and restore quiescence, induces Müller glia to proliferate and generate neurons in adult mice after injury [221]; and (c) NF-I TFs induce increased accessibility at regulatory sites associated with genes expressed in late-stage retinal progenitor cells and Müller glia at post-natal stages, while NFI loss of function may produce the opposite effect [124]. Despite the fact Nfib is expressed by late embryonic stages [127], stage-specific NF-I binding by embryonically staged ChIP-seq is not being assessed. Considering footprinting does not necessarily recall all the ChIP-seq binding sites, as it is led on ATAC-seq-capturing regions of open chromatin, some of the late retinal progenitor cells related NF-I targets are likely to remain undetected, limiting the informativity of the study [124].
NF-I TFs are known to control gliogenesis in the cortex and in the spinal cord [200,[222][223][224]] and a deeper understanding of the underlying regulatory networks holds promise for possible regenerative therapies.
To summarize, epigenomic profiling of retinal progenitors identifies LHX2 as a central regulator of chromatin accessibility in retinal progenitor cells across several developmental time points and indicates LHX2 controls the timing of retinal neurogenesis by co-opted action of previously uncharacterized, developmentally regulated TFs with high pioneer potential [75].
The knowledge of these mechanisms can be leveraged for the investigation of evolutionarily conserved and divergent mechanisms in humans, paving the way for the restoration of the visual function upon injury.

Target Identification and Repurposing of Epigenetically Relevant Pharmacological Compounds via Drug Delivery Nanosystems
Open chromatin profiling can identify pre-symptomatic configurations of the genome leading to disease progression [100]; variations in chromatin accessibility are predictive of retinal developmental stages [75,119,128,225], as well as disease aggravation [226].
Age-related macular degeneration (AMD) is the leading cause of severe vision loss and blindness affecting the elderly in the developed world and accounts for one-third of all cases of untreatable vision loss. Globally, of the 7.33 billion people alive in 2015, an estimated 36 million were blind, 216 million people had moderate to severe visual impairment, and 188 million had mild visual impairment [227]. AMD accounts for 8.7% of all cases of blindness worldwide [228], has a global prevalence of 170 million individuals and the disease is predicted to affect 288 million individuals by 2040, due to an increased life expectancy.
AMD is a multifactorial, progressive disease that impacts the macular, central region of the retina responsible for visual acuity, affecting the photoreceptors and RPE: it causes destructive and irreversible changes in sharp-sightedness. Asymptomatic in its early stages, some patients experience acute vision loss, blurred vision, scotomas, metamorphopsia, and chronic visual distortion. Different therapeutic approaches are currently available for wet, exudative AMD associated with choroidal neovascularization and for dry, non-exudative AMD, associated with pigmentary abnormalities and deposits (drusen), but no treatment can permanently repair and restore the cellular damage.
The RPE is among the hallmarks of dry AMD accounting for 90% of all cases for which no curative treatment is available prompting the need to develop stem cell therapies to replenish the tissue lost and restore the visual function.
The RPE is involved in light absorption, ions buffering, metabolites recycling, photoreceptors outer segments phagocytosis, secretion, and immune modulation. Inflammatory pathways, activation of the complement system, hypoxia, alterations of the RPE tight junctions involved in the retina-blood barrier, cholesterol metabolism, and dysfunctional rates of photoreceptors outer segments clearance by autophagy or phagocytosis, are all contributing factors to drusen accumulation and to the pathogenesis of AMD [229][230][231][232][233].
The first comparative profiling of macular and peripheral retinal tissue from healthy donors and AMD patients has furthered the understanding of the epigenetic contribution to the progression of age-related macular degeneration, indicating that a widespread decrease in chromatin accessibility occurs during disease progression, and the RPE is affected first [226]. Gene ontologies enriched in differentially accessible chromatin were found devoted to inflammatory pathways, metabolism, and homeostasis. It was later shown, by bisulfite pyrosequencing, that differential DNA methylation occurs in AMD affecting a proto-oncogene involved in TGF beta signalling (SKI), and transcription-dependent DNA repair mechanisms (GTF2H4) [234]. Following this first evidence by footprinting analysis showing that TFs binding is directly affected in human photoreceptors and RPE during macular degeneration [226], subsequent studies have investigated the impact of genome-wide associated variants and possible single nucleotide polymorphisms on TF mediated regulatory networks and binding sites in the human retina [70,235]. Hence, the candidate cis-regulatory sites identified so far may drive the prioritization of non-coding cis-regulatory elements in the aetiology of retinal degeneration [184].
Loss of chromatin accessibility correlates with disease severity and exacerbates as disease progresses in an isogenic background [226].
Inflammatory, oxidative stress paradigms in iPSC-derived RPE, as well as overexpression of the histone deacetylase HDAC11 can recapitulate the reduction in chromatin accessibility observed in AMD, consistent with previous observations that: (a) the temporal progression of photoreceptors cell death is accompanied by specific enzymatic activities [236,237]; (b) degenerating photoreceptors trigger an epigenetic program involving de novo DNA methylation and a simultaneously increased HDAC activity, to possibly minimize transcription, protein biosynthesis, hence energy expenditure, in face of a reduced metabolic supply during cell death [238][239][240][241]. Hence, the identification of early events in the pathogenesis of AMD may help devise preventive, therapeutic strategies.
Besides AMD, epigenetic changes have been explored in retinitis pigmentosa, glaucoma, and diabetic retinopathy [242][243][244][245][246]. DNA hypermethylation of genes involved in cell death and survival, cell morphology, and nervous system development has been described in mouse models of retinitis pigmentosa and it is thought to be primarily mediated by the DNA methyltransferase DNMT3a, affecting the expression levels of the genes encoding the TFs Yy1, E2f3, and Nrl [241]. DNA hypermethylation can be paralleled by histone hypoacetylation, hypermethylation and poly-ADP-ribosylation [241,247,248]. Histone-deacetylases inhibitors can mediate neuroprotection by reducing photoreceptors cell death in some models of retinitis pigmentosa, although functional discrepancies have been reported [249][250][251] and pharmacological inhibition of DNA methylation can rescue photoreceptors cell death in organotypic retinal explants. This evidence suggests that inhibition of DNA methylation and, more broadly, manipulation of epigenetic modifiers by the repurposing of currently available pharmacological compounds, could be leveraged for treatment of genetically heterogeneous forms of retinal degeneration.
Nanoscale exosomes can be vehicles for several therapeutic cargos, from small RNAs such as siRNAs and miRNAs, to proteins and neurotrophic molecules, decoy receptors and pharmacological agonists [252].
Lipidic nanocarriers and viral-mediated delivery strategies have also been implemented over time to improve the overall bioavailability, stability, and half-life of the pharmacological compounds in the posterior segment. This has allowed to circumvent the necessity for frequent, topical treatment or intravitreal injections, due to a precocious degradation and clearance of therapeutic molecules [253,254].
Among the viable drug delivery routes utilized to suppress VEGF-mediated angiogenesis, lipidic nanocarriers have been engineered to silence the expression of the human antigen R (HuR/Elavl1), encoding an RNA-binding protein known to be involved in diabetic retinopathy, suggesting a similar approach could be leveraged for the treatment of wet age-related macular degeneration and diabetic macular oedema [255]. Ocular angiogenesis has also been tackled by mimicking pharmacologic VEGF inhibition through lentiviral and adenoviral delivery of siRNA and miRNA, or by inducing the expression of the anti-angiogenic PEDF [256,257].
Current considerations for viral vector design and eligibility for pre-clinical trials include retinal tropism, choice of serotype, DNA packaging capacity and transduction efficiency in the cell type of interest, persistence of gene expression, risk of host genome integration, reduced immunogenicity, and off-target effects [253,258,259].
More recently, the advent of CRISPR-mediated gene therapy and genome editing has enabled to tentatively expand the portfolio of therapeutic approaches from monogenic to multi-factorial ocular diseases, to address aberrant gene expression and pathogenic, noncoding variants involved in choroidal neovascularization, altered vascular permeability, lipid homeostasis, neuronal apoptosis or detoxification of reactive oxygen species, and to leverage neuroprotective strategies for the restoration of the visual function [260,261].

Expanding the Therapeutic Portfolio from Gene Editing to Genome Editing towards Personalized Medicine
Several gene editing strategies have been proposed to restore the visual function, depending on the clinical presentation. While gene therapy by subretinal injection is suitable to restore deficient or haploinsufficient copies of a gene and is the elective treatment for monogenic, genetically inherited, recessive forms of blindness, such as Leber congenital amaurosis [262], alternative, corrective strategies are necessary to treat dominant forms of blindness, such as inherited forms of macular degeneration, dominant forms of retinitis pigmentosa and glaucoma. In such case, a suppression and replacement therapy or direct gene editing have been attempted by lentiviral or adeno-associated viral delivery (AAV) of CRISPR-Cas9 [263][264][265][266][267], where a customizable Split-Cas9 system may enable reversible genome engineering [268] and cell type-specific delivery with tissue-specific promoters [269].
With regard to non-coding variants, enhancer de-regulation has been reported, for instance, in autosomal recessive congenital retinal nonattachment, where ablation of a non-coding enhancer regulatory region 20 kb upstream the proneural bHLH TF ATOH7 gene [270] leads to retinal ganglion cells degeneration and optic nerve atrophy [271].
As mentioned before, ablation of the murine ortholog fails to recapitulate the pathogenic phenotype seen in humans, suggesting that lack of functional equivalence between species may be due to the compensatory deployment of redundant regulatory elements in mice [219].
Mutations of non-coding regulatory elements nearby the PAX6 gene result in haploinsufficiency due to enhancers de-regulation and have been reported in aniridia and congenital cataract [272,273]. Mutations of the enhancer located 150 kb downstream from PAX6 disrupt the autoregulatory loop controlling PAX6 expression in the developing ocular structures.
Another instance of enhancer deregulation emerges from point mutations of CRX binding sites that are proximal to the SAMD7 gene and are necessary to mediate its expression in photoreceptors. Mutations of CRX binding sites result in SAMD7 transcriptional deregulation and retinitis pigmentosa [274]. Additional non-coding regulatory elements have been linked to retinitis pigmentosa, and putatively assigned to EYS, LCA5 and PCDH15, NMAT1, ABCA4, OFD1, CEP290, USH2A, and PROM1, while other mutations have been identified in the seed region of microRNA-204 and within an open chromatin region located upstream of PRDM13 [274].
Cryptic enhancers and consequent rewiring of enhancer-promoter interactions can emerge, instead, from the deregulation of TADs, due to some structural variations detected in autosomal dominant forms of retinitis pigmentosa. Structural variations were modelled in iPSCs and retinal organoids and assayed by HiC, resulting in ectopic contacts between retinal enhancers and GDPD1, a gene involved in lipid metabolism, and loss of regulatory insultation for YPEL2 [275].
Advancements in genome-editing technologies may enable to survey all candidate regulatory elements that segregate with disease. Functional dissection of risk-associated alleles can be carried out in retinal organoids and human iPSCs paired with isogenic controls, which allows the comparison of chromatin dysregulation in a virtually identical genetic background.
The translational potential of epigenetics ultimately lies within the functional dissection of non-coding regulatory variants of the genome, which can then lead to the prioritization of such loci for cell type-specific gene therapy and for personalized, genome editing-based medicine.
First, classification of developmentally regulated and disease-related regulatory elements can be leveraged for construct design, to ensure that cell-type specificity of transgene expression is attained when combined with viral vectors exhibiting broad tropism, as to minimize off-target effects of gene therapy.
Second, by leveraging all known cisand trans-regulatory sites, CRISPR-Cas9 can be repurposed for simultaneous, reversible gene activation or silencing by conjugating the catalytically inactive caspase9 dCas9 to an effector protein [276][277][278]. A short guide RNA (sgRNA) with a protospacer adjacent motif (PAM) complementary to the targeted genomic sites can then direct the dCas9-KRAB-Effector transcriptional suppresser [279] or the dCas9-SAM synergistic activation mediator [280], to specific genetic loci of interest.
Third, for complex, polymorphic traits, the variability in the phenotypic penetrance of non-coding variants can complicate the diagnosis and the aetiological identification of the causative loci and related genes.
Whenever the genetic burden for some single nucleotide polymorphisms (SNPs) and rare variants cannot be clearly assessed, then several epigenomic features could be lever-aged. Overlaying epigenomic features with SNPs may help prioritize specific regulatory loci to ultimately devise CRISPR-Cas9 or CRISPR-Cpf1 genome editing-based strategies [281] for restoration of gene expression. Ultimately, the epigenomic readout and genome-editing strategies could be translated into clinical practice, by orienting patient-tailored therapies in precision medicine.

Agnostic Approaches to Restore the Visual Function, from Prosthetics to Autologous Cell Replacement
Among the agnostic approaches that do not require any prior knowledge of the causative gene or regulatory loci instead, the advent of optogenetic strategies has enabled to restore light sensitivity by delivery of genetically encoded opsins, re-sensitization of photoreceptors, and sensitization of bipolar and retinal ganglion cells, although with limited spectral sensitivity, the only caveat being the reliance on functional, residual cells to relay the signal for visual processing. Potential treatment by optogenetics has been successfully attempted for the treatment of retinitis pigmentosa [282].
Likewise, bionic prosthesis relying on residual cells has been used to augment synaptic transmission, where phosphenes enable the perception of directional movements and shapes with limited resolution [283].
Replacement therapies have been leveraged for treatment of dry age-related macular degeneration, relying on stem cells reprogramming ex vivo, and subsequent autograft that is amenable for RPE transplantation [284] but has limited applicability to retinal neurons, due to the lack of proper integration into the retinal visual circuitry [285].
Furthermore, while iPSCs-derived organoids have proved to recapitulate well the lamination, cell-type diversity, light responses, and single-cell transcriptional profiles of the whole human retina [72], they lack functional foveal-to-peripheral regionalization and representation of vascular-associated, choroidal, and immune cell types, potentially limiting oxygenation and impacting oxidative phosphorylation.
This limits the usage of organoids as potential model systems or, at least, it should warrant some caution when modelling regionalized forms of retinal degeneration, such as AMD that affects primarily the macula, or autoimmune-related disorders associated with ocular manifestations, neovascularization, and an altered retinal permeability.
Finally, reprogramming strategies in vivo have allowed to study the molecular mechanisms underpinning retinal regeneration, or lack thereof, in different model organisms, shedding light on evolutionarily conserved and divergent mechanisms in cell responses upon injury.

Chasing the Secret to Youth: Is the Evolutionary Conservation of the Regulatory Elements in the Genome Truly Key to Unlock the Regenerative Potential of the Retina?
The regenerative potential of the retina varies considerably across species, developmental time frames, and cell types utilized as elective sources for reprogramming. Neurogenic progenitors can be regenerated upon injury from the RPE, the Müller glia, and the ciliary margin zone in nonmammalian vertebrates.
While the ciliary marginal zone has regenerative potential in amphibians and fish [286] and, to a minor extent, in mice [287], the RPE is the primary source for regeneration in amphibians, such as salamanders and larval frogs [89]. Upon retinectomy, the RPE undergoes dedifferentiation, depigmentation, and detachment from the basement membrane. The deriving cells proliferate as neurogenic progenitors [288][289][290] to ultimately replace the neural retina, with proper lamination and proper ratio of cell types, and they restore a functional RPE. However, newly generated neurons transiently retain expression for RPE markers [89,291] (Figure 7). Thymidine labeling and transgenic lineage tracing indicate that the RPE re-enters the cell cycle and gives rise to a newly derived retina in amphibians [289,292], although the molecular mechanisms underlying amphibian RPE dedifferentiation have not been elucidated.  In the amphibians, the RPE undergoes dedifferentiation upon retinectomy, depigmentation, and detachment from the basement membrane. The deriving progenitors proliferate and regenerate the neural retina, yet RPE markers are partially retained. In the teleost fish and chick retina Müller glia are the major source of reprogramming instead, where they can dedifferentiate, re-enter into the cell cycle, and acquire neural progenitor cell-like fate upon NMDA excitotoxic injury. Müller glia in mice can partially respond to injury by migrating to the lesion site and upregulating Pax6, although they do not fully re-enter the cell cycle, unless specifically manipulated to do so, especially in adult mice. Induction can be driven by Lin28/Ascl1 overexpression combined with HDAC inhibitors, NF-I ablation, or Hippo pathway inactivation. Even then, reprogrammed Müller glia are intrinsically biased towards bipolar, amacrine cells, and a minor proportion of cones (Adapted from [89]).
Outside of amphibians, the potential of the RPE to generate retinal neurons is restricted to early embryonic stages in fish [294,295], chicken, and mammals [296,297] and even then, it requires substantial manipulation [89].
Within a limited embryonic window, the RPE and Müller glia can be induced to dedifferentiate, proliferate, and reprogram into retinal progenitor cells by mitotic growth factors (EGF) and fibroblast growth factor (FGF) [298].
Müller glia are the major source of reprogramming in the teleost fish and chick retina, where they can dedifferentiate, re-enter into the cell cycle, and acquire neural progenitor cell-like fate upon excitotoxic injury. Müller glia reprogramming in fish occurs through the induction of retinal progenitor cells related TFs such as atoh7, pax6, islet1, and otx2 [207,221,[299][300][301]. Upregulation of the pluripotency marker lin28, which represses the miRNA let7 [202] enables the zebrafish Müller glia to revert to a truly pluripotent state when they can give rise to the full repertoire of retinal cell types.
The chick Müller glia can robustly proliferate after injury, although it predominantly regenerates amacrine cells and fails to induce OTX2, a TF essential for the photoreceptor cell fate [208,299].
The mammalian RPE and Müller glia instead display an intrinsic reprogramming ability early in development but, as retinogenesis proceeds, they become reluctant to proliferate after injury and, in the adult, they exhibit limited cellular pliancy, and retain terminally differentiated features [89,297,302,303].
Transgenic lineage tracing indicates murine Müller glia can also partially respond to injury by migrating to the lesion site and upregulating Pax6, but do not fully re-enter the cell cycle [89,304,305].
The neurogenic function for some of the identified miRNA has already been described in the nervous system [306][307][308][309][310], indicating miRNAs as a potential tool for regenerative therapies. Overexpression of miRNA miR-25 and miR-124 in young Müller glia cultured from reporter mice (Ascl1-CreER: tdTomato flSTOP/flSTOP ) elicits neurogenic conversion through downregulation of let-7 and induction of Ascl1. The combination of the let-7 antagonist with miR-25 and miR-124 analogs leads to a further increase in Ascl1 expression in the Müller glia, although the regenerative response was limited when tested in adult Müller glia in vitro [194], suggesting additional mechanisms may be needed to fully recapitulate the regenerative process.
Several groups have comparatively profiled Müller glia upon injury, in the attempt to identify the key differences and epigenomic determinants of their regenerative response across species. Assuming functional equivalence of orthologs and related regulatory loci, they ultimately attempted to manipulate the evolutionarily conserved elements that are responsible for progenitor quiescence and for limitations to the neurogenic potential [124,128,221,[311][312][313][314].
Notwithstanding, the experimental paradigms of retinal acute damage that are currently in use for animal models may not adequately recapitulate the degenerative response in humans.
Recent evidence indicates that high-intensity, acute light damage causes Müller glia reactivity in zebrafish, whereas low, continuous light exposure damages the photoreceptors' outer segments and evokes microglia recruitment, but it does not cause apoptotic cell death, Müller glia gliosis, or cell cycle re-entry [315].
Furthermore, ablation of the retinoblastoma gene RB evokes different proliferative responses in human and murine cone precursors [316]. Human maturing cone precursors are propense to re-enter the cell cycle, proliferate, and give rise to quiescent retinomas or retinoblastoma tumors, following RB loss of function. Conversely, immature, and mature cone precursors of the Rb-ablated mouse retinae, re-enter the cell cycle but are reluctant to proliferate and give rise to retinoblastoma-like lesions.
Moreover, the ectopic expression of the oncogenes Mdm2 and Mycn is not sufficient to initiate oncogenesis in murine cone precursors, which could be explained with stagespecific and species-specific sensitivities to mitogenic signalling.
In zebrafish, chick, and mice, Müller glia re-enter a reactive state upon injury, although species-specific differences were noticed. While zebrafish Müller glia reverts to a retinal progenitor cell-like state by overexpression of six2b, tgif1, and lepb, to undergo proliferation and neurogenesis, murine Müller glia only transiently express cell cycle regulators and neurogenic factors before re-entering quiescence [221].
Upon acute injury, murine Müller glia acquire transcriptional and epigenomic signatures that are reminiscent of those seen in retinal progenitor cells, yet the regenerative potential is primarily limited to proliferative Müller glia and neurogenic Müller glia giving rise to bipolar [319], amacrine-like cells, and a minor proportion of cones [221]. Moreover, coupling EdU positive and negative labeling with lineage tracing has revealed that only part of the newly generated neurons derives from mitotic divisions, while others arise from direct trans-differentiation of Müller glia, suggesting additional mechanisms are needed to fully de-differentiate the adult Müller glia.
It has been proposed that the loss of neurogenic gene expression and motif accessibility during glial maturation may prevent efficient reprogramming, accounting for the bipolar fate restriction. However, by comparing open chromatin landscapes in bipolar cells and rod photoreceptors with those in retinal progenitor as opposed to Müller glia, the degree of similarity in accessible chromatin was not sufficient to explain the bias towards bipolar cells seen in Müller glia. The bias towards bipolar neurons appears to be preferentially driven by ASCL1 recruitment on chromatin accessible regions [119] and the neurogenic potential of Müller glia upon Ascl1 overexpression seems to be restricted by the STAT3 signalling [225]. Furthermore, diversifying the combination of neurogenic bHLH TFs enhances Müller glia trans-differentiation and neurogenesis in vivo. Müller glia co-expressing the bHLH TFs Atoh1 and Ascl1 give rise to bipolar and amacrine cells and immature retinal ganglion cell-like cells in the absence of injury, although cell bodies fail to migrate to the ganglion cell layer and the cells have only small axonal processes [320].
Finally, recent evidence has shown that the ectopic expression of the Yamanaka factors reverts the epigenetic clock by restoring transcriptomic patterns and resetting DNA methylation status, by upregulating the DNA demethylases Tet1 and Tet2 in retinal ganglion cells and promoting axon regeneration in vivo, in a mouse model of glaucoma with experimentally induced intraocular pressure. DNA methylation also occurs in senescent mice, targeting signature CpG sites enriched for the binding of polycomb repressive complex 2 (PRC2) and its histone methyltransferase product H3K27me3 [321]. Knock-down of the DNA demethylases Tet1 and Tet2 affects the methylation status of the CpG sites in a similar way to senescent mice, impacting the expression of Stat3 and abrogating the ability of retinal ganglion cells to regenerate upon optic nerve crush injury. Yet, overexpression of the DNA demethylases has no protective or regenerative effect on challenged retinal ganglion cells, suggesting additional mechanisms are needed to circumvent aging and to fully restore pluripotency upon injury [321].
Ultimately, the phylogenetic divergence of cisand trans-regulatory elements across genomes and the epigenetic retention of cellular origins may account for species-specific differences and for intrinsic biases towards a given cell fate. Additionally, the lack of functional equivalence across orthologs despite sequence homology, possibly due to TFs rewiring and evolutionary repurposing of the related regulatory sequences, could explain cell fate restrictions, differences, and constraints to the endogenous, regenerative potential of the retina among species.

Calling for Empirical Diversity and Better Demographic Representation across TFs Position Weight Matrices: The Biases That Underrate the Underdogs
How truly unbiased are the current epigenomic studies? A simple known motifs enrichment analysis performed on ChIP-seq or ATAC-seq samples can reveal the relative representation of TF motifs by binomial or hypergeometric scoring of known deposited PWMs, that is, TF binding signatures [322]. An input or isotype control should always be performed and visually displayed to account for variations in local reads coverage, especially on low complexity sequences. Motif enrichment analysis should always display the relative representation of a given motif both in the dataset and the genome or control dataset, since the absolute percentage of the motif in the experimental dataset may not necessarily exceed its representation in the background dataset, and it should be corrected by false discovery rate.
Nonetheless, the knowledge over TF binding patterns is somehow biased towards the ones that are most frequently characterized in the literature, at a given developmental stage or physio-pathological context, through deposited ChIP-seq. In vitro assays such as protein binding microarrays [323], the systematic evolution of ligands by exponential enrichment (SELEX) [324,325], or more recently developed affinity capture methods such as multiDAP [326,327] have inherent limits: some binding sites may be undetectable due to low affinity, cofactors requirement or in vivo conditions that cannot be recapitulated [328], leading to a possible overrepresentation of a given matrix in data repositories over other matrices that are not so well characterized, prolific, or invested in.
A TF binding site may not show up as experimentally enriched for the simple fact that there is no deposited matrix associated with it; therefore it cannot be computed, not even in the background set, be it a control, untreated sample, or the genome, leading to possible false negatives.
A de novo-based approach focused on a totally agnostic identification of oligonucleotides, instead, is not limited by the accuracy and comprehensiveness of PWMs repositories, as detailed and up to date as they can be [322,[329][330][331][332][333][334]. Whatever DNA sequence is present in any given sample, it can be accounted for, and it may be the most developmentally relevant one.
Multiple empirical variations of the same motif instance can be found by ATAC-seq and ChIP-seq and clustered for similarity by average linkage. Their functional recurrence in the dataset, or across species can be verified and their binding specificity tested upon permutation [75,335,336]. In a good ChIP-seq, the immunoprecipitated TF should always rank first from the de novo analysis, yet additional oligonucleotides can be identified, by similarity to known PWMs, reflecting co-occurring cofactors and steric dependencies.
Moreover, the empirical divergence of oligonucleotides from the known consensus flanking sequences may drive selective, preferential, or combinatorial recruitment of TFs onto specific promoters and co-regulated enhancers. In fact, recently evolved enhancers and cell-type specificity seem to be linked [337,338]. This may entail major implications in cell fate specification pathways and in vivo reprogramming since it would explain how TFs that display apparently redundant expression across several cell types may manage to diversify their function, leading to cellular diversity.
To conclude, a heuristic pipeline has been proposed allowing reduction of functional redundance across TFs motifs instances populating the genome, by clustering of de novo oligonucleotides represented by ATAC-seq, followed by probabilistic assignment of such oligonucleotides to known PWM [75,348,349].
The TFs, for which binding patterns are being inferred, are then compared with stagematched expression levels by RNA-seq, to ascertain a given TF is expressed at a given stage. ChIP-seq is then leveraged to comprehensively identify the genome-wide binding patterns of a given TF and its cognate molecular interactors.
Footprinting analysis of all endogenously recruited TFs is then led to identify actual binding sites across the genome, based on the focal depletion of the ATAC-seq reads around motifs centers, similarly to a nuclease-protection assay.
The analysis of TF signatures can be further refined by comparing footprints in control conditions and after conditional knockouts for any TF of interest, ruling out possible consensus redundancy across TF family members, thus enabling unequivocal assignment of a given binding site to a specific TF.
Functional gene ontologies can be identified from modules of coordinately accessible chromatin and related transcriptional targets, validated from conditional knockouts.
Finally, by integrating all predicted molecular interactors and co-dependencies computationally inferred by ChIP-seq motifs enrichment analysis with ATAC-seq footprinting analysis before and after loss of function for a given TF, it is possible to identify the full cohort of inter-dependent TFs that coordinate genetic programs at a given developmental stage.
Last, but not least, by adopting the PIQ metrics [179], it is possible to monitor chromatin accessibility and nucleosome occupancy around TF motif centers over time, and identify signs of nucleosome displacement by competing TFs, that is, candidate pioneer factors [174] that bind highly compacted chromatin, by making it accessible to the transcriptional machinery for the subsequent induction of gene expression.
Functional disruption of pioneer factors that interact with a given TF of interest can result from steric imbalance or epistatic de-regulation of cisand trans-regulatory loci coordinated by such factors. Briefly, by combining the pipeline above [75] with cell lineage tracing and purification strategies, one can identify all master epigenetic regulators-as well as their steric and transcriptional dependencies-that drive the chronology of chromatin accessibility states across development, hence, the sequence of events that lead a progenitor cell towards a specific cell fate.
Developmental trajectories towards specific cell fates can be further stratified by RNA velocity pseudotime, by computing the ratio between unspliced and spliced RNA from single-cell RNA-sequencing (scRNA-seq) [350,351].
A similar concept has been recently applied on a single-cell scale, by capturing progressive chromatin opening by chromatin velocity pseudotime. Chromatin velocity is applied to scGET-seq data that enable direct and sequential profiling of open and compacted chromatin regions, an overall better resolution for the latter compared to scATAC-seq, and clonal definition by clustering of genomic copy number variants in cancer [352]. Chromatin velocity is computed as the ratio between Tn5 transposable open chromatin and heterochromatin marked by the hybrid transposase TnH, which is designed to sever closed chromatin regions while retaining affinity for the accessible ones. The TnH transposase is engineered by conjugating the Tn5 transposase to an HP1α domain targeting H3K9me3 chromatin regions. Again, modules of chromatin that acquire accessibility are then characterized by gene ontology analysis and TFs binding, by overlay with TFs PWMs and TFs expression data. To identify TFs roles in cell trajectories, the TFs dynamic score is then fitted to cell clusters by projection to latent structures regression analysis (PLS) [353]. The sparsity of the signal is a common problem in single-cell RNA-seq, as it leads to zero-inflation occurrences, whereby dropout events in scRNA-seq cause many transcripts to remain undetected, resulting in high type 2 errors, hence false-negative rates [354]. A technical failure in transcript detection causes an excess of zero values compared to what is expected from a random distribution, making it difficult to distinguish technical artifacts from true biological scarcity [355].
Several technical artifacts, such as inefficient cDNA production, amplification bias, and low sequencing depth due to extreme multiplexing are known to occur [356,357]. Stochastic transcriptional bursting may also lead to zero-inflation and spurious results [358].
The zero-inflation problem is known to occur in single-cell sequencing compared to standard count distributions, such as the linear model with a negative binomial distribution, used in bulk RNA-seq [359].
Model-based analysis of single-cell transcriptomics (MAST) and dimensionality reduction methods have attempted to address these issues [357,360,361]. Expression data can be represented as continuous distribution with additional zero values and proportional relations are identified between the number of zero values and the average expression level per gene, to ultimately impute gene expression per cell, had there not been zero-inflation occurrences [362].
Recent advances in droplet scRNA-seq and mixture models have been developed to mitigate the zero-inflation problem and to reflect actual biological differences in transcripts abundance. This is accomplished by measuring discrete counts of unique molecular identifiers and assuming a negative binomial distribution [362]. Negative controls are spiked-in to mimic the RNA content of a cell, making the RNA content in each droplet identical and accounted for with a Poisson distribution, as opposed to a zero-inflated distribution. However, droplet-based single-cell sequencing data also contain intra-individual hierarchical correlation structure, possibly leading to false positives [354].
Pseudo-replication, or subsampling, can be sacrificial or simple [354]. In sacrificial pseudo-replication, experimental treatment is being performed but replicates are pooled before the analysis is conducted. In simple pseudo-replication, intra-individual samples, that is, samples obtained from the same experimental source are treated as multiple experimental replicates. Pseudo-replication seems to be pervasive in current single-cell sequencing [354].
A generalized linear mixed model and a two-part hurdle model with a random effect for the individual (MAST with RE) have recently been developed to correct for type 1 error rates, hence false positives, when performing pseudo-replication in single-cell RNA-seq showing that: (i) intra-individual correlations across cell types are consistently higher than inter-individual correlations; (ii) gene expression analysis tools that do not take into account intra-individual correlation lead to false positives (iii) as the number of correlated, cells within individual increases, performance over type 1 error rates becomes increasingly worse for all other standardized methods; (iv) power analysis of differential gene expression obtained by increasing the number of independent experimental individuals, increases the power to detect true differences [354].
Batch effect correction can be applied prior to differential expression analysis and cell-type clustering where the batches are individuals but are not exempt from high error type I rates.
A feature matrix can be built upon signals originating from individual cells across genomic coordinates and the deriving, aggregated signal can then be subject to unsupervised clustering by k-means, Louvain, or hierarchical clustering [369]. The signals are then subject to dimensionality reduction and visualization by principal component analysis (PCA), t-SNE or projected with a uniform manifold learning technique for dimensionality reduction (UMAP) [370,371] to identify putative subpopulations and similarities across individual profiles [372][373][374]. The quality of clustering can then be assessed by comparison with bulk sequencing from flow-sorted cells purified by virtue of a reporter or based on the expression of known marker genes.
Simulated pseudo-aggregation of open chromatin signals obtained by synthetically mixing bulk ATAC-seq with 10X Genomics scATAC-seq in the same input set, indicate that cell type-specific peaks from rare populations are consistently underestimated by peak calling. Only 18% of cell type-specific peaks from very rare populations with 1% prevalence, or 40% cell type-specific peaks from rare populations with 5% prevalence can be detected by peak calling when treating the heterogeneous source as a synthetic bulk experiment [372]. Since these peaks would be underrepresented in a consensus, pseudo-aggregated peak set, virtually all computational algorithms are expected to fail the identification of rare populations [372]. In such a scenario, bulk sequencing from purified, flow-sorted populations would serve as an indispensable reference to resolve rare cell populations.
Furthermore, pseudo-bulk methods that aggregate signals across cells from the same individual seem to be underpowered when the number of cells per individual becomes increasingly imbalanced and when intra-individual variance exceeds inter-individual variance, which is likely to occur in a developmental context. As a result, pseudo-bulk aggregation methods across cells from the same individual underestimate within-subject variability, leading to high false-negative rates [354].
Rarefaction of the signal can be leveraged, as a normalization technique, to account for unequal sampling by downscaling samples to the same sequencing depth.
Rarefaction curves can also be leveraged to estimate the saturation point, hence, to standardize the sequencing depth and starting material for any given experimental condition.
A rarefaction analysis curve on scRNA-seq can reveal the minimum number of sequencing reads that are needed to saturate the number of valid somatic mutations identified [375], the number of unique transcripts observed, and gene coverage rates.
Saturation analysis has been traditionally adopted in ChIP-seq to identify the sequencing depth beyond which no new discoveries (binding sites) are made by a given peak caller and additional reads would only result in variations in amplitude rather than novel binding patterns [376]. Under ideal saturation conditions, binding profiles should lead to robust, reproducible results. Consistency between replicates in high-throughput sequencing analysis can be estimated by irreproducible discovery rate (IDRs) [377], to identify concordant ChiP-seq peaks, ATAC-seq open chromatin regions, or gene expression profiles. Although, deterministic processes that are consistent across replicates would rank more significant, with a lower IDR because more unlikely to be irreproducible, compared to those that occur rarely or stochastically [378], generating a sparser signal, hence limiting IDR applicability in single-cell analyses.
The recent implementation of the IDR-based framework has led to the single-cell reproducibility across donors (scRAD), in an attempt to address sacrificial pseudo-replication, overcoming confounding effects due to inter-individual variability to ultimately identify reproducible patterns of gene expression across multiple donors. This method allows to correct for batch-specific differences by measuring the reproducibility over stratified, replicate experiments, that is, multiple donors [379].
While it is not clear if, and how, current ChIP-seq and single-cell ATAC-seq experiments are being standardized [98,124], near-saturation analyses and meta-analyses of signals across multiple donors through a similar IDR framework could be leveraged to implement the method for robust, reproducible results.
Recent efforts to address the sequencing depth that is required for reproducible identification of open chromatin regions and TF footprints have, in fact, shown that the numbers of reproducible footprints gradually decline with decreasing depth and that, unlike peak calling, footprinting efficiency does not saturate but reflects the library complexities at different depths [380].
Furthermore, TF footprinting accuracy is linked to a clear distinction of footprints from the background and is contributed by the sequence biases of TF binding sites. While correcting ATAC-seq for Tn5 sequence bias did not seem to improve footprinting performance [380], keeping the replicates separate to assess reproducibility seemed to lead to more accurate footprint predictions, especially at low-sequencing depths, arguing against possible pseudo-aggregation strategies for scATAC-seq.
To conclude, these considerations warrant some caution over extreme multiplexing of scATAC-seq libraries, to ensure that proper sequencing depth per library is reached. In addition, the minimum amount of starting material, i.e., the number of cells should be empirically determined by saturation and power analysis and balanced across individuals before any pseudo-bulk aggregation attempts. Finally, independent experimental replicates should be collected and tested in power-analysis and mixed-models to address any pseudoreplication-related errors.

The 4D Genome: Complementing Single-Cell Sequencing with Spatiotemporal Information by Super-Resolution Imaging
The spatial organization of the genome underlies several mechanistic aspects in health, development, and disease [55,[381][382][383] and recent advances in integrated spatial genomics are now enabling multimodal, simultaneous visualization of thousands of genomic loci, nascent RNA transcripts, epigenetic marks, and cooperative chromatin interactions in native conditions, at an unprecedented resolution [384][385][386][387].
Advancements in single-molecule-based, super-resolution imaging over the past decade have progressively allowed to overcome the diffraction limit and achieve resolution under the 10 nm scale for simultaneous acquisition of multiple DNA targets, relying on two key approaches. With the first approach, individually labeled probes can be transiently bound and unbound to resolve spatially overlapping images of single molecules over time and the blinking rate can be tuned by adjusting the concentration of the labeled oligos and the thermodynamic stability of the hybridized duplex. The super-resolution image is ultimately rendered from sequentially acquired, single-molecule localizations.
Another approach derives from the implementation of the photo-activated localization microscopy (PALM) [388] and utilizes small-molecule dyes, to stochastically illuminate a subset of sufficiently sparse, individual, photo-switchable fluorophores between on and off states, so that the imaged fluorophores do not substantially overlap when in the bright state, by concurrently blinking within a diffraction-limited distance [386,389]. The super-resolution image is ultimately rendered from temporally resolved emitters, from subsequent detection events.
Multiplexed super-resolution imaging of distinct DNA targets has, ever since, been achieved with the aforementioned approaches. With the former approach, libraries of short, single-stranded labeled oligonucleotides (Oligopaint probes) designed to visualize genomic regions in the range of kilobases to megabases [390] are combined with the point accumulation for imaging in nanoscale topography (DNA-PAINT). Transient binding of short fluorescently labeled, single-stranded oligonucleotides has enabled less than 20 nm resolution of synthetic DNA structures in fixed cells (OligoDNA-PAINT). In addition, Exchange-PAINT has enabled the serial imaging of multiple targets in pseudocolors with only one spectral channel [389,391].
With the latter approach derived from PALM microscopy, the iterative, stochastic activation of photo-switchable probes in 3D stochastic optical reconstruction microscopy (STORM and OligoSTORM), [389,392,393], enables axial and lateral acquisition of fluorophores to reconstruct 3D morphology of cellular structures at a nanometer scale. STORM microscopy has been used to sequentially map and spatially locate interacting genomic loci within the cells and to reconstruct contiguous chromatin segments. Once pairwise and higher-order interactions across chromatin regions are preliminarily inferred by bulk HiC [50], diffraction-limited imaging and multiplexed STORM microscopy can be combined to probe and validate such contacts on a single-cell level, providing high sensitivity and resolution, respectively.
Multiplexed super-resolution imaging can now be leveraged for chromatin tracing, to visualize chromatin conformation across topologically associated domains (TADs), providing insights into chromatin partitioning and chromosome folding [61,394,395], since hundreds to thousands of individual loci per cell can be sequentially mapped [389,[396][397][398]. Multiplexed super-resolution FISH combines structural chromatin features with their genomic coordinates in pseudocolors and compensates for the lack of sensitivity inherent to current single-cell HiC, due to the sparsity of the contacts identified [399][400][401][402][403]. Chromatin ultrastructure has been resolved at the level of single nucleosomes on fixed samples by chromatin electron tomography (ChromEMT) [404].
The recent advent of optical reconstruction of chromatin architecture methods (ORCA) [405] coupled with tiled, barcoded Oligopaint probes has ensured full coverage and uniformity of the probed chromatin region [405] and has allowed to scale up super-resolution imaging of multiple loci across tens of thousands of individual cells [386,406]. Optical reconstruction of chromatin architecture methods has enabled separation across chromatin domains [386,[407][408][409], separation of recurrent chromatin blobs, and globular domains delimited by TADs boundaries that are shared within cell populations [396,397] and cell type-specific TADs boundaries [384]. DNA sequential fluorescence in situ hybridization (DNAseqFISH+) can also map 3D chromosome structures, by partially replicating the genome-wide interactions map identified by Hi-C [56], whereas cell types identified by scRNA-seq can be recapitulated by RNA seqFISH [410]. Sequential immunofluorescence with oligonucleotide-conjugated antibodies can be used to probe histone modifications, nu-clear speckles, and chromatin-binding proteins, to ultimately correlate nuclear organization with transcriptional states [387].
Finally, the recently developed in situ genome sequencing (IGS) enables simultaneous base-pair sequence resolution and imaging of thousands of genomic loci within intact individual nuclei, without any prior knowledge of the sequence target [411]. The method relies on the in situ transposition of fixed samples, where the Tn5 transposase is added to randomly incorporate DNA sequencing adaptors upon fixation, preserving all genomic fragments in their native spatial position. Tagmented fragments are then circularized by ligation mediated by DNA hairpins with unique molecular identifiers (UMI) and the deriving templates are amplified by rolling circle amplification. UMIs are first imaged by immunostaining and deconvolved, then subject to further in situ amplification for library generation. The libraries are ultimately dissociated and sequenced ex situ on a regular Illumina platform. Upon sequencing, deconvolved in situ fluorescent locations are mapped to the corresponding UMI tagged genomic readout, enabling spatial assignment of the sequencing output for every single cell. This method has allowed to assign novel sequencing reads to the nuclear lamina, centromeres, and nucleolar bodies.
Last, but not least, optical sectioning capabilities and axial resolution have been recently increased by 3D interferometric lattice light-sheet microscopy (3D-iLLS) [412] and nanoscale chromatin imaging and analysis (nano-ChIA) [413], enabling unprecedented long-term, spatio-temporally resolved and label-free three-dimensional acquisition of subcellular structures in live biological specimens, ranging between 20 and 300 nm.
While a thorough presentation of the current imaging techniques eludes the scope of this review, super-resolution imaging will undoubtedly advance the mechanistic understanding of gene regulatory networks and functional epigenetic states behind cell fate determination, through direct visualization of chromatin compartmentalization and topology in space and time.
To what extent can we push the boundaries of informativity of next-generation sequencing?
A PubMed search for "single-cell differential expression" returned 9049 results published since 2016. Moreover, 7610 articles were published related to "chromatin accessibility" in the same time frame, of which 812 were led on by single-cell sequencing, while 659 on single-cell chromatin conformation.
Notwithstanding, the spurious contribution of heterogeneous cell types from wholecell populations can be circumvented by several purification strategies prior to bulk sequencing, such as immunolabeling, flow sorting [75,425], or through in vivo biotin labeling of the nuclear envelope, followed by affinity purification of tagged nuclei (INTACT) [426], leading to a reasonably pure cell population by virtue of a reporter. Conversely, rarefaction issues are known to occur in single-cell sequencing due to the sparsity of the signal generated by few cells, the noise, and the underrepresentation of rare amplicons, which is hard to obviate. This is particularly relevant for footprinting studies where the inflection of the reads around TF binding sites relies on a population-averaged signal. scATAC-seq data are inherently sparse [372]. Infrequent TF binding in rare cell types, short resident TF binding, or cell type-specific peaks from very rare cell populations may not necessarily be reflected in features matrices from pseudo-aggregated single-cell ATAC-seq signals, especially if under-sampling occurs, leading to possible false negatives, making the classic bulk ATAC-seq from whole-cell populations still a more informative, and an overall more desirable option.
Theoretically, to benchmark a truly representative dataset, one may want to establish minimum sampling requirements and iterate single-cell sequencing data collection, by progressively scaling up the sample size for power analysis and saturation until no new peaks and footprints are being discovered. Since the proportion of rare cell types may differ, depending on the developmental context, this control would have to be repeated for every single experimental condition and developmental time point, resulting in a time-consuming experimental setup and substantial financial burden.
Single-cell sequencing also entails considerable data workload, requires extensive storage and processing capacity. Computationally intensive programs and mixed modelsbased analysis that are iteratively performed on thousands of samples may limit the applicability of the FAIR principles of findability, accessibility, and interoperability across platforms and users, underscoring the need of pipelines for better data harmonization.
Super-resolution imaging techniques could ultimately complement genome-wide sequencing techniques on a single-cell level.
Finally, single-cell sequencing can graciously supplement the lack of hypothesis testing while sustaining the scientific prolificacy over time and has, in fact, raised exponentially over the past decade. Nonetheless, the above considerations should warrant some caution before resorting to its default usage and to a programmed obsolescence of bulk sequencing, calling for a more rational experimental design to avoid uninformative redundancies [427], and for a more equitable allocation of the funding resources [428,429]. Funding: This review is self-financed. C.Z. is a current recipient of a Marie Sklodowska Curie research fellowship, under the Scientia Fellows Marie Sklodowska Curie COFUND, grant agreement 801133, a European Union Horizon 2020 Research and Innovation program. C.Z. is currently hosted by the Department of Ophthalmology, Institute of Clinical Medicine, at the University of Oslo. As project proposer, C.Z. is currently working on iPSC derived RPE, investigating the epigenomic contribution to the induction and maintenance of the RPE cell fate and has received funding support from the Norges Blindeforbund and Unifor associations. AMD: the most common cause of blindness in the elderly, affects the photoreceptors and RPE in the macula of the eye, responsible for central vision. By leveraging a recently described workflow on stem cells derived, clinical-grade autologous RPE, C.Z. intends to identify sources of variability among transplantable RPE lines, to decipher, assess and tentatively correct, the relative contribution of the genome, in its predetermined and acquired configurations. Induced RPEs may potentially serve as platform for disease modelling, drugs screening and genome-editing studies to instruct patient-tailored strategies of precision medicine for treatment of AMD.
Institutional Review Board Statement: Not applicable.

Informed Consent Statement: Not applicable.
Data Availability Statement: C.Z. generated the following datasets during her previous postdoctoral research, from 2011 to 2017. Subsequently, the datasets were made available at the NIH NCBI GEO data repository as GSE99287 [226], GSE99818 [75,425] and GSE118880 [185].

Conflicts of Interest:
The author declares no conflict of interest.
Personal Disclosure: Neither the former nor the current funders have had any role in the design of the study, in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the review. Citation biases and editorial biases are known to occur in scientific literature [430][431][432][433][434]. Moreover, there is no mandate from the funding agencies to verify the authors' factual contributions or claimed expertise. Consequently, substantial overlaps and redundancies, repurposing of original contributions, and arbitrary reassignments of authorship are rarely addressed, besides the recommendations from the Committee on Publication Ethics (COPE). Undisclosed biases and conflicts of interest ultimately impact the track record and career progression of the relevant parties, as no corrective, a posteriori remedy is currently undertaken by institutions, editorial boards, or funding agencies [428,429,[435][436][437][438][439]. These findings underscore the importance to allocate equitable funding across the scientific work force and to address such biases, in part by compiling a comprehensive bibliography with the relevant work in literature, to contextually mention it within the text, regardless of the prolificacy, affiliation and position held by the contributors. Artificial Intelligence could be leveraged to implement the selection of all relevant bibliography for any given paper, circumventing such conflicts of interest, with the goal to ensure adequate visibility and proper representation of all the relevant work, to ultimately orient the decision making and prioritization efforts of the stakeholders.