Experimental Validation and Prediction of Super-Enhancers: Advances and Challenges

Super-enhancers (SEs) are cis-regulatory elements of the human genome that have been widely discussed since the discovery and origin of the term. Super-enhancers have been shown to be strongly associated with the expression of genes crucial for cell differentiation, cell stability maintenance, and tumorigenesis. Our goal was to systematize research studies dedicated to the investigation of structure and functions of super-enhancers as well as to define further perspectives of the field in various applications, such as drug development and clinical use. We overviewed the fundamental studies which provided experimental data on various pathologies and their associations with particular super-enhancers. The analysis of mainstream approaches for SE search and prediction allowed us to accumulate existing data and propose directions for further algorithmic improvements of SEs’ reliability levels and efficiency. Thus, here we provide the description of the most robust algorithms such as ROSE, imPROSE, and DEEPSEN and suggest their further use for various research and development tasks. The most promising research direction, which is based on topic and number of published studies, are cancer-associated super-enhancers and prospective SE-targeted therapy strategies, most of which are discussed in this review.


Introduction
The role of transcriptional regulation mediated by super-enhancers attracted the attention of many research groups during the past decade. Since the introduction of the term, the discussion around super-enhancers has heated up, and the concept of such a cis-regulatory mechanism has not been clearly and completely defined [1,2].
In 2013, the term "super-enhancer" (SE) was suggested and proposed as a novel regulatory class represented by large clusters of enhancer elements with abnormally high transcription factor enrichment and target gene expression. The definition of SE includes a list of characteristic features that were found to correlate with the occurrence of active super-enhancers within various cell lines. The majority of in vitro studies emphasize the local enrichment of master transcription factors (MTFs), boosted epigenetic modifications (H3K4me1, H3K27ac etc.) along with the grouping of more than two enhancers (maximum distance between enhancer elements was proposed to be 12.5 kb), and high binding rate with Mediator complex subunits (especially with MedI subunit) [3,4]. These characteristics all together became the classical interpretation of the nature of SEs and formed their definitive basis. The rapid development of experimental approaches, including next generation sequencing (NGS), genome editing technologies, and the progress of computational methods, provided important insights in the structure of SEs, their impact on cell fate, and their role in the pathogenesis of various diseases. It has been shown that super-enhancers could largely determine cells' identity during normal development and differentiation as well as in pathological states due to strong key gene upregulation.
A number of experimental studies demonstrated contradictions with the classical definition of super-enhancers [5]. There is a discussion about the characterization relevance of super-enhancers as independent objects [6]. In this review, we focus on experimental studies about super-enhancer regulation in order to develop a stable sense of its role in tumorigenesis, cell development, differentiation, and motility. In addition, we summarize the existing computational methods used for SE detection and prediction in order to consolidate actual machine learning (ML), linear, and ChIP-Seq based approaches. Thus, the experimental data and in silico methods combined represent a promising direction in the development of preventive therapy strategies.
In order to include the latest information on super-enhancers as a novel regulatory element, we defined a set of rules used for publication searching and filtering. Here is, for example, the PubMed search formula: (Super-enhancers OR super-enhancer) AND (chromatin structure OR TAD OR TADs OR chromosome conformation capture OR 3C OR 4C OR 5C OR Hi-C OR NG Capture-C OR distant interactions OR looping OR 3D OR 3D structure) [Publication Date] Similar queries were formulated in order to access papers from the following research databases: Scopus, Web of Science, EMBASE, GHL, VHL, Cochrane, Google Scholar, mRCTs, POPLINE, and SIGLE.
Found papers were further classified as relevant and used for the qualitative summary if any of the following rules were applicable: (i) article contains specific information associated with super-enhancers; (ii) is an in silico research article with various SE-search algorithms applications (pipeline must be available for robustness evaluation and result reproducibility); (iii) is an in vivo/in vitro research article followed by a well-described methods section. In addition, we formulated the exclusion criteria for the set of retrieved papers: (i) duplicated work; (ii) inaccessible article; (iii) paper published earlier than 2013 (this criterion was necessary to bring up as soon as the term itself appeared only by this year); (iv) articles contrary to the inclusion rules.

Defining Constitutive Super-Enhancers: Current View
It is remarkable that the DNA molecule serves as efficient storage of genetic information for multiple cells, and that gene expression regulation results in the unique phenotype of each cell. Turning genes "on" and "off" depends on certain proteins called transcription factors (TFs) attaching to some specific short sequences of DNA (500-1500 bp long) known as enhancers. They are found in both prokaryotic and eukaryotic organisms and, though acting in cis, can be located up to 1 million base pairs away from the target sequence. The formation of SE complexes through numerous TFs binding boosts the activity of the genes nearby. A multicenter study initiative from 2011, the Encyclopedia of DNA Elements (ENCODE) [7], reported the identification of approximately one million enhancer switches in the human genome.
The term "super-enhancer" was introduced by Chen and colleagues in 2004 to describe a baculoviral DNA locus, hr3, that could induce the transcription of ie1 promoters up to a thousand-fold change, implying the first differentiation criterion for a SE: higher transcriptional effects [8]. Three first trendsetting studies by Young et al., published later in the 2010s, coined the term "super-enhancer" to describe extended genomic domains that play an important role in cell identity [3,5] and disease control [9] and act differently from the single enhancer or typical enhancer (TE) (Figure 1). Based on his team's findings, Young speculated that some of these enhancers do not function alone but rather come together in large groups. The main concept of super-enhancers' action mechanism is very similar to one that was previously described for normal enhancers. Transcription factors bind to specific motifs of super-enhancers and facilitate SE interaction (looping model) with subunits of transcriptional complexes (RNA-pol II, Med. TFIIF, etc.) in the promoter regions of regulated genes. Therefore, they activate target gene expression through the recruitment of numerous transcription machinery units [10]. The nuances of this seemingly simple mechanism remain unclear, but there is a proposed phase separation model that could shed light on the studied SEs' functions [11]. The model includes two parameters: the number of molecules and their "valence" in the volume of the super-enhancer. It is suggested that the transcription activity of target genes non-linearly depend on the relative amount of interaction between transcription factors within the SE volume. If the interaction value between TFs is close to maximum, then the phase separation occurs, and the probability of gene transcription grows. Thus, the balance between the concentration of TFs and valence values are crucial in terms of effective SE insulation and further target gene transcription maximization.
Nevertheless, there are many other factors which potentially affect SE function and change the transcription probability and rate of dependent genes. For example, it is known that massive eRNA transcription from super-enhancers plays an important role during the cis-regulation of target genes [12]. At the same time, these eRNAs may modulate the expression of distant genes that are not initially associated with studied super-enhancers. Such proposed mechanisms should be carefully researched in order to gain a deeper understanding of super-enhancers' mechanistic nature.
It was investigated that the epigenetic regulator BRD4 and the multifunctional cMyc proteins play an important role in mice with leukemia. The scientists reported that blocking Brd4 decreases Myc and prevents cancer cell proliferation, and that Brd4 binds to a 40000-bp stretch near the MYC gene. Further, the researchers showed that not only cancer cells

1.
Enhancers were considered to be the sites bound by all three master regulators, Oct4, Sox2, and Nanog, according to ChIP-seq; 2.
Enhancers within 12.5 kb of each other were stitched together to form a single entity; 3.
The stitched SE entities and the remaining TEs were then ranked by the total backgroundnormalized level of Med1 signaling within the locus.
However, this definition has been criticized because it was not "functional" in essence and did not reflect the unique properties of so-called super-enhancers. Moreover, the clustering procedure in step 2 is algorithmic, implying no specific selection or filtering. Additionally, multiple publications that have defined SEs (and sometimes even within a single paper) deviate from Whyte's group's definition, including various characterization marks at different algorithm steps. For example, in a study by Hnisz et al. [5], H3K27ac23 was used to identify enhancer regions, whereas another group used Med1 [9]. Thus, the only defining feature of super-enhancers is an exceptionally high degree of enrichment of transcriptional activators or chromatin marks as determined by ChIP-seq, which is assessed in step 3. The major differences between normal enhancers and super-enhancers are inevitably dominated by features associated with the ChIP-seq enrichment used to define them.
It should be noted, however, that H3K27ac may be functionally dispensable for transcription, even for genes associated with super-enhancers, as it was addressed in mouse ESCs [18]. The other study led by the Huggs group [19] had a look at one of the most consistent SEs associated with the α-globin gene [20]; the authors reported comprehensive functional dissection of the SE in erythroid cells (individually and in combinations) with no significant features of synergistic or higher-order effects.
It should be mentioned that the related term "stretch enhancer" has been described in parallel [21,22]. Stretch enhancers are defined by ChIP-seq on the basis of several chromatin marks being presented over a span of 3 kb genomic loci. In some papers, the term is used as a synonym for SE and shares some of the same characteristics [23]. At the same time, one might face a paper in which SEs and stretch enhancers are distinguished. Accordingly, SEs have been shown to be transcriptionally more active and cell type-specific than stretch enhancers. A comprehensive analysis comprising histone modification and chromatin accessibility profiling as well as cell type-specific gene expression evaluation has revealed that at the genome scale, stretch enhancers are more abundant and are further away from TSS than super-enhancers, whereas super-enhancers are more evolutionarily conserved [24].
The analytical work comparing two previous experimental datasets of Hay and Shin's groups proved that all the knockout data for α-globin and Wap SEs can be explained with a simple model [25], suggesting little evidence to support the existence of a more complex genetic element such as a SE, or at least no urgent need to develop sophisticated mechanistic models.
The controversy of the usage of the term "super-enhancer" has had a negative impact on drug discovery. Some researchers believe that super-enhancers play a particularly important role in stem cell differentiation and oncogenic pathways [26], two dominant areas of medical interest. In addition, there have been proposals to either try to attack these transcription pathways directly (which still seems to be a difficult task) or to use the characteristics of SEs for possible new drug target prioritization. Another limiting factor for drug research is data consolidation and processing. In order to ease the perspective of SE-targeted drug development, several research groups provide SE databases of various sizes and completeness. Current SE databases available include dbSUPER [27], SEA version 3.0 [28], SEdb [29], and human SEs repository (https://sunlightwang.github.io/Super-Enhancers/ (accessed on 6 March 2023)), each with comparative and exploratory analyses of publicly available ChIP-seq data. The most volumetric database, SEdb, contains up to 331,000 SEs derived from 541 human cell lines/tissues [30], and 52 SEs of the newest SEA v. 3.0 have been experimentally confirmed.
Still, there have not been enough studies disclosing the evolutionary stability of superenhancers as separate biological units (which would be interesting to do on a variety of related model lines, such as in healthy vs. tumor cells). In the current concept, the definition of a super-enhancer could be formulated as follows: super-enhancers in a broad range of human cell types are (putative) extensive clusters of enhancers with aberrantly high levels of transcription factor binding, which are historically associated with higher expression rates of cell type specification genes (including cell identity and pro-oncogenic genes). In order to provide the most comprehensive definition of SEs while comparing them to typical enhancers and with reference to published data, we provide a summary table with the main feature characteristics for both regulatory elements ( Table 1).
The algorithms used to identify typical enhancers and super-enhancers differ in their complexity and the number of genomic features they analyze. While both methods rely on ChIP-seq data for histone modifications, the identification of super-enhancers requires additional data and more complex algorithms along with the tuning of the latter. There are various proposed metrics that could be possibly used for SE identification along with machine learning methods with similar precision and validity. One of these involves a super-enhancer's frequency across various cell lines. There is emerging evidence for a class of SEs that seem to be universally active; they display a strong association with fast recovering chromatin loops after sequential cohesin removal and restoration [31] and reveal a constitutive occurrence pattern (or "constitutively active", or "common") and extra high degree of universality across different cell and tissue types.

Techniques Used and Proposed for SE Discovery and Research
To study the biology of super-enhancers, the mechanisms of their functioning, their role in a particular cell type, and the effect of various agents in particular inhibitors, a set of both computational and experimental methods is needed. In order to study superenhancers, they are often used in tandem. Bioinformatics methods are used to process experimental data and search for potential objects, and experimental data itself could be used to refine in silico predictions. Here, we review bioinformatic methods for searching for super-enhancers, focusing on machine learning methods and experimental techniques that have been actively used in recent years.

Computational Methods for Searching and Predicting Super-Enhancers
To solve the problem of searching for super-enhancers, classical methods and machine learning (ML) approaches, including neural networks (NN), were used. Most of the algorithms use Chip-Seq data, which gives information about key SE marks (H3K4me1, H3K27ac, Med1, Oct4, Sox2, Nanog, etc.) [6] and is extremely helpful for the search, but some of them take steps to move away from using ChIP-Seq data.
At the moment, the most-used tool and the gold standard for solving this problem is the linear ROSE algorithm [3,9]. The main idea of the algorithm is to stitch enhancers if the distance between them is less than the threshold value (12.5 kb) and to rank them based on the level of SE marks' signal from the Chip-Seq data ( Figure 2A). Other tools that do not use ML methods have not gained popularity.
Aziz Khan and Xuegong Zhang performed ML analysis of the importance of SE features and developed the imPROSE tool (integrated methods for prediction of superenhancers) based on ML algorithms to predict which enhancers are SEs [32]. An analysis of ChIP-seq data, RNA-seq data, Gene Ontology (GO), and DNA motifs was performed. The aim was to find a minimal subset to differentiate SE and TE. They used a randomforest-based approach, Boruta [33], for feature importance ranking and found that Med1, Med12, H3K27ac, Brd4, Cdk8, Cdk9, p300, and Smad3 are more informative for the task. The top six features according to their research are Brd4, H3K27ac, Cdk8, Cdk9, Med12, and p300. Then, they implemented six classical ML models, including Random Forest, linear SVM, k-NN, AdaBoost, Naive Bayes, and Decision Tree, and they performed 10-fold cross-validation and compared them ( Figure 2B). The best results were shown by the Random Forest (AUC = 0.98). While using only DNA sequence features, such as conservation scores (phastCons), GC content, and repeat fractions, the authors achieved the AUC = 0.81. This shows that only DNA sequence characteristics can be used to distinguish TEs and SEs. Compared with ROSE, a H3K27ac-based method, imPROSE trained on Smad3 and H3K27ac data makes more predictions for cell-type-specific SEs in pro-B cells.
The imPROSE source code is available on GitHub and can be used to choose the model and its features, train the model, and make predictions. The input is a CSV file with the computed available features for the enhancers. The output is also a CSV file with the field Class and the class flag (SE or TE). Aziz Khan and Xuegong Zhang performed ML analysis of the importance of SE features and developed the imPROSE tool (integrated methods for prediction of super-enhancers) based on ML algorithms to predict which enhancers are SEs [32]. An analysis of ChIP-seq data, RNA-seq data, Gene Ontology (GO), and DNA motifs was performed. The aim was to find a minimal subset to differentiate SE and TE. They used a random-forestbased approach, Boruta [33], for feature importance ranking and found that Med1, Med12, H3K27ac, Brd4, Cdk8, Cdk9, p300, and Smad3 are more informative for the task. The top six features according to their research are Brd4, H3K27ac, Cdk8, Cdk9, Med12, and p300. Then, they implemented six classical ML models, including Random Forest, linear SVM, k-NN, AdaBoost, Naive Bayes, and Decision Tree, and they performed 10-fold cross-validation and compared them ( Figure 2B). The best results were shown by the Random Forest (AUC = 0.98). While using only DNA sequence features, such as conservation scores (phastCons), GC content, and repeat fractions, the authors achieved the AUC = 0.81. This shows that only DNA sequence characteristics can be used to distinguish TEs and SEs. Compared with ROSE, a H3K27ac-based method, imPROSE trained on Smad3 and H3K27ac data makes more predictions for cell-type-specific SEs in pro-B cells.
The imPROSE source code is available on GitHub and can be used to choose the model and its features, train the model, and make predictions. The input is a CSV file with With the development of the NN approaches came the idea to use them for SE prediction. There are some works offering convolutional neural networks (CNN) to perform this task. DEEPSEN, a CNN implemented in 2019, was the first NN model used for this purpose, and it was implemented on TensorFlow with Python [34]. At that time, it outperformed all existing methods. The authors tested CNNs with different numbers of convolutional layers (from 2 to 4-DEEPSEN-2L, DEEPSEN-3L, and DEEPSEN-4L) and two fully connected layers, and they selected 36 features ( Figure 2C). Each convolution layer included two steps: a convolution step and a pooling step. They used ReLU as the activation function, Adam optimization, and the cross-entropy loss function. The code is also available on GitHub, and the input data structure is similar to that of imPROSE. The next step was implementing DeepSE in 2021 by Keras [35]. It outperformed methods that existed at that time using only sequence feature embeddings. It was shown that it can be used for different cell lines, which suggests that cell-specific SEs may share hidden sequence patterns. The researchers collected data from a number of human and mouse cell types. The novel moment was using the k-mer sequence embeddings obtained by training dna2vec on genomes. Then, the CNN classifier with two convolutional and two fully connected layers ( Figure 2D) performs the binary classification (TE or SE). The code is available on GitHub. The first step of the pipeline is encoding the sequence for the particular cell type, and the second is choosing a model, training it, and making predictions.
Nowadays, neural network methods are undergoing a period of active development, and new approaches are emerging for working with sequences, in particular, biological ones. Their potential applications for in silico searches for super-enhancers look promising.

Experimental Validation and Characterization of Super-Enhancers
To validate the predictions of super-enhancers with computational methods, it is necessary to carry out a number of experiments. Here, we consider experimental approaches to the study.
Much of the experimental research on super-enhancers involves the use of various genome-wide, sequencing-based approaches. They make it possible to identify SEs by their characteristic marks. Like typical enhancers, super-enhancers are characterized by increased sensitivity to DNase I, which is common for active chromatin structures. The results are cell type-specific [36,37]. This makes DNase-seq [38] a powerful and widely used approach in the search for and characterization of TE and SE. The identification of other marks is possible using ChIP-seq, a DNA-protein interaction characterization method. It makes it possible to detect common SE nucleosome architecture and TF binding sites [39][40][41]. SE ChIP-seq studies focus on such marks as H3K4me1, H3K27ac, p300, Med1, BRD4, LSD1, etc. [3,[42][43][44]. DNase-seq and ChIP-seq are the major techniques for SE identification, but there are some other tools that can be used for this purpose. ATAC-seq, a technique used to assess genome-wide chromatin accessibility for transposase, has such advantages as high sensitivity and superior procedure speed [45]. It is also used in SE research [19,[46][47][48]. Another chromatin accessibility estimation method, FAIRE-seq, which is based on formaldehyde crosslinking in vivo [49], is an alternative to DNase-seq and ATAC-seq, but it is sometimes used for SE validation [50]. There are some methods to measure noncoding RNA, including eRNA, such as GRO-seq [51,52]. It is another potential way to search for SEs and study their biology [53,54].
Many authors study the functioning of super-enhancers by subjecting them to certain manipulations, including DNA editing, the downregulation of certain genetic elements, or the inhibition of proteins closely connected with SE functioning. Deletions of SE or some of their parts help to estimate their role in epigenetic regulation and are performed using, for example, in vitro or in vivo CRISPR/Cas editing [46,[55][56][57][58][59] or Cre/Lox recombination [55]. The results of such manipulations are measured using the aforementioned techniques or through the estimation of expression levels at the RNA level (microarrays, qRT-PCR, RNAseq, etc.) or at the protein level (reporter assays, ELISA, Western blot, flow cytometry, etc.). If the goal is to have an influence on cancer cells, some features such as microscopy techniques, cell proliferation, and migration assays can measure cell metabolic activity (for example, the MTT test).
Another interesting and, in recent years, actively developed direction in the study of super-enhancers is their study in the context of the three-dimensional organization of the genome. Chromatin interaction measurement technologies (3C, 4C, 5C, Hi-C, and NG Capture-C) are used for this purpose. Often, methods of measuring expression are connected to the study of this topic, as the combination of these techniques makes it possible to establish correlations between changes in the structural organization of the genome in different cell types or after a certain type of exposure and levels of gene expression. These studies are aimed at finding answers to questions about how SEs are organized in space and what effect they can have on the expression of distant but spatially close genes.

Three-Dimensional Organization of Super-Enhancers
The functionality of eukaryotic genomes is mostly determined by their 3D organization. Understanding the stochastic expression of large gene cluster subsets dependent on chromatin contacts and 3D regulation has made significant progress [60][61][62]. Methods for measuring physical proximity between genomic sequences in fixed cells, known as chromosome conformation capture (3C), revealed that chromosomal contacts are organized into submegabase domains of preferentially insulated interactions known as topologically associating domains (TADs).
TADs are primarily formed by nested interactions between convergently oriented binding sites of the DNA-binding protein CTCF, which are established as chromatin-bound CTCF that arrest the loop-extruding cohesion complex [63]. Allelic insulation via CCCTCbinding factor (CTCF)-mediated directional looping may be epigenetically regulated by CTCF-binding site (CBS) element methylation [64,65]. CTCF is an eleven-zinc finger (ZF) multivalent regulator of transcription that recognizes numerous motifs using different combinations of its ZFs. It can act as a transcriptional activator, repressor, or insulator protein, thereby preventing enhancers and promoters from communicating. The CTCF protein binds to several thousand genomic loci in a given cell type, the vast majority of which are intergenic, and a subset of these sites overlap with transcriptional enhancers. While bound to chromatin domain boundaries, CTCF can recruit other TFs. CTCF-binding site mapping in various species has revealed that CBS elements are found throughout the genome. CTCF has the ability to directly affect transcription at promoters as well as to orchestrate interactions between regulatory elements and help separate eu-and heterochromatic areas in the genome at more distal sites by acting as a chromatin barrier. For example, in mouse thymocytes, the binding of CTCF and cohesin is highly enriched at enhancer and SE sequences, and cohesin itself facilitates spatial enhancer clustering, according to local and global 3C analyses [66]. Another study expanded the scope of SE functional reach beyond its respective target and past several CTCF sites into a juxtaposed neighborhood. Thus, mutational analysis demonstrated that the Wap SE controls Ramp3 (which plays an important role in cancer development), despite three separating CBSs. Deleting all of them resulted in the elevated expression of Ramp3 in mammary tissue, whereas deleting one distal CBS lowered Ramp3 expression in non-mammary tissues. Although these CTCF sites, including the loop anchor, do not prevent the super-enhancer from being activated, their absence from the mouse genome demonstrates their ability to suppress gene activation [67]. The idea is supported by the data obtained in the study of the Prdm14 SE in mouse embryonic stem cells (mESCs). The enhancer insulation proved to not be solely dependent on loop formation between its flanking boundaries. It was also shown that the SE activated the Slco5a1 gene beyond its prominent domain boundary. Thus, the loop extrusion model is complemented by the fact that cohesin loading and extrusion trajectories originate at an enhancer contribute to gene activation [68]. Additionally, computational simulation in silico and genetic deletion in vivo revealed that tandem-arrayed CBS elements ensure balanced usage of associated promoters in specific and equal spatial chromatin contacts in general. Thus, studies of Pcdh, β-globin, and Igh SE clusters in HEC-1-B, K562, and Neuro-2A cells suggested that directional CTCF chromatin looping between convergent CBS elements underlies insulator function and accessibility of the tandem CTCF sites' balanced promoters in 3D genome folding and regulation [55,69]. Recent research indicates that the transcription apparatus is compartmentalized and concentrated at SEs [70,71], resulting in phase-separated condensates that drive the expression of cell identity genes. Recently, it was demonstrated that CTCF is required for RNA polymerase II (Pol II)-mediated chromatin interactions at SEs, which appear as hyperconnected spatial clusters. Moreover, CTCF clustering is independent of liquid-liquid phase separation (LLPS) and is resistant to transcriptional perturbation, and it might have an instructive role and act as an architectural prerequisite in the formation of transcriptional condensates [72]. More detailed studying of CBS sites clustering reveals the existence of so-called "persistent" or constitutive CTCF sites for eight different cell types (GM12878, K562, Hela, IMR90, HUVEC, NHEK, HMEC, and KBM7), which are enriched at TAD boundaries. The deletion via CRISPR-Cas9 of two persistent CTCF sites at the boundary between a long-range epigenetically active (LREA) and a silenced (LRES) region within the Kallikrein (KLK) locus-a deregulated region of interest in prostate cancer-resulted in the concordant activation of all eight KLK genes within the LRES region.
CTCF genome-wide depletion alters TAD structure (including TADs merging), meaning that higher-order chromatin structures are supported by groups of essential constitutive CBSs [73].
Another TF mediating SE spatial activity, Aire, is the driver of self-reactive thymocyte clonal deletion and the generation of perinatal regulatory T cells. Aire controls immunological tolerance by driving the promiscuous expression of a large swath of the genome in medullary thymic epithelial cells (mTECs). Thus, ex vivo Hi-C experiments on mTECs from murine thymi together with biochemical and in vivo loss-of-function analyses confirmed that Aire regulates chromatin looping by evicting CTCF from domain boundaries and favoring the accumulation of cohesin on super-enhancers [74].

SEs and Transcription Regulation in Different Cell Types
With the knowledge of constitutive CBSs, given that SEs have varying sensitivity to chemical inhibitors, as discussed later in this review, and given that super-enhancers of many genes act with functional redundancy [75], it is reasonable to assume that only specific groups of SEs have high sensitivity to perturbations. In this case, there is a certain subset of SEs that function constitutively, providing a powerful level of background transcription for highly expressed genes. One of the first authors who hypothesized and showed the constitutive expression of SEs was Jung and colleagues in 2019 [31], calling such SEs "common". They analyzed 30 human cell and tissue types in total, including 5 H1 ESCs and their derivatives (mesendoderms, mesenchymal stem cells, neuronal progenitor cells, and trophoblasts), 8 immortalized cell lines (with a focus on colon cancer cell line HCT-116), and 17 postmortem tissue types stemming from the circular, digestive, endocrine, and other systems. So-called "super-enhancer domains" (with an average length of 32 Kb compared to 26 Kb in individual SEs) were defined with the ROSE algorithm based on H3K27ac signals. The genes associated with common SE domains showed universally high expression, 82.4% of which were non-housekeeping genes that were GO-enriched and showed increased cell motility, transcription, and cell proliferation regulation, thereby implying the existence of distinct biological functions within common SE domains. Common SE domains showed higher conservation rates and low context-dependent tolerance scores with a surprisingly low correlation with the genes in close proximity, which points at an additional biological role of these domains that, according to the authors, is related to the early establishment of 3D chromatin loops (20 min after cohesion recovery). The authors speculated on three possible models of the SE domain in chromatin organization: loop recovery acceleration via architectural protein recruitment, a sequential model, and a hierarchical model. However, the hypothesis needs further experimental validation, as the result was obtained only for the HCT-116 cell line.
Furthermore, constitutive SEs are described in another CRISPR-based work devoted to the study of SE landscape dynamics in cell differentiations of several lineages. The authors discovered that there are three distinct SE patterns (which are probably SE subtypes): conserved, temporally hierarchical, and de novo. The temporal order of establishment of elements within SEs may be guided by DNA sequences. It has been shown in differentiated cells that both early-and late-emerging enhancers are indispensable for target gene expression, whereas in undifferentiated cells, the early enhancers are capable of it [76].
A comparison of stretch enhancers with SEs revealed differences in active chromatin mark loading: SEs were mostly associated with H3K27ac and RNA Pol II and produced enhancer RNA (eRNA), whereas stretch enhancers were enriched with H3K27me3, depleted of H3K27ac, and did not produce eRNA [24]. In addition, SEs, despite overlapping with a small fraction of stretch enhancers, were more cell type-specific than stretch enhancers. In another study, so-called "differential SEs" were proposed via the computational method DASE, which helped to identify the internal dynamics by summarizing them. The authors categorized differential SEs into five major groups based on their overall activity and structural alterations: overall-change, shortened, hollowed, shifted, and others, which allowed them to link SEs with different numbers of genes and assess their gene expressions' divergent impacts. Besides that, DASE demonstrated the increased power of identifying cell line-specific SE regulation when applied to similar cell lines [77].
When talking about spatiotemporal SE dynamics, it is important to understand the possible variety of modes of interaction between SEs and their constituent parts. For instance, with the help of an integrated approach including ChIP-seq, scRNA-seq, and scATAC-seq data, the murine retinal SE Vsx2 double modality was shown, which reflected distinct developmental stages and cell type activity in vivo [46]. The CapStarr-seq highthroughput method developed for accurate quantification of enhancer activity has helped to identify associations between tissue-specific TF binding complexity and SEs [78]. The enhancer × promoter self-transcribing active regulatory region sequencing (ExP STARRseq) approach allows for the systematic quantification of enhancer-promoter compatibility in humans, which was shown in K562 cells. The method identifies two classes of enhancers and promoters that exhibit subtle preferential effects. Housekeeping gene promoters contain activating motifs for factors such as GABPA and YY1, which reduce promoter responsiveness to distal enhancers. Promoters of variable-expression genes lack these motifs and respond more strongly to enhancers [79]. Additionally, dissection/reinsertion experiments with well-characterized SEs, such as the erythroid α-globin SE [80,81], would aid in revising current models by which SEs are thought to contact and activate their cognate genes.

SE Conservation in Humans, Mice, and Other Placental Organisms
One of the pending questions about enhancers and SEs in particular is deciphering their evolution and conservation history. It seems quite complicated to show how a single enhancer/SE can be conserved across the animal kingdom, but enhancers and SEs may be as ancient and conserved as the TFs they interact with.
However, a study by Wong et al. demonstrated that in zebrafish and mouse embryos, putative enhancers identified in sponge Amphimedon sp. microsyntenic areas control patterns of cell type-specific gene expression. These sponge enhancers lie in the regions of microsynteny that are orthologous to those found in other metazoans and contain substantial histone H3K4me1 enhancer signals, even though they do not share considerable sequence identity with vertebrates [82]. Previous research found that some enhancers are conserved between humans and mice [83], whereas other enhancers may have been reprogrammed (RPE) as a result of human-mouse speciation [84] or during cancer progression [85,86].
Historically, putative enhancer sequences in humans with higher rates of evolution are associated with so-called human accelerated regions (HARs). These HAR sequences are thought to be responsible for human-specific phenotypic adaptations, as shown with the brain developmental transcription factor neuronal PAS domain-containing protein 3 (NPAS3) [87]. Along with other putative human-specific enhancers discovered in genome-wide surveys, HARs provide a rich dataset for the discovery of the unique gene regulation characteristics that distinguish humans, as it was demonstrated for a set of heart tissue-associated enhancers [88]. Another example is human accelerated non-coding sequence 1 (HACNS1), a positively selected limb enhancer sequence showing evidence of accelerated evolution and acquiring novel TF-binding sites in the human lineage compared to other vertebrates [89]. Transgenic mice with human HACNS1 show increased transcriptional activity compared with the chimpanzee and macaque orthologous enhancers, which is probably due to mutations that prevent repressor binding to the gene in a distinct 81-bp region [90]. Data on noncoding HARs combined with genome-wide analyses of chromatin marks and TF/cofactor binding as well as further experiments on a set of human and chimp enhancer orthologues reveal that the vast majority of enhancers, as well as SEs, can have functional activity in transgenic mice; however, their regulatory effects may differ from one species to another [91]. Emerging evidence suggests that SEs can act as master regulatory hubs in humans and mice, controlling cell identity [92] and pluripotency [93,94]. However, it is unclear whether pluripotency-associated SEs share a deeply common evolutionary origin in mammals.
Extensive comparative epigenomic and transcription factor binding analyses in pigs, humans, and mice proved that SEs evolve rapidly in mammals. In addition, 30 shared transcription factors and BRD4 were found to be conserved activators of mammalian pluripotency-associated SEs in parallel with three pluripotency-associated SEs (SE-SOX2, SE-PIM1, and SE-FGFR1) that are highly conserved in mammals [95]. In SE conservation studies, it is probably more accurate to speak of functional conservation and/or structural conservation in regard to the pool of regulated genes both in studying the evolution of enhancers in mammals [96,97] and in vertebrates in general when, for example, comparing the results to zebrafish [98]. Nevertheless, the genomic distribution of zebrafish TEs and SEs differs from that of mammalian regions. Generally, SEs are more cell-and tissue-specific than TEs and are associated with a conserved set of genes throughout vertebrate evolution. Authors also speculate that SEs are more effective in defining tissue identity than broad H3K4me3 domains [98].
Studying the evolution of SEs should help to correctly adapt animal models of diseases. Thus, in a characterization of cis-regulatory element functions in 12 diverse tissues from four pig breeds, using strategies similar to those used in ENCODE and the Roadmap Epigenomics projects (RNA-, ATAC-, and ChIP-seqs), more than 220,000 cis-regulatory elements were identified in the pig genome. Surprisingly, regulatory elements between human and pig genomes were found to have higher conservation rates than those between humans and mice [99]. Another study focused on the investigation of the Nppa and Nppb genes in the heart (encoding atrial and B-type natriuretic peptides, respectively) and identified an evolutionarily conserved SE required for the spatiotemporal expression pattern and stress induction of Nppa and Nppb [100]. Apart from that, the enhanced knowledge of SEs aids in the investigation of novel aspects of the classical molecular mechanisms of gene expression regulation, such as alternative splicing [101], and of more complex processes, such as DNA repair coupled with oncogenic SE activation [102] and eRNA production, implying its promoter-like activity [103,104]. Needless to say, SEs seem to be as ancient as the heat-shock (HSP) response and conserved not only in the placenta [105,106]. The stress-remodeled yeast nucleome was suggested to bear functional and structural resemblance to mammalian SEs and to be controlled by transcriptional condensates.
In addition, studies show that SEs have different levels of evolutionary conservation when they are unequivocally defined and calculated via different algorithms [2,107]. All methods demonstrate a higher overlap with conserved elements compared to random chance; however, the enhancers detected via eRNA transcription were the most conserved, whereas the enhancers obtained using PTM marks were the least conserved [107].
Taken together, SE evolution is dependent on the organism and cell type; the inability to identify conserved SEs appears to be due to their ability to evolve faster than both the TFs with which they interact and the genes which they regulate. The terms defining SEs are ambiguous and requires a complement of experimentally validated facts. To achieve robust and reproducible results, a better understanding of how the various enhancer identification strategies in use today relate to the dynamic activity of gene regulatory regions is required, along with improvements to experimental methods and data analysis and the possible implementation of machine learning approaches [108]. As soon as the unity of nature and knowledge can be effectively proved "from the contrary", it is important to acknowledge the necessity of the SE concept regardless of whether we want to disprove or prove the existence of super-enhancers as a distinct regulatory element.

The Role of Super-Enhancers in Non-Pathological Processes
Super-enhancers were described as regulators in the fields of stem cell biology, differentiation, and cell-specific gene expression [5]. In this section, we describe the role of SEs during the ontogenesis and growth of healthy tissues.
It is known that maintenance of cell pluripotency requires a huge and complex regulatory network that includes a large number of transcription factors [109,110]. Superenhancers were described in studies of mouse embryonic stem cells (mESCs)' pluripotency maintenance. The main TFs, namely Oct4, Sox2, Nanog, Klf4, and Esrrb, were shown to occupy unusual enhancer domains, which were later called super-enhancers [3]. These transcriptional factors recruited the Mediator complex to activate the genes responsible for mESC pluripotency. The majority of the analyzed SE-controlled genes encoded such TFs as Oct4, Sox2, and Nanog and thereby formed an autoregulatory expression network. It was also shown that SEs control some other genes associated with pluripotency, such as coactivators, chromatin regulators, and shRNA [3]. There are additional TFs (Nr5a2, Prdm14, Tcfcp2l1, Smad3, Stat3, and Tcf3) that also occupy SEs and contribute to super-enhancer and expression stability [5]. One of the proposed mechanisms of Oct4/Sox2/Nanog (OSN) is the regulation of the pluripotency described through OSN recruitment via Ash2l followed by Ash2l/OSN complex formation at Nanog, Sox2, Oct4, and Jarid2 super-enhancers [111]. Interestingly, Sox2 super-enhancers are responsible for more than 90% of Sox2 expression, and Sox2 seems to be its only target. The knockout of Sox2 SE caused dramatic changes in mESCs morphology and proliferation, as Sox2 is among the key TFs associated with pluripotency [56,112]. Nanog expression could be regulated by three super-enhancers, −45, −5, and +60, where downstream SEs (−45 and −5 kb) have stronger effects on Nanog expression. Dppa3 expression is also regulated by −45 SE, which provides transcriptional repression and chromatin condensation during the production of functional oocytes [113]. Enhancer RNAs produced during SE transcription also take part in its interactions with the Dppa3 promoter region [114]. The obtained data implies that Nanog SEs do not have the same effect on cell fate. Supposedly, this feature is explained by the existence of redundancy among enhancer elements. For example, one could become fully functional only if both the −5 SE and −45 SE are deactivated [88]. For both mESCs and human ESCs (hESCs), BRD4, a member of the BET bromodomain family and a transcriptional regulator, was shown to play an important role in pluripotency maintenance. Its inhibition led to expression changes of genes associated with the identity of ESCs. To determine the mechanism of BRD4 action, ChIP-seq experiments were performed. These experiments showed that BRD4 occupies SEs and has a strong effect on their activity. The level of SE-controlled gene expression relies on the binding of Med-containing complexes, which is also BRD4-dependent [115]. Spt6 is a histone chaperone that disassembles and reassembles H3-H4 dimers [116]. It was shown that Spt6 depletion leads to the expression downregulation of the pluripotency maintenance genes, as ESC super-enhancers show high enrichment levels of Spt6. In that study, Spt6 controlled the ratio between H3K27 acetylation and methylation [117], which confirmed the importance of super-enhancers in stem cell biology and epigenetics. Finally, it seems important that there are super-enhancers connected with pluripotency and are highly conserved in mammals. It was shown that the disruption of super-enhancers which regulate SOX2, PIM1, and FGFR1 expression leads to the loss of stem cell pluripotency [118].
Cell-and tissue-specific human SEs are also a common research focus. For example, five (NFIL3, KLF15, RXRA, SNAI2, and BCL6) and three (MEF2A, FLI1, and ETS1) SEs target TFs that are associated with adipocyte-selective and osteoblast-selective SEs, respectively, forming core regulatory circuitry (CRC, which is a group of interconnected auto-regulating TF-forming loops able to bind to not only their own SEs but also the SEs of other TFs within the loop). Furthermore, the findings show that osteoblast-selective SEs pre-exist in hMSCs, whereas adipocyte-selective SEs are generated only after adipocyte induction [119]. In a similar fashion, SEs stimulate osteogenesis in human bone marrow mesenchymal stem cells (hBMSCs). KEGG analysis of SE target genes before and after osteogenic differentiation showed that the TGF-β, PI3K-Akt, and ECM receptor signaling pathways are highly enriched TFs within SEs, and therefore, they are closely related to osteogenic differentiation [120]. Two lines of stem cells, mESCs and mouse epiblast stem cells (mEpiSCs), have different SE profiles. A number of SEs become activated in primed pluripotency and stay active in somatic derivatives in contrast with naïve cell lines [121]. However, SEs are not only involved in embryoblast regulatory networks. It was shown that super-enhancers in trophoblast stem cells (TSCs) have an influence on genes that control trophectoderm lineage development and placentation. Trophectoderm-specific master TFs (Gata3, Tead4, and Tfap2c) bind to SEs that control DNA-binding TFs and factors involved in signaling pathways (PI3K-Akt, Hippo, MAPK, etc.) [122]. In addition, SEs control some aspects of the primary germ layers' differentiation process within triploblastic animals. For ectoderms, for example, corneal epithelial development is controlled not only by TFs but by 1154 SEs as well. These super-enhancers are loaded with ETS transcription factor family members (AP1, KLF, etc.). They affect the expression of genes such as PAX6, WNT7A, and MIR205HG, which are important for corneal epithelium formation [123]. Another example is CRC-SE, which can be found upstream of the Vsx2 gene and is important for appropriate retinal development. There are Zfhx3, Prox1, Vsx2, Dbp, Hlf, Otx2, Isl1, and Lhx4 binding sites, and four of them (Isl1, Lhx4, Prox1, and Vsx2) showed reduced expression if the SE was experimentally disrupted. The direct CRISPR/cas9 deletion of some enhancer elements led to SE malfunction and further developmental issues such as eye or retinal size reduction [46]. Regarding the mesodermal germ layer, there are many examples of SEs that can take part in cell differentiation. We combined the results of experimental research groups into a summary table (Table 2). Genes associated with response to hypoxia and extracellular matrix organization [132] Endodermal germ layer differentiation and functionality also reveal a strong association with a group of SEs. In Xenopus tropicalis, endoderm activation-specific TFs (Otx1, Vegt, and Foxh1) bind to super-enhancers located near the endodermal cell fate genes (pnhd, foxa2, foxa4, sox17, frzb, gata6, hhex, and admp), thereby activating the target gene's expression [133].
In adult organisms, SEs are also involved in tissue-specific cell differentiation. For example, regulatory T cells become a functionally mature subpopulation only under the control of the Treg super-enhancers. These SEs are activated by Satb1 and regulate the Foxp3, Il2ra, and Ctla4 Treg cell signature genes [134]. There are a number of well-studied tissuespecific super-enhancers. Some of these studies aid in understanding the mechanisms of SE functionality. In vivo experiments performed on murine models were conducted to study the α-globin super-enhancer. Four enhancer elements within this SE (R1-4) showed high Med-1 signals and were evolutionarily conserved. Deletions of these parts showed that only two of them (R1 and R2) had a significant effect on erythropoiesis in vivo. No synergic effect was observed; therefore, it seems that enhancers within SE act independently and mostly in an additive manner. The authors suggested considering such cases as individual enhancers rather than as higher-order structures [19]. A question arose on the individual enhancer elements' roles within SEs, and this was also studied in Ly6Clow monocytes. The TFr Nr4a1 is the master regulator characteristic for the SE, which has three conserved parts: E2, E6, and E9. Only E2 was shown to be essential for Ly6Clow monocyte development and cell growth. It is possible that E6 and E9 can regulate Nr4a1 expression in some other conditions, but their primary effect and importance are not yet clear [135]. There is a complex experimental study dedicated to the role of super-enhancers during mammary gland cells' ontogenesis. Here, researchers paid special attention to the Wap-associated SE. Whey acidic proteins (WAP) are highly expressed in mammary tissue and are activated during pregnancy. The Wap SE consists of three enhancer elements (E1, E2, and E3). At mid-pregnancy only E1 is fully occupied by the regulators, but between the 14th and 16th days of pregnancy, E2 and E3 are also activated. This leads to a 100-fold increase in Wap expression. The studied mutagenesis of E1, E2, and E3 enhancer elements revealed significantly different in vivo effects; that of E3 was the most influential. E2 and E3 had additive interactions. The simultaneous inactivation of all three enhancers caused a 1000-fold decrease of Wap expression. That study, in spite of the additive model, testified to the existence of an inner hierarchy of enhancer elements within the SE [136]. Further research on this superenhancer showed that E1 is not required for E2 and E3 activation at terminal differentiation stages during lactation. In addition, when transferred to permissive chromatin, Wap SE retained mammary specificity, which indicates the weak regulatory potential of chromatin in mammary gland cell specificity [137]. Super-enhancers also control neuroplasticity, as the NMDA response involves SE reorganization and is associated with genes of neuronal identity (Ncam1, Mapt, Rbfox3, etc.) and neuronal activity (Fos, Per1, Ephb2, etc.) [50]. With the help of the rank ordering of SEs, epigenetic rewriting, and enhancer deletion analysis, renin cells, crucial for survival and homeostasis, were found to harbor a unique set of SEs that determine their identity, including the classical renin enhancer and Rbpj-enhancer. That study compared renin-phenotype cells at different physiological states. Comparing normal unstressed JG cells, chronically recruited renin-null cells, acutely recruited cells from mice subjected to sodium depletion plus captopril, and constitutively active As4.1 cells a notably reproducible pattern of open chromatin along via ATAC-Seq was revealed [138].
Interesting patterns may be associated with eRNA. eRNA expression and stability is considered not to be strong enough for regulation in trans. The higher transcriptional activity of TE compared with SE suggests that some of them could produce the non-coding RNA involved in trans-regulation. For example, alncRNA-EC7 (or Bloodlinc), which is transcribed from a SE needed for the expression of BAND3/SLC4A1, an erythroid membrane transporter, is also trans-acting and regulates about 500 target genes. Bloodlinc also interacts with TFs and chromatin-organizing factors [139].
The use of SEs for cell identity profiling and prediction may be limited in terms of sensitivity and specificity, and it needs to be complemented with promoter-centric data when analyzing gene clusters [140].

Super-Enhancers and the Development of Diseases
Because super-enhancers take part in the control of such important processes as the maintenance of stem cell pluripotency, differentiation, and cell-specific gene expression, abnormal changes in their functioning can lead to pathological processes. It is known that there are SEs associated with tumorigenesis, developmental disorders, neurodegenerative processes, and some other diseases [5,26].
SEs are important regulators, and their activity modulation could cause pathological changes. Approximately 3000 SEs were shown to be cell-specific in microglia, neurons, and oligodendrocytes. A number of them harbored GWAS risk) variants for Alzheimer's disease. Some GWAS variants were found within super-enhancers and affected Alzheimerassociated gene expression [141]. In addition, mice with Huntington's disease had mainly decreased expressions of neuron-specific genes regulated by super-enhancers that were poorly enriched with RNAPII and H3K27ac histone modifications [142].
The research on super-enhancers aids in understanding and explaining aspects of complex inflammation mechanisms. SE-mediated transcription is an important part of inflammatory pathways and is connected with inflammatory disorders. NF-κB participates in the activation of endothelial enhancers and further proinflammatory activation and SE formation. These processes lead to global changes in the BRD4 landscape and the expression of endothelial proinflammatory factors such as TNF-α. Cytokine expression causes the rapid loss of non-inflammatory SEs, pushing cells and tissues into a pro-inflammatory metabolic state. In vivo BET bromodomain inhibition causes the suppression of atherogenesis [143]. A common mechanism was shown for human Simpson-Golabi-Behmel syndrome adipocytes. TNF stimulation is followed by the loss of cell-type specific SEs and formation of NF-κB-bound SEs. As a result, cell identity genes are repressed, and proinflammatory processes are activated [144]. It came out that the demethylation of H3K9 and H3K27 by demethylases KDM7A and UTX is important for NF-κB-signaling and the adhesion of endothelial cells during pro-inflammatory responses [145]. Th9 cell-mediated allergic inflammation is associated with the IL-9 SEs. The selective knockdown of BRD4 and Med1 along with inhibition via JQ1 all lead to SE disruption, anti-inflammatory IL-9 suppression, and, following Th9 differentiation, arrest under additional OX40 stimulation [146]. SEs associated with juvenile idiopathic arthritis are enriched by ETS-and RUNX1-binding motifs. JQ1 treatment has an influence on JIA-associated gene expression inhibiting disease-associated processes [147]. In some T-helpers (Th1, Th2, Th17) genes associated with cytokines and their receptors were mainly linked to super-enhancers. A number of immune-related diseases are connected with variations in BACH2 loci under SE regulation. In addition, it was shown that cytokine signaling blockers, which are clinically effective in autoimmune diseases, predominantly affect genes associated with SEs [23]. A differentially methylated region (DMRs) is a region in the genome that has different methylation patterns among samples [148]. For atherosclerosis, it was estimated that hypermethylated DMRs often overlap super-enhancers. After the vascular smooth muscle cells transdifferentiate, the disease-associated and differentially methylated regions are common entities found within SEs characteristic of tissue-specific monocyte cells. In addition, elastin, whose expression is relevant to atherosclerosis, is significantly downregulated due to aorta-specific SE hypermethylation [149].
Genome-wide association studies were held to study the role of SNPs in SE regions in type 2 diabetes. The set of genes and SNPs within them were strongly associated with type 2 diabetes through glucose homeostasis, the G-protein coupled signaling pathway, the WNT signaling pathway, the negative regulation of inflammatory response, the positive regulation of lipid metabolism, etc. [150].
It has been demonstrated that particular SNPs and insertions/deletions inside superenhancers not only correlate with different genetic illnesses but also have a high correlation with oncogenic expression and cancer in general [5]. Super-enhancers are crucial for the development of hematological malignancies. According to a number of studies, chromosomal translocations are the most typical mutation associated with hematological malignancies and the leukemogenesis process. For instance, a specific translocation, t(6;8)(p21;q24), was discovered in a blastic plasmacytoid dendritic cell neoplasm. It led to a potent RUNX2 super-enhancer interaction with the MYC promoter, allowing both oncogenes to be expressed simultaneously [151]. An inversion connected to the overexpression of the EVI1 gene was discovered in another investigation. The distant GATA2 super-enhancer was able to relocate to the EVI1 upstream and boost its expression as a result of this translocation [152]. Furthermore, because of distant SEs' translocation at the promoter region of these oncogenes, multiple myeloma is usually linked to MYC and MYB overexpression [153,154].
In addition, super-enhancers should be taken into account as a component of a complex pluripotent transcriptional regulatory network, according to some research that uses the Core Regulatory Circuitry (CRC) paradigm [155]. One of the most crucial elements of a specific cancer type's CRC, once it has been established, is the interaction between its transcriptional factors and SEs, which ordinarily do not have such a regulating influence. In other words, it is an integrated auto-regulatory loop that forms this kind of SE. For instance, SEs that upregulate MYC, JUNB, and FOSL1 have shown high cross-binding to the SE areas of other TFs that are crucial for ESCC progression during the cell cycle [156][157][158][159]. The OCA-B oncogene is upregulated by BRD4-loaded super-enhancers that interact with POU2AF1 loci and cause the development of diffuse large B-cell lymphoma (DLBCL). A total of 285 genes with the most BRD4-loaded super-enhancers showed considerable downregulation as well as a potent anti-DLBCL effect after DLBCL was treated with BRD4 inhibitors [160].
According to a study conducted by Betancur P. and colleagues, SEs significantly upregulate the CD47 enzyme, which is highly expressed in T-ALL and breast cancer cells. Namely, a collection of particular SEs responsible for CD47 expression were found in breast cancer cell lines (HER2 or ER+ PR+) and breast tumors (ER+ PR+). However, when several cancer cell lines were tested, the consistency of the evaluated SE sets did not continue to be the same. It was demonstrated that each type of malignant cell has a unique core of super-enhancers that upregulate a unique set of oncogenes. Human mammary epithelium and CD3+ T cells both displayed significantly lower levels of CD47 expression and varied SE-profiles from tumor cells when compared. Accordingly, SEs of CD47 discovered in cancer cells were accompanied by newly generated super-enhancers, increasing the likelihood of high CD47 expression. This was proven using H3K37ac analysis and the treatment results of BRD4 inhibitors. Additionally, decreased CD47 expression and increased phagocytosis frequency in MCF7 cells occurred after inhibiting the TNF pathway with infliximab monoclonal antibodies [161].
According to Chen H. and Liang H.'s work, the expression of super-enhancer RNA and CpG methylation both play crucial roles in the development of the TCGA melanoma cancer type and other malignancies. Super-enhancer areas (377 Mb) containing >300,000 eRNA loci were found and allowed to assess the nuances of SEs activation using RNA-seq data. The Cancer eRNA Atlas was constructed in order to organize and facilitate the usage of the collected data. This online resource [162] offers information on SEs' eRNA expression profiles and methylation metrics for a variety of cancer samples as well as for normal cell lines in addition to annotations of the analyzed SEs. Analysis of the function of related SEs during the formation of ovarian tumors became critically essential after high levels of BRD4 TF expression became distinctive in ovarian cancer patients and were found to influence the success of therapy. CRISPRi, CRISPR-KO, and Hi-C were utilized by Kelly R. and colleagues to characterize the regulatory role of ovarian cancer super-enhancers. Among the 86 most active SEs identified using CRISPRi, SE60 and SE14, which were the first two candidates to modify key genes involved in the development of ovarian cancer, attracted particular attention. These two targets were chosen for additional CRISPR knockdown, a technique that made it possible to determine the role of SEs in the regulation of genes expressed during quiescence, metastasis, and invasion (e.g., RAE1 and EPHA2). The acquired results were confirmed with Hi-C tests to support the typical local regulatory role of the investigated super-enhancers and to provide confidence in the functional gene annotation [163]. An earlier ovarian cancer study revealed a substantial decrease in ALDH1A1 expression following BET inhibitor administration (e.g., JQ1). The suppression mechanism was found to involve repressing its own seRNA production and SE disruption in response to BRD4 inhibition. The use of BET inhibitors may be a promising therapeutic approach when ALDH is linked to resistance and ovarian tumor relapse [164].
A specific super-enhancer had an unexpected regulatory role in the neck squamous cell carcinoma, according to a recent study by Wan Y. and colleagues (HNSCC). MiR-21-5p is a well-known oncomiR that has been demonstrated to encourage the development of malignant cancer [165]. NFI, SRF, p53, STAT3, AP-1, and other repressors or activators are abundant in the promoter region of the MIR21 gene, according to several studies [166][167][168]. In this study, it was discovered that the AP-1 member FOSL1 upregulates the expression of MIR21. Further research showed that FOSL1 was enhanced in the newly created MIR21 super-enhancer. Because MIR21 expression significantly decreased after JQ1 therapy or FOSL1 expression knockdown in HNSCC cells [169], the FOSL1-reliant mir-21-p production hypothesis was verified.
Thus, super-enhancers are crucial regulatory elements because they modulate cell fate during non-tumor pathological processes as well as those in cancerous cells. Several studies have been conducted in order to explain the concept of super-enhancers; other groups experimentally showed the role of SEs during various states.

Inhibitors of the SE-Mediated Transcription Positive Regulators
The important role of super-enhancers in determining cell fate and development of various diseases, including tumors, suggests the idea of targeting super-enhancers' regulatory elements, such as TFs and Co-TFs, during prospective therapy. Their inhibitors can be used both for research purposes, e.g., studying the biology of super-enhancers, and also for practical purposes, for example, the treatment of socially significant diseases. In this section, we review experimental works suggesting therapies for super-enhancers and their proposed inhibitors (Figure 3).

Inhibitors of the SE-Mediated Transcription Positive Regulators
The important role of super-enhancers in determining cell fate and development of various diseases, including tumors, suggests the idea of targeting super-enhancers' regulatory elements, such as TFs and Co-TFs, during prospective therapy. Their inhibitors can be used both for research purposes, e.g., studying the biology of super-enhancers, and also for practical purposes, for example, the treatment of socially significant diseases. In this section, we review experimental works suggesting therapies for super-enhancers and their proposed inhibitors (Figure 3). Because the goal is to block transcription, it is logical to influence the proteins that affect the function of RNAPII. That is why the most popular targets are BRD4, chromatin reader, and activator of RNAPII transcription at active chromatin marks [170], CDK7, which is a cyclin-dependent kinase involved in the regulation of RNAPII-activating phosphorylation [171,172]. Combination therapy is also a promising strategy that has gained popularity in recent years.
Here, we review SE inhibitors with different target proteins and different mechanisms of action. Some of them are SE-specific, whereas others have broad transcriptional effects but mainly affect SE function. These compounds have in common that they act on key elements involved in the formation and functioning of super-enhancers and therefore significantly change the super-enhancer profile. Next, we consider the experimentally studied groups of super-enhancer inhibitors and their specifics.

Selective Inhibitors of BET Bromodomains
Bromodomains (BRDs) are protein interaction modules that specifically recognize ε- Because the goal is to block transcription, it is logical to influence the proteins that affect the function of RNAPII. That is why the most popular targets are BRD4, chromatin reader, and activator of RNAPII transcription at active chromatin marks [170], CDK7, which is a cyclin-dependent kinase involved in the regulation of RNAPII-activating phosphorylation [171,172]. Combination therapy is also a promising strategy that has gained popularity in recent years.
Here, we review SE inhibitors with different target proteins and different mechanisms of action. Some of them are SE-specific, whereas others have broad transcriptional effects but mainly affect SE function. These compounds have in common that they act on key elements involved in the formation and functioning of super-enhancers and therefore significantly change the super-enhancer profile. Next, we consider the experimentally studied groups of super-enhancer inhibitors and their specifics.

Selective Inhibitors of BET Bromodomains
Bromodomains (BRDs) are protein interaction modules that specifically recognize ε-N-lysine acetylation motifs. They play an important role in the reading process of epigenetic marks. BRDs are evolutionarily conserved and present in diverse nuclear proteins, including those in the BET family. BET proteins have two amino-terminal BRDs that bind to hyperacetylated promoter or enhancer regions [173]. Of particular interest in the context of super-enhancer research is BRD4, the best-characterized member of the BET family. It is one of the master regulators of SE [71] and one potential therapy target. BRD4, as well as Med1, can be used for SE annotation, and SEs are especially enriched with it [3,5,160]. It was shown that Med1 and BRD4 have an important role in the formation of condensates at SEs, which is described in the phase separation model [11,71]. Superenhancers are particularly sensitive to therapy with these inhibitors [9]. The inhibitors of BET proteins (BETi) targeting one or both bromodomains can be an important step towards the goal of suppressing oncogenic networks in tumors. JQ1 is a well-known BET bromodomain inhibitor [174]. Although BRD4 is not only involved in the regulation of super-enhancers, it appears that it is controlled by super-enhancers that are particularly sensitive to JQ1 treatment [9]. Potentially, JQ1 can be used for the therapy of different diseases, including atherosclerosis [143], primary effusion lymphoma [175], skin cancers [176], ovarian cancer [164], colorectal cancer [177], osteosarcoma [178], breast cancer [179], brain tumor [180], prostate cancer [181], etc. JQ1 is not the only known BET inhibitor. I-BET, a benzodiazepine derivative, disrupts chromatin complexes responsible for the expression of key inflammatory genes in activated macrophages, and it confers protection against lipopolysaccharide-induced endotoxic shock and bacteria-induced sepsis [182]. Some BETi were shown to inhibit SE in tumors.
A number of researchers have shown that cancer cells can be resistant to BETi therapy. There are various causes of this resistance in different conditions. One reason can be BRD4 recruitment to chromatin in a bromodomain-independent manner [183]. Another mechanism of resistance is connected with the suppression of the PRC2 complex, which involves the activation and recruitment of WNT-signaling components to compensate for the loss of BRD4 and drive resistance in various cancer models [175]. For breast cancer, another mechanism was shown. PELI1 is upregulated in breast carcinomas, and it destabilizes LSD1. This leads to the decommissioning of the BRD4/LSD1/NuRD complex and the absence of the effect of JQ1 treatment [42]. There are other serious problems with JQ1 therapy. It was shown that it reversibly blocks IFN-γ production while IFN-γ is also regulated by SEs. This shows that BET inhibitors may disrupt the functions of the innate and adaptive immune response [184]. The possible strategies to overcome these difficulties include choosing other targets for therapy or combinational therapeutic targeting.

Selective CDK7 Inhibitors
CDK7 is a cyclin-dependent kinase that plays an important role in the regulation of RNAPII phosphorylation [171,172]. Targeting CDK7 is another way to reduce RNAPIImediated gene transcription. SEs are enriched with the elements of transcriptional machinery and are downregulated with low concentrations. SE transcription is especially sensitive to CDK7 inhibitors [185][186][187][188].
The most popular CDK7 inhibitors are THZ1 [185] and THZ2 [189]. THZ1 is a phenylaminopyrimidine that targets a remote cysteine residue located outside of the canonical kinase domain. It is selective for CDK7 and shows an IC50 of less than 200 nM [185]. It was shown that CDK7 inhibition suppresses SE-linked transcription in MYCN-driven tumors [188]. There are a number of examples of using THZ1 for SE inhibition to prevent cancer progression. In some cases, for example, in mantle cell lymphoma and double-hit lymphoma, THZ1 helps to overcome tumor resistance to other drugs [190]. THZ2 is a modified THZ1 with altered regiochemistry of the acrylamide on THZ1 (from 4-acrylamidebenzamide to 3-acrylamide-benzamide). It also selectively targets CDK7 and has improved pharmacokinetic features and a fivefold improved half-life in vivo [189]. Like THZ1, THZ2 also attracted research as a potential SE inhibitor.

Histone Deacetylase and Demethylase Inhibitors
SEs are typically enriched in such post-translational modification histone marks as acetylation at H3 lysine 27 (H3K27ac) and mono-methylation at H3 lysine 4 (H3K4me1) [5]. This leads to the idea that effects on such histone modifications can affect the biology of super-enhancers, primarily their interaction with epigenetic readers. Thus, histone deacetylases and histone demethylases can become potential targets for inhibitors. Histone deacetylase inhibitors (HDACIs) can be rather effective. Because histone acetylation has an important epigenetic role, it is natural that HDAC inhibition leads to a wide range of effects and global responses. The inhibition leads to increased pausing of RNAPII and loss of H3K27ac at enhancers. It is important that HDACIs preferentially suppress SE-driven transcripts and cause the reduction of RNAPII signals on SEs [191].
For example, it was shown that the treatment of transformed cells with largazole leads to the remodeling of the enhancer structure by modulating H3K27ac in a dose dependent manner. It is also important that it preferentially suppresses SE-driven transcripts that are associated with oncogenic activities [191]. In rhabdomyosarcoma, hyperacetylated histones spread and disrupt the three-dimensional organization of SEs upon HDAC inhibition with entinostat. As a result, a decrease in the expression of SOX8, MYOD1, MYOG, and MYCN was observed [192]. HDAC1 and HDAC7 are also important for maintenance cancer stem cells (CSCs) in breast and ovarian tumors. Entinostat represses HDAC 1 and HDAC7 in BrCa cells with a stem-like phenotype, and the treatment reduces H3K27ac at stem cell transcription factor genes including c-MYC, VDR, RB1, EZH2, c-JUN, HOXA2, and HOXA10 while increasing H3K27ac globally [193]. The effect of HDACi depends on the class of the target HDAC protein. HDACs are primarily divided by their dependence on either NAD (Class III) or Zinc (Class I, IIa/b, and IV) to catalyze deacetylation. Class I HDAC enzymes (HDAC1, 2, 3, and 8) are related to the regulation of transcription. Using PAX3-FOXO1 fusion oncogene positive rhabdomyosarcoma as a model system and a panel of HDAC inhibitors with diverse and well-characterized isoform selectivity, it was shown that Class I HDAC inhibitors were the most effective, followed by Class IIa (HDAC4/5/7/9), Class IIb (HDAC6/10), Class III (SIRT proteins), and Class IV (HDAC 11) [194].
Another example is the dependence of PAX8, a prototype lineage-survival oncogene in epithelial ovarian cancer, on HDAC inhibition through the perturbation of the superenhancer topology associated with the PAX8 gene locus. Class I HDAC inhibitors such as panobinostat, romidepsin, and entinostat disrupt PAX8 transcription [195]. Panobinostat, romidepsin, and vorinostat can influence the Warburg effect by disrupting super-enhancers related to such genes as MYC, hexokinase 2 (HK2), GAPDH, and enolase 1 (ENO1) in glioblastoma models. These chemicals can be used to reprogram glioblastoma's central carbon metabolism, which is controlled by SEs [196]. Among the histone demethylases, the most popular target is LSD1 (lysine-specific demethylase 1), which regulates gene expression by affecting histone modifications. Unlike the previously discussed inhibitors, LSD1 inhibitors do not suppress the work of super-enhancers, but instead remove LSD1 repression from them. For example, LSD1 inhibitors NCD25 and NCD38 inhibited the growth of MLL-AF9 leukemia as well as erythroleukemia, megakaryoblastic leukemia, and myelodysplastic syndromes. Activated super-enhancers regulate myeloid differentiation. The hematopoietic regulators that are abnormally silenced by LSD1 have anti-leukemic effects [197]. The activation of GFI1-SE during treatment with the same inhibitors leads to the differentiation of erythroleukemia cells [198]. Another example of demethylase inhibition is KDM6 histone demethylase inhibition by GSK-J4 in colorectal cancer. The treatment led to enhancers being reprogrammed, and SE-associated genes responded in a more sensitive manner. This weakened the malignant phenotypes of cancer cells [199].

Other Potential Inhibitors and Their Targets
In recent years, some key downstream molecules and pathways of SEs involved in various tumors have attracted more and more attention. New strategies for influencing super-enhancers are developing, and they have great potential.
Predominant downstream signaling pathways, for example, the MAPK/ERK pathway, play an important role in cell growth and proliferation for normal cell development and cancer progression [200]. The MEK kinase (MAP kinase kinase) is one of the key elements of this pathway. It phosphorylates and activates mitogen-activated protein kinase (MAPK). MEK is one of the most popular targets for MAPK/ERK pathway inhibition [201]. Influencing this metabolic pathway may lead to the remodeling of super-enhancers. It was shown that in rhabdomyosarcoma, MEK inhibition with trametinib causes the loss of ERK2 at the MYOG promoter as well as the transcriptional delay of MYOG expression. MYOG opens up the chromatin and establishes SEs at genes that are responsible for late myogenic differentiation [202,203]. Potent and selective MEK inhibitor AZD8330 affects glioblastoma stem cells' viability at low micromolar doses through SE remodeling [204]. The JAK-STAT pathway also plays a huge role in cell growth, differentiation, apoptosis, and other integral cellular functions [205]. SEs are selectively targeted by a JAK (Janus kinase) inhibitor, tofacitinib, which can help in the treatment of immune mediated disorders, including rheumatoid arthritis [23]. Mediator-associated kinases cyclin-dependent kinase 8 (CDK8) and CDK19 prevent the increased activation of key super-enhancer-associated genes in acute myeloid leukemia (AML) cells. Cortistatin A, which inhibits Mediator kinases, has antitumor effects in vitro and in vivo [206]. Some of the drugs that have a potential influence on SE are specific to the tumor type. As an illustration, NR4A nuclear receptors are known to be tumor suppressors in AML. The drug dihydroergotamine (DHE) interacts with NR4A nuclear receptors and represses transcription of a subset of SE-associated leukemic oncogenes [207].

Combined Treatment Strategies
Combined therapy strategies have a complex effect on super-enhancers, and inhibitors in pairs can show synergies. In addition, the combination of inhibitors reduces the likelihood of drug resistance. One of the most popular combinations is that of BET inhibitors paired with CDK7 inhibitors. Such a therapy has shown increased efficiency in some tumor types.
BET bromodomain inhibitors are also combined with the inhibitors of other kinases. BRD4 is an important player in MYC SE transcription regulation, and cyclin-dependent kinases, especially CDK1 and CDK2, stabilize MYC phosphorylation. Suppression via the combined targeting of MYC expression and stabilization using BET bromodomain inhibition (JQ1) and CDK2 inhibition (milciclib), respectively, is a promising strategy to cure medulloblastoma [208]. BET bromodomain inhibitors were also used together with CREB-binding protein (CBP) inhibitors in diffuse intrinsic pontine glioma cells. CBP is an acetyltransferase associated with the acetylation of histones at SE. A combination of JQ1 (a BET bromodomain inhibitor) and ICG-001 (a CBP inhibitor) led to strong cytotoxic effects [209]. The repression of c-MYC is one important task in the therapy of many cancer types, including in colorectal cancer. Its overexpression due to hyperactive WNT/β-catenin/TCF signaling is one of the key phenomena for tumorigenesis. JQ1 suppresses c-MYC transcription, and trametinib, a MEK/ERK pathway inhibitor, affects post-translational mechanisms. The dual targeting of BET proteins and MAPK signaling shows synergetic effects as well [177]. In B-cell lymphoma, BET inhibitor OTX015 was combined with PI3Kα-selective inhibitors alpelisib and CYH33 and PI3Kδ-selective inhibitor idelalisib. CYH33's performance together with OTX015 was especially successful and had such consequences as a decrease in acetylated H3 bound to the promoter and the super-enhancer region of c-MYC, the phosphorylation and proteasomal degradation of the histone acetyltransferase p300, and finally, cell cycle arrest and apoptosis [210]. It is known that patients with neuroblastoma with consequent N-Myc oncoprotein overexpression have a very poor prognosis. THZ1 suppresses super-enhancers regulated with the participation of CDK7, including the one that controls N-Myc transcription. Drugs such as ponatinib and lapatinib are tyrosine kinase inhibitors (TKIs). Tyrosine kinase performs c-Myc and N-Myc protein dephosphorylation and N-Myc stabilization. TKIs with THZ1 synergistically induce MYCN-amplified neuroblastoma cell apoptosis [211]. Another strategy is to co-target SEs with THZ1 and Bcl-2 or Bcl-xL, which are anti-apoptotic proteins, with BH3-mimetics (ABT263, WEHI-539, ABT199, etc.). This combination showed a strong cell-killing effect in glioblastoma [212]. HDAC inhibition, for example, with panobinostat, can be successfully combined with either JQ1 treatment (BET bromodomain inhibitor) or THZ1 (CDK7 inhibitor). In the case of diffuse intrinsic pontine glioma, both strategies were rather beneficial, and the pairs of inhibitors have shown synergy [213]. Moreover, panobinostat and THZ1 were also shown to be effective in neuroblastoma treatment, reducing JMJD6, E2F2, N-Myc, and c-Myc [214]. HDAC inhibitors, as mentioned before, influence the Warburg effect and reprogram the metabolism of glioblastoma cells. This resulted in the engagement of oxidative phosphorylation driven by elevated fatty acid oxidation (FAO). FAO inhibitors such as etomoxir influence lipid metabolism. A combination treatment with HDAC and FAO inhibitors led to animal survival in xenograft models [196].

Conclusions
In the current review, we discussed the current knowledge on experimental validation and prediction of super-enhancers (SEs), a topic of ongoing discussion in the scientific community. First, we dove into the emergence of the concept of "super-enhancers". Despite little consensus on how to define SE regions, both in terms of their calculation and theoretical conception, super-enhancers are characterized as separate objects of research, comprising broad enhancer clusters with an exceptionally high degree of enrichment of transcriptional factors and of other gene expression regulators or chromatin marks, as determined with ChIP-seq, RNA-seq, ATAC-seq, etc., and they are historically associated with the increased expression of oncogenes and cell differentiation genes. Such a definition requires further clarification, especially when talking about so-called constitutive SEs, the nature of interactions between the SEs' constituent parts, and alternative clusters such as stretch enhancers.
We also discussed the regulatory potential of super-enhancers, both in normal and pathological conditions such as oncological conditions, neurodegenerative (such as Alzheimer's and Huntington's) diseases, inflammatory processes, and others. The location of SE regions and the labelling of TADs as well as the annotation of CTCF binding sites suggest that they share a common nature of organization and interaction, which was confirmed in various studies using the DASE, CapStarr-seq, and ExP STARR-seq algorithms. A number of studies, including those that implemented CRISPR-Cas editing, have shown that SEs, due to their interaction with transcriptional factors, significantly increase the expression of target genes, making a huge contribution to cell differentiation, growth, development, and tumorigenesis.
There are some future research directions for SEs in diseases. First, it is still unclear which of the socially significant diseases, in addition to those for which they have already been well-studied and mentioned before, can be associated with changes in the functioning of super-enhancers. Second, the development of therapeutic approaches aimed at targeting SE may be quite promising. Given the potential importance of SEs in cell fate determination and disease development, we examined promising therapeutic avenues, including the use of inhibitors that target positive regulators of SE-mediated transcription. These include selective inhibitors of BET bromodomain-containing proteins and CDK7, histone deacetylase and demethylase inhibitors, and other potential targets and strategies for prospective combination therapies.
We provided an overview of bioinformatic methods used to search for SEs, with a focus on machine learning and experimental techniques that have been actively used in recent years. A separate task was to describe the methods of in silico SE prediction, with special attention to machine learning approaches, used for the discovery and prediction of these regulatory elements. The majority of these are based on Chip-Seq data and the use of convolutional (CNN) and recurrent (RNN) neural networks and their combinations, for example, the KEGRU network [215] and DNABERT architecture [216]. Similar algorithms would enable a better understanding of the super-enhancers' organization and biological role in healthy cells and diseases, facilitating the development of effective treatment strategies.
Nevertheless, the existing techniques are not sufficient to facilitate complex approaches because modern algorithms ignore the chromatin structure, the probable lateral interactions of SEs and uncharacteristic target genes, the possible breakdown of super-enhancer insulation, hijacking mechanisms, and other non-canonic formation and decomposition mechanisms of super-enhancers. That is why in our future work, we aim to develop an approach that would allow us to take into account the maximum number of nuances that affect the formation of super-enhancers in tumors and try to resolve the existing uncertainty based on regulatory effect blurring.
The question about SE evolution remains open. Super-enhancers, which are historically associated with the human-enhanced region, are thought to be responsible for human-specific phenotypic adaptations when compared between mammals. At the same time, according to some data, there is reason to believe that super-enhancers can be much more ancient, if compared exempli gratia with stress response proteins in multicellular organisms. It should be noted that advancements in understanding the molecular mechanisms of diseases (particularly oncological diseases [217][218][219][220]) as well as the study of intermolecular interactions through high-performance optical methods [221][222][223], the ultrasensitive determination of biomolecules using magnetic nanomarkers [224][225][226][227] or label-free biosensors [228][229][230], and the processing of genome-wide data through highperformance methods such as cloud computing [231,232] may provide new opportunities for comprehending the role of SEs, potentially leading to the revision of their definition.
In conclusion, despite ongoing research, even if the definition of the object itself is currently incomplete, our understanding of SEs is evolving, and further experimental and computational studies are needed to fully comprehend their role in gene regulation and to further develop therapeutic strategies.

Conflicts of Interest:
The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; in the decision to publish the results.