Natural Antisense Transcripts: Molecular Mechanisms and Implications in Breast Cancers

Natural antisense transcripts are RNA sequences that can be transcribed from both DNA strands at the same locus but in the opposite direction from the gene transcript. Because strand-specific high-throughput sequencing of the antisense transcriptome has only been available for less than a decade, many natural antisense transcripts were first described as long non-coding RNAs. Although the precise biological roles of natural antisense transcripts are not known yet, an increasing number of studies report their implication in gene expression regulation. Their expression levels are altered in many physiological and pathological conditions, including breast cancers. Among the potential clinical utilities of the natural antisense transcripts, the non-coding|coding transcript pairs are of high interest for treatment. Indeed, these pairs can be targeted by antisense oligonucleotides to specifically tune the expression of the coding-gene. Here, we describe the current knowledge about natural antisense transcripts, their varying molecular mechanisms as gene expression regulators, and their potential as prognostic or predictive biomarkers in breast cancers.


Introduction
After an international effort, the scientific community has revealed that up to 90% of the human genome is transcribed. Thanks to the FANTOM project (functional annotation of the mammalian genome, available online: http://fantom.gsc.riken.jp/), started in 2000 with the mouse genome [1,2], which was quickly followed by the human genome in 2003 by the ENCODE project (encyclopedia of DNA elements, available online: https://www.encodeproject.org/) [3,4], we know that 98% of the human genome is composed of non-coding (nc) sequences, previously considered "Junk DNA" due to their heterogeneity, low expression levels, and unknown functions [5][6][7][8][9][10][11]. This huge part of the transcriptome could therefore play a role in protein-coding (pc) RNA expression regulation. Databases specialized in genome annotation, such as the GENCODE project (encyclopaedia of genes and gene variants, available online: http://www.gencodegenes.org/) [12,13], specialized in ncRNA, such as the NONCODE (integrated knowledge database dedicated to ncRNAs, especially lncRNAs, available online: http://www.noncode.org/) and RNAcentral projects (the non-coding RNA sequence database, available online: http://rnacentral.org/) [14,15], or specialized in human long non-coding RNAs (lncRNAs), such as the LNCipedia project (a comprehensive compendium of human long Figure 1. lncRNA classification according to their orientation and position in the genome. lincRNAs are located between two pcGenes, regardless of their orientation. Intronic lncRNAs are entirely encoded in pcGene introns, while sense lncRNAs overlap pcGene exons. Bidirectional lncRNA transcription starts less than 1 kb from a pcGene transcription start site and goes in its opposite direction. Cis-NATs (natural antisense transcript) are RNA sequences that are transcribed from the two strands of the same genomic locus, in the antisense direction. NAT pairs can be protein-coding sequences (pc, red colored) or non-coding sequences (nc, blue colored), forming nc|pc, nc|nc or pc|pc pairs. NAT pairs that are nc|pc or nc|nc sequences only belong to the lncRNA classification (purple colored sequences are pc or nc).
lncRNAs are defined as endogenous cellular RNAs without a significant ORF (open reading frame) [27][28][29]. However, some ncRNAs containing an ORF smaller than 100 amino-acids may be classified as lncRNAs [27]. The known biological roles of lncRNAs are very heterogeneous and cover various molecular and cellular functions such as pcGene regulation [30], stem cell pluripotency and differentiation [31], allelic expression [32], cell cycle control [33], apoptosis and senescence [34], heat shock response [35], and control of chromatin modifications [36]. It is worth noting that lncRNAs are found in all tissues and show pronounced tissue-specific expression. Their cellular location may Figure 1. lncRNA classification according to their orientation and position in the genome. lincRNAs are located between two pcGenes, regardless of their orientation. Intronic lncRNAs are entirely encoded in pcGene introns, while sense lncRNAs overlap pcGene exons. Bidirectional lncRNA transcription starts less than 1 kb from a pcGene transcription start site and goes in its opposite direction. Cis-NATs (natural antisense transcript) are RNA sequences that are transcribed from the two strands of the same genomic locus, in the antisense direction. NAT pairs can be protein-coding sequences (pc, red colored) or non-coding sequences (nc, blue colored), forming nc|pc, nc|nc or pc|pc pairs. NAT pairs that are nc|pc or nc|nc sequences only belong to the lncRNA classification (purple colored sequences are pc or nc). lncRNAs are defined as endogenous cellular RNAs without a significant ORF (open reading frame) [27][28][29]. However, some ncRNAs containing an ORF smaller than 100 amino-acids may be classified as lncRNAs [27]. The known biological roles of lncRNAs are very heterogeneous and cover various molecular and cellular functions such as pcGene regulation [30], stem cell pluripotency and differentiation [31], allelic expression [32], cell cycle control [33], apoptosis and senescence [34], heat shock response [35], and control of chromatin modifications [36]. It is worth noting that lncRNAs are found in all tissues and show pronounced tissue-specific expression. Their cellular location may vary, probably reflecting their function [20,37,38]. There is a structural similarity between lncRNAs and mRNAs, in the sense that they may be multi-exonic, 5 capped, 3 polyadenylated, and spliced [23]. RNA polymerase II (RNA Pol II) is responsible for the transcription of most of the lncRNAs, and their expression is under the control of promoters and enhancers, that can be induced by external stimuli [23].

Generic Definition of NATs
Natural antisense transcripts (NATs) are coding or non-coding RNA sequences that are complementary to and overlap with either protein-coding or non-coding transcripts [39]. As 98% of the transcriptome is non-coding, the vast majority of paired transcripts are composed of nc|nc or nc|pc pairs. Therefore, NATs are defined in regard to the relative genomic position from their paired transcript origins, in cis or in trans. Cis-NAT pairs are transcribed from the opposite strand of the same genomic locus and display perfect RNA|RNA sequence complementarity with the opposite strand transcript (if no RNA modifications, such as RNA editing, occur). Trans-NAT pairs are transcribed from different genomic loci, and the two RNA molecules may hybridize to each other with imperfect RNA|RNA sequence complementarity [40,41].
Because whole genome sequencing of the antisense transcription has only been available for less than a decade, many NATs were described as lncRNAs without information about the co-existence of other transcripts from the same genomic origin. This convergence between NAT and lncRNA classifications may thus lead to some confusion in the literature and will probably disappear with the increasing knowledge in the antisense transcription field.

NAT: Structure, Localization, and Expression Regulation
Like lncRNAs and mRNAs, NATs may be capped and poly-adenylated transcripts that are maturated to excise introns. NAT expression is also controlled by promoters and enhancers. In addition, many examples of bidirectional promoters that control transcript expression originating from both strands are described in the literature [42,43]. In this case, several transcription factors, such as GABPA or E2F1, are preferentially implicated [44][45][46]. NATs may originate from cryptic promoters that are then inserted within the intronic regions of a gene or close to the transcription start site of neighboring genes [43,47,48].
NATs accumulate preferentially in the nucleus, associating with chromatin, unlike coding mRNAs which accumulate in the cytoplasm. NATs are also found in other cellular compartments, such as mitochondria, and have been reported to accumulate at polysomes [3,18,49]. Moreover, NAT expression is closely linked with the activity of their sense or neighboring genes [43].

NAT: Role, Function and Mechanism of Action
The biological significance of NATs remains under scientific investigation with major key questions yet to be answered. Specific pcGene regulation by their corresponding overlapping ncNATs has been reported. Our team and others have shown that up to 50% of the pcGenes also express ncNATs [2,39,50] and that transcript levels of nc|pc pairs are often tightly correlated [39,46,50]. Altogether, this suggests that NATs could be implicated in a new level of gene expression regulation [5,51].
Both transcriptional and post-transcriptional regulations of expression have also been explained as the result of the creation of natural sense and antisense transcript pairs. The regulatory processes implicated can be more or less complex, ranging from simple transcriptional interference to modulation of chromatin changes or the formation of double-stranded RNA (dsRNA). The latter leads to RNA masking, RNA interference or RNA editing [52].
Several examples of pcGene expression regulation by their NATs are described hereafter to illustrate the different molecular mechanisms of action.

Action in Cis or Trans
While NATs are more likely to handle regulation of other genes in cis, they may also tune gene expression elsewhere in the genome by trans regulation. Based on the definition of overlapping genes from Makalowska et al., cis-NATs are here classified according to the relative position of the DNA coding sequence of the RNA transcripts [53]. Three categories can thus be described and are depicted in Figure 2: (1) "head-to-head", where sense and antisense transcripts overlap on their 5 ends; (2) "tail-to-tail", where sense and antisense transcripts overlap on their 3 ends; and (3) "embedded overlap" (also called "full overlap"), where one of the entire transcript overlaps the other.

Action in Cis or Trans
While NATs are more likely to handle regulation of other genes in cis, they may also tune gene expression elsewhere in the genome by trans regulation. Based on the definition of overlapping genes from Makalowska et al., cis-NATs are here classified according to the relative position of the DNA coding sequence of the RNA transcripts [53]. Three categories can thus be described and are depicted in Figure 2: (1) "head-to-head", where sense and antisense transcripts overlap on their 5′ ends; (2) "tail-to-tail", where sense and antisense transcripts overlap on their 3′ ends; and (3) "embedded overlap" (also called "full overlap"), where one of the entire transcript overlaps the other. Figure 2. cis-NAT classification. cisNAT pairs can be protein coding sequences (pc) or non-coding sequences (nc), forming nc|pc, nc|nc or pc|pc pairs. In head-to-head orientation, sense and antisense transcripts overlap on their 5′ ends. Inversely, tail-to-tail describes an overlap of the 3′ ends. In a full overlap (or embedded overlap), one transcript is totally included in the other one.

Transcriptional Interference
Antisense transcription can modulate in cis the sense transcription of the opposite strand, although this effect may not be caused by the pairing of the RNA molecules themselves. The proximity of the two transcriptional events, sense and antisense, leads to a downregulation of both transcripts [54]. Transcriptional interference can occur during the initiation or elongation phases of transcription. In the initiation phase, promoters of head-to-head NATs are competing for the use of RNA Pol II and common regulatory elements ( Figure 3A). In the elongation phase, interference can occur after the following events: a collision between RNA Pol II complexes, leading to a machinery blockage ( Figure 3B); a promoter occlusion by RNA Pol II during the antisense transcript elongation ( Figure 3C); or an RNA Pol II dislodgement by the RNA Pol II standing on the opposite strand, when the first one was too slow to start ( Figure 3D) [54]. It is worth noting that the transcriptional interference investigation field is still young and that formal proof of gene expression regulation by this mechanism was only recently reported [55]. Nevertheless, a negative correlation between sense and antisense transcript levels are less frequently observed than a positive correlation or no correlation. This suggests that only a minority of NATs could be involved in transcriptional interference processes [50,[56][57][58][59].
Despite difficulties in discriminating transcription interference from gene expression regulation by RNA transcripts, Stojic et al. [55] have demonstrated such a mechanism by screening an siRNA library. Whereas nearly all siRNAs dampen GNG12-AS1 (a non-coding natural antisense transcript of the tumor suppressor coding gene DIRAS3) post-transcriptionally, siRNA targeting exon 1 of Figure 2. cis-NAT classification. cisNAT pairs can be protein coding sequences (pc) or non-coding sequences (nc), forming nc|pc, nc|nc or pc|pc pairs. In head-to-head orientation, sense and antisense transcripts overlap on their 5 ends. Inversely, tail-to-tail describes an overlap of the 3 ends. In a full overlap (or embedded overlap), one transcript is totally included in the other one.

Transcriptional Interference
Antisense transcription can modulate in cis the sense transcription of the opposite strand, although this effect may not be caused by the pairing of the RNA molecules themselves. The proximity of the two transcriptional events, sense and antisense, leads to a downregulation of both transcripts [54]. Transcriptional interference can occur during the initiation or elongation phases of transcription. In the initiation phase, promoters of head-to-head NATs are competing for the use of RNA Pol II and common regulatory elements ( Figure 3A). In the elongation phase, interference can occur after the following events: a collision between RNA Pol II complexes, leading to a machinery blockage ( Figure 3B); a promoter occlusion by RNA Pol II during the antisense transcript elongation ( Figure 3C); or an RNA Pol II dislodgement by the RNA Pol II standing on the opposite strand, when the first one was too slow to start ( Figure 3D) [54]. It is worth noting that the transcriptional interference investigation field is still young and that formal proof of gene expression regulation by this mechanism was only recently reported [55]. Nevertheless, a negative correlation between sense and antisense transcript levels are less frequently observed than a positive correlation or no correlation. This suggests that only a minority of NATs could be involved in transcriptional interference processes [50,[56][57][58][59].
GNG12-AS1 downregulates its transcription by recruiting Argonaute 2 and inhibiting RNA polymerase II binding. In this case, the active transcription of GNG12-AS1 causes the transcriptional silencing of DIRAS3, leading to increased cell proliferation. Transcriptional Interference: (A) in the initiation phase, promoters of head-to-head NATs are competing for the use of RNA Pol II and common regulatory elements; (B) in the elongation phase, interference can occur after the following events: a collision between RNA Pol II complexes, leading to a machinery blockage; (C) a promoter occlusion by RNA Pol II during the antisense transcript; and (D) a RNA Pol II dislodgement by the RNA Pol II standing on the opposite strand, when the first one was too slow to start. Promoters of protein coding sequences are represented in red, and promoters of non-coding sequences in blue. RNA pol II enzyme is represented in dark grey when able to transcribe the sequence, and light grey when its binding and thus activity, is prevented.

Chromatin Modification
ncNATs may regulate the expression levels of the sense pcGenes by regulating chromatin modifications. Such epigenetic modifications encompass DNA methylations of cytosine in CpG islands and histone modifications by methylation or acetylation of lysine residues. NATs and, more widely, ncRNAs, are thought to affect DNA methylation by interacting with various types of proteins involved in histone modification or chromatin remodeling such as, in particular, the interference can occur after the following events: a collision between RNA Pol II complexes, leading to a machinery blockage; (C) a promoter occlusion by RNA Pol II during the antisense transcript; and (D) a RNA Pol II dislodgement by the RNA Pol II standing on the opposite strand, when the first one was too slow to start. Promoters of protein coding sequences are represented in red, and promoters of non-coding sequences in blue. RNA pol II enzyme is represented in dark grey when able to transcribe the sequence, and light grey when its binding and thus activity, is prevented.
Despite difficulties in discriminating transcription interference from gene expression regulation by RNA transcripts, Stojic et al. [55] have demonstrated such a mechanism by screening an siRNA library. Whereas nearly all siRNAs dampen GNG12-AS1 (a non-coding natural antisense transcript of the tumor suppressor coding gene DIRAS3) post-transcriptionally, siRNA targeting exon 1 of GNG12-AS1 downregulates its transcription by recruiting Argonaute 2 and inhibiting RNA polymerase II binding. In this case, the active transcription of GNG12-AS1 causes the transcriptional silencing of DIRAS3, leading to increased cell proliferation.

Chromatin Modification
ncNATs may regulate the expression levels of the sense pcGenes by regulating chromatin modifications. Such epigenetic modifications encompass DNA methylations of cytosine in CpG islands and histone modifications by methylation or acetylation of lysine residues. NATs and, more widely, ncRNAs, are thought to affect DNA methylation by interacting with various types of proteins involved in histone modification or chromatin remodeling such as, in particular, the polycomb repressive complex 2 (PRC2) [60]. A current hypothesis considers that nascent NATs guide PRC2 to specific-target sites on the chromatin. The tethering would occur by pairing the nascent NAT with DNA or mRNA sequences, during or after NAT transcription ( Figure 4A) [61]. 123 6 of 24 polycomb repressive complex 2 (PRC2) [60]. A current hypothesis considers that nascent NATs guide PRC2 to specific-target sites on the chromatin. The tethering would occur by pairing the nascent NAT with DNA or mRNA sequences, during or after NAT transcription ( Figure 4A) [61]. The NAT binds a protein complex that can trigger chromatin modifications and prevents, by competition, this complex from binding the sense transcript. This complex can also prevent the interaction of the sense gene with RNA Pol II (RNA polymerase II); (B) a tethering mechanism, such as ANRIL (antisense non-coding RNA in the INK4 locus): ANRIL recruits PRC2 (polycomb repressive complex) through interaction with SUZ12 (suppressor of zeste 12 homolog) and EZH2 (enhancer of zeste 2 polycomb repressive complex 2 subunit) components and PRC1 by binding CBX7 (chromobox homolog 7). Next, PRC2 silences the INK4 locus expression by inducing H3K27 tri-methylation, and PRC1 maintains a repressive chromatin structure by mono-ubiquitination of H2AK119. Protein coding sequences or promoters are represented in red, and non-coding in blue.
Additionally, a "decoy" mechanism can be described, where the NAT binds a protein complex, such as PRC2, and prevents this complex from binding the sense transcript by competition. This complex can also prevent the interaction of the sense gene with RNA Pol II or the chromatin [61,62].
Here are two examples of lncRNA/NAT that play a role in the tethering of PRC2 with chromatin. The first example is the combined action of ANRIL and PRC1-PCR2 on INK4b-ARF-INK4a gene expression and on the chromatin structure of this locus. ANRIL is a cisNAT that is dysregulated in breast cancer. It is located in the INK4b-ARF-INK4a gene cluster, which contains three genes encoding the three tumor-suppressor proteins p15, p14 and p16 [63]. Polycomb The NAT binds a protein complex that can trigger chromatin modifications and prevents, by competition, this complex from binding the sense transcript. This complex can also prevent the interaction of the sense gene with RNA Pol II (RNA polymerase II); (B) a tethering mechanism, such as ANRIL (antisense non-coding RNA in the INK4 locus): ANRIL recruits PRC2 (polycomb repressive complex) through interaction with SUZ12 (suppressor of zeste 12 homolog) and EZH2 (enhancer of zeste 2 polycomb repressive complex 2 subunit) components and PRC1 by binding CBX7 (chromobox homolog 7). Next, PRC2 silences the INK4 locus expression by inducing H3K27 tri-methylation, and PRC1 maintains a repressive chromatin structure by mono-ubiquitination of H2AK119. Protein coding sequences or promoters are represented in red, and non-coding in blue.
Additionally, a "decoy" mechanism can be described, where the NAT binds a protein complex, such as PRC2, and prevents this complex from binding the sense transcript by competition. This complex can also prevent the interaction of the sense gene with RNA Pol II or the chromatin [61,62].
Here are two examples of lncRNA/NAT that play a role in the tethering of PRC2 with chromatin. The first example is the combined action of ANRIL and PRC1-PCR2 on INK4b-ARF-INK4a gene expression and on the chromatin structure of this locus. ANRIL is a cisNAT that is dysregulated in breast cancer. It is located in the INK4b-ARF-INK4a gene cluster, which contains three genes encoding the three tumor-suppressor proteins p15, p14 and p16 [63]. Polycomb repressive complexes 1 and 2 (PRC1 and PRC2) are implicated in epigenetic silencing mechanisms. ANRIL can recruit those complexes to the chromatin of the INK4b-ARF-INK4a locus, recruiting PRC2 through interaction with SUZ12 and EZH2 components, and recruiting PRC1 by binding CBX7 [63][64][65]. Next, PRC2 silences INK4b-ARF-INK4a gene expression by inducing H3K27 tri-methylation, and PRC1 maintains a repressive chromatin structure by mono-ubiquitination of H2AK119 ( Figure 4B) [66].
A second example is HOTAIR, which is implicated and dysregulated in many types of cancer and displays an active and critical role in chromatin dynamics [67,68]. Like ANRIL, HOTAIR interacts with PRC2 through its 5 end to induce H3K27 tri-methylation. In addition, HOTAIR binds to LSD1 (lysine-specific demethylase 1) by its 3 end, leading to H3K4 demethylation. These combined modifications lead, in trans, to a repressive chromatin structure and thus to the silencing of multiple genes [68,69] NATs can regulate gene expression through the formation of a complex of two overlapping NAT sequences. This double-stranded RNA (dsRNA) molecule thus creates a physical protection against post-transcriptional regulation factors that target the pcGene. RNA masking will then interfere with splicing or translation machineries. This mechanism will also prevent miRNA binding or RNAse activities, which often target single-stranded RNA and influence their complex stability [52]. Under this condition and in opposition with other mechanisms described above, NAT positively regulates pcGene expression.
In osteocarcinoma, upregulated FGFR3-AS1 forms a tail-to-tail dsRNA with FGFR3, its sense transcript. FGFR3 mRNA is thus protected against RNase activity, leading to an increase in both its mRNA stability and its protein production [70]. Conversely, binding of the MALAT1 3 UTR by its ncNAT TALAM1 allows for RNase P cleavage, leading to 3 end processing and maturation that is essential for MALAT1 stability and function [71].
While forming dsRNAs, NATs can also interfere with splicing and translation mechanisms. For example, the protein coded from the gene ZEB2 is a transcriptional factor that downregulates E-cadherin and its antisense transcript, ZEB2-AS1. ZEB2 also contains an IRES (internal ribosome entry site) required for its translation. By binding this sequence, ZEB2-AS1 promotes ZEB2 splicing and downregulates its protein expression [72].

Double-Stranded RNA/RNA A to I Editing
ADARs (adenosine deaminases that act on RNA) are enzymes responsible for RNA editing by site-specific adenosine deamination. They target dsRNA molecules such as those formed by NAT pairs. After adenosine to inosine (A-to-I) editing, inosines (I) are interpreted as a guanosines (G) during splicing or translation. Such modification may modulate the localization or the stability of the edited transcripts [73,74]. The occurrence frequency of RNA editing by NATs is not yet characterized [52,75,76]. Indeed, few NATs display edited sequences, but they may be quickly degraded or retained in the nucleus, thus disappearing from the bulk of the expressed sequences [77].
An example of this A to I editing mechanism has been found in human prostate cancers with the sense/antisense couple of PRUNE2 and PCA3 transcripts. PRUNE2 is a pcGene that has a tumor suppressor role. PCA3 is an NAT that originates from introns, and is fully overlapped by PRUNE2's 6th intron. The dsRNA created by PCA3 and PRUNE2's pre-mRNA forms a complex with ADAR proteins. An A-to-I editing of this dsRNA leads to a downregulation of protein expression and an increase in tumor cell growth [78]. It is important to note that PCA3 was also approved as a specific biomarker for diagnostic tests.

Double-Stranded RNA/RNA Interference
RNA interference is an additional mechanism whereby NATs are implicated in pcGene post-transcriptional regulation [79]. RNA interference is the endogenous siRNA formation from NAT-derived dsRNA. RNA interference is DICER-dependent and is followed by the action of the RNA-induced silencing complex (RISC) [80][81][82]. NATs may thus serve as precursors in endo-siRNA and miRNA production [83]. NATs form internal hairpins or duplexes with sense RNA, leading to a dsRNA that can be handled and digested by DICER. Short RNA duplexes will then be bound by the RISC complex, where one strand of the RNA duplex is used as a guide for mRNA recognition. This mRNA is then cleaved by the RISC complex, which will decrease the protein expression. Even with scarce evidence of NAT involvement in the RNA interference process, recent transcriptome sequencing studies have shown the widespread occurrence of endo-siRNAs and their regulatory potential during stages of development and differentiation [82,83].

NATs in Breast Cancer
Numerous studies have highlighted a link between lncRNA/NAT and cancers, especially breast cancers. Most of these transcripts were either highlighted by high-throughput transcriptomic studies that lacked the strand origin, or explored one by one due to their implication in oncogenic pathways. Therefore, many lncRNA listed in Table 1 are generally not described as NAT in the literature. In addition, the expression correlation between the NAT pair transcripts, as well as the ncNAT regulatory role with regard to the paired pcGene, are often unknown. It is also worth noting that most genomic loci coding for NAT transcript pairs also display numerous alternative transcripts. Therefore, each lncRNA transcript may belong to different classes among NAT pc|nc, lincRNA, lncRNA, or NAT nc|nc.
To the best of our knowledge, only three strand-specific whole genome transcriptomic studies were performed on breast cancer samples [39,46,50]. The main concordant conclusions were that: (i) pcGene transcription coincides with an antisense ncNAT transcription in 50% of the cases; (ii) NAT transcripts are 1000 times less abundant than pcGene transcripts; and (iii) positive expression correlations between ncNATs and their paired pcGenes are approximately six times more frequent than negative correlations. This latest suggests that if ncNATs can affect the expression of their corresponding pcGene, positive regulation of expression should be more frequently observed than repression. However, a comparison of transcript levels between tumors and paired non-malignant adjacent healthy tissues showed that the ncNAT/pcGene transcript balance is disrupted in tumors. Therefore, new positive correlations of NAT/pcGene pairs are created in tumor tissues, while others that were present in the normal tissue decline [50].
The mechanism by which lncRNA/NAT regulates pcGene expression is known in several instances, and two mechanisms are often described in breast cancer. The first is driven by the polycomb repressing complexes (PRC), and the second by microRNAs. Here are three examples of PRC2 involvement in cancer pathways. The NAT ANRASSF1 leads PRC2 binding on the RASSF1 promoter to regulate RASSF1 expression [84]. The INK4b-ARF-INK4A locus coding for the cell cycle associated proteins p14, p15 and p16 is regulated by the NAT ANRIL via PRC2, and in addition, the lncRNA PANDAR recruits PRC1 to also regulate p16 expression [63][64][65]85]. Similarly, the p53 pathway is regulated at several levels via PCR2 by HOTAIR and MEG3 lncRNAs [86][87][88][89]. The importance of gene regulation by PRC2 is well known in breast cancers, as the expression of its targeted genes can be used to predict patient outcomes [90].
As displayed in Table 1, microRNAs are also frequently involved in gene regulation by lncRNA. One particular example is the epithelial to mesenchymal transition (EMT) that is regulated by three lncRNAs, namely, H19, linc-RoR and TP73-AS, which capture multiple microRNAs and prevent their binding to other mRNA targets [91][92][93][94]. Table 1. Role and therapeutic utility of lncRNAs in breast cancer. lncRNAs implicated in breast cancer pathology are listed and classified in different categories: lincRNA (long intergenic non-coding RNA), bidirectional lncRNA, sense-overlapping lncRNA, sense-intronic lncRNA, and NAT composed of nc|nc or nc|pc transcripts pairs. In the case of nc|pc pairs, the pcGene name is provided. As ncRNAs often display multiple transcript variants, some lncRNAs may belong to multiple categories. Mediates breast cancer cell plasticity, invasion, and proliferation by sponging several miR (miR-200b/c, let-7b, miR-152), silences pro-apoptotic gene BIK through epigenetic modifications, precursor of miR-675 (pro-tumoral and pro-metastatic).
In TNBC, biomarker for detection, prognosis and prediction for recurrence and response to taxane chemotherapy.
Upregulation is a marker in multi-drug resistance, chemotherapy tolerance. Potential therapeutic target for aggressive and metastatic breast cancer.

NATs as Cancer Biomarkers
Like mRNAs, the expression levels of NATs and lncRNAs are affected under cancerous conditions. Differences in mRNA expression patterns between different subgroups of breast cancer patients have been used to develop genomic tests able to predict patient's prognosis, or to predict treatment response by breast cancers. Among them, we can underline the MammaPrint and PAM50 microarray-based gene signatures, or the Oncotype DX RT-PCR-based assay that can help clinicians make treatment decisions based on the calculation of the recurrence risk, and/or the benefits of chemotherapy in the case of Oncotype DX test [179][180][181].
Similarly, multiple NATs/lncRNAs display expression levels that are associated with the disease prognosis, the treatment response or the clinical classification of breast cancers (Table 1). Although no clinically validated test has emerged yet, several studies report prognostic ncRNA gene signatures [119,[182][183][184][185].

NATs as Therapeutic Targets
The understanding of antisense transcription is important for therapies. Indeed, NATs represent a potential highly specific entry point for therapeutic intervention on targeted genes by the use of ASO (antisense oligonucleotides) that are drugs already FDA-approved for several diseases [186].
Functionally characterized NATs can be targeted by ASOs, called in this case antagoNATs, to block the interaction of the sense and antisense transcripts. The hybridization of ASOs with the antisense transcript would lead to its degradation, or to transcriptional de-repression at the chromatin level [187]. The first in vivo demonstration of antagoNAT efficacy was shown by Modarresi et al. [188] and has been validated in other clinical contexts, detailed in the review by MacLeod et al. [187].

Conclusions
Up to 90% of the human genome length is transcribed:~2% of the genomic DNA is coding for proteins;~88% is transcribed but do not encode proteins; and~10% is not transcribed. In contrast, 80% of the RNA transcripts are coding for proteins and the remaining~20% do not. These sequences are thus less expressed than the coding ones. They are also less conserved between species than coding genes, but more conserved than the non-coding and the non-transcribed genes. Such transcripts must therefore play a biological role, which has yet to be described.
Among lncRNAs, NATs are coding or non-coding RNA sequences, which are complementary to and overlapping with either protein-coding or non-coding transcripts. Their main biological role is thought to be the regulation of pcGene expression through a variety of molecular mechanisms. High-throughput transcriptomic studies have demonstrated that the expression of NATs and lncRNAs is modified under cancerous conditions, making them good cancer biomarkers. Finally, non-coding/pcGene transcript pairs are interesting, especially for specific target-gene treatments using ASO. Author Contributions: Guillaume Latgé and Claire Josse compiled the current literature and wrote the first draft of the article. Christophe Poulet contributed to scientific and language editing. Vincent Bours, Claire Josse and Guy Jerusalem supervised the project. All authors provided critical feedback and helped shape the manuscript writing.

Conflicts of Interest:
The authors declare no conflict of interest.