Effects of Replication and Transcription on DNA Structure-Related Genetic Instability

Many repetitive sequences in the human genome can adopt conformations that differ from the canonical B-DNA double helix (i.e., non-B DNA), and can impact important biological processes such as DNA replication, transcription, recombination, telomere maintenance, viral integration, transposome activation, DNA damage and repair. Thus, non-B DNA-forming sequences have been implicated in genetic instability and disease development. In this article, we discuss the interactions of non-B DNA with the replication and/or transcription machinery, particularly in disease states (e.g., tumors) that can lead to an abnormal cellular environment, and how such interactions may alter DNA replication and transcription, leading to potential conflicts at non-B DNA regions, and eventually result in genetic stability and human disease.


Introduction
Genomic DNA is the macromolecule responsible for the storage of genetic information; it is used for replicating another copy of genomic DNA for a daughter cell, and codes for other functional macromolecules such as messenger RNA for protein synthesis and regulatory non-coding RNAs in cells-i.e., DNA replication and transcription. Maintaining the accuracy and integrity of DNA is crucial for cell survival and, thus, cells have developed highly sophisticated DNA damage-monitoring and repair systems to avoid mutation to maintain genome integrity. Interestingly, the distribution of mutations in human genomes is not random, and many mutational "hotspots" that have been identified in disease etiology are often located in specific repetitive, non-B DNA-forming sequences [1][2][3][4][5][6][7][8][9][10].

B-DNA and non-B DNA Structures
After years of research from many groups, Watson and Crick described a well-accepted model of the DNA double-helical structure, referred to as B-DNA, where the two complementary strands align in an antiparallel, right-handed orientation, held together by hydrogen bonding between each of the base-pairs [11]. The double-stranded DNA likely exists in a B-form conformation in cells the majority of the time and is further organized into higher orders of structures, including wrapping of DNA around histone cores to form nucleosomes and then further wrapped and condensed into chromatin. All DNA metabolic processes in cells, including DNA replication, transcription, recombination, and repair occur within the background of DNA secondary structure and tertiary chromatin conformations. recombination, and repair occur within the background of DNA secondary structure and tertiary chromatin conformations.
The Human Genome Project revealed that >50% of human genomic DNA consists of repetitive sequences, containing repeating units of different lengths ranging from single base-pairs to large segments of DNA at a mega base-pair (Mbp) scale [12]. These repetitive elements were originally considered as "by-products" of genome evolution or locations of viral attack and were often referred to as "junk DNA." However, we have now realized that these repetitive elements play important regulatory roles in genomic structure and function. Notably, many of these repeats are able to form alternative secondary DNA conformations that differ from the classic B-DNA structure, due to interand intra-molecular interactions within/between the repetitive elements. Figure 1 depicts several commonly studied non-B DNA conformations. More than 15 types of "non-B" DNA structures that differ from B-DNA have been characterized to date [6,13,14]. For most types of non-B DNA structures, the first step of the conformational transition from B-form DNA to non-B is separation of the DNA duplex into single-stranded DNA (ssDNA), providing the single-stranded repetitive sequence the opportunity to interact with nucleotides that are on the same strand or with those of the underlying duplex regions. For example, a single-stranded inverted repeat sequence can form Watson-Crick base-pairs between the self-complementary regions on the same strand to form hairpin or cruciform structures [15]; an H-DNA structure can form at a purine-or pyrimidine-rich ssDNA region with mirror repeat symmetry that can fold back and Hoogsteen hydrogen bond to the major groove of the other half of the duplex containing mirror symmetry [16,17]. If the ssDNA contains four guanine runs, each containing three or more guanines, the ssDNA can fold into a G-tetrad structure via Hoogsteen-hydrogen bonding to form a square planar structure [18], and three or more of such stacked guanine tetrads are referred to as G-quadruplex or G4 DNA structures [19,20]. ssDNA containing simple repetitive units can re-anneal with the complementary strand with misalignment, resulting in loop structures [21]. Z-DNA is a left-handed helix that can form in regions of alternating purine-pyrimidine sequences, where the guanines in every two base-pairs are in the syn conformation, in contrast to the canonical anti-conformation in B-DNA, which twists the phosphodiester backbone into a zigzag (hence the name) pattern [22,23]. A-DNA is a duplex structure that exits under dehydrating conditions such as crystal formation, with altered major and minor groove structures [6].  Formation of non-B DNA structures has been characterized using many different techniques such as circular dichroism (CD) that provide signature spectropolarimetry for a non-B DNA structure under specific conditions of temperature, ionic strength, and pH [25]; enzyme or chemical probing, or the use of structure-specific antibodies that can potentially recognize and probe for specific DNA conformations [26][27][28][29]; and direct visualization of some non-B DNA structures that induce DNA strand bending under electron microscopy [30][31][32][33][34]. These techniques have provided strong evidence for the presence of non-B DNA both in vitro and in vivo.

DNA Replication and Transcription Facilitate Non-B DNA Structure Formation
In the classical B-formation, DNA duplexes are perfectly base-paired, and thus are in the most stable energy status; the chromatin organization also favors maintaining DNA in the B-formation. Negative supercoiling can in general facilitate the B-DNA to non-B DNA transition because it can provide energy to destabilize the B-DNA structures, stimulate DNA breathing, a spontaneous duplex separation below the DNA melting temperature, and exposure of ssDNA. As mentioned above, opening of the B-DNA duplex and exposing ssDNA is typically required for the transition from B-DNA to non-B DNA conformations. Therefore, any DNA metabolic event that involves the generation of negative supercoiling, separation of the duplex, and/or exposure of ssDNA in appropriate regions may facilitate the formation of non-B DNA structures.

DNA Replication and non-B DNA Formation
DNA replication requires opening of chromatin structures and unwinding of the packed DNA from nucleosome cores, employing DNA helicases to separate the DNA duplex into two ssDNA regions, and topoisomerases to relax the positive supercoiling generated upstream. The migration of replication forks through non-B DNA-forming sequences can generate negative supercoiling downstream that can facilitate the formation of non-B DNA structures. For example, triplet repeats can form hairpin or loop structures on ssDNA via intrastrand hydrogen bonding, which can result in repeat unit expansion and human disease [65]. It has been found that the expansion of these repeats was much more pronounced in highly proliferative tissues [66,67] or in rapidly dividing cells [68,69]. Thus, DNA replication appears to play important roles in the generation of non-B DNA conformation and the subsequent genetic instability.
DNA replication generates long ssDNA regions during lagging strand synthesis and creates a favorable environment for the formation of non-B DNA structures during replication [70]. Single-stranded DNA binding protein (SSB) in prokaryotic cells or replication protein A (RPA) in eukaryotic cells binds to the ssDNA and can prevent non-B DNA structure formation [71][72][73]. However, the highly dynamic feature of RPA or SSB binding to ssDNA can grant the formation of high-order DNA confirmations at appropriate regions, particularly with the assistance of negative supercoiling and/or non-B DNA-associated proteins [74][75][76][77]. Therefore, the location and orientation of the triplet repeat relative to the replication origin can have dramatic effects on the stability and the types of mutations induced at these sequences [78][79][80][81][82][83][84][85][86][87]. CTG repeats form a slightly more stable hairpin structure than the CAG repeats on the complementary strand [88]. Correspondingly, large expansions of CTG/CAG triplet repeats occur predominantly when the CTG repeats serve as the template for the leading strand replication; and deletions are more prominent when the CTG repeats are on the template for lagging strand synthesis [78]. In a study using plasmids containing an SV40 replication origin, Cleary et al. found that CAG(79) repeats yielded predominantly expansions when the repeats were positioned at the 3 end of an Okazaki initiation zone, and mostly contractions when the repeats were located at the 5 end [89]. In yeast, a long track of 78-150 trinucleotide GAA repeats in the intron of the URA3 gene caused substantial repeat expansion, and the number of repeat units added in these long GAA tracts were in a relatively narrow range of 44-63 triplets, which is approximately the length of an Okazaki fragment in yeast [90]. The threshold of repeat length approximately matches the length of Okazaki fragments in various species. In bacterial cells, the Okazaki fragment is longer, approximately 1000 bp. It was found that the majority of CTG repeat expansions were incremental when the CTG tracks were shorter than an Okazaki fragment (120-200 repeats); and when CTG repeats were longer than 1000 bp (330-500 repeats), the majority of expansions were discrete changes in the repeat size, adding hundreds or thousands of base-pairs but nothing in between [91]. Mutation of polymerase α, which can slow DNA replication and lead to longer stretches of ssDNA in the Okazaki initiation zone, dramatically increased the expansion events and the size of repeat expansions, accompanied by mutations outside of the repetitive sequence that inactivated the URA3 gene [90]. In fact, mutations in other genes involved in DNA replication, e.g., Flap endonuclease 1 (FEN-1) [92] or DNA polymerases [93], have significant effects on triplet repeat stability. These data provide some examples of the effects of DNA replication on non-B DNA formation and non-B-induced genetic instability.

Transcription and non-B DNA formation
The transcription machinery also unwinds DNA from nucleosomes, separates the coding and non-coding strands and generates negative supercoiling downstream [94]. These conditions can favor non-B DNA conformations during transcription. For example, Wittig et al. found that transcription through the human c-MYC gene was required for Z-DNA formation in the promoter regions, as evidenced by the binding of Z-DNA-specific antibodies to this region only when the gene was actively transcribed [54,95,96]. In isolated nuclei of Allium cepa L. root meristems, labels for Z-DNA structures nearly disappeared when RNA polymerase (I and II)-dependent transcription was inhibited, supporting the idea that transcription was required for Z-DNA formation. In fact, the authors suggested using the in situ immunodetection of Z-DNA as a marker of transcription [97].
During transcription, nascent single-stranded RNA can bind back to the template DNA strand to form an R-loop conformation, which contains an RNA-DNA hybrid and a long region of ssDNA. If the ssDNA region is comprised of a repetitive element and can fold into a higher-order structure such as G4-DNA or a hairpin structure, the "R-loop + non-B DNA complex" could be further stabilized [34,98]. There are excellent review articles in this issue on this topic, and thus will not be further discussed here.

Non-B DNA Conformations Impact DNA Replication and Transcription
Both DNA replication and transcription require DNA to be in a ssDNA status to serve as a template so each incoming deoxynucleoside triphosphate (dNTP) (for DNA replication) or ribonucleoside triphosphate (for transcription) can form an appropriate base-pair with a base in the template strand before polymerization to the 3 end of a polynucleotide chain to form a complementary DNA or RNA strand. Therefore, it is reasonable to speculate that any impediments to the access of the ssDNA template could affect the progression and/or the fidelity of the DNA and RNA polymerases.

Non-B DNA Structures in front of Replication and Transcription Machineries
Non-B DNA structures formed in front of replication or transcription machineries have been shown to affect both the processivity and fidelity of these events. In cells, when a DNA or RNA polymerase encounters a non-B DNA structure, to progress or to stall is the result of competition between the stability of the non-B DNA conformation and the ability of components in the polymerase complexes to remove or pass through the impediment caused by the non-B conformation. An intermolecular triplex (similar in structure to an intramolecular H-DNA triplex) structure formed between two oligonucleotides has been shown to inhibit DNA unwinding in vitro by the eukaryotic SV40 large T-antigen DNA helicase [99], and such inhibitory effects were diminished when the pH was increased to 8.7, which did not favor the formation of the triplex structure [99]. Although the same group reported that a third-strand TC (20) oligonucleotide in a triplex structure could be separated by the SV40 large T-antigen DNA helicase from a linearized double-stranded plasmid DNA substrate containing a GA (27) repeat with the energy provided by ATP hydrolysis, it happened only when 3 -flanking ssDNA was available on the third-strand oligonucleotide. Because 3 -flanking ssDNA was required for large T-antigen helicase to load and migrate in a 3 -> 5 direction toward the triplex region before it released the third strand [100], it is likely that such helicases are not able to unwind intramolecular H-DNA where no 3 -free end is available.
There are many reports demonstrating that non-B DNA can cause replication fork stalling [80,[101][102][103][104][105][106][107], which can result in replication fork collapse, and the formation of DNA double-strand breaks (DSBs), and subsequent genetic instability [108][109][110]. We have discovered that a short 34 bp H-DNA-forming sequence from the human c-MYC promoter, near one of the translocation hotspots found in Burkitt lymphoma, was mutagenic on reporter plasmids in human cells and on chromosomes in mouse genomes via both replication-dependent and replication-independent mechanisms [38,61,62]. We found that this repetitive element stalled DNA replication fork progression in mammalian cells, and also stimulated the formation of DSBs and mutations in the presence or absence of DNA replication [24]. Using the same approach, we report here that the orientation of this polypurine-polypyrimidine mirror repeat plays an important role in replication stalling and mutation in mammalian cells in a replication orientation-dependent fashion ( Figure 2). The same repeat was inserted at the same location on the mutation-reporter plasmid, but in opposite directions, i.e., the purine-rich strand served as the template for lagging strand synthesis in pMEXY, and the pyrimidine-rich strand was used for lagging strand synthesis in pMEXU. The plasmids (pMEXY, pMEXU and control B-DNA-forming pCEX) were transfected into mammalian COS-7 cells, and replication intermediates were recovered 24 h later and subjected to 2-D electrophoresis of DNA replication intermediates as previously described [4,24,111]. The results revealed a typical Y-shape replication arc because the SV40 origin was not included in the probed area. As shown in Figure 2A, DNA replication was stalled at the H-DNA region in the pMEXY plasmid and resulted in bulges on the right arm of the arc, and a much lighter left arm, suggesting fewer replication intermediates past the H-DNA sequence compared to that in pMEXU. Notably, the replication stalling on the pMEXU plasmid was not as obvious as that on pMEXY, suggesting a stronger impact on DNA replication when the purine-rich strand was on the lagging strand. Consistently, H-DNA-forming sequences induced higher mutation frequencies than control B-DNA sequences in the same reporter plasmids 48 h after transfection into mammalian COS-7 cells, and pMEXY stimulated higher mutation frequencies than pMEXU ( Figure 2B). This result provided evidence for a role of non-B DNA-induced replication fork stalling in DNA-structure-induced genetic instability in mammalian cells.
Stalled transcription complexes can also cause genetic instability. Transcription-coupled DNA repair (TCR) is a pathway that preferentially repairs DNA lesions in the template strand over the non-template strand and results in the excision of a fragment of the DNA containing the lesion. In TCR, RNA polymerase stalling, rather than DNA damage per se, can serve as a signal for triggering DNA repair [112]. Thus, we and others have proposed that a stalled RNA polymerase at non-B DNA structures, even in the absence of DNA damage per se, may be sufficient to trigger a "gratuitous" DNA cleavage and repair. A "successful repair and re-synthesis" would rebuild the repetitive sequence and the non-B DNA structure could reappear when transcription or replication occurred, leading to another round of "gratuitous" DNA cleavage and repair until an error is generated during this process, resulting in DNA breakage and mutation.
TCR, RNA polymerase stalling, rather than DNA damage per se, can serve as a signal for triggering DNA repair [112]. Thus, we and others have proposed that a stalled RNA polymerase at non-B DNA structures, even in the absence of DNA damage per se, may be sufficient to trigger a "gratuitous" DNA cleavage and repair. A "successful repair and re-synthesis" would rebuild the repetitive sequence and the non-B DNA structure could reappear when transcription or replication occurred, leading to another round of "gratuitous" DNA cleavage and repair until an error is generated during this process, resulting in DNA breakage and mutation.

Non-B DNA Structures behind Transcription Machinery
Processivity and fidelity of RNA polymerases may be affected by non-B DNA structures formed behind the transcription machinery [113][114][115]. In a collaborative project with the Hanawalt group, we found that transcription by T7 RNA polymerase was paused at an H-DNA-forming sequence in a fraction of molecules within and even downstream of the H-DNA-forming sequence from the human c-MYC promoter (same as the H-DNA-forming sequence used in Figure 2). We also found similar results with Z-DNA-forming CpG repeats, and the stalling was much more obvious in multiple round transcription [113,114]. Similarly, G-rich sequences that can form triplexes, G-quadruplexes, or R-loops can significantly block transcription by T7 RNA polymerase or mammalian RNA polymerase II. The replication stalling at G-rich sequences was orientation-, length-and supercoiling-dependent, implicating non-B DNA structures, rather than linear sequences per se, in transcription stalling [116]. Although stalling sites were observed both in front of and after G-rich sequences, the major stalling events occurred when RNA polymerase had passed G-rich inserts, supporting a model of R-loop structure formation where nascent RNA interacts with repetitive DNA sequences to form an RNA-DNA hybrid. Further, the non-B DNA structures formed at the G-rich DNA regions could further stabilize the interactions and impact the movement of RNA polymerase complexes [116]. Such effects may not only interfere with the functions of RNA polymerases, but also cause genetic instability as discussed above.

Non-B DNA Structures Cause Transcription and Replication Collision and Lead to Genetic Instability
In both prokaryotic and eukaryotic genomes, DNA replication and transcription could occur on the same DNA strand simultaneously, so it is possible that the two complexes could collide with each other in a "co-direction" fashion or "head-on". Mirkin et al. reported that when collision or conflict occurred, replication fork progression was dramatically stalled at the transcribed DNA segments, suggesting a plausible direct contact between the two machineries [117]. DNA topological distortion (e.g., positive supercoiling) generated in front of both complexes might also serve as a barrier for further progression [118]. In addition, the active transcription machinery might recruit and increase the deoxyuridine triphosphate (dUTP) concentration near the DNA replication machinery and result in mis-incorporation of dUTP at the sites of deoxythymidine triphosphate (dTTP). The removal of dUTP would leave apurinic/apyrimidinic sites near the area where replication and transcription meet [119], such that collision could be detrimental to cells, resulting in genomic instability.
In both prokaryotic and eukaryotic cells, transcription and replication are carefully organized, regulated, and timed. In prokaryotic genomes DNA replication is initiated at a single origin, and the highly expressed genes and essential genes are located on the template for leading strand synthesis, thus replication and transcription progress in the same direction. Thus, "head-on" collisions at these important genes are largely avoided. Transcription itself does not appear to affect DNA replication elongation when the two processes occur in a co-directional orientation or do not exist in close proximity [117]. In eukaryotic genomes, both transcription and replication start from multiple sites and replication forks move in both directions in the genome, so DNA and RNA polymerases have the risk of competing for the same DNA template if not properly regulated. Because non-B DNA conformations may affect the initiation and timing of both replication and transcription, and stall the elongation process, it is possible that non-B DNA-forming elements could cause "traffic congestion" on the DNA template and increase the chance of collision, leading to genetic instability. Inappropriate initiation or elongation of either replication or transcription could lead to head-on or co-directional collisions, as illustrated in Figure 3.
Sequences that can form non-B DNA structures are enriched in both rare and common fragile sites in human genomes [120]. Rare fragile sites are characterized by an expansion of CGG repeats (which can form Z-DNA, quadruplex structures, hairpin/cruciform structures or loop-outs) or AT-rich minisatellite repeats (which can form hairpin or cruciform structures). Common fragile sites that are present in all individuals also contain many elements that can form non-B DNA, particularly AT-rich inverted repeats that can form hairpin or cruciform structures [120][121][122]. Fragile sites affect DNA replication and lead to chromosome breakage, and are associated with disease development [123][124][125]. One of the key features of fragile sites is that they are replicated slower and in later stages (there are also early replicating fragile sites, (ERFSs), see below), and non-B DNA conformations formed in these repetitive regions are considered as one of the contributors for abnormal DNA replication within these regions [120].
(which can form Z-DNA, quadruplex structures, hairpin/cruciform structures or loop-outs) or AT-rich minisatellite repeats (which can form hairpin or cruciform structures). Common fragile sites that are present in all individuals also contain many elements that can form non-B DNA, particularly AT-rich inverted repeats that can form hairpin or cruciform structures [120][121][122]. Fragile sites affect DNA replication and lead to chromosome breakage, and are associated with disease development [123][124][125]. One of the key features of fragile sites is that they are replicated slower and in later stages (there are also early replicating fragile sites, (ERFSs), see below), and non-B DNA conformations formed in these repetitive regions are considered as one of the contributors for abnormal DNA replication within these regions [120]. (1) DNA replication progresses from chromosome location "X" to "Y"; (2) Transcription progresses within the same region, but from the opposite direction (from "Y" to "X") after (or before) DNA replication is complete in that area; (3) A non-B DNA structure formed between chromosome loci "X" and "Y" interrupts the initiation and/or progression of replication and transcription, resulting in "head-on" replication-transcription collision (shown as a lightning bolt). (B) Schematic diagram of non-B DNA-induced "co-directional" replication-transcription collision. (1) DNA replication and transcription occur simultaneously and "co-directionally" on the chromosome without collision; (2) A non-B DNA structure formed in front of the transcription machinery stalls transcription progression; (3) A DNA replication fork runs into the transcription machinery, resulting in a "co-directional" replication-transcription collision (shown as a lightning bolt). (1) DNA replication progresses from chromosome location "X" to "Y"; (2) Transcription progresses within the same region, but from the opposite direction (from "Y" to "X") after (or before) DNA replication is complete in that area; (3) A non-B DNA structure formed between chromosome loci "X" and "Y" interrupts the initiation and/or progression of replication and transcription, resulting in "head-on" replication-transcription collision (shown as a lightning bolt). (B) Schematic diagram of non-B DNA-induced "co-directional" replication-transcription collision. (1) DNA replication and transcription occur simultaneously and "co-directionally" on the chromosome without collision; (2) A non-B DNA structure formed in front of the transcription machinery stalls transcription progression; (3) A DNA replication fork runs into the transcription machinery, resulting in a "co-directional" replication-transcription collision (shown as a lightning bolt).
Notably, most of the common fragile sites lie within, or span, known genes and are transcribed in many cells [126]. Moreover, in a study performed by Calin et al. [127], more than 50% of the analyzed microRNAs were mapped near known fragile sites, with a nine-fold greater occurrence than in non-fragile control regions. Because many of these elements within the fragile sites serve as templates for both DNA replication and transcription, and can disrupt the timing and progression of both activities, the two machineries could meet at these repetitive sequences. It is of particular interest that many actively transcribed large human genes (spanning more than 1.0 Mb) contain chromosomal fragile sites [128]. Some large genes such as fragile histidine triad (FHIT, 1.5 Mb), WW domain containing oxidoreductase (WWOX, 1.1 Mb), and IMP2 inner mitochondrial membrane peptidase-like (IMMP2L, 0.9 Mb) contain repetitive sequences in common fragile sites, FRA3B, FRA16D and FRA7K, respectively. Helmrich et al. [129] found that transcription in these large genes was very slow (~30 nucleotides/second) and took more than 11-13 h to finish. Because transcription in these regions takes longer than a cell cycle (~10 h for the cells studied), collisions between replication and transcription machineries were very likely. RNA synthesis stalling within the fragile sites colocalized with genomic breakage hotspots. Under mild replication stress by aphidicolin, FHIT and WWOX induced chromosomal breakage in human B-lymphoblasts where the genes were expressed, but not in myoblasts where the genes were silent. The IMMP2L gene was expressed in both cell types, but higher in B-lymphoblasts, and the chromosomal lesions at the FRA7K fragile site were detected in both cell types and was~3-fold higher in in B-lymphoblasts [129]. Interestingly, although RNA-DNA hybrids were detected at sites of replication and transcription overlap, RNase H2 did not affect fragility of these repetitive sequences when transcribed [129]. However, another study reported that in epithelial and erythroid cells, transcription did not play a major role in chromosomal breakage in fragile sites within large genes [130]. Whether or not replication and transcription were stalled and/or resulted in collisions within these fragile sites is not clear.
In B lymphocytes, Barlow et al. identified a recurrent early replicating chromosomal fragility termed early replication fragile sites (ERFSs) that cause replication fork stalling under hydroxyurea (HU)-induced stress [131]. G + C nucleotides and repetitive elements such as long interspersed elements (LINEs) and short interspersed elements (SINEs) are significantly enriched in ERFSs. These fragile sites are located in highly transcribed genes, and transcription can significantly increase the fragility of the ERFSs [131]. Ataxia telangiectasia and Rad3-related (ATR) kinase plays an important role in coordinating transcription and replication in eukaryotic cells by suppressing transcription and yielding the right of way to replication forks [132]. Inhibiting ATR activity in B lymphocytes also significantly increased the fragility seen at ERFSs [131], similar to what had been seen on common fragile sites. Therefore, these ERFSs may also lead to collisions by impacting both DNA replication and transcription.
In addition to transcription disruption and replication fork collapse, slowing of DNA replication by collision with transcription complexes could give rise to the formation of non-B DNA structures in adjacent areas if the collision occurs within repetitive sequence or within a continuous track of several repetitive elements. It was recently reported that in an artificial system to allow study of replication-transcription collisions in actively dividing bacteria, duplications/deletions and base substitutions were the two major classes of mutations that occurred at replication-transcription collision regions [133]. The duplications and deletions were significantly affected by transcription, and were likely caused by replication stalling events at collision sites where the replication fork first encountered a transcription complex [133]. These data suggested that when a collision occurred, the repetitive sequences in the templates for DNA replication (particularly for lagging strand synthesis), for transcription, and/or the nascent RNA strand containing the repeats may have had a greater chance to form non-B DNA conformations and RNA secondary structures. Such structure-on-structure complexes might introduce more complexity in resolving the collision and restarting the replication forks, resulting in more genetic instability events including repeat unit duplication or deletion, DNA breakage, and/or recombination.

Non-B DNA and Replication-Transcription Collision in Cancer
Cancer cells often exhibit a unique phenomenon of "replication stress", where DNA replication fork progression in S phase is slower and the accuracy is reduced relative to normal cells. Although the mechanisms are not completely understood, replication stress in cancer cells may be the result of activation of oncogenes such as resistance to audiogenic seizures (RAS), v-myc avian myelocytomatosis viral oncogene homolog (MYC), cyclin-dependent kinases (CDKs) and CYCLINs [134][135][136][137][138][139][140][141][142][143][144]. These oncogenes can deregulate E2F-dependent G1/S transcription to stimulate S-phase entry before cells are ready for genome duplication [145,146]. The conditions generated by these oncogenes can also maintain E2F activity after S phase entry, which otherwise should be inactivated via a negative feedback loop [147], and manipulate DNA replication stress tolerance and genomic integrity [148]. Such inappropriately timed and slowed replication may overlap with transcription at regions containing repetitive sequences that form non-B DNA structures and lead to transcription-replication collision in actively transcribed regions as discussed above.
In fact, increased transcription and transcription-induced non-B DNA structure formation caused by activation of oncogenes may directly contribute to replication stress and chromosome instability in cancer cells [149]. The hormone estradiol (E2) can stimulate transcription of many E2-responsive genes and generate R-loops in transcribed regions. For example, Stork et al. found a significant enrichment and colocalization of R-loop formation, DSBs and gene rearrangement events (duplications, large deletions, inversions, and translocations) in breast cancer cells upon E2 treatment [150]. In human BJ fibroblast cells, overexpression of the HRAS v12 oncogene increased transcription-stimulated R-loop formation in many genes as evidenced by RNA-DNA hybrids, and caused replication stress. Transient suppression of transcription in HRAS v12 overexpressing cells using small molecule inhibitors or treating the cells with RNaseH1 to remove RNA/DNA hybrids overcame this replication stress [149]. The authors suggested that the formation of a "transcription-stimulated structural barrel" can significantly impact both transcription and replication progression. Thus, there is a possibility for collision and/or conflicts between the two machineries, which could result in subsequent chromosome instability.
Cancer cells are also known to have deregulated replication origin activities, including licensed origin scarcity during S phase due to lack of appropriate S-phase checkpoints and/or depletion of replicative DNA helicase minichromosome maintenance complex 2-7 (MCM2-7), and unscheduled replication caused by overexpression of proteins such as chromatin licensing and DNA replication factor 1 (CDT1) and cell division cycle 6 (CDC6). Licensed origin scarcity could result in insufficient DNA replication initiation, thus delayed or incomplete replication, and unscheduled replication could cause re-replication (i.e., segments of the genome replicated more than once) or premature DNA replication origin firing [144][145][146]151]. In both cases, deregulated replication could lead to a conflict or collision with the transcription machinery, resulting in genetic instability or cell death.
Telomestatin, a G-quadruplex stabilizer, is able to interrupt the loading of telomere maintenance proteins such as telomere-capping protein TRF2 and topoisomerase III alpha, and cause DNA damage, telomere instability and cell death [152][153][154][155]. Hasegawa et al. found that telomestatin-induced 53BP1 foci at telomeric regions and cell death depended on both DNA replication and transcription [156]. In fact, triggering replication stress in cancer cells as a potential cancer chemotherapeutic approach has been discussed previously [151,[157][158][159][160]. Therefore, stimulating non-B DNA formation at actively transcribed regions by the use of small compounds [156,161] may represent a therapeutic strategy to enhance the efficiency and specificity of traditional chemotherapies.

Conclusions
In summary, studies on the impact of non-B DNA conformations on replication and transcription accuracy, efficiency, coordination and potential collisions have provided important information on mechanisms of genetic instability, although research in this field is still in its early stages. For example, a fine mapping of replication-transcription collision sites under different conditions, particularly in cancer cells under replication stress, will be very informative for exploring the contribution of non-B DNA structures in replication-transcription-related genetic instability and cell viability. Further work in this area is warranted to better understand the mechanisms involved and to develop potential approaches to reduce non-B DNA-induced replication-transcription conflicts in healthy cells or to stimulate such collision in cancer cells.