2.1. Excision of the Expanded (CTG•CAG)n Repeat
The most straightforward application of CRISPR/Cas9 in DM1, for the first time reported by our own group [18
], is to remove the genetic cause of disease by precise excision of the expansion mutation. This was accomplished by designing two sgRNAs targeting flanking sequences at either end of the mutation, followed by joining of the two DSBs through NHEJ, while excluding the repeat-containing fragment in between (Figure 2
a). The main advantage of this strategy is that the disease defect is restored at the DNA level, so long repeat-containing transcripts are not produced and downstream toxic effects are eliminated. From a therapeutic point of view, this excision approach is feasible in DM1, because the (CUG)n repeat is not part of the DMPK
open reading frame, while functional open reading frames in long noncoding RNA DM1-AS
have not been demonstrated yet [19
]. Since expanded (CAG)n repeats in DM1-AS
transcripts are subject to non-canonical, disease-related RAN translation [20
], repeat excision will also abolish the production of toxic homopolymeric proteins.
The success of the repeat excision strategy is confirmed in a number of studies by other laboratories [21
]. Together, these reports show the reliability and robustness of the CRISPR/Cas technique, since different choices were made regarding the sgRNA sequences, located closer to or further away from the (CTG•CAG)n sequence (Table 1
). To prevent significant perturbation of the 3′ UTR in corrected DMPK
transcripts, we reasoned that besides the expanded repeat, we should remove as few flanking base pairs from the locus as possible. The design of sgRNAs, however, is being dictated by the presence of a proper PAM sequence (NGG for SpCas9) and the number of predicted off-target effects elsewhere in the genome. In our hands, different sgRNAs, targeting flanking sequences or the repeat itself, demonstrated variable cutting efficiencies and some did not result in any cleavage at all [18
]. This may be explained by the complex DNA hairpin structures that can be formed by expanded (CTG•CAG)n repeats [26
]. A similar rationale was described by Provenzano et al. (2017), as they speculated that the editing efficiency in regions close to the repeat might be influenced by its abnormal 3D structure [22
]. Therefore they chose to target the DM1 locus more distal to the repeat, ~200–300 base pairs up- and downstream. Whether the deleted flanking sequences harbor any regulatory 3′ UTR information for DMPK
mRNA half-life, translation efficiency or subcellular localization, e.g., through binding of miRNAs or RNA-binding proteins, remains to be investigated.
Regardless of the use of different combinations of two sgRNAs, all reports demonstrate dual cleavage followed by ligation of the two DSBs and exclusion of the repeat segment plus flanking parts. Expanded and unaffected alleles were equally well targeted. Besides, it should be noted that no suitable single nucleotide polymorphisms (SNPs) located near the repeat are available that can be used to discriminate between long and short alleles. In most cases, the new junction precisely matched joining of the two CRISPR/Cas9 cleavage sites. However, small and larger indels at the cleavage sites were also seen, as well as repeat inversions [18
Notably, we and others also discovered that a DSB close to the expanded repeat (<50 bp) induces uncontrolled deletion of large repeat segments, thereby resulting in unpredictable repeat contraction [18
]. This phenomenon, not observed when sgRNAs are directed further away from the repeat (>200 bp) [22
], relates probably to the occurrence of unstable slipped-strand structures at (CTG•CAG)n tracts in or close to a DSB [27
] (see for an excellent review on this topic [29
]). To us this demonstrates that for reliable and predictable removal of an expanded repeat two highly effective sgRNAs are needed and that single CRISPR/Cas9 cleavage must be avoided. Of note, if one of the two DSBs is repaired by NHEJ whereby an indel is created, this site cannot be cut again and repeat excision is blocked.
Current evidence suggests that the DM1 triplet repeat can be removed from any cell type in the human body, which seems a prerequisite for gene therapy in vivo in a multisystemic disease like DM1. The repeat has been excised in unaffected and DM1 primary and immortalized myoblasts, induced pluripotent stem cells (iPSCs), embryonic stem cells (ESCs), iPSC-derived myogenic cells, iPSC-derived neural stem cells, MYOD1-expressing immortalized fibroblasts, HEK293T cells and transgenic mouse myoblasts (Table 1
). Whether CRISPR/Cas9-mediated repeat excision is also possible in terminally differentiated cells like myotubes needs to be investigated. Excision efficiencies may vary between the different cell types and reports, likely depending on the choice of the sgRNAs. We propose, however, that the local chromatin organization surrounding the (CTG•CAG)n repeat does not play a dominant role in cleavage efficiency, given the observation that unaffected as well as expanded, hypermethylated alleles were successfully targeted.
The ultimate goal of removal of the pathogenic repeat in a DM1 patient cell is improvement or preferably reversal of the disease situation. Whether that indeed will be possible in vivo depends on in vivo CRISPR/Cas9 activity, the reversibility of the molecular mechanisms and the resilience of the cells and tissues involved. Well-known biomarkers that are used to measure changes in DM1 disease status at the cellular level are occurrence of repeat RNA/MBNL1 nuclear foci, ratios of certain DM1-typical alternative splice modes, miRNA expression and myogenic differentiation capacity. In most studies, one or more of these molecular measures were tested and these indeed improved after removal of the expansion (Table 1
). One particular DNA biomarker for CDM, that deserves special attention, is hypermethylation of the CpG island surrounding the repeat in the DM1 locus [24
]. This abnormal chromatin structure is commonly seen in patient cells with repeat lengths of over a few hundred triplets. In a collaborative project between the Eiges laboratory and our group, we compared CpG methylation before and after repeat removal in ESCs and immortalized myoblasts, both carrying CDM-size repeats with corresponding hypermethylation [24
]. To our surprise, excision of the repeat in undifferentiated stem cells resets the methylation status in the locus, but methylation levels remain unchanged in affected myoblasts after deletion of the large expansion. These findings suggest a transition from a reversible to an irreversible heterochromatin state by the DM1 mutation, which must be taken into account when considering gene correction in differentiated cells in vitro and in vivo [24
DNA editing strategies with the purpose to excise a disease-causing repeat have also been designed in other microsatellite expansion disorders. These studies may be informative for therapy development in DM1, although approaches strongly depend on disease-specific features related to (i) the corresponding disease mechanism, i.e., loss- versus gain-of-function; (ii) the location of the unstable repeat in the mutated gene, i.e., in coding or non-coding sequences; (iii) the function of the gene or the repeat sequence itself, i.e., crucial, redundant or insignificant and (iv) the length of the repeat, i.e., many disease-causing microsatellites are relatively short (<100–200 units) compared to the extreme expansions in many DM1 patients.
Like in DM1, the unstable microsatellite in fragile X syndrome (FXS) is a non-coding repeat: A (CGG●CCG)n sequence in the 5′ UTR of FMR1
on the X-chromosome. A repeat of >200 triplets induces FMR1
silencing via hypermethylation and chromatin remodeling of the region. Indeed, removal of pathogenic repeats in FXS iPSCs and ESCs was associated with reduced methylation and reactivation of the FMR1
]. CRISPR/Cas9-mediated editing was accomplished by either a single cleavage 20 bp upstream of the repeat or dual cleavage at either side (~55 bp) of the repeat. Following NHEJ, the entire repeat including short flanking sequences was removed, also in wt alleles (n
Another noncoding unstable microsatellite is the (GAA)n repeat in intron 1 of FXN
, associated with the recessive disorder Friedreich’s ataxia (FRDA), characterized by heterochromatinization of the gene and low protein production. Repeats containing 82 and 190 triplets were excised by dual CRISPR/Cas9 approaches (using Sp
Cas9 and Sa
Cas9) and different sgRNA combinations at either side of the repeat (~100–600 bp up and downstream) in transgenic mouse fibroblasts and transfected mouse muscle [32
]. (GAA•TTC)n removal raised FXN
transcript and protein production.
In many (CAG)n expansion diseases, the pathogenic repeat is located in a coding sequence giving rise to the production of proteins with extended polyglutamine (polyQ) stretches. In Huntington’s disease (HD), for example, the pathogenic (CAG)n repeat is located in exon 1 of HTT
. Dual CRISPR/Cas9 cleavage, ~35 bp up- as well as downstream of the repeat, successfully excised a (CAG)140 repeat plus flanking sequences from a HTT
transgene in a HD mouse model in vivo [33
]. Consequently, excision inactivated the HTT
transgene, mutant HTT protein was no longer produced and the neurological phenotype in the mice was attenuated. Excision of an expanded (CAG)78 repeat from exon 10 in ATXN3
in spinocerebellar ataxia type 3 (SCA3) patient-derived iPSCs [34
] was also performed through dual CRISPR/Cas9 cleavage (82 bp up- and 11 bp downstream of the repeat). Repeat excision resulted in a premature stop codon in exon 11, but truncated ATXN3 protein was still able to associate with its normal binding partner ubiquitin.
Finally, in an approach to decrease off-target effects and increase specificity, a second report on HD used paired Cas9 nickases, making nicks in the most 5′ CAG triplet and ~60 bp downstream of the repeat, in a series of HD fibroblasts [35
]. Both unaffected (CAG)17/21 and expanded (CAG)44–151 repeats were efficiently removed from HTT
, thereby inactivating these alleles. It remains to be determined whether a paired Cas9n strategy [36
] will work for typical repeat lengths of hundreds to several thousands of triplets in DM1 [7
]. Long distances between pairs of Cas9n may induce slipped strand structures leading to an unpredictable outcome [18