Isogenic GAA-KO Murine Muscle Cell Lines Mimicking Severe Pompe Mutations as Preclinical Models for the Screening of Potential Gene Therapy Strategies

Pompe disease (PD) is a rare disorder caused by mutations in the acid alpha-glucosidase (GAA) gene. Most gene therapies (GT) partially rely on the cross-correction of unmodified cells through the uptake of the GAA enzyme secreted by corrected cells. In the present study, we generated isogenic murine GAA-KO cell lines resembling severe mutations from Pompe patients. All of the generated GAA-KO cells lacked GAA activity and presented an increased autophagy and increased glycogen content by means of myotube differentiation as well as the downregulation of mannose 6-phosphate receptors (CI-MPRs), validating them as models for PD. Additionally, different chimeric murine GAA proteins (IFG, IFLG and 2G) were designed with the aim to improve their therapeutic activity. Phenotypic rescue analyses using lentiviral vectors point to IFG chimera as the best candidate in restoring GAA activity, normalising the autophagic marker p62 and surface levels of CI-MPRs. Interestingly, in vivo administration of liver-directed AAVs expressing the chimeras further confirmed the good behaviour of IFG, achieving cross-correction in heart tissue. In summary, we generated different isogenic murine muscle cell lines mimicking the severe PD phenotype, as well as validating their applicability as preclinical models in order to reduce animal experimentation.

Cellular models for PD have traditionally been based on the isolation of primary myoblasts from human patients [29], GAA-KO mouse models [30,31] or derived from iPSCs from Pompe patients [31][32][33]. Although they are fundamental for disease mechanisms studies, these cellular model systems are difficult to culture and work with [29,34,35]. As a complementary toolbox to these cellular models, different groups have generated immortalised cell lines derived from GAA-KO mice [30,31]. However, these immortalised cellular models replicate lysosomal but not autophagic pathology [30,36]. In addition, they do not harbour analogous PD-patient mutations that could also help to understand the basic mechanism involved in the disease. Moreover, most preclinical studies use human GAA in murine cells or in mice, potentially modifying the outcome due to inter-species differences.
In the present study, we used CRISPR/Cas9 tools to generate isogenic murine GAA-KO cell lines resembling severe mutations found in Pompe patients, and validated these cells as models for PD. Then, we designed different chimeric murine GAA proteins harbouring (1) the IFNβ-1 leader peptide to improve secretion (IFG), (2) the IFNβ-1 leader peptide plus a lysosomal signal peptide (LSP) to improve secretion and lysosomal targeting (IFLG) and (3) the fusion protein of insulin-like growth factor 2 (2G) to allow glycosylationindependent lysosomal targeting [37]. We performed a phenotypic rescue analysis on our generated models by using lentiviral vectors, which expressed different chimeras, with the IFG chimera demonstrating the best restoration of GAA activity and normalisation of the autophagic marker p62 and surface levels of CI-MPRs. The in vivo administration of liver-directed AAVs expressing different chimeras confirmed superior behaviour of IFG and validated our new cellular models as easy-to-use tools to be included in the Pompe toolbox in order to minimise animal experimentation.

Generation of GAA-KO Isogenic Murine Muscle Cell Models by Genome Editing
In order to generate relevant murine cellular models, we looked for the most severe mutations listed at the Pompe Center database (Erasmus MC), which catalogues mutations identified worldwide depending on their localisation and severity (www.pompevariantda tabase.nl) (accessed on 26 January 2022), and selected the most severe human mutations conserved in mouse Gaa gene ( Figure 1A, top). Because skeletal muscle is one of the main targets for GT of PD, we used Sol8 cells (a murine skeletal muscle cell line) as the host cell line ( Figure 1A, top). The selected mutations are described by the Erasmus Medical Centre database as severe because they result in truncation, frame shifts that generate aberrant proteins or inactivation of catalytic sites. In addition, all these mutations are associated with early IOPD.
We designed different gRNAs (g1, gCys, gE5 and gE7) targeting the murine loci corresponding to the selected human mutations ( Figure 1A, second-top and Table S1). Sol8 cells were nucleofected with different ribonucleoparticles (RNPs) that contained Cas9 and the different guide RNAs ( Figure 1B). The Gaa editing efficiency ranged from 30% to 35% using the T7EI assay to measure it ( Figure 1C, bottom graph, grey bars), and from 44% to 87% using the ICE algorithm ( Figure 1C, bottom graph, black bars). These discrepancies (stronger in E2_bulk and E7_bulk) are due to the high frequency of insertions (E2_bulk) and deletions (E7_bulk) of one nucleotide, which are detected by the ICE algorithm but not with the T7EI assay ( Figure 1D, right graphs). From each Gaa-edited bulk population, four to five clones were isolated and characterised ( Figure S1, left). One representative clone of each mutation was selected ( Figure S1, right) and named according to the Pompe mutation intended and the clonal number: ∆ ATG_14 (mutation in exon 2 targeting the ATG), E2_1 (mutation in exon 2), E5_6 (mutation in exon 5) and E7_1 (mutation in exon 7) ( Figure 1E). All selected clones were confirmed to have bi-allelic targeting by means of cDNA sequencing ( Figure 1E and Table S1). Next, we characterised the effect of the different mutations on the primary structure ( Figure 1F) and the predicted tertiary structure ( Figure 1G) of the resulting polypeptides. The mutations in E2_1 and E5_6 clones resulted in premature stop codons, generating truncated proteins that lack the catalytic domain. The A) Scheme of the strategy designed to generate murine Sol8 cell Pompe models. These cellular models, Δ ATG, E2, E5 and E7, resemble some mutations present in Pompe patients' cells, such as c.3G>A and c.18_25del, c.340_341insT, c.982_988del and c.1199_1210del, respectively. These models were used to generate different protein modifications from no protein expression to different frame shifts generating aberrant proteins, or even deletions of critical amino acids at the catalytic site. B) Representative scheme of the nucleofection process with RNP, cellular cloning and analysis of the cell clones for the selection of edited positive clones. Image created with BioRender.com. C) Editing efficiencies by CRISPR/Cas9 targeted at different exons in the murine GAA locus determined by the T7 assay and the ICE algorithm (see M&M). Top, T7EI assay of the Sol8 nucleofected cells (RNP g1-ATG, g Cys, gE5 and gE7) and NT (non-treated). Bottom, graph of the editing efficiencies in Sol8 (A) Scheme of the strategy designed to generate murine Sol8 cell Pompe models. These cellular models, ∆ ATG, E2, E5 and E7, resemble some mutations present in Pompe patients' cells, such as c.3G>A and c.18_25del, c.340_341insT, c.982_988del and c.1199_1210del, respectively. These models were used to generate different protein modifications from no protein expression to different frame shifts generating aberrant proteins, or even deletions of critical amino acids at the catalytic site. (B) Representative scheme of the nucleofection process with RNP, cellular cloning and analysis of the cell clones for the selection of edited positive clones. Image created with BioRender.com accessed on 30 April 2022. (C) Editing efficiencies by CRISPR/Cas9 targeted at different exons in the murine GAA locus determined by the T7 assay and the ICE algorithm (see M&M). Top, T7EI assay of the Sol8 nucleofected cells (RNP g1-ATG, g Cys, gE5 and gE7) and NT (non-treated). Bottom, graph of the editing efficiencies in Sol8 cells nucleofected with RNP g1 (∆ ATG), g Cys (E2_bulk), gE5 (E5_bulk) and gE7 (E7_bulk) using the T7EI vs. ICE algorithm. (D) Graphs showing the profile of indels generated in Sol8 cells nucleofected (∆ ATG, E2_bulk, E5_bulk and E7_bulk), using the ICE algorithm. The coordinate zero represents the cut site, negative values represent deletions of different length and positive values represent insertions. (E) cDNA sequence in each Sol8 clone with different insertions and deletions compared with the WT Sol8 cells by ICE tool. The insertions are marked in red and the deletions as "-". (F) Schematic representation of the different domains of mGAA WT, E2_1, E5_6 and E7_1 based on their homology with hGAA. (G) Murine GAA (WT, E2_1, E5_6 and E7_1) 3D structures predicted using SwissProt modelling. For 3D structure prediction of the different clones, we performed a translation of their sequenced mRNA using ExpasY and the structures were predicted with SwissProt modelling.

Murine GAA-KO Sol8 Clones Lack GAA Activity, Present Increased Autophagy Markers and Glycogen Content as Well as Surface Downregulation of Mannose 6-Phosphate Receptors (CI-MPRs)
All isogenic Sol8 cellular models generated for each mutation (∆ ATG_14, E2_1, E5_6 and E7_1) showed a partial reduction in GAA mRNA ( Figure 2A) and almost a complete lack of GAA activity ( Figure 2B).
The loss of GAA activity in Pompe patients results in glycogen accumulation and impaired autophagy, causing muscle deterioration [39]. We therefore analysed whether our muscle cellular models could mimic these phenotypic hallmarks of PD. The accumulation of glycogen in gene edited bulk populations was variable ( Figure S3A), probably due to differences in genome editing efficacy (see Figure 1C, bottom, black bars). All selected clones have a tendency to accumulate glycogen upon myotube differentiation compared with WT, but only ∆ATG_14 and E7_1 glycogen deposits were significant ( Figure 2C and Figure S3B). Similarly, p62, a hallmark of impaired autophagy [40] and a marker for disease progression in PD [41], was also increased in clones and bulk populations before ( Figure 2D, left-bottom panel and Figure 2E, white bars) and after ( Figure 2D, right-bottom panel and Figure 2E, black bars) differentiation into myotubes. In this direction, the lysosomal marker LAMP-1 presented a trend to be increased in the generated clones after differentiation ( Figure 2D, top panels and Figure 2F, black bars). The elevated p62 and LAMP-1 are indicative that autophagic buildup is occurring in our cellular models, although not all clones reached statistical significance.
Cardone et al. previously showed reduced levels of CI-MPRs at the surface of fibroblasts from Pompe patients [42]. We therefore analysed whether the different GAA-KO cellular models mimic this phenotypic defect. Although the mRNA expression levels were equivalent in wild-type (WT) Sol8 and in the selected clones, E2_1, E5_6 and E7_1 ( Figure 2G), a clear decreased trend in surface CI-MPRs levels was observed in all clones and bulk populations, being especially relevant in the E2_1 clone ( Figure 2H,I).
The loss of GAA activity in Pompe patients results in glycogen accumulation and impaired autophagy, causing muscle deterioration [39]. We therefore analysed whether our muscle cellular models could mimic these phenotypic hallmarks of PD. The accumulation of glycogen in gene edited bulk populations was variable ( Figure S3A), probably due to differences in genome editing efficacy (see Figure 1C, bottom, black bars). All selected clones have a tendency to accumulate glycogen upon myotube differentiation compared with WT, but only ΔATG_14 and E7_1 glycogen deposits were significant (Figure

Design of Different Chimeric GAA Proteins for Gene Therapy Approaches
Next, we designed three different GAA chimeras based on the codon-optimised sequence of murine cDNA ( Figure 3A): 1-the IFG protein, containing the leader peptide of the murine IFNβ1 at the N-terminus, which should increase mGAA secretion; 2-the IFLG protein that contains, in addition to the IFNβ1 leader peptide, four lysosomal signal peptides (LSPs, 2 YQRLC and 2 CNPGY) separated by GAG linkers ( Figure 3A) and 3-the 2G protein that incorporates the leader peptide and a 7-68 polypeptide of mature mIGF2 ( Figure 3A), designed to keep binding for M6PR/IGF2R [43] while reducing binding to IGFIR and IGFBP [43][44][45].
the murine IFNβ1 at the N-terminus, which should increase mGAA secretion; 2-the IFLG protein that contains, in addition to the IFNβ1 leader peptide, four lysosomal signal peptides (LSPs, 2 YQRLC and 2 CNPGY) separated by GAG linkers ( Figure 3A) and 3-the 2G protein that incorporates the leader peptide and a 7-68 polypeptide of mature mIGF2 ( Figure 3A), designed to keep binding for M6PR/IGF2R [43] while reducing binding to IGFIR and IGFBP [43][44][45].  Values above 0 correspond to hydrophobic regions, and those below 0 correspond to hydrophilic ones. The addition of IFLG and 2G to mGAA creates a more hydrophilic region in the N-terminus than the native one. IFG has a similar tendency to the native one, due to the hydrophilicity of the leader peptide. (C) Three-dimensional structure prediction of optimised chimeric mGAAs by I-TASSER. The optimisations included in mGAA are present in the amino terminus of the global structure (coloured in green), coupled with the trefoil domain of mGAA, non-interfering with the catalytic domain (catalytic domain in yellow, red arrows pointing the two catalytic asparagine). (D) Scheme showing the rationale of GAA modifications to improve GT strategies. Created with BioRender.com. Contrary to endogenous GAA, GT-optimised GAAs (opGAA) must be primarily secreted in order to cross-correct as many cells as possible. The leader peptide of the IFNβ-1 or IGF-2 will enhance secretion of the different chimeric GAAs. Once opGAA is secreted to the medium, it must be taken up by enzyme-deficient cells to correct GAA deficiency and degrade the accumulated glycogen (cross-correction). IGF-2 and LSP sequences are included to improve uptake and/or lysosomal targeting of secreted opGAAs.
The hydropathicity pattern prediction using the Kyte-Doolittle scale in ExPASy [46,47] showed a very similar hydrophobicity pattern of the different mGAA chimeras compared to the original mGAA ( Figure 3B). Additionally, a theoretical reconstruction of the different mGAA proteins using I-TASSER prediction, Pymol and Chimera software showed that all chimeric proteins preserved the domains involved in GAA activity without significant differences in the 3D structure of the GAA protein ( Figure 3C). A scheme of the potential therapeutic advantages in the secretion and uptake of the different chimeric mGAA proteins for GT strategies can be seen in Figure 3D.

Murine Cellular Models Point to IFG as the Best Chimera in Terms of Expression, Secretion and Restoration of PD Defects
To study the performance of the different GAA chimeras for potential GT application, we generated different lentiviral vectors (LVs) expressing IFLG, IFG and 2G under the SFFV promoter (Spleen Focus Forming Virus) (SIFLG, SIFG and S2G, respectively; Figure 4A).  The E2_1 GAA-KO cellular model was transduced with these different LVs (the vector copy number per cell in each population is indicated in Figure S4A) to investigate the behaviour of the different GAA chimeras restoring the GAA activity ( Figure 4B, graph and middle panel) and secretion level of the optimised GAA ( Figure 4B, right panel). E2_1 clone transduced with LVs expressing the IFG chimera (SIFG) showed the highest GAA restoration and secretion activities ( Figure 4B, graph), despite the fact that SIFLG showed higher levels of processed protein ( Figure 4B, middle panel). This could be due to the lower activity of the SIFLG protein as a consequence of the insertion of the LSP. As expected, SIFG and SIFLG were mostly processed to the active forms of 70-75 kDa intracellularly ( Figure 4B, middle panel) and secreted as unprocessed 110 kDa protein ( Figure 4B, right panel). However, in E2_1 cells, most of the 2G chimeras were unprocessed and were retained intracellularly with poor secretion (Figures 4B and S6). Since the 2G remains largely unprocessed, its ability to restore GAA activity in E2_1 cells was also reduced in comparison to the other chimeras ( Figure 4B, left graph). We next analysed the therapeutic potency of the different GAA chimeras in the E2_1 cellular model. Interestingly, all LVs could normalise p62 levels ( Figure 4C and S4), although only LVs expressing SIFG showed a partial improvement restoring normal surface levels of CI-MPRs ( Figure 4D).
To identify whether potential differences in processing/secretion were cell-dependent, we investigated the expression, processing and secretion of the different GAA chimeras in myeloid (RAW 264.7) and hepatic (Hepa 1-6) murine cell lines. Interestingly, although the expression and processing of the different chimeras were similar in all cell lines ( Figure 4B,E,F, left graphs and Figures S5 and S6), only Hepa 1-6 efficiently secreted all chimeras, including the 2G protein ( Figure 4F, right panel).

Human Cellular Models Uncover Crucial Differences in Secretion of the 2G Chimera Compared to Murine Cells
We next investigated the behaviour of the murine GAA chimeras in K562 (myelogenous leukaemia-lymphoblast) Meg-01 (myelogenous leukaemia-megakaryoblast) and SJCRH30 (rhabdomyosarcoma-muscle cells) human cell lines in order to determine potential cross-functions of the different mouse-derived domains (IFLG, IFG and 2G) in human cells. To this end, these cell lines were transduced with the different therapeutic LVs, and GAA protein expression and activity were analysed as previously described ( Figure 5A-C).
We were not able observe significant differences in intracellular alfa-glucosidase activity nor in GAA processing (left graphs and Western blot (middle panels) of Figure 5A-C) compared with murine cell lines (left graphs and Western blot (middle panels) of Figure 4B,E,F), since, in murine cells, both myeloid ( Figure 5A,B) and muscle ( Figure 5C) human cells transduced with the SIFG LVs achieved their highest intracellular activity in relation to their vector copy number (Table S3). Surprisingly, the secretion of the 2G protein (cells transduced with S2G LVs) was easily detected in all human cell lines (Figure 5A-C; Western blots, middle panels and Figure S6). In fact, this chimera presented the highest secretion level of all chimeras in human myeloid cells (K562 and Meg-01) and similar levels to the IFG protein in the human muscle cell line SJCRH30 ( Figure S6). Therefore, these data suggest that the murine cellular models do not necessarily mimic the secretion profile of newly designed proteins in human cells. We were not able observe significant differences in intracellular alfa-glucosidase activity nor in GAA processing (left graphs and Western blot (middle panels) of Figure (Table S3). Surprisingly, the secretion of the 2G protein (cells transduced with S2G LVs) was easily detected in all human cell lines ( Figure  5A-C; Western blots, middle panels and Figure S6). In fact, this chimera presented the highest secretion level of all chimeras in human myeloid cells (K562 and Meg-01) and similar levels to the IFG protein in the human muscle cell line SJCRH30 ( Figure S6). Therefore, these data suggest that the murine cellular models do not necessarily mimic the secretion profile of newly designed proteins in human cells.

Poor Cross-Correction of GAA-KO Cellular Model with GAA Chimeras
Since IFLG and 2G chimeras were designed to enhance GAA cellular/lysosomal uptake as described in Figure 3D, we analysed the efficacy of cross-correction of the IFG, IFLG and 2G proteins by measuring their functional uptake in E2_1 cells ( Figure 6A).

Poor Cross-Correction of GAA-KO Cellular Model with GAA Chimeras
Since IFLG and 2G chimeras were designed to enhance GAA cellular/lysosomal uptake as described in Figure 3D, we analysed the efficacy of cross-correction of the IFG, IFLG and 2G proteins by measuring their functional uptake in E2_1 cells ( Figure 6A). First, we quantified the GAA activity found in the conditioned media obtained from non-transduced (NT) and transduced K562 cells with the different LVs ( Figure S7). The different conditioned media were added to the E2_1 clone, and GAA activity was measured 20 h later. A significant restoration of the enzymatic activity was observed in all E2_1 cells incubated with conditioned media from transduced K562 cells (SIFLG, SIFG and First, we quantified the GAA activity found in the conditioned media obtained from non-transduced (NT) and transduced K562 cells with the different LVs ( Figure S7). The different conditioned media were added to the E2_1 clone, and GAA activity was measured 20 h later. A significant restoration of the enzymatic activity was observed in all E2_1 cells incubated with conditioned media from transduced K562 cells (SIFLG, SIFG and S2G), but not when using media from non-transduced K562 cells ( Figure 6B). However, no significant differences were observed between the different chimeras. Moreover, the GAA activity achieved in cross-corrected E2_1 cells was 5-10 times lower than in WT Sol8 cells ( Figure 2B vs. Figure 6B), emphasising the difficulties in cross-correcting Pompe muscle cells. However, it is interesting to highlight that this low uptake of GAA by E2_1 cells mimics what it is observed in muscle from PD patients and is probably due, at least in part, to the low CI-MPRs levels present in GAA-KO cells.

In Vivo Comparison of the Different mGAA Chimeras in GAA-KO Mice
We finally analysed the performance of the different GAA chimeras for in vivo gene therapy application, focusing on their ability for cross-correction. For this aim, we generated different adeno-associated viral vectors (AAV) expressing codon-optimised murine GAA, IFLG, IFG and 2G under the expression of the human alpha 1-antitrypsin (hAAT), a human hepato-specific promoter (AG, AIFLG, AIFG and A2G, respectively; Figure 7A) and tested them in GAA-KO mice ( Figure 7B). Mice were sacrificed one month after the injection, and the vector genome copy number ( Figure 7C), as well as the GAA activity in the liver ( Figure 7D) and serum ( Figure 7E, Mice were sacrificed one month after the injection, and the vector genome copy number ( Figure 7C), as well as the GAA activity in the liver ( Figure 7D) and serum ( Figure 7E, left graph), were analysed to determine in vivo GAA expression and secretion activity for each of the constructs. As already reported [48], the unmodified-codon optimised GAA appears as the best performer in terms of expression in the liver ( Figure 7D, right graph). However, modified GAAs were secreted with similar efficacy, as evidenced by normalising the GAA activity in serum versus GAA activity in the liver ( Figure 7E, right). We next analysed GAA activity and glycogen content in the heart as an indicator of the ability of the different GAAs to cross-correct muscle tissue. As it can be observed in Figure 7F, left and Figure 7G, all chimeras appear to behave similarly, restoring physiological levels of GAA activity and reducing glycogen levels, although the IFG and 2G modified GAA seem to improve the uptake compared to wild-type GAA (AG) and AIFLG ( Figure 7F, right).

Discussion
The development of accurate cellular models helps preclinical studies based on the screening of new drugs or GT tools. Therefore, one of the aims of this manuscript was to generate a murine-murine cellular model to investigate rationally designed murine GAA chimeras to refine the strategy before translation into animal models. There are several in vitro murine cellular models available [31,36,49]; unfortunately, they do not fully mimic severe IOPD signatures or LOPD disease progression. Additionally, engineered GAA-KO mice generally lack human analogue mutations that mimic those described in PD patients.
Being aware of the genetic and physiological interspecific differences between the gold-standard GAA-KO model and most PD patients, we first generated murine muscle cell models using genome editing to reproduce relevant PD mutations listed in the PD database [50,51]. Interestingly, in addition to succeeding in abrogating GAA activity, all cellular models generated for this work presented several phenotypic defects characteristic of PD, such as increased autophagic build-up and glycogen accumulation as well as a downregulation of surface CI-MPRs, a phenomenon that has been reported previously only in fibroblasts from Pompe patients [42,52].
A massive accumulation of autophagic debris, also known as autophagic build-up, is a landmark of PD and contributes to the development of muscle weakness and disease progression [41]. This phenotype results in the accumulation of p62/LC3 (autophagic markers) and LAMP-1 (lysosomal marker), both of which are clearly observed in patients and animal models [5]. However, although previously described murine and human cellular models resemble glycogen accumulation [53], only a few human cellular models have been reported to mimic autophagic build-up [54]. Surprisingly enough, our murine isogenic cellular models not only accumulated glycogen, but also presented clear evidence of autophagic build-up after myotube differentiation, as indicated by increased p62 and LAMP-1.
ERT therapy and gene therapy cross-correction strategies are based on the binding of GAA to CI-MPRs, they enter into the cells through clathrin-coated vesicles and, finally, their endosomal fusion occurs in order to (1) deliver the GAA enzyme into the lysosomes to mature and become active and (2) to recycle the M6PR to the cell surface [2,55]. It has been described that the autophagic build-up disrupts normal protein trafficking. In particular, Fukuda et al. detected unprocessed GAA as well as CI-MPR proteins trapped in autophagic areas in cells from GAA-KO mice [56]. This processing block could lead, not only to lower GAA activity into the cells, but also to lower M6PR recycling as well as its surface availability. In line with this, Cardone et al. found a marked surface CI-MPR reduction that correlated with PD severity [42], pointing to defects in receptor recycling as the driver of this phenomenon. This reduction in surface CI-MPRs in Pompe patients' cells was confirmed by several authors [52,56]. All of our murine cellular models presented lower surface expression of CI-MPRs, although this reduction was highly variable depending on culture conditions (data not shown). Therefore, our isogenic cellular models represent the first ones that allow us to investigate the effects of patients' GAA mutations on autophagic build-up and surface CI-MPR downregulation. The reduction in membrane-bound CI-MPRs in cells lacking GAA can explain the poor responses to ERT in the most severe PD patients [1,57,58] and favours the search for new strategies focused not only on GAA restoration, but also on increasing CI-MPR surface expression, such as the use of salmeterol [59,60] or clenbuterol, two selective b2-agonists [61]. In this sense, there is an ongoing phase II clinical trial with clenbuterol in adult patients with PD stably treated with ERT (NCT04094948).
To validate the potential applicability of our new isogenic cellular models as tools for preclinical studies, we analysed the ability of three different GAA chimeras to restore their GAA activity and their phenotypic defects directly (LV-transduced cells) or by crosscorrection (adding supernatant from GAA-expressing cells). Interestingly, only the SIFG LVs could partially overcome CI-MPRs surface reduction and also normalised p62 levels, indicating that IFG could be a suitable candidate for GT applications. This observation was confirmed by our in vivo analysis using liver-directed AAVs on GAA-KO mice, validating our in vitro cellular models as a tool to refine animal experimentation.
Our data of the murine IGF2-GAA (2G) chimera contrast with the results obtained by several groups using human IGF2-GAA proteins, showing clear improvements in their therapeutic activity in comparison with WT GAA [48]. The differences in IGF2-GAA processing in human versus murine myeloid cells could be a consequence of several factors. Interestingly, the mouse Igf2 expression declines after birth [62], while human Igf2 remains active throughout life [63]. The fact that the murine and human IGF2 share an amino acid identity of 94% ( Figure S8) could explain the good secretion profile of the murine 2G chimera by human cells.
In summary, we generated new isogenic cellular models of severe PD that mimic hallmark features described in Pompe patients: a lack of GAA activity, the accumulation of glycogen, autophagic build-up and the downregulation of surface CI-MPRs.
Our isogenic murine GAA-KO models, together with the human cell lines, are useful tools for the analysis of LVs expressing different GAA chimeras. In this sense, although LV-IFG (SIFG) is a potential candidate for GT of PD, we consider that it requires further refinement before moving forward to GT applications.

ICE
Genomic DNA from Sol8 cells was extracted 5 days after nucleofection using a QI-Aamp DNA mini kit (Qiagen) following the manufacturer's instructions. The genomic regions flanking the CRISPR target site for each treatment were amplified by PCR using different pairs of primers with KAPA Taq PCR (Kapa Biosystems): GAA exon 2 Fw: AAGATGCTCTGGCTGCCT and GAA exon 2 Rev: TGCTCTGCCTAGCCTGTC for ∆ ATG and E2 cells; GAA Exon 4 Fw: AGTTCCTGCAGCTGTCCA and GAA Intron 6 Rev: AAGTGTTTGGGCTCAGGAA for E5 cells; GAA Exon 6 common Fw: TCCTGAGCC-CAAACACTTCT and GAA Exon 8 Rev: CCACGATCATCATGTAGCG for E7 cells. The fragments were purified with the QIAquick PCR product purification kit (Qiagen), in accordance with the manufacturer's protocols. For analysing allele modification frequencies, we used ICE (Inference of CRISPR Edits), a web-based analysis tool developed by Synthego (https://ice.synthego.com/#/, accessed on 20 October 2019) [65]. Our purified PCR products were Sanger-sequenced using both PCR primers. Then, each sequence chromatogram was analysed with the ICE software (Synthego, Silicon Valley, CA, USA). Analyses were performed using a control sequence. The ICE score showed editing by NHEJ.

Off-Target Analysis
The top listed off-targets were selected from the CRISPR designed tool (Synthego) for the different Sol8 GAA-KO clones. Off-target analysis were performed by ICE analysis, as previously described. The genomic regions flanking the CRISPR target site for each off target at the different clones were amplified by PCR using different pairs of primers (Table S2).

Plasmid Design
First, the murine GAA sequence was codon optimised by GenScript (Piscataway, NJ, USA). The 2G sequence was as follows: MGIPVGKSMLVLLISLAFALCCIAALCGGELVDTLQFVCSDRGFYFSRPSSRANRR SRGIVEECCFRSCDLALLETYCATPAKSEGAP, containing the signal peptide, the IGF2 sequence without six amino acids which presented protease cutting sites and two spacer amino acids. Optimised mGAA was added after this sequence. The IFG construct contained the interferon beta 1 leader peptide: MNNRWILHAAFLLCFSTTALS followed by optimised mGAA. IFLG contained the interferon beta 1 leader peptide followed by lysosomal sorting peptides (LSP) described by Dekiwadia et al. [66]. Between LSPs, spacers and flexible amino acids were added: MNNRWILHAAFLLCFSTTALSYQRLCGGACNPGYGAGYQRLCGGACNPGYAI. Optimised mGAA was added after this sequence. All sequences were obtained from the UniProtKB database [67]. Predictions of hydrophobicity patterns were performed through the online database ExPASy Bioinformatics Resources Portal [68] and the Kyte-Doolittle hydropathy scale [46].

Lentiviral Constructions
The lentiviral plasmid SIFLG was obtained by the incorporation of the sequence IFLG in the place of eGFP in a 2nd-generation lentiviral backbone, SEWP. pUC57 plasmid (GenScript) containing the IFLG sequence and SEWP were digested with BamHI and Sbf1 (New England Biolabs) and the resulting plasmids were ligated with T4 DNA ligase (New England Biolabs). The lentiviral plasmid SIFG was obtained by cloning the sequence IFG in the place of IFLG in the SIFLG lentiviral plasmid. pUC57 plasmid (GenScript) containing the IFG sequence and SIFLG was digested with BamHI and AsiSI (New England Biolabs), and the resulting plasmids were ligated with T4 DNA ligase (New England Biolabs). The lentiviral plasmid S2G was obtained by cloning the sequence 2G in the place of IFLG in the SIFLG lentiviral plasmid. pUC57 plasmid (GenScript) containing the 2G sequence and SIFLG was digested with PacI and AsiSI (New England Biolabs), and the resulting plasmids were ligated with T4 DNA ligase (New England Biolabs). The lentiviral plasmid SG was obtained after the digestion of S2G plasmid with AsiSI (New England Biolabs) in order to eliminate the IGF2 cassette and the resulting plasmid was ligated with T4 DNA ligase (New England Biolabs). After the different ligations and transformation into E. coli Stbl3 competent bacteria (Life Technologies, Thermo Fisher Scientific, Waltham, MA, USA), the plasmids were obtained using Wizard ® Plus SV Minipreps DNA (Promega, Madison, WI, USA). The restriction pattern was performed and the whole plasmid was eventually sequenced. Maxi-production was performed using NucleoBond ® Xtra Maxi (Macherey-Nagel, Düren, Germany).

Vector Production and Virus Titration
LV particles were produced by polyethylenimine (PEI) (408727, Sigma-Aldrich, St. Louis, MO, USA), as previously described [69]. Briefly, for second-generation LVs, 293T packaging cells were transfected with packaging plasmid pCMvdR8.91, plasmid pMD2.G.47 encoding the vesicular stomatitis virus (VsV-g) envelope gene (http://www.addgene.org/ Didier_Trono/, accessed on 2 June 2020) and the desired vector plasmid (SG, SIFLG, SIFG or S2G). The producer cells were cultured for 24, 48 and 72 h and the viral supernatants were collected and filtered through 0.45 µm filters (Nalgene, Rochester, NY, USA). The viral particles were then concentrated by ultracentrifugation in a Beckman Optima Centrifuge (Beckman Coulter, CA, USA) at 40,000 rpm for 2 h at 4 • C, and the viral pellets were resuspended in StemSpan medium (StemCell Technologies, Vancouver, Canada) for 1 h on ice, aliquoted, and immediately frozen at −80 • C. Viral titres (transduction units [TU]/mL) were calculated using quantitative PCR. Briefly, 10 5 K562 cells were transduced with serially diluted amounts of LV. Genomic DNA was isolated (10 5 cells, equivalent to 0.6 µg of genomic DNA) (kit QiAamp DNA Mini Kit) (Qiagen) and the copy number of LVs integrated was measured using a standard curve (from 10 5 to 10 copies) of plasmid DNA. We used KAPA SYBR FAST Universal qPCR (KAPA Biosystems) in a Mx3005P QPCR System from Stratagene (Agilent Technologies, Santa Clara, CA, USA). The primers used for titration were ∆U3 (fw: GACGGTACAGGCCAGACAA) and PBS (rev: TGGTGCAAATGAGTTTTCCA).

AAVs Constructions and Vectors (Design and Production)
Transgene sequences were cloned into an AAV vector backbone under the transcriptional control of the apolipoprotein E (hepatocyte control region enhancer) and the human alpha 1-antitrypsin (hAAT) promoter, as described in [70,71]. We designed and cloned an oligo with PacI and SbfI restriction sites in order to clone our mGAA chimeras from our lentiviral vectors (SIFLG, SIFG, S2G and SG) in an AAV vector backbone. Then, we digested with PacI (New England Biolabs) and SbfI (New England Biolabs) our lentiviral vectors (SIFLG, SIFG, S2G and SG) and the AAV vector backbone. The resulting plasmids were ligated with T4 DNA ligase (New England Biolabs). After the different ligations and transformations into E. coli Stbl3 competent bacteria (Life Technologies, Thermo Fisher Scientific, Waltham, MA, USA), the plasmids were obtained using Wizard ® Plus SV Minipreps DNA (Promega, Madison, WI, USA). The restriction pattern was performed and the whole plasmid was eventually sequenced. Maxi-production was performed using NucleoBond ® Xtra Maxi (Macherey-Nagel, Düren, Germany).
The research-grade AAV vectors used in this study were produced using an adenovirusfree transient transfection method. Briefly, suspension HEK293 cells were transfected using PTG1-plus (POLYTHERAGENE) with the three plasmids containing the adenovirus helper proteins, the AAV Rep and Cap genes, and the ITR-flanked transgene expression cassette. Then, 24 h after transfection, cells were treated with Benzonase ® (Merck-Millipore, Darmstadt, Germany), and 2 days later, they were lysed with Triton (Sigma, St Louis, MO, USA) and clarified by filtration. Vectors were then purified by a single immunoaffinity chromatography column, using POROS CaptureSelect (Thermo Fisher Scientific, Waltham, MA, USA) resins. Purified particles were formulated in phosphate buffered saline containing 0.001% of Pluronic F68 (Sigma Aldrich, Saint Louis, MO, USA), and stored at −80 • C. Titres of AAV vector stocks were determined using quantitative real-time PCR (qPCR). Specific primers were as follows: forward 5 -GGCGGGCGACTCAGATC-3 , reverse 5 -GGGAGGCTGCTGGTGAATATT-3 .

In Vivo Studies
Mice studies were performed according to French and European legislation on animal care and experimentation (2010/63/EU) and approved by the local institutional ethical committee (protocol no. 2015-008). Gaa −/− mice were purchased from the Jackson Laboratory (B6;129-Gaatm1Rabn/J, stock no. 004154, 6neo) and were originally generated by Raben and colleagues [72]. AAV vectors were administered intravenously via the tail vein in a dose of 5 × 10 11 vg/kg, five mice per group. One month after the treatment, all the mice were sacrificed for tissue analysis.

GAA Secretion
We plated 10 7 control (NT, non-transduced) and transduced (SIFLG, SIFG y S2G) suspension cells (K562 and Meg-01) in a 48-well plate and 3 × 10 6 control (NT) and transduced (SIFLG, SIFG and S2G) adherent cells (Sol8, RAW 264.7, Hepa 1-6 and SJCRH30) in a 6-well plate in 1 mL of Opti-MEM with 1% P/S (Thermo Fisher Scientific) for 16 h at 37 • C and 5% CO 2 . Supernatants were centrifuged at 13,000 rpm, 4 • C for 15 min. GAA activity (see GAA activity protocol) and expression (Western blot) were analysed in the supernatant. Briefly, for protein precipitation, we added 1/4 volume of chloroform and 1 volume of methanol, and vortexed the mixture. After centrifugation at 12,000 rpm for 5 min at RT, we eliminated the interphase of methanol and added 1 volume of methanol for a second centrifugation. Finally, the pellet was allowed to dry at 50 • C for 5-10 min and analysed by means of Western blot.

GAA Activity Assay (Intracellular and Extracellular)
We measure GAA activity using a fluorometric kit (Lysosomal alpha-Glucosidase Activity Assay kit) (Abcam, Cambridge, UK) based on 4-methylumbelliferyl-α-D-glucopy ranoside (4-MUG). GAA hydrolyses 4-methylumbelliferyl α-D-glucopyranoside and releases 4-methylumbelliferone (4-MU) that can be measured fluorometrically [73,74]. Briefly, 10 6 cells or 10 mg of animal tissue were lysed in 150 µL of the GAA assay buffer for 20 min on ice. After that, cells were centrifuged at 12,000 rpm for 5 min at 4 • C and 10 µL of the supernatant was placed in a 96-well plate. Protein quantification was performed using Pierce BCA Protein Assay Kit (Thermo Fisher Scientific). At the same time, a standard curve was prepared using the standard of 4-methylumbelliferone at 100 µM (Abcam) following the protocol indications and adding GAA assay buffer until gaining a volume of 60 µL per well. We added 40 µL of GAA assay buffer in a separate well as a background control. Finally, 20 µL of substrate (Abcam) was added into all wells, except standard curve wells, and incubated at 37 • C for 90 min, protected from light. The reaction was stopped by adding 100 µL of stop buffer (Abcam) and fluorescence intensity (λex = 368 nm and λem = 460 nm) was measured by using a fluorescence microtitre plate reader (Infinite 200 PRO NanoQuant) (Tecan, Männedorf, Switzerland).

Myotubes Differentiation
For PAS staining, 18,000 Sol8 cells were cultured in 4-well slide chambers (Nunc, Roskilde, Denmark) in DMEM high glucose (supplemented with 20% FBS and 1% P/S). For glycogen content measurement, 250,000 Sol8 cells were cultured in 6-well plates in DMEM high glucose (supplemented with 20% FBS and 1% P/S). After 48 h, the medium was replaced with DMEM high glucose (supplemented with 1% of FBS and 1% P/S). At day 6, a 9-h starving was performed with DMEM without glucose (Gibco, Amarillo, TX, USA). Cells were collected by lysing in PBS-triton with protease inhibitor (1X) for glycogen detection.

PAS Staining
The staining was performed using Atrys (Granada, Spain). Briefly, slides were washed with PBS, fixed with 4% of p-formaldehyde and incubated in a 0.5% periodic acid solution for 25 min. After washing, they were stained with Schiff s reagent for 30-40 min. The contrast was performed with haematoxylin for 10 s (all reagents from Merck/Sigma-Aldrich). Slides were dehydrated and mount using DPX. Images were obtained with an Olympus upright BX43 microscope (10x objective).

Glycogen Quantification
Glycogen content was measured as described in by Cagin, U. et al. [75]. The glycogen was indirectly measured whilst the glucose was released after total digestion with amyloglucosidase from Aspergillus niger (Merck Life Science, Darmstadt, Germany). Sol8 myoblasts were differentiated into myotubes, as previously described. Lysates from cells and animal tissues were incubated for 5 min at 95 • C and then cooled at 4 • C; 25 µL of amyloglucosidase diluted at 1:5 in 0.1M potassium acetate (pH 5.5) was added to each sample. A control reaction without amyloglucosidase was prepared for each sample. Both sample and control reactions were incubated at 37 • C for 90 min. The reaction was stopped by incubating samples for 5 min at 95 • C. The released glucose was determined using a glucose assay kit (Merck Life Science) by measuring absorbance with Infinite 200 PRO NanoQuant (Tecan, Männedorf, Switzerland) at 540 nm.

RNA Extraction
Total RNA was obtained using the Trizol reagent (Thermo Fisher Scientific). Briefly, samples were incubated for 5 min at RT, 200 µL of chloroform was added, mixed, and incubated for 3 min at RT. Samples were centrifuged in Phase Lock Gel Heavy tubes (VWR, Radnor, PA, USA) at 12,000× g at 4 • C for 15 min. Afterwards, the aqueous phase was collected to precipitate RNA by adding 450 µL of isopropanol 100%. After centrifugation at 12,000× g at 4 • C for 10 min, the pellet was washed with ethanol 75%, resuspended in RNase free water and kept at −80 • C until further use.

RT-PCR
RNA samples were reverse transcribed using the Superscript first-strand system (Thermo Fisher Scientific). Afterwards, we conducted a qPCR with primers for GAA cDNA (fw: TACGCAGGAGGTCGTGT and rev: GTCTGCTCCTGGATGTGC) to amplify the amplicon of 372 bp with KAPA Taq PCR (Kapa Biosystems). The fragment was purified with a QIAquick PCR product purification (Qiagen), in accordance with the manufacturer's protocols, and we sequenced it using Sanger. The results were analysed using the Basic Local Alignment Search Tool (BLAST) and ICE software (Synthego). CI-MPRs RNA analyses were conducted using qPCR with TaqMan™ Gene Expression Assay (FAM) (4453320, inventoried Assay ID: Mm00439576_m1) (Thermo Fisher Scientific) following the manufacturer's instructions.

GAA Uptake Assay
A total of 2 × 10 7 control and transduced K562 cells (NT, SIFLG, SIFG and S2G) were plated in 24-well plates in 2 wells with 1.5 mL/well of Opti-MEM with 1% P/S (Thermo Fisher Scientific). In parallel, 1.75 × 10 5 E2_1 cells were plated in a 6-well plate. After 20 h, the conditioned medium from K562 cells was centrifuged at 13,000 rpm for 15 min at 4 • C and supernatant was added to E2_1 cells. After 20 h, GAA intracellular activity was measured.

Surface CI-MPRs Analysis by Flow Cytometry
A total of 2.5 × 10 5 Sol8 cells (WT, bulk populations, E2_1 and transduced E2_1 cells) were plated in 6-well plates with 2 mL of DMEM (supplemented with 20% FBS and 1% penicillin/streptomycin). After 24 h, the cells were detached using cell scrapers and kept on ice (4 • C) to avoid internalisation of CI-MPRs. A total of 10 5 cells were washed twice with cold PBS + 1% BSA and resuspended in blocking buffer (PBS + 1% BSA with goat serum) for 30 min on ice. After two washes, cells were stained with anti IGF-II/IGF2R mouse antibody (2G11) (#NB300-514, Novus biologicals, Centennial, CO, USA) for 1.5 h on ice at 4 • C. Mouse IgG2a kappa was used as an isotype control. Cells were washed twice and incubated with a secondary antibody goat anti-mouse AlexaFluor-488 (#4408S, Cell Signaling Technology, Danvers, MA, USA) for 1 h on ice (dark). Cells were washed two times and analysed on a FACS Canto II flow cytometer (Becton Dickinson, Franklin Lakes, NJ, USA) using the FACS Diva software (BD Biosciences, Bedford, MA, USA) and FlowJo program (BD, Ashland, OR, USA).

Karyotypes Analysis
Karyotypes were performed by the Biobank of the Andalusian Public Health System (Granada, Spain). Cells were incubated in growth medium supplemented with 0.1 mg/mL of colcemid (Merck) for 4 h. The cytoplasm was removed using a hypotonic solution of KCl (0.075 mol/L) and the nuclei were fixed with methanol:acetic in a 3:1 ratio (vol/vol). Metaphases were fixed in slides. G-bands were made with Trypsin-Wright (GTG) and a minimum of 20 metaphases were analysed for each cell line, assigning a karyotype formula according to the international System of Human Cytogenetic Nomenclature (ISCN) 2020. Karyotypes were analysed using a Leica DM5500 microscope (Leica, Wetzlar, Germany) and the Ikaros Karyotyping System (Metasystems, Heidelberg, Germany).

Institutional Review Board Statement:
The animal study protocol was performed according to French and European legislation on animal care and experimentation (2010/63/EU) and approved by the local institutional ethical committee (protocol no. 2015-008).