1. Introduction
AAV is a replication-defective parvovirus which requires a helper virus to complete its life cycle [
1]. The virus is best known for its small genome, which is tightly packaged inside a capsid that is 20–25 nm in diameter. Its single-stranded DNA genome contains approximately 4700 nucleotides and encodes the
rep and
cap genes which represent the non-structural and capsid proteins, respectively. AAV replication is mediated by the rep proteins, which are capable of nicking the inverted terminal repeats (ITR) and therefore initiates AAV replication. In the presence of a helper virus, AAV undergoes its lytic infection. Without a helper virus, AAV integrates into the host genome and maintains a relatively stable, latent state.
AAV infection is highly regulated in both latent and lytic infections. In latent infection, AAV genes remain silent. In the lytic life cycle,
Rep78 and
Rep68, under control of the P5 promoter, are expressed early, followed by expression of the packaging related genes, such as
Rep52/Rep40 and capsid genes
VP1,
VP2 and
VP3. In the late stage of AAV lytic infection, expression from the P5 promoter is downregulated to facilitate virus encapsidation, which was also demonstrated in recombinant AAV replication and packaging [
2]. A variety of host, viral, and helper virus factors are involved in the temporal and spatial regulation of viral replication and packaging.
Although the canonical AAV particle typically contains both ITRs and the entire coding region, AAV populations are often found to be heterogenous [
3,
4,
5,
6,
7]. Upon being processed through a CsCl gradient, the full AAV particles are present in the fraction with 1.4 g/mL density. Lighter density AAV particles (1.32 and 1.35 g/mL) are largely deemed as empty particles or defective interfering (DI) particles. DI particles contain aberrant, shorter AAV DNA genomes, and are named for their ability to inhibit AAV replication intracellularly. Template switch was proposed to explain the nonunit-length AAV genome [
6]. However, their composition and molecular conformation are elusive.
In this study, we systemically sequenced the genomes of AAV viruses within a population and revealed the molecular state of AAV genomes, both full length genome and DI particles at the single virus level. Unexpectedly, we discovered that a major class of DI particles are capable of regulating important biological functions for AAV expression and are an integral part of the AAV life cycle. In contrast to the general belief that all subgenomic particles are waste-byproducts and only compete for essential resources required for wild type virus replication and packaging, our findings suggest that the AAV virus has evolved to utilize these subgenomic particles to potentially encode dsRNA that regulate rep expression and augment cap expression. This is a clever manipulation evolved by AAV to circumvent its size limitation.
2. Materials and Methods
2.1. Cell Lines and Transfections
HEK 293 cells were maintained in Dulbecco’s modified essential medium (DMEM) containing 10% bovine calf serum. PolyJet™ DNA In Vitro Transfection Reagents (SignaGen Laboratories, Frederick, MD, USA) were used to deliver DNA into HEK 293 cells. Cells are plated into 10-cm-diameter culture dishes 18 to 24 h prior to transfection so that the monolayer cell density reaches to the optimal 70~80% confluency at the time of transfection. Complete culture medium with serum is freshly added to each plate 30 min before transfection. We prepared the PolyJet™-DNA Complex for transfection according to the ratio of 3 µL PolyJet™ to 1 µg DNA using serum-free DMEM to dilute DNA and the PolyJet™ reagent. We incubated them for 10~15 min at room temperature and added the PolyJet™/ DNA mixture onto the medium. We removed the PolyJet™/DNA complex-containing medium and replaced with fresh serum-free DMEM 12~18 h post transfection.
2.2. Expression Plasmids
The pH29 negative expression plasmid was made by deleting a 786-bp sbf1 and BsiwI fragment from pH29 plasmid. This plasmid expresses a normal level AAV Rep in rAAV production system but do not express AAV-2 capsid gene. The pssAAV-CB-RepCap plasmid was made by cloning the 4436-bp rep&cap fragment from pCI-RepCap into the backbone of pssAAV-CB-GFP at Bgl II site. The AAV 5′-SBG dimer was generated through the intramolecular ligation of the ITR-CB-Rep fragment, which contains AAV ITR and partial CB-Rep fragment, and isolated from pssAAV-CB-RepCap by NaeI-BamHI digestion. AAV 3′-SBG monomer was made by intramolecular-ligation of BamHI-SmaI fragment isolated from psub201. AAV 3′-SBG dimer was made by intermolecular ligation of BamHI-SmaI fragment isolated from psub201.
Head-to-head dimer DNA molecules containing Gluc gene were generated by ligation of the 1587 bp fragment, which is from pssAAV-CB-Gluc plasmid digested with SmaI and ClaI. The monomer fragment includes a single CB promoter, Gluc gene and a single Poly A signal. Tail to Tail dimer DNA molecules containing Gluc gene were generated by ligation of the 1557 bp fragment, the monomer fragment is from pdsAAV-CB-Gluc digested with SmaI and MluI.
2.3. Virus Production
AAV viruses were produced using the triple plasmid transfection system in HEK 293 cells according to method that were described previously [
8].
2.4. Western Blot
To detect Rep and Cap protein expression, at 72 h after transfection, total protein was extracted. The cell pellet was washed with PBS twice, added RIPA lysis buffer plus proteinase inhibitor (PefablocSC PLUS, Roche Applied Science, Indianapolis, IN, USA), incubated for 30 min on mice, centrifugated 30 min at 4 °C 15,000 rpm, and cell supernatant was harvested. We used 10 µL of cell supernatant to quantify using the Pierce™ BCA Protein Assay Kit (Thermo Scientific, USA). The rest supernatant was heated in a 1× dilution loading buffer NuPAGE™ LDS Sample Buffer (4×, Invitrogen, Waltham, MA USA) at 70 °C for 10 min to denature sample. We loaded 30 ug of protein from each sample on 4% of concentrating gel and 8% of separating SDS-PAGE gel alongside a set of protein molecular weight markers (PageRuler™ Prestained Protein Ladder, Thermo Scientific), run 120 min at 100 V. The samples were transferred to the PVDF membrane on a semi-dry transfer apparatus (Bio Rad Trans-blot SD Semi dry transfer cell), at 16 V, 30 min. The membrane was blocked with 5% of BSA in TBS buffer. To detect Rep or Cap protein, the membrane was incubated with the mouse anti-AAV-Rep monoclonal antibody 76.3 at 1:2000 dilution (American Research Products, Inc. Waltham, MA, USA) or mouse anti-AAV-Cap monoclonal antibody 61058 (American Research Products, Inc., Waltham, MA, USA), at 4°C for overnight. The membrane was washed three times with the TBST buffer, 10 min each time. The membrane was incubated with the secondary antibody HRP-linked with anti-mouse antibody (dilution: 1:2000; Cell Signaling Technology, Inc., Danvers, MA, USA). The antibody was detected by chemiluminescence with Amersham ECL (Sigma-Aldrich, Inc., St. Louis, MO, USA).
2.5. Quantitative Real Time PCR (qPCR) Assay
Viral vectors (1 × 1010 vg, 1 µL) in the DNase solution containing DNase I (1 U/mL) were incubated for 30 min at 37 °C, adding 1 µL of 0.5 M EDTA (to a final concentration of 5 mM) and subsequently heated for 10 min at 75 °C to cease DNase I activity. Control samples each received a lysis buffer containing proteinase K (40 μg/mL), then were incubated for 1 h at 56 °C and finally heated for 10 min at 95 °C. The samples intended for thermal treatment were directly heated following heat inactivation of DNase I treatment at the indicated temperatures. The copy numbers of viral genomes subsequently released were quantified by real-time PCR and expressed in vg/mL.
2.6. Dexamethasone Induction
pFΔ6, pMMTV trans (Rep-cap), pssAAV-CB-GFP and wild-type defective DNA were co-transfected into HEK 293 cells, at 8 h, replaced the old medium with fresh DMEM containing final concentration 1 × 10–5 M of dexamethasone, and incubated for 72 h, AAV-GFP vector yield was detected by qPCR.
2.7. AAV Genome Sequencing
The AAV genome was sequenced by PacBio SMRT Sequencing. Samples were prepared according to the general protocol of the SMRTbell library. DNA was extracted and purified by AMPure PB Beads. DNA damage/end repair was performed using the SMRTbellTM Damage repair kit. Adaptor ligation reaction was performed and then add ExoIII and ExoVII to remove failed ligation products. Templates were purified by AMPure PB Beads three times. Sequencing was performed by PacBio SMRT platform. PacBio raw subreads were modified with recalladapters v7.1.0 (
https://github.com/PacificBiosciences/recalladapters, accessed on 21 September 2019) to generate high quality circular consensus sequencing reads (HiFi reads) using the SMRT link analysis ccs v4.0.0 with –minPasses 3 and –minPredictedAccuracy 0.99 (
https://github.com/PacificBiosciences/ccs, accessed on 21 September 2019). Filtered HiFi reads were mapped to the reference genome using the minimap2 v2.17 with -ax map-pb (
https://github.com/lh3/minimap2, accessed on 21 September 2019). Reads were subsampled in varying length ranges with seqkit tools v0.10.1- (
https://anaconda.org/bioconda/seqkit, accessed on 23 September 2019) and processed by doing loops of BLAST-based alignment to reference genome to categorize molecules based on identity. For each loop, we aligned unmatched fragments in the previous alignment of the same HiFi read on the left or right sides to the reference genome with -task blastn and -max_hsps 1. The comprehensive detail data of alignments were visualized in AAV configurations.
2.8. Statistic Analysis
All data were presented with as means ± SD. Statistical analysis was performed by Student’s unpaired t-test in SPSS software version 1.0.0.1406. A p value < 0.05 was considered statistically significant.
4. Discussion
Previously, the heterogeneity of AAV genomes within a population was not fully characterized because the palindromic structure of the AAV ITR inhibited progression of the typical sequencing reaction, thus requiring AAV genome to be broken/interrupted in order to make compatible libraries that were compatible for the sequencing instrument. However, this maneuver causes the loss of the detailed information of individual molecules, since DNA in AAV subgenomic particles are inherently diverse and heavily rearranged from the standard genomes (
Figure 1). Reported studies using PacBio to sequence scAAV only recovered special categories of AAV molecules and did not provide details on the individual genomes within the entire AAV population [
9], In studies with single-stranded DNA genomes [
10], only annealed AAV genomes were captured in the library. Here we have obtained the full profile of AAV genomes within a population at the single virus level (
Figure 1). Those unannealed molecules are also fully sequenced. This high-resolution analysis allowed us to identify and characterize an array of genome configurations that are present. Besides canonical AAV full-length genomes, the presence of particles with incomplete AAV genomes, starting from the 3′ITR that failed to reach the 5′-ITR, which indicated an aborted packaging process (
Figure 1). Since AAV packaging initiation starts from the 3′ITR, all ICGs have intact 3′ITR and no ICGs have 5′ITR [
11]. We did not observe obvious hot spots in ICG population, which suggested that incomplete packaging of the AAV genome was most likely a random event.
In addition, we revealed the divergence of genome deletion mutants (GDM) in the AAV population
supplementary data Figure S1. The existence of GDMs as well as the discovery of asymmetric SBG configuration are the primary reason that we proposed that NHEJ is the mechanism of subgenomic particles formation in the AAV population (citation of Preprint). NHEJ is the mechanism that can simultaneously explain the formation sSBG, aSBG, GDM and various forms of subgenomic particles identified in the rAAV population (including those containing foreign genetic element such as host genome DNA and helper genes). Although we did not study GDMs, some GDMs will bring AAP reading frame closer to either P5, 19 or P40 promoter, which would increase AAP expression and enhance AAV package. This possibility is outlined in
Figure 4.
AAV SBG molecule is naturally occurring self-complementary DNA genome which is similar to the well-known recombinant scAAV vector which used successfully in clinical trials, but does not need a special mutated AAV ITR. In this study, our findings revealed a critical role for SBGs in the AAV life cycle. The 5′-SBG containing the P5 promoter can downregulate
Rep78 expression, its underlying mechanism may be through expressing dsRNA overlapping the rep gene transcript. dsRNA is the precursor for RNA Interference [
12], which would function as an inhibitor for the P5 promoter transcripts. This is a very clever mechanism that can balance P5 expression levels when the viral template increases excessively at the end of viral replication and packaging and there is no additional need to recruit host factors downregulate the promoter. Since the expression of Rep78 and its inhibitor is under the same transcription mechanism, the inhibition rate is determined by relative copy numbers of full-length genome and 5′-SBG, 5′-SBG therefore becomes a de facto AAV genome population sensor. In a hypothetical replication model starting with one full length particle, the P5 promoter initiates expression of
Rep 78 which replicates the AAV genome. The SBG molecules are generated when AAV genomes have accumulated to a certain level. Since SBG particles replicate faster than the full AAV genome, at a certain point, SBG replication will overtake the full AAV genome and
Rep78 expression levels will be significantly reduced. Thus, 5-SBG is a rather effective inhibitor and can regulate the AAV genome population.
One the other hand, the snapback molecules were also formed at 3′-end. 3′-SBG cannot produce dsRNA because the P40 promoters are sitting in a head-to-head configuration. This leads to the enhancement of P40 gene expression, since the expression can be increased greatly as the copy number of 3′-SBG increases (
Figure 3).
A Hypothetial Model for the Essential Role of Snapback Subgenomes in the AAV Life Cycle
Based on the detailed genomic state and molecular function of the AAV subgenomes, we propose that such molecules are not waste byproducts but play an active and critical role in the AAV life cycle. During replication and packaging, these molecules would function as “cis” and “trans” regulators of rep and cap gene expression, balancing viral gene copy number and expression levels. Without SBG molecules, AAV primarily replicates itself with reduced packaging. However, the successive replication of AAV genomes also lead to the production of SBG molecules, which may have a growth advantage over the wild type AAV genome and lead to a decrease in
Rep 78 expression and an increase in cap expression. The outcome of this replication mechanism would be an increased packaging of the AAV progenies. This hypothetical model is summarized in
Figure 4.
The excessive amounts of 5-SBG in the wild type AAV population may have another implication. Based on the results of various genomic studies, it is known that, during a latent infection of host cells, only AAV fragments are found. It is likely that 5′-SBG also functions in the host cells as a suppressor of P5 expression, which may stabilize its latent infection (
Figure S3). That may be a reason that it is beneficial to have subgenomic particles present in large quantities when the AAV virus is preparing its latent infection. Protein kinase R (PKR) is activated by dsRNA produced during virus replication. Adenovirus virus-associated RNA-I (VAI) is a short, noncoding transcript that functions as an RNA decoy to sequester PKR in an inactive state. VAI is an essential gene for rescuing AAV from latent infection. It further implies the complicated relationship between AAV helper virus and AAV replication.
AAV virus has a small genome of 4.7 k nucleotide in size. Yet, its genome design is a rather efficient and space conscience, with all genes overlapping each other in the same reading frame. Nevertheless, the virus developed a mechanism to use subgenomic particles, i.e., defective interfering particles, to express the negative regulator dsRNA, and to enhance expression of capsid proteins to facilitate packaging during the late stage of infection. The presence of such 5′-SBG and 3′-SBG is not a coincidence but evolved over virus-spread in the human population. The utilization of snap-back molecules is a perfect example of how extra-genomic molecules can be an integral part of AAV regulation. This is the first report describing a virus utilizing these mechanisms, which seems to be shared by all members of the parvovirus family (data not shown).