Re-Analysis of the Widely Used Recombinant Murine Cytomegalovirus MCMV-m157luc Derived from the Bacmid pSM3fr Confirms Its Hybrid Nature

Murine cytomegalovirus (MCMV), and, in particular, recombinant virus derived from MCMV-bacmid pSM3fr, is widely used as the small animal infection model for human cytomegalovirus (HCMV). We sequenced the complete genomes of MCMV strains and recombinants for quality control. However, we noticed deviances from the deposited reference sequences of MCMV-bacmid pSM3fr. This prompted us to re-analyze pSM3fr and reannotate the reference sequence, as well as that for the commonly used MCMV-m157luc reporter virus. A correct reference sequence for this frequently used pSM3fr, containing a repaired version of m129 (MCK-2) and the luciferase gene instead of ORF m157, was constructed. The new reference also contains the original bacmid sequence, and it has a hybrid origin from MCMV strains Smith and K181.


Introduction
Human cytomegalovirus (HCMV) is a widely distributed pathogen, with seroprevalence of the lifelong infection varying from 40% to more than 90%.HCMV is the most frequent congenital pathogen after infection in utero and a significant clinical problem in immunocompromised hosts, causing severe symptoms like pneumonia, retinitis, and hepatitis [1].Research on HCMV is complicated by the strict tropism of the virus for human cells and tissue, which makes it necessary to use surrogates like related betaherpesviruses from model animals [2][3][4], or more recently, complicated xenograft models like those in immune-deficient mice [5].Murine cytomegalovirus (MCMV) has a high sequence similarity to HCMV and, not least due to availability of transgenic hosts, is widely used in small animals as an infection model for HCMV.The route of infection of MCMV was analyzed by utilizing reporter viruses [6,7], showing that MCMV spreads from the initial nasal entrance site to secondary vascular sites (e.g., spleen) and subsequently to tertiary sites like the salivary glands.Furthermore, it was demonstrated that the primary target after intra-footpad infection is the popliteal lymph node, which is neither host-nor virus-strain-specific.Intra-footpad infection can thus simulate peripheral infection [6].Experiments analyzing the natural oral and nasal route of infection show that, in this case, the olfactory neurons are the primary target cells.However, deeper infection of the lower respiratory tract is also possible.After nasal infection, the tertiary virus spreads to the salivary glands and therein establishes persistence [7].The replication of CMV usually occurs in differentiated myeloid cells [8].Typical laboratory strains of MCMV are the strains Smith and K181.The genomes of these strains show a very high sequence similarity to each other (nucleic acid identity > 99%).However, they are highly variable with respect to "private" gene families like m02 and m145, located near the opposite termini of the genome.These play an important role in viral antigen presentation and, therefore, immune evasion ( [9], see also Figure 1).The MCMV Smith and K181 strains differ in their infection dynamics in vitro and in vivo.In vitro, K181 leads to lower viral titers and smaller plaques in comparison to Smith, while the titers of K181 in vivo are higher in salivary glands [10,11].In general, there are distinct genetic differences in the CMV genomes infecting different hosts, and also between MCMV strains, which are often found in the area of surface proteins, suggesting host adaption and immune evasion [12,13].
Genetic modification of viral genomes with traditional methods is difficult and laborious.The first bacterial artificial chromosome (bacmid) containing a near-complete sequence of a herpesvirus, MCMV strain Smith by Messerle and colleagues [14], represented a major breakthrough.However, extensive analysis of this bacmid pSM3 detected a gap comprising the open reading frames (ORFs) m151-m158.The loss of this region might have occurred to reduce the oversized nature of the viral genome due to insertion of the bacmid sequence via recombination.The missing region was later reinserted using homologous recombination with cloned MCMV DNA sequences [15] derived from plasmids initially designated as strain Smith but later identified as being derived from MCMV strain K181 [16,17].Therefore, the resulting bacmid pSM3fr represents a hybrid consisting of the m150 to m158 genes belonging to the K181 strain within the genome of strain Smith.Thereafter, several groups observed an impaired replication of the pSM3fr-derived viruses in salivary glands; this was finally shown to be attributable to a frameshift mutation in the MCK-2 (MCMV chemokine homologue) gene, encoded in the ORF m129, leading to impaired replication in salivary glands.Repair of the m129 frameshift in pSM3fr-MCK-2fl (full length) restored salivary gland replication [18].Interestingly, a further gene, m155, was initially implicated in the deficiency in salivary gland replication of the pSM3fr-derived viruses [19].Furthermore, the pSM3fr bacmid was used as a template for generation of various reporter viruses, e.g., MCMV-m157luc, with luciferase instead of m157.The M157 glycoprotein was identified as a ligand for the natural killer (NK) cell activation receptor Ly49H [20].Interestingly, this interaction between a viral protein and a host cell receptor was shown to be the opposite of immune evasion in the scope of co-infection and mutation studies [21,22].Therefore, the deletion of the m157 gene with subsequent replacement by a luciferase gene does not impair the fitness of the resulting viruses during infection but sensitizes C57BL6 mice due to reduced NK cell defense.In other mouse strains, which are Ly49H-negative, M157 might act as ligand for different receptors.The original pSM3fr bacmid and derivatives like MCMV-m157luc were used for multiple studies (e.g., [23,24]); MCMV-m157luc was also independently repaired in the m129 gene encoding MCK-2 [25].In the scope of another project from our lab, we sequenced the full bacmids and viruses derived thereof, and we noted numerous divergences from the deposited m129-repaired pSM3fr-MCK-2fl reference sequence in GenBank (Acc.No. KY348373), which prompted us to reassess the history of this widely used reagent.During our effort to reconstruct the history of this important reagent, we generated new reference sequences for the original pSM3fr and the widely used luciferase-containing bacmids.

Results and Discussion
In the scope of generating modified MCMV bacmids derived from the widely used pSM3fr bacmid, we performed full-genome sequencing of the constructed DNA via nextgeneration sequencing (NGS) to verify the correct assembly and search for potential mutations.Since we detected a high number of unexpected mismatches in specific regions, we also sequenced the original pSM3fr bacmid [14,17].
The sequences generated from the original pSM3fr bacmid were mapped to the MCMV strain Smith (GenBank Acc.No.: OP429142) and the deposited pSM3fr-MCK-2fl reference sequence (GenBank Acc.No.: KY348373) with average coverages of 160 and 162, respectively.Interestingly, for both mappings, the variant detection showed a heavily mutated region in the open reading frames (ORFs) m150 and m151, which encode type 1 membrane proteins of the m145 family (Figure 2A).In total, for m150, we detected 19 SNPs that lead to 12 amino acid changes, and for m151, 51 SNPs that lead to 32 amino acid changes.The alignment of the consensus sequence of m150 and m151 with the same region of different MCMV strains revealed that these ORFs match the strain K181, while the rest of the sequences match the Smith strain (Figure 2B,C).These regions are, therefore, part of the reinserted ORFs to complete the MCMV genome within the bacmid, in order to repair an initially loss of m151-m158 during cloning [17].This divergence was expected for the comparison with the Smith strain, but not for the deposited pSM3fr-MCK-2fl reference sequence [18].Moreover, this reference sequence does not contain annotations for the bacmid sequences, which were found in our mapping subsequent to ORF m158, flanked by repetitive elements for recombination (Figure 3A).Lodha et al. recently studied the transcriptional profile of MCMV, including noncoding RNA of the pSM3fr-derived virus; in this context, they also adapted the deposited reference sequence KY348373 [26].However, this sequence does not contain the corrections in the region m150 to m158.We constructed a new reference sequence for the hybrid pSM3fr bacmid, which contains the regions originating from K181 at the position of the ORFs m150 to m158, the frameshift mutation in m129 (encoding for MCMV chemokine homologue MCK-2 [18]), as well as the bacmid-specific regions between m158 and m159 (Figure 3B).Further variants that were detected after mapping our pSM3fr libraries to the deposited reference sequence (KY348373) are listed in Table 1.Of note, this earlier deposited pSM3fr MCK2-fl reference sequence was obviously corrected for the m129 frameshift, but not for any of the further variant positions with respect to strain Smith described in [18].
Another derivative of the widely used bacmid pSM3fr derived from MCMV contains a repaired version of the MCK-2 gene (m129) [18] and a HCMV IE promoter-driven luciferase cassette replacing ORF m157 (MCMV-m157luc) [23,24].For future analysis, we also generated an annotated reference sequence for this bacmid derived from respective consensus sequences and confirmed it via resequencing (Figure 3C).Furthermore, we sequenced two different stocks of low-passage viruses reconstituted from the MCMV-m157luc bacmid with an average coverage of 330.The resulting reads were then mapped to our newly generated reference sequence for MCMV-m157luc (Figure 4) and reached average coverages of 256 and 2523, respectively.Interestingly, both mappings show gaps within the area encoding the bacmid components.This indicates that loss of genomic regions unimportant for viral replication is not infrequent, most probably due to the large size of the MCMV genome, which is already at the edge of the nucleocapsid packaging capacity [17,27].This probably was also the cause of the initially observed deletion of m151 to m158 in the original bacmid [14].Since the genes encoded in the bacmid areas are dispensable for viral replication and infection, we suggest that viruses that lose these partsinstead of parts of the original viral genome-have replication advantages over viruses with the full bacmid.This is supported by the fact that we observe partial bacmid loss in two different viral stocks at different positions, but no gene loss or extensive gaps in other parts of the genome.Nevertheless, the tendency of gene loss and further genetic alterations during passaging demonstrates the usefulness of regular full-genome sequencing of both, newly generated viral stocks as well as passaged viruses.It also suggests that one should more regularly make use of the two loxP sites flanking the bacmid sequences; by passaging the virus through Cre recombinase-expressing cells, better comparable viruses with a homogenous deletion of the bacmid and a better packable genome size may be obtained.However, the loxP sites are not at the very ends of the bacmid, leaving residual foreign sequences.Alternatively, the vector cassette could be moved to the end of the genome or placed within an essential gene to increase selection pressure [28].Since HCMV has a similar genome size to MCMV, we would assume similar parameters and restrictions apply for bacmid cloning and rescue.This indicates that particular attention is required in the characterization of recombinant herpesviruses, like marker rescue, careful restriction analysis and resequencing of full recombinant genomes.

In Mapping
Variants already described in [18]    Furthermore, since HCMV has a similar genome size to MCMV, we would expect a tendency of gene loss in bacmid systems as well and would recommend full-genome sequencing of bacmid-derived HMCV.
In conclusion, we provide the CMV research community with new and corrected references for the extensively used pSM3fr bacmid (accession: ERZ20801763) and the MCK-2-repaired version, MCMV-m157luc reporter virus, containing a luciferase instead of ORF m157 (accession: ERZ20801746).

Bacterial Culture and DNA Preparation
Escherichia coli DH10B-derived EL250 [29] or GS1783 [30] carrying bacmid DNA was cultured overnight in LB medium containing 15 µg/mL chloramphenicol at 32 • C and 200 rpm.Bacmid DNA was purified using the PureLink TM HiPure Plasmid Maxiprep Kit of Invitrogen, Thermo Fisher Scientific, Waltham, MA, USA (K210006) according to the manufacturer's instructions.

Library Preparation and Sequencing
Purified bacmid DNA was obtained via standard precipitation protocols.Viral DNA was extracted utilizing a Qiagen (Venlo, The Netherlands) EZ-1 instrument.For the library preparation for next-generation sequencing, 500 ng of bacmid DNA or extracted viral DNA was fragmented and adapters and indices were ligated using the NEBNext ® Ultra™ II FS DNA Library Prep Kit from Illumina, San Diego, CA, USA (E7805), according to manufacturer's instructions, with a fragmentation time of 15 min.
Sequencing was performed via paired-end sequencing utilizing the MiSeq Reagent Kit v3 (150 cycles) on a MiSeq™ Instrument (Illumina, San Diego, CA, USA).Sequence analysis was conducted using CLC Genomics Workbenches 22 and 23 (Qiagen Aarhus A/S, Denmark).Raw reads were trimmed for quality (limit 0.05, Mott trimming algorithm), adapters, and ambiguities.The trimmed reads were mapped with different stringencies, testing for the best distribution.Duplicate reads were removed from the mapping.Variants from the reference sequence were detected, only considering regions with coverages above 50 and a frequency of the variant in the reads of more than 80%.

Production and Purification of MCMV Stocks
In order to generate high-titer stocks of MCMV, murine embryonic fibroblast (MEF) cells were seeded 1:2 into twelve T175 cell culture flasks.On the next day, purified MCMV virions were thawed and 1 × 10 5 virions were added to 25 mL of complete DMEM.After extensive vortexing, the MEF supernatant of each flask was exchanged for 25 mL of complete DMEM containing MCMV virions and the cells were incubated for 1 week under cell culture conditions.Cell culture supernatants were centrifuged (1500× g, 20 min, 4 • C) to remove cell debris.The supernatants were ultracentrifuged (33,000× g, 3 h, 4 • C).The supernatant was discarded; then, the pellets were resuspended in a few microliters of residual media, pooled, and dispersed using a syringe.Following that, 1 mL of concentrated virions was carefully loaded on a 9 mL cushion of 15% sucrose in virus suspension buffer (VSP, 50 mM Tris-HCl (pH 7.8), 12 mM KCl, 5 mM EDTA) and purified via ultracentrifugation (70,000× g, 1 h, 4 • C).The sucrose cushion was discarded; then, the pellets were washed in 10 mL of PBS and repelleted (70,000× g, 1 h, 4 • C). PBS was discarded and 500 µL of VSP was added and incubated on ice overnight.The next day, pellets were resuspended, pooled, dispersed using a syringe, and stored at −80 • C in 40 µL aliquots.

Sequences
Reference sequences were deposited in the European Nucleotide Archive (https:// www.ebi.ac.uk/ena/browser/home, accessed on 31 July 2023).The accession number for the original pSM3fr bacmid is ERZ20801763, while the accession number for the MCK-2 repaired version with a luciferase instead of ORF m157 is ERZ20801746.

Figure 1 .
Figure 1.Genome alignment of the MCMV strains Smith and K181.The reference sequences for the MCMV strains Smith (OP429142) and K181 (AM886 GenBank were aligned and sequence conservation is displayed.Black bars indicate lower conservation.The genome areas of m02-m06 as well as m145enlarged to demonstrate high sequence diversity in these respective areas between the two strains.

Figure 1 .
Figure 1.Genome alignment of the MCMV strains Smith and K181.The reference sequences for the MCMV strains Smith (OP429142) and K181 (AM88612) from GenBank were aligned and sequence conservation is displayed.Black bars indicate lower conservation.The genome areas of m02-m06 as well as m145-m159 are enlarged to demonstrate high sequence diversity in these respective areas between the two strains.

Figure 2 .
Figure 2. Comparison of ORFs m150 and m151 between MCMV strains Smith and K181.(A) Mapping of original pSM3fr reads against MCMV strain Sm pSM3fr-MCK-2fl reference sequence and variant detection.Colored vertical lines indicate multiple nucleotide polymorphisms.(B) Section of an align pSM3fr-MCK-2fl sequence of ORFs m151 and m152 as well as the respective regions of database reference sequences for MCMV strains Smith (O GU305914) and K181 (AM88612).Alignment was constructed using the Jukes-Cantor model.Black bars indicate lower conservation.(C) Maximum-likelih generation with 1000 bootstrap repeats from the alignment of (B); numbers at bifurcation give percent of trees with the respective branching.Scale bars the phylogenetic distance in percent of number of substitutions/changes per nucleotide.

Figure 2 .
Figure 2. Comparison of ORFs m150 and m151 between MCMV strains Smith and K181.(A) Mapping of original pSM3fr reads against MCMV strain Smith and pSM3fr-MCK-2fl reference sequence and variant detection.Colored vertical lines indicate multiple nucleotide polymorphisms.(B) Section of an alignment of pSM3fr-MCK-2fl sequence of ORFs m151 and m152 as well as the respective regions of database reference sequences for MCMV strains Smith (OP429142, GU305914) and K181 (AM88612).Alignment was constructed using the Jukes-Cantor model.Black bars indicate lower conservation.(C) Maximum-likelihood tree generation with 1000 bootstrap repeats from the alignment of (B); numbers at bifurcation give percent of trees with the respective branching.Scale bars indicate the phylogenetic distance in percent of number of substitutions/changes per nucleotide.

Figure 3 .
Figure 3. Schematic representation of corrected reference sequences.(A) Mapping showing the bacmid-specific sequences within the references.(B) The co pSM3fr reference sequence, with the frameshift mutation in m129 and strain-K181-derived sequences in m151-m158.(C) The newly generated reference se for the repaired and modified pSM3fr-m157luc bacmid, containing the repaired version of m129 and a luciferase gene cassette instead of m157.

Figure 3 .
Figure 3. Schematic representation of corrected reference sequences.(A) Mapping showing the bacmid-specific sequences within the references.(B) The corrected pSM3fr reference sequence, with the frameshift mutation in m129 and strain-K181-derived sequences in m151-m158.(C) The newly generated reference sequence for the repaired and modified pSM3fr-m157luc bacmid, containing the repaired version of m129 and a luciferase gene cassette instead of m157.

Figure 4 .
Figure 4. Mapping of reads obtained from low-passage MCMV viruses (m157-luc).Displayed are the mapped reads from two different viral stocks.The genome area spanning the bacmid components, which are flanked by the repetitive elements for recombination, is shown.In both cases, gaps (without specific read mapping or with strongly reduced specific read mapping) can be observed.

Figure 4 .
Figure 4. Mapping of reads obtained from low-passage MCMV viruses (m157-luc).Displayed are the mapped reads from two different viral stocks.The genome area spanning the bacmid components, which are flanked by the repetitive elements for recombination, is shown.In both cases, gaps (without specific read mapping or with strongly reduced specific read mapping) can be observed.