Intensive Distribution of G2-Quaduplexes in the Pseudorabies Virus Genome and Their Sensitivity to Cations and G-Quadruplex Ligands

Guanine-rich sequences in the genomes of herpesviruses can fold into G-quadruplexes. Compared with the widely-studied G3-quadruplexes, the dynamic G2-quadruplexes are more sensitive to the cell microenvironment, but they attract less attention. Pseudorabies virus (PRV) is the model species for the study of the latency and reactivation of herpesvirus in the nervous system. A total of 1722 G2-PQSs and 205 G3-PQSs without overlap were identified in the PRV genome. Twelve G2-PQSs from the CDS region exhibited high conservation in the genomes of the Varicellovirus genus. Eleven G2-PQSs were 100% conserved in the repeated region of the annotated PRV genomes. There were 212 non-redundant G2-PQSs in the 3′ UTR and 19 non-redundant G2-PQSs in the 5′ UTR, which would mediate gene expression in the post-transcription and translation processes. The majority of examined G2-PQSs formed parallel structures and exhibited different sensitivities to cations and small molecules in vitro. Two G2-PQSs, respectively, from 3′ UTR of UL5 (encoding helicase motif) and UL9 (encoding sequence-specific ori-binding protein) exhibited diverse regulatory activities with/without specific ligands in vivo. The G-quadruplex ligand, NMM, exhibited a potential for reducing the virulence of the PRV Ea strain. The systematic analysis of the distribution of G2-PQSs in the PRV genomes could guide further studies of the G-quadruplexes’ functions in the life cycle of herpesviruses.


Introduction
The life-long latency of herpesviruses poses potential threats to the host at any time, and the reason for the wide existence of the GC-rich sequences in herpesvirus genomes remains unknown [1]. Guanine-rich sequences have been discovered to form special DNA or RNA secondary structures called G-quadruplexes. They are composed of π-stacked G-quartets via hydrogen-bonded structure in DNA or RNA [2]. Each G-quartet contains four guanines held by eight hydrogen bonds, and coordinated with a central monovalent cation, such as K + and Na + . The G 3 -putative quadruplex sequences (G 3 -PQSs) in the form of (G 3+ N 1-7 ) 3 G 3+ form three or more G-quartets in the structures (Figure 1), and they exhibited Pseudorabies virus (PRV) represents a good model for the study of G2-quadruplex functions in herpesvirus genomes. This virus is a herpesvirus of the Varicellovirus genus in the Alphaherpesvirinae subfamily that is included in the family Herpesviridae. The Alphaherpesvirinae chooses the nervous system for latency [9]. PRV causes neuronal and lethal infection in many animal species, yet posing little or no danger to humans [10][11][12][13][14]. PRV has been used as a model species for studying the cycle of infection, latency, and reactivation, which are critical processes for the survival of alphaherpesviruses [10]. Vaccination is the most effective approach to preventing virus infection. However, herpesviruses can establish latency in the host after the first infection, and they can reactivate to cause serious diseases in their host. Failure of vaccination will result in a big threat to humans and animals. For example, the Bartha-K61 vaccine has been used worldwide, and it played an important role in the eradication of pseudorabies virus in many countries. Nevertheless, it had failed to protect piglets from being infected by several virulent PRV strains in China, resulting in PRV re-outbreak in 2011 [15][16][17][18][19][20][21][22][23]. In order to prevent and cure herpesvirus infections successfully, it could be helpful to reveal the latency-reactivation mechanism, based on the characteristics of herpesvirus genomes and their feature in host cells. The Human gammaherpesvirus 4 (Epstein-Barr virus, EBV) was assumed to modulate immune evasion with a G2-quadruplex forming in the coding sequence (CDS) of Epstein-Barr virus-encoded nuclear antigen 1 (EBNA1) [7]. The PRV may respond to the defense of the host cells, through the formation and resolution of G-quadruplexes to regulate latency and reactivation. This study is aimed to provide clues for vaccine and drug development at the nucleic acid level.
In this work, a systematic analysis will be carried out to locate the distribution of G2/G3quadruplex sequences in the PRV genome. The conservation of the putative G-quadruplex-forming sequences will be evaluated in the PRV strains and in the Varicellovirus genus. The evolutionary differences in the G-quadruplexes between non-human infectious herpesvirus and human herpesviruses will be discussed further. G2-PQSs structure types and their sensitivities to different cations and ligands will be examined, for their roles in regulating gene expression. The study of a classic G-quadruplex ligand, N-methyl mesoporphyrinIX (NMM), exhibited potential for inhibiting the virulence of the PRV Ea strain. This study will further investigate G2-quadruplexes working as Pseudorabies virus (PRV) represents a good model for the study of G 2 -quadruplex functions in herpesvirus genomes. This virus is a herpesvirus of the Varicellovirus genus in the Alphaherpesvirinae subfamily that is included in the family Herpesviridae. The Alphaherpesvirinae chooses the nervous system for latency [9]. PRV causes neuronal and lethal infection in many animal species, yet posing little or no danger to humans [10][11][12][13][14]. PRV has been used as a model species for studying the cycle of infection, latency, and reactivation, which are critical processes for the survival of alphaherpesviruses [10]. Vaccination is the most effective approach to preventing virus infection. However, herpesviruses can establish latency in the host after the first infection, and they can reactivate to cause serious diseases in their host. Failure of vaccination will result in a big threat to humans and animals. For example, the Bartha-K61 vaccine has been used worldwide, and it played an important role in the eradication of pseudorabies virus in many countries. Nevertheless, it had failed to protect piglets from being infected by several virulent PRV strains in China, resulting in PRV re-outbreak in 2011 [15][16][17][18][19][20][21][22][23]. In order to prevent and cure herpesvirus infections successfully, it could be helpful to reveal the latency-reactivation mechanism, based on the characteristics of herpesvirus genomes and their feature in host cells. The Human gammaherpesvirus 4 (Epstein-Barr virus, EBV) was assumed to modulate immune evasion with a G 2 -quadruplex forming in the coding sequence (CDS) of Epstein-Barr virus-encoded nuclear antigen 1 (EBNA1) [7]. The PRV may respond to the defense of the host cells, through the formation and resolution of G-quadruplexes to regulate latency and reactivation. This study is aimed to provide clues for vaccine and drug development at the nucleic acid level.
In this work, a systematic analysis will be carried out to locate the distribution of G 2 /G 3 -quadruplex sequences in the PRV genome. The conservation of the putative G-quadruplex-forming sequences will be evaluated in the PRV strains and in the Varicellovirus genus. The evolutionary differences in the G-quadruplexes between non-human infectious herpesvirus and human herpesviruses will be discussed further. G 2 -PQSs structure types and their sensitivities to different cations and ligands will be examined, for their roles in regulating gene expression. The study of a classic G-quadruplex ligand, N-methyl mesoporphyrinIX (NMM), exhibited potential for inhibiting the virulence of the PRV Ea strain. This study will further investigate G 2 -quadruplexes working as sensors in response to small molecules, proteins, and physiological cation conditions in the specific microenvironment, to reveal the latency-reactivation mechanism of herpesviruses. The putative G 3 -quadruplex sequences (G 3 -PQSs) in the PRV genome were predicted with the Quadparser program [3] in the form of the GnNm sequence (where n ≥ 3 and 1 ≤ m ≤ 7). The putative G 2 -quadruplex sequences (G 2 -PQSs) were predicted with the same program, but in the modified sequence form (n ≥ 2). The genome size of PRV was 143,461 bp, and it encoded 69 proteins. The analysis of the PRV reference genome (NC_006151.1) indicated that 1722 G 2 -PQSs and 205 G 3 -PQSs were distributed in the PRV genome ( Figure 2; File S1), with the density of the G 2 -PQSs being 12 PQS/kb.

Bioinformatic Analysis
As the formation of the G-quadruplex could affect either transcription or translation, our analysis of PQSs was based on double strands with the well-annotated regions, and the predicted regulatory regions in the PRV reference genome (File S2). The 3' end untranslated regions (3 UTRs) from 63 genes, and the 5' end untranslated regions (5 UTRs) from 61 genes were annotated in the PRV reference genome in The National Center for Biotechnology Information (NCBI) Genome database [24]. The promoters of the annotated PRV genes were predicted to be 1 kb upstream of the annotated transcription start site of each gene. G 2 -PQSs were higher in density in the CDS region and in the large latency transcript (LLT) than in the repeat region, while the G 3 -PQSs were densely distributed in the repeat region ( Table 1). The density of the G 2 -PQSs in the 3 UTR was higher than that in the 5 UTR (Table 1). In the promoter regions, the G 2 -PQSs density was 5.34 PQS/kb, and it was more than eight-fold that of the G 3 -PQSs density ( Table 1).
The PQS monomer was named the single G-quadruplex-forming sequence, and the sequences forming more than two possible simultaneous G-quadruplexes were defined as the PQS cluster. The G 3 -quadruplex cluster, forming a highly stable structure, was identified in the repetitive region of the Herpes simplex virus type 1 (HSV-1) genome [25]. This study found that 86.8% of G 3 -PQSs and 77% of G 2 -PQSs were monomers in the genome of PRV (File S1). 27 non-redundant G 3 -PQSs clusters were located in the regulatory regions, rather than in the coding region in the PRV genome. Sixty-nine G 2 -PQS clusters were distributed in the repeat regions, and 122 G 2 -PQS clusters were located in the coding region in the PRV genome (Table 1). 0.71 †: The CDS, 3 UTR, 5 UTR and repeat regions were analyzed according to the annotation information in the reference genomes. The promoter regions were predicted as 1kb upstream of the transcription start site of each annotated gene. LLT: large latency transcript; LAT: latency-associated transcript; VLT: VZV latency-associated transcript [26]. *: The annotated 5 UTR of the genes from the Human alphaherpesvirus 3 were analyzed for the PQSs. As the formation of the G-quadruplex could affect either transcription or translation, our analysis of PQSs was based on double strands with the well-annotated regions, and the predicted The G 2 -quadruplex formation in the open reading frame (ORF) of Epstein-Barr virus-encoded nuclear antigen 1 (EBNA1) led to decreased mRNA translation in the Epstein-Barr virus (EBV), which suggested that the G-quadruplex in viral transcripts acts as a specific regulatory element to regulate translation level and immune evasion [7]. The PRV replication cycle contains five main processes, including entry, immediate early stage, early stage, late stage, and egress [10]. There were 481 G 2 -PQSs in the ORFs of the genes in above five stages (Figure 3). 112 G 2 -PQSs were found in the entry stage, and 30 of them were involved in important envelope glycoproteins recognizing host cells. After the entry stage, PRV is in the immediate early stage. It transcribes and expresses only one immediate early gene, IE180; this was different from human herpesviruses, which transcribe three to five immediate early genes. IE180 is required for the effective transcription of early viral genes [10]. There were two copies of IE180 in the PRV genome, and 22 G 2 -PQSs were located in the coding sequence of each copy. The genes in the early stage contained 144 G 2 -PQSs, which was twice as many as the total number of G 2 -PQSs in the late stage (n = 77). The early-stage proteins had functions in transactivation and viral DNA synthesis, while the late-stage proteins were mainly responsible for DNA packaging and capsid maturation. A total of 104 G 2 -PQSs were identified in the egress stage, and most of them were present in the tegument proteins, which are important for virion formation before cell-to-cell movement ( Figure 3). The G2-quadruplex formation in the open reading frame (ORF) of Epstein-Barr virus-encoded nuclear antigen 1 (EBNA1) led to decreased mRNA translation in the Epstein-Barr virus (EBV), which suggested that the G-quadruplex in viral transcripts acts as a specific regulatory element to regulate translation level and immune evasion [7]. The PRV replication cycle contains five main processes, including entry, immediate early stage, early stage, late stage, and egress [10]. There were 481 G2-PQSs in the ORFs of the genes in above five stages ( Figure 3). 112 G2-PQSs were found in the entry stage, and 30 of them were involved in important envelope glycoproteins recognizing host cells. After the entry stage, PRV is in the immediate early stage. It transcribes and expresses only one immediate early gene, IE180; this was different from human herpesviruses, which transcribe three to five immediate early genes. IE180 is required for the effective transcription of early viral genes [10]. There were two copies of IE180 in the PRV genome, and 22 G2-PQSs were located in the coding sequence of each copy. The genes in the early stage contained 144 G2-PQSs, which was twice as many as the total number of G2-PQSs in the late stage (n = 77). The early-stage proteins had functions in transactivation and viral DNA synthesis, while the late-stage proteins were mainly responsible for DNA packaging and capsid maturation. A total of 104 G2-PQSs were identified in the egress stage, and most of them were present in the tegument proteins, which are important for virion formation before cell-to-cell movement ( Figure 3).   in the coding region of the helicase-primerase complex formed by UL5, UL8, and UL52. Twenty-five G 2 -PQSs were found in the gene UL29, encoding the single-stranded DNA-binding protein ICP8. The DNA replication origin-binding helicase UL9 contained 13 G 2 -PQSs in its coding sequence, and the DNA polymerase complex UL30/UL42 contained 13 G 2 -PQSs in total.
In the late stage, UL38, UL35, UL25, UL19, UL18, and UL6 encoded the proteins of the mature capsid constituents; these are required for the capsid assembly in the nucleus of the cell, and two scaffolding proteins (UL26 and UL26.5) have been reported to have participated in capsid formation [10]. The above late proteins accounted for 61% of the G 2 -PQSs in the late stage ( Figure 3).

G 2 -PQSs Involved in the Processes of Virus Entry and Egress
Both UL36 and UL37 were the tegument protein genes required in both the entry and egress stages [10]. The VP1/2 protein (UL36) is the important component of the inner layer of tegument proteins, and it was reported to be associated with the capsid during PRV transport across cytoplasm into the nuclear pore [27]. The production of VP1/2 protein is cut off by truncating the translation of UL36 gene, resulting in the failed transportation of the virus particle. Fifty-six G 2 -PQSs were observed in the CDS region of UL36, accounting for 50% of the total number of G 2 -PQSs in the stages it involved ( Figure 3). The viral glycoproteins gC (UL44), gD (US6), gB (UL27), gH (UL22), and gL (UL1) mediated a cascade required by PRV virions to enter specific cells, and totally there were 29 G 2 -PQSs in the ORFs of glycoproteins ( Figure 3).

Conservation and Potential Function of G 2 -PQSs in CDS Region in the Varicellovirus Genus
Since coding sequences had higher conservation than regulatory regions among different viruses in the same genus, the inter-species analysis of the conservation of the G 2 -PQSs in the coding region was performed within 11 Varicellovirus species including PRV. The conservation of the 494 G 2 -PQSs in the CDS region was analyzed. The analysis indicated that 55.3% of G 2 -PQSs exhibited the conservation score higher than 0.2 ( Figure S1). Twelve of these 494 G 2 -PQSs were found with the conservation score higher than 0.7 (Table 2). Five G 2 -PQSs of them were derived from the immediate early gene IE180, and two G 2 -PQSs were derived from the gene UL30 encoding the DNA polymerase catalytic subunit. One G 2 -PQS derived from the gene UL13 encoding protein-serine/threonine kinase was 100% conserved in the Varicellovirus genus ( Table 2). The other G 2 -PQSs with their conservation scored as 1.0 were from the gene UL33 associated with UL28 and UL15 to contribute to DNA cleavage and PRV package. The DNA cleavage and encapsidation related gene UL17 and major capsid protein gene UL19 contained one conserved G 2 -PQS in their coding sequences. Two tegument protein genes UL16 and UL47 also contained one conserved G 2 -PQS, respectively (Table 2).

Distribution Analysis of G 2 -PQSs in Regulatory Regions in PRV Genomes
Promoter, untranslated region, and intergenic region, which were important regulatory regions, had multiple functions in gene regulation. G 3 -PQSs were reported to be mainly located in the regulatory regions in the genomes of human herpesviruses [5], while less report on G 2 -PQSs was available. Systematic analysis of G 2 -PQS could be conducive to the exploration of their regulatory function. In this study, PRV was found to have higher density of G 2 -PQS than G 3 -PQS in the repeat regions ( Table 1).

Dozens of Conserved G 2 -PQSs in the Repeat Regions Related to Genome Recombination
Terminal repeat (TR) region is important for genome replication in some herpesviruses. The G-quadruplex in the TR region of gammaherpesvirus Kaposi sarcoma-associated herpesvirus (KSHV) altered the latent DNA replication and episomal persistence [28]. Furthermore, the stabilization of HSV-1 G-quadruplexes in the repeat region inhibited DNA polymerase processing and viral DNA replication [25]. The PRV genome was similar to the HSV-1 genome which was characterized by two unique regions (UL and US), and the US region was flanked by the internal and terminal repeat sequences (IRS and TRS, respectively). During PRV infection, the recombination between the inverted repeats produced two possible isomers of the genome with the U S region in opposite orientation. Both isomers were infectious and were in equimolar amount after infection, and the PRV genome was circularized upon entry into the host nucleus through blunt end ligation independent of any viral protein synthesis (reviewed in [10]). 4 G 2 -PQSs were located in 0-656 bp region, and 208 G 2 -PQSs were located in the two regions between genes IE180 and US1 ( Figure 4). The conservation percentage of 117 G 2 -PQSs on the sense strand of PRV genome ranged from 4% to 100% ( Figure 5). There were 38 G 2 -PQSs in the repeated region between IE180 and US1 with conservation percentage more than 90% ( Figure 5), including 11 G 2 -PQSs with 100% conservation rate (Table S1). These conserved GC-rich sequences might form G-quadruplexes in the inverted repeat region, mediating the genome recombination after infection. 14 G 2 -PQSs existing between IE180 and ORF1 may be switches of PRV replication in circular genome.    The untranslated regions (UTR) are important regulatory region for gene expression. G-quadruplex-forming sequences in the 5 UTR acting as translational repressor have been reported in several human genes [29,30]. The 3 UTR is related to mRNA stability, alternative splicing, polyadenylation, and localization. G-quadruplexes in the 3 UTR of CaMKIIa (Ca 2+ /calmodulin-dependent protein kinase II) and PDS-95 (post-synaptic density protein 95) mRNAs are responsible for the transport of those mRNA in neurites in vivo [31].

G2-PQSs in the Untranslated Regions of PRV Genes
The untranslated regions (UTR) are important regulatory region for gene expression. Gquadruplex-forming sequences in the 5′ UTR acting as translational repressor have been reported in several human genes [29,30]. The 3′ UTR is related to mRNA stability, alternative splicing, polyadenylation, and localization. G-quadruplexes in the 3′ UTR of CaMKIIa (Ca 2+ /calmodulindependent protein kinase II) and PDS-95 (post-synaptic density protein 95) mRNAs are responsible for the transport of those mRNA in neurites in vivo [31].
Nineteen non-redundant G2-PQSs and five G3-PQSs were found in the 5′ UTR of the annotated PRV genes ( Figure S2). Among all the genes in PRV genome, the gene UL13 contained four G2-PQSs in its 212-bp 5′ UTR with the highest density of G2-PQS in the 5′ UTR regions, and the gene UL28 contained one G3-PQS in its 72-bp 5′ UTR with the highest density of G3-PQS in the 5′ UTR ( Figure 6). The transactivator gene US1 contained four G2-PQSs in its 5′ UTR, accounting for 17% of G2-PQS monomers found in 5′ UTR region in the whole genome ( Figure 7A). UL13 had three G2-PQS monomers in its 5′ UTR, accounting for 13% of G2-PQS monomers in 5′ UTR in the PRV genome ( Figure 7A). In the early protein genes, the deoxyribonuclease gene UL12 and dUTPase gene UL50 had two G2-PQS monomers in their 5′ UTR, respectively ( Figure 7A). One G2-PQS cluster was found Nineteen non-redundant G 2 -PQSs and five G 3 -PQSs were found in the 5 UTR of the annotated PRV genes ( Figure S2). Among all the genes in PRV genome, the gene UL13 contained four G 2 -PQSs in its 212-bp 5 UTR with the highest density of G 2 -PQS in the 5 UTR regions, and the gene UL28 contained one G 3 -PQS in its 72-bp 5 UTR with the highest density of G 3 -PQS in the 5 UTR ( Figure 6). The transactivator gene US1 contained four G 2 -PQSs in its 5 UTR, accounting for 17% of G 2 -PQS monomers found in 5 UTR region in the whole genome ( Figure 7A). UL13 had three G 2 -PQS monomers in its 5 UTR, accounting for 13% of G 2 -PQS monomers in 5 UTR in the PRV genome ( Figure 7A). In the early protein genes, the deoxyribonuclease gene UL12 and dUTPase gene UL50 had two G 2 -PQS monomers in their 5 UTR, respectively ( Figure 7A). One G 2 -PQS cluster was found in the gene UL13, which was the unique G 2 -PQS cluster in the all annotated 5 UTR of PRV genes (File S2).
The G 2 -PQSs in the 3 UTR of annotated PRV genes were more than ten folds of those found in the 5 UTR of the annotated PRV genes. There existed 212 non-redundant G 2 -PQSs and 8 G 3 -PQSs in the 3 UTR of PRV genes ( Figure S2). These results suggested that G 2 -quaduplexes might regulate gene expression frequently by effecting 3 UTR secondary structure. The genes UL51 and UL29 had the highest density of G 3 -PQSs in their 3 UTR regions ( Figure 6).
More than 70 G 2 -PQS monomers were found in the 3 UTR of the genes related to viral genome replication, envelopment, and packaging processes ( Figure S2). UL48 encoded the VP16 protein exerting multiple functions in the PRV, including transactivation in gene regulation and secondary envelopment in viral egress [10]. 21 G 2 -PQS monomers were found in the 3 UTR of gene UL48, accounting for 11% of all the G 2 -PQS monomers in the 3 UTR in the PRV genome ( Figure 7B). The late stage gene UL15 encoding DNA packaging terminase subunit 1 had 15 G 2 -PQS monomers in its 3 UTR. The UL9 encoded the sequence-specific ori-binding protein OBP which formed the ATP-dependent helicase motif. OBP was essential for the replication of the viral DNA [10]. UL9 had 13 G 2 -PQS monomers in its 3 UTR. The tegument protein coding gene UL14 and type III membrane protein coding gene UL24 had the same number of G 2 -PQSs in their 3 UTR (Figure S2). The G 2 -PQS monomers from UL9, UL14 and UL24 accounted for 21% of the entire number of G 2 -PQS monomers in the 3 UTR ( Figure 7B). in the gene UL13, which was the unique G2-PQS cluster in the all annotated 5′ UTR of PRV genes (File S2). The G2-PQSs in the 3′ UTR of annotated PRV genes were more than ten folds of those found in the 5′ UTR of the annotated PRV genes. There existed 212 non-redundant G2-PQSs and 8 G3-PQSs in the 3′ UTR of PRV genes ( Figure S2). These results suggested that G2-quaduplexes might regulate gene expression frequently by effecting 3′ UTR secondary structure. The genes UL51 and UL29 had the highest density of G3-PQSs in their 3′ UTR regions ( Figure 6).
More than 70 G2-PQS monomers were found in the 3′ UTR of the genes related to viral genome replication, envelopment, and packaging processes ( Figure S2). UL48 encoded the VP16 protein exerting multiple functions in the PRV, including transactivation in gene regulation and secondary envelopment in viral egress [10]. 21 G2-PQS monomers were found in the 3′ UTR of gene UL48, accounting for 11% of all the G2-PQS monomers in the 3′ UTR in the PRV genome ( Figure 7B). The late stage gene UL15 encoding DNA packaging terminase subunit 1 had 15 G2-PQS monomers in its 3′ UTR. The UL9 encoded the sequence-specific ori-binding protein OBP which formed the ATPdependent helicase motif. OBP was essential for the replication of the viral DNA [10]. UL9 had 13 G2-PQS monomers in its 3′ UTR. The tegument protein coding gene UL14 and type III membrane protein coding gene UL24 had the same number of G2-PQSs in their 3′ UTR ( Figure S2). The G2-PQS monomers from UL9, UL14 and UL24 accounted for 21% of the entire number of G2-PQS monomers in the 3′ UTR ( Figure 7B). Another important characteristic was that G2-PQS clusters existed in the 3′ UTR of PRV gene densely. The gene UL48 had 3 G2-PQS clusters in its 3′ UTR. The DNA replication related gene group had far more G2-PQS clusters than other functional gene groups. The UL9/UL5/UL52 group had 6 G2-PQS clusters in the 3′ UTR region ( Figure 7C). The proteins coded by above three genes included the sequence-specific ori-binding protein, helicase motif, and primase subunit. The G2-PQSs from 3′ UTR of UL5 and UL9 were validated in further experiments. The identification of the functional G2-PQSs in the untranslated regions of genes in transactivation, and viral DNA synthesis processes will provide the elements for post-transcriptional regulation of virus genes. The large latency transcript was the only gene reported to be transcribed during PRV latency [10]. It overlapped with the oppositely transcribed IE180 gene and EP0 gene, and it was one of the spliced transcripts in the PRV genome [9]. Sixty-seven G2-PQSs and 72 G2-PQSs were predicted, respectively, in the positive strand and in the negative strand of the LLT (Figure 8). Forty G2-PQSs were found to be 100% conserved among the annotated PRV genomes (Table S2). Thirty G2-PQSs in Another important characteristic was that G 2 -PQS clusters existed in the 3 UTR of PRV gene densely. The gene UL48 had 3 G 2 -PQS clusters in its 3 UTR. The DNA replication related gene group had far more G 2 -PQS clusters than other functional gene groups. The UL9/UL5/UL52 group had 6 G 2 -PQS clusters in the 3 UTR region ( Figure 7C). The proteins coded by above three genes included the sequence-specific ori-binding protein, helicase motif, and primase subunit. The G 2 -PQSs from 3 UTR of UL5 and UL9 were validated in further experiments. The identification of the functional G 2 -PQSs in the untranslated regions of genes in transactivation, and viral DNA synthesis processes will provide the elements for post-transcriptional regulation of virus genes.
A Wide Distribution of G 2 -PQSs in the Large Latency Transcript The large latency transcript was the only gene reported to be transcribed during PRV latency [10]. It overlapped with the oppositely transcribed IE180 gene and EP0 gene, and it was one of the spliced transcripts in the PRV genome [9]. Sixty-seven G 2 -PQSs and 72 G 2 -PQSs were predicted, respectively, in the positive strand and in the negative strand of the LLT (Figure 8). Forty G 2 -PQSs were found to be 100% conserved among the annotated PRV genomes (Table S2). Thirty G 2 -PQSs in the negative strand were overlapped with the G 2 -PQSs in the EP0 and IE180 transcripts, suggesting that these overlapped G 2 -PQSs could be potential dual-functional regulatory elements. Twenty-seven G 3 -PQSs were found in the LLT transcription region ( Figure S3), and only one G 3 -PQS was overlapped with EP0 transcript, while the IE180 transcript did not have any overlapped G 3 -PQS. The LLT intron had a microRNA cluster that was involved in PRV replication and affecting virulence [32,33]. Fifteen G 3 -PQSs and 35 G 2 -PQSs were found to be located in the intron of LLT. Thus, it could be speculated that these PQSs in the intron might be involved in the transcription regulation of microRNAs during PRV latency.

Density of the G2-PQSs in the Promoter Regions
The G3-quadruplexes in the promoter regions of Human immunodeficiency virus 1 (HIV-1) and human herpesviruses were reported to negatively regulate gene expression [5,[34][35][36]. Since the promoters of PRV genes were not annotated in the reference genome, we predicted the promoters to be 1 kb upstream of the transcription start site of each gene. The gene UL14 coding tegument protein and the gene UL28, coding DNA packaging terminase subunit 2, were found to contain nine G2-PQSs in their promoters, respectively, which was the largest number among all of the PRV genes. Viral DNA replication-related genes UL29/UL30/UL8 had eight G2-PQSs in each promoter ( Figure 6).

Comparison of PQS Distribution among Three Herpesvirus Genomes
Inter-species comparison was performed among PRV, HHV-1 (Herpes simplex virus type 1, HSV-1) and HHV-3 (Varicella-zoster virus, VZV). These three viruses were neurotropic alphaherpesviruses. PRV and HHV-3 belonged to the same genus called Varicellovirus. These herpesviruses always caused serious diseases after reactivation from latency. Trigeminal ganglia (TG) was the common latency site for these three herpesviruses, and the dorsal root ganglia (DRG)

Density of the G 2 -PQSs in the Promoter Regions
The G 3 -quadruplexes in the promoter regions of Human immunodeficiency virus 1 (HIV-1) and human herpesviruses were reported to negatively regulate gene expression [5,[34][35][36]. Since the promoters of PRV genes were not annotated in the reference genome, we predicted the promoters to be 1 kb upstream of the transcription start site of each gene. The gene UL14 coding tegument protein and the gene UL28, coding DNA packaging terminase subunit 2, were found to contain nine G 2 -PQSs in their promoters, respectively, which was the largest number among all of the PRV genes. Viral DNA replication-related genes UL29/UL30/UL8 had eight G 2 -PQSs in each promoter ( Figure 6).

Comparison of PQS Distribution among Three Herpesvirus Genomes
Inter-species comparison was performed among PRV, HHV-1 (Herpes simplex virus type 1, HSV-1) and HHV-3 (Varicella-zoster virus, VZV). These three viruses were neurotropic alphaherpesviruses. PRV and HHV-3 belonged to the same genus called Varicellovirus. These herpesviruses always caused serious diseases after reactivation from latency. Trigeminal ganglia (TG) was the common latency site for these three herpesviruses, and the dorsal root ganglia (DRG) was another latency site for HHV-3 [10,25,37]. It was difficult to control the switch between latency and reactivation of these herpesviruses. An analysis of the common features and differences in the PQS distribution among these herpesviruses will provide an insight into latency modulation. G 3 -PQSs with the highest density among three herpesviruses were located in the repeat regions (Table 1) related to genome recombination during infection. Though they were in different genuses, PRV and HHV-1 shared similar features in the distribution and high density of G 2 -PQSs, which were different from HHV-3. G 2 -PQSs with the highest density in PRV and HHV-1 were located in the coding regions. The density of G 2 -PQSs in the coding regions of PRV and HHV-1 genes was found to be five-fold as much as that of HHV-3 (Table 1). In the PRV and HHV-1 genomes, the number of the G 2 -PQS monomers in the 3 UTR regions was 8-12 fold of that of the 5 UTR region. The latency-associated transcripts in HHV-1 and PRV were much longer than that in HHV-3. In PRV, the densities of G 2 -PQSs and G 3 -PQSs were 10.66 G 2 -PQS/kb and 2.07 G 3 -PQS/kb in its LLT, respectively. In HHV-1, it was 9.91 G 2 -PQS/kb and 3.39 G 3 -PQS/kb in its LAT, while in HHV-3, no G 3 -PQS was observed in its VLT, and the density of G 2 -PQS was half that of PRV (Table 1). Compared with HHV-1, the PRV had more G 3 -PQS monomers distributed in the repeat region, but less G 3 -PQS monomers in the CDS, UTR, promoter, and LLT regions (Table 1).

Summary of Bioinformatics Analysis
In herpesvirus genomes, G 3 -PQSs were especially abundant in the repeat regions, while G 2 -PQSs were distributed genome-wide. The non-coding region had many more G 2 -PQSs than G 3 -PQSs, and G 2 -PQSs showed a higher density in the CDS region and LLT than in the repeat region. The genes involved in transactivation, genomic DNA replication, and the virus maturation processes had rich and conserved G 2 -PQSs in their mRNA sequences. In the Varicellovirus genus, some G 2 -PQSs in the coding sequences of the immediate early protein ICP4, DNA cleavage and packaging protein (UL33), and serine/threonine kinase (UL13) exhibited conservations of 100%. The highly conserved G 2 -PQSs provide universal target sites to control Varicellovirus by disturbing the translation of the above proteins. The density of G 2 -PQS in 3 UTR was higher than that in the 5 UTR, and more G 2 -PQS clusters were found in the 3 UTR of PRV genes than those in the 5 UTR. The 38 G 2 -PQSs in the repeated region between IE180 and US1 exhibited conservations of higher than 90% among all the annotated PRV genomes, indicating that the conserved elements might be involved in PRV genome recombination. The LLT overlapped with the oppositely transcribed IE180 gene and the EP0 gene. The PQSs in the overlapped regions might be potential dual-functional regulatory elements.

Experimental Validation
The genome-wide analysis of the G 2 -PQS distribution indicated that one-third of the PRV G 2 -PQSs were distributed in the non-coding regions of PRV genome, such as the regions of UTR, IRS, TRS, and LLT (File S1). Further study is needed to determine whether these PQSs could serve as cis-regulatory elements in gene expression.

Parallel G-Quadruplexes formed by G 2 -PQSs In Vitro
Circular dichroism (CD) spectroscopy is widely used to distinguish parallel, anti-parallel or hybrid G-quadruplex structures [38]. A parallel G-quadruplex showed a negative peak near 240 nm, and a positive peak near 260 nm. An anti-parallel G-quadruplex exhibited a negative peak near 260 nm and a positive peak around 290 nm. The hybrid G-quadruplex always displayed one negative peak at 240 nm and two positive peaks at 260 nm and 290 nm. Of the 15 G 2 -PQS oligonucleotides (details in Table S3) selected from the regulatory regions of the PRV genome, 13 G 2 -PQS folded into parallel G-quadruplexes in a buffer containing 100 mM potassium. LLT-PQS1 and LLT-PQS2, between the start of LLT and the Prv-miR-1-5 , formed hybrid G-quadruplexes, while other LLT-PQSs between Prv-miR-11-1 and the end of LLT formed parallel G-quadruplexes ( Figure 9A). The three G 2 -PQSs were located in the repeat region between IE180 and US1, and the five G 2 -PQSs were located in the repeat region complementary to US1 CDS, and all of these eight G 2 -PQSs mentioned above formed parallel G-quadruplexes ( Figure 9B,C). One G 2 -PQS from the 3 UTR of UL5, and another from the 5 UTR of US1 exhibited typical parallel G-quadruplex peaks ( Figure 9D). TRS, and LLT (File S1). Further study is needed to determine whether these PQSs could serve as cisregulatory elements in gene expression.

Parallel G-Quadruplexes formed by G2-PQSs In Vitro
Circular dichroism (CD) spectroscopy is widely used to distinguish parallel, anti-parallel or hybrid G-quadruplex structures [38]. A parallel G-quadruplex showed a negative peak near 240 nm, and a positive peak near 260 nm. An anti-parallel G-quadruplex exhibited a negative peak near 260 nm and a positive peak around 290 nm. The hybrid G-quadruplex always displayed one negative peak at 240 nm and two positive peaks at 260 nm and 290 nm. Of the 15 G2-PQS oligonucleotides (details in Table S3) selected from the regulatory regions of the PRV genome, 13 G2-PQS folded into parallel G-quadruplexes in a buffer containing 100 mM potassium. LLT-PQS1 and LLT-PQS2, between the start of LLT and the Prv-miR-1-5′, formed hybrid G-quadruplexes, while other LLT-PQSs between Prv-miR-11-1 and the end of LLT formed parallel G-quadruplexes ( Figure 9A). The three G2-PQSs were located in the repeat region between IE180 and US1, and the five G2-PQSs were located in the repeat region complementary to US1 CDS, and all of these eight G2-PQSs mentioned above formed parallel G-quadruplexes ( Figure 9B and 9C). One G2-PQS from the 3′ UTR of UL5, and another from the 5′ UTR of US1 exhibited typical parallel G-quadruplex peaks ( Figure 9D).

G2-quadruplexes in the 3′ UTR Affects Gene Expression In Vivo with Varying Sensitivities
The formation of a G-quadruplex in the 3′ UTR was reported to result in gene expression decrease, alternative polyadenylation [39], and retrotransposition [40].This study found that there were more G2-PQSs in the 3′ UTR than in the 5′ UTR, in PRV (Table 1). In this study, G2-PQSs from the 3′ UTR were inserted into the dual luciferase vector, to check their functions.
In the Circular dichroism (CD) experiment, the PQS UL9-3′UTR-2G exhibited no G-quadruplex signal in 100 mM sodium solution, while it folded into a parallel G-quadruplex in the buffer containing 100 mM potassium ( Figure 10A). The stability of the G-quadruplex increased with the addition of the potassium cation. (Figure 10B). UL9-3′UTR-2G showed a higher sensitivity to the

G 2 -quadruplexes in the 3 UTR Affects Gene Expression In Vivo with Varying Sensitivities
The formation of a G-quadruplex in the 3 UTR was reported to result in gene expression decrease, alternative polyadenylation [39], and retrotransposition [40].This study found that there were more G 2 -PQSs in the 3 UTR than in the 5 UTR, in PRV (Table 1). In this study, G 2 -PQSs from the 3 UTR were inserted into the dual luciferase vector, to check their functions.
In the Circular dichroism (CD) experiment, the PQS UL9-3 UTR-2G exhibited no G-quadruplex signal in 100 mM sodium solution, while it folded into a parallel G-quadruplex in the buffer containing 100 mM potassium ( Figure 10A). The stability of the G-quadruplex increased with the addition of the potassium cation. (Figure 10B). UL9-3 UTR-2G showed a higher sensitivity to the increase of the potassium than the 3G mutant oligonucleotide obtained by replacing "GG" with "GGG" (Figure 10C). The insertion of PQS UL9-3 UTR-2G into the 3 UTR region of the renilla luciferase gene in the psiCHECK-2 vector resulted in a 38% decrease in relative luciferase activity ( Figure 11A). The PQS UL5-3 UTR-2G folded into a parallel G-quadruplex in a buffer with 100 mM K + (Figure 10D), and the addition of the G-quadruplex ligand NMM stabilized the structure. The insertion of PQS UL5-3 UTR into the 3 UTR region of renilla luciferase gene in the psiCHECK-2 vector made no difference in the protein expression ( Figure 11A). The addition of NMM at 20 µM increased the relative luciferase activity (p < 0.01) ( Figure 11B, Figures S5). The addition of other typical G-quadruplex ligands such as PDS and BRACO-19 made no significant difference in structural stability ( Figures S5, S6A, S6B) and relative luciferase activity ( Figures S5, S7A, S7B). The above results suggested that the G-quadruplex formed from G 2 -PQSs in 3 UTR affected the protein expression, and that G 2 -PQSs were more sensitive to different cations and ligands than G 3 -PQSs. These features made the G 2 -quadruplex suitable as sensitive switches for gene expression regulation in response to different environmental factors, such as various proteins, small molecules, and signal cations.

The G-Quadruplex Ligand Decreases the Virulence of PRV
G-quadruplex ligands were reported to interfere with biological processes related to tumor growth, by binding, stabilizing, converting, or unwinding G-quadruplex structures [41][42][43]. NMM at a series of concentrations (150 nM, 100 nM, 50 nM) was employed to treat the PRV-infected PK15 cells for 24 h. The plaque assay indicated that the NMM had the potential for reducing the virus titer at 24 h post-treatment ( Figure S8).

Discussion
A high density of G 2 -PQSs was distributed widely in the genomes of PRV, HHV-1, and HHV-3, and they could provide sensitive regulatory switches in response to environmental factors in gene expression regulation. Compared to the G 3 -quadruplexes, the G 2 -quadruplexes exhibited less thermal stability and higher sensitivity to loop size and compositions, and they presented multiple conformations in different solutions [44]. As they were sensitive to the microenvironment in the cells, the G 2 -quadruplexes could act as the receptors of specific proteins or metabolites in the cells, to identify the cells that are suitable for PRV latency. In this study, PQS UL9-3 UTR-2G exhibited a parallel structure in potassium solution, while it could not fold into a G-quadruplex in a sodium cation solution ( Figure 9A). The insertion of PQS UL9-3 UTR-2G into the psiCHECK-2 vector led to the decrease in reporter gene expression, while the insertion of PQS UL5-3 UTR-2G resulted in an increase in gene expression only with the addition of the ligand NMM ( Figure 11, Figure S4). The G-quadruplex structural sensitivity enabled the virus to be easily mediated by physiological cations, different small molecules, or proteins. These findings are conducive to revealing the latency-reactivation mechanism of PRV in specific tissues under certain conditions.
Highly conserved G 2 -PQSs were discovered in the coding regions of genes related to virus replication and maturation, in the Varicellovirus genus. The formation of the RNA G 2 -quadruplex in the open reading frame of EBNA1 of EBV resulted in a downregulation of the expression level of the maintenance protein, facilitating virus escape from immune recognition [7]. The conserved G 2 -PQSs in the Varicellovirus genus were located in the unique immediate early protein (IE180), viral DNA replication protein (UL30), DNA cleavage and package proteins (UL33/UL17), and tegument proteins (UL16/UL47). IE180 had five conserved G 2 -PQSs among all the Varicellovirus (Table 2). IE180 was found to be a unique immediate early gene of PRV and the first viral gene transcribed during PRV infection. IE180 is reported to mediate latency and reactivation by regulating early genes, and it activates US4, UL12 (alkaline exonuclease), UL22 (type I membrane protein), UL23 (thymidine kinase), and UL41 (RNAse) [10]. Based on previous reports, we speculated that the formation of the G-quadruplex in the CDS region of IE180 would result in a reduction of immediate early protein levels, disturbing downstream gene expression in the life-cycle of PRV. Therefore, it could be further inferred that the G 2 -quadruplex might have a significant regulatory effect on the Varicellovirus genus, especially for the immediate early protein ICP4, the homolog of immediate early protein from PRV.
A high density of G 2 -PQSs in the untranslated regions of the PRV genes could provide cis-elements to regulate the post-transcription and translation processes. Though the G 3 -quadruplexes in 3 UTR and 5 UTR were widely reported to have regulated human gene translation [5,[29][30][31]45], few reports on the G-quadruplexes in the untranslated regions of human herpesvirus genes are available. In this study, many G 2 -PQSs were found in the untranslated regions of herpesvirus genes (Figure 7). Various G-quadruplexes formed by those G 2 -PQSs regulated gene expression diversely, either under physiological conditions or with a stabilizing ligand ( Figure 11). The data suggested that the formation of these G 2 -quadruplexes might result in the truncated 3 UTR, and in turn, disturb 3 -end polyadenylation, finally leading to unstable virus mRNA. The G-quadruplex in the 3 UTR was also reported to have played a role in regulating microRNA binding in humans [46,47], suggesting that the G-quadruplex might be involved in the interaction between virus/host microRNAs and virus genes.
A large number of G 2 -PQSs and G 3 -PQSs in the repeat regions of the herpesvirus genomes might play an important role in genomic integration or recombination during herpesvirus latency. Some conserved G 3 -PQSs in herpesviruses were reported to have played important role in virus integration, latent DNA replication, and episomal persistence. Several herpesviruses, like Marek's Disease virus (MDV) [48], gallid herpesvirus 2 (GaHV-2) [49], and human herpesvirus 6 (HHV6) [50], were found to establish latent infections, with viral genomes integrated into telomere repeat tracts in host chromosomes through homologous recombination. In this study, the pseudorabies virus had 205 G 3 -PQSs in the genome, with 43.9% located in the repeat region (File S1). A high density of conserved G 2 -PQSs existed in the repeat regions between the two diverging transcripts IE180 and US1, which were close to the origin of replication (OriS) ( Figure 5). Telomeres were reported to form G-quadruplex clusters with a repeated sequence of (GGGTTA)n [51]. This study found that the PRV genome had imperfect telomeric repeats, with the unit sequence 5 -GGGGTGGAGACGGTGGAGGGAGAGGGGAGTGGG-3 repeated 12 times. Thus, these G 2 -PQSs were speculated to be related to genomic recombination and integration during latency.

Virus Sequences
The complete genome sequence of the Suid herpesvirus 1 (Pseudorabies virus, PRV) (NC_006151.1) was retrieved from NCBI Genome database [24]. It was annotated into 12 features, including 69 CDS regions, 70 genes, four introns, 119 misc. feature regions, one misc. RNA, 20 polyA sites, 14 protein binding sites, 117 regulatory region, 28 repeat regions, seven stem loop regions, one sequence tagged sites (STS), and one variation. The UTR was determined through alignment between the CDS sequence and the gene sequence. The sequences included in the gene sequences and located upstream of the CDS sequence were designated as 5 UTRs, and the sequences downstream of the CDS were defined as 3 UTRs. The putative G-quadruplex sequences (PQS) were searched in the standard genome sequence of PRV. The number of PQSs in the CDS, 3 UTR, 5 UTR and repeat regions was counted, respectively.

Identification of Putative G-Quadruplex Sequences in the PRV Genome
The putative G 3 -quadruplex sequences were searched by Quadparser software [3]. The formula of the putative G 2 -quadruplex sequence was defined as G 2+ N 1-7 G 2+ N 1-7 G 2+ N 1-7 G 2+ , where G is guanine, and N is any nucleotide including G. With the same software, the putative I-motif sequences were searched with the formula C 2+ N 1-7 C 2+ N 1-7 C 2+ N 1-7 C 2+ , where C is cytosine, and N is any nucleotide including C. The Quadparser coded the sequence in the format x:y:z, where x stands for the number of guanine tracts or C-runs, y stands for the number of locations of putative G4 or I-motif formations, and z stands for the number of possible simultaneous G4-or I-motif structures.
The number of I-motif sequences in the sense strand was the same as the number of G 2 -PQSs in its complementary strand. The sum of the G 2 -PQSs and I-motif sequences in the sense strand was the total number of G 2 -PQSs in the double-stranded genome of PRV. The density of G 2 -PQS in the PRV standard genome was calculated by dividing the total number of G 2 -PQSs in the double-stranded genome of PRV by the genome size. If z = 1, the G 2 -PQS was counted as a PQS monomer. If z ≥ 2, the G 2 -PQS was counted as a PQS cluster.

Comparison of the Distribution of G 2 -PQS and G 3 -PQS between PRV and Two Herpesviruses
The G 3 -PQS preferred to be located in the repeat regions in human herpesvirus genomes and some repeated G 3 -PQS clusters among the analyzed genomes were reported to be conserved [5]. In order to determine the distribution features of PQSs in PRV genome, we compared PRV, Human herpesvirus 1 (HHV-1), also known as Herpes simplex virus type 1 (HSV-1), and Human herpesvirus 3 (HHV-3), also known as Varicella-zoster virus (VZV). The reference genome sequences of PRV (NC_006151.1), HHV-1 (NC_001806.2) and HHV-3 (NC_001348.1) were downloaded from NCBI [24] and saved as FASTA files. The genome features of viruses above were analyzed with the software BEDTools (https://github.com/arq5x/bedtools2) [52] and the length of CDS region, 3 UTR region, 5 UTR region, repeat region, and latency associate transcript was recorded. The promoter region of each gene was predicted as 1kb upstream of the transcription start site of each gene. The PQSs in CDS, latency associate transcript, and repeat regions were counted in both positive strand and negative strand, and the PQSs in the untranslated regions and promoters were counted in terms of the transcription direction of each PRV gene. The number of G 3 -PQS monomer, G 3 -PQS cluster, G 2 -PQS monomer and G 2 -PQS cluster was counted with the Quadparser software. Quadparser was modified as described in last section. The density of PQSs was calculated by dividing the total number of PQSs located in each region or gene by the length of the corresponding region or gene.

Conservation of Putative G-Quadruplex Sequences and I-Motif Sequences in the PRV CDS Region in the Varicellovirus Genus
The reference genome sequences of 11 virus species from the Varicellovirus genus were downloaded from NCBI. All the PQS and I-motif sequences in the genome of viruses from Varicellovirus genus were searched with Quadparser software and output into file g4_cds.txt (File S3). The protein sequences in each genome were listed in file common_protein.txt (File S4). Following data preparation, CDS and amino acid sequences of all proteins in the eleven virus species were used for multiple sequence alignment with MAFFT software (https://www.ebi.ac.uk/Tools/msa/mafft/). The identity of the above nucleotide sequences and amino acid sequences of each protein was calculated with infoalign program of the EMBOSS package (http://emboss.sourceforge.net/apps/release/6. 6/emboss/apps/infoalign.html). The PQS sequences (GG**GG**GG**GG) and I-motif sequences (CC**CC**CC**CC) from the multiple alignment result of nucleotide sequences were identified and counted, and then the ratio of the common PQSs and I-motif sequences to the nucleotide sequences of all the proteins was calculated and output as conservation score of each PQS and I-motif sequence. The conservation of PQS in PRV CDS in the Varicellovirus genus was analyzed.

Percentage Conservation of Putative G-Quadruplex Sequences in the Repeat Regions and LLTs of the PRV Genomes
Twenty-five PRV complete genome sequences from different strains were downloaded from the NCBI Genome database [24]. These sequences were Suid herpesvirus 1 (NC_006151.  1). All of the putative G-quadruplex sequences located in the repeat regions or LLT of the reference genome sequence (Accession Number: NC_006151.1) were searched from the other 24 genome sequences, and the percentage conservation was calculated through dividing the number of genome sequences containing the same putative G-quadruplex sequences by the total number of genome sequences.

Oligonucleotide Folding Conditions
All oligonucleotides purchased from Sangon Biotech, Shanghai, China were salt-free, purified, and dissolved in ddH 2 O to a concentration of 100 µM. Oligonucleotide sequences of PQS in regulatory region were selected from the PQSs predicted by Quadparser (Table S3). G 2 -PQSs from the UL5 3 UTR, UL9 3 UTR, and mutant sequences (Table S4) were folded under the same conditions, as follows. Oligonucleotides were diluted to 10 µM in 10 mM phosphate buffer at pH 7.0 supplemented with 100 mM KCl, then they were heated to 95 • C for 5 min in a 1.5 mL Eppendorf tube in water bath, and subsequently slowly cooled for~8 h to room temperature, then used for spectra or stored at 4 • C.
Under the induced folding conditions for G 2 -PQS from UL9 3 UTR, the G 2 -PQS was diluted to 10 µM in 10mM Tris-HCl buffer at pH 7.0, supplemented with an increasing concentration of KCl (50 mM, 100 mM, and 150 mM). The G 2 -PQS in 10mM sodium phosphate buffer at pH 7.0 supplemented with 100 mM NaCl was tested. These samples were placed at room temperature for 30 min, and then applied to CD spectra.
Under the induced folding condition for G 2 -PQS from UL5 3 UTR, the G 2 -PQS was diluted to 10 µM in 10 mM phosphate buffer at pH 7.0, supplemented with 100 mM KCl, and then incubated with different ligands at room temperature for 30 min before CD spectroscopy.

Circular Dichroism Spectroscopy
CD spectra of the 10 µM folded oligonucleotide samples were collected at 25 • C on a JASCO 1500 CD spectrometer by using a quartz cuvette with 1 mm optical path. Data within a 200-320 nm range were collected using two scans at 100 nm/min with 1 s settling time and 1 nm bandwidth. The buffer baseline was recorded with the same parameters, and it was subtracted from the sample spectra before plotting.

Cell Culture
Cell lines, human embryonic kidney (HEK) 293T, porcine kidney cell (PK-15), and bovine kidney cells (MDBK) were provided by the State Key Laboratory of Agricultural Microbiology, College of Veterinary Medicine, Huazhong Agricultural University in China. The above cell lines were cultured in Dulbecco's modified Eagle medium (DMEM) containing 10% fetal bovine serum (FBS), 0.044 M NaHCO 3 , and 0.025 M HEPES. Cells were grown at 37 • C in a humidified atmosphere with 5% CO 2 .

Cytotoxicity Assay
The cytotoxicity of the G-quadruplex ligands, N-methyl mesoporphyrin IX (NMM) [53] (J&K Scientific, Beijing, China), BRACO-19 [54] (Sigma-Aldrich, Saint Louis, MO, USA), and pyridostatin (PDS) [55] (J&K Scientific, Beijing, China) to HEK293T was determined by the MTT assay, which is dependent on the measurement of mitochondrial dehydrogenase enzyme activity of viable cells. MTT, 3-(4,5-dimethylthiazol-2-yl)-2,5-diphenyltetrazolium bromide, is a yellow tetrazole, and it can be reduced to a purple formazan in living cells. The HEK293T cells, at a density of 1 × 10 4 cells per well were seeded into a 96-well microplate. When the cell confluence reached 90%, the growth medium was replaced with 100 µL of DMEM containing 2% FBS and G-quadruplex ligands at a series of final concentrations ( Figure S5), and the cells were incubated at 37 • C for 24 h. Then, 50 µL of MTT solution was added into each well, and the cells were incubated at 37 • C for another 4 h. After incubation, the supernatant was removed and 150 µL of dimethyl sulfoxide (DMSO) was added into each well to dissolve the formazan. The relative cell viability was analyzed by measuring the absorbance of formazan at 570 nm on the Synergy TM HTX microplate reader (BioTek, Winooski, VT, USA). The cytostatic concentration, which will be applied in the subsequent dual luciferase assays, was required to maintain a cell viability of more than 90%.

Transfection and Dual Luciferase Assays
The HEK293T cells were seeded in the 24-well plates at the concentration of 1 × 10 5 cells/well. The cells were transfected with 0.8 µg of psiCHECK-2 reporter plasmids and Lipofectamine 2000 (Thermo Fisher Scientific, Carlsbad, CA, USA) according to the manufacturer's instructions. NMM at 20 µM, BRACO-19 at 10 µM, or PDS at 10 µM was respectively added to the cells, with the medium being replaced at 6 h after transfection. The activities of firefly and renilla luciferase were measured 24 h after addition of the above G-quadruplex ligands, using the Dual-Luciferase Reporter Assay Kit (Promega, Madison, WI, USA) on a GloMax 20/20 luminometer (Promega, Madison, WI, USA).

Plaque Assay
The antiviral activity of NMM to PRV Ea strain was examined through plaque assay. The PK15 cells at the density of 1.2 × 10 6 cells per well were seeded in the 6-well plate. When cell confluence reached 80-90%, the cells were placed in 4 • C for 1 h, and then the PRV Ea strain virus solution (MOI = 5) was added into each well. Then, the cells were placed back to 4 • C for 2 h, and the plate with cells was shacked once every 15 min during the incubation. After the incubation at 4 • C, the virus solution was removed and replaced with 2 mL of DMEM supplemented with NMM at different final concentrations (150 nM, 100 nM, 50 nM). Afterwards, the cells were incubated at 37 • C for 24 h. After incubation, the infectious viral particles were isolated from each well and prepared for plaque assay. MDBK cells at the density of 2 × 10 5 cells per well were seeded in the 12-well plate. When the cell confluence reached 90%, the cells were used for the plaque assay. Exactly 200 µL of virus solution was added into each well. Afterwards, the cells were incubated at 37 • C for 2 h. Then, the infected cells were covered by 4% sodium carboxymethylcellulose (CMC-Na) supplemented with 3% FBS and 1% penicillin-streptomycin solution. After incubation at 37 • C for another 48 h, the infected cells were fixed and stained with the crystal violet solution (0.35%, w/v in ethanol) at room temperature. After 15 min, the crystal violet solution was removed, and the plate was washed with tap water. Then, the plate was put in the dry oven for a few minutes. Finally, the viral titer was determined by the plaque assay.

Statistical Analysis
A Student's t-test was used to determine the significant differences in luciferase activity between the ligand treatment and control of each construct. One-way analysis of variance (ANOVA) with Tukey's multiple comparison was applied to determining the significant difference among various treatments dual luciferase assay.

Conclusions
In summary, the systematic analysis of the distribution of G 2 -PQS in the PRV genomes provides a clear guide for elaborated studies of their functions, related to the establishment of latency and its reactivation in herpesviruses. We analyzed the putative G 2 -quadruplex and G 3 -quadruplex sequences in the PRV genome systematically, and then compared it with typical human herpesviruses in the same subfamily, then evaluated the structures and functions of G 2 -quadruplexes. G 2 -quadruplex sequences in the form of both monomers and clusters were found to be distributed in the entire PRV genome, especially in the CDS, LLT, and repeat regions. Extremely conserved G 2 -quadruplex sequences existed in the CDS of the genes related to viral genome replication and maturation processes. G 2 -quadruplex sequences tended to be located in the repeat regions close to the origin of replication site, which may contribute to genome replication and recombination. Most G 2 -quadruplexes from the regulatory regions formed parallel-type G-quadruplex. There were more G 2 -quadruplex sequences in the 3 UTR regions than in the 5 UTR regions. These G 2 -quadruplexes showed different sensitivities to physiological cations and small molecules. Thus, it could be inferred that the G 2 -quadruplex could act as a switch to control the expression of genes involved in virus latency establishment, viral genome replication cascade, and virus cell-to-cell movement. The G-quadruplex ligand, NMM, exhibited the potential for inhibition of the proliferation of PRV in its host cells. These massive and sensitive G 2 -quadruplexes could serve as a class of receptors in response to intracellular environments, guiding herpesviruses to choose specific cells or conditions for latency or reactivation.