Transcriptional Control and mRNA Capping by the GDP Polyribonucleotidyltransferase Domain of the Rabies Virus Large Protein

Rabies virus (RABV) is a causative agent of a fatal neurological disease in humans and animals. The large (L) protein of RABV is a multifunctional RNA-dependent RNA polymerase, which is one of the most attractive targets for developing antiviral agents. A remarkable homology of the RABV L protein to a counterpart in vesicular stomatitis virus, a well-characterized rhabdovirus, suggests that it catalyzes mRNA processing reactions, such as 5′-capping, cap methylation, and 3′-polyadenylation, in addition to RNA synthesis. Recent breakthroughs in developing in vitro RNA synthesis and capping systems with a recombinant form of the RABV L protein have led to significant progress in our understanding of the molecular mechanisms of RABV RNA biogenesis. This review summarizes functions of RABV replication proteins in transcription and replication, and highlights new insights into roles of an unconventional mRNA capping enzyme, namely GDP polyribonucleotidyltransferase, domain of the RABV L protein in mRNA capping and transcription initiation.


Introduction
Rabies virus (RABV) is a nonsegmented negative-strand (NNS) RNA virus belonging to the Lyssavirus genus of the Rhabdoviridae family in the order Mononegavirales (reviewed in References [1][2][3]). RABV is transmitted to humans from infected animals, mainly domestic dogs, through their saliva by biting or scratching, and causes an acute, fatal neurological disease, called rabies [2][3][4]. Although rabies is preventable by vaccination prior to or immediately after exposure, there are currently no established therapeutic countermeasures once patients are symptomatic [4]. RABV kills more than 50,000 people each year worldwide [5,6], posing continuing threats to human health, especially in developing countries with lower vaccination coverage in domestic dogs. In addition, rabies-like diseases are known to occur in humans by infection with other lyssaviruses, such as Duvenhage virus, European bat lyssaviruses 1 and 2, Australian bat lyssavirus, Mokola virus, and Irkut virus, although in rare cases [3,7]. Therefore, it is important to understand the basic biology of lyssaviruses, which can aid in development of therapeutic targets against them.
Studies on transcription and replication of rhabdoviruses have been carried out mainly using vesicular stomatitis virus (VSV, an arthropod-borne animal vesiculovirus) as a model (reviewed in Reference [8]), because VSV can be safely handled and shows the strongest RNA synthesis activity among rhabdoviruses as well as other NNS RNA viruses belonging to different families. Since purified RABV particles exhibited significantly lower RNA synthesis activities than that of VSV [9][10][11], it has remained difficult to biochemically characterize virion-associated enzymes involved in RNA biosynthesis. Recent advantages in establishing in vitro RNA synthesis and capping assays for RABV have enabled us to elucidate the molecular mechanisms of viral RNA biosynthesis. In this review, we discuss roles of RABV replication proteins in transcription and replication, and focus on recent studies regarding unique RABV machineries required for mRNA capping and transcription initiation.

Transcription and Replication of the Rabies Virus (RABV) Genome
The RABV genome of approximately 11.9 kilo nucleotides (nt) is composed of five genes encoding nucleocapsid (N), phospho-(P), matrix (M), glyco-(G), and large (L) proteins [12][13][14][15] that are preceded and followed by the noncoding 3′-leader (Le, 58 nt) and 5′-trailer (Tr, 70 nt) regions, respectively ( Figure 1). Other lyssaviruses possess the same genomic organization as RABV [16,17]. The negative-strand Le (Le(−)) region on the genome and positive-strand Tr (Tr(+)) region on the antigenome contain promoter elements required for synthesis of the anti-genome and genome, respectively [18][19][20][21]. The difference in promoter strength between Le(−) and Tr(+) determines the molar ratio of the genome to anti-genome of 49:1 in infected cells, eventually leading to packaging of the genome into virus particles more efficiently than the anti-genome [20]. Each lyssaviral gene begins and ends with conserved gene-start and gene-end sequences, respectively (Figure 2A), which may serve as signals for transcription initiation and polyadenylation/termination, respectively, as reported for VSV [22,23]. These lyssaviral transcriptional signals show strong sequence similarities to those of vesiculoviruses ( Figure 2B). The genome is encapsidated with the N proteins to form a helical N-RNA complex [24], which acts as a template for transcription and replication. As demonstrated for VSV [25][26][27][28][29], the RABV RNA-dependent RNA polymerase (RdRp) complex composed of the catalytic L and cofactor P proteins is associated with the N-RNA complex to form a transcriptionally active ribonucleoprotein (RNP) complex [30][31][32][33][34]. In host cells, RABV forms cytoplasmic inclusion bodies, similar to Negri Bodies, as liquid-like replication organelles, where viral RNA synthesis as well as RNP assembly takes place [35][36][37]. Similar liquid-like inclusion bodies formed in the cytoplasm of VSV-infected cells were suggested to serve as VSV replication sites [38].  (upper). The RNA-dependent RNA polymerase (RdRp, the complex between large (L) and phospho-(P) proteins) sequentially transcribes the leader region (Le) and five internal genes (N, P, M, G, and L) in the genome into the leader RNA (LeRNA) and five monocistronic mRNAs with the 5′-cap 1 and 3′-poly(A) structures by the stop-start transcription mechanism. The RdRp carries out encapsidation-coupled genome replication together with the complex of the RNA-free N protein with the P protein (N 0 -P). Partial RNA sequences of the genome and transcripts of RABV (PV strain, GenBank id: M13215) are shown. The conserved gene-start and gene-end sequences serve as Figure 1. Transcription and replication of the rabies virus (RABV) genome. The negative-strand RABV genome wrapped with the nucleocapsid (N) proteins serves as a template for transcription (lower) and replication (upper). The RNA-dependent RNA polymerase (RdRp, the complex between large (L) and phospho-(P) proteins) sequentially transcribes the leader region (Le) and five internal genes (N, P, M, G, and L) in the genome into the leader RNA (LeRNA) and five monocistronic mRNAs with the 5 -cap 1 and 3 -poly(A) structures by the stop-start transcription mechanism. The RdRp carries out encapsidation-coupled genome replication together with the complex of the RNA-free N protein with the P protein (N 0 -P). Partial RNA sequences of the genome and transcripts of RABV (PV strain, GenBank id: M13215) are shown. The conserved gene-start and gene-end sequences serve as transcription initiation and polyadenylation/termination signals, respectively. The conserved mRNA-start sequence acts as an mRNA capping signal. HO-and ppp indicate hydroxyl and triphosphate groups, respectively. m 7 GpppAm indicates N 7 -methylguanosine(5 )triphospho(5 )2 -O-methyladenosine (cap 1). transcription initiation and polyadenylation/termination signals, respectively. The conserved mRNAstart sequence acts as an mRNA capping signal. HO-and ppp indicate hydroxyl and triphosphate groups, respectively. m 7 GpppAm indicates N 7 -methylguanosine(5′)triphospho(5′)2′-Omethyladenosine (cap 1). Consensus sequences are shown under the logos. R is A or G; Y is U or C; W is A or U; K is G or U; M is A or C; B is C, G, or U; D is A, G, or U; H is A, C, or U; V is A, C, or G; N is any nucleotide. The RABV and VSV residues essential and important for transcription initiation or capping are marked by bold and regular asterisks, respectively [21][22][23][40][41][42][43][44][45][46].
For genome replication, the RABV RdRp ignores the signals for mRNA synthesis on the genome to copy it into the positive-strand anti-genome, which is in turn used as a template for synthesis of the genome (Figure 1, upper). The genome and anti-genome are each co-replicationally encapsidated with the N proteins to form the N-RNA templates. As proposed for VSV [61][62][63][64], a complex of an RNA-free N (called N 0 ) protein with the P protein, accumulated in RABV-infected cells, may play an essential role in co-replicational nucleocapsid assembly. An N-terminal portion (residues 4-40) of the RABV P protein interacts with the N protein, keeping it in an RNA-free soluble form [65]. Thus, it can be suggested that the N-terminal N 0 -binding domain of the P protein is required for its chaperoning activity that delivers the N 0 protein to the replication products. For VSV, selective encapsidation of LeRNA with the N proteins was suggested to trigger a mode switch from transcription to replication, leading to encapsidation-coupled elongation of LeRNA to the full-length anti-genome [66,67]. The RABV N protein interacts with an A-rich sequence (residues 20-30) in LeRNA more selectively than with unrelated RNAs in the presence of the P protein [68], suggesting that the selective LeRNA encapsidation is carried out with the N 0 -P complex and thereby leads to genome replication. In addition, the RABV M protein inhibits transcription, but rather stimulates replication, suggesting that the M protein regulates mode switching between transcription to replication [69,70].

Rabies Virus (RABV) Replication Proteins
The RABV N protein (450 amino acids, Figure 3A) encapsidates the genome and antigenome, to generate the competent templates for RNA synthesis and to protect them from cellular ribonucleases. Recombinant rhabdoviral N proteins are known to be assembled with cellular RNAs into closed ring-like structures containing 10 ± 1 N subunits as well as helical nucleocapsid-like structures, when expressed in insect cells [24,31] or E. coli [71,72]. X-ray crystallographic analyses of the ring-like N-RNA complexes of RABV and VSV revealed that the N protein is composed of N-and C-terminal domains, which are oriented in an angled conformation to form an RNA-binding groove ( Figure 3B) [72,73]. In the N-RNA complexes, each N subunit is associated with neighboring N subunits using unique N-terminal arm-like and C-terminal loop structures extended from the N-and C-terminal domains, respectively ( Figure 3C) [72,73]. Both the N-arm and C-loop structures of the VSV N protein are required for its oligomerization and RNA encapsidation activities [74]. Nine nucleotides of RNA are covered within an RNA-binding groove in each N subunit, in which basic amino acid residues are associated with the RNA phosphate backbone [72,73]. Transcription and replication of a VSV mini-replicon were shown to be abolished or diminished by mutations of the phosphate-binding basic amino acid residues [75]. The C-loop of two adjacent N subunits in the VSV N-RNA complex provides a binding site for the C-terminal domain of the P protein [76]. A similar mode of binding of the C-terminal domain of the RABV P protein to the N-RNA template was proposed from molecular modeling studies [33]. Phosphorylation of the RABV N protein at a serine residue position 389 (S389) is required for efficient transcription and replication [77].  The RABV P protein (297 amino acids, Figure 3D) is an elongated dimeric protein with structured and unstructured regions and is phosphorylated with cellular kinases [79][80][81]. The P protein plays The N-terminal (residues 4-40) and C-terminal (residues 186-297) domains of the RABV P protein interacts with the N 0 protein [65,82] and N-RNA complex [32,33], respectively ( Figure 3E). For VSV, the N-terminal N 0 -binding site of the P protein binds adjacent to the RNA-binding groove of the N 0 -protein and a binding site for the N-arm of an adjacent N subunit [78], thereby preventing the nucleocapsid formation. Two independent regions (residues 1-19 and 40-100) of the RABV P protein were reported to bind to a C-terminal part of the L protein [30,34], while its activity that stimulates the RdRp activity of the L protein resides within residues 11-50 [83]. The dimerization domain of the RABV P protein is located in a central region (residues 90-133) [84], but residues 65-175, including the dimerization domain, are dispensable for transcription of an RABV mini-genome in cultured cells [85]. The RABV P protein contains a binding site (residues 139-151) for cellular dynein light chain 1 (DLC1, also called dynein light chain LC8) [86][87][88], which is required for efficient genome transcription rather than originally proposed retrograde axonal transport of RABV in neuronal cells [89]. Cellular focal adhesion kinase also interacts within the central dimerization domain (residues 106-131) of the RABV P protein, and positively regulates the RABV RNA synthesis activity [90]. In addition, the RABV P protein and its N-terminal truncation isoforms (called P2-P5) are known to counteract functions of various cellular factors involved in antiviral responses, such as IRF-3, IRF-7, STAT1, and PML (reviewed in [91]). Residues 176-186 of the RABV P protein are required for inhibition of IRF3/7 activation [92]. The C-terminal domain of the RABV P protein includes binding sites for STAT1 (residues 187-297) [92][93][94] and PML (residues 223-297) [95].
The RABV L protein (2127 amino acids, Figure 4A) is a multi-domain protein that may catalyze all enzymatic reactions required for RNA synthesis and processing (mRNA 5 -capping, cap methylation, and 3 -polyadenylation) as reported for other NNS RNA viral L proteins [43,45,[96][97][98][99][100][101][102][103]. At present, for RABV, only RNA synthesis and capping activities have been experimentally demonstrated using recombinant forms of the RABV L protein [21,46,83]. The RABV P protein interacts with the C-terminal part of the L protein [30,34,104], though its precise binding site is still not known. The RABV P protein stimulates transcription initiation [21] as well as elongation [83] mediated by the L protein. The RABV L protein exhibits significant similarities to the well-characterized VSV L protein throughout the entire protein at the amino acid sequence level [105][106][107], suggesting that it has the same domain organization as well as enzymatic functions as the VSV L protein. Since information on a three-dimensional structure of the RABV L protein is not available, its structural model was generated using the structure of the VSV L protein solved by cryo-electron microscopy (PDB id: 5A22) [108] as a template ( Figure 4B). As suggested for the VSV L protein [8,108,109], the RABV L protein was predicted to have an N-terminal (NTD, composed of subdomains I and II), RdRp, bridge, mRNA capping enzyme (GDP polyribonucleotidyltransferase, PRNTase), connector (CD), methyltransferase (MTase), and C-terminal (CTD) domains. The RdRp domain of the RABV L protein as well as the VSV L protein [108] is composed of putative fingers, palm, and thumb subdomains, and contains structural motifs A-F for RdRps [8,105,[110][111][112]. The palm subdomain of the RABV L protein possesses two universally conserved aspartate residues in motifs A (D618) and C (D729), which may serve as catalytic residues for two-metal dependent nucleotide polymerization as proposed for other polymerases [113][114][115][116][117][118][119]. As predicted, D729 of the RABV L protein was shown to be required for its RNA synthesis activity in vitro [21] as well as in cellula [120]. Interestingly, residues 1079 to 1083 of the RABV L protein interact with cellular DLC1, which stimulates RABV transcription in infected cells and recruits the L protein to reorganized microtubules when overexpressed in transfected cells [121]. However, the degree of microtubule-reorganization with the RABV L protein in infected cells and its precise role(s) in viral replication and pathogenesis have not been studied.

mRNA Capping in Eukaryotes and Rhabdoviruses
The eukaryotic mRNA cap is a unique 5′-terminal block structure, in which m 7 G is linked to the first nucleoside (N1) of mRNA through an inverted 5′-5′ triphosphate bridge (reviewed in [124][125][126][127][128]). In eukaryotic cells, the cap 0 structure (m 7 GpppN1-) is formed on pre-mRNA with sequential enzymatic reactions coupled with mRNA chain elongation by DNA-dependent RNA polymerase II (pol II) in the nucleus ( Figure 5A) [125][126][127][128]. The cap 0 structure is essential for mRNA biosynthesis and metabolism at various steps, such as mRNA splicing, export, translation, and degradation [124,125,128]. Eukaryotic mRNA capping enzymes are associated with the largest subunit of pol II in

mRNA Capping in Eukaryotes and Rhabdoviruses
The eukaryotic mRNA cap is a unique 5 -terminal block structure, in which m 7 G is linked to the first nucleoside (N 1 ) of mRNA through an inverted 5 -5 triphosphate bridge (reviewed in [124][125][126][127][128]). In eukaryotic cells, the cap 0 structure (m 7 GpppN 1 -) is formed on pre-mRNA with sequential enzymatic reactions coupled with mRNA chain elongation by DNA-dependent RNA polymerase II (pol II) in the nucleus ( Figure 5A) [125][126][127][128]. The cap 0 structure is essential for mRNA biosynthesis and metabolism at various steps, such as mRNA splicing, export, translation, and degradation [124,125,128]. Eukaryotic mRNA capping enzymes are associated with the largest subunit of pol II in an early stage of transcription and carry out co-transcriptional mRNA capping with their RNA 5 -triphosphatase (RTPase) and mRNA guanylyltransferase (GTase) activities [125][126][127][128]. The RTPase activity removes the γ-phosphate group  [124]. Importantly, N 1 -2 -O-methylation of the cap structure is required for avoiding anti-viral innate immune reactions (reviewed in [129,130]). In higher eukaryotic cells, cap 0-mRNAs are sensed as non-self RNAs by viral RNA sensors, such as RIG-I [131,132] and MDA5 [133], resulting in production of interferon followed by anti-viral factors. The cap 0 structure on viral mRNAs is further recognized by the interferon-inducible IFIT1 protein together with its related proteins to block their translation [134][135][136][137][138][139]. In addition, when N 1 is adenosine, the cap structure is often methylated at the adenine-N 6 position [140,141]. an early stage of transcription and carry out co-transcriptional mRNA capping with their RNA 5′triphosphatase (RTPase) and mRNA guanylyltransferase (GTase) activities [125][126][127][128]. The RTPase activity removes the γ-phosphate group from 5′-triphosphate-ended RNA (pppRNA), producing 5′diphosphate-ended RNA (ppRNA) and inorganic phosphate (Pi). The GTase activity subsequently transfers the GMP moiety from GTP to ppN1-RNA via a covalent enzyme-  [124]. Importantly, N1-2′-O-methylation of the cap structure is required for avoiding anti-viral innate immune reactions (reviewed in [129,130]). In higher eukaryotic cells, cap 0-mRNAs are sensed as non-self RNAs by viral RNA sensors, such as RIG-I [131,132] and MDA5 [133], resulting in production of interferon followed by anti-viral factors. The cap 0 structure on viral mRNAs is further recognized by the interferon-inducible IFIT1 protein together with its related proteins to block their translation [134][135][136][137][138][139]. In addition, when N1 is adenosine, the cap structure is often methylated at the adenine-N 6 position [140,141]. The unconventional mechanism of rhabdoviral mRNA capping, discovered in vesiculoviruses, such as VSV [43,47,142] and Chandipura virus [143], is strikingly different from the conventional mechanism of eukaryotic mRNA capping (reviewed in [107,144]). In the first step of the VSV capping reaction, GTP is hydrolyzed into GDP by a guanosine 5′-triphosphatase (GTPase) activity associated with the VSV L protein [43,44]. In the second step, the PRNTase domain in the VSV L protein transfers 5′-monophosphate-ended RNA (pRNA) from pppRNA (pRNA donor) to GDP (pRNA acceptor) through a covalent enzyme-(histidyl-N ε )-pRNA (called L-pRNA) intermediate to generate GpppRNA [43,145]. The VSV L protein specifically recognizes pppRNA, but not ppRNA, with the vesiculoviral mRNA start-sequence, 5′-A1R2C3N4G5 (R: A or G; N: any nucleotides; the subscript The unconventional mechanism of rhabdoviral mRNA capping, discovered in vesiculoviruses, such as VSV [43,47,142] and Chandipura virus [143], is strikingly different from the conventional mechanism of eukaryotic mRNA capping (reviewed in [107,144]). In the first step of the VSV capping reaction, GTP is hydrolyzed into GDP by a guanosine 5 -triphosphatase (GTPase) activity associated with the VSV L protein [43,44]. In the second step, the PRNTase domain in the VSV L protein transfers 5 -monophosphate-ended RNA (pRNA) from pppRNA (pRNA donor) to GDP (pRNA acceptor) through a covalent enzyme-(histidyl-N ε )-pRNA (called L-pRNA) intermediate to generate GpppRNA [43,145]. The VSV L protein specifically recognizes pppRNA, but not ppRNA, with the vesiculoviral mRNA start-sequence, 5 -A 1 R 2 C 3 N 4 G 5 (R: A or G; N: any nucleotides; the subscript numbers indicate the positions of the nucleotide residues from the 5 -end) ( Figure 2B), in which A 1 R 2 C 3 and G 5 are essential and important, respectively, for the pRNA transfer reaction at the step of the L-pRNA intermediate formation [43,44,146]. In contrast, the VSV L protein is not able to form an L-pRNA intermediate with the VSV LeRNA start-sequence, 5 -ACGAA [146], explaining why the VSV L protein caps VSV mRNAs but not LeRNA [43,47,142,147]. The PRNTase domain of the VSV L protein employs GDP, but not the other three NDPs, as the pRNA acceptor [145]. The pRNA acceptor activity of GDP requires the C 2 -amino group of guanine and 2 or 3 -hydroxyl group of ribose, but not C 6 -oxo group, N 1 -hydrogen, or N 7 -nitrogen [148]. The PRNTase domain also accepts GTP to generate a tetraphosphate containing cap, G(5 )pppp(5 )A, although to a lesser extent than GDP, when GTP hydrolysis is a rate-limiting step in capping under in vitro conditions [44]. However, the GDP production step appears to be omitted for the GpppA formation in infected cells, because intracellular concentrations of GDP are usually 3-4 orders of magnitude higher than the K m for the pRNA acceptor, GDP (0.03 µM) [148].
Although there is no direct evidence, the 5 -ends of the RABV mRNAs are thought to be capped and methylated into the cap 1 structure (m 7 GpppAm-) and/or more extensively methylated forms (e.g., m 7 G(5 )ppp(5 )AmpAm-, m 7 G(5 )ppp(5 )m 6 Ampm 6 Am-) in the cytoplasm of infected cells as reported for the VSV mRNAs [47,55,56]. However, it remains challenging to analyze cap structures on RABV mRNAs synthesized either in vitro or in cellula due to their very low quantities [9][10][11]. The recent development of the methods for expression and purification of an enzymatically active recombinant form of the RABV L protein allowed us to demonstrate that it shows GTPase and PRNTase activities to generate the cap structure in vitro [46] ( Figure 5B). The latter activity caps pppRNA, but not ppRNA, with GDP in a sequence-dependent manner [46]. As shown in Figure 2A, all the known lyssaviral mRNAs begin with the conserved 5 -A 1 A 2 C 3 A 4 B 5 (B: C, U, or G). Consistently, the RABV L protein employs pppRNAs with the lyssaviral mRNA start-sequences (e.g., AACAC, AACAU), but not with the LeRNA-start sequence (ACGCU), as pRNA donor substrates [46]. The RABV L protein strictly recognizes the first three nucleotide, A 1 A 2 C 3 , of the pRNA donors, in which A 2 cannot be replaced with G [46], manifesting its specificity for lyssaviral mRNAs slightly different from that of the VSV L protein [43,44]. Since the PRNTase domain of the RABV L protein exhibits a very low specific capping activity approximately 600-fold lower than that of the VSV L protein [46], it remains particularly difficult to demonstrate the formation of the putative RABV L-pRNA intermediate.
On the other hand, it is not known whether the putative MTase domain in the RABV L protein catalyzes cap methylation. The single cap MTase domain of the VSV L protein was suggested to catalyze sequential N 1 -2 -O-and G-N 7 -methylation reactions (GpppA-→ GpppAm-→ m 7 GpppAm-) in an opposite way to those by eukaryotic cap MTases [97,98,101,[149][150][151]. The putative RABV MTase domain was predicted to have a SAM-dependent MTase core fold ( Figure 4) similar to that of the VSV L protein [108] and contains a glycine-rich SAM binding motif G[−]GxG ([−], negatively charged amino acids, 1704-GDGSG-1708) and an N-2 -O-MTase motif, namely K-D-K-E catalytic tetrad (K1685-D1797-K1829-E1867) [106]. Recombinant RABV expressing the L protein with a mutation in the 2 -O-MTase motif (e.g., K1685A, K1829A) is severely attenuated and more sensitive to IFIT2, an IFIT1-related anti-viral protein, than wild-type virus [152]. Given these observations, the putative RABV MTase domain can be suggested to methylate the cap structure of RABV mRNAs at the two positions ( Figure 5B), rendering them more translatable and resistant to anti-viral factors in infected cells.

Roles of the Rabies Virus (RABV) GDP Polyribonucleotidyltransferase (PRNTase) Domain in RNA Biosynthesis
The putative RABV PRNTase domain (residues 1093-1349) shares five conserved motifs, Rx(3)Wx(3-8)ΦxGxζx(P/A) (motif A), (Y/W)ΦGSxT (motif B), W (motif C), HR (motif D), and ζxxΦx(F/Y)QxxΦ (motif E) (Φ, hydrophobic; ζ, hydrophilic amino acids) with those in L proteins of NNS RNA viruses belonging to the the order Mononegavirales [8,107,144,153] (Figure 6A). In the flat PRNTase domain of the VSV L protein (PDB id: 5A22) [108], motifs B-E compose a unique active site with a putative substrate binding cavity, whereas motif A seems to provide a platform for the active site organization [8,153]. The putative RABV PRNTase domain was predicted to fold into a VSV PRNTase-like structure with a putative active site surrounded by motifs B-E ( Figure 6B). As reported for the VSV L protein [153], G1112 in motif A, T1170 in motif B, W1201 in motif C, H1241 and R1242 in motif D, and F1285 and Q1286 in motif E were identified as essential for the PRNTase activity of the RABV L protein [46]. Similar mutations in the VSV PRNTase motifs are lethal to VSV [153,154]. Mass spectrometric and biochemical analyses of the VSV L-pRNA intermediate revealed that the N ε2 position of H1227 in motif D is covalently linked to the 5 -monophosphate end of the RNA with the VSV mRNA-start sequence via a phosphoamide bond [145]. Therefore, the RABV counterpart (H1241) of the VSV H1227 residue can be predicted to serve as a covalent pRNA attachment site for the putative L-pRNA intermediate formation. The nucleophilic histidine residue in motif D may attack the α-phosphorus in the 5 -triphosphate group of the pRNA donor, resulting in the formation of the L-pRNA intermediate with concomitant release of inorganic pyrophosphate (PP i ). Roles of other key residues in motifs B-E in interactions with the substrates and products in the two-step ping-pong capping reaction were recently predicted based on an in silico docking study [8], but await experimental verification. with a putative substrate binding cavity, whereas motif A seems to provide a platform for the active site organization [8,153]. The putative RABV PRNTase domain was predicted to fold into a VSV PRNTase-like structure with a putative active site surrounded by motifs B-E ( Figure 6B). As reported for the VSV L protein [153], G1112 in motif A, T1170 in motif B, W1201 in motif C, H1241 and R1242 in motif D, and F1285 and Q1286 in motif E were identified as essential for the PRNTase activity of the RABV L protein [46]. Similar mutations in the VSV PRNTase motifs are lethal to VSV [153,154]. Mass spectrometric and biochemical analyses of the VSV L-pRNA intermediate revealed that the N ε2 position of H1227 in motif D is covalently linked to the 5′-monophosphate end of the RNA with the VSV mRNA-start sequence via a phosphoamide bond [145]. Therefore, the RABV counterpart (H1241) of the VSV H1227 residue can be predicted to serve as a covalent pRNA attachment site for the putative L-pRNA intermediate formation. The nucleophilic histidine residue in motif D may attack the α-phosphorus in the 5′-triphosphate group of the pRNA donor, resulting in the formation of the L-pRNA intermediate with concomitant release of inorganic pyrophosphate (PPi). Roles of other key residues in motifs B-E in interactions with the substrates and products in the two-step pingpong capping reaction were recently predicted based on an in silico docking study [8], but await experimental verification. The structural model of the putative RABV PRNTase domain suggests that it possesses a large loop structure flanking the PRNTase motif B (Figure 6B), which corresponds to a "priming loop" proposed for the VSV L protein [108]. De novo initiating RdRps of viruses (e.g., bacteriophage Φ6, reovirus, dengue virus, hepatitis C virus, influenza virus) often have a priming loop, which facilitates primer-independent initiation of RNA synthesis form 3 -termini of viral genomes by stabilizing their initiation complexes with initiator and incoming nucleotides [155][156][157][158][159]. As reported in the structure of the VSV L protein [108], the putative RABV priming loop was predicted to extend from the PRNTase domain toward the active site center of the RABV RdRp domain. To analyze the roles of the loop structure of RABV as well as VSV, effects of mutations in the loop on RNA synthesis and capping were examined [21]. The results showed that a TxΨ (Ψ, aliphatic amino acids) motif (T1174-x-L1176 for RABV, T1161-x-I1163 for VSV) on the loop is required for RNA capping, whereas a conserved tryptophan (W) residue (W1180 for RABV, W1167 for VSV) is essential for terminal de novo initiation from position 1 of the Le(−) promoter (3 -HO-U 1 G 2 -) to carry out the first phosphodiester bond formation (synthesis of pppAC) in a template-dependent manner [21]. In contrast, the TxΨ motif and W residue are dispensable for transcription initiation and capping, respectively [21]. These findings indicate that the putative loop structure extended from the PRNTase domain, named "priming-capping loop", plays dual roles in transcription initiation and mRNA capping. Both the W residue and TxΨ motif are conserved in L proteins of rhabdoviruses infecting animals and/or arthropods, but not in those of other NNS RNA viruses, indicating their specific functions in these rhabdoviruses [8,21].
Based on the structures of the bacteriophage Φ6 initiation complex (PDB id: 1HI0) [155] and the apo form of the VSV L protein (PDB id: 5A22) [108], we previously modeled the VSV initiation complex [21]. Using the VSV complex (Model Archive id: ma-5k432) as a template, we modeled the RABV initiation complex with the RABV RNA template (3 -U 1 G 2 C 3 G 4 ), initiator ATP, incoming CTP, and divalent cations (2 Mg 2+ and 1 Mn 2+ ) ( Figure 7A) with the secondary structure matching (SSM) tool in COOT [160] and energy minimization in PHENIX [161]. Similar to the VSV model, ATP and CTP were predicted to be base-paired with the U 1 and G 2 residues of the RABV model template, respectively, and are located adjacent to the active site aspartate residues (D618 and D729 in motifs A and C, respectively) in the palm subdomain of the RABV RdRp domain. The D729 residue was suggested to be associated with the α-phosphate of CTP sitting adjacent to a coordinated Mg 2+ ion. This model also suggests that the E546 and R552 residues in the putative fingertips structure (motif F) interacts with the C 4 -amino group and α-phosphate, respectively, of CTP, and the phenyl group of the F554 residue sits stacked in-line with the U 1 and G 2 bases of the model template. Furthermore, the adenine base of ATP was predicted to be sandwiched between the cytosine base of CTP and the indole group of the W1180 residue on the priming-capping loop via π-stacking interactions. All these putative interactions appear to be critical for the formation of the terminal de novo initiation complex of RABV.
On the other hand, the W1180 residue on the priming-capping loop extended from the PRNTase domain of the RABV L protein is dispensable for internal initiation from the N gene-start sequence as well as a gene-start-like sequence present in the Le(−) promoter (see Figure 1) [21]. It is intriguing to note that a proline residue on the priming loop extended from the thumb subdomain of the influenza A virus RdRp is required for terminal initiation from the genomic promoter, but not for internal initiation from the anti-genomic promoter [159]. Therefore, it is apparent that the mechanism of internal initiation is different from that of priming loop-dependent terminal initiation by these negative-strand RNA viral RdRps. Using an in vitro transcription system with oligo-RNAs as templates, the RABV RdRp was shown to use an internal 3 -A -1 /U +1 UGUNG-5 sequence as an internal initiation sequence [21]. The internal initiation signals in the RABV gene-start-sequences are employed to produce 5 -pppAAC-strated pre-mRNA, which is subsequently capped with the PRNTase domain of the RABV L protein [46]. However, it is currently not known whether the gene-start-like sequence in the Le(−) promoter serves as an internal initiation signal to synthesize a 5 -pppAAC-strated RNA(s) in infected cells.
initiation sequence [21]. The internal initiation signals in the RABV gene-start-sequences are employed to produce 5′-pppAAC-strated pre-mRNA, which is subsequently capped with the PRNTase domain of the RABV L protein [46]. However, it is currently not known whether the genestart-like sequence in the Le(−) promoter serves as an internal initiation signal to synthesize a 5′-pppAAC-strated RNA(s) in infected cells. The structure of an RABV terminal de novo initiation complex with a model RNA template (3′-UGCG-5′, white carbon backbone), initiator and incoming nucleotides (ATP and CTP, respectively; yellow carbon backbone), and divalent metal ions (Mg 2+ , purple; Mn 2+ , obscured) was modeled from the vesicular stomatitis virus initiation complex (Molecular Archive id: ma-5k432) based on the structure in [21]. The RNAdependent RNA polymerase (RdRp) subdomains and PRNTase/priming loop of the RABV L protein are colored as in Figure 4. Key amino acid residues are shown as stick models. The coordinates of the modeled RABV terminal de novo initiation complex have been uploaded to the Model Archive (id: ma-uqibu). (B) A model for de novo transcription initiation and pre-mRNA capping by the RABV L protein is schematically presented. Only the PRNTase (pale green) and RdRp (pale red) domains of the L protein are depicted with the priming-capping loop (orange). Circled capital letters indicate the following amino acid residues: "T", "H", and "R" in the PRNTase domain are T1170, H1241, and R1242, respectively; "T", "L", and "W" on the priming-capping loop are T1174, L1176, and W1180, respectively; left and right "D"s in the RdRp domain are D618 and D729, respectively. The 3′-terminal UG sequence of the Le(−) promoter and the internal UUG sequence of the N gene-start sequence are shown on the genome (black thick line). pppA and pppC indicate ATP and CTP, respectively. The nucleocapsid (N) and phospho-(P) proteins are shown in pale blue and pale yellow, respectively. The structure of an RABV terminal de novo initiation complex with a model RNA template (3 -UGCG-5 , white carbon backbone), initiator and incoming nucleotides (ATP and CTP, respectively; yellow carbon backbone), and divalent metal ions (Mg 2+ , purple; Mn 2+ , obscured) was modeled from the vesicular stomatitis virus initiation complex (Molecular Archive id: ma-5k432) based on the structure in [21]. The RNA-dependent RNA polymerase (RdRp) subdomains and PRNTase/priming loop of the RABV L protein are colored as in Figure 4. Key amino acid residues are shown as stick models. The coordinates of the modeled RABV terminal de novo initiation complex have been uploaded to the Model Archive (id: ma-uqibu). (B) A model for de novo transcription initiation and pre-mRNA capping by the RABV L protein is schematically presented. Only the PRNTase (pale green) and RdRp (pale red) domains of the L protein are depicted with the priming-capping loop (orange). Circled capital letters indicate the following amino acid residues: "T", "H", and "R" in the PRNTase domain are T1170, H1241, and R1242, respectively; "T", "L", and "W" on the priming-capping loop are T1174, L1176, and W1180, respectively; left and right "D"s in the RdRp domain are D618 and D729, respectively. The 3 -terminal UG sequence of the Le(−) promoter and the internal UUG sequence of the N gene-start sequence are shown on the genome (black thick line). pppA and pppC indicate ATP and CTP, respectively. The nucleocapsid (N) and phospho-(P) proteins are shown in pale blue and pale yellow, respectively.
Our biochemical data combined with the structural models generated here and in the recent studies [21] suggest that the priming-capping loop of the PRNTase domain in rhabdoviral L proteins performs the dual-functions in the sequential stop-start transcription ( Figure 7B). In the step of terminal de novo initiation for LeRNA synthesis, the conserved W residue on the priming-capping loop stabilizes the RdRp complex with ATP and CTP at the 3 -terminal UG sequence of the genome to mediate the first phosphodiester bond formation (pppApC synthesis). To elongate and release LeRNA, the priming-capping loop may be retracted from the active site cavity of the RdRp domain. In the step of internal de novo initiation for mRNA synthesis at the gene-start sequence, the W residue on the priming-capping loop is no longer required for the first phosphodiester bond formation (pppApA synthesis). The TxΨ motif on the priming-capping loop of RABV as well as VSV is critical for capping of 5 -pppApApC-started RNAs. For VSV, the TxΨ motif was shown to be essential for the L-pRNA intermediate formation with pppAACAG, the VSV mRNA-start sequence [21]. These observations suggest that a conformational change in the priming-capping loop may bring the TxΨ motif close to the PRNTase active site, leading to its distinct configuration to recognize the rhabdovirus specific mRNA-start sequence for pre-mRNA capping. If the VSV L protein has a mutation abolishing the L-pRNA intermediate formation in the TxΨ motif as well as the PRNTase motifs, it frequently terminates and reinitiates transcription using suboptimal termination and initiation signals within the N gene, producing unusual 5 -triphosphorylated N mRNA fragments [21,153,154]. Therefore, the L-pRNA intermediate formation mediated by the priming-capping loop during mRNA chain elongation is a key step leading to progression of the downstream events, such as the pRNA transfer to form the 5 -cap structure, continuous elongation, accurate 3 -polyadenylation and termination at the gene-end sequence, and eventually release of mature mRNA.

Concluding Remarks
Studies on RNA synthesis and processing with the RABV RdRp have been limited by the lack of efficient in vitro transcription systems over the past four decades. As described in this review, the recent development of the in vitro RNA synthesis and capping systems with the recombinant RABV L protein allowed us to reveal the enzymatic and regulatory roles of the L protein in transcription initiation and capping. The enzymatically active recombinant RABV L protein may open up new opportunities to elucidate roles of the putative MTase domain in cap methylation and to locate the binding site for the co-factor P protein. Further elucidation of the molecular mechanisms of mRNA and genome/antigenome biosynthesis by the RABV L protein and its structural analyses would help in the development of antiviral agents targeting its essential domains.
Author Contributions: T.O. conceived and wrote the manuscript, and prepared the figures. T.J.G. performed the structure modeling, prepared the structural images, and edited the manuscript.
Funding: This work was supported by funding from Case Western Reserve University and grants from the National Institutes of Health to TO (AI093569) and to TJG (AI116738).