Complete Nucleotide Analysis of the Structural Genome of the Infectious Bronchitis Virus Strain Md27 Reveals its Mosaic Nature

Infectious bronchitis virus (IBV) causes highly contagious respiratory or urogenital tract diseases in chickens. The Maryland 27(Md27) strain was first isolated in 1976 from diseased chicken flocks in the Delmarva Peninsula region. To understand the genetic diversity and phylogenetic relationship of existing strains with Md27, the complete nucleotide sequence of the 3′end coding region (∼7.2 kb) of Md27 was determined and compared with other IBV strains and coronaviruses. It has the same S-3-M-5-N-3′ gene order, as is the case of other IBV strains. The spike gene of Md27 exhibits 97% identity with the SE17 strain. There are deletions at the spike gene, non-coding region between M and 5 genes, and at the 3′ untranslated region (UTR), which is different from Ark-like strains. Phylogenetic analysis and sequence alignments demonstrate that Md27 is a chimera containing different gene segments that are most closely related to the SE17, Conn and JMK strains. This current study provides evidence for genomic mutations and intergenic recombination that have taken place in the evolution of IBV strain Md27.


Introduction
Infectious bronchitis virus (IBV), a member of the Coronaviridae, order Nidovirales, is a pathogen of domestic chickens that causes acute, highly contagious respiratory disease [1,2]. The IBV genome contains a single, positive-strand RNA molecule, which is about 27.6 kb long and has a cap at the 5'end and poly (A) tail at the 3'end [3]. It comprises ten open reading frames (ORFs) and the first 20 kb genome is made up of ORF1, which is a replicase gene. The replicase has two ORFs, 1a and 1b [4]; among which 1b is produced as 1ab polyprotein by a -1 ribosomal frame-shifting mechanism [4]. The ORF1 encodes non-structural proteins associated with RNA replication and transcription. The IBV genome codes for four major structural proteins; the spike (S) glycoprotein, the small envelope (E) protein, the membrane (M) glycoprotein, and the nucleocapsid (N) protein [5]. In addition to this, IBV has other genes that encode for non-structural proteins interspersed among structural genes, namely 3a, 3b, 5a and 5b [6].
Numerous IBV serotypes, such as Arkansas, Massachusetts, Connecticut, Florida, Georgia and others, referred to as variants, exist in the United States of America (USA). These variants have practical significance for controlling the disease because immunity following infection or vaccination with one serotype often is not protective against subsequent infections with heterologous serotypes [7,8]. Many serotypes have been described for IBV, probably due to the frequent point mutations that occur in RNA viruses and also due to recombination events [9][10][11]. It is essential to characterize field isolates for the selection of appropriate vaccine strains.
The late Dr. Warren Marquardt (Department of Veterinary Medicine, University of Maryland) received samples from poultry diagnostic laboratories on the Delmarva Peninsula from 1971 through 1974 [12]. Out of 106 isolates made, three were not neutralized by any kind of serum. Two of these appear to be identical on the basis of a serum neutralization test and had been designated as Maryland 27 (Md27). The Md27 antiserum has an unusually broad spectrum of minor cross-reactions with the other viruses [12]. The Md27 strain has never been characterized by sequencing and it is essential to sequence these typical strains to understand the evolution of IBV geographically and also to implement an effective vaccination program to control new variant IBV strains. Here, we describe the complete sequence analysis of the 7.27 kb of 3'end of the genome of Md27 IBV strain and its comparison and phylogenetic relationship with many heterologous IBV strains, and other coronaviruses from different parts of the world.

Results and Discussion
The structural organization of Md27 and nucleotide identity with other IBV strains is shown in Tables 1 and 2, respectively. The spike protein gene of Md27 is 3507 nucleotides long (1168aa) and codes for a protein of approximately 128.6 kDa. The transcription regulating sequence (TRS) CTTAACAAA is located 49 nucleotides from the start codon of the spike gene. The spike protein is most likely glycosylated and has 22 potential N-glycosylation sites (Asn-Xaa-Ser/Thr) which are predicted by NetNGlyc 1.0 server (http://www.cbs.dtu.dk/services/NetNGlyc/). Hydrophobicity analysis (http://www.cbs.dtu.dk/services/TMHMM) predicted the occurrence of one transmembrane (TM) domain at (1100-1122aa) in the C-terminus of the spike protein and one endodomain (1122-1168aa). The Spike protein is cleaved into S1 and S2, of which S1 produces neutralizing and serotype specific antibodies [13,14]. The spike protein cleavage site for Md27 (His-Arg-Ser-Arg-Arg/Ser) is located between residues 544 and 545. The S1 protein is closely related to the SE17 strain (96% identity) and exhibits <86% identity with other IBV strains. The S2 gene is more conserved than S1 and has 98% identity with Ark DPI, Gray, Jilin, JMK and SE17 strains. The S2 subunit may also induce serotype specific neutralizing antibodies and S2 subunits are conserved within a serotype but not between serotypes [15]. There is a notable deletion of 3 nucleotides in the S1 gene between nucleotides 69 and 70, which is different from other Ark-like strains. Table 1. Structural genome organization of Md27*. * The first nucleotide of start codon of spike gene is counted as nucleotide 1 of the structural genome. Gene 3 codes for two non-structural proteins (3a and 3b) and one structural protein (E), which is a small membrane protein of coronavirus. Gene 3a and 3b start at the last nucleotide of the S gene and 3a genes, respectively. Gene 3b overlaps 3c by 20 nucleotides and 3a has a signal sequence at the very 5' end (nucleotides . The small membrane protein E, encoded by a 107 nucleotides long ORF, is approximately 12 kDa in size. A TRS (CTGAACAAT) is located 20 nucleotides upstream of the ATG codon. Gene 3a of Md27 virus shows 97% identity with Conn, Cal99 and Ark99 strains. The 3b gene has 98% identity with Ark DPI, Ark99, Cal99, Conn, HK, and Jilin strains, whereas gene E shows 98% identity with Ark DPI, Conn, HK and Jilin. The comparative sequence alignment indicates that the E protein gene is more conserved than the genes coding for 3a and 3b. Gene 4 codes for the matrix (M) protein of 223aa, which is the most abundant envelope glycoprotein of coronaviruses, and it overlaps the E gene by 23 nucleotides. Its molecular mass depends on the glycosylation and varies from 25 to 33 kDa [16]. The M protein of Md27 contains a single putative N-linked glycosylation site at amino acid position 4. The M protein of coronaviruses differs in glycosylation types. In case of IBV it is N-linked, whereas in MHV, it is O-linked [13,17]. Gene M of Md27 shows 99% identity with Ark DPI, Conn, HK and Jilin. RNA 5 of Md27 is dicistronic and it encodes for two nonstructural proteins, 5a and 5b, respectively.
There is a 350 nucleotides long non-coding region between gene M and 5 with a deletion of 9 nucleotides, which is similar to Mass-type IBV strains. Gene 5 is more conserved than any of the other genes of Md27, and it is nearly identical (98-99%) to other strains. RNA 6, the smallest but the most abundant RNA in IBV-infected cells, encodes for the N protein. The N protein is the only phosphorylated structural protein in coronaviruses [18]. The N protein has many basic residues, and serine residues accounting for 8 to 10% of the total amino acids in the N protein. The abundance of serine residues accounts for the specific phosphorylation of serine residues [19,20]. The N protein of Md27 contains 33 serine residues. It has 96% identity with Ark DPI, Ark99, CK/CH/LSD/05I, Conn, HK and Jilin strains.
The Md27 has a 483 nucleotides long 3'UTR. Interestingly, there is a deletion of 22 nucleotides in the 5' terminus of 3'UTR (Figure 1), which shows some similarity to the Chinese strains A2, BJ and SAIBb2. In the case of IBV, C-and N-terminal regions of the N protein (but not the central region) interact with the 3'end of the non-coding region of IBV genomic RNA [21]. This deletion suggests possible involvement of 3'UTR in replication, which may influence pathogenesis of the virus. The 3'UTR of Md27 is closely related (97-98% identity) to Ark DPI, DE072, Beaudette, Gray, M41, Jilin and CU-T2. The 3'UTR (~161nucleotides) immediately downstream of the N gene is U-rich and it is a highly variable region of the coronavirus genome. It has been observed that the 3' UTR is involved in genome replication of coronaviruses, despite its apparent ability to possess quite variable sequence and sequence lengths [22]. The phylogenetic relationship of specific genes of the Md27 with other IBV strains is illustrated in Figures 2 and 3. Comparisons are made for the structural genes (S1, S2, E, M and N) because nucleotide differences between strains mainly occur in these regions [23][24][25][26] The S1 and S2 gene of Md27 clusters with the SE-17 strain and in the E and M gene phylogeny it clusters with Ark DPI, Jilin, Conn, and HK. In the N gene phylogeny Md27 clusters with HK, Conn, Ind/TN/92/03 and H52 (Figure 3). Comparison of the structural genes of Md27 with other coronaviruses demonstrates high sequence identity with turkey coronavirus (TCoV) and IBVs from peafowl and partridge (Table 3). Among the four structural genes, M is highly conserved between coronaviruses, whereas the E protein is less conserved.   The complete structural genome analysis of Md27 suggests that it is a chimera of virus genomes represented by the SE17, Conn, JMK and Ark 99 IBV strains (see Figure 4), which were circulating during that time period [12,27]. The S1 and S2 genes of Md27 exhibit 96% and 98% identity, respectively, with SE17. Most of the nucleotide differences between Md27 and SE17 are in the S1 gene, especially in the hypervariable regions (HVR). The remainder of the structural sequence is not available for SE17, so it is difficult to conclude the genetic relationship of other genes of Md27 with SE17. Based on the limited data available (as many other strains are not yet sequenced), it is conceivable that the spike gene is derived from SE17 by recombination. The Md27 shares close homology (96-99%) with the Conn strain, starting from gene 3a to the end of the N gene, which suggests that most of the Md27 structural genome was derived from a Conn strain. Table 2 shows the comparison of the entire 7 kb genome of Md27, which reveals that Ark DPI and Jilin share high levels of sequence identity with Md27. Earlier we reported that Jilin is actually an Ark DPI strain [28]. Ark DPI was isolated during the 1980s [27], and it was demonstrated that Ark DPI is a direct derivative of the Conn strain [28]. This study suggests that Md27 and Ark DPI are derived from the same ancestor, but diverged independently by recombination with different strains. Several factors determine IBV evolution, mainly high mutation rate due to the absence of proofreading mechanisms, and recombination between strains because of widespread use of live vaccines, immunological pressure and frequent mixed infections [29][30][31]. Our data has provided the evidence that both genomic mutations and intergenic recombination have taken place in the evolution of Md27 IBV strain and its descendents.

Virus
Virus isolation was done in Dr. Marquardt's laboratory by passaging tissue (trachea or kidney) homogenates into 9-10-day-old chicken embryos by the allantoic cavity route of inoculation and at S1 S2 3a 3b E M 5 N 3'UTR X X X X SE17-like SE17/JMK-like Conn -like Ark99/ Gray/? Conn-like Putative recombination site least three passages were done before declaring samples as negative. Two of the isolates were named as Md27, and these were propagated by inoculating into 9-day-old embryonated specific-pathogen-free (SPF) chicken eggs and allantoic fluid was collected 72 h post-inoculation. The fluid was clarified by low speed centrifugation and clear supernatant was stored at -80 ºC until further use.

RNA extraction and amplification
Genomic RNA was extracted from virus-infected allantoic fluid with Qiagen (Valencia, CA, USA) RNAeasy kit, following the manufacturer's instructions and stored at -80 °C until further use. Oligonucleotides were designed based on the Ark DPI11 sequence (GenBank accession No. EU418976). Overlapping primers were designed in a manner such that each pair of primer covered approximately 2 kb of the genome. All gene fragments were amplified using RT-PCR kit (Invitrogen, Carlsbad, CA, USA), according to the manufacturer's instructions and the RT-PCR products were cloned into the pCR2.1 TOPO TA ® vector (Invitrogen, CA). To determine 3'-terminus of Md27 genomic RNA, reverse transcription was carried out with a reverse poly (T) primer (5'-GCGGCCGCTTTTTTTTTTTTTTTT TT-3') and an internal gene specific forward primer.

DNA sequence analysis
DNA from various clones was sequenced by dideoxy chain termination method using an automated DNA sequencer (Applied Biosystems Inc., Foster City, CA). At least three independent clones were sequenced for each amplicon to exclude errors that can occur from RT and PCR reactions. The assembly of contiguous sequences was performed with the GeneDoc software [32]. Comparative sequence analyses of MD27 with other IBVs and coronaviruses were conducted using the BLAST search, NCBI, and Vector NTI Advance 10. Phylogenetic analyses were carried out using the MEGA4 program [33]. The phylogenetic tree was constructed from aligned nucleotide and amino acid sequences using the neighbor-joining method with 1000 bootstraps.

GenBank accession numbers
The GenBank accession number for the MD27 sequence is FJ008695. The accession numbers for other IBV gene sequences used in this study are as follows: (a) complete structural genomes:

Conclusions
In this study we analyzed ~7.2 kb of the 3' genomic end of the IBV strain Md27 and compared it with other IBV strains and coronaviruses from different parts of the world. This analysis suggests that Md27 evolved by recombination of different strains which were circulating during the same time period. This study demonstrates that recombination is the major evolutionary mechanism for infectious bronchitis virus to create new strains and variants.