A Renewed Appreciation of Helicoverpa armigera Nucleopolyhedrovirus BJ (Formerly Helicoverpa assulta Nucleopolyhedrovirus) with Whole Genome Sequencing

Helicoverpa assulta is a pest that causes severe damage to tobacco, pepper and other cash crops. A local strain of HearNPV-BJ (formerly Helicoverpa assulta nucleopolyhedrovirus (HeasNPV-DJ0031)) was isolated from infected H. assulta larvae in Beijing, which had been regarded as a new kind of baculovirus in previous studies. Describing the biological characteristics of the strain, including its external morphology, internal structure and the pathological characteristics of the infection of various cell lines, can provide references for the identification and function of the virus. HearNPV-BJ virion was defined as a single-nucleocapsid nucleopolyhedrovirus by scanning electron microscopy. QB-Ha-E-5 (H. armigera) and BCIRL-Hz-AM1 (H. zea) cell lines were sensitive to HearNPV-BJ. Undoubtedly modern developed sequencing technology further facilitates the increasing understanding of various strains. The whole genome sequence of the HearNPV-BJ was sequenced and analyzed. The HearNPV-BJ isolate genome was 129, 800 bp nucleotides in length with a G + C content of 38.87% and contained 128 open reading frames (ORFs) encoding predicted proteins of 50 or over 50 amino acids, 67 ORFs in the forward orientation and 61 ORFs in the reverse orientation, respectively. The genome shared 99% sequence identity with Helicoverpa armigera nucleopolyhedrovirus C1 strain (HearNPV-C1), and 103 ORFs had very high homology with published HearNPV sequences. Two bro genes and three hrs were found to be dispersed along the HearNPV-BJ genome. Three of the highest homologs, ORFs with HearNPV, were smaller due to the earlier appearance of the stop codon with unknown functions. P6.9 of HearNPV-BJ, a structural protein, is distinctly different from that of Autographa californica nucleopolyhedrovirus (AcMNPV); its homology with the corresponding gene in HearNPV-C1 was 93.58%. HearNPV-BJ contains 38 core genes identified in other baculoviruses, and phylogenetic analysis indicates HearNPV-BJ belongs to Alphabaculovirus Group II, same as HearNPV-C1. The resulting data provide a better understanding of virion structure, gene function and character of infection. By supplementing the whole-genome sequencing data and Kimura-2 model index, there is more evidence to indicate that HearNPV-BJ may be a variant of Helicoverpa armigera nucleopolyhedrovirus, which also deepens our understanding of the virus species demarcation criteria.


Introduction
The Baculoviridae are a family of viruses specific to arthropods, which belong to the new order Lefavirales and new class Naldaviricetes, with a redefined and further clarified taxonomic status in Virus Taxonomy: 2020 Release [1]. Traditionally, the viral family was classified into two genera: Nucleopolyhedrovirus (NPV) and Granulovirus (GV). Based on phylogenetic analysis of baculovirus core genes, the Baculoviridae can be divided into four genera: Alphabaculovirus (lepidopteran-specific NPV), Betabaculovirus (lepidopteran-specific GV), Gammabaculovirus (hymenopteran-specific NPV) and Deltabaculovirus (dipteran-specific NPV), respectively, as was revised in the Ninth Report of the International Committee on Taxonomy of Viruses [2]. The vast number of baculoviruses predominantly infect insects, and more than 600 species of insects were reported to be infected with the baculovirus. Of note, Helicoverpa armigera nucleopolyhedrovirus (HearNPV) has been widely applied to control pests of cotton and vegetable crops since 1994 in China, which was the first time that the study of an insect virus gained commercial success in China [3,4]. Due to complex ecological niches, virus species demarcation criteria are of particular importance. In many cases, the derived host species have been considered a named parameter, consequently resulting in the same viruses being isolated from different insect hosts and given different names. Helicoverpa armigera nucleopolyhedrovirus G4 (HearNPV-G4), Helicoverpa armigera nucleopolyhedrovirus C1 strain (HearNPV) and Helicoverpa zea single nucleocapsid nucleopolyhedrovirus (HzSNPV), isolated from different hosts, were actually found to be variants of the same virus species by the Kimura-2-parameter substitution model [5][6][7].
Helicoverpa assulta, belonging to Helicoverpa (Lepidoptera: Noctuidae), is a pest characterized by its worldwide distribution and high prolificacy. The phylogenetic relationship between H. assulta and H. armigera is close. Unlike H. armigera's feeding habits, H. assulta, a kind of oligophagous insect, is severely damaging to a wide range of the Solanaceae family of plants, such as tobacco and pepper. H. assulta and H. armigera, with reproductive isolation in nature. Nevertheless, changes in sex pheromone components and ratio could contribute to interspecific hybridization between them [8]. Namely, it is possible that the same virus could be isolated from both H. assulta and H. armigera. According to traditional naming conventions, these so-called different viruses can become a stumbling block in the further understanding of a virus. Strikingly, with the development of sequencing technology, the whole-genome sequencing data provide robust evidence in virus species demarcation criteria. There are high numbers of baculoviruses in nature, but only a small part of the research focuses on baculoviruses.
Knowledge of the molecular biology of Baculoviridae is important due to their worldwide distribution and can be used to provide models for genetic regulatory networks and genome evolution. Collectively, our sequencing data and analysis can re-recognize the HearNPV-BJ (formerly Helicoverpa assulta nucleopolyhedrovirus (HeasNPV-DJ0031)) isolated from Helicoverpa assulta larvae. Herein, we focus on morphology, virion infectivity and a complete nucleotide sequence of the HearNPV-BJ. This study also deepens our understanding of virus species demarcation criteria, and a comprehensive comparison at the molecular level will greatly facilitate the unraveling of the mystery to find the different infectious activities of similar baculoviruses, providing more data for baculovirus phylogenetic analysis and genetic classification.

Insects Cell Lines and Virus Isolates
HearNPV-BJ strain under collection number HeasNPV-DJ0031 was originally isolated in Beijing, China, in 1993 and has been preserved by the Institute of Zoology, Chinese Academy of Sciences. Occlusion-derived virions (ODVs) were released from occlusion bodies using an alkaline treatment and were purified by using a step sucrose density gradient centrifugation with modifications as described previously [9,10]. Morphology of purified virus, along with a preliminary evaluation of both their purity and quantity, was identified by scanning electron microscopy (SEM) using Hitachi-SU8010 and transmission electron microscopy (TEM) using JEM-1400 (JEOL, Tokyo, Japan) [11].
Each cell line mentioned was seeded on culture flasks and left overnight before viral infection. HearNPV-BJ was diluted to realize infections with a Multiplicity of Infection (M.O.I.) of 0.5 PFU/cell with a serum-free medium.

Sequencing and ORF Finding
An Illumina PE library of HearNPV-BJ was constructed with paired-end tags and was sequenced on the Illumina Miseq platform. The combined scaffold was generated from these sequencing reads by computational analysis. Ambiguous regions and gaps in the assembled sequence were further verified by the sequencing of PCR products.
As the criteria that ORFs encoding 50 and over 50 amino acids were considered to be protein-encoding, ORFs were defined and assigned putative genes. The predicted ORFs were annotated depending on homology using NCBI BLAST. According to a recently adopted convention, the adenine residue at the translational initiation codon of the polyhedrin gene was designated as the zero point of the physical map of HearNPV-BJ DNA.
ORFs are obtained by Gene mark vision 2.9 with the parameter settings declaring that the length of predicted genes should be more than 150 bases. During analysis, the software simultaneously provides gene information based on the upstream and downstream parts of the sequence. It is more credible that the analysis result is positive with target character information.

Structure Characteristic of HearNPV-BJ Virion
In addition to Gammabaculovirus (hymenopteran-specific NPV), two distinct baculovirus virion phenotypes are shown in the viral life cycle [20,21], budded virus (BV) and occlusion-derived virion (ODV). BVs infect through cell-to-cell type, and then these virions are occluded within polyhedrin protein to shift to the other phenotype; ODVs infect through animal-to-animal type. Mature ODVs form occlusion bodies (OBs) in the protein matrix, which can protect them from environmental damage. In terms of virion structure, the Alphabaculovirus largely differs from the Betabaculovirus. Alphabaculovirus ODVs, polyhedra in shape, have an average diameter of from 0.1 to 15 µm [22]. Betabaculovirus ODVs with an ovoid shape, ranging from about 300 nm to 500 nm [23], are much smaller than Alphabaculovirus ODVs. Furthermore, a baculovirus such as Helicoverpa armigera nucleopolyhedrovirus is also frequently subdivided by the extent of aggregation of their nucleocapsids within the envelope; some present single-nucleocapsid nucleopolyhedrovirus (SNPV), whereas others are found to as multiple-nucleocapsid nucleopolyhedrovirus (MNPVs). The OB size varies among different strains of Helicoverpa armigera nucleopolyhedrovirus, and the diameter ranges from 0.3 µm to 3 µm [24,25]. Our results showed that OBs of HearNPV-BJ are approximately 1 µm, and this was enveloped and occluded completely when scanning with SEM. This is very similar to the C1 strain [11,26]. In the TEM scanning, HearNPV-BJ, measuring about 220 nm in length and 40 nm in width, was defined as a single-nucleocapsid nucleopolyhedrovirus (SNPV) due to a virion with a single packaged nucleocapsid, and multiple HearNPV-BJ virions were embedded in each OB ( Figure 1). Morphologically the results indicated that HearNPV-BJ belonged to Alphabaculovirus, similar to the Helicoverpa armigera single nucleocapsid nucleopolyhedrovirus [27].

Infective Properties of Various Insect Cell Lines of HearNPV-BJ
Describing the biological characteristics of one specific strain, including its external morphology, internal structure and the pathological characteristics of the infection of various cell lines, can provide a reference for the identification and function of the virus strain. Determining the host domain of HearNPV-BJ and the cell line to which the virus is most sensitive could be achieved by infecting different insect cell lines with HearNPV-BJ. Through susceptibility to infection tests on six insect cell lines, it was found that HearNPV-BJ, a monoclonal strain of the nucleopolyhedrovirus, was sensitive to QB-Ha-E-5 (H. armigera) and BCIRL-Hz-AM1 (H. zea), whereas IOZCAS-Ha-I (H. armigera), SES-MaBr-2 (M. brassicas), IOZCAS-Spex-II (S. exigua) and sf9 (S. frugiperda) were not ( Figure 2). Thus, QB-HE-E-5 and BCIRL-Hz-AM-1 cell lines can be used for passage and amplification, which confirmed that HearNPV-BJ is closely related to H. armigera. The difference between the "new" strain and the cells of the above species is an important way of identifying new strains or variants strains to avoid the strain contamination caused by cross-infection. HearNPV-BJ was sensitive to QB-Ha-E-5 and BCIRL-Hz-AM-1, similar to HearNPV-C1 [11,28]. More importantly, when distinguished from HearNPV-C1, HearNPV-BJ was not sensitive to the IOZCAS-Ha-I cell line [14]. Varied infection features were found between HearNPV-BJ and Helicoverpa armigera nucleopolyhedrovirus, which may provide valuable information for the research on the expansion of host range to apply novel insecticides in biocontrol using comparative genomics.

Restriction Enzyme Digest Mapping of HearNPV-BJ
H. assulta and H. armigera are very similar species. The baculovirus, HearNPV-BJ, can infect the above two insects. Compared with HearNPV and HzNPV, which have a single host range, AcMNPV and RoMNPV can infect a broad range of hosts. The restriction endonuclease digestion results for the above strains in a viral genome showed that DNA fragments of HearNPV-BJ had a significant difference to AcMNPV and RoMNPV, whereas they were similar to HzNPV and HearNPV-C1. Accordingly, the results indicated that the HearNPV-BJ strain (formerly Helicoverpa assulta nucleopolyhedrovirus) isolated from H. assulta might be a HearNPV strain. Partial characteristics of the viral nucleic acid can be presented by the map of classic restriction endonuclease digestion [29]. The results of our enzyme digestion show that the two strains of HearNPV are still different (Supplementary Figure S1), demonstrating the limitation of the recognition of variants in the classification of new strains using the restriction endonuclease digestion method. Undoubtedly, it is necessary to reorganize the previously named strains under the guidance of new technologies, especially the widespread application of sequencing technology.

Characterization of HearNPV-BJ Genome
The complete circular HearNPV-BJ genome is 129,801 bp in length, with a G + C content of 38.87%. A total of 128 ORFs that encode proteins of 50 and over 50 amino acids were predicted, which contained 67 ORFs in the forward orientation and 61 ORFs in the reverse orientation, respectively (GenBank accession no. MG569706) ( Table 1). It is characteristic that overlaps between ORFs are represented, and repeated ORFs are not found except the bro gene, which is clearly conducive to increasing the volume of genome expression. It is interesting to note that the greatest nucleotide differences in BRO-A (10%) and BRO-B (4%) were between the HearNPV-C1 (GenBank ID: AF303045.2) and Helicoverpa armigera NPV strain Australia (HearNPV-Au, GenBank ID: JN584482.1), two strains of the same virus species of baculovirus [30]. The bro gene may function in nucleic acid binding, nucleosome binding and nucleoplasmic shuttle activities, affecting the diversity of the baculovirus genome and participating in the recombination between baculovirus genomes [31][32][33]. As analyzed, two bro genes occurred in the genome, namely, ORF54 and ORF97. Regions with homologous repeats (hrs) were first found in AcMNPV [34] and appear to be present in all baculoviruses. Hrs may function as the origin of DNA replication and transcriptional enhancers in a number of baculoviruses [31,[35][36][37]. Three hrs were found to be dispersed along with the HearNPV-BJ genome: the location of hr1 from 22,141 to 24,100, the location of hr2 from 48,921 to 49,826, and the location of hr3 from 107,215 to 108,401. The hrs are rich in A and T, especially hr1 ( Figure 3). In addition to homologous regions and baculovirus repeat ORFs, NPVs shared a high nucleotide sequence identity, which may influence gene exchange and evolution in different geographic locations [38]. The specific structural genes might be considered an important factor affecting the evolutionary status of the baculovirus. By analyzing each predicted ORF with Basic Local Alignment Search Tool (BLAST), the annotations indicated 45 ORFs that were similar to AcMNPV ORF homologs, although some of their translation products have unknown function proteins. Meanwhile, the genome of HearNPV-BJ shared 15 of the highest homolog ORFs with Helicoverpa armigera NPV strain Australia (GenBank ID: JN584482.1) [38] (ORF4, ORF37, ORF40, ORF51, ORF53, ORF61, ORF69, ORF70, ORF8rr0, ORF86, ORF90, ORF93, ORF103, ORF118 and ORF126, respectively), 5 of the highest homolog ORFs with Helicoverpa zea single nuclepolyhedrovirus (ORF14, ORF20, ORF89, ORF107 and ORF113, respectively), 3 of the highest homolog ORFs with Helicoverpa armigera nucleopolyhedrovirus G4 (ORF30, ORF65 and ORF73, respectively), 2 of the highest homolog ORFs with Helicoverpa armigera nucleopolyhedrovirus NNg1 (ORF3 and ORF76, respectively) and 103 of the highest homolog ORFs with HearNPV-C1. However, the three highest homolog ORFs, HearNPV-C1, ORF44, ORF45 and ORF46, were smaller due to the earlier appearance of the stop codon and caused translation termination. Regrettably, the function of the three ORFs is not clear, and we predict that this might be a key factor that determines the pathogenicity and host range of the two similar variants. Although the mechanism of baculovirus genome replication is not fully understood, several viral genes were identified as important genes for DNA replication [39]. P6.9 protein is encoded by p6.9 (ORF82) and participates in DNA condensation, abundant in arginine. It is also called a DNA binding protein and is rich in alkaline amino acids such as protamine in fish, poultry and mammals, which performs the function of binding with a minor DNA groove to transport signals to receptor cells. The extent of the phosphorylation or dephosphorylation of P6.9 directly affects the process of DNA packaging converting to AcMNPV [40,41]. The P6.9 protein of HearNPV-BJ reaches 116 amino acids, almost twice as many as AcMNPV. Compared with AcMNPV, it has not only high content of 35.34% arginine but also 32.74% glycine. The conserved sites predominantly focus on glycine ( Figure 4). The character might influence nucleoprotein assembly. Equally, the homology of HearNPV-BJ P6.9 amino acid sequence analysis was compared and analyzed with that of the HearNPV C1 strain, reaching up to 93.58%. Ecdysteroid UDP-glycosyltransferase (EGT), a kind of secreted protein, plays an important role in molting and pupation for larvae. ORF119 encoding EGT in HearNPV-BJ. Helicase, p143 in AcMNPV, shares a character with the DNA-binding domain, such as the helix-turn-helix domain. Baculovirus mutant with helicase can expand the host range, and helicase can be regarded as a host-range gene. ORF78 encodes helicase in HearNPV-BJ. The application of site-directed mutagenesis might obtain strains with widespread hosts. Except for DNA polyhedrin and helicase, the late expression factor occurs in essential genes for DNA replication, such as Lef1 (ORF117), Lef2 (ORF109) and Lef3 (ORF59) in HearNPV-BJ. Meanwhile, ie-1 is also necessary for DNA replication, which is not expressed in the early stage but also modulates the origin of replication due to its the ability of DNAbinding ability. Other late expression factors occur in HearNPV-BJ such as Lef4 (ORF73), Lef5 (ORF81), Lef6 (ORF20), Lef8 (ORF34), Lef9 (ORF50), Lef10 (ORF42), Lef211 (ORF28) and Lef12 (ORF32).
Gene-parity plots of HearNPV-BJ against AcMNPV, representative baculovirus, SfM-NPV (Spodoptera frugiperda multiple nucleopolyhedrovirus), SpltNPV-II (Spodoptera litura nucleopolyhedrovirus II) and HearNPV-C1 with the highest homologs demonstrated collinearity over the whole genome, which clearly provided the gene location [16]. The HearNPV-BJ gene order is substantially collinear with HearNPV, which is significantly different from AcMNPV (Supplementary Figure S2). By convention, the polyhedron gene was defined as the first ORF in HearNPV-BJ and HearNPV-C1 genome, while it was not present in the AcMNPV genome. SfMNPV, which was developed as a biopesticide against S. frugiperda by the Embrapa company in cooperation with Simbiose, and HearNPV-BJ, differ in the arrangement of genome genes. Coincidentally, the same thing happened to SpltNPV-II, which is close to SfMNPV and far from HearNPV-BJ in terms of evolutionary distance. The results also show that HearNPV-BJ is highly similar to HearNPV-C1 in terms of gene location, which highlights the high correlation between the two strains.

Virus Species Demarcation Criteria
According to traditional naming conventions, the host origin of the strain is generally the first attribute used for virus species demarcation, whereas the limitations noted by sequencing technology, especially occurring in the same viruses isolated from different insect hosts, are given different names. Baculovirus molecular identification and classification were challenged, and the Kimura 2-parameter model composed of lef8, lef9 and polh/gran is often suggested as a solution for assigning baculovirus species. Generally, baculoviruses are considered to belong to the same species, as their distance values are lower than 0.015 in the Kimura 2-parameter model [5]. Here, the distances between HearNPV BJ, HearNPV-Au, HearNPV-C1, HearNPV-G4 and HzSNPV were all calculated ( Table 2). The results indicated that HearNPV-BJ was highly likely to be a variant of Helicoverpa armigera nucleopolyhedrovirus, although HearNPV-BJ was previously named Helicoverpa assulta nucleopolyhedrovirus. More importantly, the results enrich our knowledge of known lepidopteran-specific baculovirus (HearNPV-BJ) and also deepen our understanding of virus species demarcation criteria using sequencing technology.

Phylogenetic Analysis of HearNPV-BJ
HearNPV-BJ contains all core genes identified in other baculoviruses, 38 total. A phylogenetic analysis was based on the maximum-likelihood (ML) method (using RAxML (randomized accelerated maximum likelihood)) software for concatenated 38 core-gene amino acid sequences from HearNPV-BJ and the other 90 baculoviruses listed in ICTV reports [42,43].
The reliability of the tree was tested with 1000 bootstrap replicates. Baculoviruses belonging to the same species according to 38 core-gene data and adjusted thresholds were grouped into individual taxa. A phylogenetic tree was constructed and showed a shorter genetic distance between HearNPV-BJ, HearNPV-C1 and HearNPV-G4 ( Figure 5). Furthermore, based on ORF 126 encoding fusion protein and phylogeny, HearNPV-BJ belongs to Alphabaculovirus Group II. Figure 5. Phylogenetic tree of 91 baculoviruses with complete sequences. The phylogenetic tree was generated using MEGA X software and performed with the maximum likelihood method (bootstrap test 1000 replicates) and a JTT matrix-based model. The result was visualized using iToL [44,45].

Conclusions
Our study characterized HearNPV-BJ, which was the first strain isolated from infected Helicoverpa assulta larvae in Beijing and was regarded as a new type of baculovirus. This work focused on the morphology, virion infectivity and complete nucleotide sequence of the HearNPV-BJ. Although restriction fragment length polymorphisms varied from HearNPV-C1, more evidence, especially supplementary sequencing data, suggested that HearNPV-BJ could be a variant of Helicoverpa armigera nucleopolyhedrovirus. Additionally, sequencing data not only lay the foundation for deeper research on the mechanism of chosen-host and virulence factors to progress optimized strains as biopesticides, realize resourceful utilization, improve the environment and enhance the economy and production benefits, but they also deepen our understanding of virus species demarcation criteria.
Author Contributions: Y.L. and H.Z. conceived the idea; L.Z. and X.L. designed research; L.Z., X.L. and K.T. performed research; L.Z. and X.L. analyzed data and wrote the main manuscript text; Z.Z. contributed to critically revising the manuscript. All authors reviewed the manuscript. All authors have read and agreed to the published version of the manuscript.