3.1. VINHV Genome and Terminal Sequences
The complete S and L genomic segments of VINHV and near complete M genomic segment were sequenced using high throughput sequencing (GenBank Accession numbers MF176883, MF176881, and MF176882, respectively). The lengths of the S, M, and L segments are 1729 nucleotides (nt), 4473 nt (lacking the 3′-terminal region, genome sense), and 12,133 nt, respectively. The organisation of the VINHV genome is consistent with those found in other orthonairoviruses, each containing a single ORF encoding the N protein, GPC, and L protein (Table 1
). BlastX analysis of the GenBank databases indicates that VINHV is most similar to Dera Ghazi Khan virus (DGKV) sharing 78%, 69% and 71% amino acid identity with the translated protein products from the L, M, and S segments, respectively.
A common feature of bunyaviruses is the conservation of genus-specific genome termini. In nairoviruses, the consensus terminal nt sequences are 3′ AGAGUUUCU- and 5′ UCUCAAAGA-. The genome termini of VINHV S and L segments are consistent with this, with the exception of a single nucleotide change (U→A) at position 9 of the 3′ terminus of the S segment. The terminal sequences of Dera Ghazi Khan genogroup viruses have been observed to differ at position 9 of both terminal ends for each segment, with the exception of DGKV (Figure S1
]. DGKV has a deviation from the consensus at position 9 of the 3′ termini only of the M and S segments, similar to the observation in VINHV, further supporting a close relationship between VINHV and DGKV. Attempts to obtain the 3′ terminal non-coding sequence of the VINHV M segment were unsuccessful, despite several attempts. As the coding sequence for the GPC was complete, allowing comparative analyses with other GPCs to be performed, further attempts to obtain the non-coding portion at the 3′ terminus were abandoned. It is anticipated, however, that given the overall similarity of VINHV to DGKV, a similar sequence would be present at the 3′ end, though this would need to be confirmed.
3.2. L Protein
The single ORF on the L segment of VINHV encodes a 3948-aa viral L polymerase protein. The L polymerases of -ssRNA viruses contain four conserved regions reflective of the universal functions of this protein [16
]. Region I has a presumed cap-snatching endonuclease activity [19
], whilst the function of region II is unknown. Region III, also called the polymerase module, contains six conserved motifs (pre-motif A and motifs A–E) and is predicted to be involved in catalytic functions of the polymerase, and in template and/or primer positioning [17
]. Region IV is suggested to have a role in capped primer-cleavage and 5′ viral RNA binding [16
]. All of these regions and motifs are highly conserved in the putative VINHV L protein (Figure 1
a–d). Although zinc finger and leucine zipper sequence motifs have previously been identified in the CCHFV L protein [20
], these are not always apparent in all nairoviruses [3
]. Likewise, they are not apparent in the VINHV L protein.
In addition to these regions, an ovarian tumour (OTU)-like domain (pfam02338) has been identified in proximity to the N termini of the L protein of all nairoviruses with the possible exception of the “nairo-like” viruses including South Bay virus, which has a divergent sequence that shows some homology to the OTU-like domain [3
]. Similarly, an OTU-like domain is predicted in the VINHV L protein (Figure 2
). The observed functionality differences between the OTU domains of virulent CCHFV and less virulent DUGV lead some to speculate that this domain may be a virulence factor [3
The VINHV M segment contains a single ORF that putatively encodes a 1414-aa polyprotein, which, like other bunyavirus M segment polyproteins, is predicted to be co- and post-translationally processed into mature viral glycoproteins [23
]. The VINHV polyprotein shares similar sequence organisation to other nairoviruses and contains various conserved post-translational modification sites and structural features (Figure 3
). The study of CCHFV provides much of our understanding of nairovirus GPC structure and processing [24
]. The analysis of CCHFV shows that the GPC has an N-terminal mucin-like domain containing a large number of predicted O
-glycosylation sites, followed by a protein of unknown function (GP38), an envelope glycoprotein (Gn), a non-structural protein (NSm), and a second envelope glycoprotein (Gc).
Similar to other nairoviruses, the VINHV GPC is predicted to contain an N-terminal signal peptide (at VLA30
-NT) followed by a highly O
-glycoslylated (16 sites) mucin-like domain, but one which is considerably shorter with less predicted O
-glycosylation sites than in CCHFV. Although the function of the mucin-like domain of CCHFV GPC remains undetermined, a similar mucin-like domain in the Ebola virus glycoprotein GP1 is known to play a major role in pathogenesis [29
]. The M segment is the most variable of the three segments, and this is particularly notable in the hypervariable N-terminal region that precedes the Gn protein. The characteristics of this region are generally genogroup-specific in relation to variation in the number of O
-glycosylation sites and the length of the predicted mucin-like domain [5
]. Viruses of the DGK genogroup are generally known to have one of the smallest mucin-like domains amongst the nairoviruses ranging between 56 to 124 aa in length, containing between seven and 22 O
-glycosylation sites. The CCHFV mucin-like domain is cleaved by a furin or furin-like protease (site RSKR), generating a 247-aa protein [30
]. There does not appear to be an equivalent furin-like protease cleavage site in any of the analysed DGK group GPCs, including in VINHV; however, there are a number of possible alternate protease cleavage sites in the vicinity of the domain (Figure 3
). The VINHV M segment does not appear to encode an NSm protein, which is consistent with all other nairoviruses except for the NSD genogroup viruses, which do encode this protein [5
The Gn and Gc glycoproteins of nairoviruses are relatively well conserved in size and structural characteristics [5
]. Similar to predictions in other nairoviruses, the VINHV Gn and Gc proteins are predicted to be cleaved by the subtilisin/kexin-isozyme-1 (SKI-1) protease at sites RHLL383
↓ and RRLL775
↓, respectively. The VINHV Gn and Gc proteins are of similar size to those of other DGK group viruses, and they contain numerous conserved cysteine residues which have a functional role in protein folding, transmembrane domains, and zinc finger domains (Figure 3
; Figures S2 and S3
VINHV Gn contains three predicted glycosylation sites (Figure S2
). The location of the first glycosylation site (NGTK432
) is universally conserved amongst all nairoviruses. The second glycosylation site (NGSG498
) appears to be conserved with DGKV and Sapphire II virus (SAPV), whilst the third site (NHTS509
) appears to be unique to VINHV. Similarly, there are three predicted glycosylation sites in the VINHV Gc protein (Figure S3
). The first (NNSV795
) is conserved amongst all the analysed DGK viruses with the exception of SAPV, and the second (NGSI1151
) is conserved in all the analysed DGK group viruses. The third site (NCTG1309
) is generally conserved with most of the viruses in the orthonairovirus genus [5
3.4. N Protein
The N protein of -ssRNA viruses binds to genomic RNA to form ribonucleoprotein complexes that associate with the polymerase for viral RNA synthesis (transcription and replication) and form the structural core of the virion [31
]. The length of the VINHV N protein is 499 aa, which is similar to those of other nairoviruses. Crystal structure studies of the N protein of CCHFV demonstrated two major domains, a globular head and an extended stalk, with RNA/DNA binding-associated sites predominantly found on the head domain [31
]. In comparative sequence analysis, the VINHV N protein exhibits conservation of these binding sites, either fully (K132, R134, K222, Q300, K343, R384, H453, and Q457), or with a conservative change (H197N, Y374H, E387D, and K411R) [31
] (Figure S4
). The caspase-3 cleavage site motif previously identified in some nairoviruses (CCHFV, Hazara virus (HAZV) and Thiafora genogroup viruses) is not apparent in the VINHV N protein [5
Pairwise alignments show that the VINHV N protein shares 54.4 to 72% identity with the N proteins of other viruses within the DGK genogroup (Table 2
) and 31.7 to 42.1% identity with the N proteins of representative viruses from the other genogroups (Table S2
). Walker et al. [5
] suggested a sequence identity cut-off of 52% to place viruses into genogroups. Using this criterion, the placement of VINHV into the Dera Ghazi Khan genogroup is well supported.
3.5. Phylogenetic Analysis
Until recently, only nairoviruses associated with hard ticks had been fully sequenced. This posed challenges for the phylogenetic analysis of nairoviruses associated with soft ticks, which were clearly different. Whilst partial L protein data for some soft tick nairoviruses has existed for some time [34
], the recent work of Walker et al. [4
] and Kuhn et al. [3
] has produced sequence data enabling the comprehensive genomic analysis of numerous soft-tick viruses from this genus. It is evident from these studies that the phylogenetic relationship of nairoviruses broadly reflects vector preferences, genome organisation, and serological relationships.
Bayesian phylogenetic analyses were performed using the N protein, GPC, and L protein of VINHV and other representative nairoviruses (Figure 4
a–c, respectively.) Maximum likelihood analyses were also performed for comparison and these produced trees with similar topologies (data not shown). The phylogenetic analyses demonstrate strong support for the formation of nine distinct clades representing the nine proposed genogroups [5
]. Lower support present at some of the deeper nodes is most likely reflective of the divergence of some viruses, and will only be strengthened by the sequencing of additional viruses in this genus.
The inclusion of VINHV within the DGK genogroup is strongly supported and the relationships inferred within the group are uniform with all three segments. Viruses of the DGK genogroup are widespread throughout the world (Pakistan, Taiwan, Thailand, South Africa, USA, and Australia) and, in most instances, have been isolated from ticks feeding on birds. Thus, it is feasible that the distribution of this group of nairoviruses, including the introduction of VINHV and other nairoviruses to the Australian mainland, may be via avian migration. The cattle egret, which VINHV is associated with, has populated Australia only since the late 1940s.
Phylogenetic analyses also demonstrate that VINHV and DGKV share a common ancestor. DGKV was isolated from ticks feeding on camels in Pakistan in 1966. This may indicate another entry route for VINHV, or its ancestor, into Australia via the 10–20,000 camels that were brought into the country from India and Pakistan in the period of 1860–1907.
3.6. Ticks and Emerging Viruses in Australia
Though tick-borne diseases do not contribute greatly to the overall communicable disease burden in Australia, an increase in incidence may be seen in the future with climatic, population, and lifestyle changes [35
]. Also, it is possible that a proportion of unknown or undiagnosed illnesses could be attributed to tick vectors. It is essential that we gain an understanding of the biome of the native ticks, particularly those that are known to bite humans. The A. robertsi
tick, from which VINHV was isolated, is one of five soft tick species in Australia that possibly feed on humans and domestic animals [36
]. However, the most important tick in Australia, from both a medical and veterinary perspective, is Ixodes holocyclus
, a hard tick species. Hence, much of the tick research in Australia is focused on hard ticks and associated diseases, particularly of bacterial origin. I. holocyclus
is the vector for Rickettsia australis
and R. honei
, the aetiological agents of the only two recognised tick-borne diseases in Australia—Queensland tick typhus and Flinders Island spotted fever, respectively [35
]. It is speculated that this species of tick may also have a role in Hendra virus transmission [37
]. Furthermore, amid debate regarding the presence of tick-borne Lyme disease in Australia, a Borrelia
sp. related to the Lyme disease agent has been isolated from this species of tick [38
]. However, despite patients presenting with Lyme-like disease, no aetiological agent has been linked to disease locally, and therefore the presence of Lyme disease in Australia is not confirmed.
Advances in sequencing technology have allowed us to investigate the biome of arthropods that are known vectors of disease. Although an ongoing study into tick-borne diseases in Australia has developed strategies to successfully identify low abundant bacteria in hard ticks [38
], it is not evident whether this study will expand to include the identification of viral agents. Analysis of the viromes of three American ticks revealed a diverse array of viruses, including several novel viruses with genetic similarities to pathogens of humans and livestock [21
]. Likewise, a similar metagenomics analysis of Australian mosquitoes detected the presence of viruses from families Flaviridae
, and Bunyaviridae
]. It is evident that there is great potential for novel and emerging viruses circulating in Australian arthropods.
Although some viruses have previously been isolated from Australian ticks [40
], none as yet have been associated with human disease. However, it is important to note that antibodies to VINHV have been found in human sera and, as such, the potential threat to human health demands further investigation. With the ongoing sequencing of historic Australian virus isolates, our understanding of viruses circulating within Australia will also increase [4
]. In other parts of the world, tick-borne infectious diseases are on the rise and becoming a serious world health problem affecting both human and animal health. For example, there has been a marked increase in the range and incidence of CCHF since 2000, and tick-borne encephalitis is a growing concern in Europe and Asia [45
]. Similarly, the incursion of African swine fever into the Caucasus, and potentially from there into Europe, is of deep concern and requires preventative strategies to avoid the spread of this disease [46
Bird migration plays an important part in the spread of tick-borne disease. For example, the massive expansion of the cattle egret range began after cattle became established in newly created, and expanding, cattle pastures in continents additional to Africa and Asia. They flew to the Americas in 1933, Australia 1948, and Europe in 1958, and have subsequently spread widely from there. Cattle were introduced to each of these new territories following European exploration and removal of forests. It is presumed that the associated ticks and tick-borne viruses have spread more slowly, as tick- and virus-free colonies exist within flight ranges of an infected colony. Thus, the pasture habitats are created for cattle, and the migrant egrets, ticks, and viruses follow, in that order. Investigations into the virome of Australian ticks will provide valuable information on the potential for the emergence of new viruses associated with this vector within the Australian landscape. Adequate biosurveillance in this area should be prioritised to mitigate any potential future emerging diseases.