Phylogenetic and Phylogeographic Analysis of the Highly Pathogenic H5N6 Avian Influenza Virus in China

The clade 2.3.4.4b H5N8 avian influenza viruses (AIVs) have caused the loss of more than 33 million domestic poultry worldwide since January 2020. Novel H5N6 reassortants with hemagglutinin (HA) from clade 2.3.4.4b H5N8 AIVs are responsible for multiple human infections in China. Therefore, we conducted an epidemiological survey on waterfowl farms in Sichuan and Guangxi provinces and performed a comprehensive spatiotemporal analysis of H5N6 AIVs in China. At the nucleotide level, the H5N6 AIVs isolated in the present study exhibited high homology with the H5N6 AIVs that caused human infections. Demographic history indicates that clade 2.3.4.4b seemingly replaced clade 2.3.4.4h to become China’s predominant H5N6 AIV clade. Based on genomic diversity, we classified clade 2.3.4.4b H5N6 AIV into ten genotypes (2.3.4.4bG1–G10), of which the 2.3.4.4bG5 and G10 AIVs can cause human infections. Phylogeographic results suggest that Hong Kong and Jiangxi acted as important epicentres for clades 2.3.4.4b and 2.3.4.4h, respectively. Taken together, our study provides critical insight into the evolution and spread of H5N6 AIVs in China, which indicates that the novel 2.3.4.4b reassortants pose challenges for public health and poultry.


Introduction
The genetics and antigens of H5 AIVs, which circulate in wild birds and poultry globally, are diverse [1]. To date, H5 AIVs have evolved into different phylogenetic clades (clade 0-9) [2], and clade 2.3.4.4 has further evolved into eight subclades (2.3.4.4a-2.3.4.4h) according to the World Health Organization's naming system [3]. H5 AIVs with the HA of clade 2.3.4.4b have been detected in wild birds and poultry worldwide [4][5][6] and have caused the loss of more than 33 million domestic poultry across the globe (https://empres-i.apps.fao.org/; accessed on 28 April 2022). Novel H5N6 reassortants were reported as one of the dominant AIV subtypes in China, especially in ducks, during 2014-2016 [7]. The clade 2.3.4.4h viruses became the dominant H5N6 lineage in China during 2018-2020 through continually evolving [8,9]. H5N8 AIVs with the HA of clade 2.3.4.4b are responsible for a new wave of outbreaks in poultry and wild birds in January 2020 [10,11]. Currently, the novel H5N6 reassortants with the HA from clade 2.3.4.4b H5N8 AIVs emerged in waterfowl and have caused human infections in China [12,13]. In this study, we conducted an epidemiological survey on local waterfowl farms in Sichuan and Guangxi provinces and performed a comprehensive spatiotemporal analysis of H5N6 AIVs in China from 2017 to May 2022. Our findings provide novel insight into the latest spatiotemporal characteristics of H5N6 AIVs and highlight their potential threat to public health.

Virus and Sequence Preparation
Two H5N6 AIVs designated as A/duck/Sichuan/SS1/2021 and A/goose/Guangxi/GG1/ 2022 were isolated from lung samples of waterfowl farms in Sichuan and Guangxi. The whole genome was amplified using universal primers [14]. All viral experiments were conducted in Biosafety Level 3 (BSL-3) facilities. All available H5N6 sequences isolated in China from 2017 to May 2022 were retrieved from the Global Initiative on Sharing All Influenza Data databases (GISAID; https://www.gisaid.org; accessed on 15 May 2022). Sequences with (a) evidence of lab errors and (b) 100% similarity were discarded. Detailed information on all H5N6 AIVs analysed in this study can be found in Table S1.

Maximum Likelihood Phylogenetic of the H5N6 AIVs
Sequence alignments were constructed for eight segments separately using the MAFFT (version 7.149) program [15]. Using ModelFinder, the best-fit model was GTR+F+G4 [16]. Phylogenetic trees were inferred using the maximum likelihood method in the IQ-TREE 1.68 software with 1000 bootstraps [17,18]. We used ITOL (version 5) to complete the annotation of the evolutionary tree and to adjust it aesthetically [19]. The genotypes of the H5N6 AIVs were classified according to the combinations of lineages in segment trees (Figures S3-S10) [7,12,18].

Bayesian Maximum Clade Credibility Phylogenetic and Demographic Analysis of the H5N6 AIVs
We used the TempEst (version 1.5.3) software to assess the temporal signals by performing root-to-tip regression analysis on viruses from clades 2.3.4.4b and 2.3.4.4h [20], which showed strong temporal signals suitable for demographic history analysis (Figure 1a,b). We computed marginal likelihoods using path sampling and stepping-stone sampling to compare the constant-size, exponential-growth and Bayesian skyline coalescent tree priors [21], and to compare the strict molecular clock and uncorrelated lognormal relaxed clock [22]. The best-fit model was chosen to construct Bayesian maximum cluster credibility (MCC) trees, utilising 300,000,000 total steps for each set, sampling every 1000 steps (Tables S2 and S3). The convergence (effective sample sizes > 200) of relevant parameters was assessed using Tracer (version1.7) [23]. TreeAnnotator (version 1.10.4) software was used to summarize the information from a sample of trees produced by BEAST on a single "target" tree to obtain the MCC tree. The phylogenetic tree was visualised in FigTree version 1.4.3 (http: //tree.bio.ed.ac.uk/software/figtree/; accessed on 15 June 2022).

Phylogeographic Interference
We used the Bayesian stochastic search variable selection (BSSVS) model with asymmetric substitution to infer the H5N6 AIV spread dynamics from 2017 to May 2022 in China [24]. In order to reduce the sampling biases, we used CD-HIT v4.6 to remove viruses without known regional information and identical sequences [25], and further subsampled the dataset in a stratified manner to create the balanced number of sequences per region, which resulted in the final dataset including 28 viruses of clade 2.3.4.4b and 111 viruses of clade 2.3.4.4h ( Figures S1 and S2). We considered transitions credible when the Bayes factor (BF) was >3 and used spreaD3 v0.9.6 to compute the BF tests to assess the support for significant individual transitions between distinct geographic regions [24,26]. We provide detailed information on BF values, migration rates and distances of clades 2.3.4.4b and 2.3.4.4h H5N6 AIVs in Table S4.

Results and Discussion
Molecular analysis revealed that all H5N6 AIVs in this study possess the same polybasic amino acid motif of -RRKR/GLF-in their HA cleavage site, which is a highly pathogenic avian influenza virus (HPAIV) characteristic (Table S2). It is noteworthy that all H5N6 AIVs contained more than one of the seven HA amino acid mutations (94N, 133A, 154D, 155N, 156A, 188I and 189R), which could increase virus binding to α2, 6-linked sialic acid receptors [27][28][29][30][31]. Furthermore, most H5N6 AIVs contain the M1 15I, NS1 103F and NS1 106M mutations associated with virulence, transmission, replication efficiency and adaptation in mammals [32][33][34]. At the nucleotide level, two viruses isolated in our study exhibited high homology with those that cause human infections in the same provinces. The nucleotide homology for the eight genes was 98.88-99.60% between A/duck/Sichuan/SS1/2021 and A/Sichuan/06681/2021 and 95.18-99.61% between A/goose/Guangxi/GG1/2022 and A/GX/guilin/11151/2021 (Table 1).    Figure 2). Therefore, the increasing genetic reassortment between H5N6 and H5N8 AIVs may pose an even greater threat to the poultry industry and public health.
To further reveal H5N6's evolutionary characteristics, we conducted molecular clock phylogenic analysis and genotype characterisation (Figure 2). Based on genomic diversity, clade 2.3.4.4b H5N6 AIVs were classified into ten different genotypes (2.3.4.4bG1-G10) (Figure 2), of which the 2.3.4.4bG5 and G10 AIVs caused human infections [12,13]. More specifically, 2.3.4.4bG5 viruses bore the NA gene from the Eurasian lineage, the PB2, PB1, PA and the NS genes from the H3N2 virus, the NP gene from the H5N1 virus and the M gene from the H5N8 virus. The 2.3.4.4bG10 viruses contain the NA gene from the Eurasian lineage, the PB2 and PB1 genes from the H3N2 virus, the NP and NS genes from the H5N1 virus and the M gene from the H5N8 virus ( Figure 2; Table S2). The genotypes of clade 2.3.4.4b H5N6 AIVs were most diverse in waterfowl, which are considered an essential reservoir for AIVs [36,37]. Moreover, 2.3.4.4b H5N6 AIVs exhibited distinct antigenicity to the Re-11 vaccine strain, which increases the risk to poultry [13].   (Figure 3b,e), and the transition rates between regions were inversely related to the distance (Figure 3c,f). These findings indicate that AIV migrations occur more frequently at shorter distances, consistent with a previous study [38]. The inferred spatial dynamics of H5N6 AIVs suggest that Hong Kong and Jiangxi acted as important epicentres of clades 2.3.4.4b and 2.3.4.4h H5N6 AIVs, respectively (Figure 3a,b). A previous study supports that the live poultry trade between different regions may facilitate the geographic migration of AIVs [7]. Currently, clade 2.3.4.4b H5N6 AIV is spreading in southern China (Figure 3a,b) and may further spread to other regions, which poses a challenge to public health. There were several limitations in this study. First, sampling bias may affect inference of the spread networks and association between the migration rate and distance. Second, the spatiotemporal HA datasets may not have accurately represented the times and locations of H5N6 AIVs originating in wild birds. Therefore, wild birds as long-distance vectors of HPAIV may result in an increased viral spread, a possibility that was not considered in our study [39,40].        In summary, this spatiotemporal analysis revealed that clade 2.3.4.4b has seemingly become the predominant H5N6 AIV in China. These viruses are genetically diverse among waterfowl and frequently spread in southern China. Moreover, the novel 2.3.4.4bG5 and 2.3.4.4hG10 reassortants can cause human infection, thus, posing a serious threat to public health. Hence, systematic surveillance of the H5N6 AIVs is critical for early warning and preparation for the next potential pandemic.   Funding: This work was supported by the special fund for scientific innovation strategy-construction of high-level Academy of Agriculture Science-Distinguished Scholar (R2020PY-JC001).

Data Availability Statement:
Consensus sequences generated in this study were submitted to the GISAID database, and their corresponding accession numbers are listed in Supplementary Table S2.