Phylodynamics of Highly Pathogenic Avian Influenza A(H5N1) Virus Circulating in Indonesian Poultry

After its first detection in 1996, the highly pathogenic avian influenza A(H5Nx) virus has spread extensively worldwide. HPAIv A(H5N1) was first detected in Indonesia in 2003 and has been endemic in poultry in this country ever since. However, Indonesia has limited information related to the phylodynamics of HPAIv A(H5N1) in poultry. The present study aimed to increase the understanding of the evolution and temporal dynamics of HPAIv H5N1 in Indonesian poultry between 2003 and 2016. To this end, HPAIv A(H5N1) hemagglutinin sequences of viruses collected from 2003 to 2016 were analyzed using Bayesian evolutionary analysis sampling trees. Results indicated that the common ancestor of Indonesian poultry HPAIv H5N1 arose approximately five years after the common ancestor worldwide of HPAI A(H5Nx). In addition, this study indicated that only two introductions of HPAIv A(H5N1) occurred, after which these viruses continued to evolve due to extensive spread among poultry. Furthermore, this study revealed the divergence of H5N1 clade 2.3.2.1c from H5N1 clade 2.3.2.1b. Both clades 2.3.2.1c and 2.3.2.1b share a common ancestor, clade 1, suggesting that clade 2.3.2.1 originated and diverged from China and other Asian countries. Since there was limited sequence and surveillance data for the HPAIv A(H5N1) from wild birds in Indonesia, the exact role of wild birds in the spread of HPAIv in Indonesia is currently unknown. The evolutionary dynamics of the Indonesian HPAIv A(H5N1) highlight the importance of continuing and improved genomic surveillance and adequate control measures in the different regions of both the poultry and wild birds. Spatial genomic surveillance is useful to take adequate control measures. Therefore, it will help to prevent the future evolution of HPAI A(H5N1) and pandemic threats.


Introduction
In 1996, the first outbreak of highly pathogenic avian influenza virus (HPAIv) A(H5N1) occurred in China. Subsequently, this virus from the goose/Guangdong (Gs/Gd) lineage spread to multiple other countries. Nowadays, outbreaks of HPAIv A(H5N1) and related HPAIv have caused economic losses due to the deaths and culling of millions of chickens and other poultry worldwide. In addition, 865 human cases of HPAIv A(H5N1) infections were reported with a case-fatality rate of 53% from 2003 to 2022 [1].
The HPAIv A(H5N1) virus was first reported in Indonesia in 2003 and became endemic in multiple regions afterward. The introduction to and spread of HPAIv A(H5N1) within Indonesia was facilitated by several factors [2]. First, Indonesia is located at the crossroads of international trade between two continents (Asia and Australia) and two oceans (Peace, the Indian Oceans). Second, two wild bird migratory flyways, the East Asian-Australasian (EAAF) and the West Pacific (WPF) flyways include Indonesia. Third, the high contact rate between poultry from different locations [3] and between domestic ducks and wild birds due to poor biosecurity, particularly for backyard and moving or scavenging ducks [4]. Virus transmission between farms was facilitated by poultry trade and live bird markets and by human-animal interaction from inbound and outbound visits to poultry farms and live bird markets. Humans, via contact with poultry, could act as a vector of HPAIv A(H5N1) and facilitate transmission between poultry flocks [3,5].
Molecular surveillance is an important tool to support the control of HPAIv A(H5N1). HPAI genome sequence data obtained from avian and human cases can be used to understand transmission pathways [6], identify molecular markers for disease [7,8], expand host coverage [9], and detect variants associated with vaccine escape [10]. Molecular surveillance can also help to identify possible genetic drift and reassortments of HPAIv A(H5N1) with other influenza A viruses that may result in newly emerging viruses with possible increased transmission in poultry and wild birds, different pathogenicity which may also result in a wider host range [11,12].
Based on the global analysis of genomic data of HPAIv A(H5N1) detected in Indonesia, HPAIv A (H5N1) were classified into various clades, starting with clade 2.1, which subsequently branched into clades 2.1.1, 2.1.2, 2.1.3.2, and 2.1.3.2a; most clades have been reported to affect poultry [13][14][15]. In 2012, a new clade, 2.3.2.1c, was isolated from a duck farm and live bird markets in Java with high mortality among duck and amino changes such as a Ser deletion at position 325 in the multibasic amino acid cleavage site, and a K328R substitution [16]. The detection of HPAIv from this new clade was thought to be the result of a new incursion from other parts of South East Asia to Indonesia [11,17] [11,16,[20][21][22]. A molecular study of HPAIv A(H5N1) carried out in 2015 and 2016 suggested that this new clade had diverged into two putative subgroups, clades 2.3.2.1c (A) and 2.3.2.1c (B) [11].
Although the major clades of HPAIv A(H5N1) in Indonesia are known, there is limited understanding of the evolution of HPAIv A(H5N1) in Indonesia. This knowledge can be useful to help focus surveillance and strengthen control measures aiming to reduce future reassortments and transmission of HPAIv among poultry and humans. The present study aimed to increase the knowledge of HPAIv A(H5N1) evolution in Indonesia from 2003-2016, with a particular focus on the HA gene segment and the jump of the clades of H5N1v.
To this end, we analyzed the available sequences of hemagglutinin (HA) in the genome database to improve the understanding of the phylodynamics of HPAIv A(H5N1) in Indonesia.

Dataset Preparation
Complete sequences of HA genes obtained from HPAIV A(H5Nx) detected in poultry in Indonesia from 2003 to 2016 were downloaded from the genome database, GISAID, and GENBANK and compiled as Indonesian H5N1 (HA). Another data compilation was downloaded from all available global sequences including Indonesia from 1966 to 2022 and separated as Global H5 (HA). Additional separated data for clades 2.3.2.1c, 2.3.2.1a, and 2.3.2.1 were also downloaded from the database. The HA gene was chosen because the HA protein is located on the outer surface of the virus particle, has a role in the virus-host cell interaction and is the main target for the protective antibody response [23]. Additionally, HA genes are published most frequently in the genome database, indicating that a worldwide phylodynamic analysis of H5N1v using HA genes will provide the most information.
The sequences were then aligned using MUSCLE [24] and the HA clades of the virus were phylogenetically analyzed using MEGA 7 [25] as described in a previous study [11]. The clade of HA was confirmed using the Highly Pathogenic H5N1 Clade Classification Tool of the Influenza Research Database (https://www.fludb.org/brc/home.spg?decorator= influenza, last accessed on 13 September 2022).

Clustering HA Gene Segments
The dataset of HA genome sequences was processed with cd-hit-est software of the CD-HIT Suite (http://weizhong-lab.ucsd.edu/cdhit_suite/cgi-bin/index.cgi?cmd=cd-hit-est, last accessed on 13 September 2022) to cluster sequences that shared 100% nucleotide identity [26][27][28]. The CD-HIT_EST test was performed on the globally available 12,018 HA genome sequences (1966-2022) irrespective of the accompanying NA. To condense the global taxa of the full genomes of HA genes, 80 to 99% identity thresholds were examined to obtain the cluster representative sequences. Maximum-likelihood analysis with bootstrapping was performed at different thresholds, and clusters of representative taxa were selected from taxa that share a larger identity than 98%. The representative sequences were used as a dataset for time-scale phylogeny analysis and demography reconstruction.

Time-Scale Phylogeny of Indonesian HPAIv A(H5N1) Sequences
Divergence times and evolutionary analysis were estimated simultaneously with Bayesian phylogenetic inference (BI) implemented in BEAST v.2.6.7 [29] (http://www. beast2.org/). The optimal substitution model was selected by the BEAST-ModelTest (bMod-elTest) v.1.2.1 package implemented in BEAST using transdimensional Markov chain Monte Carlo (MCMC) methods [30]. The best substitution model from bModelTest was also compared to the best substitution model selected by the Modeltest in the phangorn package implemented in the R (version R-4.0.3) environment for statistical analysis. bModelTest was also used to infer the gamma-distributed rate of heterogeneity, invariable site proportions, and unequal base frequencies [30].
Tree and clock priors were set on a coalescent Bayesian skyline tree and a relaxed molecular clock (assuming an uncorrelated lognormal distribution clock model) which was calibrated by using the sample collection dates. The Bayesian MCMC analysis was performed for 150-300 million generations sampled every 1000-3000 generations.
The parameter convergences were viewed and evaluated using Tracer v.1.7.1 [31] (http: //tree.bio.ed.ac.uk/software/tracer/). The maximum clade credibility (MCC) phylogenetic trees were constructed by TreeAnnotator v.2.6.7 (BEAST package) by removing the initial 10-25% (burn-in) trees (burn-in settings depend on convergence). Then, phylogenetic trees were visualized by using In the final stage, we performed BEAST to analyze the worldwide HA of all available avian influenza viral sequences (2005-2021). The full length 12,018 HA sequences were downloaded from GISAID and clustered using CD-HIT-EST as described above. the Reference sequences closely related to Indonesian HPAIv H5N1 according to the maximum likelihood tree were selected and aligned using MEGA 7 [25] before proceeding to the BEAST analysis.

Bayesian Evolutionary Analysis of HA of Indonesian H5N1
The number of taxa used for the BEAST analysis performed on HA sequences from Indonesia and worldwide with different times of collection and the number of sites is presented in Table 1. The mean of rates is posteriorly estimated based on Bayesian MCMC analysis using evolutionary models. The number of sequences is labeled as a taxon (taxa). The character of the number of differed sites is normalized from the length of a sequence to get the proportion of differences between two sequences [24]. Time-measured phylogenetic analysis of 1707 sites from 94 taxa using the substitution model TIM1 + Γ + I showed the evolution of various clades HPAIv A(H5N1) in Indonesia (Table 1).

Indonesian Viruses in the Phylodynamic of Indonesian Worldwide Avian Influenza H5N1 Virus (AI H5N1v)
The phylogeny of the worldwide HA including the Indonesian viruses is depicted in Figure 1. Based on analysis of representative sequences, H5N1v clade 2.1 and its subclades were only detected in Indonesia, while clade 2.3 viruses were detected in multiple countries in Asia, Europe, and Africa, including Indonesia, since 2009. Other subclades of clade 2, such as 2.2, 2.4, and 2.5, were circulating in multiple countries, such as China, Egypt, Germany, India, and Japan, but were not detected in Indonesia. These different geographic distributions of viruses also indicate geographic imbalances in virus spread and geographic leaps of multiple viruses from various clades.
The BEAST analysis estimated that the mean nucleotide substitution rate of global HA was 0.0065 substitution/site/year (s/s/y) over the course of 16 years (95% Interval, 0.0061-0.0070) ( Table 1).  Table 2.

Temporal Dynamic of Indonesian HPAIv A(H5N1): Time-Measured Phylogenetic Analysis
In the present study, a time-measured phylogenetic analysis was performed to increase the understanding of the HPAIv A(H5N1) detected in Indonesia from 2003-2016. While phylogenetic analysis of HPAIv A(H5N1) was the focus of a number of studies already [11,22,32], a study including all available Indonesian virus sequences has, to our knowledge, not been performed previously. Posterior analysis of Indonesian HPAIv A(H5N1) 2003-2016 estimated that the HPAIv A(H5N1) clade 2.3.2.1 evolved from the HA clade 2.1.1. In addition, the posterior analyses using BEAST with bModeltest, instead of the maximum likelihood approach, which is used as a criterion in a unified nomenclature system for HPAIv, confirmed the finding of our previous study [11] that HA clade 2.3.2.1c consists of two different clusters [13][14][15]. The time-measured analysis also showed that after 2012, mainly HPAIv A(H5N1) viruses classified as clade 2.3.2.1c were detected. The observed evolution of HPAIv A(H5N1) viruses, the emergence of new clades, and the emergence of reassortments may have been caused by biosecurity gaps leading to reassortment and limited vaccine efficacy and poor vaccination coverage, although we cannot exclude circulation of these viruses in wild birds due to the very limited surveillance of avian influenza in wild birds in Indonesia [11,[33][34][35].
The substitution rate of avian influenza viruses worldwide has been studied extensively [18,36,37]. A previous study [38] estimated viral RNA substitution rates in the range of 0.01 to 0.001 s/s/y. Additionally, the rapid evolutionary dynamics of avian influenza viruses were estimated by a previous study with a substitution rate range of 0.0018-0.0084 s/s/y [39]. The estimated substitution rate in this study showed the fast substitution rate of Indonesian poultry HPAIv A(H5N1) and HA of worldwide H5, which was in line with previous reports by Duffy et al.  [40]. The variation in the substitution rates between the HPAIv A(H5N1) genes can be caused by many factors, such as the differences in viral biologies such as viral genome architecture, replication speeds within-host and viral polymerase enzyme fidelities [41], and environmental selectivity related to the host factors such as species [38], vaccination status [37], contact rate, and age of infection, epidemic, and endemic status in a region during infection [41]. Positive selection pressures related to environmental selectivity have been identified at several antigenic sites of the HA gene in the previous study [22]. Meanwhile, the mean substitution rate of global HA was higher than in Indonesian poultry HPAIv A(H5N1); this observation might, however, be biased by sampling differences.
The phylogenetic analysis estimated that HA clades 2.3.2.1a and 2.3.2.1c shared a common ancestor and were rooted in the clade 2.3.2.1b. The H5N1v clade 2.3.2.1c and 2.3.2.1a diverged from clade 2.3.2.1b in agreement with a previous study [13,15]. A gap in the H5N1v clades in Indonesia is indicated by the lack of report of clade 2.3.2.1b, the clade that has been reported in Vietnam and Hong Kong [15,42]. This clade gap was assumed based on the finding in Indonesia that the HPAIv A(H5N1) clade 2.3.2.1c was rooted in HA clade 2.1.1. Bird migration and/or poultry trade could have driven the transmission and evolution of the H5N1v clade 2.3.2.1a to clade 2.3.2.1c. Additionally, unrecognized clinical signs in poultry and the reluctance of farmers to report the H5N1 outbreaks, particularly in sector 1 farms, might have contributed to the absence of some clades of H5N1v in the data set. This gap shows the need for regular and intensive surveillance to control the evolution of H5N1v, not only in poultry but also in wild birds.
The most recent ancestor of the H5N1 influenza virus in Indonesia has been previously studied [22,32]. The first study [22]

Limitations and Benefits of the Study
We acknowledge several limitations in this study. First, the limited data, particularly the number of taxa or samples, may have affected the inferences of evolutionary analysis. Surveillance data and avian influenza virus sequences in wild birds in Indonesia are very limited or absent. All avian influenza sequence data in public genome databases were obtained from domesticated birds. Differences in sampling over time and space may affect the outcomes of the analysis. Therefore, improved surveillance with good competency for clinical and laboratory diagnosis and collection of metadata, as well as the willingness to share the information, is crucial to raise the number of viral genomes in the public database. Surveillance in wild birds is also crucial to reveal the clade gap and study the evolution of the avian influenza virus. Furthermore, additional studies are needed to identify key amino acid changes and evaluate their impact on the viral phenotype, and also on the relationship with the possible role of vaccination programs on the observed evolution of HPAIV A(H5N1).
This study is of importance not only for virus identification but also for studying virus evolution in Indonesia. This study shows that probably only two introductions occurred, after which HPAIv A(H5N1) continued to circulate among poultry in Indonesia. Continuous surveillance of poultry farms in all sectors and live bird markets in Indonesia with global support and collaboration are essential to take adequate measures and prevent further evolution of the virus. In addition, compartmentalization, inspection, and certification [43,44] of poultry farms are also important to control the evolution of HPAIv A(H5N1) in Indonesia. Estimation of temporal characteristics of HPAIv A(H5N1) across Indonesia in association with the viral dynamics is essential in conducting prevention controls such as quarantine, movement restriction, diagnostic tools, surveillance systems, and vaccine development [45,46] for future outbreaks. The discovery of different clades in only a few regions and the fact that some Indonesian HPAIv A(H5N1) clades were not detected in other countries indicates the importance of area-and country-specific preventive measures for HPAI outbreaks [45]. The Indonesian archipelago, with the ocean as a geographical barrier between islands and between continents, can be an advantage for the country and region-specific preventive measures, as well as reconstructions of intensive poultry farming locations and mapping of wild bird captive areas. In parallel, capacity building is of great importance for each country, and an agreed consensus between countries is a necessity in studying the viral phylodynamics, combined with regular genomic surveillance, to prevent future HPAIv pandemics.

Conclusions
This study demonstrated that introductions of HPAIv A(H5N1) into Indonesia are infrequent and most of the observed changes in the virus originate from within Indonesia. The lack of detection of H5N1v clade 2.3.2.1b and the limited Indonesian HPAIv A(H5N1) genomic sequences in the database indicate that there is room for improvement in molecular surveillance of HPAIv in Indonesia. Furthermore, the evolutionary dynamics of the Indonesian HPAIv A(H5N1) highlight the need for continuing genomic surveillance and adequate control measures to prevent viral introduction and evolution, within and between farm transmission in different regions.

Data Availability Statement:
This study did not report any data.

Acknowledgments:
The authors acknowledge all owners of Indonesian and worldwide HA sequences in the GISAID database used in this study. The authors also acknowledge the Animal Health Authority under West Java Province, especially in the Subang, Indramayu, Bandung, Sukabumi, Tasikmalaya, Purwakarta, and Ciamis districts. We thank other collaborators: the Cikole Animal Health Laboratory of the Animal Husbandry Service of West Java Province, District Investigation Center Subang, and Centre of Tropical Animal Health Studies under Bogor Agricultural University (CENTRAS-IPB), Eijkman Institute for Molecular Biology for collecting, screening, and sequencing the samples. I Made Artika is thanked for discussion and sequencing from 2015-2016.

Conflicts of Interest:
The authors declare that the research was conducted in the absence of potential conflicts of interest.