Epidemiology and Molecular Transmission Characteristics of HIV in the Capital City of Anhui Province in China

Hefei, Anhui province, is one of the cities in the Yangtze River Delta, where many people migrate to Jiangsu, Zhejiang and Shanghai. High migration also contributes to the HIV epidemic. This study explored the HIV prevalence in Hefei to provide a reference for other provinces and assist in the prevention and control of HIV in China. A total of 816 newly reported people with HIV in Hefei from 2017 to 2020 were recruited as subjects. HIV subtypes were identified by a phylogenetic tree. The most prevalent subtypes were CRF07_BC (41.4%), CRF01_AE (38.1%) and CRF55_01B (6.3%). Molecular networks were inferred using HIV-TRACE. The largest and most active transmission cluster was CRF55_01B in Hefei’s network. A Chinese national database (50,798 sequences) was also subjected to molecular network analysis to study the relationship between patients in Hefei and other provinces. CRF55_01B and CRF07_BC-N had higher clustered and interprovincial transmission rates in the national molecular network. People with HIV in Hefei mainly transmitted the disease within the province. Finally, we displayed the epidemic trend of HIV in Hefei in recent years with the dynamic change of effective reproductive number (Re). The weighted overall Re increased rapidly from 2012 to 2015, with a peak value of 3.20 (95% BCI, 2.18–3.85). After 2015, Re began to decline and remained stable at around 1.80. In addition, the Re of CRF55_01B was calculated to be between 2.0 and 4.0 in 2018 and 2019. More attention needs to be paid to the rapid spread of CRF55_01B and CRF07_BC-N strains among people with HIV and the high Re in Hefei. These data provide necessary support to guide the targeted prevention and control of HIV.


Introduction
As of 2020, there were 37.7 million people living with HIV worldwide [1]. The prevalence and transmission of HIV remains a huge challenge to public health in China. There has been many reports on HIV in key endemic areas, but HIV research in characteristically large cities in China is also very important and needed. HIV infections in Anhui province originated with earlier commercial blood donations [2]. Gradually, sexual transmission has become the main transmission route of people with HIV in Anhui province [3]. Hefei, the capital of Anhui province, is a dual-node city under the Belt and Road Initiative and the Yangtze River Economic Belt. It is an important central city and comprehensive transportation hub in eastern China. The current HIV transmission route mainly involves homosexual transmission in Hefei. It has high population mobility and floating populations may promote the transmission and prevalence of HIV. Studying the prevalence and transmission of HIV in Hefei can ensure effective intervention in active transmission clusters and reduction in the transmission of HIV.
Recent studies on the application of new techniques and methods in molecular epidemiology have shown the power of various HIV gene sequence and analysis tools in improving transmission detection and intervention guidance [4][5][6]. The HIV molecular network is a type of transmission analysis based on the group model, which considers a potential transmission relationship between the infected people connected together. We can identify active transmission clusters that require critical attention and intervention through molecular networks [7][8][9]. The effective reproductive number (Re) is often used to describe the dynamics of transmission during an epidemic. Phylodynamic modeling can combine genetic modeling with epidemiological modeling to quantify Re in HIV infections [10][11][12].
In this study, we explored the prevalence and transmission of newly reported HIV infections in Hefei using the molecular network and estimated the Re of different subtypes. The results provide a basis for accurate prevention and control of HIV transmission and a reference for research in other provinces and cities.

Molecular Network Analysis and Active Transmission Clusters
Under the threshold of 0.5% genetic distance, 27.1% (221/816) of the sequences from 69 clusters were enrolled in the molecular network. The largest cluster consisted of 10.0% (22/221) sequences. A molecular network diagram of Hefei is shown in Figure 2.
A total of 816 plasma samples were eligible for HIV limiting-antigen avidity enzyme immunoassay (LAg-Avidity EIA), and 22.1% (180/816) of them were recently infected. Based on newly reported HIV infections in 2017-2019, a transmission cluster with at least three recent HIV infections in 2020 is defined as an active transmission cluster. There were three active transmission clusters: the largest cluster was CRF55_01B, followed by CRF01_AE-cluster 4, and CRF01_AE-cluster 5. There were seven clusters containing two recent infections in 2020, including two clusters of CRF07_BC-N and five clusters of CRF01_AE-cluster 4.

Analysis of Interprovincial Transmission Characteristics
After removing the repeated sequences, 50,798 sequences were obtained from the LANL and China CDC databases as of 30 June 2020. Dataset B covered 31 provinces in China from 2000 to 2020 (Supplementary Table S1). Under the threshold of 0.5% genetic distance, 35.8% (292/816) of the sequences from Hefei were enrolled in the molecular network, which increased by 8.7% compared with the Hefei's network. The interprovincial transmission rate in Hefei was 9.8% (92/943). We evaluated factors associated with clustering and interprovincial transmission. After adjusting for other factors, the CRF55_01B and CRF07_BC-N strains entered the network more easily, and their interprovincial transmission rates were higher (Tables 2 and 3).

Molecular Network Analysis and Active Transmission Clusters
Under the threshold of 0.5% genetic distance, 27.1% (221/816) of the sequences from 69 clusters were enrolled in the molecular network. The largest cluster consisted of 10.0% (22/221) sequences. A molecular network diagram of Hefei is shown in Figure 2.
A total of 816 plasma samples were eligible for HIV limiting-antigen avidity enzyme immunoassay (LAg-Avidity EIA), and 22.1% (180/816) of them were recently infected. Based on newly reported HIV infections in 2017-2019, a transmission cluster with at least three recent HIV infections in 2020 is defined as an active transmission cluster. There were three active transmission clusters: the largest cluster was CRF55_01B, followed by CRF01_AE-cluster 4, and CRF01_AE-cluster 5. There were seven clusters containing two recent infections in 2020, including two clusters of CRF07_BC-N and five clusters of CRF01_AE-cluster 4.

Analysis of Interprovincial Transmission Characteristics
After removing the repeated sequences, 50,798 sequences were obtained from the LANL and China CDC databases as of 30 June 2020. Dataset B covered 31 provinces in China from 2000 to 2020 (Supplementary Table S1). Under the threshold of 0.5% genetic   Sequences from Hefei existed in 140 clusters and formed 1268 links with the sequences from 23 provinces, of which 67.5% (856/1268) were linked to Hefei itself, 10.8% (137/1268) were related to other cities in Anhui province, and 21.7% (275/1268) were linked to other provinces. The links between Hefei and each province are shown in Figure 3.

Discussion
In this study, 816 newly reported people with HIV from 2017 to 2020 were selected to study the prevalence of HIV in Hefei, Anhui Province. We found that MSM were the predominant people with HIV in the Hefei study group. MSM in China are characterized by high mobility, which greatly promotes HIV transmission in different areas [15]. Moreover, MSM has always played a bridge role in the transmission of HIV among different populations [16][17][18]. MSM may have homosexual and heterosexual behaviors at the same time [7]. Therefore, especially in large cities, it is necessary to strengthen the intervention among MSM. The government and relevant departments can raise awareness within MSM groups to prevent high-risk behaviors and encourage them to perform pre-exposure prophylaxis (PrEP) and post-exposure prophylaxis (PEP) through publicity and education. Various subtypes were identified in Hefei, and the main prevalent subtypes were CRF07_BC, CRF01_AE and CRF55_01B. In addition to the common epidemic subtypes, CRF67_01B and CRF68_01B were first reported as epidemics in Hefei. There were also many URFs in Hefei, indicating that many individuals were repeatedly infected with different HIV strains. Frequent recombination of HIV genomes can accelerate the evolution of HIV strains and promote the emergence of HIV strains with high viral fitness [19]. This also suggests that the prevalence of multiple recombinant strains in China is no longer a local epidemic problem. Therefore, it is necessary to strengthen the

Discussion
In this study, 816 newly reported people with HIV from 2017 to 2020 were selected to study the prevalence of HIV in Hefei, Anhui Province. We found that MSM were the predominant people with HIV in the Hefei study group. MSM in China are characterized by high mobility, which greatly promotes HIV transmission in different areas [15]. Moreover, MSM has always played a bridge role in the transmission of HIV among different populations [16][17][18]. MSM may have homosexual and heterosexual behaviors at the same time [7]. Therefore, especially in large cities, it is necessary to strengthen the intervention among MSM. The government and relevant departments can raise awareness within MSM groups to prevent high-risk behaviors and encourage them to perform pre-exposure prophylaxis (PrEP) and post-exposure prophylaxis (PEP) through publicity and education. Various subtypes were identified in Hefei, and the main prevalent subtypes were CRF07_BC, CRF01_AE and CRF55_01B. In addition to the common epidemic subtypes, CRF67_01B and CRF68_01B were first reported as epidemics in Hefei. There were also many URFs in Hefei, indicating that many individuals were repeatedly infected with different HIV strains. Frequent recombination of HIV genomes can accelerate the evolution of HIV strains and promote the emergence of HIV strains with high viral fitness [19]. This also suggests that the prevalence of multiple recombinant strains in China is no longer a local epidemic problem. Therefore, it is necessary to strengthen the monitoring of HIV subtypes in some cities, such as Hefei, to avoid the emergence of more URFs, which will likely cause difficulties in prevention and control.
The largest and most active transmission cluster in Hefei was CRF55_01B, as shown in the molecular network. Molecular network analysis of national data showed that CRF55_01B had a higher clustered and interprovincial transmission rate. CRF55_01B, which was first reported in 2013, is a late-emerging recombinant strain in China [20]. However, it has grown rapidly in recent years and has become the fifth most predominant HIV-1 strain in China [21]. As a strain originating with MSM, CRF55_01B is more likely to spread between large cities and across provinces, similar to other strains associated with MSM [22,23]. Our previous research showed that CRF55_01B spreads rapidly from MSM to heterosexual people, and this strain has been found in all provinces, forming transmission clusters in more than half of the provinces. Its rapid transmission may be related to the development of transportation and technology [24]. CRF07_BC-N also showed higher network access and interprovincial transmission rates. CRF07_BC strains originated among IDUs in southwest China and later spread to other provinces of China through IDUs and heterosexuals [25]. In recent years, more and more CRF07_BC strains have been detected in MSM in China [26][27][28]. A study showed that CRF_07BC-N has a greater risk of transmission than CRF07_BC-O [13]. In our unpublished studies, we elucidated the transmission trend and scale of the CRF07_BC epidemic recombinant strain from IDUs to heterosexuals and then to MSM in China. The CRF07_BC-N cluster is the main interprovincial HIV epidemic recombinant strain in China, especially in developed cities. Thus, CRF55_01B and CRF07_BC-N aggregate more readily and spread more widely than the other subtypes. The prevention and control of CRF55_01B and CRF07_BC-N in provincial capitals, such as Hefei, needs to be considered.
In the national molecular network, most of the people with HIV in Hefei were linked internally, and the rest were mainly linked with first-tier cities, such as Beijing, Shanghai, and Guangdong. Owing to the increasingly convenient transportation and population mobility, many research results show that floating populations have become an important factor in the transmission and prevalence of HIV, forming a bridge group for rapid transmission between regions [29]. Floating populations are usually people separated from their spouses, and mainly young adults who are at the peak of sexual activity and generally have high-risk sexual behaviors [30]. These factors make floating populations an important group in HIV transmission. As Hefei is one of the cities in the Yangtze River Delta region, many people from this city migrate to the Jiangsu, Zhejiang, and Shanghai areas. Understanding the prevalence and molecular network transmission of HIV in these big cities can improve intervention accuracy among active transmission groups and reduce the transmission of HIV. The results of this study show that the overall rate of interprovincial transmission in Hefei is not very high. However, interprovincial transmission cannot be ignored due to the rapid movement of the population. In addition, many people who are not in the network may have contact with other provinces.
The dynamic Re for 2012-2020 was estimated. CRF07_BC-N, CRF01_AE-cluster 4, and CRF01_AE-cluster 5 were the subtypes with earlier epidemic times, and the peak of Re appeared around 2013-2016. The weighted Re in Hefei increased rapidly from 2012 to 2015, declined slowly in 2015-2016 and rapidly thereafter, which may be related to the expansion of antiviral therapy in 2016. Some studies have also evaluated the effectiveness of control measures by calculating the dynamic changes in Re [31,32]. CRF55_01B has been an epidemic strain in recent years, and its epidemic time in Hefei is relatively late. However, the Re of CRF55_01B is very high, indicating that it has grown rapidly in Hefei in recent years. The Re results are consistent with the previous conclusion that CRF55_01B is the largest and most active cluster in Hefei. The Re results combined with the LAg-Avidity EIA results showed that the new infection rate in Hefei is still high.
In conclusion, this study involved people with HIV in Hefei as an example and showed that the prevalence of complex and diverse recombinant HIV strains, the rapid spread of the CRF55_01B and CRF07_BC-N strains, and the high Re condition points to the need for greater attention in similar large population cities. This is the first time that an analysis of the association of infected patients in one city with other provinces and cities has been conducted. On the other hand, the very interesting question of whether the active transmission clusters in 2020 are related to COVID-19 city-lockdowns can be further studied. This study has some limitations. All the people with HIV were analyzed according to the reporting place, and there may be some deviation from their actual residence. However, these data will provide data support about the prevalence of HIV in large provincial capital cities. The data and analyses also suggest that the situation may be similar in other provinces and cities where HIV is transmitted primarily among MSM.

Study Population and Design
Newly reported people with HIV from 2017 to 2020 in Hefei were collected by sampling. The inclusion criteria of the study subjects were as follows: (1) age ≥ 18 years; (2) patients with HIV infection who had not received any antiviral treatment from January 2017 to December 2020; and (3) patients who had completed questionnaires and signed informed consent forms (ethics approval number is X140617334).

Laboratory Tests
Plasma samples from the study subjects were collected by the laboratory personnel of the local Center for Disease Control and Prevention (CDC) and transported to the National Center for AIDS/STD Control and Prevention, Chinese Center for Disease Control and Prevention (China CDC) for testing. Viral RNA was extracted using the QIAamp Viral RNA Mini Kit (Qiagen, Hilden, Germany). A nested polymerase chain reaction (PCR) was performed to amplify the HIV pol gene fragments at positions 2253-3553 of the international standard strain HXB2 [33]. Rent infections were detected using HIV LAg-Avidity EIA [34]. It includes the preliminary screening test and confirmation test. Samples with ODn less than or equal to 2.0 in the preliminary screening test shall be validated. If the ODn value of the test sample is confirmed to be between 0.4 and 1.5, the sample is recently infected.

HIV Sequence Acquisition and Subtyping
Sequences obtained by Sanger sequencing were spliced using Sequencher 4.10.1 (Gene Codes Corporation, Ann Arbor, MI, USA) and aligned using Mafft 7.037. Sequence quality control was performed by WHO HIVDR QC TOOL (https://sequenceqc.bccfe.ca/who_qc, accessed on 6 July 2021). FastTree 2.1 was used to construct a phylogenetic tree for subtype identification. [35]. The nucleotide substitution model was GTR + G + I, and support values of the nodes were calculated with a Shimodaira Hasegawa-like test. Clusters with a bootstrap value higher than 0.90 (90%) were defined as the same subtype and subclusters. The reference sequences included the major international epidemic strains A-D, F-H, and JK, and the major epidemic recombinant strains from HIV Databases (https://www.hiv.lanl.gov/content/index, accessed on 7 July 2021) and our laboratory. Maximum likelihood trees were imported into FigTree v1.4.3. Unique recombinant forms (URFs) were used to determine recombination breakpoints by JPHMM at GOBICS (http://jphmm.gobics.de/, accessed on 7 July 2021) and RIP (https://www.hiv.lanl.gov/ content/sequence/RIP/RIP.html, accessed on 7 July 2021) in the HIV sequence database.

Phylogenetic Analysis and HIV Molecular Network Construction
A molecular network was constructed using HIV TRACE [36]. Aligned pol sequences were used to calculate pairwise genetic distances using the Tamura-Nei 93 model [37]. All sequences were longer than 1000 bp, and ambiguous nucleotides were less than 5%. Each patient in the molecular network was represented by a node, and nodes were linked to each other if their pairwise genetic distance was within 0.5% substitutions per site. A threshold of 0.5% was selected to identify transmission relationships over two to three years. HIV pol gene region sequences from across China were collected from the HIV sequence database of the Los Alamos National Laboratories (LANL) and China CDC databases to establish dataset B. Dataset B was used to analyze the interprovincial transmission from Hefei. In this study, an interprovincial cluster was defined as a cluster containing infections from at least two provinces. Patients in interprovincial clusters indicated that they had interprovincial transmission. The interprovincial transmission rate was calculated based on the number of patients in the interprovincial cluster divided by the total sample size. The map was drawn by ArcMap 10.2. Sankey diagram, which is a specific type of flowchart in which the width of the extended branches corresponds to the size of the data flow. In this study, the branch width of the Sankey diagram is used to represent the number of links between Hefei and other provinces in the molecular network. Sankey diagram was drawn using the networkD3 package in R.

Estimating the Effective Reproductive Number (Re)
The Re of each subtype was estimated using the birth-death skyline (BDSKY) serial model in BEAST v2.6.0. Re is calculated as the median ratio of birth and death rates. The BDSKY model employs a piecewise constant birth-death-sampling process to compute the probability density of a phylogeny. In this model, a branching event in the sample tree corresponds to a "birth", each tip in the tree corresponds to a sampling event, and a death is an unobserved event, i.e., an unsampled recovery or death. Each of these three event types occurs with its own characteristic rate in each interval of the piecewise function. This enables the estimation of epidemiological parameters such as Re. [38,39]. We used the bdskytools package in R to plot the results of BDSKY analyses. Re is defined as the average number of secondary infections caused by an infected person at a specific point in time during an epidemic, when the susceptibility of the population decreased. This value is often used to describe temporal changes of an epidemic in a population, and the Re greater than 1 indicates the growth of an epidemic.
Supplementary Materials: The following are available online at https://www.mdpi.com/article/10 .3390/pathogens10121554/s1, Table S1: The sequence distribution of each province of the dataset B.
Author Contributions: S.Z. was responsible for the study design, experiments, analysis, and writing of the article. J.W. was responsible for sample and information collection, and revision of the article. L.L. (Lei Liu) and C.S. was responsible for the experiments and analysis. M.G. was responsible for the analysis and revision of the article. Z.H., Y.L., and H.W. were responsible for sample collection and epidemiological investigation. Y.F., L.L. (Lingjie Liao), and Y.S. were responsible for revising the article. Y.R. and H.X. were responsible for guiding the study and revising the article. All authors have read and agreed to the published version of the manuscript.

Conflicts of Interest:
The authors declare no conflict of interest.