High Genetic Diversity of HIV-1 and Active Transmission Clusters among Male-to-Male Sexual Contacts (MMSCs) in Zhuhai, China

Monitoring genetic diversity and recent HIV infections (RHIs) is critical for understanding HIV epidemiology. Here, we report HIV-1 genetic diversity and RHIs in blood samples from 190 HIV-positive MMSCs in Zhuhai, China. MMSCs with newly reported HIV were enrolled from January 2020 to June 2022. A nested PCR was performed to amplify the HIV polymerase gene fragments at HXB2 positions 2604–3606. We constructed genetic transmission network at both 0.5% and 1.5% distance thresholds using the Tamura-Nei93 model. RHIs were identified using a recent infection testing algorithm (RITA) combining limiting antigen avidity enzyme immunoassay (LAg-EIA) assay with clinical data. The results revealed that 19.5% (37/190) were RHIs and 48.4% (92/190) were CRF07_BC. Two clusters were identified at a 0.5% distance threshold. Among them, one was infected with CRF07_BC for the long term, and the other was infected with CRF55_01B recently. We identified a total of 15 clusters at a 1.5% distance threshold. Among them, nine were infected with CRF07_BC subtype, and RHIs were found in 38.8% (19/49) distributed in eight genetic clusters. We identified a large active transmission cluster (n = 10) infected with a genetic variant, CRF79_0107. The multivariable logistic regression model showed that clusters were more likely to be RHIs (adjusted OR: 3.64, 95% CI: 1.51~9.01). The RHI algorithm can help to identify recent or ongoing transmission clusters where the prevention tools are mostly needed. Prompt public health measures are needed to contain the further spread of active transmission clusters.


Introduction
HIV/AIDS remains an important issue in the world with 1.3 million estimated new infections in 2022 [1].Key populations such as male-to-male sexual contact (MMSC) is disproportionately affected by the HIV epidemic and have a higher risk of infection because of a higher level of risk behavior and/or their vulnerable position in society [2].In China, the estimated overall national prevalence of HIV among MMSCs from 2001 to 2018 was 5.7% (95% CI: 5.4-6.1%).There has been an observed increasing trend in HIV prevalence over time among MMSCs [3].
Detecting and addressing HIV-infected MMSCs with epidemiological connections is a crucial step toward reducing new HIV infections among MMSCs.HIV transmission clusters, particularly those with recently infected individuals, represent recent or ongoing HIV transmissions in a population.Prompt implementation of prevention strategies targeting active clusters can curb further transmission [4].HIV-1 undergoes significant diversification and continuous molecular evolution during its global spread [5][6][7].In recent years, novel circulating recombinant forms (CRFs) and unique recombinant forms (URFs) have been identified in China.It is reported that there are at least 21 CRFs prevalent in China since 2013 [8].These forms consist of gene segments derived from different HIV-1 genotypes and are predominantly found in regions with a high prevalence of infection and genetic diversity [9,10].This emergence poses additional challenges to the prevention and control of HIV epidemics.Recombination is rampant within chronic infections and in viral rebound upon antiretroviral therapy (ART) interruption and is an effective way to enable rapid escape from immune selection pressure [11].Mutations in the specific target of drug action can significantly diminish the efficiency of the drug and weaken the effectiveness of ART in blocking HIV transmission.Zhuhai, located in southern China, is a metropolitan region with high HIV-1 diversity.A previous survey showed that the main HIV-1 genotypes were CRF07_BC (43.3%),CRF01_AE (36.1%) and CRF55_01B (13.4%) among MMSCs in Zhuhai, China [12].
The identification of early HIV infection cases enables us to promptly target active transmission clusters, prioritizing the implementation of prevention tools where they are most needed.In the past years, several laboratory-based assays have been tested to identify early HIV infection according to the natural serological responses after infection [13].Recently, the World Health Organization (WHO) and the Joint United Nations Programme on HIV/AIDS (UNAIDS) recommend using recent infection testing algorithms (RITAs) to improve the accuracy of identifying recent HIV infection, which integrates HIV recency tests with multiple routinely used clinical assays [14].In other words, a RITA is a combination of laboratory tests used to classify an HIV infection as recent or long-term.The limiting antigen avidity enzyme immunoassay (LAg-EIA) is one of the widely used serological assays to identify early HIV infection (EHI) [15][16][17].CD4+ T cell count and the viral load (VL) test are the other two clinical assays used for the majority of RITAs to reclassify recent infection [14].In comparison to the use of serological assays for the classification of recent HIV infection, RITAs have been proven to accurately classify recent infection cases and effectively reduce the false recent rate (FRR) [14].
Therefore, the major objective of this study was to conduct a comprehensive molecular surveillance of HIV-1 genotypes and recombinants, as well as the identification of the latest RHIs status among MMSCs in Zhuhai.This study aims to provide valuable scientific insights into the development of effective HIV-1 transmission control and prevention strategies in southern China.

Study Design and Population
This implementation study was conducted in Zhuhai, southern China, with an estimated 17,000 MMSCs living in the city, with an HIV prevalence of 7% among MMSCs [18].There were approximately 150 newly reported HIV-positive cases in the MMSC population in Zhuhai in 2018 [19].MMSCs with newly reported HIV were enrolled in Zhuhai from January 2020 to June 2022.The inclusion criteria of the study were (1) age ≥ 18 years and (2) new HIV diagnosis without any antiviral treatment.Demographic characteristics (age, ethnicity, marital status, education, occupation, etc.), CD4+ T cell count and the viral loads of participants were extracted from the electronic follow-up records at Zhuhai Center for Disease Control and Prevention (Zhuhai CDC).The study was approved by the institutional reviewing board (IRB) at Zhuhai CDC (ethics approval number No. 2022.11).

PCR Amplification and Recency Testing
HIV diagnostic plasma samples were collected by physicians at diagnostic clinics and transported to Zhuhai CDC for further testing.Viral RNA was extracted from 140 µL plasma using the Roche High Pure Viral RNA Kit (Roche).The kit supplied by Shanghai Huirui Biotechnology Co., Ltd., Shanghai, China.A nested polymerase chain reaction (PCR) was performed to amplify the HIV polymerase (pol) gene fragments at HXB2 positions 2604-3606, covering the 528 amino acids of protease and the first 528 amino acids of reverse transcriptase codons.The primers are shown in Table 1.The pol sequences were sequenced by Sanger sequencing.We used a RITA combining a limiting antigen avidity enzyme immunoassay (LAg-EIA, supplied by Beijing Kinghawk Pharmaceutical Co., Ltd., Beijing, China) recency assay with clinical data (CD4+ T cell count ≥200 cells/µL and viral load ≥1000 copies/mL) to identify recent HIV infections.The LAg-EIA recency assay includes a preliminary screening test and a confirmation test.Samples with ODn less than or equal to 2.0 in the preliminary screening test were confirmed, and those with an ODn value less than 1.5 in the confirmation test were considered recent infections by LAg-EIA.The mean duration of recent infection for Lag-EIA is 130 days and FRR is 2.3% among MMSCs in China [20].The FRR of LAg-EIA combined with viral load ≥1000 copies/mL is 4% in Kenya [21].

Phylogenetic and Clustering Analysis
We constructed a genetic transmission network at both 0.5% and 1.5% distance thresholds using the Tamura-Nei93 (TN93) model in HyPhy 2.2.4 using the pol region sequences [28,29].Each patient in the molecular network was represented by a node, and nodes were linked to each other if their pairwise genetic distance was within 0.5% and 1.5% substitutions per site.Network visualization was undertaken by using Cytoscape 3.7.0[30].We performed analysis for different distance thresholds (ranging from 0.1% to 3.25%) to determine the optimal threshold (i.e., the distance when the maximum ratio of the number of clusters to distance is detected) using MicrobeTrace (https://github.com/CDCgov/MicrobeTrace/wiki(accessed on 26 August 2023)) [31].We ranked the edgecounts of all nodes within the network, and defined nodes with edgecounts greater than or equal to the upper quartile (i.e., edgecount ≥ 5) as having high linkage.Nodes with high linkage were defined as individuals with a high risk of transmission.CRF79_0107 sequences and their information were downloaded from LANL (https://www.hiv.lanl.gov/components/sequence/HIV/geo/geo.html(accessed on 26 June 2023)) [32].
To investigate interprovincial and intraprovincial clusters, we downloaded sequences with the highest homology to those in this study using HIV BLAST (https://www.hiv.lanl.gov/content/sequence/BASIC_BLAST/basic_blast.html(accessed on 26 August 2023)) [33], the top 10 sequences for those identified as CRF79_0107 in this study and the top 5 sequences for other variants.Location data were gathered, and sequences with 99% or higher HIV BLAST identity were included.The coordinates of the provincial capital city would be utilized only if the province's geographic location was provided.These sequences and sequences of this study were aligned using MAFFT and edited by Aliview [34].Then, we constructed a genetic transmission network at a 1.5% distance threshold using the TN93 model (R package "ape") and MicrobeTrace.

Statistical Analysis
Descriptive analysis was performed to describe the socio-demographics and gene subtypes of participants using R 4.1.2.R packages, including "mapdata", "maps", "maptools", "rgdal", "ggplot2", "plyr", "cowplot" and "sf", were used to plot and display maps depicting the distribution of samples.R package "ape" was used to calculate genetic distance using the TN93 model.R packages including "tidyverse" and "dplyr" were used for data manipulation and visualization.R package "broom" was used for multivariable analysis.Categorical variables were described by frequency (n) and percentage (%).Statistical comparisons of the recent infections and long-term infections were performed using the Chi-square test or Fisher's exact test.All tests were two-tailed, and values of p < 0.05 were considered statistically significant.Preliminary analyses were conducted to fit unadjusted and adjusted logistic regression models of MMSC characteristic covariates on each of the two outcomes of RITA.For these models, Y was the outcome of RITA and Y = 1 when the outcome was recent infection (versus Y = 0 in the case of long-term infection).

Molecular Network Analysis and Active Transmission Clusters
According to previous studies and the analysis of distance thresholds ranging from 0.1% to 3.25%, a threshold of 1.5% was optimal for detecting transmission clusters (Figure 2A) [35,36].Using the threshold of 0.5% genetic distance, 2 clusters were identified, and 3.7% (7/190) of the participants could be linked to a genetic cluster.Among the 2 clusters, 1 was infected with CRF07_BC, and the other with CRF55_01B.Using the threshold of  1 p was calculated with Fisher's exact test. 2 Clusters were identified at a 1.5% distance threshold.

Molecular Network Analysis and Active Transmission Clusters
According to previous studies and the analysis of distance thresholds ranging from 0.1% to 3.25%, a threshold of 1.5% was optimal for detecting transmission clusters (Figure 2A) [35,36].Using the threshold of 0.5% genetic distance, 2 clusters were identified, and 3.7% (7/190) of the participants could be linked to a genetic cluster.Among the 2 clusters, 1 was infected with CRF07_BC, and the other with CRF55_01B.Using the threshold of 1.5% genetic distance, we identified a total of 15 clusters (median size 2, range: 2-10), and 25.8% (49/190) of the participants could be linked to a genetic cluster.Among the 15 clusters, 9 were infected with CRF07_BC subtype, 4 with CFR55_01B and 2 with CRF79_0107.RHIs were found in 38.8% (19/49) of the participants distributed in 8 genetic clusters, i.e., active transmission clusters.The largest cluster we identified (cluster #1) had 10 members, and 60% of them were recently infected.Of note, this cluster was infected with CRF79_0107.The molecular network diagram of Zhuhai is shown in Figure 2.
To further understand the epidemic of CRF79_0107, we downloaded all CRF79_0107 sequences (41 sequences) and their basic information including accession, blood collection year, country, province and city from LANL.All CRF79_0107 sequences were detected in China, and the first CRF79_0107 sequence was identified in 2017 from Shanxi, China.The earliest blood collection year of the CRF79_0107 sequences identified in Shanxi was 2015.There were three male cases identified as CRF79_0107 in Shanxi in 2015, whose risk factors were MMSC (2/3) or heterosexual contact (1/3).In 2021, a female case whose blood was collected in 2011 from Shenzhen, China, was identified as CRF79_0107.The CRF79_0107 sequences were mainly identified in the north of China (Figure 3).In southern China, only Shenzhen and Zhuhai reported CRF79_0107.The distribution of sex and risk factors of CRF79_0107 is shown in Figure 4. Men accounted for 65% (39/60) of CRF79_0107 sequences in the HIV database and this study, and 71.8% (28/39) were MMSCs.
A total of 110 sequences were included in the analysis to investigate interprovincial and intraprovincial clusters.At a 1.5% genetic distance threshold, 15 clusters were identified.Interprovincial clusters revealed connections between sequences from Zhuhai identified as CRF07_BC, CRF55_01B or CRF79_0107 and sequences from other cities in Guangdong province (Figure 5A-D).On the other hand, intraprovincial clusters demonstrated that sequences from Zhuhai were also connected, either indirectly or directly, with other provinces in China (Figure 5E,F).Within the HIV transmission network in Zhuhai, some sequences from Zhuhai were observed as singletons, indicating unique transmission events within the city (Figure 2C).However, these sequences clustered and shared links with sequences from other cities in Guangdong province or other provinces (Figure 5G).
We further explored the factors that were associated with recent HIV infection at diagnosis using a multivariable logistic regression model (Table 4).Participants clustering (compared with not clustering, adjusted OR [aOR]: 3.64, 95% CI: 1.51~9.01)and students (aOR: 6.47, 95% CI: 1.71~25.66)were more likely to be recently infected.Notably, we observed a trend that variant, CRF79_0107, was associated with recent infection (p = 0.057).These findings suggests that there are ongoing active HIV transmissions among young students, particularly the variant, CRF79_0107.
We defined nodes with an edgecount ≥5 as those with high linkage.Fisher's exact test showed that age, ethnicity, marriage and subtype were significant factors associated with high linkage (Table 5).MMSCs with high linkage were more likely to be younger, single and non-Han.Using the threshold of 0.5% genetic distance.(C) Using the threshold of 1.5% genetic distance.Cluster #1 at a 0.5% distance threshold was from cluster #2 at a 1.5% distance threshold.Cluster #2 at a 0.5% distance threshold was from cluster #14 at a 1.5% distance threshold.
To further understand the epidemic of CRF79_0107, we downloaded all CRF79_0107 sequences (41 sequences) and their basic information including accession, blood collection year, country, province and city from LANL.All CRF79_0107 sequences were detected in China, and the first CRF79_0107 sequence was identified in 2017 from Shanxi, China.The earliest blood collection year of the CRF79_0107 sequences identified in Shanxi was 2015.There were three male cases identified as CRF79_0107 in Shanxi in 2015, whose risk factors were MMSC (2/3) or heterosexual contact (1/3).In 2021, a female case whose blood was collected in 2011 from Shenzhen, China, was identified as CRF79_0107.The CRF79_0107 sequences were mainly identified in the north of China (Figure 3).In southern China, only Shenzhen and Zhuhai reported CRF79_0107.The distribution of sex and risk factors of CRF79_0107 is shown in Figure 4. Men accounted for 65% (39/60) of CRF79_0107 sequences in the HIV database and this study, and 71.8% (28/39) were MMSCs.Cluster #1 at a 0.5% distance threshold was from cluster #2 at a 1.5% distance threshold.Cluster #2 at a 0.5% distance threshold was from cluster #14 at a 1.5% distance threshold.high linkage (Table 5).MMSCs with high linkage were more likely to be younger, single and non-Han.Using the threshold of 0.5% genetic distance.(C) Using the threshold of 1.5% genetic distance.Cluster #1 at a 0.5% distance threshold was from cluster #2 at a 1.5% distance threshold.Cluster #2 at a 0.5% distance threshold was from cluster #14 at a 1.5% distance threshold.
To further understand the epidemic of CRF79_0107, we downloaded all CRF79_0107 sequences (41 sequences) and their basic information including accession, blood collection year, country, province and city from LANL.All CRF79_0107 sequences were detected in China, and the first CRF79_0107 sequence was identified in 2017 from Shanxi, China.The earliest blood collection year of the CRF79_0107 sequences identified in Shanxi was 2015.There were three male cases identified as CRF79_0107 in Shanxi in 2015, whose risk factors were MMSC (2/3) or heterosexual contact (1/3).In 2021, a female case whose blood was collected in 2011 from Shenzhen, China, was identified as CRF79_0107.The CRF79_0107 sequences were mainly identified in the north of China (Figure 3).In southern China, only Shenzhen and Zhuhai reported CRF79_0107.The distribution of sex and risk factors of CRF79_0107 is shown in Figure 4. Men accounted for 65% (39/60) of CRF79_0107 sequences in the HIV database and this study, and 71.8% (28/39) were MMSCs.Cluster #1 at a 0.5% distance threshold was from cluster #2 at a 1.5% distance threshold.Cluster #2 at a 0.5% distance threshold was from cluster #14 at a 1.5% distance threshold.A total of 110 sequences were included in the analysis to investigate interprovincial and intraprovincial clusters.At a 1.5% genetic distance threshold, 15 clusters were identified.Interprovincial clusters revealed connections between sequences from Zhuhai identified as CRF07_BC, CRF55_01B or CRF79_0107 and sequences from other cities in Guangdong province (Figure 5A-D).On the other hand, intraprovincial clusters demonstrated that sequences from Zhuhai were also connected, either indirectly or directly, with other provinces in China (Figure 5E,F).Within the HIV transmission network in Zhuhai, some sequences from Zhuhai were observed as singletons, indicating unique transmission events within the city (Figure 2C).However, these sequences clustered and shared links with sequences from other cities in Guangdong province or other provinces (Figure 5G).A total of 110 sequences were included in the analysis to investigate interprovincial and intraprovincial clusters.At a 1.5% genetic distance threshold, 15 clusters were identified.Interprovincial clusters revealed connections between sequences from Zhuhai identified as CRF07_BC, CRF55_01B or CRF79_0107 and sequences from other cities in Guangdong province (Figure 5A-D).On the other hand, intraprovincial clusters demonstrated that sequences from Zhuhai were also connected, either indirectly or directly, with other provinces in China (Figure 5E,F).Within the HIV transmission network in Zhuhai, some sequences from Zhuhai were observed as singletons, indicating unique transmission events within the city (Figure 2C).However, these sequences clustered and shared links with sequences from other cities in Guangdong province or other provinces (Figure 5G).

Discussion
This study holds significant importance as it meticulously outlines the molecular epidemiology of HIV-1 among recently diagnosed MMSCs in Zhuhai, a bustling metropolitan area located in southern China.The observed extensive genetic diversity within HIV-1 strains strongly implies intricate introductions of the virus among MMSCs in the Zhuhai region.Guangzhou, Shenzhen and Zhuhai are three metropolitan areas with a high percentage of migrant population in the Guangdong-Hong Kong-Macao Greater Bay Area, and the most popular working and living destination for MMSCs.The most prevalent HIV-1 variants circulating among MMSCs in Guangzhou was CRF07_BC (41.6%), followed by CRF01_AE (30.0%) and CRF55_01B (12.8%) [37], while in the neighboring city of Shenzhen, the most prevalent variant is CRF07_BC (41.12%), followed by CRF01_AE (35.14%) and CRF55_01B (11.23%) [38].Our study suggested that CRF07_BC was the most commonly seen HIV-1 variant circulating among MMSCs in Zhuhai, which is consistent with the findings in Guangzhou and Shenzhen, but other subtypes such as CRF55_01B and CRF01_AE differ among these cities.
In a previous study, we observed that CRF79_0107 accounted for only 0.09% (9/10378) of newly diagnosed HIV-1 infections in Shenzhen [38].However, the outbreak of CRF79_0107 identified in this study has raised concerns regarding the rapid transmission of this specific variant.CRF79_0107 was first identified from Shanxi, China in 2017 [39], and in 2021, a sequence from a blood specimen collected from a female participant in Shenzhen was also identified as CRF79_0107 [38].Our study identified a total of 19 CRF79_0107 sequences.The emergence of CRF79_0107 increases the complexity of the HIV epidemic.From the analyses of CRF79_0107 data from LANL, we suspect that CRF79_0107 initially spread through heterosexual contact and was later introduced to bisexual males, eventually becoming prevalent among MMSCs.Furthermore, it is evident that CRF79_0107 exhibited a predominant presence in regions characterized by robust economic development and notable seasonal human migration patterns.Consequently, there was a potential that CRF79_0107 might have the tendency to spread to the MMSC population in surrounding areas, akin to the pattern observed with CRF55_01B [40].In our study, participants identified as CRF79_0107 within the network accounted for 46.2% (6/13) of those with high linkage and 16.7% (6/36) of those with low linkage.Participants with high linkage were defined as individuals with a high risk of transmitting HIV.Therefore, it suggested that CRF79_0107 might be transmitted rapidly in Zhuhai, although the sample size of nodes within the network was limited.
In our study, we identified a total of 15 transmission clusters from 49 individuals and 5 clusters with more than 2 members using the 1.5% threshold.Using a more specific threshold of 0.5%, we only identified 2 clusters.This finding suggests that the majority of the participants diagnosed were not identified within a known transmission cluster.The low cluster rate indicated that early treatment might have played an important role in the epidemic of HIV in the study population because of "U=U" (undetectable equals untransmissible).The initiation of ART in early HIV infection regardless of CD4+ count provided net benefits for individual patients in reducing the risk of viral transmission [41].From 2014 and 2021, the percentage of HIV-infected persons receiving ART increased from 59.0% to 96% in Zhuhai [42][43][44].Viral load was higher in people in larger clusters and with increased network connectivity [45], but the proportion of cluster members with unsuppressed viral load showed only a weak association (HR, 1.35 (95% CI, 0.98-1.86))with incident cluster growth [46].The data indicated a correlation between elevated viral loads and increased transmission rates among patients.However, upon their identification, the transmission rate tended to decline.This was attributed to the prompt initiation of ART in these individuals, typically resulting in the suppression of viral levels to undetectable measures.Thus, the majority of participants in this study appeared to be singletons in the transmission network.However, the presence of recent infection or active transmission networks remains significant, emphasizing the importance of enhancing early detection and treatment of HIV.
Our study found that recent infections were 38.8% (19/49) of the participants distributed in 8 genetic clusters, i.e., active transmission clusters (of the 1.5% cutoff).Additionally, our investigation unveiled that 13.1% of cases were indicative of recent infections, yet they did not exhibit any discernible clustering.This observation suggests the presence of unidentified transmission clusters in Zhuhai or the possibility that these clusters span multiple cities.RHIs are often linked to active transmission networks.The quick identification and intervention of recently infected individuals is crucial in HIV monitoring and prevention [47,48].Within an active transmission cluster, the term RHI pertains to an individual who contracted HIV either from a diagnosed patient with uncontrolled viral load or from an individual living with HIV whose status has not been officially confirmed.Immediate treatment for individuals diagnosed with a new HIV diagnosis is available in Zhuhai; thus, it is likely that a diagnosed patient in Zhuhai is on ART.There are two potential reasons why the viral load of the patient under ART might remain unsuppressed: (1) the patient has recently initiated ART, or (2) the patient has treatment failure because of poor adherence or drug resistance mutations [49].In the first situation, it is important to provide the patients with the information regarding the relationship between viral load and HIV transmission [50].It is advisable for them to avoid participating in unprotected sexual activities until their viral load is properly suppressed.It should be noted that these observations are based on findings from another study and not our own results.In the second scenario, close monitoring of viral load and drug resistance testing by healthcare professionals are essential to ensure the efficacy of ART and the prevention of potential transmission [51].Public health measures are needed to identify individuals with undiagnosed HIV, particularly when there is suspicion that they are linked to RHIs.Therefore, active transmission clusters and individuals with recent infections need to be prioritized for prompt public health intervention to contain further HIV transmission.However, it remains challenging to use the cluster detection to prospectively identify and interrupt incident transmission events in epidemics [46].Cluster formation requires a sufficient number of newly infected individuals in the population, which may take time to reach detectable levels during the early stages of an epidemic with low infection rates.This time delay impedes the prompt identification and response to incident transmission events.Additionally, cluster detection relying solely on genetic sequence data may not provide a comprehensive understanding of transmission dynamics.Augmenting genetic data with additional epidemiological information, such as behavioral characteristics, contact networks and demographic data, is often necessary for accurate identification and tracing of incident transmission events.The absence of such data can hinder the effectiveness of cluster detection approaches.
On the other hand, the number of HIV-infected people in the interprovincial transmission cluster exceeds that in the intraprovincial cluster, suggesting that the migration of people among provinces and cities plays an important role in shaping the prevalence of the AIDS epidemic in China [52].Therefore, it is crucial to address the mobility of the MMSC population and adopt strategies such as digital-based HIV self-testing to engage with molecular clusters or risk networks to prevent new infections [18].
The key indicator of the success of the targeted prevention approaches is the HIV incidence, which can be estimated by HIV recency tests.WHO and UNAIDS recommend using RITAs, which integrate HIV recency tests with multiple routinely used clinical assays, to improve the accuracy of identifying recent HIV infection [14].We applied the RITA approach to identify RHIs in our study, but it has several limitations.The manufacturer of the LAg-Avidity assay recommends excluding individuals who are receiving ART or elite suppressors or have AIDS (CD4 cell count < 200 cells/µL) from incidence surveys.The results of this study indicated that those exclusions did not remove all sources of assay misclassification among individuals with long-standing HIV infection [53].RITAs can identify RHIs in epidemiological studies with a population.RHIs are of particular interest in prevention efforts because they are more likely to be actively transmitting the virus [54,55].In addition to RITAs, genetic sequencing plays a crucial role in hotspot identification.This information helps identify clusters of infections, indicating potential transmission networks and hotspots.By combining RITAs and genetic sequencing, researchers can pinpoint geographic areas or populations with a higher concentration of RHIs or genetically related viral strains [56].These areas or populations may represent key hotspots for targeted prevention efforts.By focusing resources, interventions and education on these hotspots, public health officials can effectively allocate their efforts to reduce the spread of the infection and implement prevention strategies tailored to the specific needs of the affected communities.
Our study has a few limitations.First, it should be noted that the samples collected by convenience did not represent a random sampling.In addition, we were unable to obtain sequences from some participants.Thus, an unknown degree of sampling bias might exist in our study.Second, we did not identify significant differences between the recent infection group and the long-term infection group at baseline, indicating that the sample size of this study might be small.Lastly, we used only partial genetic sequences instead of complete genomes for the phylogenetic analysis, which might be suboptimal in areas with high HIV genetic variations.

Figure 1 .
Figure 1.Maximum-likelihood phylogenetic tree of 190 newly reported MMSCs with HIV in Zhuhai from 2020 to 2022.

Figure 1 .
Figure 1.Maximum-likelihood phylogenetic tree of 190 newly reported MMSCs with HIV in Zhuhai from 2020 to 2022.

Viruses 2023 , 16 Figure 2 .
Figure 2. (A) Selection of the optimal distance threshold.Curves represented the number of clusters, the number of nodes, the number of links, or the ratio of the number of clusters to distance at different distance thresholds.(B,C) The molecular transmission cluster diagram of newly reported MMSCs with HIV in Zhuhai from 2020 to 2022.○: 2020.□: 2021.△: 2022.Red nodes represented recent infections and pink nodes represented long-term infections.Green border represented CRF07_BC, blue border represented CRF55_01B, and purple border represented CRF79_0107.(B)Using the threshold of 0.5% genetic distance.(C) Using the threshold of 1.5% genetic distance.Cluster #1 at a 0.5% distance threshold was from cluster #2 at a 1.5% distance threshold.Cluster #2 at a 0.5% distance threshold was from cluster #14 at a 1.5% distance threshold.

Figure 2 .
Figure 2. (A) Selection of the optimal distance threshold.Curves represented the number of clusters, the number of nodes, the number of links, or the ratio of the number of clusters to distance at different distance thresholds.(B,C) The molecular transmission cluster diagram of newly reported MMSCs with HIV in Zhuhai from 2020 to 2022.: 2020.: 2021.: 2022.Red nodes represented recent infections and pink nodes represented long-term infections.Green border represented CRF07_BC, blue border represented CRF55_01B, and purple border represented CRF79_0107.(B) Using the threshold of 0.5% genetic distance.(C) Using the threshold of 1.5% genetic distance.Cluster #1 at a 0.5% distance threshold was from cluster #2 at a 1.5% distance threshold.Cluster #2 at a 0.5% distance threshold was from cluster #14 at a 1.5% distance threshold.

Viruses 2023 , 16 Figure 2 .
Figure 2. (A) Selection of the optimal distance threshold.Curves represented the number of clusters, the number of nodes, the number of links, or the ratio of the number of clusters to distance at different distance thresholds.(B,C) The molecular transmission cluster diagram of newly reported MMSCs with HIV in Zhuhai from 2020 to 2022.○: 2020.□: 2021.△: 2022.Red nodes represented recent infections and pink nodes represented long-term infections.Green border represented CRF07_BC, blue border represented CRF55_01B, and purple border represented CRF79_0107.(B)Using the threshold of 0.5% genetic distance.(C) Using the threshold of 1.5% genetic distance.Cluster #1 at a 0.5% distance threshold was from cluster #2 at a 1.5% distance threshold.Cluster #2 at a 0.5% distance threshold was from cluster #14 at a 1.5% distance threshold.

Figure 2 .
Figure 2. (A) Selection of the optimal distance threshold.Curves represented the number of clusters, the number of nodes, the number of links, or the ratio of the number of clusters to distance at different distance thresholds.(B,C) The molecular transmission cluster diagram of newly reported MMSCs with HIV in Zhuhai from 2020 to 2022.: 2020.: 2021.: 2022.Red nodes represented recent infections and pink nodes represented long-term infections.Green border represented CRF07_BC, blue border represented CRF55_01B, and purple border represented CRF79_0107.(B) Using the threshold of 0.5% genetic distance.(C) Using the threshold of 1.5% genetic distance.Cluster #1 at a 0.5% distance threshold was from cluster #2 at a 1.5% distance threshold.Cluster #2 at a 0.5% distance threshold was from cluster #14 at a 1.5% distance threshold.

Figure 3 .
Figure 3. Geographic distribution of CRF79_0107 in the HIV database and this study over time.Red dots represented HIV-infectious cases.

Figure 3 .
Figure 3. Geographic distribution of CRF79_0107 in the HIV database and this study over time.Red dots represented HIV-infectious cases.

Figure 3 .
Figure 3. Geographic distribution of CRF79_0107 in the HIV database and this study over time.Red dots represented HIV-infectious cases.

Figure 4 .
Figure 4. Distribution of sex and risk factors of cases identified as CRF79_0107 in the HIV database and this study over time.

Figure 4 .
Figure 4. Distribution of sex and risk factors of cases identified as CRF79_0107 in the HIV database and this study over time.

Figure 3 .
Figure 3. Geographic distribution of CRF79_0107 in the HIV database and this study over time.Red dots represented HIV-infectious cases.

Figure 4 .
Figure 4. Distribution of sex and risk factors of cases identified as CRF79_0107 in the HIV database and this study over time.

Figure 4 .
Figure 4. Distribution of sex and risk factors of cases identified as CRF79_0107 in the HIV database and this study over time.

Figure 5 .
Figure 5. Interprovincial and intraprovincial links between Zhuhai and other provinces in China.(A-D) Links between sequences of other cities in Guangdong and different CRFs of Zhuhai.(E) Links between sequences of other provinces and those of Zhuhai.(F) Links among sequences of China from NCBI and those of Zhuhai.(G) All included sequences in HIV-1 transmission network.The clusters in the colored polygons indicated the presence of more than two nodes within each cluster.Zhuhai, Guangzhou, Shenzhen and Jiangmen are cities in Guangdong province.

Figure 5 .
Figure 5. Interprovincial and intraprovincial links between Zhuhai and other provinces in China.(A-D) Links between sequences of other cities in Guangdong and different CRFs of Zhuhai.(E) Links between sequences of other provinces and those of Zhuhai.(F) Links among sequences of China from NCBI and those of Zhuhai.(G) All included sequences in HIV-1 transmission network.The clusters in the colored polygons indicated the presence of more than two nodes within each cluster.Zhuhai, Guangzhou, Shenzhen and Jiangmen are cities in Guangdong province.

Table 1 .
List of the PCR primers used in the study.

Table 3 .
The association between viral load and clustering of 190 HIV-positive MMSCs.

Table 4 .
Univariable analysis and multivariable analysis for recent infection among HIV-positive MMSCs: logistic regression analysis.
* Clusters were identified at a 1.5% distance threshold.

Table 5 .
Factors associated with high linkage among clustered individuals within networks.