Previous Article in Journal
IFN-λ4 Exhibits Differential Induction and Antiviral Activity in RSV and HMPV Infections
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Full-Genome Hepatitis B Virus Genotyping: A Juxtaposition of Next-Generation and Clone-Based Sequencing Approaches—Comparing Genotyping Methods of Hepatitis B Virus

1
Guangxi Zhuang Autonomous Region Center for Disease Prevention and Control, Guangxi Key Laboratory for the Prevention and Control of Viral Hepatitis, Nanning 530028, China
2
The Animal Husbandry Research Institute of Guangxi Zhuang Autonomous Region, Nanning 530010, China
3
Drug and Food Vocational College, Guangxi Vocational University of Agriculture, Nanning 530007, China
*
Authors to whom correspondence should be addressed.
These authors contributed equally to this work.
Viruses 2026, 18(1), 112; https://doi.org/10.3390/v18010112 (registering DOI)
Submission received: 8 October 2025 / Revised: 7 November 2025 / Accepted: 8 November 2025 / Published: 15 January 2026
(This article belongs to the Section Human Virology and Viral Diseases)

Abstract

Background: The enhanced sensitivity of next-generation sequencing (NGS) for assessing hepatitis B virus (HBV) quasispecies heterogeneity over clone-based sequencing (CBS) is well documented. However, its comparative reliability for genotype determination remains an open question. Objective: This study aimed to directly compare the performance of NGS and CBS for genotyping HBV using the entire viral genome. Methods: We selected five challenging clinical samples that previously could not be subgenotyped or showed conflicting results when using direct sequencing of the S open reading frame (ORF). The full HBV genome from these subjects was amplified and then analyzed in parallel by both NGS and CBS. Phylogenetic analysis was subsequently used to assign genotypes. Results: Both methods identified a range of genotypes, including B, C, and I, as well as aberrant and recombinant forms. For three of the five subjects, genotyping results were identical between the two platforms. In the remaining two cases, however, CBS revealed greater complexity, identifying additional subgenotypes and recombinant/aberrant strains not detected by NGS. Notably, for three individuals, the genotypes determined by both modern methods contradicted earlier results from 2011 based on direct S ORF sequencing. Furthermore, the specific mutations detected were incongruent between the platforms, with CBS identifying a higher number of variants than NGS. Conclusions: Our findings indicate that genotyping results from NGS and CBS can be discordant. Contrary to expectations, CBS may uncover more genetic diversity, including a greater number of subgenotypes and mutations, than NGS in certain contexts. The study also confirms that genotyping based solely on direct sequencing of the S ORF can be unreliable and lead to misclassification.

1. Introduction

Hepatitis B virus (HBV) has a partially double-stranded, circular DNA genome of approximately 3200 base pairs, including four partly or completely overlapping open reading frames (ORFs): preC/C, P, PreS1/S2/S, and X [1]. The encoded reverse transcriptase lacks proofreading activity, resulting in a high genetic heterogeneity, with various genotypes and subgenotypes and the development of viral quasispecies in individual infections [2].
The term quasispecies denotes a swarm of closely genetically related variants, where the nucleotide variation is less than between genotypes and subgenotypes [3]. Nine distinct genotypes of the hepatitis B virus are currently recognized, a classification stemming from comprehensive phylogenetic and evolutionary investigations of entire genome sequences. Each of these genotypes exhibits a nucleotide variation of over 7.5% when compared to the others [4,5]. A more refined categorization has been established for genotypes A–D, F, and I, which are now divided into no fewer than 55 subgenotypes. This subdivision is based on an internal nucleotide variation of roughly 4–8% at the complete genome level and is validated by significant bootstrap statistical support [1,6,7,8].
Hepatitis B virus genotypes are not uniformly distributed globally, showing clear geographical segregation. A prime example is genotype A, which is predominantly found in populations across Africa, Europe, India, and the Americas. Meanwhile, the Asia-Pacific area is characterized by the common occurrence of both genotypes B and C [9]. It has been reported that infections with different genotypes have various clinical implications. While the effectiveness of direct antiviral medications for HBV shows little consistent variation among genotypes, a clear disparity exists for interferon-based therapy. Specifically, patients with genotypes A and B tend to respond more successfully to interferon treatments than those carrying the C and D genotypes [10]. Infection with hepatitis B virus genotypes C, D, and F confers a more substantial lifetime risk for progression to cirrhosis and hepatocellular carcinoma, in contrast to infections involving genotypes A and B [11].
For the determination of HBV genotypes, a multitude of techniques have been put into practice over time. These range from restriction fragment mass polymorphism (RFMP) and PCR-invader assays to more common procedures like real-time PCR and the direct sequencing of PCR products. Hybridization-based tools, including INNO-LiPA strips and reverse dot blot assays, also serve this function. Nevertheless, the benchmark method for accurately assigning an HBV genotype is the sequencing of the full viral genome, followed by a detailed phylogenetic comparison [4,12]. A more cost-effective approach is to sequence an individual ORF (e.g., the surface ORF) rather than the complete genome [13].
A combination of molecular cloning of PCR products and subsequent Sanger dideoxy sequencing represents a viable technique for genotyping in complex cases involving mixed HBV infections [14]. Clone-based sequencing (CBS), often considered the gold standard, involves isolating a single DNA molecule by inserting it into a bacterial plasmid and growing it into a clonal population. This process allows for the determination of a consensus sequence from thousands of identical copies, yielding long, high-fidelity reads (800–1000 bases) and the unique ability to resolve haplotype phases. However, CBS is low-throughput, time-consuming, and costly for large-scale projects. In contrast, Next-generation sequencing (NGS) employs a massively parallel strategy. DNA is fragmented into short pieces, and all fragments are sequenced simultaneously in a single run. This approach generates an enormous volume of data at a low cost per base, making it ideal for profiling entire genomes or transcriptomes. The primary trade-offs are shorter read lengths (50–300 bases), which can complicate assembly in complex genomic regions, and the loss of direct haplotype information [15,16]. The ability to detect the complex diversity within HBV viral populations is significantly enhanced by using next-generation sequencing, which the literature has reported to be more sensitive than clone-based sequencing techniques [17]. However, it is unclear which approach is better for identifying genotypes based on complete genome sequences. We found in our previous study in Guangxi, China, that some isolates could not be subgenotyped or yielded different genotypes using S gene sequences, with direct sequencing of different PCR products [18]. Genotypes in this region are complex; genotypes C, B, and I (a recombinant) are the most common [18,19,20]. In order to determine the superior methodology, this study will ascertain the genotype and subgenotype of each sample through the parallel application of next-generation sequencing and conventional clone-based sequencing.

2. Materials and Methods

2.1. Study Subjects and Sample Collection

The study subjects were selected from a study cohort, which has been described previously [21]. We found that isolates from five study subjects could not be subgenotyped or yielded different genotypes using S gene sequences with direct sequencing of different PCR products [18]. Serum samples collected in 2011 from the five study subjects were included in the analysis. For comparison, we also collected serum samples from these study subjects in 2020. This study was conducted in strict adherence to the ethical tenets of the 1975 Declaration of Helsinki and received formal approval from the Guangxi Institutional Review Board. Prior to enrollment, every participant provided their written informed consent. A key exclusion criterion for all subjects was the presence of co-infection with either hepatitis C virus (HCV) or human immunodeficiency virus type 1 (HIV-1).

2.2. Serological Testing

Serum samples were evaluated for both viral serological markers and a key liver enzyme. Specifically, enzyme immunoassays (EIA; Beijing Wantai Biopham Company Limited, Beijing, China) were utilized to detect HBsAg and HBeAg/anti-HBe, while alanine aminotransferase (ALT) activity was quantified with a Reitman kit (DiaSys Diagnostic Systems, Shanghai, China).

2.3. Measurement of Serum Viral Loads

Real-time polymerase chain reaction (PCR) was used to quantify the concentration of HBV DNA in serum samples. This quantification was carried out on an ABI Prism 7500 instrument (Applied Biosystems, Foster City, CA, USA) using a commercial kit from Sansure Biotech Inc. (Changsha, China), which provided a linear detection range of 1 × 102 to 5 × 109 IU/mL.

2.4. Nested Polymerase Chain Reaction (PCR) for HBV DNA

HBV genomic DNA was extracted from 200 μL of serum samples using QIAamp DNA Mini kits (QIAGEN GmbH, Hilden, Germany) and eluted in 50 μL of distilled water. The entire HBV genome was amplified using nested PCR. The amplification protocol and primers P1 (nt 1821–1841, 5′-CCGGAAAGCTTGAGCTCTTCTTTTTCACCTCTGCCTAATCA-3′) and P2 (nt 1823–1806, 5′-CCGGAAAGCTTGAGCTCTTCAAAAAGTTGCATGGTGCTGG-3′) have been described previously [22]. The initial amplification was conducted using a VeritiPro Thermal Cycler (Thermo Fisher Scientific, Waltham, MA, USA) for a total of 40 cycles. The thermal profile included denaturation at 94 °C for 40 s, annealing at 60 °C for 1.5 min, and an elongation phase at 68 °C for 3 min. Notably, the elongation time was progressively lengthened by two minutes following each block of ten cycles. Subsequently, a second-round PCR was initiated using 5 μL of the primary amplicon in a 50 μL reaction volume. This nested reaction featured primers MDN5R (nt 1774-1794, 5′-ATTTATGCCTACAGCCTCCT-3′) and BCPF (nt 1854–1875, 5′-ATGTCCTACTGTTCAAGCCTCC-3′). Its protocol began with a 5 min hot start and then proceeded for 30 cycles of 94 °C for 30 s, 50 °C for 30 s, and 72 °C for 4 min. For samples characterized by low viral titers, a preliminary rolling circle amplification step was implemented before the nested PCR procedure [23], which exhibits less amplification bias and greater yield, product length, and fidelity than PCR [24,25]. Confirmation of the second-round PCR output was achieved by analyzing the products via electrophoresis with a gel composed of 1% agarose.
In order to obtain sequences (60 nt in total) between the positions of primer P1 and BCPF and P2 and MDN5, two second round PCRs were carried out on 5 μL of the first round’s products above in a 50 μL reaction using primers P1 and MDC1 (nt 2304–2324, 5′-TTGATAAGATAGGGGCATTTG-3′) and P2 and XSEQ3(nt 1653–1672, 5′-CATAAGAGGACTCTTGGACT-3′), respectively. The PCR program is a 5 min hot start followed by 30 cycles of 94 °C for 30 s, 50 °C for 30 s, and 72 °C for 30 s.

2.5. Clone-Based Sequencing

Amplicons from the second round were confirmed by agarose gel electrophoresis and cloned into the vector P clone 007T (The Beijing Qingke Biotech Co., Ltd., Beijing, China) according to the manufacturer’s instructions and subsequently transformed into competent Escherichia coli (The Beijing Qingke Biotech Co., Ltd., China) (Figure 1). Following plasmid DNA extraction with a SK1191 UNIQ-10 kit (The Beijing Qingke Biotech Co., Ltd., China), the purified samples were sequenced. The analysis was completed by The Beijing Qingke Biotech Co., Ltd., where the reaction was performed using the BigDye Terminator V3.1 Cycle Sequencing kit from Applied Biosystems (Foster City, CA, USA).
For the direct sequencing of small PCR fragments, a 2 µL aliquot of purified amplicon DNA served as the template. The procedure, carried out at The Beijing Qingke Biotech Co., Ltd. (China), employed primers MDC1 and XSEQ3 along with the BigDye Terminator V3.1 Cycle Sequencing kit (Applied Biosystems, Foster City, CA, USA), strictly following the kit manufacturer’s instructions.
Sequences were determined for both strands to derive robust data for comparison with the full-length sequences of the various genotypes.
Forward primers:
W803-C01 (nt 43–61, 5′-GGGGCCTGTATTTTCCTGCT-3′),
W798-A02 (nt 254–273, 5′-TGTCAACAATTTGTGGGCCC-3′),
W807-F05 (nt 505–524, 5′-ATTCCTATGGGAGTGGGCCT-3′),
W803-A01 (nt 810–829, 5′-ACCAATCGGCAGTCAGGAAG-3′),
W811-C09 (nt 1252–1271, 5′-GCTCCTCTGCCGATCCATAC-3′).
W798-B02 (nt 1633–1652, 5′-TGTGAACAATTTGTGGGCCC-3′),
W798-E01 (nt 2472–2491, 5′-GTGGGAAACTTTACCGGGCT-3′),
W803-B01 (nt 3061–3080, 5′-GGAGGTCTTTTGGGGTGGAG-3′),
Reverse primers:
W807-A01 (nt 60–41, 5′-GCAGGAAAATACAGGCCCCT-3′),
W807-B01 (nt 353–334, 5′-GGACAGGAGGTTGGTGAGTG-3′),
W807-C01 (nt 893–874, 5′-CCCCAATCCTCGCGAAGATT-3′).
W798-C02 (nt 1237–1218, 5′-CCACAAAGGTTCCACGCATG-3′),
W803-D01 (nt 1543–1524, 5′-GAGGCCCACTCCCATAGGTA-3′),
W803-B03 (nt 2325–2306, 5′-AGGCCCACTCCCATAGGAAT-3′),
W798-F01 (nt 2640–2621, 5′-GTATGGATCGGCAGAGGAGC-3′),

2.6. Workflow for Next-Generation Sequencing Analysis

For high-throughput sequencing, amplicons from the secondary PCR were processed at Delivectory Biosciences Inc. (Beijing, China). Following an initial purification step with Agencourt AMPure XP beads (Beckman Coulter, Shanghai, China) and quantification using Qubit dsDNA HS assay kits (Invitrogen, Carlsbad, CA, USA), DNA libraries were constructed. The Celero EZ DNA-Seq Library Preparation Kit (Tecan Genomics, Shanghai, China) was utilized for this purpose before the samples were sequenced on an Illumina Noveseq platform, per the manufacturer’s instructions. The system’s control software analyzed the raw optical data, which was ultimately transformed into paired-end reads with sequences of 2 × 150 base pairs.

2.7. NGS Data Preprocessing and Sample Genotyping

Quality control and preprocessing of each sample’s raw NGS short reads were performed by fastp v0.20.1 [26]. The raw sequence reads underwent several data processing steps, beginning with the trimming of adapters and the excision of 15 bases from the 5′ end. A filtering process was then applied to remove reads with a length under 50 nucleotides or an average quality score below 30. Using the bowtie2 alignment tool (v2.3.4.1), the remaining high-quality reads from each sample were subsequently mapped to the reference sequence X02763 [27]. First, the Samtools v1.7 utility [28] was applied to sort the alignments and filter out any duplicate reads. The final consensus sequence was subsequently generated from this processed dataset with the CliqueSNV v1.5.3 tool [29].

2.8. Haplotype Inference and Quasispecies Diversity Assessment

To assess diversity and construct haplotypes, the filtered NGS reads for each sample were first re-mapped against appropriate genotype-specific references. This alignment was performed with bowtie2’s very-sensitive-local mode [27], followed by the removal of duplicates using Sambamba [30]. Haplotype structures were then inferred from the processed SAM alignment files employing the CliqueSNV software (v1.5.3) with all settings at their default values [29]. A threshold of 1% minimum abundance was set for a haplotype to be included in subsequent investigations, and each was treated as a unique HBV variant’s genome. Ultimately, the heterogeneity of the viral quasispecies was quantified through its genetic complexity, a metric based on the number of distinct sequences identified.

2.8.1. HBV Genotype Determination

Phylogenetic reconstruction, based on both the complete genome and the preS/S gene regions, was the method used for HBV genotype classification. Our sequences were first aligned to 46 GenBank reference sequences of known genotypes with the Clustal W tool, and the result was visually inspected using BioEdit [31] (reference details are shown in Table 1). Subsequently, maximum likelihood trees were built with the MEGA_12.0.11 software [32] employing the GTR+I+G substitution model. The reliability of the clustering was established by performing an interior branch test of 1000 replicates, where a support threshold of 75% for internal nodes was considered significant.

2.8.2. Detection of Recombination

To investigate potential genetic exchange, a recombination analysis was carried out with the Simplot program (V.3.5.1) using its boot scanning function, consistent with our earlier study. The complete HBV sequence was scanned against consensus sequences representing genotypes A–I. During the bootscan, the query sequence’s phylogenetic position was evaluated relative to reference parental strains (FR714490, AB074047) and two outgroups (AY226578, AB486012). A shift in the supported phylogenetic clustering along the genome was interpreted as evidence of a recombination event. This process was executed with a window length of 400 base pairs, a 20 base pair step size, and a bootstrap value of 1000 replicates.

3. Results

3.1. General Information

The five study subjects were two males and three females with an average age of 57.2 ± 10.5 years (range from 31 to 66). All were negative for HBeAg and had ALT levels below 40 U/L. The median of viral load was 4.06 × 102 IU/mL (IQR: 88 IU/mL~4.59 × 102 IU/mL) (Table 2). Although complete genome sequences were obtained from all study subjects, sequences were not derived from the same individual at two time points (2011 and 2020). The major problems were that some samples were of low volume and others had viral loads below the detection limit of PCR.
The average number of clones selected per sample was 29, with a total of 145 clones. In next-generation sequencing, 0.58G raw reads were generated. The average number of quasispecies per sample obtained after filtration and error correction was 9.6, and the total number was 48.

3.1.1. Comparison of CBS and NGS for Genotype/Subgenotype Analysis Based on Complete Genome Sequences

Two phylogenetic trees were constructed on the basis of the complete genome sequences obtained from NGS and CBS (Figure 2 and Figure 3). Genotypes B, C, and I were found with NGS and CBS. In addition, we found a recombinant or an aberrant form of I by CBS. The genetic distances between these strains and all subgenotypes of I exceeded 4%. It is suggested these strains may be recombinant or an aberrant form of I. There are three subjects whose genotype/subgenotype identified with NGS are the same as that identified with CBS. Two subjects have more subgenotypes identified with CBS compared to NGS (Table 3).
All quasispecies of subject SS078 obtained from NGS were typed as subgenotype C5, while only two and almost all of the rest quasispecies obtained from CBS were subgenotype C5 and C1, respectively. In subject SS584, all quasispecies obtained from NGS and almost all quasispecies obtained from CBS were subgenotype C1. One more recombinant and aberrant strain was identified with CBS in that subject. Clearly, genotyping results with sequences obtained from NGS are not consistent with those yielded from CBS. CBS may identify more subgenotypes than NGS.

3.1.2. Comparison of Genotype/Subgenotypes Based on the S ORF Sequences Derived Using NGS, CBS, and Direct Sequencing

In order to clarify the accuracy of genotyping based on the S ORF with direct sequencing, two phylogenetic trees were constructed on the basis of PreS/S sequences obtained from the full-length NGS and CBS sequence (Figure 4). Genotypes/subgenotypes identified from the complete genome and S ORF sequences, obtained by NGS, were the same in all subjects. The same could be seen using CBS in all subjects, except for subject SS584. Complete genome sequencing revealed that this subject was infected with one more aberrant (Table 3).
The genotype/subgenotypes identified in one subject by direct sequencing were the same as those identified from sequences obtained by both NGS and CBS. However, genotype/subgenotypes identified by direct sequencing in four subjects could not be found using NGS or CBS. Subject SS078 had genotype B by direct sequencing, but his genotype was genotype C by both NGS and CBS. Subject BY640 had genotype G or I by direct sequencing, while the genotype was C according to both NGS and CBS. The genotype of subject CW512 could not be genotyped by direct sequencing but was B by NGS and CBS, respectively. Subject SS584 had genotype C or B by direct sequencing, but his genotype was genotype C, an aberrant by both NGS and CBS.
These findings suggest that genotyping based on the S ORF may not be reliable in regions with complex genotypes (Table 3).

3.1.3. Comparison of NGS and CBS for Detection of Mutations in the Complete HBV Genome

All of the mutations found in this study were point mutations. In NGS, mutations were detected by the Samtools mpileup algorithm and another in-house script. Detailed information on variants with read depths greater than 1000 and mutation rates higher than 1.0% is shown in Table 4. In CBS, mutations were determined by comparison with the database (Geno2pheno hbv, https://hbv.geno2pheno.org/ accessed on 24 December 2024). The number of mutations detected by both NGS and CBS was 19. The number of mutations detected by one method was only 13 for CBS and 1 for NGS. Mutations found by NGS were not consistent with those found by CBS. More mutations were found by CBS than by NGS.
Except for subject SS584, all have the PreC nt 1896 (G → A) point mutation according to both NGS and CBS. Except for subject CW512, all have BCP nt 1762 (A → T) and 1764 (G → A) double mutations by NGS or CBS. Subjects SS078 and SS584 have escape mutations by CBS, while none of the sequences obtained from NGS have the same mutation.
The statistical results showed that CBS had significantly better detection ability than NGS (p = 0.002) in the C zone (such as L100I, P130T), indicating that CBS was more sensitive to low-frequency mutations. There was no significant difference between NGS and CBS in detecting point mutations in S, RT, X, PreC, and BCP regions (p > 0.05).

3.1.4. Analysis of Genetic Recombination and Putative Breakpoint Locations

Evidence of recombination was detected in strain group 1 (SS584-CBS-13 and SS584-CBS-24), group 2 (SS584-CBS-21and SS584-CBS-31) using Simplot 3.5.1 software (Figure 5a,b). The results showed that SS584-CBS-13 and SS584-CBS-24 had no obvious recombination, but SS584-CBS-21 and SS584-CBS-31 had obvious recombination.
The bootscanning result of strain SS584-CBS-21 showed that parts of the genome (nt 1 to 765 and nt 1556 to 3057) were more similar to genotype C, while the remaining (nt 765 to 1556) was more similar to genotype I (Figure 5c). Parts of the genome of strain SS584-CBS-31 (nt 1 to 703 and nt 1852 to 3158) were more similar to genotype C, while the remaining (nt 703 to 1852) was more similar to subgenotype I (Figure 5d). Phylogenetic analyses of both the whole-genome and the S gene sequences further confirmed this. The isolates clustered within genotype C in the S gene tree but belonged to genotype I in the whole-genome tree, with bootstrap support values exceeding 70% in both cases. Clearly, both strains are recombinants between genotype C and genotype I.

4. Discussion

To the best of our knowledge, this study is the first designed to compare the accuracy of genotyping based on complete HBV genome sequences determined by NGS and CBS. The principal findings are that genotyping results with sequences obtained from NGS may be inconsistent with those from CBS. CBS can find more subgenotypes and mutations than NGS. Genotyping based on S ORF sequences obtained by direct sequencing may not be reliable in this region with complex genotypes. The strength of the study is that complete genome sequences were used for genotyping, which may provide more information to compare the advantages of NGS and CBS. The major limitation of this study is that none of the study subjects had two serum samples collected in 2011 and 2020, respectively, which may allow us to compare the evolution of HBV genomes in infected individuals.
The ability to conduct high-resolution analyses of viral quasispecies is a major advantage of next-generation sequencing, stemming from its power to read thousands of distinct molecular sequences. Consequently, this technology has seen broad adoption in diverse research fields, for instance, in virological and cancer-related investigations [33]. However, compared to Sanger sequencing, the ability of NGS to identify genotypes remains unclear. It has been reported that the results of NGS genotyping showed a concordance of 95.2% with INNO-LiPA and 100% with Sanger sequencing. However, the sequences used for genotyping covered only 418 nt of the polymerase gene [34]. Another study showed that NGS based on the complete genome is useful to discriminate mixed genotypes detected with INNO-LiPA. Unfortunately, these results were not confirmed by Sanger sequencing [35]. In this study, we compared NGS genotyping results with those of CBS based on the complete genome. Therefore, our study may provide more information on the ability of NGS genotyping.
Numerous studies have demonstrated that the specific HBV genotype correlates with clinical manifestations and can serve as a genetic indicator to predict the progression of the disease [9]. These genotypic variations are particularly relevant to therapeutic outcomes, as they have been shown to impact responses to interferon-α treatment and the potential for achieving a functional cure via HBsAg loss [36]. Therefore, characterizing the viral genotype, along with other prognostic factors, allows for the formulation of customized treatment approaches and informs surveillance strategies, including hepatocellular carcinoma screening protocols [37]. Therefore, the findings of our study are likely important to clinicians.
For detecting mixed-genotype HBV infections, direct sequencing of PCR products is considered inadequate [13]. Clone-based sequencing (CBS) offers a partial solution, but it is hampered by potential selection artifacts and the small sample of clones, which may not be fully reflective of the viral population’s genetic complexity [38]. In contrast, next-generation sequencing (NGS) provides enormous throughput, enabling deeper sequencing capacity [33]. It has been widely reported that for identifying minor variants and simulating quasispecies, NGS demonstrates superior sensitivity and efficiency when compared to the CBS method [17].
Surprisingly, the data from our study did not align with these expectations. The genotyping outcomes derived from NGS sequences were not congruent with those from CBS; in fact, CBS identified more subgenotypes and mutations. This counterintuitive finding might stem from the larger number of complete viral sequences generated through CBS in our specific workflow compared to NGS. It is important to note that our NGS pipeline did account for diversity by treating any haplotype with a frequency of at least 1% as a unique HBV variant. This discrepancy highlights the need for further studies to clarify the optimal application of these methods.
Our sequencing data reveal a complex landscape of HBV quasispecies, characterized by a dynamic equilibrium between dominant and rare variants. The dominant quasispecies, which constitute the majority of the viral population, are believed to represent the fittest viruses in the current host environment, driving ongoing viral replication and disease progression. However, the clinical significance of the rare quasispecies cannot be overlooked. This reservoir of genetic diversity, while quantitatively minor, serves as a critical archive for pre-existing drug-resistant or immune-escape mutations [39,40].
The presence of low-frequency variants can act as a predictive biomarker for treatment failure. For instance, the pre-existence of NA-resistant mutations within the rare quasispecies pool, even before the initiation of therapy, is a well-documented mechanism leading to subsequent virological breakthrough. These rare variants represent a formidable challenge to the host’s immune control and vaccine efficacy. Under selective pressure from neutralizing antibodies or cytotoxic T lymphocytes, a previously rare immune-escape variant can be rapidly selected, outgrow the former dominant population, and lead to immune evasion and chronicity [41,42,43].
It has been reported that a more cost-effective approach to genotyping is to sequence a single ORF (e.g., the surface ORF) instead of the complete genome [13]. The findings from this kind of analysis may be adequate for classifying the primary HBV genotype but are often unsuitable for subgenotype assignment if genetic exchange has taken place [4]. This is because recombination events can interfere with the proper reconstruction of a phylogenetic tree and lead to an incorrect increase in nucleotide divergence values [44]. The misclassification of strains such as subgenotypes B3, B5, and C11 serves as a clear illustration of this analytical pitfall [45]. In this study, we found that four of the five subjects’ genotypes identified in direct sequencing could not be found by NGS/CBS. One subject has the same genotype found by direct sequencing as by NGS and CBS. Clearly, genotyping based on the S gene ORF may not be reliable in regions with complex genotypes.
The high replication rate of HBV and the lack of proofreading activity of its polymerase result in high genetic heterogeneity [2]. Resistance mutations may occur naturally during long-term infection [46]. A wide variation has been observed in the frequency of naturally occurring resistance mutations among patients who have no prior treatment history, with reported prevalence rates spanning from 0% up to 57% [47]. These mutants sometimes may appear predominant and sometimes as minor quasispecies [35,48].
Inter-variant genetic recombination serves as a pivotal mechanism in viral evolution. The occurrence of recombination events disrupts the coherence and consistency of evolutionary histories among genomic regions, including distinct gene segments [49]. HBV recombinant strains between different genotypes/subgenotypes play critical roles in shaping viral genetic diversity and facilitating transmission among human populations [50]. Multiple recombinant strains, such as C/D, A/E, B/C, B/D, and A/G recombinants, have been identified [51,52]. Our study identified two classes of suspected recombinant strains. Subsequent phylogenetic tree, Simplot, and Bootscan analyses confirmed one class as I/C recombinant strains. Intriguingly, another cluster exhibited no detectable recombination signals but displayed an evolutionary divergence exceeding 4% from both I1 and I2. We provisionally classify it as an aberrant subgenotype of I, pending further validation to determine whether it represents a novel subgenotype of I. The identification of these recombinant and aberrant strains aids clinicians in evaluating clinical outcomes and antiviral treatment responses for patients with specific viral genotype infections, enabling precision therapy. However, further epidemiological characterization of such strains is warranted.
In this study, we found that all the study subjects had pre-existing resistance mutations that occurred as minor quasispecies. It remains crucial to determine how the presence of minor viral populations harboring primary drug resistance mutations might influence the therapeutic response to subsequent antiviral treatment.

Author Contributions

Conceptualization, L.-P.H. and H.-H.J.; data curation, L.-P.H., Q.-Y.C., X.-Q.H. and H.-H.J.; formal analysis, L.-P.H., M.-L.H. and H.-H.J.; investigation, L.-P.H., Q.-Y.C., W.-J.Z. and H.-H.J.; methodology, L.-P.H., X.-F.Y. and H.-H.J.; resources, L.-P.H. and H.-H.J.; software, L.-P.H. and H.-H.J.; supervision, Q.-Y.C. and X.-F.Y.; validation, L.-P.H. and H.-H.J.; writing—original draft, L.-P.H. and H.-H.J.; writing—review and editing, L.-P.H. and H.-H.J. All authors have read and agreed to the published version of the manuscript.

Funding

This study was supported by the Guangxi Natural Science Foundation (Grant No. 2025GXNSFBA069111), the Guangxi Natural Science Foundation (Grant No. 2025GXNSFBA069574) and Guangxi Municipal and County Scientific Research Project (Grant No. 2024KY1238, XKJ2401), the First Batch of the Guangxi Medical Young Reserve Talents Training Program.

Institutional Review Board Statement

Informed consent in writing was obtained from each individual. The study protocol conforms to the ethical guidelines of the 1975 Declaration of Helsinki and has been approved by the Guangxi Institutional Review Board (GXIRB 2020-0021) (approval date: 2 March 2020).

Informed Consent Statement

Informed consent in writing was obtained from each individual.

Data Availability Statement

All data generated or analyzed during this study are included in this article. Further enquiries can be directed to the corresponding author.

Acknowledgments

The primary reason for the successful completion of this study is the substantial support received from the Guangxi Natural Science Foundation (Grant No. 2025GXNSFBA069111). We are indebted to staff members of the Centers for Disease Prevention and Control of BinYang county, CangWu county, and QinNan district, Guangxi, who assisted in recruiting the study subjects and sample collection.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:
NGSNext-Generation Sequencing
CBSClone-Based Sequencing
HBVHepatitis B Virus
ORFOpen Reading Frame
HIV-1Human Immunodeficiency Virus Type 1
HCVHepatitis C Virus
EIAEnzyme Immunoassays
ALTAlanine Aminotransferase
PCRPolymerase Chain Reaction
RFLPRestriction Fragment Length Polymorphism
RFMPRestriction Fragment Mass Polymorphism
MSMass Spectrometry
INNO-LiPAINNO-LIPA (A commercial line probe assay)
HBsAgHepatitis B Surface Antigen
HBeAgHepatitis B e Antigen
anti-HBeAntibody to Hepatitis B e Antigen
HCCHepatocellular Carcinoma
DNADeoxyribonucleic Acid
ntNucleotide(s)
BCPBasal Core Promoter
RTReverse Transcriptase
PreCPre-Core
PreSPre-Surface
SSurface
PPolymerase
CCore

References

  1. Kramvis, A. Genotypes and genetic variability of hepatitis B virus. Intervirology 2014, 57, 141–150. [Google Scholar] [CrossRef] [PubMed]
  2. Tian, Q.; Jia, J. Hepatitis B virus genotypes: Epidemiological and clinical relevance in Asia. Hepatol. Int. 2016, 10, 854–860. [Google Scholar] [CrossRef] [PubMed]
  3. Mei, F.; Ren, J.; Long, L.; Li, J.; Li, K.; Liu, H.; Tang, Y.; Fang, X.; Wu, H.; Xiao, C. Analysis of HBV X gene quasispecies characteristics by next-generation sequencing and cloning-based sequencing and its association with hepatocellular carcinoma progression. J. Med. Virol. 2019, 91, 1087–1096. [Google Scholar] [CrossRef]
  4. Pourkarim, M.; Amini-Bavil-Olyaee, S.; Lemey, P.; Maes, P.; Ranst, M.V. HBV subgenotype misclassification expands quasi-subgenotype A3. Clin. Microbiol. Infect. 2011, 17, 947–949. [Google Scholar] [CrossRef]
  5. Huy, T.T.T.; Sall, A.A.; Reynes, J.M.; Abe, K. Complete genomic sequence and phylogenetic relatedness of hepatitis B virus isolates in Cambodia. Virus Genes 2008, 36, 299–305. [Google Scholar] [CrossRef]
  6. Feng, Y.; Ran, J.Y.; Feng, Y.M.; Miao, J.; Zhao, Y.; Jia, Y.Y.; Li, Z.; Yue, W.; Xia, X.S. Genetic diversity of hepatitis B virus in Yunnan, China: Identification of novel subgenotype C17, an intergenotypic B/I recombinant, and B/C recombinants. J. General. Virol. 2020, 101, 972–981. [Google Scholar] [CrossRef]
  7. Liu, Y.; Feng, Y.; Li, Y.L.; Ma, J.; Jia, Y.Y.; Yue, W.; Feng, Y.M. Characterization of a novel hepatitis B virus subgenotype B10 among chronic hepatitis B patients in Yunnan, China. Infect. Genet. Evol. 2020, 83, 104322. [Google Scholar] [CrossRef]
  8. Thijssen, M.; Trovão, N.S.; Mina, T.; Maes, P.; Pourkarim, M.R. Novel hepatitis B virus subgenotype A8 and quasi-subgenotype D12 in African–Belgian chronic carriers. Int. J. Infect. Dis. 2020, 93, 98–101. [Google Scholar] [CrossRef]
  9. Lin, C.-L.; Kao, J.-H. Hepatitis B virus genotypes and variants. Cold Spring Harb. Perspect. Med. 2015, 5, a021436. [Google Scholar] [CrossRef]
  10. Fernandes da Silva, C.; Keeshan, A.; Cooper, C. Hepatitis B virus genotypes influence clinical outcomes: A review. Can. Liver J. 2023, 6, 347–352. [Google Scholar] [CrossRef] [PubMed]
  11. Lin, C.L.; Kao, J.H. Natural history of acute and chronic hepatitis B: The role of HBV genotypes and mutants. Best. Pract. Res. Clin. Gastroenterol. 2017, 31, 249–255. [Google Scholar] [CrossRef]
  12. Guirgis, B.S.; Abbas, R.O.; Azzazy, H.M. Hepatitis B virus genotyping: Current methods and clinical implications. Int. J. Infect. Dis. 2010, 14, e941–e953. [Google Scholar] [CrossRef]
  13. Bartholomeusz, A.; Schaefer, S. Hepatitis B virus genotypes: Comparison of genotyping methods. Rev. Med. Virol. 2004, 14, 3–16. [Google Scholar] [CrossRef] [PubMed]
  14. Lim, C.; Tan, J.; Ravichandran, A.; Chan, Y.; Ton, S. Comparison of PCR-based genotyping methods for hepatitis B virus. Malays. J. Pathol. 2007, 29, 79–90. [Google Scholar] [PubMed]
  15. Goodwin, S.; McPherson, J.D.; McCombie, W.R. Coming of age: Ten years of next-generation sequencing technologies. Nat. Rev. Genet. 2016, 17, 333–351. [Google Scholar] [CrossRef] [PubMed]
  16. Karakoyun, H.K.; Sayar, C.; Yararbaş, K. Challenges in clinical interpretation of next-generation sequencing data: Advantages and Pitfalls. Results Eng. 2023, 20, 101421. [Google Scholar] [CrossRef]
  17. Gong, L.; Han, Y.; Chen, L.; Liu, F.; Hao, P.; Sheng, J.; Li, X.-H.; Yu, D.-M.; Gong, Q.-M.; Tian, F. Comparison of next-generation sequencing and clone-based sequencing in analysis of hepatitis B virus reverse transcriptase quasispecies heterogeneity. J. Clin. Microbiol. 2013, 51, 4087–4094. [Google Scholar] [CrossRef]
  18. Wang, X.; Harrison, T.; He, X.; Chen, Q.; Li, G.; Liu, M.; Li, H.; Yang, J.; Fang, Z. The prevalence of mutations in the major hydrophilic region of the surface antigen of hepatitis B virus varies with subgenotype. Epidemiol. Infect. 2015, 143, 3572–3582. [Google Scholar] [CrossRef]
  19. Fang, Z.-L.; Hue, S.; Sabin, C.A.; Li, G.-J.; Yang, J.-Y.; Chen, Q.-Y.; Fang, K.-X.; Huang, J.; Wang, X.-Y.; Harrison, T.J. A complex hepatitis B virus (X/C) recombinant is common in Long An county, Guangxi and may have originated in southern China. J. General. Virol. 2011, 92, 402–411. [Google Scholar] [CrossRef]
  20. Li, G.J.; Hue, S.; Harrison, T.J.; Yang, J.Y.; Chen, Q.Y.; Wang, X.Y.; Fang, Z.L. Hepatitis B virus candidate subgenotype I1 varies in distribution throughout Guangxi, China and may have originated in Long An county, Guangxi. J. Med. Virol. 2013, 85, 799–807. [Google Scholar] [CrossRef]
  21. Fang, Z.L.; Harrison, T.J.; Yang, J.Y.; Chen, Q.Y.; Wang, X.Y.; Mo, J.J. Prevalence of hepatitis B virus infection in a highly endemic area of southern China after catch-up immunization. J. Med. Virol. 2012, 84, 878–884. [Google Scholar] [CrossRef]
  22. Günther, S.; Li, B.-C.; Miska, S.; Krüger, D.; Meisel, H.; Will, H. A novel method for efficient amplification of whole hepatitis B virus genomes permits rapid functional analysis and reveals deletion mutants in immunosuppressed patients. J. Virol. 1995, 69, 5437–5444. [Google Scholar] [CrossRef] [PubMed]
  23. Margeridon, S.; Carrouée-Durantel, S.; Chemin, I.; Barraud, L.; Zoulim, F.; Trépo, C.; Kay, A. Rolling circle amplification, a powerful tool for genetic and functional studies of complete hepatitis B virus genomes from low-level infections and for directly probing covalently closed circular DNA. Antimicrob. Agents Chemother. 2008, 52, 3068–3073. [Google Scholar] [CrossRef] [PubMed]
  24. Esteban, J.A.; Salas, M.; Blanco, L. Fidelity of phi 29 DNA polymerase. Comparison between protein-primed initiation and DNA polymerization. J. Biol. Chem. 1993, 268, 2719–2726. [Google Scholar] [CrossRef]
  25. Lasken, R.S.; Egholm, M. Whole genome amplification: Abundant supplies of DNA from precious samples or clinical specimens. Trends Biotechnol. 2003, 21, 531–535. [Google Scholar] [CrossRef]
  26. Chen, S.; Zhou, Y.; Chen, Y.; Gu, J. fastp: An ultra-fast all-in-one FASTQ preprocessor. Bioinformatics 2018, 34, i884–i890. [Google Scholar] [CrossRef]
  27. Langmead, B.; Salzberg, S.L. Fast gapped-read alignment with Bowtie 2. Nat. Methods 2012, 9, 357–359. [Google Scholar] [CrossRef]
  28. Li, H.; Durbin, R. Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics 2009, 25, 1754–1760. [Google Scholar] [CrossRef]
  29. Knyazev, S.; Tsyvina, V.; Shankar, A.; Melnyk, A.; Artyomenko, A.; Malygina, T.; Porozov, Y.B.; Campbell, E.M.; Switzer, W.M.; Skums, P. Accurate assembly of minority viral haplotypes from next-generation sequencing through efficient noise reduction. Nucleic Acids Res. 2021, 49, e102. [Google Scholar] [CrossRef]
  30. Tarasov, A.; Vilella, A.J.; Cuppen, E.; Nijman, I.J.; Prins, P. Sambamba: Fast processing of NGS alignment formats. Bioinformatics 2015, 31, 2032–2034. [Google Scholar] [CrossRef]
  31. Elkins, K. Analysis of deoxyribonucleic acid (DNA) sequence data using BioEdit. In Forensic DNA Biology: A Laboratory Manual; Elsevier: Amsterdam, The Netherlands, 2013; Volume 15, pp. 129–132. [Google Scholar]
  32. Kumar, S.; Stecher, G.; Mega, K.T. Molecular evolutionary genetics analysis version 7.0 for bigger datasets. Mol. Biol. Evol. 2016, 33, 1870–1874. [Google Scholar] [CrossRef]
  33. Garcia-Garcia, S.; Cortese, M.F.; Rodriguez-Algarra, F.; Tabernero, D.; Rando-Segura, A.; Quer, J.; Buti, M.; Rodriguez-Frias, F. Next-generation sequencing for the diagnosis of hepatitis B: Current status and future prospects. Expert. Rev. Mol. Diagn. 2021, 21, 381–396. [Google Scholar] [CrossRef] [PubMed]
  34. Lowe, C.F.; Merrick, L.; Harrigan, P.R.; Mazzulli, T.; Sherlock, C.H.; Ritchie, G. Implementation of next-generation sequencing for hepatitis B virus resistance testing and genotyping in a clinical microbiology laboratory. J. Clin. Microbiol. 2016, 54, 127–133. [Google Scholar] [CrossRef] [PubMed]
  35. Hebeler-Barbosa, F.; Wolf, I.R.; Valente, G.T.; Mello, F.C.d.A.; Lampe, E.; Pardini, M.I.d.M.C.; Grotto, R.M.T. A new method for next-generation sequencing of the full hepatitis B virus genome from a clinical specimen: Impact for virus genotyping. Microorganisms 2020, 8, 1391. [Google Scholar] [CrossRef] [PubMed]
  36. Revill, P.A.; Tu, T.; Netter, H.J.; Yuen, L.K.; Locarnini, S.A.; Littlejohn, M. The evolution and clinical impact of hepatitis B virus genome diversity. Nat. Rev. Gastroenterol. Hepatol. 2020, 17, 618–634. [Google Scholar] [CrossRef]
  37. Rajoriya, N.; Combet, C.; Zoulim, F.; Janssen, H.L. How viral genetic variants and genotypes influence disease and treatment outcome of chronic hepatitis B. Time for an individualised approach? J. Hepatol. 2017, 67, 1281–1297. [Google Scholar] [CrossRef]
  38. Knyazev, S.; Hughes, L.; Skums, P.; Zelikovsky, A. Epidemiological data analysis of viral quasispecies in the next-generation sequencing era. Brief. Bioinform. 2021, 22, 96–108. [Google Scholar] [CrossRef]
  39. Zhang, C.; An, S.; Lv, R.; Li, K.; Liu, H.; Li, J.; Tang, Y.; Cai, Z.; Huang, T.; Long, L.; et al. The dynamic variation position and predominant quasispecies of hepatitis B virus: Novel predictors of early hepatocarcinoma. Virus Res. 2024, 341, 199317. [Google Scholar] [CrossRef]
  40. Xie, C.; Lu, D. Evolution and diversity of the hepatitis B virus genome: Clinical implications. Virology 2024, 598, 110197. [Google Scholar] [CrossRef]
  41. Tang, X.; Huang, W.; Kang, J.; Ding, K. Early dynamic changes of quasispecies in the reverse transcriptase region of hepatitis B virus in telbivudine treatment. Antivir. Res. 2021, 195, 105178. [Google Scholar] [CrossRef]
  42. Liao, H.; Zhang, H.; Shao, J.; Li, X.; Zheng, W.V.; Li, L.; Yu, G.; Si, L.; Zhou, T.; Yao, Z.; et al. Nucleos(t)ide analogues altered quasispecies composition of hepatitis B virus (HBV)-resistant mutations in serum HBV DNA and serum HBV RNA. J. Med. Virol. 2023, 95, e28615. [Google Scholar] [CrossRef]
  43. Yang, J.; Yang, G.; He, H.; Liu, H.; Fu, Q.; Wu, X.; Meng, R.; Li, Z.; Zhao, Q.; Luo, K.; et al. Pretreatment viral quasispecies characteristics and evolutionary phases correlate with HBsAg seroconversion in peginterferon-alfa-2a-treated children with HBeAg-positive chronic hepatitis B. Antivir. Res. 2025, 244, 106291. [Google Scholar] [CrossRef]
  44. Pourkarim, M.R.; Amini-Bavil-Olyaee, S.; Kurbanov, F.; Van Ranst, M.; Tacke, F. Molecular identification of hepatitis B virus genotypes/subgenotypes: Revised classification hurdles and updated resolutions. World J. Gastroenterol. 2014, 20, 7152–7168. [Google Scholar] [CrossRef] [PubMed]
  45. Shi, W.F.; Zhang, Z.; Ling, C.; Zheng, W.M.; Zhu, C.D.; Carr, M.J.; Higgins, D.G. Hepatitis B virus subgenotyping: History, effects of recombination, misclassifications, and corrections. Infect. Genet. Evol. 2013, 16, 355–361. [Google Scholar] [CrossRef] [PubMed]
  46. Fu, Y.; Zeng, Y.; Chen, T.; Chen, H.; Lin, N.; Lin, J.; Liu, X.; Huang, E.; Wu, S.; Wu, S. Characterization and clinical significance of natural variability in hepatitis B virus reverse transcriptase in treatment-naive Chinese patients by Sanger sequencing and next-generation sequencing. J. Clin. Microbiol. 2019, 57, e00119-19. [Google Scholar] [CrossRef] [PubMed]
  47. Choi, Y.-M.; Lee, S.-Y.; Kim, B.-J. Naturally occurring hepatitis B virus reverse transcriptase mutations related to potential antiviral drug resistance and liver disease progression. World J. Gastroenterol. 2018, 24, 1708–1724. [Google Scholar] [CrossRef]
  48. Choe, W.H.; Kim, K.; Lee, S.-Y.; Choi, Y.-M.; Kwon, S.Y.; Kim, J.H.; Kim, B.-J. Tenofovir is a more suitable treatment than entecavir for chronic hepatitis B patients carrying naturally occurring rtM204I mutations. World J. Gastroenterol. 2019, 25, 4985–4998. [Google Scholar] [CrossRef]
  49. Tshiabuila, D.; San, J.E.; Wilkinson, E.; Dor, G.; Tegally, H.; Maponga, T.G.; Delphin, M.; Matthews, P.C.; Martin, D.P. Conserved recombination patterns across hepatitis B genotypes: A retrospective study. Virol. J. 2025, 22, 220. [Google Scholar] [CrossRef]
  50. Matlou, M.K.; Gaelejwe, L.R.; Musyoki, A.M.; Rakgole, J.N.; Selabe, S.G.; Amponsah-Dacosta, E. A novel hepatitis B virus recombinant genotype D4/E identified in a South African population. Heliyon 2019, 5, e01477. [Google Scholar] [CrossRef]
  51. Locarnini, S.A.; Littlejohn, M.; Yuen, L.K.W. Origins and Evolution of the Primate Hepatitis B Virus. Front. Microbiol. 2021, 12, 653684. [Google Scholar] [CrossRef]
  52. Cremer, J.; van Heiningen, F.; Veldhuijzen, I.; Benschop, K. Characterization of Hepatitis B virus based complete genome analysis improves molecular surveillance and enables identification of a recombinant C/D strain in the Netherlands. Heliyon 2023, 9, e22358. [Google Scholar] [CrossRef]
Figure 1. The cloning vector P clone 007T and the HBV clone insertion sites.
Figure 1. The cloning vector P clone 007T and the HBV clone insertion sites.
Viruses 18 00112 g001
Figure 2. Phylogenetic analysis of CBS (a) and NGS (b) complete sequences. Maximum likelihood tree was constructed using the complete sequences of the viruses under the GTR+G+I substitution model with the program Mega V7.0. The branch lengths represent the number of substitutions per site. The reliability of clusters was evaluated using the interior branch test with 1000 replicates, and internal nodes with over 75% support were considered reliable. To facilitate interpretation, samples are color-coded by their designated genotype (SS078, red; SS255, green; SS584, blue; BY640, purple; CW512, deep azure).
Figure 2. Phylogenetic analysis of CBS (a) and NGS (b) complete sequences. Maximum likelihood tree was constructed using the complete sequences of the viruses under the GTR+G+I substitution model with the program Mega V7.0. The branch lengths represent the number of substitutions per site. The reliability of clusters was evaluated using the interior branch test with 1000 replicates, and internal nodes with over 75% support were considered reliable. To facilitate interpretation, samples are color-coded by their designated genotype (SS078, red; SS255, green; SS584, blue; BY640, purple; CW512, deep azure).
Viruses 18 00112 g002aViruses 18 00112 g002b
Figure 3. Phylogenetic analysis of discordant samples. (a) Concordant samples (b) based on CBS and NGS complete sequences. Maximum likelihood tree was constructed using the complete sequences of the viruses under the GTR+G+I substitution model with the program Mega V7.0. The branch lengths represent the number of substitutions per site. The reliability of clusters was evaluated using the interior branch test with 1000 replicates, and internal nodes with over 75% support were considered reliable. To facilitate interpretation, samples are color-coded by their designated genotype (SS078, red; SS255, green; SS584, blue; BY640, purple; CW512, deep azure).
Figure 3. Phylogenetic analysis of discordant samples. (a) Concordant samples (b) based on CBS and NGS complete sequences. Maximum likelihood tree was constructed using the complete sequences of the viruses under the GTR+G+I substitution model with the program Mega V7.0. The branch lengths represent the number of substitutions per site. The reliability of clusters was evaluated using the interior branch test with 1000 replicates, and internal nodes with over 75% support were considered reliable. To facilitate interpretation, samples are color-coded by their designated genotype (SS078, red; SS255, green; SS584, blue; BY640, purple; CW512, deep azure).
Viruses 18 00112 g003aViruses 18 00112 g003b
Figure 4. Phylogenetic analysis of CBS (a) and NGS (b) preS/S sequences. Maximum likelihood tree was constructed using the preS/S sequences of the viruses under the GTR+G+I substitution model with the program Mega V7.0. The branch lengths represent the number of substitutions per site. The reliability of clusters was evaluated using the interior branch test with 1000 replicates, and internal nodes with over 75% support were considered reliable. To facilitate interpretation, samples are color-coded by their designated genotype (SS078, red; SS255, green; SS584, blue; BY640, purple; CW512, deep azure).
Figure 4. Phylogenetic analysis of CBS (a) and NGS (b) preS/S sequences. Maximum likelihood tree was constructed using the preS/S sequences of the viruses under the GTR+G+I substitution model with the program Mega V7.0. The branch lengths represent the number of substitutions per site. The reliability of clusters was evaluated using the interior branch test with 1000 replicates, and internal nodes with over 75% support were considered reliable. To facilitate interpretation, samples are color-coded by their designated genotype (SS078, red; SS255, green; SS584, blue; BY640, purple; CW512, deep azure).
Viruses 18 00112 g004aViruses 18 00112 g004b
Figure 5. Simplot analysis of the recombination of strains SS584 (group 1 and group 2) (a,b) shows similarity for each position. (c,d) shows the percentage of permuted trees (BootScan). P, C, S, and X indicate the polymerase, core, surface, and X genes, respectively.
Figure 5. Simplot analysis of the recombination of strains SS584 (group 1 and group 2) (a,b) shows similarity for each position. (c,d) shows the percentage of permuted trees (BootScan). P, C, S, and X indicate the polymerase, core, surface, and X genes, respectively.
Viruses 18 00112 g005aViruses 18 00112 g005b
Table 1. HBV reference sequences of genotypes retrieved from GenBank.
Table 1. HBV reference sequences of genotypes retrieved from GenBank.
HBV GenotypeSubgenotypeReference GenBank IDCountry of Origin
AA1AB116082Japan
A2EU859908Belgium
A3AB194952Japan
BB1AB602818Japan
B2EU939638China
AB981582Japan
B3AB976562Indonesia
B4AB115551Cambodi
B5GQ924645Malaysia
B6KP659253Canada
B7GQ358143Indonesia
B8GQ358146Indonesia
B9GQ358152Indonesia
B10MN689123China
CC1GQ35154Indonesia
AB074047Japan
C2AB042285Japan
FJ386625China
C3X75665Sweden
C4AB04870Australia
C5JN827414Thailand
AB241111Philippine
C6AB493847Indonesia
C7EU670263Philippines
C8AP011106Indonesia
C9AP011108Indonesia
C10AB540583Indonesia
C11AB554020Indonesia
C12AB554025Indonesia
C13AB644281Indonesia
C14AB644284Indonesia
C15AB644286Indonesia
C16AB644287Indonesia
C17MG826140China
DD1AB188244Japan
EEAB032431Liberia
FFDQ823094Argentin
GGAF405706Germany
HHAY090460USA
II1AB231908Vietnam
FJ023659Laos
FR714490China
I2FJ023664Laos
FJ023670Laos
Woolly monkeyWoolly monkeyAY226578
Table 2. Serological test results of study samples.
Table 2. Serological test results of study samples.
CodeSexAgeYearHBsAgAnti-HBsHBeAgAnti-HBeAnti-HBcALTHBV DNA
SS078Female502020+++ND *4.06 × 102
SS078Female412011+++18.8ND
SS255Female632020+++ND4.59 × 102
SS255Female542011+++23.8ND
SS584male662020+++ND88.00
SS584male572011+++21.60ND
BY640Female402020+++ND4.63 × 102
BY640Female312011+++27.1ND
CW512male522020++ND88.00
CW512male432011++36.5ND
* ND = No detection.
Table 3. Comparison of S genome-based and full-length genome genotyping methods.
Table 3. Comparison of S genome-based and full-length genome genotyping methods.
CodeYearNGS-GenotypingCBS-GenotypingSanger-Genotyping (a)Sanger-Genotyping (b)
Whole GenomeS GeneWhole GenomeS GeneS GeneS Gene
SS0782011NDNDNDNDB2O
SS0782020C5C5C5 and C1C5 and C1NDND
SS2552011NDNDNDNDB2O
SS2552020B2B2B2B2NDND
BY6402011NDNDNDNDGI1
BY6402020C1C1C1C1NDND
CW5122011B2B2B2B2OO
CW5122020NDNDNDNDNDND
SS5842011C1C1C1, recombinant, aberrantC1, aberrantC2B
SS5842020NDNDNDNDNDND
ND = No detection, O = cannot be genotyped, a: Geno2pheno hbv, b: phylogenetic tree.
Table 4. Summary of mutation detection results of NGS and CBS.
Table 4. Summary of mutation detection results of NGS and CBS.
Region/ModeMutationCode
SS078SS255SS584BY640CW512
NGSCBSNGSCBSNGSCBSNGSCBSNGSCBS
ST131N0.00%0.00%0.00%0.00%0.00%11.11%0.00%0.00%0.00%0.00%
G145K0.00%0.00%0.00%0.00%0.00%11.11%0.00%0.00%0.00%0.00%
G145R0.00%32.35%0.00%0.00%0.00%11.11%0.00%0.00%0.00%0.00%
RTM250I0.00%2.94%0.00%0.00%0.00%0.00%0.00%0.00%0.00%0.00%
V214A0.00%0.00%0.00%0.00%0.00%0.00%100.00%97.30%0.00%0.00%
CL100I0.00%8.82%0.00%0.00%0.00%11.11%0.00%0.00%0.00%0.00%
L100P0.00%0.00%0.00%0.00%0.00%0.00%0.00%2.70%0.00%0.00%
P130T0.00%82.35%0.00%0.00%100.00%88.89%0.00%0.00%0.00%3.70%
P135Q0.00%0.00%100.00%100.00%0.00%0.00%0.00%0.00%0.00%0.00%
P135A0.00%0.00%0.00%0.00%0.00%0.00%0.00%0.00%7.41%11.11%
XC1653T85.71%8.82%0.00%0.00%0.00%0.00%0.00%0.00%0.00%0.00%
T1674C0.00%2.94%0.00%0.00%0.00%0.00%0.00%0.00%0.00%0.00%
PreCG1896A100.00%6.10%100.00%100.00%0.00%0.00%100.00%97.30%100.00%100.00%
G1899A100.00%6.10%0.00%0.00%0.00%0.00%0.00%0.00%0.00%0.00%
G1862T14.00%0.00%0.00%0.00%0.00%0.00%0.00%0.00%0.00%0.00%
BCPC1673T0.00%0.00%100.00%100.00%0.00%0.00%0.00%0.00%100.00%100.00%
T1753C0.00%0.00%0.00%0.00%0.00%0.00%100.00%97.30%0.00%0.00%
A1762T0.00%96.70%100.00%100.00%8.00%33.33%100.00%100.00%0.00%0.00%
G1764A0.00%96.70%100.00%96.60%8.00%27.78%100.00%100.00%0.00%0.00%
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Hu, L.-P.; Chen, Q.-Y.; Huang, M.-L.; Zhang, W.-J.; Huang, X.-Q.; Yi, X.-F.; Jia, H.-H. Full-Genome Hepatitis B Virus Genotyping: A Juxtaposition of Next-Generation and Clone-Based Sequencing Approaches—Comparing Genotyping Methods of Hepatitis B Virus. Viruses 2026, 18, 112. https://doi.org/10.3390/v18010112

AMA Style

Hu L-P, Chen Q-Y, Huang M-L, Zhang W-J, Huang X-Q, Yi X-F, Jia H-H. Full-Genome Hepatitis B Virus Genotyping: A Juxtaposition of Next-Generation and Clone-Based Sequencing Approaches—Comparing Genotyping Methods of Hepatitis B Virus. Viruses. 2026; 18(1):112. https://doi.org/10.3390/v18010112

Chicago/Turabian Style

Hu, Li-Ping, Qin-Yan Chen, Mei-Lin Huang, Wen-Jia Zhang, Xiao-Qian Huang, Xian-Feng Yi, and Hui-Hua Jia. 2026. "Full-Genome Hepatitis B Virus Genotyping: A Juxtaposition of Next-Generation and Clone-Based Sequencing Approaches—Comparing Genotyping Methods of Hepatitis B Virus" Viruses 18, no. 1: 112. https://doi.org/10.3390/v18010112

APA Style

Hu, L.-P., Chen, Q.-Y., Huang, M.-L., Zhang, W.-J., Huang, X.-Q., Yi, X.-F., & Jia, H.-H. (2026). Full-Genome Hepatitis B Virus Genotyping: A Juxtaposition of Next-Generation and Clone-Based Sequencing Approaches—Comparing Genotyping Methods of Hepatitis B Virus. Viruses, 18(1), 112. https://doi.org/10.3390/v18010112

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Article metric data becomes available approximately 24 hours after publication online.
Back to TopTop