Molecular Epidemiology of HCV Infection among Multi-Transfused β -Thalassemia Patients in Eastern India: A Six-Year Observation

: Background: HCV infection is very common in multi-transfused β -thalassemia patients who need regular blood transfusions. Aim: The study was conducted to determine the epidemiology of HCV in multi-transfused β -thalassemia patients in West Bengal, India. Methods: Over a span of six years, blood samples were collected from HCV sero-reactive β -thalassemia patients and processed for viral RNA isolation followed by nested RT-PCR for qualitative viremia detection. The HCV genotype was determined by amplifying the partial HCV core gene by nested RT-PCR followed by DNA sequencing and NCBI genotyping tools. Phylogenetic and phylogeographic studies were performed with MEGA-X and BEAST software, respectively. Results: Out of 917 multi-transfused HCV sero-reactive β -thalassemia patients, 598 (65.21%) were HCV RNA positive while 250 (41.80%) had spontaneously cleared the virus. A signiﬁcant percentage of male patients from rural areas ( p = 0.042) and economically backward class ( p = 0.002) were at higher risk of HCV infection. Female thalassemia patients and individuals belonging to ages 11–15 years had higher chances of spontaneous clearance. The most prevalent circulatory HCV genotype was 3a (78.26%) followed by 1b (12.04%). Phylogeographic analyses revealed that the 3a strains share genomic similarities with strains from Pakistan, Sri Lanka, and Thailand, whereas the 1b strains share similarities with strains from Thailand, Vietnam, Russia, and China. Uncommon HCV subtypes 3g and 3i were also detected. Conclusion: The high prevalence of HCV infection among β -thalassemia patients of West Bengal, India indicates NAT-based assays should be implemented for HCV screening in donor blood to eliminate HCV by 2030.


Introduction
HCV is an enveloped, positive sense, ~9.6 kb ssRNA virus belonging to the Hepaciviral genus and Flaviviridae family that can cause chronic liver diseases, including cirrhosis of the liver and hepatocellular carcinoma (HCC).HCV mainly transmits through blood, blood products, and body fluids infecting an estimated 200 million people globally [1].After entering the blood stream, the virus makes its way into the liver and replicates in hepatocytes, resulting in acute or chronic liver infection.Based on viral genetic diversity, HCV has been classified into seven genotypes (1-7) and over sixty-seven subtypes [2].Before 2011, Ribavirin (RIB) plus Pegylated interferon (Peg-IFN) was the gold standard for HCV treatment.However, a lower sustainable viral response (SVR) and adverse side effects to patients lead to the introduction of direct acting antivirals (DAAs) [3].According to the treatment guidelines by the American Association for the Study of Liver Diseases (AASLD) and also the Indian National guidelines for the management of viral hepatitis, determination of the HCV genotype is important to manage DAAs failure cases [4][5][6].HCV genotypes differ from each other not only in nucleotide sequences but also in geographical distribution [7].
β-thalassemia is an inherited autosomal recessive disorder that arises from reduced or lack of synthesis of the beta-globin chain of hemoglobin [8].β-thalassemia is prevalent all over the world; however, the frequency is higher in the Mediterranean region, the Middle East, the Indian subcontinent, and Southeast Asian countries [9].The burden of hemoglobinopathies, an inherited genetic disorder, is frequent in India because a vast majority of ethnic groups still follow endogamy.Due to the diversified population, the prevalence of β-thalassemia and other hemoglobinopathies varies with different castes/ethnic groups [10].Approximately one-tenth of the world's thalassemic patients are born in India every year of which about 1-3% in southern India and 3-15% in northern India are carriers [11].A lack of carrier screening and prenatal diagnosis is one of the main reasons why India has such a high number of thalassemia-positive individuals, especially children who go undiagnosed for a long time.Many parents do not even know that they are thalassemia carriers till they unintentionally pass on the gene to their children [12].Regular blood transfusion is the major treatment option for multi-transfused β-thalassemia patients; these patients are at a higher risk of acquiring post-transfusion HCV infection, if the transfused blood was collected during the donor's seronegative "window" period [13].HCV infection in β-thalassemia patients may lead to HCC which can further deteriorate the situation [14].Moreover, infection management in β-thalassemia patient is difficult because they are immunocompromised [15].
The genomic diversity of HCV depends on geographical localization.Different countries may have different genotype distributions [16].In India, genotype 3 is prevalent in the north, east, and west, and genotype 1 in the south [17].Although a hospital-based study on the prevalence and genotypic distribution of HCV in thalassemia patients has been conducted [18], there was still a lack of a more comprehensive study on the genomic diversity of HCV in β-thalassemic individuals from India.This study was undertaken to highlight the viremia, genomic diversity, and evolution dynamics of HCV among the multi-transfused β-thalassemia patient population in the Indian state of West Bengal from 2014 to 2019.

Ethical Statement
Written informed consent was obtained from the patients before including them in this study.This study protocol complied with the Helsinki Declaration of 2013, and was approved by the Institutional Ethical Committee, the Indian Council of Medical Research-National Institute of Cholera and Enteric Diseases (ICMR-NICED), Kolkata, India.

Study Design
A total of 917 HCV sero-reactive β -thalassemic patients were enrolled in this prospective study (from January 2014 to December 2019).Patients' blood samples were collected by venipuncture in clot vials followed by the collection of demographic and clinical data from 10 collaborating transfusion centers (TCs).Patients with other viral co-infections were not included in this study.These 10 TCs were named TC 1-10 and are located around ten southern districts namely, Purulia, Bankura, Burdwan, East Midnapur, West Midnapur, Howrah, Hooghly, North and South 24 parganas and Kolkata in West Bengal, the eastern state of India (Figure 1a).All laboratory work was performed at ICMR-NICED, Kolkata.HCV seroreactivity was rechecked by HCV ELISA (HCV Ag-Ab Monolisa kit; BioRad, Mames-la-coquette, France).HCV RNA-positive patients were classified into seven age groups (AG), 1 to 7, depending on their age, namely, AG-1 ( RNA (viral load < 50 IU/mL) by quantitative real-time PCR in two consecutive testing 6 months apart.Patients who achieved spontaneous clearance in this study were followed up for 2 years after the initial detection of infection.

Detection of HCV RNA and genotyping:
Viral RNA was isolated from all HCV sero-reactive serum samples using the QI-Aamp viral RNA mini kit (Qiagen, Hilden, Germany) according to the manufacturer's instructions.In brief, viral RNA was extracted from 140 μL serum, eluted in 50 μL elution buffer, and stored at −80 °C for further use.
Detection of HCV viral RNA was performed by nested RT-PCR based on 5′UTR of the HCV genome as described previously [19].In brief, the first round one tube RT-PCR was completed in 20 μL total reaction volume containing 2 μL isolated RNA and the second round nested PCR was performed in 25 μL total volume using 2 μL of 1st round RT-PCR product.A positive band at 256 bp in 1.5% agarose gel stained with ethidium bromide was observed using a gel documentation system (BioRad, Hercules, CA, USA).Quantitative HCV RNA was estimated using a Qiagen quantitative real-time RT-PCR (qRT-PCR) kit (QuantiFast Pathogen RT-PCR plus IC Kit, Germany).The HCV primer and probe sequences were directed against the 5′UTR of the HCV genome [20] using the 4th WHO International Standard for HCV (NIBSC code 06/102) as standard.

Detection of HCV RNA and Genotyping
Viral RNA was isolated from all HCV sero-reactive serum samples using the QIAamp viral RNA mini kit (Qiagen, Hilden, Germany) according to the manufacturer's instructions.In brief, viral RNA was extracted from 140 µL serum, eluted in 50 µL elution buffer, and stored at −80 • C for further use.
Detection of HCV viral RNA was performed by nested RT-PCR based on 5 UTR of the HCV genome as described previously [19].In brief, the first round one tube RT-PCR was completed in 20 µL total reaction volume containing 2 µL isolated RNA and the second round nested PCR was performed in 25 µL total volume using 2 µL of 1st round RT-PCR product.A positive band at 256 bp in 1.5% agarose gel stained with ethidium bromide was observed using a gel documentation system (BioRad, Hercules, CA, USA).Quantitative HCV RNA was estimated using a Qiagen quantitative real-time RT-PCR (qRT-PCR) kit (QuantiFast Pathogen RT-PCR plus IC Kit, Germany).The HCV primer and probe sequences were directed against the 5 UTR of the HCV genome [20] using the 4th WHO International Standard for HCV (NIBSC code 06/102) as standard.
For the determination of the HCV genotype, nested RT-PCR amplified amplicons of the partial HCV core gene (405 bp) [19] were gel purified and directly used for DNA sequencing analyses in an automated DNA sequencer, model 3130XL (ABI, Van Allen way, Carlsbad, CA, USA) using a Big Dye terminator 3.1 kit (Applied Biosystems, Austin, Texas, USA).HCV genotyping was determined by NCBI genotyping tool, available at "https: //www.ncbi.nlm.nih.gov/projects/genotyping/formpage.cgi"(accessed on 20 February 2020).

Phylogenetic Analysis
Phylogenetic analysis was performed using 122 representative partial core sequences from HCV RNA positive samples and 29 reference sequences from NCBI.To investigate the evolutionary linkage among lab strains and reference strains, partial core sequences of eighty-six laboratory-isolated 3a strains (Accession Numbers, MN635186-MN635242, MW330397-330425), twenty-six 1b strains (Accession Numbers, MN617343-MN617360, MW368893-MW368895), ten 3b strains (Accession Numbers, MN605946-MN605950, MW368887-MW368891), two 3g strains (Accession numbers-MW368883, MW368885), one 3i strain (Accession number-MW368886), and one 1a strain (Accession number-MW368892) were aligned with HCV reference strains using the Molecular Evolutionary Genetics Analysis tool (MEGA-X) [21].The evolutionary history was inferred by using the Maximum Likelihood method and Tamura-Nei models [22].This analysis involved 151 nucleotide sequences.All positions with less than 95% site coverage were eliminated, i.e., fewer than 5% alignment gaps, missing data, and ambiguous bases were allowed at any position (partial deletion option).There was a total of 300 positions in the final dataset.

Phylogeographic Analysis
Bayesian Evolutionary Analysis Sampling Trees (BEAST), package 1.10.4available at "http://beast.community/programs" (10 February 2020) was used to study the phylogeographic relationship and distribution of HCV between neighboring countries.HCV core sequences from the following countries, India, China, Pakistan, Bangladesh, Myanmar, Thailand, Japan, Vietnam, Iran, and Australia were downloaded from the ViPR (Virus Pathogen Resource) database, available at "www.viprbrc.org/brc/home.spg?decorator=vipr" (20 February 2020); the HCV sequence database available at "https://hcv.lanl.gov/content/index" (21 February 2020), and NCBI.The first 300 bp (Respective of 371 nt to 671 nt of the strain H77-accession number NC-004102) of conserved HCV core regions were used for analysis.The General Time Reversible (GTR) model of nucleotide substitution was used with gamma + invariant sites distribution.For the clock setting, an uncorrelated relaxed clock type was used [23] along with the Bayesian Skyline Coalescent model.It also ensured the achievement of effective sampling sizes (ESS) > 200 in TRACER, available at "http://beast.community/tracer" (10 February 2020) [24].The results of BEAST analyses were evaluated in TRACER.Tree Annotator, Available at "https://beast.community/treeannotator"(10 February 2020) and Fig-Tree, available at "https://beast.community/figtree"(10 February 2020) software were used to construct the final tree.

Statistical Analysis
Paired t-tests were performed to interpret if there was any significant association between male and female viremia among different age groups, demographic distributions, and socio-economic parameters.For genotype distribution, the multivariate one-way ANOVA was used to see if there was any relation between male and female patients with HCV genotypes across the age groups or other factors.The Cochran-Mantel-Haenszel (CMH) test was used to evaluate spontaneous clearance between males and females with respect to age groups.Statistical significance was defined as a p-value of less than 0.05.

Demographic Data and Distribution of HCV Viremia
A total of 917 HCV sero-reactive serum samples (561 male and 356 female) were collected over six years from multi-transfused β-globulin synthesis defective thalassemia patients from 10 TCs located around ten districts of West Bengal, India.Out of 917 HCV sero-reactive samples, 598 (65.21%) were found to be HCV RNA positive.Out of these 598 HCV RNA positive individuals, 63.71% (n = 381) were male and 36.29%(n = 217) were female, respectively.Overall, HCV RNA positivity among males was significantly higher than the females (Table 1).It was also observed that HCV prevalence was significantly higher in rural areas (p = 0.042) and among poor populations (p = 0.002) (Table 1).

Distribution of HCV Genotype
A total of 598 partial core gene sequences (405 bp) were amplified and sequenced for HCV genotyping and phylogenetic analyses.Sequences were aligned with reference sequences by the NCBI genotyping tool and it revealed that 87.80% (n = 525) patients of our study population were infected with HCV genotype 3 and 12.20% (n = 73) patients were infected with HCV genotype 1 (Table 2).The circulating HCV subtypes that were detected in this region were: 1a, 1b, 3a, 3b, 3g, and 3i with subtype 3a being the predominant subtype (78.26%, n = 468) followed by subtype 1b (12.04%, n = 72) and 3b (8.70%, n = 52) (Table 2), while subtypes 3g, 3i and 1a were found in only three (0.50%), 2 (0.33%) and one (0.17%) patient(s) respectively.Studies on the distribution of HCV genotypes in each TC individually revealed that HCV genotype 3a was detected and found significantly high in all the 10 TCs (TC1-TC10) (Table 2).HCV genotype 3a was also predominant in rural areas (80%) and also among poor people (80.66%).Population-based DNA sequencing data from the partial core region of the HCV genome showed no multiple HCV genotype infections in any of the β-thalassemia patients in this study population.HCV sequences reported in this study were deposited in GenBank with accession numbers as mentioned in the Materials and Methods section.

HCV Spontaneous Clearance
Out of 598 HCV RNA-positive patients, 485 patients (309 males and 176 females) could be followed up for two years from the day they tested positive.It was observed that female patients could spontaneously clear the HCV virus more efficiently than male patients (Table 3) and age group 11-15 years could clear the HCV infection more efficiently than other age groups (Table 3).As per the CMH test, female patients in the age group 11-15 years could more efficiently clear the virus than other groups (Table 3).

Discussion
HCV infection is a very common TTIs in multi-transfused β-thalassemic individuals causing serious health complications in this high-risk group population.The epidemiology and genotype distribution of HCV infection in thalassemic individuals is not very well reported from India or other countries, especially the HCV genotypes and geographic distribution.Determination of HCV genotypes and subtypes is essential to identify the source of HCV infection in a specified population and it is also required for clinical management, therapeutic intervention, and development of an effective HCV vaccine.This study was carried out over a period of six years (January, 2014 to December, 2019) in 917 The HCV genotype 1b maximum clade tree was created using first 300 bp of the core genomic region (respective of 342 nt to 642 nt of the strain H77-accession number NC-004102).The Bayesian MCMC BEAST software package version 1.10.4 was used to conduct the study.The tree was constructed using reference sequences from China: EU081423, EU081424, EU081407, EU081379, DQ777809; Japan: AB077724, AB062255, D16694, FJ607115, FJ607117, FJ607117, FJ607091, LC368325; Thailand: GQ913859, HQ 229043, HQ229114, GQ913860; Vietnam: DQ155498, DQ155495, MH191929, AB301749; Bangladesh: JQ668313, JQ668314; Saudi Arabia: KC143884, KC143890; Myanmar: FJ607133; USA: EU660388; Russia: MN026684; Israel: MT632111; India: KF181665, KF181670, KF181668, KF181661.The prefix "Thal" was used to encode local sequences.

Discussion
HCV infection is a very common TTIs in multi-transfused β-thalassemic individuals causing serious health complications in this high-risk group population.The epidemiology and genotype distribution of HCV infection in thalassemic individuals is not very well reported from India or other countries, especially the HCV genotypes and geographic distribution.Determination of HCV genotypes and subtypes is essential to identify the source of HCV infection in a specified population and it is also required for clinical management, therapeutic intervention, and development of an effective HCV vaccine.This study was carried out over a period of six years (January, 2014 to December, 2019) in 917 HCV sero-reactive β-thalassemic patient samples that were collected from a total of 10 transfusion centers where patients received blood from their respective district TCs of West Bengal on a regular basis (Figure 1a).The prevalence of active HCV infection (state of viremia) in some of these transfusion centers was very high (Table 1).This research implies that TCs could be hotspots for HCV transmission.It can also be postulated that due to improper screening of donor's blood during the seronegative "window" period, serological tests such as Tri-Dot assays and ELISA, which are more commonly used for donor blood screening in the blood banks, are insufficient to detect active HCV infection in donor blood.Hence, HCV-infected blood is being transfused to the thalassemic recipients.Therefore, the maintenance of strict transfusion protocols using more sensitive NAT-based RT-PCR screening methods is essential to check the donor's blood for probable infections before it is used for transfusion to prevent the transmission of HCV in this high-risk group population.
Data from this study revealed that β-thalassemic patients within the age group of 11-15 years had a higher percentage of HCV clearance (Table 3).From the findings, we concluded that the probability of HCV clearance is higher in teenage patients (11-15 years) as compared to adult patients (>25 years).This may be because of the increase in serum ferritin levels in older thalassemic patients due to repeated blood transfusions that facilitates HCV replication [25] and leads to faster progression to liver cirrhosis and HCC [26].
It has been previously reported that women can clear acute HCV infection at a higher rate than men [27].Data from this study also provide evidence for the first time that in HCVinfected multi-transfused β-thalassemic Indian patients, thalassemic females were more likely to spontaneously clear the virus than thalassemic males (Table 3) and in addition the females had a lower chance of HCV RNA positivity than males (Table 1).This difference in the HCV clearance rate based on gender is still an unresolved issue but there are a few plausible explanations.In most viral infections, basal immune responses are higher in females than in males and low doses of estrogen have been suggested to modulate antiviral innate and adaptive immune responses [28].Secondly, women have a slower rate of liver disease progression including liver cirrhosis and HCC [29]; thus, it is not surprising that the burden of HCV disease complications is lesser and the rate of infection clearance is higher in thalassemic females as compared to thalassemic males.In addition to these factors, several other yet to be determined factors such as menstrual cycles and genetic and immunological factors may play a key role in favoring HCV clearance in thalassemic females.It has been found that β-thalassemia patients had experienced menarche at 12-17 years depending on iron chelation therapy (ICT) [30] which somewhat justified our findings (Table 3) where it could be easily seen that female patients in the age group 11-15 years more efficiently cleared the infection.Studying the effects of the menstrual cycle in thalassemia-major women in India is challenging due to a lack of awareness about puberty and menstrual cycles in the Indian female population.Additionally, it is extremely difficult to collect data as women often hesitate to disclose such information publicly.
In this study, six HCV subtypes were noticed and they were 1a, 1b, 3a, 3b, 3g, and 3i (Figure 1b,c) of which genotypes 3g and 3i have not been previously reported from this region and this is the first time that these two HCV subtypes were found circulating in βthalassemic patients of West Bengal.Another interesting finding is the higher predominance of HCV genotype 3 (especially subtype 3a) among β -thalassemic patients irrespective of the fact that the transfusion centers from where these patients received blood transfusions were all located in different districts.A similar pattern of HCV genotype 3 (subtype 3a) predominance among β-thalassemic patients in Iran [31] has been previously reported.HCV subtype 3a is mainly distributed through central, middle-east, and south-east Asian countries [32] which comprise the 'Global Thalassemia belt' [33].In the South American subcontinent, thalassemia cases can be found in Brazil [34] and, surprisingly, HCV genotype 3a is present at a relatively high rate [35] there as well.Although the predominance of HCV genotype 3 (63.85%) in the Indian population has been reported earlier [36], our study reports that the prevalence of HCV genotype 3 among the Indian β-thalassemic teenage patients is even higher (87.79%).Several factors influence HCV replication, viz., host immunity, age, hepatocellular damage, and whether the infection is acute or chronic, and these factors might provide a replicative advantage to one HCV genotype in a specific patient population over other genotypes [37].Although a detailed analysis is necessary to prove this from our observations it can be preliminarily speculated that HCV genotype 3a has a replication advantage over other HCV genotypes during the natural course of infection in the β-thalassemic Indian population.HCV genotype 3 is also associated to more aggressive hepatic steatosis and liver cancer [38][39][40], which may impart a serious life threat to β-thalassemia patients and further reduce their lifespan due to additional complications such as iron overloading, splenomegaly, etc.
Phylogeographic analysis of HCV strains assists in monitoring the virus for molecular tracing, ancestral studies, genetic assessments, clinical, and therapeutic intervention.Hence, phylogeographic analysis was performed with the two most commonly found HCV genotypes among β-thalassemia patients in West Bengal, 3a and 1b, to obtain information about the hierarchal relationships and genetic evolution.The data suggest that the majority of the HCV 3a isolates from thalassemia patients in West Bengal have sequence similarities with HCV strains from neighboring countries such as Pakistan, Sri Lanka, and Thailand (Figure 2) whereas the majority of the HCV 1b isolates of this region had sequence similarities to HCV strains of Vietnam, Thailand, Russia, and China (Figure 3), suggesting that the point of origin of the circulating HCV strains 3a and 1b can be traced back to the adjacent countries that share a border with India.However, we could not perform phylogeographic analyses of HCV strains 1a, 3i, and 3g as we did not isolate sufficient numbers of these strains from this study population to undertake any such analytical study.
Although this is the first detailed molecular epidemiological study of HCV infection among thalassemia patients in India, there were a few limitations.Firstly, 80% of the study population belonged to rural areas with poor economic backgrounds.There was a significant lack of awareness regarding thalassemia, how it spreads as well as the risk of Transfusion Transmitted Infections (TTIs).This study also included patients from districts known to be comprised of ethnic tribes [41,42].Due to illiteracy and lack of awareness, a few of our patients lost their medical records and showed unwillingness to comply with medical professionals.These factors need to be addressed to further strengthen such studies.Furthermore, the NGS-based whole genome data of HCV could have enhanced the evolutionary aspect of this epidemiological study [43].Hence, a combination of harm reduction measures, extensive sampling, and deep sequencing of HCV will help us better understand HCV pathogenesis and spread.

Conclusions
This study is one of the first multi-district, detailed, and thorough analyses of HCV viremia and genotypic distribution among β-thalassemia patients of Eastern India.Our study indicated that male thalassemia patients are more prone to chronic HCV infection than females, and those in rural areas and economically weaker sections are also at a higher risk of HCV infection.Thalassemia patients between the ages of 11-15 years and female β-thalassemic patients had a higher probability of spontaneously clearing the virus.Two major genotypes with six subtypes are circulating within this study population and the most prevalent HCV strains were 3a and 1b which share a sequence similarity with neighboring countries such as China, Pakistan, Myanmar, and Bangladesh, suggesting their probable migration into the Indian sub-continent.This investigation shows two new HCV subtypes (3g and 3i) which are not very common in this region.The prevalence of HCV infection is extremely high among Indian thalassemic patients.The introduction of more sensitive NAT-based tests for HCV detection in donor blood is an urgent requirement to overcome this situation.

Funding:
This work was supported by the Department of Science and Technology, Govt. of West Bengal; grant number-758(sanc.)/ST/P/S&T/9G-8/2014,dated-27.11.2014.All instruments' facilities were provided by the Indian Council of Medical Research (ICMR-NICED).Institutional Review Board Statement: Ethical clearance for this study was obtained from Institutional Ethics Committee of National Institute of Cholera and Enteric Diseases (Indian Council of Medical Research), Approval number-A-1/2013-IEC.Informed Consent Statement: Informed consent was obtained from all subjects involved in the study.

Table 1 .
Distribution of HCV viremia among various demographic and socio-economic groups of β-thalassemia patients.

Table 2 .
Gender-wise HCV genotype distribution, demographic data, and socio-economic status of research participants with β-Thalassemia.

Table 3 .
Spontaneous HCV clearance in β-thalassemia patients of various age and sex groups.