Genotypic and Epidemiologic Profiles of Giardia duodenalis in Four Brazilian Biogeographic Regions

Human infections with gut protozoan parasites are neglected and not targeted by specific control initiatives, leading to a knowledge gap concerning their regional diversity and epidemiology. The present study aims to explore Giardia duodenalis genetic diversity and assess the epidemiologic scenario of subclinical infections in different Brazilian biogeographic regions. Cross-sectional surveys (n = 1334 subjects) were conducted in four municipalities in order to obtain fecal samples and socioenvironmental data. Microscopy of non-diarrheal feces and nucleotide sequencing of a β-giardin gene fragment were performed. From a total of 51 samples that could be sequenced, 27 (52.9%) β-giardin sequences were characterized as assemblage A and 24 (47.1%) as assemblage B. In the Amazon, assemblage B was the most frequently detected, predominantly BIII, and with two novel sub-assemblages. Assemblage A predominated in the extra-Amazon region, with five novel sub-assemblages. Prevalence reached 17.8% (64/360) in the Amazon, 8.8% (48/544) in the Atlantic Forest, 7.4% (22/299) in Cerrado and 2.3% (3/131) in the Semiarid. People living in poverty and extreme poverty presented significantly higher positivity rates. In conclusion, subclinical giardiasis is endemic in Brazilian communities in different biogeographic regions, presenting high genetic diversity and a heterogeneous genotypic distribution.


Introduction
Giardia duodenalis is a cosmopolitan, flagellated gut protozoan parasite with higher prevalence in developing countries, mainly in regions with poor sanitation and inappropriate drinking water supply [1]. G. duodenalis trophozoites inhabit the small intestine and, during the course of infection, a variable proportion of cells encyst and are shed in feces. Cysts can persist in the environment and contaminate water and food, constituting the parasite infective stage. In industrialized regions, G. duodenalis causes outbreaks of diarrheal diseases which have water contaminated with fecal material as their main source [2,3].
The Global Enteric Multicenter Study (GEMS) assessed the role of G. duodenalis as an etiologic agent of diarrheal diseases in children living in African and Asian developing countries in a 3-year prospective, age-stratified, multicentric and matched case-control study, demonstrating that G. duodenalis more frequently infected controls than children aged 12-59 months with diarrhea [4]. Similar results were obtained in Cambodia, where higher G. duodenalis positivity rates were found among controls when compared with children with diarrhea [5]. Thus, in endemic areas in developing countries, there is growing evidence that giardiasis is a subclinical and chronic infection, with the parasite being detected in non-diarrheal feces and frequently associated with protein-caloric malnutrition [6,7]. These findings were reported in the Etiology, Risk Factors, and Interactions of Enteric Infections and Malnutrition and the Consequences for Child Health and Development Project (MAL-ED), a multisite birth cohort study [8,9]. Therefore, in vulnerable communities of developing regions, G. duodenalis-through complex pathophysiological mechanisms-impacts the absorption of nutrients and affects both the nutritional status and physical development of children in a process that does not depend on the presence of diarrhea [10,11].
G. duodenalis infects a wide range of mammalian species and presents considerable intraspecific genetic variation, with eight distinct assemblages (A to H). Assemblages A and B were identified mainly in humans, C and D in wild and domestic canines, E in ruminants and domestic pigs, F in cats, G in mice and rats and H in seals [12,13]. Giardiasis is a potentially zoonotic infection, and cross-host transmission has been well documented [14]. The degree of genetic divergence between assemblages A and B has led to the proposition of two distinct species [15,16]. In Rio de Janeiro, Brazil, assemblage E was detected in human infections [17].
Brazil occupies most of the South American continent and has great biogeographic and climatic diversity, with specific rainfall regimes with great variation, from rain forests to semiarid, associated with different water management strategies, creating regional scenarios for the epidemiology of water-borne infections.
Infections with gut protozoan parasites are not targeted by specific intestinal parasite control strategies, which are based on the mass administration of anthelmintic drugs [18]. This has led to a knowledge gap concerning the prevalence, distribution and factors associated with subclinical giardiasis in many regions, as well as concerning parasite genetic diversity. The present study aims to describe the parasite genetic heterogeneity and the epidemiological scenario of subclinical G. duodenalis infection in an urban Amazonian area and three extra-Amazonian Brazilian biogeographic regions.

Study Design
Cross-sectional surveys (n = 1334; Table 1) were performed in order to obtain sociodemographic, anthropometric and sanitation data as well as fecal samples for parasitological and molecular analyses ( Figure 1). For logistical reasons in the field work, only one fecal sample was obtained from each person. The interviews were conducted face-to-face by the research team in domiciles. Fecal samples were from non-diarrheal stools, from asymptomatic subjects. Weight (n = 844) and height (n = 742) were obtained from individuals between 0-14 years of age. Weight was obtained with a portable electronic scale, to the nearest 100 g. Children were barefoot and with minimum clothing. Infants aged less than 12 months were weighed in their mothers' arms. Height or length was measured using an anthropometer to the nearest 0.1 cm. The nutritional parameters height-for-age Z-scores (HAZ) and weight-for-age Z-scores (WAZ) were calculated with the Nutrition module of the Epi Info™ v.3.5.1 software to verify the presence of protein-energy malnutrition characterized by stunting (HAZ < −2) and low weight (WAZ < −2). Extreme poverty was defined when monthly family per capita income was below BRL 125, which corresponds to USD 25 (considering the exchange rate of USD 1 = BRL 5). Poverty was defined by monthly family per capita income between BRL 125 and BRL 250 (USD . The researchers gathered information about the site of defecation, i.e., if the family had a latrine inside the house or if members of the family practice open defecation in the peridomestic environment. The final destination of the feces was also characterized, being adequate when the feces went to closed septic tanks and inadequate when they were deposited in the ground, in rudimentary holes or directly into a waterbody.

Socio-Environmental Characteristics of the Biogeographic Regions
Brazil has great social and environmental diversity. Many regions have poor access to drinking water and inadequate sewage systems in common. The study was conducted in four localities whose sociodemographic, environmental and climatic differences are presented in Table 1: São João do Piauí and Teresina, in the state of Piauí; Bagre, in the state of Pará; and Cachoeiras de Macacu, in the state of Rio de Janeiro (see map in Figure 2).

Parasitological Examinations
Fecal samples were collected in plastic bottles without preservatives and sent to the field laboratory to be examined through light microscopy using the Ritchie and saturated glucose solution flotation techniques. For the Ritchie method, fecal suspensions were homogenized, filtered through folded gauze and centrifuged at 2500 rpm for 1 min. Sediment was resuspended with neutral detergent and water and centrifuged at 2500 rpm for 1 min. Sediment was examined through light microscopy. For the saturated glucose solution flotation technique, fecal material subjected to a hyperosmolar sucrose solution was collected on the surface with the aid of a loop and examined on a microscope slide. Two experienced parasitologists examined all samples.

DNA Extraction, PCR Amplification and Nucleotide Sequencing of B-Giardin Encoding Gene Fragment
After parasitological examinations, 137 G. duodenalis-positive fecal samples were obtained. Since the success rate in the amplification and sequencing of parasitological material of fecal origin is usually low, and to enable a greater number of sequences, PCR of a pool of microscopically negative samples was also performed (n = 42), usually from inhabitants who lived in the homes of positive subjects. Thus, a total of 179 fecal samples were submitted for PCR. Genomic DNA was extracted from 200 µL of the sedimented fecal material using the ZR Fungal/Bacterial DNA MiniPrep™ kit (ZymoResearch, Irvine, CA, USA).
PCR was performed using the Platinum Taq DNA Polymerase kit (Invitrogen, Waltham, MA, USA) with a final volume of 50 µL, and targeted a 753 bp region of the β-giardin locus of G. duodenalis, as described [19]. The PCR conditions were: 1X PCR Buffer, 1.5 mM MgCl 2 , 0.05 mM dNTP, 10 pmol of each primer, 2.5 U of Taq polymerase and~40 ng of template DNA. Amplification parameters included an initial denaturation at 94 • C for 5 min followed by 35 cycles of amplification comprising denaturation (94 • C for 30 s), annealing (65 • C for 30 s) and extension (72 • C for 30 s), and a final extension at 72 • C for 5 min. The PCR products were purified with polyethylene glycol (PEG). Capillary electrophoresis was performed in an ABI3730 automated DNA sequencer (Applied Biosystems) in PDTIS/Fiocruz Genomic Platform RPT01A.

Sequence Data Analysis
The sequences were edited and analyzed using the BioEdit v.7.2.5 software [20]. The Basic Local Alignment Search Tool (BLAST-NCBI https://www.ncbi.nlm.nih.gov accessed on 20 August 2021) was used to verify similarity with G. duodenalis sequences. All sequences generated were deposited in the GenBank database under the accession numbers MW679411-MW679461. To determine G. duodenalis assemblages (genotypes), an alignment was performed with 55 G. duodenalis orthologous reference sequences retrieved from GenBank in BioEdit v.7.2.5 software. Sequences with degenerate bases were not included. Further details of reference strains can be found in Supplementary Table  S1. The most appropriate substitution model was estimated using Bayesian Information Criterion (BIC) in MEGA v.7 software [21]. Maximum likelihood (ML) and neighbor joining (NJ) genetic trees were constructed in MEGA v.7 software using a Tamura-Nei model (bootstrap 1000-replicates). The median-joining (MJ) haplotype network based on distance criteria was constructed using the Network v.10.1.0.0 software (Fluxus Technology Ltd., www.fluxusengineering.com) [22]. The DNA Sequence Polymorphism (DnaSP) v.5.10.01 software was used for editing the files [23]. To evaluate the intraspecific genetic diversity of G. duodenalis, diversity indexes were determined for each population pair using Arlequin v.3.5.2.2 software (http://cmpg.unibe.ch/software/arlequin35 accessed on 20 August 2021) [24]. The populations were grouped considering assemblage, geographic origin and Brazilian regions.

Statistical Analysis
G. duodenalis positivity rates were described in different groups defined by sociodemographic characteristics and nutritional status. Prevalence ratios (PRs) and their respective 95% confidence intervals (CIs) were calculated. The statistical significance of the differences between the positivity rates was assessed by Fisher's exact test. Associations were considered statistically significant when p < 0.05. Statistical analyses were performed with Epi Info 2000 ® (CDC, Atlanta, GA, USA).

Ethics
The study was approved by the Research Ethics Committee (license CAAE 12125713.5.0000.5248) of the Oswaldo Cruz Institute, Fiocruz.

Genetic Diversity of Giardia duodenalis
Of the 179 fecal samples submitted to PCR/sequencing, 51 were successfully genotyped using the β-giardin locus. Overlapping peaks were not observed in the nucleotide sequences. Twenty-seven (52.9%) sequences were characterized as assemblage A and twenty-four (47.1%) as assemblage B ( Table 2).
The MJ haplotype network (Figure 3) showed that the G. duodenalis sequences were grouped by assemblages, as expected. Assemblages A and B presented a star-like shape, including the sequences obtained in this study as a central and dominant haplotype (except the novel haplotypes).  Supplementary Table S1. ML and NJ phylogenetic trees (Figure 4) also demonstrated that the G. duodenalis sequences were grouped by assemblages. The main difference between the two trees was in the NJ tree: assemblage D shared a common ancestor with assemblages B and E. Concerning the molecular diversity indexes, in general, assemblage B revealed greater intraspecific diversity when compared with assemblage A (H = 0.921 ± 0.028 vs H = 0.854 ± 0.029) (Supplementary Table S2). In Brazil, assemblage A showed greater intraspecific diversity (H = 0.879 ± 0.037) when compared with Europe and North America (H = 0.822 ± 0.096 and 0.666 ± 0.204, respectively). In contrast, assemblage B in Brazil showed lower intraspecific diversity when compared with North America, Asia and Europe (H = 0.918 ± 0.033 vs. 1.000 ± 0.500, 1.000 ± 0.126 and 1.000 ± 0.272). Assemblage A from São João do Piauí showed lower diversity when compared with the other biomes in the present study. Assemblage B from the Amazon biome showed a lower diversity (H = 0.757 ± 0.086) when compared with reference sequences from the same biome (H = 0.991 ± 0.025) (Supplementary Table S2).  Table 3 shows positivity rates (by microscopy) in different groups defined by sociodemographic characteristics and nutritional status and the association of giardiasis with other intestinal protozoa (coinfection with Entamoeba histolytica/E. dispar and with Entamoeba coli). The overall positivity rate was 137/1334 (10.3%). The frequency was significantly higher in Bagre, in the Amazon region, reaching 64/360 (17.8%) and lower in São João do Piauí, in the Caatinga (3/131 (2.3%)). In Bagre, Teresina and Cachoeiras de Macacu, the age groups of 3 to 6 years old and 7 to 15 years old had the highest rates of positivity, but infants and toddlers up to 2 years old were also frequently infected in Bagre and Cachoeiras de Macacu. Giardiasis was significantly more frequent among people living in poor and extremely poor families in Bagre, Teresina and Cachoeiras de Macacu. People living in scenarios of inappropriate disposal of feces and open defecation also presented significantly higher positivity. Individuals positive for Entamoeba histolytica/E. dispar or E. coli were infected with G. duodenalis at significantly higher frequencies.

Discussion
The present study explored regional differences in the genotypic composition and epidemiologic profile of G. duodenalis in different Brazilian biogeographic regions. Regarding the detection frequencies of the main assemblages-even considering the relatively small proportion of positive samples in which the partial sequencing of the β-giardin gene was achieved-some regional differences could be observed. It is important to note that the large amount of bacterial DNA in the fecal samples, as well as inhibitory substances in the material, usually interfered negatively in the success of PCR amplification.
In the Amazon region, there was a predominance of assemblage B, and in the extra-Amazonian area, assemblage A was more frequently detected. A similar genotypic profile was described in a previous work when we compared another Amazon region (the Rio Negro basin) with different extra-Amazonian regions [25,26]. A high detection rate of assemblage B was also detected in the Colombian Amazon [27]. In the present study, besides differences in genotypic composition, the Amazon region presented a substantially higher prevalence of G. duodenalis subclinical infection. In São Paulo (southeastern Brazil), a predominance of assemblage A was reported in low-income families [28]; this predominance was also demonstrated in environmental samples recovered from sewage in the same region [29]. In Rio de Janeiro, an increase in the proportion of infections by assemblage B in recent years has been suggested, with this genotype being associated with HIV coinfection and more severe symptoms, influenced by the degree of immunosuppression presented by patients [30,31]. A prospective study carried out in Fortaleza, northeastern Brazil, suggests that the parasite burden of G. duodenalis assemblage B infections is higher, with greater shedding of cysts and, consequently, a greater potential for spreading [32]. Outside of the Amazon, the predominance of assemblage A was also demonstrated in southern and northeastern Brazil [33,34].
Taken together, these data may point to putative differences in the epidemiologic profile of G. duodenalis assemblages A and B. However, prospective studies with large sample sizes and quantitative assessments of the parasite load, response to treatment and the clinical and nutritional impact of giardiasis have not yet been carried out in order to assess clear clinical and epidemiological differences between the main genotypes of G. duodenalis. The present study did not include subjects with diarrhea, and infections were identified in subclinical conditions. In Saudi Arabia, children infected with assemblage B were predominantly symptomatic, whereas asymptomatic participants harbored assemblages AI and AII [35]. Conversely, in Egypt, it was demonstrated that iron deficiency anemia and intestinal symptoms were mainly associated with assemblage A [36].
Despite the lack of robust data on the clinical and epidemiological differences between G. duodenalis assemblages A and B, the genetic variation between them is well established. G. duodenalis complete genomes revealed substantial phylogenetic divergence between the two main genotypes infecting humans and demonstrate that: (i) the average amino-acid identity in 4300 orthologous proteins is not superior to 78% and (ii) the full-genome-derived similarity between enzootic assemblage E and assemblages A and B at the level of aminoacid identity is 90% and 81%, respectively [31]. Our phylogenetic inferences based on the β-giardin gene partial sequencing also demonstrated marked divergence between assemblages A and B in Brazil. Through the ML analysis, assemblage E sequences obtained in GenBank were more related to our samples characterized as assemblage A, while the NJ analysis demonstrated that assemblage E was closer to assemblage B. In both phylogenetic trees, assemblage A was more closely related to enzootic assemblages C and F than to assemblage B.
The comparison of some socio-environmental characteristics of the four regions assessed in the present study reveals that Bagre in the Amazon region, the locality with higher prevalence of G. duodenalis infection, has the lowest human development index and the highest proportion of people living in poverty and extreme poverty. Historically, in the Amazon, the process of demographic concentration of populations of Amerindian descent has favored the spread of parasitic and fecal-borne diseases. The Amazon is the region of the country with the highest prevalence of soil-transmitted helminthiases, incidence of diarrheal diseases and diarrhea-related mortality. The riverside character of Bagre should also be considered, with its close proximity and greater contact of the population with waterbodies in a region of abundant rainfall and poor sanitation, favoring the spread of water-borne infections, which contrasts with the semiarid climate and very low rainfall in São João do Piauí, where the prevalence of giardiasis was lower. Intermediate positivity rates were observed in Teresina and Cachoeiras de Macacu, in the Cerrado and Atlantic Forest biogeographic regions, respectively. In addition, our data demonstrated that G. duodenalis positivity is strongly influenced by the living conditions of the studied subjects, as the positivity rate was higher among children within the poorest families in Bagre, Teresina and Cachoeiras de Macacu and with inadequate destination of feces in Teresina. The detected association of giardiasis with other gut protozoan parasites, such as E. histolytica/E. dispar and Entamoeba coli, reinforces the vulnerability of the poorest families to fecal-borne infections.
In conclusion, subclinical giardiasis, identified in subjects without diarrhea, is frequent in resource-poor Brazilian communities and with a heterogeneous geographic genotype distribution and prevalence. The data suggest the need to improve control strategies, including better access to diagnosis and treatment.
Informed Consent Statement: Informed consent was obtained from all subjects involved in the study.
Data Availability Statement: All nucleotide sequencing data have been submitted to GenBank as shown in the methods section. The data that support the findings of this study are available on request from the corresponding author, Carvalho-Costa FA. The data are not publicly available because they contain information that could compromise the privacy of research participants.