Multilocus Genotyping of Giardia duodenalis in Mostly Asymptomatic Indigenous People from the Tapirapé Tribe, Brazilian Amazon

Little information is available on the occurrence and genetic variability of the diarrhoea-causing enteric protozoan parasite Giardia duodenalis in indigenous communities in Brazil. This cross-sectional epidemiological survey describes the frequency, genotypes, and risk associations for this pathogen in Tapirapé people (Brazilian Amazon) at four sampling campaigns during 2008–2009. Microscopy was used as a screening test, and molecular (PCR and Sanger sequencing) assays targeting the small subunit ribosomal RNA, the glutamate dehydrogenase, the beta-giardin, and the triosephosphate isomerase genes as confirmatory/genotyping methods. Associations between G. duodenalis and sociodemographic and clinical variables were investigated using Chi-squared test and univariable/multivariable logistic regression models. Overall, 574 individuals belonging to six tribes participated in the study, with G. duodenalis prevalence rates varying from 13.5–21.7%. The infection was positively linked to younger age and tribe. Infected children <15 years old reported more frequent gastrointestinal symptoms compared to adults. Assemblage B accounted for three out of four G. duodenalis infections and showed a high genetic diversity. No association between assemblage and age or occurrence of diarrhoea was demonstrated. These data indicate that the most likely source of infection was anthropic and that different pathways (e.g., drinking water) may be involved in the transmission of the parasite.


Introduction
The flagellated Giardia duodenalis (syn. G. intestinalis, G. lamblia) is a cosmopolitan protozoan parasite that inhabits the gastrointestinal tract of humans and other vertebrate animals. Giardiasis is the most reported intestinal protozoan infection globally, with an estimated 280 million symptomatic cases every year [1]. Asymptomatic infections are even more frequent, both in developing [2,3] and developed [4] countries. Indeed, large epidemiological case-control studies conducted in high-prevalence settings have demonstrated that G. duodenalis infection was significantly more common in asymptomatic controls than in cases with diarrhoea [5][6][7]. Host immune status and level of nutrition seem to be key factors in the control of the infection or its progression to active disease [8], although the genotype of the parasite may also play a role in the health/disease balance of the host [9]. When present, clinical manifestations associated with G. duodenalis infection may include self-limiting acute diarrhoea, persistent diarrhoea, epigastric pain, nausea, and vomiting [10]. Long-term sequelae, including childhood growth retardation and cognitive impairment, have also been recognised [11,12]. Contrary to severe infections by other diarrhoea-causing protozoan parasites such as Cryptosporidium spp. or Entamoeba histolytica, giardiasis is rarely fatal and is better considered as a debilitating condition.
Transmission of G. duodenalis is through the faecal-oral route, either directly via direct contact with infected humans or animals, or indirectly via ingestion of contaminated food or water. Waterborne transmission is likely the most common source of human infections in poor-resource settings with little or no access to safe drinking water and insufficient sanitary facilities [3]. Because of its strong bond with poverty and elevated socioeconomic impact, giardiasis (together with cryptosporidiosis) joined the Neglected Diseases Initiative launched by the World Health Organisation in 2004 [13].
Giardia duodenalis exhibits a considerable degree of genetic heterogeneity, allowing the differentiation of eight (A-H) lineages or assemblages with marked differences in host specificity and range [14]. These genetic variants likely represent cryptic species [15]. Assemblages A and B cause most human infections, but they can also infect other mammalian hosts and are, therefore, considered potentially zoonotic. Assemblages C and D occur mainly in canids, assemblage E in domestic and wild ungulates, assemblage F in cats, assemblage G in rodents, and assemblage H in marine pinnipeds. Human infections by assemblages C-F have been sporadically reported, particularly in children and immunocompromised individuals [14].
A recent review on the epidemiological situation of G. duodenalis in Brazil has revealed that this protozoan parasite represents a public health concern in the country, with prevalence rates up to 78% in Minas Gerais State and 70% in São Paulo State in 1998 [16]. Available molecular data in the country have evidenced marked differences in the geographical segregation of G. duodenalis assemblages circulating in human populations (Table S1), domestic and wildlife animal species (Table S2), surface waters (Table S3), and fresh produce (Table S4), likely reflecting disparities in infection sources and transmission pathways. Indeed, contaminated surface waters and having contact with domestic (mainly dog) animals were considered as probable sources of human infections [16]. Despite this relative abundance of epidemiological data, giardiasis has been poorly studied in Brazilian indigenous people, partially due to the geographical isolation and difficulty in accessing these fragile communities. Thus, G. duodenalis infections have been documented by conventional (microscopy) methods in the range of 7-47% in the Parakanã indigenous people in the eastern Amazon region [17], in indigenous communities in the municipality of São Gabriel da Cachoeira, Amazonas State [18], in native Brazilian children in the Xingu Indian Reservation, Mato Grosso State [19], in the Maxakali and Xukuru-Kariri indigenous communities, Minas Gerais State [20,21], and in the Terena indigenous people, Mato Grosso do Sul State [22]. However, no information is currently available on the G. duodenalis assemblages and sub-assemblages circulating in native Brazilian people. This molecular-based epidemiological survey aims at investigating the genetic diversity of G. duodenalis and assessing potential risk and/or protective factors associated with the infection in indigenous people from the Tapirapé tribe living in the Brazilian Amazon.

Study Population
In this study, a total of 574 individuals (male/female ratio: 0.96; age range: 0.1-88 years old, median: 14.0 years old) of the Tapirapé ethnicity living in six independent tribes (population range: 40-263 inhabitants, standard deviation: 83.3) were censed and invited to participate in four consecutive sampling campaigns during July 2008 and January 2010, in both dry and wet seasons. Overall, 98% (564/574) of the censed individuals participated at least in one sampling campaign. Participation rates ranged from 40% to 93% depending on the tribe and sampling campaign ( Table 1). A total of 141 individuals participated in all four sampling campaigns, 201 in three sampling campaigns, 136 in two sampling campaigns, and 86 in a single sampling campaign. The distribution of the participating individuals according to sex, age group, and tribe of origin is also summarised in Table 1. Females (mean: 53.6%, SD: 1.0) participated in the survey more often than males (mean: 46.4%, SD: 1.0). Adults (>15 years old) were the largest group in the surveyed population (mean: 24.6%, SD: 3.2), with children 5 to 14 years of age (mean: 39.4%, SD: 1.9) and children ≤5 years old accounting, in average, for 14.6% (SD: 2.0) of the investigated individuals.

Prevalence of G. duodenalis
Microscopy-based prevalence rates for G. duodenalis in the Tapirapé community varied from 13.5% (55/407) in the dry season of 2009 to 21.7% (83/382) in the rainy season of 2010 (Table 2). Over the four sampling campaigns, 35.1% (198/564) individuals tested positive at least once. The occurrence of the parasite was influenced by the seasonality (22% rainy versus 17% dry season, Chi-squared test p = 0.022) but not the sampling period (year) (Chi-squared test p = 0.126). Subsequent newly diagnosed infections were also more likely to occur in the rainy season (odds ratio: 2.29, 95% CI 1.46-3.68, p = 0.0001) although this is dependent on the number of samples analysed. G. duodenalis infections were more commonly identified in children aged 0-4 years old. During the period of study, G. duodenalis prevalence varied greatly within and among the six tribes investigated, but tribe 5 presented the highest infection rates in all sampling campaigns.
A total of 43, 17, and 4 individuals tested positive for G. duodenalis in two, three, or all four sampling campaigns, respectively (Table 3), although this was dependent on the number of samples. In all cases, children younger than 15 years of age accounted for 50.0% to 88.2% of the subjects where the parasite was detected in two or more sampling campaigns. When considering observations with repeated samples, 61.7% observations were always negative, 4.2% always positive, and 34.1% discontinuously positive. Repeated G. duodenalis infections were more frequently detected in the wet season (odds ratio: 1.60, 95% CI 1.12-2.29, p = 0.0075) in members of tribe 1 (range: 50.0-58.1%) and, to a lesser extent, in tribe 5 (range: 18.6-50.0%).

Molecular Characterisation of G. duodenalis
The genetic diversity within G. duodenalis was investigated in a subset of 70 stool samples from 65 individuals with a positive result for this parasite by conventional microscopy. Five individuals provided stool samples positive to this parasite at two different sampling periods. The presence of the parasite was confirmed by qPCR in 97% (68/70) of these samples. Generated cycle threshold (Ct) values ranged from 18.2 to 35.4 (median: 27.4; SD: 3.7).

Intra-Assemblage Genetic Diversity
Tables 5-7, Table S5 show the genetic diversity of the gdh, bg, and tpi representative, partial sequences generated in the present study. These Tables provide information for each sequence including stretch, single nucleotide polymorphisms (SNPs), and GenBank accession number. Assemblage/sub-assemblage assignment was conducted by direct comparison of the sequencing results obtained at the three loci investigated. Sequences presenting double peak positions that could not be unequivocally assigned to a given assemblage/sub-assemblage were reported as ambiguous sequences.
A total of 63 sequences were successfully characterised at the gdh locus ( Table 5). All 17 assemblage A sequences were unequivocally identified as sub-assemblage AII. Of them, seven sequences were 100% identical to reference sequence L40510. The remaining 10 sequences differed by 1-6 SNPs from L40510. BIII sequences showed a high degree of genetic diversity among them, explaining that 21/24 of the sequences assigned to this sub-assemblage corresponded to distinct genotypes (genetic variants) of the parasite. These sequences differed by 4-13 SNPs from reference sequence AF069059, most of them associated with ambiguous (double peak) positions. Similarly, most (20/22) sequences identified as ambiguous BIII/BIV sequences were different among them, differing by 9-17 SNPs from reference sequence L40508. Virtually all SNPs detected in BIII/IV sequences corresponded to double peaks at single nucleotide positions.
At the bg locus, a total of 55 sequences were fully characterised ( Table 6). Out of the 14 assemblage A sequences, two belonged to AII and five to AIII. All AII and AIII sequences were identical to reference sequences AY072723 and AY072724, respectively. Five sequences were considered mixed AII + AIII infections based on the presence of two double peak (C415Y and T423Y) positions and taking sequence AY072723 as reference. Two additional sequences corresponded to AII + B and AIII + B mixed infections, differing by 32 and 38 SNPs from reference sequence AY072727, respectively. Except one, all the detected SNPs corresponded to clear double peak positions. Compared to the gdh locus, a lower (but still substantial) degree of genetic variability was observed within the 41 sequences assigned to assemblage B at the bg locus. All of them differed by 1-6 SNPs from reference sequence AY072727. A genetic variant showing two transitional mutations at positions C165T and A183G was the genotype most frequently detected.
The distribution of single point mutations and double peaks differed substantially among sub-assemblages and loci. At the gdh locus, hotspot sites accumulated 57.7% of all SNPs detected in BIII sequences, but this figure increased to 72.9% in BIII/BIV sequences. Double peaks accounted for 37.8% of the SNPs detected in BIII sequences, but for 67.8% of the ambiguous BIII/BIV sequences (Figure 2A). At the tpi locus, hotspot sites accumulated 18.2% of all SNPs detected in BIII sequences, but this figure increased to 54.5% in BIII/BIV sequences ( Figure 2B). Finally, at the bg locus, hotspot sites clustered 78.4% of the SNPs detected in assemblage B sequences, of which 58.1% corresponded to double peaks ( Figure 2C). Overall, 55% of individuals who tested positive for G. duodenalis were female, the median age was 10 years old, with 49% <10 years old. A total of 55% were from tribe 1, followed by 15% from tribe 5. The most frequent clinical signs were normal stool appearance (92%), abdominal pain (53%), and stool consistency type 2 (45%). Overall, 58% did not report hand washing, 84% reported eating with hands, and 18% did not report washing fresh produce. Sanitation was predominantly defecation in the woods (70%) and open defecation near households (21.7%). Microscopy examination also revealed that 53% of 2.6. Risk Association Analysis 2.6.1. Comparing G. duodenalis Negative/Ever Positive Overall, 55% of individuals who tested positive for G. duodenalis were female, the median age was 10 years old, with 49% <10 years old. A total of 55% were from tribe 1, followed by 15% from tribe 5. The most frequent clinical signs were normal stool appearance (92%), abdominal pain (53%), and stool consistency type 2 (45%). Overall, 58% did not report hand washing, 84% reported eating with hands, and 18% did not report washing fresh produce. Sanitation was predominantly defecation in the woods (70%) and open defecation near households (21.7%). Microscopy examination also revealed that 53% of individuals were coinfected with Endolimax nana, 48% with Entamoeba coli, 18% with Chilomastix mesnili, 16% with Ancylostoma spp., and 11% with Blastocystis sp.
Children under <15 years old reported more frequently vomiting, abdominal pain, and abnormal (mucous, bloody, mucous-bloody) faecal appearance compared to adults. However, only differences in abdominal pain appeared significant (Chi-squared test, p = 0.016). There were no differences in age or symptoms between the two assemblages A and B (Chi-squared test: age group, p = 0.552; faecal consistency, p = 0.732; abdominal pain, p = 1; vomit, p = 0,953).

Discussion
This survey presents new insights into the epidemiology of G. duodenalis in Amazonian indigenous communities. The main contributions of the study include the demonstration that (i) giardiasis is a common finding (13-22%) in apparently healthy Tapirapé people, mainly affecting children in the age group of 0-9 years old; (ii) assemblage B was responsible for near 70% of the mostly asymptomatic infections detected; and (iii) a high degree of genetic heterogeneity was observed within assemblage B (but not assemblage A) sequences, regardless of the molecular marker used.
Several epidemiological studies conducted in endemic areas worldwide have shown that G. duodenalis infections do not seem to correlate positively with diarrhoea [23,24], demonstrating that asymptomatic giardiasis is the rule rather than the exception in these settings. This fact would explain why giardiasis is systematically absent in global burden estimations of diarrhoeal disease [25]. This seems to be also the case of the present study, where G. duodenalis infections were detected similarly in asymptomatic individuals (33.8%) and individuals presenting with diarrhoea or other gastrointestinal manifestations (35.3%). Taken together, this information supports the hypothesis that some enteric protist species (e.g., Blastocystis sp., Dientamoeba fragilis, G. duodenalis) might in fact be protective against disease [26]. This is an attractive possibility implying that these agents are indeed acting as pathobionts (that is, microorganisms that normally live as harmless symbionts but under certain circumstances can be pathogenic) forming part of the host eukaryome.
We have shown in our study that G. duodenalis infection was strongly related to younger age and tribe (with tribes 1 and 5 having a higher association) and to seasonality. This may be due to external factors associated with indirect transmission pathways of the infection (e.g., source of drinking water, consumption of contaminated fresh produce, swimming in contaminated surface waters, defecation on the open ground near households, and high density of companion or domestic animals) or increased risk of reinfection within the tribe from other infected members through direct person-to-person contact. Contact with faecally contaminated water and produce may be more likely in the rainy season. Children <15 years old with giardiasis reported more frequently vomiting, abdominal pain, and presence of mucus/blood in faeces compared to adults, although observed differences did not reach statistical significance. Young children with an immature immune system may be at higher risk of infections and probably more severe disease episodes. Thus, older adults may have acquired immunity after a previous infection. Indeed, it has been shown that levels of intestinal inflammation caused by G. duodenalis infection decrease with subsequent infections [27,28]. This implies that there is acquired protection against the severity of giardiasis but not from reinfection [29]. In this regard, it should be noted that the composition and abundance of the host's microbiota have also been suggested to play an important role in the outcome of the infection [30].
Giardiasis was also strongly dependent on the number of samples taken, even considering that conventional microscopy (a method that is largely known to be of limited diagnostic sensitivity) was the screening method for the initial detection of G. duodenalis in the present survey. This suggests that possible reinfections or chronic infections with intermittent positivity may be more common than initially anticipated. Reinfection may be more pronounced in the rainy season. In addition, no evident differences between individuals continuously positive/discontinuously positive to G. duodenalis were found. However, we should exclude a bias in those presenting for sampling. This is unlikely to be a major factor due to the lack of symptoms in most cases.
Regarding coinfections, the presence of G. duodenalis was not associated with any other enteric parasite species, except possibly E. nana. These results may be biased by the relatively small number of positive samples detected for certain pathogens and should, therefore, be interpreted with caution. Similarly, a counter-intuitive positive association between G. duodenalis with washing fresh produce was found. This result may be the consequence of the potential confounder effect of other variables no considered here such as the manipulation of fresh produce or the use of contaminated washing water. The latter possibility would support the relevance of waterborne transmission for human giardiasis.
Molecular sequence analyses of the three loci used here for genotyping purposes also revealed interesting data. There were no differences in age between individuals infected either by the assemblage A or the assemblage B of G. duodenalis. Regarding age-related patterns in the distribution of G. duodenalis assemblages, our results are in contrast with those previously obtained in surveys targeting clinical populations. For instance, children have been shown to be more commonly infected by assemblage B (83%, 44/53) than adults (52%, 22/42) in patients of all age groups in Spain [31]. Moreover, in that country, assemblage B was significantly more prevalent than assemblage A in asymptomatic outpatient children, but not in individuals of older age [32].
Remarkably, no association between the occurrence of diarrhoea (or any other gastrointestinal manifestation) and the G. duodenalis assemblage involved in the infection was found in the investigated population. This result corroborates that observed in children under 5 years of age (n = 222) recruited under the Global Enteric Multicentre Study (GEMS) in Mozambique [33]. However, it should be noted that other surveys have shown different, even contradictory, results. For instance, assemblage A was more prevalent than assemblage B in Bangladeshi people (n = 343) [34], in Turkish clinical patients (n = 44) [35], and in Spanish outpatient children (n = 43) [32]. The opposite trend was reported in asymptomatic infected individuals (n = 18) in the Netherlands [36].
Genotyping data generated here demonstrated that assemblage B was responsible for three out of four G. duodenalis infections in the Tapirapé people, a similar proportion of that (78%) described in paediatric populations in the Amazonas State [37]. Of note, assemblage A tends to be the predominant G. duodenalis genetic variant circulating in humans in Brazil (Table S1). These facts may be indicative of differences in sources of infection, transmission pathways, or even geographical segregation patterns of the parasite in the country. Lack of non-human, host-specific assemblages C-F seem to suggest that companion, production, and free-living animal species are no significant contributors of giardiasis in the surveyed population. This is in spite of the fact that swine and poultry were reared in all seven tribes, and that domestic dog and cat densities were also high. In addition, cattle (but not sheep) farming was also frequent in the proximity of them. Taking together, these data indicate that human giardiasis is mainly of anthropic nature among the Tapirapé people. The extent and accuracy of this statement should be corroborated in future molecular epidemiological studies including animal and environmental (water) samples.
This study also confirms the high genetic variability within G. duodenalis assemblage B (but not assemblage A) reported frequently in similar molecular epidemiological surveys conducted in endemic areas globally [38,39] including Brazil [40,41]. This finding was particularly evident at the gdh and tpi loci, for which most of the generated BIII (78-87%), BIV (100%), and BIII/BIV (90-92%) sequences corresponded to distinct genotypes of the parasite. Sequences unmistakably assigned to BIII and BIV at the gdh/tpi loci tended to vary only in one to six positions (hotspots) either as mutations or ambiguous (double peak) sites. In these sets of hotspots, the proportion of sites involving double peaks in BIII sequences varied from 38% at the gdh locus to 18% at the tpi locus. Interestingly, these percentages increased in both cases to 55-68% in ambiguous BIII/BIV sequences, explaining why these isolates were difficult to allocate to a given sub-assemblage. Two independent mechanisms have been proposed to explain the presence of ambiguous (double peak) positions. The first one involves the occurrence of true mixed infections (e.g., BIII + BIV) and would fit well with an epidemiological scenario characterised by high infection and reinfection rates as the one described in the present study. The second one would be associated with the occurrence of genetic recombination. Evidence for the latter possibility comes from independent investigations demonstrating low levels of allelic sequence heterozygosity (implying a genetic homogenisation mechanism) within assemblage A [42] and, to a lesser extent, within assemblage B [43]. Additional evidence of genetic recombination events has been demonstrated within assemblage B in single (trophozoite and cyst) cells [44] and within sub-assemblages BIII and BIV at the genetic population level [45].
The results obtained in the present study may be biased by certain design and methodological constricts. For instance, the initial screening of G. duodenalis was based on conventional microscopy, so the true prevalence of the infection is likely to be underestimated. In addition, there may be a response bias as people may be more or less inclined to return to the study if they had a negative or positive test result. Interestingly, the positivity rate was increased by the number of tests performed, suggesting that over time people were likely to have had a giardiasis episode, that they may have had a false-negative result at microscopy examination, or an inherent response bias in that people who were likely to be positive would return for testing. Limitations associated with the main dataset may arise from the combination of period-specific data, although most of the independent variables considered (e.g., demographics, access to safe drinking water, and sanitary conditions) were not expected to change over time. As our analyses used the first negative test result, we could not further explore the effect of seasonality in the multivariable analysis. However, we have already shown in the descriptive data that seasonality is associated with infections and repeated infections. Lack of association between G. duodenalis genetic variants and occurrence of clinical symptoms may be influenced by the fact that other diarrhoea-causing agents (including viral and bacterial pathogens) were not assessed. In addition, suspected mixed infections were not further investigated by cloning of PCR amplicons or next-generation sequencing, methods with high sensitivity able to detect genetic variants of the parasite that are underrepresented in the population pool, and that are otherwise undetectable using conventional PCR methods and Sanger sequencing. Finally, the typing scheme used in the present study may lack enough phylogenetic resolution to correctly differentiate between sub-assemblage BIII and BIV sequences. This issue has been highlighted in recent molecular studies for assemblage B and assemblage A sequences [46,47]. This important point emphasises the need of identifying new markers and of developing novel methods for MLST purposes.

Study Area
Brazil extends over 8,511,965 km 2 and includes 724 indigenous lands (ILs) covering a total area of 1,173,770 km 2 and accounting for 14% of the country's territory [48]. Most ILs are concentrated in the Legal Amazon, representing 23% of the Amazon territory) [49]. The indigenous people from the Tapirapé ethnicity live in the Serra do Urubu Branco region, Mato Grosso State, a region of tropical forest with typical Amazonian flora and fauna interspersed with clean and closed fields. The Tapirapé exploit this environment alternating agriculture, hunting, gathering, and fishing according to the time of year [49,50]. Farmers villages have traditionally been in the vicinity of dense forests on high, non-flooding lands. Currently, the Tapirapé ethnic group is made up of approximately 700 individuals living in six tribes with maximum and minimum distances from the main tribe of 70 km and 10 km, respectively. The main tribe is in the municipality of Confresa, Mato Grosso State (Figure 3). Tapirapé people interact frequently with individuals from other ethnic tribes at social events, hunt parties, and other activities.

Sampling and Data Collection
This is a prospective, cross-sectional epidemiological study including four sampling periods covering two dry (July 2008 and July 2009) and two rainy (January 2009 and January 2010) seasons. After obtaining the chief's ('cacique') consent for permission to survey, all members of the tribe were informed about the aim of the project and invited to provide a single stool sample at each of the four scheduled sampling periods. Designated persons in each household were given polystyrene plastic flasks for each member of the household and stool samples were collected on the following day.
Individual standardised questionnaires were completed by a member of our research team in face-to-face interviews with designated persons at sample collection, who provided the requested information for each member of his/her household. Questions included demographics (gender, age, village of origin), clinical manifestations (vomit, abdominal pain), hand and vegetable washing, source of drinking water, use of water treatment, defecation place, and contact with domestic animals and livestock. Provided stool samples were visually inspected for consistency and the presence of mucus or blood. Each participant was assigned a unique distinctive code through the whole period of study, which was used to identify his/her stool sample(s) and associated epidemiological questionnaire(s).

Microscopy Examination
Stool samples were kept at 4 °C before microscopy examination, usually within 48 h

Sampling and Data Collection
This is a prospective, cross-sectional epidemiological study including four sampling periods covering two dry (July 2008 and July 2009) and two rainy (January 2009 and January 2010) seasons. After obtaining the chief's ('cacique') consent for permission to survey, all members of the tribe were informed about the aim of the project and invited to provide a single stool sample at each of the four scheduled sampling periods. Designated persons in each household were given polystyrene plastic flasks for each member of the household and stool samples were collected on the following day.
Individual standardised questionnaires were completed by a member of our research team in face-to-face interviews with designated persons at sample collection, who provided the requested information for each member of his/her household. Questions included demographics (gender, age, village of origin), clinical manifestations (vomit, abdominal pain), hand and vegetable washing, source of drinking water, use of water treatment, defecation place, and contact with domestic animals and livestock. Provided stool samples were visually inspected for consistency and the presence of mucus or blood. Each participant was assigned a unique distinctive code through the whole period of study, which was used to identify his/her stool sample(s) and associated epidemiological questionnaire(s).

Microscopy Examination
Stool samples were kept at 4 • C before microscopy examination, usually within 48 h of collection. A conventional flotation method using sucrose solution (specific gravity: 1.2 g/cm 3 ) was conducted in all stool samples as previously described [51]. Two additional techniques were performed-spontaneous sedimentation [52] and centrifugalsedimentation in formalin-ether [53]. A sample was considered G. duodenalis-positive if cysts of the parasite were detected by at least one of the three methods used. Aliquots of faecal positive samples were stored at -20 • C for downstream molecular analyses. Any other enteric parasite (including helminthic and protist) species found during microscopy observation were also identified and recorded.

DNA Extraction and Purification
Positive stool samples were defrosted and G. duodenalis cysts concentrated and purified using the Faust method [54]. Obtained supernatants were subjected to three freeze-thaw cycles to facilitate the mechanical breakage of the cyst wall [55]. Genomic DNA was extracted from the processed supernatants (ca 200 µL) using the PureLink Genomic DNA Mini Kit (Thermo Fisher Scientific, Waltham, MA, USA) according to the manufacturer's instructions. Extracted and purified DNA samples in molecular grade water (200 µL) were kept at −20 • C and shipped to the Spanish National Centre for Microbiology (Health Institute Carlos III) in Majadahonda (Spain) for downstream genotyping analyses.

Molecular Confirmation of G. duodenalis
Confirmation of G. duodenalis infection was achieved using a real-time PCR (qPCR) method targeting a 62-bp region of the gene codifying the small subunit ribosomal RNA (SSU rRNA) of the parasite [56]. Amplification reactions (25 µL) consisted of 3 µL of template DNA, 0.5 µM of each primer Gd-80F and Gd-127R, 0.4 µM of the probe (Table S12), and 12.5 µL TaqMan ® Gene Expression Master Mix (Applied Biosystems, Foster City, CA, USA). Detection of parasitic DNA was performed on a Corbett Rotor GeneTM 6000 realtime PCR system (Qiagen, Hilden, Germany) using an amplification protocol consisting of an initial hold step of 2 min at 55 • C and 15 min at 95 • C, followed by 45 cycles of 15 s at 95 • C and 1 min at 60 • C. Water (no template) and genomic DNA (positive) controls were included in each PCR run.

Molecular Characterisation of G. duodenalis
Giardia duodenalis isolates with a qPCR-positive result were re-assessed by sequencebased multi-locus genotyping of the genes encoding for the glutamate dehydrogenase (gdh), beta-giardin (bg), and triosephosphate isomerase (tpi) proteins of the parasite. A semi-nested PCR was used to amplify a~432-bp fragment of the gdh gene [57]. PCR reaction mixtures (25 µL) included 5 µL of template DNA and 0.5 µM of the primer pairs GDHeF/GDHiR in the primary reaction and GDHiF/GDHiR in the secondary reaction (Table S12). Both amplification protocols consisted of an initial denaturation step at 95 • C for 3 min, followed by 35 cycles of 95 • C for 30 s, 55 • C for 30 s, and 72 • C for 1 min, with a final extension of 72 • C for 7 min.
A nested PCR was used to amplify a~511 bp-fragment of the bg gene [58]. PCR reaction mixtures (25 µL) consisted of 3 µL of template DNA and 0.4 µM of the primers sets G7_F/G759_R in the primary reaction and G99_F/G609_R in the secondary reaction (Table S12). The primary PCR reaction was carried out with the following amplification conditions: one step of 95 • C for 7 min, followed by 35 cycles of 95 • C for 30 s, 65 • C for 30 s, and 72 • C for 1 min, with a final extension of 72 • C for 7 min. The conditions for the secondary PCR were identical to the primary PCR except that the annealing temperature was 55 • C.
A nested PCR was used to amplify a~530 bp-fragment of the tpi gene [59]. PCR reaction mixtures (50 µL) included 2-2.5 µL of template DNA and 0.2 µM of the primer pairs AL3543/AL3546 in the primary reaction and AL3544/AL3545 in the secondary reaction (Table S12). Both amplification protocols consisted of an initial denaturation step at 94 • C for 5 min, followed by 35 cycles of 94 • C for 45 s, 50 • C for 45 s, and 72 • C for 1 min, with a final extension of 72 • C for 10 min.
The semi-nested and nested PCR protocols described above were conducted on a 2720 Thermal Cycler (Applied Biosystems). Reaction mixes always included 2.5 units of MyTAQ TM DNA polymerase (Bioline GmbH, Luckenwalde, Germany), and 5× MyTAQTM reaction buffer containing 5 mM dNTPs and 15 mM MgCl 2 . Laboratory-confirmed positive and negative DNA samples for each parasite species investigated were routinely used as controls and included in each round of PCR. PCR amplicons were visualised on 2% D5 agarose gels (Conda, Madrid, Spain) stained with Pronasafe nucleic acid staining solution (Conda). Positive PCR products were directly sequenced in both directions using appropriate internal primer sets (Table S12). DNA sequencing was conducted by capillary electrophoresis using BigDye ® Terminator chemistry (Applied Biosystems) on an on ABI PRISM 3130 Genetic Analyser.

Sequence and Phylogenetic Analyses
Raw sequencing data in both forward and reverse directions were viewed using the Chromas Lite version 2.1 sequence analysis program (https://technelysium.com. au/wp/chromas/ (accessed on 1 February 2021)). The Basic Local Alignment Search Tool (BLAST) (http://blast.ncbi.nlm.nih.gov/Blast.cgi (accessed on 1 February 2021)) was used to compare nucleotide sequences with sequences retrieved from the NCBI GenBank database. Generated DNA consensus sequences were aligned to appropriate reference sequences using the MEGA 6free software [60] for species confirmation and assemblage/sub-assemblage identification.
For the estimation of the phylogenetic relationships among the identified Giardiapositive samples, gdh sequences generated in this study and human-and animal-derived homologue sequences mostly from Brazil retrieved from GenBank were aligned using Clustal X and adjusted manually with GeneDoc [61,62]. Inferences by maximum parsimony (MP) were constructed by PAUP version 4.0b10 using a heuristic search in 1000 replicates, 500 bootstrap replicates, random stepwise addition starting trees (with random addition sequences), and tree bisection and reconnection branch swapping [63]. MrBayes v3.1.2 was used to perform Bayesian analyses with four independent Markov chain runs for 1,000,000 metropolis-coupled MCMC generations, sampling a tree every 100th generation [64]. References  are cited in the supplementary materials. The first 25% of trees represented burn-in and the remaining trees were used to calculate Bayesian posterior probability. The GTR +I + G substitution model was used. The gdh sequence of G. ardeae was used as the outgroup.

Statistical Analysis
We investigated factors (public health features, clinical symptoms, coinfection with other pathogens) associated with a positive G. duodenalis result. The main dataset was constructed with data from one of the four sampling points-if the observation ever tested positive for G. duodenalis, we used data from the sampling point of the first positive G. duodenalis result; otherwise, we used data from the first sampling point in order.
We conducted Chi-squared tests (p < 0.05) to compare characteristics of cases and non-cases, and we calculated crude odds ratios (OR) with 95% confidence intervals (CI) to investigate the crude association between independent variables and a G. duodenalispositive result. We constructed multivariable logistic regression models to assess the association between G. duodenalis and (i) public health features and clinical signs or (ii) coinfection with other intestinal pathogens, adjusted by age, tribe, and the number of samples. Additionally, we considered the serial results of G. duodenalis for observations with at least two samples. We conducted similar analyses by comparing those continuously negative versus continuously positive, and those discontinuously positive versus continuously positive.
Univariable analyses were conducted on all available observations, but observations with missing values were removed from multivariable analyses. All the independent variables were included in the analyses and we used the stepwise backward selection method, removing successively the least significant variable and using Akaike information criterion (AIC) and Bayesian information criterion (BIC) to construct the best fit model. Analyses were performed in R (package stats).

Ethics Approval
This study has been approved by the National Research Ethics Commission (CONEP), Ministry of Health (Brazil), under reference number 120/2008.

Conclusions
This microscopy-based survey demonstrates that symptomatic and asymptomatic giardiasis are common in indigenous people from the Brazilian Amazon. Children under 15 years of age were particularly exposed to the infection, suggesting that acquired immunity plays a role in modulating the frequency and virulence of the disease. G. duodenalis infection rates varied largely among the surveyed tribes and sampling periods, suggesting that different pathways may be involved in the transmission of the parasite. Molecular sequence data indicated that the most likely source of infection was anthropic. The distribution of assemblages was independent of the occurrence of clinical manifestations, indicating that the genotype of the parasite was not associated with the outcome of the infection. Assemblage B accounted for near 75% of the infections detected and showed a high genetic diversity that impaired the correct identification of sub-assemblages BIII and BIV. This diversity was mainly associated with the presence of ambiguous positions (double peaks) at the chromatogram level, suggesting that coinfections and/or genetic recombination events were taking place, at unknown rates, in the investigated population. Further molecular epidemiological studies targeting animal (including domestic and wildlife) and environmental (drinking water) samples are needed to elucidate the transmission dynamics of G. duodenalis in this Brazilian geographical region.
Supplementary Materials: The following are available online at https://www.mdpi.com/2076-0 817/10/2/206/s1, Table S1: Prevalence and molecular diversity of Giardia duodenalis in humans in Brazil, Table S2: Prevalence and molecular diversity of Giardia duodenalis in domestic and wildlife animal species in Brazil, Table S3: Prevalence and molecular diversity of Giardia duodenalis in water samples in Brazil, Table S4: Prevalence and molecular diversity of Giardia duodenalis in fresh produce in Brazil, Table S5: Full dataset showing the molecular diversity of G. duodenalis at the gdh, bg, and tpi molecular markers, Table S6: Intra-assemblage B single nucleotide polymorphisms distribution and classification among G. duodenalis sequences at the gdh, bg and tpi loci. Hotspots for SNPs are identified and the summarised comparisons of frequencies between the hotspot and non-hotspot sites are highlighted with darker shades, Table S7: Univariable analysis comparing discontinuously G. duodenalis-positive results versus continuously G. duodenalis-positive results. p-values marked in bold indicate numbers that are significant on the 95% confidence limit, Table S8: Multivariable analysis comparing discontinuously G. duodenalis-positive results versus always G. duodenalis-positive results. p-values marked in bold indicate numbers that are significant on the 95% confidence limit, Table S9: Univariable analysis comparing always G. duodenalis-negative results versus always G. duodenalis-positive results. p-values marked in bold indicate numbers that are significant on the 95% confidence limit, Table S10: Multivariable analysis always comparing G. duodenalis-negative results versus always G. duodenalis-positive results. p-values marked in bold indicate numbers that are significant on the 95% confidence limit, Table S11: Multivariable analysis comparing always G. duodenalis-negative results versus always G. duodenalis-positive results and considering the presence of coinfections. p-values marked in bold indicate numbers that are significant on the 95% confidence limit, Table S12: Oligonucleotides used for the molecular identification and characterisation of G. duodenalis in the present study, Figure S1: Maximum parsimony phylogenetic dendogram based on bg sequences of G. duodenalis. Numbers on nodes indicate the bootstrap/posterior probability values. GenBank accession numbers for all sequences used for the phylogenetic analysis were embedded in the tree, Figure S2: Maximum parsimony phylogenetic dendogram based on tpi sequences of G. duodenalis. Numbers on nodes indicate the bootstrap/posterior probability values. GenBank accession numbers for all sequences used for the phylogenetic analysis were embedded in the tree.