Genetic Variants of Glucose-6-Phosphate Dehydrogenase and Their Associated Enzyme Activity: A Systematic Review and Meta-Analysis

Low glucose-6-phosphate dehydrogenase enzyme (G6PD) activity is a key determinant of drug-induced haemolysis. More than 230 clinically relevant genetic variants have been described. We investigated the variation in G6PD activity within and between different genetic variants. In this systematic review, individual patient data from studies reporting G6PD activity measured by spectrophotometry and corresponding the G6PD genotype were pooled (PROSPERO: CRD42020207448). G6PD activity was converted into percent normal activity applying study-specific definitions of 100%. In total, 4320 individuals from 17 studies across 10 countries were included, where 1738 (40.2%) had one of the 24 confirmed G6PD mutations, and 61 observations (3.5%) were identified as outliers. The median activity of the hemi-/homozygotes with A-(c.202G>A/c.376A>G) was 29.0% (range: 1.7% to 76.6%), 10.2% (range: 0.0% to 32.5%) for Mahidol, 16.9% (range 3.3% to 21.3%) for Mediterranean, 9.0% (range: 2.9% to 23.2%) for Vanua Lava, and 7.5% (range: 0.0% to 18.3%) for Viangchan. The median activity in heterozygotes was 72.1% (range: 16.4% to 127.1%) for A-(c.202G>A/c.376A>G), 54.5% (range: 0.0% to 112.8%) for Mahidol, 37.9% (range: 20.7% to 80.5%) for Mediterranean, 53.8% (range: 10.9% to 82.5%) for Vanua Lava, and 52.3% (range: 4.8% to 78.6%) for Viangchan. A total of 99.5% of hemi/homozygotes with the Mahidol mutation and 100% of those with the Mediterranean, Vanua Lava, and Viangchan mutations had <30% activity. For A-(c.202G>A/c.376A>G), 55% of hemi/homozygotes had <30% activity. The G6PD activity for each variant spanned the current classification thresholds used to define clinically relevant categories of enzymatic deficiency.


Introduction
Plasmodium vivax has become the predominant cause of malaria outside of sub-Saharan Africa, causing between 4 and 14 million clinical cases annually [1,2]. The control and elimination of P. vivax is confounded by the parasite's ability to form dormant liver stages (hypnozoites), which are not effectively eliminated by the schizontocidal drugs used to clear blood-stage infections [3]. Untreated P. vivax hypnozoites can reactivate weeks to months after the primary infection, causing recurrent episodes of malaria and ongoing transmission of the parasite [4]. The timely elimination of the parasite requires a radical cure, a combination of schizontocidal and hypnozoitocidal drugs, to kill the blood and liver stages of the parasite [5]. The only available class of drugs with hypnozoitocidal activity are the 8-aminoquinoline compounds (primaquine and tafenoquine), which cause severe haemolysis in individuals with glucose-6-phosphate dehydrogenase (G6PD) deficiency. G6PD deficiency (G6PDd) is a common inherited enzyme disorder, with a prevalence of 1% to 35% in malaria-endemic countries [6].
Exposure to several drugs and compounds can cause oxidative stress and induce haemolysis in G6PD-deficient individuals; these include 8-aminoquinoline agents, dapsone, ciprofloxacin, henna, and fava beans [7]. The risk of severe haemolysis following 8-aminoquinoline treatment is particularly relevant to the radical cure of patients with P. vivax malaria, and so the WHO recommends testing for G6PDd prior to administration of the antimalarial drugs [8]. The reference standard for diagnosing G6PD deficiency is quantitative UV spectrophotometry, for which several commercial kits are available [9,10]; however, spectrophotometry is not suitable for testing at the point of care [11,12]. In practice, routine diagnosis of G6PD deficiency is often unavailable or limited to more readily available and cheaper qualitative tests [13]. Tafenoquine is an 8-aminoquinoline drug that can be administered as a single dose; however, its recent licensing and roll-out requires quantitative G6PD testing prior to use, to identify patients with both intermediate (<70% normal activity) and severe (<30% normal activity) G6PD deficiency, in whom the drug is contraindicated [14]. Several new quantitative diagnostics have been developed to provide point-of-care testing for routine use [15,16]. G6PD deficiency is caused by one or more mutations in the G6PD gene, located on the X chromosome. Hence, males are hemizygous for the gene and phenotypically are either G6PD normal (G6PDn) or G6PDd, whereas females can be homozygous for the G6PD gene, conferring normal or deficient activity, or heterozygous, with activities that range from almost no activity to close to normal G6PD activities, with the majority clustering around the 50% activity threshold. A special case is compound heterozygous females who harbour two distinct G6PD variants on their two X chromosomes, both conferring low G6PD activities, similar to homozygous and hemizygous individuals. The G6PD gene was first cloned in 1986 [17], with subsequent studies identifying more than 230 mutations Pathogens 2022, 11, 1045 3 of 18 associated with reduced enzyme activity [7,18,19]. The majority of these arise from missense mutations, which cause substitution of a single amino acid [7]. Many G6PD mutations are rare, with limited observations from the few individuals reported in the literature. The large number of clinically relevant G6PD genotypes identified to date result in a wide range of phenotypes that are commonly characterised according to their residual G6PD enzymatic activity, but also according to other biochemical properties, such as electrophoretic mobility, thermal stability, and the Michaelis constant (Km). In 1971, Yoshida et al. proposed a classification of G6PD variants observed in hemizygous-mutated males according to five classes: I-severe enzyme deficiency, with chronic non-spherocytic anaemia; II-severe deficiency, with residual activity <10%; III-moderate-to-mild G6PD activity, with residual activity 10-60%; IV-very mild-to-no deficiency, with 60-100% residual activity; and Vincreased G6PD activity, with >200% residual enzyme activity [20,21]. This classification system has been in use for the past 50 years and is incorrectly known as the "WHO classification". To date the correlation between G6PD genotype and phenotype remains poorly characterized for many variants. In 2022, the World Health Organization Global Malaria Programme convened a technical consultation to propose a revised classification scheme for G6PD variants, spurred on by overlapping reports of G6PD activity in Classes II and III, and scarcity of reports in Class V [22].
To characterise the relationship between genotype and phenotype, and the associated variability, and to investigate its implications for classification of severity of G6PD variants, we undertook a systematic review and meta-analysis of the existing quantitative measurements of G6PD activity, in individuals with a known G6PD genotype.

MEDLINE (PubMed)
, Web of Science Core Collection (Clarivate), and SCOPUS were searched using standardized search terms (File S1; PROSPERO 2020 CRD42020207448). Studies were included for screening if they involved quantitative measurement of G6PD activity (using quantitative UV spectrophotometry at a wavelength of 340 nm) at a steady state (no haemolytic crisis within the previous 4 months) and molecular diagnosis of a G6PD variant known to be of clinical relevance. Each identified abstract was screened by at least two authors independently and a third author consulted for any disagreements (D.P., B.L., A.W.S., and A.S.). Full texts of relevant articles were then screened. Studies were excluded if they included only individuals with other known haematological conditions, newborns, or fewer than 20 G6PD normal males (with the exception of 2 studies reporting a robust definition of 100% G6PD activity), or if they did not provide sufficient information on laboratory procedures. Studies published before 2005 were excluded due to unavailability of individual-level datasets. The corresponding authors of the relevant studies were contacted at least twice and invited to provide published and unpublished individual patient data (IPD). Reference lists of the identified articles were screened for further relevant studies. Data confidentiality agreements were signed, and formal approval obtained, as required by the affiliated institutions of the corresponding authors.
The absolute values of the spectrophotometry results vary significantly between different laboratories [11]. Prior to pooling of G6PD activity observations from the different studies and settings, all measurements were converted from either U/g Hb or U/10 12 red blood cell (RBC) to a % normal activity, using a study-specific definition of 'normal' (100%) G6PD activity. In most cases, this represented an adjusted male median (AMM) [9], either calculated from the included data (where datasets included ≥20 G6PD normal individuals) or using pre-defined values reported for each study. Whenever AMM or data from G6PD normal individuals were unavailable, an alternative definition of 100% activity was used, provided this was derived from the same study population in the same laboratory. To reflect the variability present in the genotyping methodology, individuals for whom no variant was confirmed were classed as either 'wild-type' (sequencing studies) or 'no confirmed mutation' (SNP-typing studies). Since it was not possible to discriminate between either scenario, IPD from these individuals were not analysed further. One study measured G6PD activity alongside 6-phosphogluconate dehydrogenase (6PGD) activity and reported the results as a ratio of the two enzyme's activity. As the G6PD/6PGD ratio results exhibited a different dynamic range to G6PD activity alone, these data were not normalised to a fraction of normal activity, not included in the main analysis, and presented separately.
IPD were excluded if participants had a confirmed Plasmodium spp. infection, were less than one year of age, or zygosity was not defined. IPD were pooled and the median activity (in % of normal activity, or G6PD/6PGD ratio), interquartile range, and total range were calculated for each variant for homo-/hemizygotes and heterozygotes, separately. All variants for which data were available were included to indicate the breadth of the mutations present; however, since the available data for several variants was limited, variants were classed as data-rich (n ≥ 30 hemi-/homozygous deficient individuals) or data-poor (n < 30) [23]. G6PD activity estimates represent either a single point estimate (n = 1), a mean of two observations (n = 2), or the median of all G6PD spectrophotometry measurements (n ≥ 3). G6PD/6PGD ratio results were analysed separately, and excluded from analyses involving diagnostic thresholds (30%, 60%, 70%, and 80%) established for use with G6PD activity readings alone.
To highlight the presence of extreme measurements, which may reflect procedural errors, outliers were defined for data-rich variants. Outliers were defined per variant and separately for observations reported in U/gHb and those where the G6PD/6PGD ratio was defined. Any measurement that had an activity greater than 1.5× the interquartile range (IQR) above the median measurement for the respective variant (including all observations) was defined as an outlier [24]. Outlier measurements were retained, to illustrate the breadth of variability in measurements, but excluded from the estimates of G6PD activity, reported for each variant, and analyses involving clinical diagnostic thresholds. Differences in median readings were compared using the Kruskal-Wallis test with pairwise Wilcoxon post-tests, with Bonferroni correction to account for multiple comparisons. All analyses were performed using R version 4.0.3 [25].
To assess the risk of bias attributable to the study design and/or testing procedures, the QUADAS-2 tool [26] was modified (Supplementary File S2) and applied to all studies. To assess whether a given study contributed a higher-than-average proportion of outlier measurements, the proportion of outliers in each study was compared to the overall dataset using chi-squared testing.

Characteristics of the Pooled Database
A total of 838 papers were screened by title and abstract; 153 of these were included and the full text screened, and a further 10 papers were added from reference lists and author contact. Of these, 53 were identified as relevant ( Figure 1); however, data from 36 papers were unavailable due to no author response or no permissions to share the IPD. Overall, data were available from 17 studies published between 2009 and 2021, conducted across 10 countries: 11 studies in Asia, 3 studies in the Americas, 2 studies in Africa, and 1 study in the Middle East ( Figure 1, Supplementary Material Table S1) [27][28][29][30][31][32][33][34][35][36][37][38][39][40][41][42][43]. Individual-level data from the 4320 participants were available, of whom 564 (13%) individuals were excluded due to an age less than one year or unknown age, and 251 (6%) individuals aged above one year were excluded due to confirmed malaria infection, along with two (0.05%) females for whom it was unknown whether they were hetero-or homozygous [35,44]. Among the remaining 3503 individuals (81.1%), no clinically relevant G6PD variant was identified in 1765 (41%) individuals and these were therefore excluded from further analysis.
Similar variability was observed for variants analysed as a ratio of the G6PD/6PGD activity. In total, 35 hemi-/homozygotes were included for both the Canton and Kaiping variants, with both variants exhibiting the same median G6PD/6PGD ratio of 0.4 (range: 0.1 to 0.7). The corresponding ratio was 0.3 (range: 0.1 to 0.8) for 25 individuals with the Gaohe variant (Table 3, Figure 3). Heterozygous individuals with these variants exhibited median G6PD/6PGD ratios of 1.5 (range 0.4 to 2.2, n = 36) for the Canton variant, 1.6 (range 0.5 to 2.0, n = 31) for Kaiping, and 1.1 (0.5 to 2.3, n = 24) for Gaohe.  * Estimates for n = 1 are the single G6PD activity measurement; for n = 2, these are the mean of the 2 measurements; for n ≥ 3, these are the median of the included measurements. Estimates indicated in Columns 3-5 were calculated after the exclusion of outliers. Table 3. Median G6PD activity and variability of the variants investigated using the G6PD/6PGD ratio method.

Data-Poor Variants
A total of 15 variants had less than 30 observations in hemi-/homozygous individuals. Across these variants, G6PD activity followed the expected trends, with 60 hemi-/homozygous individuals falling below 30% activity. Overall, 14 data-poor variants had >1 observation, with similar levels of variability observed as for the data-rich variants. For example, for hemi/homozygous individuals with the Orissa variant, activity varied from 3.8% to 59.6%, with a similar spread for the hemi/homozygous individuals with Quing Yuan or Chinese-4 (20.0% to 46.4%). For eight compound heterozygous females, with one of seven combinations of G6PD mutations, all but one individual exhibited G6PD activity below 30%.
In 86 heterozygous females with a single, data-poor variant, enzyme activity ranged from 20% to 80% normal activity. Although not directly comparable to measurements expressed as a % normal activity, the same trends were observed for data-poor variants included using the G6PD/6PGD ratio method (Figure 4).

Assessment of Study Quality and Risk of Bias
All included studies were assessed using a modified form of the QUADAS-2 tool (Supplementary Files S2 and S3) to examine the risk of bias towards the aims of this metaanalysis arising from the study design and/or sample collection and testing procedures [26]. The assessment was divided into four domains: patient selection, genotyping methods, spectrophotometry methods, and flow and timing. The included papers comprised a heterogenous mix of recruitment methods and study populations. Risk of bias due to patient selection was deemed high in seven studies, and unclear in two studies, due primarily to purposive selection of individuals with known G6PD deficiency, from a specific ethnic group or convenience sampling. A total of 13 out of 17 studies purposefully selected participants for genotyping based on prior phenotypic testing: only genotype-deficient individuals (or those less than 60% activity for example) or only a subset of G6PD normal individuals, resulting in a significant risk of bias towards the lower G6PD activity range. Genotyping methodology primarily introduced bias in the form of the selection of G6PD variants included in SNP-typing methods, providing logistical benefits but risked missing variants not included in the SNP-typing panel. While difficult to assess retrospectively, reported spectrophotometry methodologies were deemed appropriate in all included studies. The primary risk of bias identified in the spectrophotometry methodology was the absence of replicate measurements in 10 out of 17 included studies. This was deemed a high risk of bias due the documented potential for considerable inter-replicate variability in G6PD spectrophotometry [11]. Finally, when considering flow and timing, the primary source of bias identified was the exclusion of certain participants (e.g., G6PD normal individuals, etc.) from the final study samples, due to the wide range of study objectives and methodologies represented. All studies appeared to employ appropriate timing and storage of blood samples.
In addition to the above assessment, all included studies were assessed based on the proportion of outlier measurements that contributed to the data-rich variants. Two studies [31,39] (one comprising 50 A-individuals, the other contributing 22 Mahidol individuals) exhibited a significantly higher proportion of outlier measurements than other studies (p < 0.01). A sensitivity analysis was performed excluding all data from these studies, with little-to-no effect upon our overall findings (A-median G6PD activity 25.9% (range 1.7-64.4%), with 64% hemi/homozygous below 30% activity; Mahidol 10.2% (range 0.0% to 32.5%), with 99.5% hemi/homozygous below 30% activity.

Discussion
Our study highlights significant variation in G6PD activity for individuals with the same G6PD variant, which was apparently irrespective of the phenotypic method used. Whilst hemi-/homozygous-deficient individuals with the Mahidol, Mediterranean, Vanua Lava, and Viangchan variants had similar median enzyme activities (p > 0.05), their activities were significantly lower than activities of individuals with the A-(c.202G>A/c.376A>G) variant. Variation in G6PD activity was greatest for the A-variant (c.202G>A/c.376A>G), ranging from almost 0% to >100% across the six studies, even among hemi-and homozygous individuals. Enzyme activity varied least for the Mediterranean and Mahidol variants (ranging from 0% to 20% across three and five studies, respectively). For most variants, the observed G6PD activity distributions spanned the 10% threshold separating Class II from Class III in the 1971 classification of variant severity [21], supporting recent proposals for revised classes [22].
While considerable, this variability rarely resulted in a confirmed hemi-/homozygous individual crossing the 30% clinical threshold for severe deficiency. Individuals were categorised according to commonly used diagnostic thresholds at 30% and 70% G6PD activity. Almost all hemi-/homozygous individuals with the Mahidol, Mediterranean, Vanua Lava, and Viangchan variant were severely deficient (<30% activity, Table 4), with 29% of hemi-/homozygous individuals with data-poor variants (45/154) also falling below this line. At the same time, however, only 55% of hemi/homozygous individuals with the A-(c.202G>A/c.376A>G) variant had <30% activity and 3% had >70% activity and did not meet the criteria for being outliers. As expected, enzyme activity varied significantly in heterozygous females for all variants, ranging from very low activities to levels that would generally be categorised as normal, a reflection of lyonization [7].
Our findings highlight the substantial proportion of heterozygous individuals with activities between 60% to 80%, and this was apparent for all variants assessed. Out of all non-compound heterozygous females included, 29% had activities between 60% and 80%; hence, a relatively small change in assay precision or decision-making regarding treatment thresholds will have a large impact upon the number of individuals eligible for treatment. However, data on the haemolytic risk associated with G6PD activity in this range is limited, and this is likely to differ according to residual enzyme activity, the associated variant, and degree of oxidative stress.
Several factors may contribute to the observed variability in G6PD activity, including both laboratory and biological factors. Firstly, infancy is associated with elevated G6PD activity [47][48][49], and early reports suggest concurrent malaria infection may transiently increase G6PD activity [35,44]. To minimise the effect of these factors, malaria-positive individuals and infants below one year of age were excluded from our analysis. Second, although G6PD activities were normalised for each study, it is likely that some of the observed variability could be due to lab procedures or errors in data recording, demonstrated, for example, by the occurrence of two hemizygous males with the Mahidol variant that recorded a G6PD activity of more than 150% the normal, which were classified as outliers. Though unlikely, we cannot exclude the possibility that these individuals had Klinefelter syndrome, with an additional X chromosome [50]. Third, most assay protocols provided with commercial test kits do not require the removal of white blood cells from the sample prior to testing, despite extreme leucocytosis being known to influence G6PD spectrophotometry [51]. Since the leucocyte count was not done in most participants, this may have contributed to the observed variability. Fourth, two studies from 1975 and 2004 suggest potential diurnal variations in G6PD activity [52,53], although from relatively small sample sizes, and this may also contribute to the variation in observed G6PD activity. Finally, further variability may stem from sources such as unreported infection, recent haemolytic events, and undiagnosed haematological conditions, resulting in elevated reticulocytosis.
Our study has a number of limitations. First, retrospective analysis of spectrophotometry data cannot exclude poorly standardized and/or executed laboratory procedures [11]. To address this, all studies were assessed for quality control measures, and for the data-rich variants, extreme values were excluded. Additionally, the proportion of outliers contributed by each study was quantified and compared to the proportion of outliers in the complete dataset. Finally, a sensitivity analysis, excluding one dataset with a significantly higher proportion of outliers, did not alter the overall results significantly. This approach penalises small sample size studies, which may potentially introduce additional biases. Second, a total of 14 out of the 17 studies included in our analysis predominantly genotyped individuals below a pre-defined G6PD activity threshold. Accordingly, the derived G6PD activity distributions are likely to be skewed towards the lower end of the G6PD activity spectrum. This is particularly apparent when considering the activity range for heterozygous females with the Viangchan variant (Figure 2), where all observations were from studies in which only females with less than 60% or less than 80% activity were genotyped. Third, some variants were reported predominantly from a single study or from limited geographic areas. For example, almost 90% of all observations for the Mediterranean variant were reported by Reading et al. [37], and all observations of heterozygous females with the A-variant were derived from either the USA or Uganda [32,34,36,39]. Fourth, reported activities in U/gHb were normalized to percent activity to allow for a direct comparison of readings between studies, assuming equivalency between assays, which may not be the case for all assays included [10].
In conclusion, our results highlight marked variability in enzyme activity among individuals with the same G6PD variant. G6PD activity distributions spanned the widely used thresholds to demarcate classes for G6PD deficiency, supporting the updated classification schema recently proposed during a WHO-convened meeting of international experts [22]. In order to define the severity of deficiency associated with a given variant, genotyping should be performed, not only for individuals with a phenotypic activity below a pre-defined threshold, but also phenotypically normal individuals. Further studies are required to determine the association between the G6PD enzyme activity, genetic variant, and risk of severe haemolysis following an 8-aminoquinoline drug, to inform the diagnostic and clinical implications of this heterogeneity.
Supplementary Materials: The following supporting information can be downloaded at: https: //www.mdpi.com/article/10.3390/pathogens11091045/s1, Figure S1: G6PD activity distributions (% Normal) for data-rich variants-A-, Mahidol, Mediterranean, Vanua Lava and Viangchan per study; Table S1: Characteristics of Included Studies, Table S2: Outliers identified per study; File S1: Literature search protocols, File S2: Modified QUADAS tool, File S3: QUADAS Results.  For the purpose of Open Access, the authors have applied a CC BY public copyright license to any Author Accepted Manuscript version arising from this submission.