Comparison of a New Multiplex Immunoassay for Measurement of Ferritin, Soluble Transferrin Receptor, Retinol-Binding Protein, C-Reactive Protein and α1-Acid-glycoprotein Concentrations against a Widely-Used s-ELISA Method

Recently, a multiplex ELISA (Quansys Biosciences) was developed that measures ferritin, soluble transferrin receptor (sTfR), retinol-binding protein (RBP), C-reactive protein (CRP), α1-acid glycoprotein (AGP), thyroglobulin, and histidine-rich protein 2. Our primary aim was to conduct a method-comparison study to compare five biomarkers (ferritin, sTfR, RBP, CRP, and AGP) measured with the Quansys assay and a widely-used s-ELISA (VitMin Lab, Willstaett, Germany) with use of serum samples from 180 women and children from Burkina Faso, Cambodia, and Malaysia. Bias and concordance were used to describe the agreement in values measured by the two methods. We observed poor overall agreement between the methods, both with regard to biomarker concentrations and deficiency prevalence estimates. Several measurements were outside of the limit of detection with use of the Quansys ELISA (total n = 42 for ferritin, n = 2 for sTfR, n = 0 for AGP, n = 5 for CRP, n = 22 for RBP), limiting our ability to interpret assay findings. Although the Quansys ELISA has great potential to simplify laboratory analysis of key nutritional and inflammation biomarkers, there are some weaknesses in the procedures. Overall, we found poor comparability of results between methods. Besides addressing procedural issues, additional validation of the Quansys against a gold standard method is warranted for future research.


Introduction
Micronutrient deficiencies are thought to be widespread globally but particularly affect certain subgroups in low-and middle-income countries, such as young children and women of child-bearing age [1]. The measurement of nutritional biomarkers such as iron, vitamin A, iodine, folate, vitamin B 12 , and zinc, is an essential component of assessing nutritional status at the population-level.
Such biomarker analyses are done in a laboratory, either in the country where samples were collected or internationally. In either case, acceptable precision and accuracy of laboratory analysis is of primary importance in obtaining reliable estimates of nutritional status. In addition, affordability is an important criterion for selecting laboratory tests to be done in a survey or study, in particular in resource-constrained settings. Other factors include sample throughput, complexity of the method, technical expertise required for operation and maintenance, and quantity of blood sample available for analysis.
Often in large micronutrient surveys conducted in low-and middle-income countries, the identification of a suitable laboratory to measure selected biomarkers with adequate quality and affordable cost is not straightforward. When biological samples cannot be exported, stakeholders must identify a laboratory with sufficient expertise in-country, and it can be challenging to find a laboratory that has the needed capacity at affordable costs. When exporting samples is possible, affordability is also often a challenge.
In the recent past years, the VitMin Lab in Germany (led by Dr. Juergen Erhardt) has automated and scaled up an in-house sandwich ELISA (s-ELISA) [2], enabling the analysis of large numbers of samples at a very competitive cost. This laboratory offers the analysis of ferritin, soluble transferrin receptor (sTfR), retinol-binding protein (RBP), C-reactive protein (CRP), and α 1 -acid-glycoprotein (AGP). It has demonstrated high performance in external quality control schemes [2]. Because this laboratory offers outstanding quality at a low price, many groups working in the field of nutrition rely heavily on it for analysis of said biomarkers. These biomarkers are ideal for assessing iron and vitamin A status (ferritin, sTfR, RBP) and adjusting for their response to inflammatory triggers [3]. This strong performance and a low price has: (a) resulted in a high demand for the services of the single person operating the VitMin Lab; and (b) forced the laboratory to restrict the number of samples accepted per project to a somewhat arbitrary number of 3000 samples in order to cope with the huge demand. Repeated attempts of the laboratory to establish the methodology in other laboratories for analysing large number of samples have not yet been successful [4]. In addition, to our knowledge, the range of aforementioned biomarkers cannot currently be measured on a single commercially-available clinical analyser, further complicating the issue. Several models are available that measure all but either AGP or RBP, necessitating employment of several methods/analysers. Thus, while the VitMin Lab is a viable and reliable solution at present, it has the drawbacks of: (a) dependency on a single laboratory; (b) limitations in the number of samples that can be analysed; and (c) the permission and resources to export samples from the country of origin to Germany. Therefore, there is a need to identify alternative options.
PATH, in collaboration with Quansys (Quansys Biosciences, Logan, UT, USA), has developed a multiplex ELISA (Q-Plex™ Human Micronutrient Assay) that measures the same five biomarkers as those measured by the VitMin Lab, as well as thyroglobulin and histidine-rich protein 2 (HRP2; a biomarker to detect recent or current malaria parasitemia). The multiplex platform is relatively easy to use and requires less sophisticated laboratory knowledge and skills than other methods, which would make it suitable for use in different contexts, including those in low-and middle-income countries. Additionally, although the multiplex reader is relatively expensive, the cost per sample of the disposable microplates is similar to that of the VitMin Lab.
As a primary goal, we conducted a method-comparison study to compare five biomarkers (ferritin, sTfR, RBP, CRP, and AGP) measured by the VitMin Lab and using the Quansys assay with use of serum blood sampled from women and children from three different regions. As a secondary goal, we compared a sub-sample of children with confirmed qualitative malaria testing to the Quansys assay to determine its diagnostic ability to identify children with malaria.

Description of the Gold Standard Malaria Test
Qualitative malaria diagnosis was conducted by detecting the histidine-rich protein 2 (HRP2) antigen of Plasmodium falciparum using a commercial kit (SD BIOLINE Malaria Ag P.f, Standard Diagnostics, Gyeonggi-do, Republic of Korea) and was considered as the gold standard method (in the method comparison to the Quansys s-ELISA method) for detecting malarial infection that occurred in the last month [6].

Studied Population
A summary of the origin of the 180 serum samples for the primary method comparison analysis is described in Table 1

Blood Collection Procedures
In Burkina Faso, a capillary blood sample was collected from the finger into silica-coated blood collection tubes (Microvette 300, Sarstedt, Nümbrecht, Germany) and stored and transported cold until later centrifugation on the same day at the district health centre in Fada-N'Gourma (East Region of Burkina Faso). Samples were stored frozen (−20 • C) until shipment to the VitMin lab (July 2016) and use on the Quansys platform (June 2017).
In Cambodia, a 3-h fasting venous blood sample was collected from women in the morning into a 3.

Data and Statistical Analysis
Biomarkers concentrations were reported as mean ± SD or median (IQR) depending on the distribution (normal or skewed, respectively). Commonly-used cut-offs were used to define nutritional deficiencies were as follows: ferritin <12 µg/L for children and <15 µg/L for women [7], sTfR <8.3 mg/L [8], and RBP <0.7 µmol/L [9]. Acute inflammation was defined as CRP >5 mg/L and chronic inflammation as AGP >1 mg/L [10]. HRP2 >0.92 µg/L was used to indicate a positive malaria diagnosis in the Quansys ELISA (personal communication, A. Tyler, Quansys). All values were unadjusted (e.g., no adjustments for inflammation or other factors).
Bland and Altman's bias and limits of agreement (95% CI) were used to describe the agreement in values measured by the two methods [11,12]. Bias was defined as the difference in means between the two measures. Limit of agreement plots were also generated for visual interpretation. Lin's concordance coefficient was also calculated as a measure of reproducibility between the two methods [13]. Concordance measures the departure of the measured values from a 45 • line of perfect concordance. We also estimated linear trend equations in the form (y = ax + b) for each population group for the comparison between each of the two methods. Two-sided p-values < 0.05 indicated statistical significance. Stata software version SE/13.1 for Mac (Stata Corp., College Station, TX, USA) was used for analyses.

Quality Control
Quality control data for the s-ELISA are described as overall co-efficients of variation (CV, %) of the different analytes for each population group (with expected concentrations). Calibration curves were adjusted with a control sample, measured in 10 wells with Biorad Liquichek controls in 3 different concentrations in 6 wells on each plate. CV's for samples from the three countries were <3.2% for Ferritin, <4.3% for sTfR, <4.0% for RBP, <6.2% for CRP and <10.1% for AGP. The Quansys software provides the following quality control data in each measured plate (Table 2). Table 2. Quality control data for the Quansys enzyme-linked immunosorbent assay (ELISA) for each of the five assays conducted (total n = 180). The total number of samples that were outside of the LOD range in the Quansys ELISA were: n = 35/179 (20%) for ferritin, n = 2/180 (<1%) for sTfR, n = 0/180 (0%) for AGP, n = 7/180 (4%) for CRP, and n = 4/180 (2%) for RBP. One sample failed the s-ELISA analysis for ferritin in the Cambodian population; thus, only n = 59 samples were available for comparison in this group.

Characterisitcs of the Women and Children Included in the Analysis
A total of 180 individuals were included in the analysis from three countries: Burkina Faso, Cambodia, and Malaysia. Table 3 presents the mean or median concentrations of each nutritional biomarker and the deficiency prevalence rates of women and children included in the method-comparison analyses. Ferritin: Prevalence of iron deficiency based on low ferritin concentrations (<12 and <15 µg/L for children and women, respectively) varied between the s-ELISA and Quansys ELISAs in Burkina Faso children (17-32%), Cambodian women (0-23%), and Malaysian women (6-18%). Prevalence also varied between the Quansys ELISAs in Burkina Faso and Malaysia regardless of whether or not measurements outside of the limit of detection (LOD) were excluded or included (17-22% and 6-10%, respectively).
RBP: Prevalence of vitamin A deficiency based on RBP concentrations (<0.7 µmol/L) varied between the two methods in Burkina Faso children. No differences in vitamin A deficiency were observed in Cambodian and Malaysian women in either the s-ELISA or Quansys ELISA (2% and 0% prevalence, respectively). Table 3. Nutritional biomarkers and deficiency prevalence of the women and children based on s-ELISA and Quansys measurements both within and outside of the limit of detection (LOD) range 1 .

Burkina Faso Children
Cambodian   1 Values are median (interquartile range; IQR) unless otherwise indicated. AGP, α 1 -acid-glycoprotein; CRP, C-reactive protein; LOD, limit of detection; RBP, retinol-binding protein; SD, standard deviation; sTfR, soluble transferrin receptor; 2 Excludes values of measurement for the Quansys ELISA that were outside of the LOD range; 3 Includes all values of measurement for the Quansys ELISA, however, for values outside of the LOD range, values were included in the analysis at the value of the lowest or highest value of the LOD (e.g., for a ferritin value measured as <1.5 µg/L on the Quansys ELISA, we included in the analysis as a value of 1.5 µg/L); 4 Ferritin <12 µg/L for children and <15 µg/L for women.

Trend Estimates for Each Analyte for Each Population Group and the Pooled Population
We estimated linear trend equations in the form (y = ax + b; y representing the Quansys method) for each population group for the comparison between each of the two methods (Table 4). Equations were calculated for each analyte using all samples within the LOD range in the Quansys ELISA, with the exception of ferritin, for which we calculated two trend equations for (1) samples within the LOD range; and (2) all samples within and outside of the LOD range.

Method-Comparisons between the Two Methods for Each Analyte
Bias, limits of agreement, and correlation coefficients of ferritin, sTfR, AGP, CRP, and RBP concentrations comparing the s-ELISA and the Quansys ELISA kit are described in Table 5. Results are shown for all samples, as well as only for those within the LOD range in the Quansys ELISA.
Ferritin: Agreement between the two methods was poor in the Burkina Faso, Cambodian and Malaysian populations (bias ranged from 14.7 to 29.3 µg/L, and concordance ranged from 0.40 to 0.62). Bias did not improve when the samples outside of the LOD range were excluded.
STfR: Agreement between the two methods was poor in the Burkina Faso children, (bias of 26.5 mg/L). Conversely, agreement was low in the Cambodian and Malaysian women (bias of 5.1 and 0.1 mg/L), respectively. Concordance ranged from 0.09 to 0.62 in the Burkina Faso and Cambodian women, but was higher (0.78) in the Malaysian women. Bias did not improve when samples outside of the LOD range were excluded. AGP: Agreement between the two methods varied among the three populations (0.1-0.2 g/L in all three groups). Concordance ranged from 0.41 to 0.85. No samples were outside of the LOD range on the Quansys assay.
CRP: Agreement between the two methods varied among the three populations (1.4-3.1 mg/L in all three groups). Concordance was relatively good in all three populations and ranged from 0.74 to 0.79. Bias did not change when the samples outside of the LOD range were excluded. RBP: Agreement between the two methods varied from −0.04 to 0.6 in the three populations. Concordance ranged from 0.38 to 0.83. Bias did not improve when the samples outside of the LOD range were excluded (although only 2 samples were excluded for this reason).
Figures 1-5 present the bias and limit of agreement plots for each of the five analytes in each population group for samples within the LOD range (excludes all values of measurement for the Quansys ELISA that were outside of the LOD range). Table 5. Bias, limits of agreement, and correlation coefficients of ferritin, soluble transferrin receptor (sTfR), α 1 -acid glycoprotein (AGP), C-reactive protein (CRP), and retinol-binding protein (RBP) concentrations comparing the s-ELISA and the Quansys ELISA kit.

Malaria Testing
The Quansys ELISA measures HRP2 as a biomarker measuring malarial infection. We compared HRP2 concentrations in 60 Burkina Faso children with completed qualitative testing of malaria parasitemia (Table 6). Considering the qualitative method as the reference, the specificity and sensitivity of the HRP2 biomarker measured on the Quansys ELISA to identify children with malaria was as follows: 72% sensitivity (true-positive rate) and 80% specificity (true-negative rate). The resulting Kappa coefficient is 0.680 (95% CI 0.488, 0.872), which is considered to indicate 'good agreement'.

Discussion
This study compares a newly established method (Quansys ELISA) with s-ELISA results from a single laboratory (VitMin lab) which has over the recent years been widely used in the field of surveys assessing vitamin A and iron deficiencies. As such, this study is not a method validation per se, but rather an evaluation to determine if results from a newly established method can be used interchangeably with those from an established lab. This is an important distinction to keep in mind, as for both the newly established method and the established approach, validation studies have been conducted [2,5]. Further, it is noteworthy that although the Quansys ELISA platform is on the market, the producer of the platform continues to further improve the analytical performance. As such, results presented here are of somewhat transient nature.
Results from our comparison almost consistently indicate poor overall agreement of the two methods, both with regard to absolute biomarker concentrations and deficiency prevalence estimates. For most of the analytes, there is a relatively wide scattering between the two methods as can be seen from the Bland-Altmann plots. Moreover, as can be seen in Figures 1-5 in the results section, there is also a systematic shift for several biomarkers, in particular for ferritin, sTfR, and CRP, and to some extent RBP. Such bias may be a result of different affinity of the antibodies in the two methods [14] and as such, if proving to be constant, could be justifiably adjusted, in particular if absolute true values could be established to adjust for. However, this shift is not consistent across the samples from the three countries such that from this work, no factor-adjustment to render the two methods more comparable can be proposed without reservation. Further, in comparison with a recently published report comparing the Quansys method to other methods based on samples from pregnant women in Niger [5], the shift varies again, despite some rough qualitative agreement (above or below the line of equality): the slopes (y = Quansys, x = comparing method) are 1.88x in that work vs. 2.08x for pooled Ferritin in our work, 1.70x vs. 2.48x for sTfR, 0.67x vs. 0.99x for CRP, and 0.47x vs. 0.63x for AGP; for RBP, this qualitative trend does not hold true: 0.84x vs. 1.12x.
The observed lack of concordance may be attributed to the sample preparation. In 2017, the Quansys assay was validated using a Nigerian cohort of heparinized plasma samples [15]. The fact that serum and capillary samples were used in this study could account for some of the observed assay differences. Further studies would be warranted to further understand the impact of sample collection methods on both the s-ELISA and Quansys multiplex assay.
With regard to malaria parasitemia, 80% percent of the cases were truly positive and 72% were truly negative, when using the on-site rapid diagnostic test kit as the reference method. This results in a Kappa coefficient indicating good agreement. However, Kappa statistics are being used primarily for inter-observer comparability and as such, are tending to overestimate the comparability of biochemical methods.
In terms of handling the Quansys ELISA, our impression is that the platform is relatively easy to use with minimal additional training for trained lab technicians. Instructions on assay preparation and completion are well-described in the assay handbook (available online and inside the kit). The kit contains the 96-well plate and all required reagents (calibrator, detection mix, substrate, sample diluents, and wash buffer). The Q-View™ software (Quansys Biosciences, Logan, UT, USA) is user-friendly and tutorials are available online to provide the user with additional guidance if needed. Despite the simple handling, there were some challenges faced and these revolve in particular around the platform's automatic calculation of the LOD: this calibration is done for each new plate and a built-in algorithm fits a curve to the measured concentration of the calibrators. Particularly for ferritin, this led to difficulty in interpreting deficiency status on two plates, where the fitted curve resulted in estimations of ferritin concentrations below the threshold for defining iron deficiency in children (two plates produced n = 14/40 and n = 9/40 ferritin values defined as <15.16 µg/L and <14.13 µg/L, respectively; both below or just around the cut-off for iron deficiency of <12 µg/L and <15 µg/L, see Table 2). For these n = 23 samples, we could not confirm if children had a ferritin concentration below the 12 µg/L cut-off for deficiency. Through discussions with technical staff of Quansys, we identified options to 'manually' adjust the calibration curve and lower the LOD. However, we decided against presenting results of such manual adjustments, since this comparison was meant to be done for routine analyses and not a research-context. Instead, we present the result including and excluding samples where these challenges were faced. As previously mentioned, Quansys is working on improving the calibration curve and introducing quality control samples to reduce such issues.
As observed in several of the Bland-Altman plots, there appeared to be poorer agreement between values at the higher end (this was especially the case for ferritin (see Figure 1)). We recognize that most assays are developed with the goal to achieve high accuracy of measurement near the threshold that defines deficiency (for ferritin,~12-15 µg/L). This is intended to optimize diagnostic accuracy with use of ferritin measurements. However, this is likely achieved with the consequence of decreased accuracy at higher values (e.g., ferritin > 150 µg/L). We considered whether or not we should exclude higher values of each analyte because of the poor agreement observed among values at the higher end. However, it was challenging to arbitrarily estimate what these higher values should be. This is also because the higher end threshold would likely vary from assay to assay, and would also depend on the controls used in each assay (which differ from lot to lot). Without a consensus on what higher end thresholds should be, and if we should exclude those values for the purposes of a method-comparison, we decided not to exclude any values. However, we raise this issue as one that requires further attention and solution, and note that the agreement between the methods in our study would likely have been higher if we excluded those higher (potentially inaccurate) values.
The strength of our comparison is that we had samples over a wide concentration range of most analytes available, samples from both women and children and from Asia and Africa. Although this is not an exhaustive representation, the samples certainly provide a certain heterogeneity in their origins. One possible limitation is that samples were analysed at different time points after obtaining them and during that time, they were stored under different conditions. This may have affected the analytes (concentration or chemical form) and thereby affected comparability. All analytes used in this work are proteins and are considered relatively stable for an extended period of time if kept frozen at −80 • C. For example, ferritin is very stable and serum can be frozen at −20 to −70 • C for several years without affecting sample quality [16]. Samples from Malaysia and Cambodia followed this standard, but this was not the case for Burkina Faso, where samples were stored at −20 • C for about 2 months and from then onwards at −40 • C. Further, the time between the two analyses was less than one year for samples from Malaysia and Burkina Faso, while it was two years for those from Cambodia, thus, we believe degradation of sample poses a very minor risk. Moreover, we suspect that degradation was not an issue as the Quansys concentrations tend to be higher than those obtained from the VitMin lab. The other important limitation is that this study is not using a method performance validation against a gold standard, but merely compares one newly-developed method with an established method. Thus, it would be inappropriate to say that one method is wrong and the other is right, both may have shifts from the true concentrations of the analytes. Yet, the VitMin Lab method has undergone several critical assessments and mostly yielded good scores [2,14]; plus, it is so widely used that it is almost setting a benchmark for new methods to be compared against.
In our prior work, we measured serum ferritin concentrations using four different immunoassays, including Erhardt's s-ELISA, in Cambodian women (n = 420) and Congolese children (n = 226) [14]. We observed differences in mean serum ferritin concentrations across assays, which were likely an inherent reflection of the different ferritin isoforms, antibodies, and calibrators used by these assays and labs [14]. Despite the differences in ferritin concentrations, iron deficiency prevalence was similar (and low) across the different ferritin methods [14]. Similarly, we suspect that some of the observed differences in ferritin concentrations are likely due to the different analytical (isoforms and antibodies) and calibration techniques between the two methods.
In conclusion, we think it would be important to identify and publish the upper thresholds for both methods used here, so that users can easily understand where the methods' limitations are in terms of accurate measurement, although we recognize this is difficult without the use of standard controls across all methods. With regard to the Quansys ELISA, although it has great potential to simplify laboratory analysis of ferritin, sTfR, RBP, CRP and AGP, there are still some weaknesses in the procedures that will need to be eliminated for routine use, in particular the automated calibration curves. In our use, it sometimes led to measured values below the established thresholds defining deficiency, in particular with ferritin. It would further be helpful if the test kits included certified quality control materials that would enable the user to spot inconsistencies relatively easily in routine use. An important caveat is that the systematic shift between the two methods is not constant for a given analyte in the samples from the three population in our study. More research is needed to establish such adjustment factors, if the comparability cannot be improved by changing the concentration of the capture antibodies.
In its present form, the Quansys platform may be of interest to laboratories that have methods in place for analysing these analytes at small-scale and are looking for higher throughput options. In such a context, the laboratory may be able to cross-compare results on a regular basis and detect inconsistencies originating from the Quansys platform. For routine use in settings where such quality control options are not easily available, we conclude from our data that the risk of obtaining biased results is currently too high.