Diagnostic Efficiency of Three Fully Automated Serology Assays and Their Correlation with a Novel Surrogate Virus Neutralization Test in Symptomatic and Asymptomatic SARS-COV-2 Individuals

To support the deployment of serology assays for population screening during the COVID-19 pandemic, we compared the performance of three fully automated SARS-CoV-2 IgG assays: Mindray CL-900i® (target: spike [S] and nucleocapsid [N]), BioMérieux VIDAS®3 (target: receptor-binding domain [RBD]) and Diasorin LIAISON®XL (target: S1 and S2 subunits). A total of 111 SARS-CoV-2 RT-PCR- positive samples collected at ≥ 21 days post symptom onset, and 127 pre-pandemic control samples were included. Diagnostic performance was assessed in correlation to RT-PCR and a surrogate virus-neutralizing test (sVNT). Moreover, cross-reactivity with other viral antibodies was investigated. Compared to RT-PCR, LIAISON®XL showed the highest overall specificity (100%), followed by VIDAS®3 (98.4%) and CL-900i® (95.3%). The highest sensitivity was demonstrated by CL-900i® (90.1%), followed by VIDAS®3 (88.3%) and LIAISON®XL (85.6%). The sensitivity of all assays was higher in symptomatic patients (91.1–98.2%) compared to asymptomatic patients (78.4–80.4%). In correlation to sVNT, all assays showed excellent sensitivities (92.2–96.1%). In addition, VIDAS®3 demonstrated the best correlation (r = 0.75) with the sVNT. The present study provides insights on the performance of three fully automated assays, which could help diagnostic laboratories in the choice of a particular assay according to the intended use.


Introduction
The COVID-19 pandemic, caused by the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) [1,2] was first reported in December 2019 in Wuhan, China [3,4]. The virus has rapidly spread and become a major public health concern, resulting in a total of 80,818,467 confirmed cases and 1,766,847 deaths, as of 27 December 2020 [5].
Although molecular detection techniques have played an important role in testing and contact tracing efforts, virus elimination is perhaps no longer feasible due to the extensive and insidious spread of the virus. Thus, further diagnostic methods are needed to guide the most efficient use of public health measures. The gradual lifting of restrictions and control measures will require active surveillance to allow early detection of new cases or clusters, along with retrospective contact tracing and quarantine, most likely combined with physical distancing measures and augmented protection of those at higher risk. Serology testing is ideally suited for this purpose as it can inform the need for contact tracing, investigation of asymptomatic and other undocumented infections, accurate determination of the infection fatality rate, assessment of herd immunity, and the level and duration of protective immunity in the population at large and in specific groups [6], which remains a key knowledge gap in COVID-19 research.
Laboratories and companies are racing to produce reliable and versatile serological tests that can detect SARS-CoV-2 infection with sufficient specificity and sensitivity [6]. The required performance of a serological test will depend on the purpose of testing. Numerous commercial serological tests have been developed and introduced into the market [7,8]. However, due to the need for their rapid development and implementation, in many countries, the normally stringent regulatory criteria have not been applied to many of them [6]. Thus, persistent concerns remain regarding the accuracy and reliability of the currently available SARS-CoV-2 immunoassays.
Serological tests typically detect antibodies against spike protein (S) and/or nucleoprotein (N) since these are the most immunogenic proteins of SARS-CoV-2 [9]. Recently, it has been shown that antibodies directed against the S1 subunit of the SARS-CoV-2 S protein, specifically against the receptor-binding domain (RBD), strongly correlate with virus neutralization [9]. Thus, the likelihood of predicting protective antibody responses increases when using either the S1 antigens or the RBD in the assay. The specificity of antibody tests in detecting antibodies against SARS CoV-2 might be hampered by the presence of antibodies against other circulating coronaviruses in the population [10], and thus, testing for cross-reactivity is crucial. When selecting an appropriate antibody test for a specific aim, it is necessary to develop a broad understanding of antibody specificities, kinetics, and functions [11]. The lack of knowledge of antibody kinetics in emerging viral infections during an outbreak is always a challenge for validation of serological tests. Recent studies on have shown that seroconversion rates have reached as high as 100% after 10-14 days, and that antibody levels correlate with clinical severity [9,12,13]. This is in concordance with reports on the Middle East Respiratory Syndrome coronavirus (MERS-CoV) infection, in which antibody response varies according to disease severity, with mild and asymptomatic infections resulting in weaker immune responses [12]. Thus, sufficient samples from persons with mild and asymptomatic disease should be included in validation studies for useful interpretation and extrapolation of results to population screening.
In the present study, we aimed to evaluate the performance of three commercially available automated analyzers for the detection of anti-SARS-CoV-2 IgG antibodies using confirmed RT-PCR samples that were collected from symptomatic and asymptomatic RT-PCR confirmed cases. In addition, for the first time, we assessed the performance of the three commercial automated assays in correlation to a surrogate virus neutralization test (sVNT). The CL-900i ® detects anti-S and anti-N antibodies, LIAISON ® XL-Diasorin detects anti-S1 and anti-S2 antibodies, and VIDAS ® 3-bioMérieux detects antibodies directed against the RBD of the S1 subunit. We assessed the sensitivity, specificity, Cohen's Kappa, and estimated the positive and negative agreement values of the three automated assays in correlation to the gold standard RT-PCR, and the sVNT. We also performed concordance assessment among the assays. The strength of this study lies in the diversity of the sample population characteristics, with~89% of the total population of Qatar being expatriates from over 150 countries [13][14][15][16].
All specimens used in the study were in a hospital setting, or professional laboratory acquisitioned for routine testing, and shipped on ice packs to our laboratory. According to CDC recommendations, all samples were stored in a refrigerator at 4 ± 2 • C for up to 72 h after collection if a delay in shipping or processing was expected. Samples were centrifuged at 2500 rpm for 10 min to facilitate plasma/cell phase separation. The resulting upper plasma layer was extracted, and tested fresh, or aliquoted to minimize future freeze-thaw cycles, and stored at -80 • C for later analyses. Frozen samples were thawed on ice before the analysis.
Sensitivity was determined using sera collected from 111 RT-PCR-confirmed SARS-CoV-2 patients, with different COVID-19 clinical outcomes. Qiagen RNA extraction kit was used to extract RNA from nasopharyngeal swab specimens. The extracted RNA was tested for SARS-CoV-2 using the SuperscriptIII OneStep RT-PCR kit (Cat No. 12594100, ThermoFisher, Waltham, MA, USA). Each sample was tested using three sets of primers: one set targeting the E gene for screening and the other two sets targeting the RdRp gene for confirmation as described in [26]. Cycle threshold (CT) values below 32 were considered positive. All samples were collected ≥21 days of symptoms onset. Clinical records of the patients were reviewed to determine the disease's severity and were categorized into: (a) symptomatic (n = 56), and (b) asymptomatic (n = 51). All specimens were stored at −80 • C until use.

Neutralization Assay (sVNT)
The SARS-CoV-2 surrogate virus neutralization test (sVNT) was used as a reference in this study (Cat. No. L00847, GenScript, NJ, USA) [32,33] for detecting neutralizing antibodies. This assay was developed by GenScript ® Biotech and is now available commercially as 96-well microplates for large serological screening for neutralizing antibodies targeting the RBD domain of the S1 subunit. Moreover, this assay demonstrated a high correlation with the pseudovirus neutralization test (pVNT, R 2 = 0.84) and the complete virus-neutralization test (cVNT, R 2 = 0.85) [33]. Validation of sVNT showed a specificity of 99.9% and a sensitivity of 95.0-100% [33]. In this study, all SARS-CoV-2 RT-PCR-positive plasma samples were tested for neutralizing antibodies against the RBD protein using the sVNT. According to the manufacturer's instructions, a value result ≥20% signal inhibition was considered positive (neutralizing antibodies were detected), and <20% signal inhibition was considered negative (neutralizing antibodies were not detected).

Statistical Analysis
The diagnostic assessment of the three automated analyzers with RT-PCR for SARS-CoV-2 resulted in three cross-tabulations for each COVID-19 patient group versus the control group. Using RT-PCR as the reference standard, overall percent agreement, sensitivity, specificity, and Cohen's Kappa statistic were calculated to assess the performance of each assay. Informed by literature, borderline results were considered positive [3,34].
Receiving operating characteristic (ROC) curves were conducted to study the diagnostic performance of each assay. The area under the curve (AUC) was estimated. Statistically, the larger the AUC, the more the accurate a tool can be considered in its overall performance. An AUC of 0.9-1.0 is considered excellent, 0.8-0.9 very good, 0.7-0.8 good, 0.6-0.7 sufficient, 0.5-0.6 bad, and less than 0.5 considered not useful [35]. The cut-off values for optimal sensitivity and specificity were determined by calculating Youden's index J (J = sensitivity + specificity − 1). The Youden index J represents the point on the curve in which the distance to diagonal line (line of equality) is maximum [36].
Using the GenScript sVNT as the reference standard, the sensitivity for each automated analyzer was also calculated. Concordance analysis between the three automated assays along with the sVNT were conducted and resulted in 20 test combinations. These concordance measures include overall, positive, and negative percent agreement, as well as Cohen's Kappa statistic. The latter measure is a standard and robust metric that estimates the level of agreement (beyond chance) between two diagnostic tests. Ranging between 0 and 1, a Kappa value <0.40 denotes poor agreement, 0.40-0.59 denotes fair agreement, 0.60-0.74 denotes good agreement, and ≥0.75 denotes excellent agreement [37]. The significance level was indicated at 5%, and a 95% confidence interval (CI) was reported for each metric. Pearson correlation coefficient (r) was calculated. For absolute values of Pearson's r, 0-0.19 is denoted as very weak, 0.2-0.39 as weak, 0.40-0.59 as moderate, 0.6-0.79 as strong and 0.8-1 as very strong correlation [38]. All calculations were performed using GraphPad Prism Version 8.2.1.
Microorganisms 2021, 9, x FOR PEER REVIEW 6 of 18 0.79 as strong and 0.8-1 as very strong correlation [38]. All calculations were performed using GraphPad Prism Version 8.2.1.

Figure 1.
Sensitivity of each assay in samples collected ≥ 21 days post symptom onset using RT-PCR as a reference test. Data are presented for 111 RT-PCR confirmed SARS-CoV-2 samples categorized into: symptomatic (n = 56); asymptomatic (n = 51); and unclassified (n = 4); run on each automated assay; VIDAS ® 3, CL-900i ® , LIAISON ® XL. Chi-square test was used to detect the presence of a statistically significant difference in the sensitivity of each assay between the symptomatic and asymptomatic samples. * p < 0.05, *** p < 0.0001. ; asymptomatic (n = 51); and unclassified (n = 4); run on each automated assay; VIDAS ® 3, CL-900i ® , LIAISON ® XL. Chi-square test was used to detect the presence of a statistically significant difference in the sensitivity of each assay between the symptomatic and asymptomatic samples. * p < 0.05, *** p < 0.0001.

RT-PCR
The overall distribution of the values generated by each automated analyzer against the cut-offs (dashed lines) is shown in Figure 2. As depicted in the figure, only CL-900i ® showed a significant difference between the symptomatic and asymptomatic samples (p = 0.0063), suggesting that CL-900i ® could be used in the future as a simi-quantitative assay by performing an in-point titration curve. The continuous lines represent the median and confidence interval (CI) for each group. One-way analysis of variance (ANOVA) was used to compare the differences between groups.

Evaluation of Potential Cross-Reactivity with Other Viruses
The specificity of each automated analyzer in relation to sample cross-reactivity with antibodies against various viruses is summarized in Table 3. Of the 127 pre-pandemic control samples, eight sera samples cross-reacted; two samples cross-reacted with VIDAS ® 3, while the remaining six samples cross-reacted with CL-900i ® . In the other-coronaviruses subgroup, CL-900i ® demonstrated the lowest specificity at 66.7% (95% CI: 43.

Concordance Assessment among the SARS-CoV-2 IgG Automated Assays and the GenScript sVNT Test
The tests' agreements were studied in a pairwise fashion applying inter-rater agreement statistics; (Cohen's Kappa statistic, k) ( A pairwise correlational analysis of the numerical values obtained by each automated IgG assay against the percentage inhibition obtained by sVNT was performed. As depicted in the correlation plots ( Figure 5), all automated assays showed a moderate to strong correlation with the sVNT with Pearson's r ranging from 0.5678 for CL-900i ® /sVNT   Figure 4. Sensitivity for each assay on samples collected ≥ 21 days post symptom onset in patients with SARS-CoV-2 RT-PCR-confirmed infection using the sVNT as a reference test. Data are presented for 111 RT-PCR confirmed SARS-CoV-2 positive samples categorized as: overall (n = 111), symptomatic (n = 56); and asymptomatic (n = 51); run on each automated assay; VIDAS ® 3, CL-900i ® SARS-CoV-2, and LIAISON ® XL. Chi-square test was used to detect the presence of a statistically significant difference in the sensitivity of each assay between the symptomatic and asymptomatic samples. * p < 0.05, ** p < 0.001. A pairwise correlational analysis of the numerical values obtained by each automated IgG assay against the percentage inhibition obtained by sVNT was performed. As depicted in the correlation plots ( Figure 5), all automated assays showed a moderate to strong correlation with the sVNT with Pearson's r ranging from 0.5678 for CL-900i ® /sVNT to 0.7535 for VIDAS ® 3/sVNT ( Figure 5). Thus, VIDAS ® 3 demonstrated the best correlation with sVNT among all three automated IgG assays ( Figure 5).

Discussion
The present study evaluated and compared the performance of three fully automated analyzers for the detection of SARS-CoV-2 IgG antibodies: CL-900i ® , VIDAS ® 3, and LIAISON ® XL. The sensitivity was evaluated using 111 samples collected from SARS-CoV-2 RT-PCR-positive symptomatic and asymptomatic patients. The specificity was evaluated using 127 pre-pandemic control samples. To assess the diagnostic performance of the three automated assays, RT-PCR was used as a reference test, in addition, for the first time, the performance of the three automated assays was assessed in correlation to a sVNT, which has recently been shown to correlate well with conventional virus neutralization test (cVNT), the current gold standard for the detection of neutralizing antibodies [39]. In this study, convalescent plasma samples (collected ≥21 days post symptom onset or positive PCR test) were used for the evaluation. It has been shown in several studies that most SARS-CoV-2 antibody assays exhibit variable performance during the early phases infection, but the concordance improves after day 14 of symptoms onset where IgG seroconversion rate reaches 90% [12,40,41]. According to recent data on COVID-19 serology testing, the performance of serological tests was found to stabilize ≥21 days after symptom onset [42]. Moreover, previous studies have shown that convalescent COVID-19 patients have higher neutralization activity [40,43]. Hence, these convalescent samples are expected to provide a more accurate evaluation of the selected assays.
In the present study, we observed variable performance for the three automated assays. Among the three automated assays, CL-900i ® demonstrated the best overall performance in detecting SARS-CoV-2 IgG antibodies. The overall performance of the three assays was comparable to other detection methods such as Abbott Architect and Roche Cobas 6800, which were reported to have sensitivities of 93.5% and 95.2%, respectively, after 21 days of symptom onset, similar to CL-900i ® with showed the highest sensitivity (90.1%) ( Table 2). Tang et al. reported a sensitivity of 89.4% by Roche assay [41], which was comparable to the sensitivity obtained by VIDAS ® 3 (88.3%). Another study on VIDAS ® 3 reported a sensitivity of 86.7% [30]; similar to our findings. Further, among the three automated assays, LIAISON ® XL demonstrated the lowest sensitivity (85.6%) and failed to detect SARS-CoV-2 specific antibodies in three samples that were detected by both CL-900i ® and VIDAS ® 3.
It is important to note that our COVID-19 cohort comprised both asymptomatic and symptomatic patients, of which most of the symptomatic cases were mild and nonhospitalized cases. Our study demonstrated that the sensitivity was higher in symptomatic patients compared to the asymptomatic patients (Table 2), which is in concordance with other studies reporting a stronger humoral immune response in severe COVID-19 patients compared to non-severe cases [43,44]. It is noteworthy to mention that among the 111 samples collected from SARS-CoV-2 RT-PCR-positive patients, nine were negative by all three assays, of which eight were collected from asymptomatic patients, and one was from a pauci-symptomatic patient. These patients may have developed a weak antibody response that was below the detection limit of the assays; thus, further investigation is needed by other highly sensitive assays. Also, a false positive PCR result or high CT-value (above 30 cycles) are plausible explanations, if an RT-PCR-positive COVID-19 participant had no detectable antibodies. It is noteworthy to mention that false positive PCR results due cross-contamination, or the interference of pure technical artifacts have been regularly documented even in the most highly regarded laboratories [45][46][47].
To assess the specificity of the automated assays, we have compiled pre-COVID-19 pandemic plasma sample obtained before the first appearance of the SARS-CoV-2 virus. Among the three assays, LIAISON ® XL showed the highest specificity (100%), similar to a previous study from the United States that reported a 99.9% specificity [48]. However, the sensitivity of LIAISON ® XL in our study using RT-PCR as the reference test was much lower (85.6%) compared to the one reported by the aforementioned study (100% by day 17 post symptoms onset) [48]. Overall, the specificity of all three analyzers was excellent, ranging from 95.3-100%). This is similar to what has been reported for other automated assys such as Abbot Architect™ i2000 (95.1%) and Elecsys ® Anti-SARS-CoV-2 (99.98%) reported elsewhere [35,49,50]. This could be due to the fact that both assays (Abbot Architect and Roche cobas™) are N protein-based which is conserved among coronaviruses leading to cross-reactivity.
The variability in assay performance does not seem to be dependent on the different detection methods of each assay. CL-900i ® , which is a CLIA-based assay, demonstrated the best performance compared to LIAISON ® XL, which is also a CLIA-based assay that showed the lowest performance among all assays (Tables 1 and 2 and Figure 1). However, this heterogenicity in assays performance is most likely dependent on the type of targeted antigen. The three automated assays were all based on different antigen components (Table 1). This is noteworthy, as antibody responses against each of these antigens may develop with variable kinetics, which remains a subject for further investigation. Our study showed higher specificities in assays targeting the S protein of SARS-CoV-2 (VIDAS ® 3 and LIAISON ® XL) compared to the one targeting both S and N proteins (CL-900i ® ). This is because N protein is relatively small and more conserved than the S protein among human coronaviruses, which could cause false-positive results through cross-reactivity [51,52]. Therefore, although targeting both S and N proteins improved the sensitivity of CL-900i ® , it decreased the specificity by causing cross-reaction with other coronaviruses.
To determine which assay best correlate with neutralizing antibodies, GenScript sVNT test was used, a newly described VNT that has recently been shown to demonstrate an excellent performance in correlation to cVNT, the current gold standard for detecting neutralizing antibodies [33]. While cVNT provides the recognized benchmark, it is not practical for large-scale implementation due to requirement of a live pathogen, high biosecurity containment, and the need for highly trained personnel to perform the labor-intensive procedures. sVNT on the other hand, was designed to detect total neutralizing antibodies in an isotype-and species-independent manner without requiring a live virus or high biosecurity containment, and thus making the test immediately accessible to the global community [33]. In the present study, VIDAS ® 3 demonstrated the best correlation with the sVNT in detecting IgG antibodies with neutralizing activity against SARS-CoV-2 ( Figure 5), which was expected since both assays target the RBD of S1 protein. This suggests that VIDAS®3 could be used for detecting IgG antibodies that correlate with protective immunity. Moreover, concordance assessment among the automated-IgG assays and the sVNT showed a high overall percent agreement, nevertheless, a variation in the positive and negative percent agreements was observed (Table 4).
In the present study, using sVNT as a reference test, all three automated assays demonstrated a sensitivity above 90%, with the highest overall sensitivity estimated at 96.1% by CL-900i ® (Figure 4). Recently, Abbott Architect was reported to have a sensitivity of 80.5%, using microneutralization test (MNT) as a reference method [53]. It is noteworthy to mention that the variation in sensitivity values reported in most studies on the currently available commercial automated analyzers, could be in part due to the variation in the time of sample collection. The sensitivity of serological tests is usually lower at early stages of infection (<7 days), and the performance starts to stabilize ≥ 21 days after symptom onset [8,42]. In correlation to the sVNT, LIAISON ® XL had a 92.2% positive percent agreement and 100% negative percent agreement (Table 4), this is in concordance with another study reporting positive and negative percent agreements of 94.4% and 97.8%, respectively, for LIAISON ® XL using MNT as reference method [54].
In the present study, adaptation of lower cut-off values, as determined by the ROC curve analysis (Figure 3), improved the sensitivities of all assays without affecting the specificity. Thus, lower cut-off values may be used to improve the detection of SARS-CoV-2 IgG antibodies by the three assays. Other studies have also suggested using a lower cut-off for LIAISON ® XL (8.76 AU/mL and 9 AU/mL) [54,55]. The importance of using a cut-off value that provides high sensitivity compared to one that provides low sensitivity, but high specificity is affected by the disease prevalence. For screening purposes, higher thresholds may be desirable, whereas for diagnosis purposes in high-prevalence settings, lower thresholds are preferred. Therefore, it is recommended for each lab to establish its own cut-off values to improve the clinical performance and avoid false-negative results.
Although serological assays do not replace molecular tests in diagnosing active infection, they serve as an essential tool to accurately estimate the seroprevalence of SARS-CoV-2 infection in the general population and to quantify the level of herd immunity [56]. This could help ease the restrictions on human mobility and interactions without provoking a significant resurgence of transmission and mortality. In addition, serological tests will also help in assessing the potential effectiveness of vaccine trials and antibody-mediated therapies [33,53].
Our study has several limitations. The RT-PCR-confirmed SARS-CoV-2 samples were collected at ≥21 days post symptom onset. Thus, the results obtained for the diagnostic efficiency could have been different if samples at different time points (<21 days) were available. In addition, technical problems such as insufficient sample volume may have affected the results. However, since all samples were drawn in duplicate, we were able to continue the, notably by using multiple aliquotes that were kept in −80 • C for later use.

Conclusions
In conclusion, the three evaluated automated assays: CL-900i ® SARS-CoV-2 IgG (Mindray, China); VIDAS ® 3 SARS-CoV-2 IgG (bioMérieux, France); and LIAISON ® XL SARS-CoV-2 IgG (Diasorin, Italy), demonstrated high overall sensitivity and specificity for the detection of IgG antibodies against SARS-CoV-2. Among the three automated assays, CL-900i ® demonstrated the best diagnostic performance. In addition, VIDAS ® 3 correlated best with the neutralization test, and thus could serve as a tool for detecting protective IgG threshold, particularly in vaccinated population.