Evaluation of a Broad Panel of SARS-CoV-2 Serological Tests for Diagnostic Use

Serological testing is crucial in detection of previous infection and in monitoring convalescent and vaccine-induced immunity. During the Severe Acute Respiratory Syndrome Corona Virus 2 (SARS-CoV-2) pandemic, numerous assay platforms have been developed and marketed for clinical use. Several studies recently compared clinical performance of a limited number of serological tests, but broad comparative evaluation is currently missing. Within this study, a panel of 161 sera from SARS-CoV-2 infected, seasonal CoV-infected and SARS-CoV-2 naïve subjects was enrolled to evaluate 16 ELISA/ECLIA-based and 16 LFA-based tests. Specificities of all ELISA/ECLIA-based assays were acceptable and generally in agreement with the providers’ specifications, but sensitivities were lower as specified. Results of the LFAs were less accurate as compared to the ELISAs, albeit with some exceptions. We found a sporadic unequal immune response for different antigens and thus recommend the use of a nucleocapsid protein (N)- and spike protein (S)-based test combination when maximal sensitivity is necessary. Finally, the quality of the immune response in terms of neutralization should be tested using S-based IgG tests.


Introduction
Serological testing is the standard diagnostic technique to detect past infection with various pathogens and is widely used to monitor vaccine-induced immunity. Immunoassays like Enzyme-linked or Electrochemiluminescence Immunosorbent Assay (ELISA/ECLIA), Lateral Flow Rapid Test (LFA), Immuno-PCR (iPCR) or Recombinant Immunofluorescence Assay (rIFA) provide methods for the detection of antigen-specific immunoglobulins (Ig) with different strengths and weaknesses [1].
To date, several unknowns remain with respect to the significance of antibody titers in the progression of Coronavirus disease 2019  in the ongoing SARS-CoV-2 pandemic [2]. Whilst neutralizing antibodies can reduce viral loads, viral surface proteindirected antibodies may be of disadvantage due to mechanisms like antibody-dependent enhancement (ADE) [3,4]. Thus, careful analysis of antibody titers and their correlation with other disease parameters will help to elucidate the importance of antibodies in COVID-19 (Glueck et al., in press [5]). Along these lines, serological testing is of utmost importance in monitoring B-cell responses to SARS-CoV-2 vaccines. Depending on the delivered antigen of a vaccine, several tests using different antigens are necessary to quantify the magnitude and longevity of vaccine-induced antibody titers. Moreover, serological assays are crucial in identifying plasma donors for therapeutic interventions and potentially in monitoring levels of therapeutic antibodies in plasma during passive immunization approaches [6].
Here, we compare 13 ELISAs/ECLIA from six different commercial providers and three in-house tests for detection and quantification of SARS-CoV-2-directed antibodies of different isotypes. Furthermore, we tested 16 LFA-based assays, which can be used in point-of-care (PoC) rapid testing or in a two-step confirmatory test scenario. As a measure of diagnostic accuracy, we determined sensitivity and specificity of the assays with respect to defined negative and positive sera. We include platform-dependent assays and easy-toperform benchtop tests and point out advantages and disadvantages of both approaches.

Materials and Methods
Serum samples of pseudonymized donors were selected from a serum panel that we recently used to validate an in-house SARS-CoV-2 ELISA [7]. The panel consisted of five sample groups. For calculation of the test specificity, we analyzed 60 serum samples, which were collected before the SARS-CoV-2 outbreak (SARS-CoV-2 naïve group). This negative panel included 31 sera from patients tested PCR-positive for seasonal CoV with confirmed reactivity against their N-antigen (seasonal CoV group). A total of 101 sera taken during March and April 2020 from hospitalized COVID-19 patients with PCR-proven SARS-CoV-2 infection was used to calculate sensitivity for three different time points after positivity of SARS-CoV-2 reverse transcription polymerase chain reaction-(RT-PCR). Group 1 (0-5 days post PCR-positivity [dppp]; n = 30), group 2 (6-10 dppp; n = 30) and group 3 (>10 dppp; n = 41). This procedure was approved by the Ethical Committee of the University of Regensburg (ref. no. 20-1854-101).
Using this serum panel, we evaluated one ECLIA, 12 commercial ELISAs, three inhouse ELISAs and two LFAs ( Table 1). The selection of tests was based on availability in Germany as of 1st May 2020. These assays were based on recombinant virus-proteins, used either the N-antigen or S-antigen, and detected different isotypes (total Ig: n = 2, IgG: n = 6, IgA: n = 5, IgM: n = 3). All LFAs detected IgG and IgM simultaneously. For fourteen additional LFAs we used a reduced panel of 19 sera due to limited availability of tests at that time. The samples were chosen randomly within the five different sample groups. All tests were carried out according to the manufacturers' instructions. A modification was made to our recently published ELISA protocol [7] by using a commercial substrate solution (3,3 ,5,5 -Tetramethylbenzidine substrate, Mikrogen). The ELISAs were analyzed using a Microplate Reader (Model 680, Bio-Rad, Hercules, CA, USA) and the ECLIA was performed on a cobas e 801 analytical unit (Roche, Mannheim, Germany). The cutoff values and limits were calculated according to the manufacturers' specifications. For subsequent comparative analyses the data were normalized according to the following Equation (1). Here, a (minimal value) and b (maximal value) were set 0 and 0.89 for negative values, 0.9 and 1.1 for borderline and 1.01 and 10 for positive results, respectively. Max(x) and min(x) were set separately for negative, borderline and positive values according to the manufacturers' specifications. For better interpretation and comparisons of the LFA results, we introduced a semiquantitative score depending on the color intensity of the bands (0 for negative, 1 to 3 for positive) which was assigned by the experimenter based on visual inspection. According to the manufacturers' recommendations, any weak shade of color was counted as positive.
For calculation of the specificity and sensitivity, borderline results (based on the manufacturers' recommendations) were regarded as negative. For correlation of assay results with virus neutralization titers (IC 50 ), we used the recently published data of the 22 samples analyzed in a SARS-CoV-2 virus neutralization assay [7].
Experimental data were evaluated and plotted using GraphPad Prism (GraphPad Prism version 9.0.2 for Windows, GraphPad Software, San Diego, CA, USA). Confidence intervals were calculated using the Wilson score method [8].

Results
In this cross-sectional study, a panel of 161 sera from 161 donors was analyzed with 16 ELISAs and two LFAs (Figure 1a,b). Due to the different dynamic ranges of the S/CO values (Roche over 3 logs; ELISAs over 1-2 logs) data were normalized.

Determination of Specificity of the Analysed Assays
Based on the results of the SARS-CoV-2 negative panel (n = 60), we calculated specificities for all assays ( Figure 1a, Table 2). The specificities ranged between 93.3% (Euroimmun: IgA) and 100% (Roche; Epitope: IgG, IgM; Virotech: IgA; Serion: IgG, IgA; UKR: IgG, IgM; NvM01: IgG, IgM; NvM03: IgG) and generally deviated only slightly from the manufacturers' information. In general, IgG tests showed less false positive results (n = 5) compared to IgM (n = 4) and IgA (n = 7) tests given the total number of assays for these isotypes. There was no evidence of generally cross-reactive sera since out of a total of 13 sera with false positive tests only three samples (N7, N17, N23) were repeatedly tested positive with assays from different providers. Within the seasonal CoV group (n = 31), four assays (Bio-Rad, Virotech: IgG, Mikrogen: IgA, NvM03: IgM) produced a single false positive result (S1, S15, S3, S22).   Sensitivity manufacturer: d = days post symptoms onset; sensitivity as determined: d = days post positive RT-PCR; n = 89: 12 RT-qPCR positive sera that showed positive results in less than two assays were excluded; low/high p: values result from adjustment of the assay cutoff to low/high prevalence.

Determination of the Sensitivity of the Assays
The sensitivities (Table 2) ranged between 36.7% (Virotech IgM) and 87.8% (Euroimmun IgG, IgA; UKR IgG; NvM03 IgG), depending on the isotype and the different time points after positive SARS-CoV-2 RT-qPCR. In general, IgG based assays showed higher sensitivities (50%-87.8%). As expected, and in accordance with previous findings, the ratio of sera which were tested positive increased at the later time points [9]. The tests for IgM and IgA did not show such a clear finding, and the sensitivities increased or decreased within the time intervals. The pan-Ig assays showed highest sensitivities at 6-10 dppp (Bio-Rad: 86.7%) or at >10 dppp (Roche: 80.5%). However, when excluding 12 sera that showed positive results in less than two assays (n = 89 vs. n = 101), sensitivities increased and ranged between 45.8% (Virotech IgM) and 96.3% (Bio-Rad, NvM01 IgM).
Following the recommendations of the Center for Disease Control and Prevention to adapt tests to the prevalence of the target population [10], Virotech proposes two cutoff for the IgG test indices: a high prevalence cutoff for better sensitivity, and a low prevalence cutoff for better specificity in population with a predicted lower infection rate. With the latter value, the specificity within our panel improved from 96.7% to 98.3% (one false positive less), while sensitivity rose by up to 13 percent (4 additional positives in group 1, and two additional positives each in group 2 and 3) for the high prevalence cutoff.

Agreement between the Different Serological Assays
We then calculated the overall percent agreement for all test combinations by comparing the results of the positive panel ( Figure 2). The pan-Ig and IgG tests showed the best agreement (79-97%), with overall better agreement between tests using the same antigen (N: 84-94%, S: 83-97%). The agreement for IgA and IgM tests ranged from 71-93% and 61-93%, respectively.

Correlation with Neutralisation Titer
Next, we correlated the assays' results of a smaller panel of sera (n = 22) with previously determined SARS-CoV-2 virus neutralization titers (IC50, Figure 1c). Here, LFAs were excluded due to their semiquantitative scoring. R 2 values ranged from 0.28 (Roche) to 0.87 (UKR: IgG, Serion: IgG). In general, the IgG tests showed the best correlation with higher values for the tests based on the S-antigen. The lower correlation coefficients of the pan-Ig tests (Roche and Bio-Rad) may be attributed to the simultaneous measurement of IgM and IgA, which showed a generally lower correlation with virus neutralization.

Evaluation of an Additional Panel of LFAs
Finally, we used a panel of 19 sera to evaluate 14 additional LFAs from 14 providers (Figure 1d). The tests were selected based on their availability by 1st May 2020. Three naïve sera, 4 seasonal CoV-sera (n = 4) and four samples for each time point were selected randomly, and for every sample we set the antibody status according to our findings with the ELISAs for each antigen. A sample was considered positive if more than 50% of the assays showed a positive result, as unclear if 50% were reactive, and as negative if less than 50% of the assays for the corresponding antigen were positive. Due to the low number of samples, sensitivity and specificity was not determined. Within the negative samples (n = 7), three LFAs showed a positive result, with ZECEN for two samples (N12, S15) and one sample each with Boson (N6) and Dialab (S15). Interestingly, these sera were detected false positive in several ELISAs, too. This may be attributed to crossreactive antibodies against the antigens used in these tests, or due to matrix effects. Within the positive serum panel (n = 12) six LFAs showed an insufficient performance with varying proportions of false negative results (ZECEN, Chemtron, RapiGEN, Türklab, AmonMed, Affimedix).

Discussion
Recently, a number of studies have compared and validated smaller panels (n = 1-7) of SARS-CoV-2 serological tests for diagnostic use [9,[11][12][13][14][15][16][17][18][19][20][21][22][23][24][25]. In the present study, we intended to systematically cross-compare a large number of SARS-CoV-2 serological tests, and thus analyzed a broad panel of SARS-CoV-2 ECLIA-, ELISA-, and LFA-based tests which were available in May 2020, using a relatively large number of sera from patients with COVID-19, and from controls. In general, we found comparable performance of the tests in terms of specificity, which was in accordance to the manufacturers' specifications. Sensitivities were generally lower than specified by the manufacturer. Presumably, this was due to a number of sera with reactivity below cutoff in all assays. When removing sera with reactivity in less than two tests (n = 12), sensitivities were comparable to the manufacturers' specifications. By trend, a more heterogeneous picture appeared when comparing 14 LFAs, albeit a smaller panel of test-sera was used. Thus, we conclude that LFA tests should be carefully validated before being used for screening of larger cohorts. Further, verification and quantitative analysis should be performed using other assay formats. A suitable application of LFAs may be decentralized monitoring of vaccine-induced immunity, where simple PoC tests are advantageous. In such cases, the performance of assays using whole blood may be superior to serum tests due to feasibility reasons, but equivalent performance should be ensured. However, a clear limitation of LFAs is the merely semiquantitative test result.
When correlating the results of the different tests with real-virus neutralizing capacity of the sera, we could show superiority of S-based tests. This is to be expected, as the spike protein and its receptor-binding domain (RBD) in particular is the main target of neutralizing antibodies, which compete with the receptor on the cell surface. Thus, for determination of the potentially protecting quality of antibodies, an S-based test should be preferred. Furthermore IgG reactivities correlate best with neutralization in accordance with recent findings from others [26].
Serological tests can help to discriminate (the rate of) breakthrough infections after SARS-CoV-2 vaccination. The preferred test system here depends on the molecular composition of the applied vaccine. For example, the antigen of the currently applied RNA-vaccines (e.g., Comirnaty BNT162b2, by Biontech/Pfizer) is the S-protein and, thus, an active SARS-CoV-2 or past infection after vaccination can be detected by an N-based test. In regions with high seroconversion rates, prevaccination screening for already existing antibodies may save scarcely available vaccine-resources.
For some samples we found a biased immune response for one of the tested antigens. Such a constellation may be even more abundant in samples at later time points after infection, since the stability of the humoral immune response may differ between the antigens. Therefore, in a scenario where maximal sensitivity is necessary, e.g., in epidemiological studies in areas of low prevalence, a combination of S-and N-based assays with high sensitivity is recommended for screening. Subsequently, deviating, low and borderline signals may be confirmed by additional testing with highly specific serological tests or using alternative test systems like real virus or pseudotype-based neutralization assays or immune fluorescence.
Furthermore, in epidemiological follow-up studies aiming to quantify the longevity of the humoral immune response, pan-Ig assays should be avoided, since the kinetics of the different subtypes are not discriminable. In such studies, a pan-Ig test may be used for screening of positive cases and conformational, quantifying and qualifying (subtype specific response) measurements can be performed subsequently.
As a limitation of this study, mostly sera from hospitalized subjects were used due to limited availability of outpatient material. This may have biased the determination of the accuracy of the tests. Currently, the published datasets (including ours) lack independent evaluation with standard serum panels. Very recently, such reference material has been made available from the National Institute for Biological Standards and Control (NIBSC) [27]. A helpful extension of such standard sample material would be additional panels of reference material, including a number of weakly reactive sera to allow for evaluation of the precision of different tests. Along these lines, frequent independent round robin test studies are indispensable to ensure diagnostic accuracy and precision of laboratory testing.

Conclusions
In addition to nucleic acid-and antigen-testing for detection of acute infection, serological testing is able to detect past infections, and thus is essential for epidemiological surveillance and analysis of transmission patterns, and patient contacts and can help to identify asymptomatic cases. To this end, accuracy and precision of serological tests should be determined and the specifications and characteristics of the used test should be appropriate for the scenario addressed. In our study, we found superior performance of ELISAs compared with LFAs, a better correlation of S-tests with virus neutralization, and less false positive results as well as better sensitivities of IgG tests as compared to IgM and IgA tests. In our collection of sera, we identified few samples with a biased immune reaction towards the N or S-antigens. To avoid false negative testing of such cases, a combination of tests against both antigens is recommended.
Finally, information about protective antibody levels in convalescent and postvaccination sera against different emerging mutant strains is currently growing [28,29]. Serological tests with the potential to discriminate the status of protection against these novel strains will be indispensable to safeguard high-risk groups through the upcoming period of genetic drift of SARS-CoV-2, and to provide guidance for adapted vaccine strategies.

Informed Consent Statement:
Since the present study is retrospective, informed consent was not required.

Conflicts of Interest:
The authors declare no conflict of interest.