Introduction

SARS-CoV-2 causing the new coronavirus disease (COVID-19) has now spread worldwide [1, 2]. SARS-CoV-2 infection may be asymptomatic or can cause respiratory symptoms and other serious complications [3,4,5]. Identification of individuals showing a serological response to SARS-CoV-2 provides important complementary information by giving an evaluation of the fraction of individuals who have previously been infected [6, 7].

Many serological assays are currently available and choosing the best assay may be very challenging for laboratories. Our aim in this article is to report independent evaluations in order to provide an overview of 30 serological assays’ clinical performances in symptomatic patients.

Methods

Thirteen laboratories of virology of AP-HP, located all over Paris region, were involved in practice of virological diagnosis of COVID-19 and in evaluation of diagnostic assays (Supplementary Table 1).

Patients and sera

Between March and May 2020, 2594 sera were collected from symptomatic adults (not immunocompromised) previously diagnosed with COVID-19 by rRT-PCR on a respiratory sample [8]. These patients were attending COVID-19-specific consultations, hospitalization, or emergency units of Paris public hospitals. Symptoms were either severe or mild, but none of the samples were collected from asymptomatic patients as these patients were not referred to hospitals.

We stratified our analysis in 3 periods depending on the time interval between onset of symptoms and serum collection:

  • 0–9 days after onset of symptoms (N = 581/2594 (22.4%));

  • 10–14 days after onset of symptoms (N = 581/2594 (22.4%));

  • 14 days after onset of symptoms (median 22 days) (N = 1432/2594 (55.2%)).

A total of 1996 serum samples expected to be negative for SARS-CoV-2, as collected before the COVID-19 outbreak in France, were also tested to assess specificity. This panel included 665/1996 (33.3%) “potentially interfering sera” collected from patients with acute or chronic viral, bacterial, or malaria infections. Others were named “unselected pre-pandemic sera”.

Samples were not shared from one laboratory to another. Samples were stored at −20 °C until testing and within the same freeze/thaw cycle if tested by multiple methods.

Ethics

This work was a retrospective non-interventional study. Reclassification of biological remnants into research material was approved by the Institutional Review Board of all the Assistance-Publique-Hôpitaux-de-Paris University Hospitals participating to the study. According to the French Public Health Code (CSPArtL.1121-1.1), such protocols are exempted from individual informed consent due to the retrospective chart review design and absence of identifying images or personal/clinical details that could compromise anonymity.

Rapid tests for qualitative detection of anti-SARS-CoV-2 antibodies (RDTs)

A total of 17 qualitative membrane-based immunoassay (CE-IVD approved) were performed according to manufacturer’s instructions (Supplementary Table 1). All rely on immunochromatography lateral flow assay technology and interpreted via visual inspection, except Finecare assay which uses fluorescent detection conjugate with dedicated reader. For analysis, a test was considered positive regardless the intensity of the band.

Automated and manual ELISA/CLIA assays

A total of 13 immunoassays (CE-IVD approved) were performed according to manufacturer’s instructions (Supplementary Table 1). For analysis, all equivocal results were considered as positive.

Statistics

Antibody response was assessed in stratified analysis considering the time interval between the onset of symptoms and the date of sample collection. Sensitivity and specificity of each assay were calculated with their respective 95% confidence interval (95% CI). We compared qualitative serology results in different contexts with chi-squared Pearson tests (considered significant if p < 0.05).

Results

Each assay was evaluated with 50 to 1364 different sera (1571 (60.6%) samples were tested with both RDTs and IAs). Concerning specificity, false positive results were more frequent with “potentially interfering samples” (13.4%) compared with unselected pre-endemic sera (4.4%) (p < 0.001) (Table 1). Results for specificity and sensitivity for each assay are shown in Fig. 1 and Supplemental Table 2a for global results (IgG + IgM/IgA or TAb), Fig. 2 and Supplemental Table 3a for IgG results, and Fig. 3 and Supplemental Table 4a for IgM/IgA results.

Table 1 Detailed results for false positive results. Samples mentioned in lines 1–9 were collected from patients with another infectious disease. Respiratory infections (coronavirus, influenza…) were assessed by multiplex PCR on a respiratory sample at least 2 weeks before serum collection. Samples mentioned in line 10 were collected from patients having potentially interfering agents in their serum (rheumatoid factor or monoclonal IgG or IgM peak)
Fig. 1
figure 1

Global performances of immunoassays: rapid tests for qualitative detection of anti-SARS-CoV-2 antibodies (RDTs) (white background), and automated/manual ELISA/CLIA assays (IAs) (gray background). All error bars represent 95% confidence interval. Number of samples tested (N) is specified for each assay. a Specificity (black line represents the minimum expected specificity (98%) according to French recommendations [9]). b Percent of samples tested positive to time after onset of symptoms: 0–9 days, 10–14 days, and more than 14 days after onset of symptoms (median 22 days) (black line represents the minimum expected sensitivity (90%) according to French recommendations [9]). Areas where no data are shown correspond to assays that only detect IgG

Fig. 2
figure 2

IgG results of immunoassays: rapid tests for qualitative detection of anti-SARS-CoV-2 antibodies (RDTs) (white background) and automated/manual ELISA/CLIA assays (IAs) (gray background). All error bars represent 95% confidence interval. Number of samples tested (N) is specified for each assay. a Specificity (black line represents the minimum expected specificity (98%) according to French recommendations [9]). b Percent of samples tested positive to time after onset of symptoms: 0–9 days, 10–14 days, and more than 14 days after onset of symptoms (median 22 days) (black line represents the minimum expected sensitivity (90%) according to French recommendations [9]). Areas where no data are shown correspond to assays that only detect TAb

Fig. 3
figure 3

IgM/IgA results of immunoassays: rapid tests for qualitative detection of anti-SARS-CoV-2 antibodies (RDTs) (white background) and automated/manual ELISA/CLIA assays (IAs) (gray background). All error bars represent 95% confidence interval. Number of samples tested (N) is specified for each assay. a Specificity (black line represents the minimum expected specificity (98%) according to French recommendations [9]). b Percent of samples tested positive to time after onset of symptoms: 0–9 days, 10–14 days, and more than 14 days after onset of symptoms (median 22 days) (black line represents the minimum expected sensitivity (90%) according to French recommendations [9]). Areas where no data are shown correspond to assays that do not detect IgM nor IgA

Results of rapid tests for qualitative detection of anti-SARS-CoV-2 antibodies (RDTs)

RDTs achieved between 77.4 and 100.0% TAb specificity, with only 6 RDTs fitting the > 98% French recommendations [9]. By 15 days after onset of symptoms, most RDT (12/17) reached the expected sensitivity > 90% [9]. Only 4 RDTs fitted both sensitivity and specificity criteria: Finecare, NADAL, AAZ, and Orientgene.

Results of automated and manual ELISA/CLIA assays

Considering TAb or IgG (for assays detecting IgG only), assays achieved between 58.8 and 100.0% specificity, with only 5/13 IAs fitting the French recommendations (> 98%) [9]. By 15 days after onset of symptoms, 9/13 IAs reached the expected sensitivity > 90% [9]. Only 3 IAs fitted both sensitivity and specificity criteria: Biorad (TAb), Elecsys anti-SARS-CoV-2 (TAb), and Abbott Architect (IgG).

Serology results depending on disease severity and age of patients

For 783/2494 (31.4%) patients (samples collected between day 0 and 91st day after onset of symptoms), information on the necessity of hospitalization was available: 318/783 (40.6%) required hospitalization and 465/783 (59.4%) were released from emergency unit or consultation for mild disease. For these patients, results of IgG + IgM or TAb are reported in Table 2 (a serum was considered positive if at least one assay was positive). More than 14 days after onset of symptoms, 84.8% non-hospitalized patients had positive serology, compared with 95.5% of hospitalized patients (p < 0.05).

Table 2 Percentage of positive sera in hospitalized and non-hospitalized patients. A serum was considered positive if at least one assay was positive

Concerning the age of patients, difference is significant for serology performed more than 14 days after onset of symptoms: 94.0% elder patients (> 50 years old) had positive serology, compared with 86.5% younger patients (< 50 years old) (p < 0.05) (Supplemental Table 5).

Discussion

Even if several assays have the minimum expected specificity of 98%, confidence intervals should not be overlooked, as several of them are quite large. Our RDT results suggest that IgG detection is more specific for SARS-CoV-2 infection (33/1996; 1.7%), whereas IgM from other infections or patient background may introduce specificity concerns (54/1996; 2.7%). Consequently, we suggest any RDT positive for IgM only to be as soon as possible investigated with nasopharyngeal rRT-PCR and/or subsequent serological sampling looking for IgG seroconversion. Previous works evaluating a panel of many serologic assays reported a range from 95 to 100% for sensitivity and specificity [10,11,12,13,14,15,16,17]. Concerning specificity, we chose an important number of “potentially interfering sera” compared with these studies, which usually use samples collected from healthy blood donors. Our aim was to identify potential cross-reactivities, and indeed, false positive results, both for RDTs and IAs, were more frequent with “potentially interfering samples” compared with unselected pre-endemic sera. This certainly explains why specificity reported in our study is lower than in previous publications, and it would certainly have been improved if only serum collected from healthy individuals had been used.

The main limitation of serology resides in the fact that sensitivity is low before 10 days after onset of symptoms. Indeed, in our evaluation, only 6/17 RDTs and 6/13 IAs reached 90% sensitivity 10–14 days after onset of symptoms. However, close to what was previously reported, in our study, most assays achieved sensitivity higher than 90%, both for RDTs and IAs, for sera collected more than 15 days after onset of symptoms [10,11,12,13,14,15,16,17]. Even if some current reports show that neutralizing antibodies decline in convalescent individuals in 2 to 5 months after infection, identifying patients with an history of SARS-CoV-2 infection might be particularly interesting to spare vaccine doses as it was recently described those patients presented high titers of neutralizing antibodies after receiving only one vaccine dose, with titers similar or higher than uninfected individuals that received two vaccine doses [18,19,20,21] (https://doi.org/10.1101/2021.01.29.21250653).

Our study presented some limitations. Selection of sera was based on samples collected from symptomatic patients having positive rRT-PCR on upper respiratory tract specimens. More studies are needed to address whether asymptomatic patients, or patients with chest imaging compatible with COVID-19 but negative rRT-PCR, have different antibody response that could influence assays’ performances. RDT, which intend to be used as point of care devices and therefore with capillary blood, were only evaluated with serum samples in our study. Again, additional investigations are needed to provide information on which of these assays are reliable enough to be used in clinical practice. Despite the overall large number of samples, several assays were underpowered for true sensitivity and specificity assessment and the large confidence intervals may not reflect assay performance as much as they reflect sample size in this study. As our evaluation was implemented in the first weeks of French outbreak, most samples were collected during the acute phase of the illness while long-term follow-up and late collection of samples will soon allow assessment of the long-term persistence of specific antibodies.

Overall, our findings provide reassurance that several RDTs/IAs are suitable to detect specific antibodies against SARS-CoV-2 with high levels of sensitivity and specificity more than 14 days after onset of symptoms, but also highlight that assays should be assessed before implementation to ensure analytical capabilities are as needed for the clinical purpose.