Review Over a 3-Year Period of European Union Proficiency Tests for Detection of Staphylococcal Enterotoxins in Food Matrices

Staphylococcal food poisoning outbreaks are a major cause of foodborne illnesses in Europe and their notifications have been mandatory since 2005. Even though the European regulation on microbiological criteria for food defines a criterion on staphylococcal enterotoxin (SE) only in cheese and dairy products, European Food Safety Authority (EFSA) data reported that various types of food matrices are involved in staphylococcal food poisoning outbreaks. The European Screening Method (ESM) of European Union Reference Laboratory for Coagulase Positive Staphylococci (EURL CPS) was validated in 2011 for SE detection in food matrices and is currently the official method used for screening purposes in Europe. In this context, EURLCPS is annually organizing Inter-Laboratory Proficiency Testing Trials (ILPT) to evaluate the competency of the European countries’ National Reference Laboratories (NRLs) to analyse SE content in food matrices. A total of 31 NRLs representing 93% of European countries participated in these ILPTs. Eight food matrices were used for ILPT over the period 2013–2015, including cheese, freeze-dried cheese, tuna, mackerel, roasted chicken, ready-to-eat food, milk, and pastry. Food samples were spiked with four SE types (i.e., SEA, SEC, SED, and SEE) at various concentrations. Homogeneity and stability studies showed that ILPT samples were both homogeneous and stable. The analysis of results obtained by participants for a total of 155 blank and 620 contaminated samples allowed for evaluation of trueness (>98%) and specificity (100%) of ESM. Further to the validation study of ESM carried out in 2011, these three ILPTs allowed for the assessment of the proficiency of the NRL network and the performance of ESM on a large variety of food matrices and samples. The ILPT design presented here will be helpful for the organization of ILPT on SE detection by NRLs or other expert laboratories.

In fact, SEs produced by coagulase-positive staphylococci (CPS), including mainly Staphylococcus aureus, have super-antigenic and emetic activities, leading to toxic shock syndrome and staphylococcal food poisoning [6,7]. They are active in nanogram to microgram quantities, and are resistant to environmental conditions such as high or low temperature and pH that easily kill bacteria. Moreover, SEs are resistant to proteolytic enzymes, hence retaining their activity in the digestive tract after The concentrations for each tested SE were selected based on those measured during investigation of SFPOs, routine and official control analysis performed in our laboratory. In addition, a naturally contaminated cheese was also used in the frame of the ILPT organized in 2013.
This article will describe the experimental design of European ILPTs, including sample preparation, homogeneity, and stability studies. The results obtained by participants and the performance of the EURL network over three years of SE detection in food matrices will be discussed.

Results and Discussion
Since 2001, EURL for CPS has been accredited (accreditation scope no. 1-2246 available at www.cofrac.fr) for SE detection in food products, according to Standard NF EN ISO CEI 17025 [23].This accreditation covers the use of ESM. Proficiency tests were performed according to the specifications of EN ISO IEC 17043 and ISO Guide 43 [24].

Proficiency Test Items and NRL Network Participation
Epidemiological data were investigated in order to determine the main food matrices, toxin types, and contamination levels involved in SFPOs, as well as the toxin types and the levels of contamination determined [1- 6,18]. Thus, the capacity of the NRL network to detect SEs in food was assessed in eight matrices covering five food categories: ready to eat food, meat, milk products, pastry, and fish. SEA, SEC, SED, and SEE, identified in several food poisoning outbreaks in Europe, were selected for sample contamination. In the three ILPTs, a blank and two spiking levels were used (Section 2.2), each level being applied to the five food categories. Over three years, this ILPT scheme provided a precious additional data set to those formerly obtained during the validation study of ESM using a dialysis concentration step and both Vidas ® SET2 and RIDASCREEN ® SET Total kits [21,22].
Out of 29 European member states and associated countries, 28, 27, and 27 participated in the ILPTs dedicated to SE detection using ESM in food matrices organized in 2013, 2014, and 2015, respectively. Globally, 31 NRLs participated in the ILPTs, corresponding to a NRL participation rate of at least 86%.

Homogeneity and Stability Studies
As discussed in Section 4.4, quantitative criteria have been used for the assessment of homogeneity data. Depending on the concentrations of toxins used for sample preparation and on the resulting raw data, three contamination levels were distinguished, as follows: Blank level, representing unspiked samples; Level 1, representing samples contaminated at very low SE concentration associated with low raw data (test value (TV) or absorbance unit (AU) < 0.9); Level 2, representing samples contaminated at low SE concentration associated with higher raw data (TV or AU > 0.9).

Homogeneity Study
Based on qualitative criteria, SEs were not detected in 100% of the blank samples, and SEs were detected in 100% of the contaminated samples, regardless of the assay used (Tables 1 and 2). Therefore, these samples were considered to be homogeneous for a qualitative analysis.
For the Vidas SET2 kit, mean values ranged from 0.52 to 0.76 TV and from 0.98 to 1.25 TV for levels 1 and 2, respectively. For the Ridascreen SET Total kit, mean values ranged from 0.28 to 0.88 and from 1.23 to 2.74 for levels 1 and 2, respectively. On the other hand, relative standard deviation (RSD) calculated for each couple food type/contamination level were less than 15%. Thereby, samples were considered to be homogeneous.

Stability Study
Stability tests were performed after receiving all participants' data in order to cover the entire ILPT analysis period; i.e., 11, 9, and 8 weeks after sample dispatch for ILPT 2013, 2014, and 2015, respectively. Based on qualitative criteria, SEs were not detected in 100% of the blank samples (data not shown), and SEs were detected in 100% of the contaminated samples, regardless of the assay used ( Figure 1). Thus, these samples were considered to be stable during each ILPT analysis period for a qualitative determination. (2) where is the test value of the nth replicate and is the mean values obtained in the homogeneity study, which were thus considered as assigned values ( Table 2). For the Vidas SET2 assay, regardless of the food matrix, all replicate values were included in the interval of ±25%, except for one replicate of mackerel at level 1 (56%, ILPT 2014) and one replicate of dessert cream at level 1 (69%, ILPT 2014).
For the Ridascreen SET Total assay, most values were included in the interval of ±25%, except for ready-to-eat foods (RTE) at level 1, milk at levels 1 and 2, and replicate values of dessert cream at level 1, which were included in the interval of ±40%.
This quantitative assessment confirmed the qualitative results and the stability of samples over the ILPT analysis period.

EURL Network Results over a 3-Year Period
Most laboratories analyzed the ILPT samples according to ESM and returned their results before the deadline (except for one NRL during ILPT 2014). A few results were rejected when (i) the applicable version of ESM was not correctly performed; (ii) the NRL used none of the two validated detection kits; and (iii) the delay for analyses and/or for sending the results was not fulfilled.
After performing the extraction and dialysis concentration step, detection could be performed using either only Vidas SET2 or Ridascreen SET Total, or both kits. In this section, results obtained by each detection kit are assessed separately due to measures and instruments used for the two detection kits which are not comparable. Thus, the ability of the EURL network to perform each detection method can be evaluated.
For NRLs performing ESM with the Vidas SET2 detection kit, only 22 (3.6%) of the 615 samples sent to the NRLs over three years were rejected for data processing (Table 3). For NRLs performing ESM with the Ridascreen SET Total detection kit, only 6 (1.8%) of the 339 samples were rejected (  Figure 1. Results of the stability study. Comparison between data obtained after ILPT period (six replicates) and the assigned value obtained during the homogeneity study (n = 20).
The six stability raw data obtained by each detection assay were compared to the assigned values (Equations 1 and 2). TV n stability TV assignedˆ1 00 for Vidas SET2 assay (1) where TV n stability is the test value of the nth replicate and TV assigned is the mean values obtained in the homogeneity study, which were thus considered as assigned values (Table 1).
AU n stability AU assignedˆ1 00 for Ridascreen SET Total assay (2) where TV n stability is the test value of the nth replicate and TV assigned is the mean values obtained in the homogeneity study, which were thus considered as assigned values (Table 2). For the Vidas SET2 assay, regardless of the food matrix, all replicate values were included in the interval of˘25%, except for one replicate of mackerel at level 1 (56%, ILPT 2014) and one replicate of dessert cream at level 1 (69%, ILPT 2014).
For the Ridascreen SET Total assay, most values were included in the interval of˘25%, except for ready-to-eat foods (RTE) at level 1, milk at levels 1 and 2, and replicate values of dessert cream at level 1, which were included in the interval of˘40%.
This quantitative assessment confirmed the qualitative results and the stability of samples over the ILPT analysis period.

EURL Network Results over a 3-Year Period
Most laboratories analyzed the ILPT samples according to ESM and returned their results before the deadline (except for one NRL during ILPT 2014). A few results were rejected when (i) the applicable version of ESM was not correctly performed; (ii) the NRL used none of the two validated detection kits; and (iii) the delay for analyses and/or for sending the results was not fulfilled.
After performing the extraction and dialysis concentration step, detection could be performed using either only Vidas SET2 or Ridascreen SET Total, or both kits. In this section, results obtained by each detection kit are assessed separately due to measures and instruments used for the two detection kits which are not comparable. Thus, the ability of the EURL network to perform each detection method can be evaluated.
For NRLs performing ESM with the Vidas SET2 detection kit, only 22 (3.6%) of the 615 samples sent to the NRLs over three years were rejected for data processing (Table 3). For NRLs performing ESM with the Ridascreen SET Total detection kit, only 6 (1.8%) of the 339 samples were rejected (Table 4).  Regarding blank and each contaminated level, only a few data have been rejected (ď4.4% for Vidas SET2 and ď3.0% for Ridascreen SET Total). These observations indicated that the data obtained in the three ILPTs were considered to be significant for EURL network evaluation. Tables 5 and 6 show the results obtained by NRLs CPS on each type of food/contamination level over three years.

‚
For the blank level: 123 and 68 samples were analyzed using Vidas SET2 and Ridascreen SET Total kit, respectively. No negative deviation was obtained by NRLs, regardless of the detection kit used. Thus, ESM was considered as specific for SE in food matrices (100%).   Overall, and taking into account the performance criteria on the qualitative results (specificity, sensitivity, and trueness), the EURL network for CPS obtained satisfactory results.
During the 2013 ILPT, 3.9% of results were rejected due to a deviation from ESM or non-respect of the organizers' instructions. Also, 1.7% positive and/or negative deviations were obtained by participants. During the 2015 ILPT no deviation from the method nor from organizer's instructions was reported (Table 7), indicating the efficiency of the measures implemented by NRLs after each ILPT. In fact, participants were able to describe any difficulties they encountered during testing and/or to add comments and observations. As a consequence, NRLs suggested some corrective actions in order to improve their reliability through technical exchange on ESM steps or by organizing training sessions, and these corrective actions were assessed by EURL. This task is part of reference activities requested by DG SANTE.

Conclusions
Even though the European regulation on microbiological criteria for food has settled a SE criterion only for cheese and dairy products, EFSA reported that various types of food matrices were involved in SFPOs. In this context, EURL network competency was evaluated for the first time through three ILPTs on a large panel of food matrices likely to be the source of SFPOs. A total of 31 NRLs participated to these ILPTs and analysed eight food matrices spiked with four types of SE (SEA, SEC, SED, and SEE) at different concentrations. Data assessment showed a significant progress of the EURL network proficiency. In fact, the rates of discrepancies identified decreased from 1.7% (ILPT 2013) to 1.0% (ILPT 2014), and finally to 0.0% (ILPT 2015).
The ILPT design presented in this work, having included a large panel of matrices tested, different types and concentrations of SE used, together with the homogeneity and stability studies, should be helpful for NRLs and other PT providers when organizing their own ILPTs.

Toxins
Highly purified freeze-dried SEs were purchased from Toxin Technology, Sarasota, FL, USA (batch no. 120794 A for SEA, no 113094C2 for SEC 2 , and no 70595E for SEE) and were rehydrated according to the manufacturer's instructions to obtain stock solutions. Briefly, 1 mL of osmosis water was added to 1 mg of SE powder in order to obtain a theoretical concentration equal to 1 mg¨mL´1. Purity has been checked for each toxin using SDS PAGE analysis.

Preparation of the Proficiency Test Items
The three ILPTs were performed on eight matrices: ‚ Tuna, mackerel, ready-to-eat-food (pie, Quiche Lorraine), dessert cream (Crème brûlée, pastry), roasted chicken, and liquid semi-skimmed milk purchased from a retail store.
The SEs non-detection in a 25 g test portion was checked before sample contamination.

Preparation of Blank and Contaminated Batches
Uncontaminated blank samples were homogenized and dispatched into flasks in order to obtain 25˘0.1 g.
Sample contamination was performed as follows. After homogenisation, 25˘0.1 g test portions were prepared in flasks and spiked separately by adding 500 µL of SE solution in PBS-BSA-Azide in each flask to obtain the target concentration (Table 8).
In order to prevent any cross contamination, each sample set of couple food/contamination level was prepared and contaminated separately. After their preparation, all samples were stored at´18˝C until homogeneity tests and shipment to participating laboratories.

Identification of the Proficiency Test Items
The EURL guaranteed the full respect of confidentiality with regards to the identity of the participants in ILPTs.
In accordance with the internal ILPT Quality Manual, a random encrypted coding encompassing all samples was used. The samples were randomly coded independently of the laboratories' codification to avoid any collusion with the results. The distribution of the samples within the different laboratories was also randomly performed.

European Screening Method for SE Detection in Food Matrices
This method includes an extraction-concentration step by dialysis and a detection step carried out using the Vidas SET2 (bioMérieux ® , Marcy l'Etoile, France) or Ridascreen SET Total kit (R-Biopharm ® AG, Darmstadt, Germany), which are able to simultaneously detect SEA, SEB, SEC, SED, and SEE in food matrices [13].
Briefly, 25˘0.1 g of sample was mixed in 40 mL of distilled water at 38˘2˝C, using an Ultra Turrax homogenizer (T25-basic, Stanfen, Germany), and were shaken at room temperature for at least 30 min for toxin diffusion. In the case of liquid product, no distilled water was added.
Then, the pH of the slurry was adjusted to between 3.5 and 4.0 with HCl (Merck, Darmstadt, Germany) to precipitate caseins (in the case of dairy products) and centrifuged at 10,000ˆg at 4˝C for 15 min. The aqueous supernatant was sampled and adjusted to pH 7.5˘0.1 with NaOH (Merck) and centrifuged as above. The supernatant was filtered through glass wool and concentrated on a dialysis membrane with a molecular weight cut-off (MWCO) of 6000-8000 Da (Spectrum Laboratories Inc., Rancho Dominguez, CA, USA) against 30% (w/w) polyethylene glycol 20,000 (Merck, Darmstadt, Germany) overnight at 4˝C. The concentrated protein extract was recovered and adjusted to a final weight of 5.0 to 5.5 g using phosphate buffered saline (PBS: 145 mmol¨L´1/10 mmol¨L´1 NaCl/Na 2 HPO 4 , pH = 7.3˘0.2).
SE detection was performed from the extract using the two qualitative commercial assays (Vidas ® SET2 and/or the RIDASCREEN ® SET Total).  Table 9.  Over three years, NRLs received eight food matrices spiked with various SE types at different concentrations levels. Homogeneity tests were performed before sample shipment and stability tests after receiving participants' data.

ILPT Design over 3 Years
Food matrices, spiking level, and number of replicates sent to the NRLs are reported in Table 8.

Data Processing
Vidas SET2 and Ridascreen SET Total are qualitative staphylococcal enterotoxin assays that should be used as primary screening tools. These detection assays are able to detect the presence or absence of five SEs (SEA to SEE) but they are not able to identify the SE type detected in the food extract. Therefore, results obtained by participants were interpreted as "SE not detected" if raw data are below the positive threshold, and as "SE detected" if raw data are above the positive threshold. However, to assess results of the homogeneity study, the EURL added additional quantitative criteria (see Section 4.4.1) to better control the quality of the samples during the ILPT period.

Homogeneity Study
According to the EN ISO 13528 Standard [25], the homogeneity study was performed on 20 flasks randomly sampled for each combination matrix/contamination level. Each sample was analyzed once on the same day according to ESM using both Vidas ® SET2 and RIDASCREEN ® SET Total kits. For blank samples, 100% of the obtained results must be below the positive threshold for the detection essay. For contaminated samples, 100% of the obtained results must be above the positive threshold for the detection kits and relative standard deviation (RSD) less than or equal to 15%. It should be noted that this RSD was calculated using the mean and the standard deviation of the 20 values.

Stability Study
A stability test was performed after receiving all participants' data. Six samples of each matrix/level were randomly selected and analyzed according to ESM. For the blank level, 100% of the obtained results must be below the positive threshold. For spiked levels, 100% of the obtained results must be above the positive threshold for the detection kits.

Assessment of Participants' Data
Only data obtained by participants using ESM were accepted and assessed. Results obtained by participants were interpreted as "SE not detected" or "SE detected" and compared with the expected results.
Accuracy of qualitative results was assessed according to the three following criteria: Specificity: ability to obtain a negative response for a sample known not to contain any analyte where N´is the number of negative samples and Né xpected is the number of samples expected to be negative. Sensitivity: ability to obtain a positive response for a sample known by the organizer to contain SE: Sensitivity " NǸè xpectedˆ1 00 (4) where N`is the number of negative samples and Nè xpected is the number of samples expected to be negative. Trueness: ability to obtain a positive response for a sample known to contain SE and to obtain a negative response for a sample known to contain no analyte, Trueness " N N expectedˆ1 00 (5) where N is the number of samples correctly identified to be positive or negative and N expected is the total number of samples.