Metabolomics Approach on Non-Targeted Screening of 50 PPCPs in Lettuce and Maize

The metabolomics approach has proved to be promising in achieving non-targeted screening for those unknown and unexpected (U&U) contaminants in foods, but data analysis is often the bottleneck of the approach. In this study, a novel metabolomics analytical method via seeking marker compounds in 50 pharmaceutical and personal care products (PPCPs) as U&U contaminants spiked into lettuce and maize matrices was developed, based on ultrahigh-performance liquid chromatography-tandem mass spectrometer (UHPLC-MS/MS) output results. Three concentration groups (20, 50 and 100 ng mL−1) to simulate the control and experimental groups applied in the traditional metabolomics analysis were designed to discover marker compounds, for which multivariate and univariate analysis were adopted. In multivariate analysis, each concentration group showed obvious separation from other two groups in principal component analysis (PCA) and orthogonal partial least squares discriminant analysis (OPLS-DA) plots, providing the possibility to discern marker compounds among groups. Parameters including S-plot, permutation test and variable importance in projection (VIP) in OPLS-DA were used for screening and identification of marker compounds, which further underwent pairwise t-test and fold change judgement for univariate analysis. The results indicate that marker compounds on behalf of 50 PPCPs were all discovered in two plant matrices, proving the excellent practicability of the metabolomics approach on non-targeted screening of various U&U PPCPs in plant-derived foods. The limits of detection (LODs) for 50 PPCPs were calculated to be 0.4~2.0 µg kg−1 and 0.3~2.1 µg kg−1 in lettuce and maize matrices, respectively.


Introduction
Pharmaceutical and personal care product (PPCP) contamination in animal-derived foods has attracted worldwide attention, and a series of formal regulatory documents on the maximum residue limits (MRLs) of PPCPs from different countries and organizations has been issued [1][2][3][4]. However, PPCPs-induced contamination in plant-derived foods has not been fully addressed [5]. Previous studies [6][7][8][9][10][11][12][13][14] indicate that some plant-derived foods (e.g., corn, barley, pea, wheat, carrot, potato, cucumber and lettuce) can easily absorb PPCPs from soil with animal manure used as a fertilizer, which contains several kinds of commonly used antibiotics, e.g., tetracyclines, quinolones, sulfonamides and β-lactam, with their total concentration from the µg kg −1 to the mg kg −1 level in the plants [9,[15][16][17][18]. Due to the lack of evaluation standards of PPCPs in plant-derived foods, it is hard to directly judge whether the residue concentrations of PPCPs can induce adverse effects on human health. Referring to the regulatory files on MRLs of PPCPs in animal-derived foods [2,4], which proposed a concentration of 10 µg kg −1 as the threshold of safety for most PPCPs, it can be inferred that if the concentrations of PPCPs in plant-derived foods exceed 10 µg kg −1 , it triggers a food safety risk. Therefore, the top priority is to develop reliable analytical methods for the investigation of PPCP residues in plant-derived foods. hydrochloric acid (HCl) and C18 powder (Sinopharm Chemical Reagent Co., Ltd., Shanghai, China); methanol and acetonitrile (HPLC grade, Merck, Darmstadt, Germany); formic acid (HPLC grade, Shanghai ANPEL Laboratory Technologies Inc., Shanghai, China); filter membrane (0.22 µm, Agilent Technologies, Singapore, MI, USA); ultrapure water (Milli-Q ultrapure water system, Merck, Darmstadt, Germany); ciprofloxacin-d8 hydrochloride solution (100 µg mL −1 in methanol, First Standard, Ridgewood, NY, USA). Analytical standard compounds for 50 PPCPs (purity > 98.3%) were obtained from First Standard (Ridgewood, NY, USA), Sigma (Alexandria, VA, USA), TRC (Toronto, ON, Canada) and Dr. Ehrenstorfer (Augsburg, Germany). More details on the 50 PPCPs are shown in Table 1.

Solution Preparation
A total of 50 PPCPs were separately prepared with methanol at 100 µg mL −1 , 1 mL of which was withdrawn, mixed together and further diluted with methanol to obtain a 1 µg mL −1 solution. Then, 100 ng mL −1 ciprofloxacin-d8 methanol solution was prepared by diluting its 100 µg mL −1 solution. A 0.1 mol L −1 Na 2 EDTA-Mcllvaine buffer solution was prepared with Na 2 HPO 4 (5.5 g), citric acid (12.9 g) and Na 2 EDTA (37.2 g) dissolved in 1 L pure water, which was further adjusted to pH 4.0 with 0.1 mol L −1 HCl or NaOH solution.

Sample Preparation and Pretreatment Process
(a) Lettuce sample was cut into small pieces, then ground into batter by tissue homogenizer; (b) 2.0, 5.0 and 10.0 g lettuce batters, together with one-to-one corresponding 20, 50 and 100 µL of 50 PPCPs mixed solutions (1 µg mL −1 ) were poured into 50 mL polypropylene centrifuge tubes. To calibrate the recovery during the sample pretreatment process, ciprofloxacin-d8 methanol solution (0.5 mL, 100 ng mL −1 ) as recovery internal standard was further added, as adopted in previous studies [30][31][32]; (c) 5 mL Na 2 EDTA-Mcllvaine buffer solution (0.1 mol L −1 ) was dumped into the tube, vortexed for 1 min, then 20 mL 1% (V/V) formic acid/acetonitrile solution was added further, stirring for 1 min. An extraction salt package (10.0 g Na 2 SO 4 + 2.0 g NaCl) was added for stratification under salting out after the solution standing for 10 min, centrifuging at 4500 r min −1 for 5 min; (d) then, after transferring all the supernatant into new 50 mL polypropylene centrifuge tubes, adding 100 mg C18 powder, vortexing for 1 min, centrifuging at 4500 r min −1 for 3 min, the solution was extracted to another 50 mL centrifuge tube, dried with N 2 blowing by nitrogen blowing apparatus (N-EVAP-112, Organomation, Berlin, MA, USA), and redissolved in 1 mL 40% (V/V) methanol 0.1% formic acid/water solution, vortexed for 1 min; (e) then, filtered with a 0.22 µm filter membrane, the sample solutions of 50 PPCPs at the theoretical concentrations of 20, 50 and 100 ng mL −1 were prepared. Each concentration experiment was repeated nine times.

Analytical Method
The 50 PPCPs and ciprofloxacin-d8 were analyzed on a quadrupole/electrostatic field orbitrap LC-MS/MS system (Q Exactive Plus, Thermo Fisher Scientific Inc., Waltham, MA, USA) under the positive mode of electrospray ion (ESI) source. Components in the sample solution underwent separation within an Accucore RP-MS column (100 × 2.1 mm, 2.6 µm particle diameter, Thermo Fisher Scientific Inc., Waltham, MA, USA), with injection volume of 10 µL. Next, 0.1% (V/V) formic acid/water and 0.1% (V/V) formic acid/methanol solutions were prepared as the mobile phase A and B, respectively, with flow rate of 0.3 mL min −1 . In consideration of the matrix complexity of lettuce and maize, there may be some impurities not eluted from the LC-MS/MS system in a relatively short time (738 s for the last eluted target PPCP in this study) designed only for 50 PPCPs, leading to the potential disruption for the elution and analysis of the next sample. Therefore, a longer elution program was designed as follows: gradient started from 5% B, kept for 2 min, then increased to 30% B in 1 min, at a duration of 7 min, further increased to 90% B in 1 min, holding on 25 min, finally decreased to 5% B in 1 min, equilibrating for 16 min. The oven temperature was set at 40 • C. Other parameter settings were as follows: heating and capillary temperature 320 • C; lens and spray voltage 50 and 3200 V, respectively; auxiliary and sheath gas N 2 , with flow rate at 10 and 40 arb, respectively; scan mode: full-scan/data-dependent two-stage scanning; MS parameters: full-scan resolution 70,000, maximum dwell time 100 ms, AGC target 1 × 10 6 , m/z scan range 100~1000; MS/MS parameters: resolution 17,500, maximum dwell time 50 ms, AGC target 2 × 10 5 .
LC-MS/MS output results of 50 PPCPs and ciprofloxacin-d8 were analyzed by Trace Finder 3.3 software, with screening conditions as follows: (a) for primary parent ion, signal to noise ratio 5.0, response intensity threshold 10,000, and mass error 5 ppm; (b) for secondary fragment ions, minimum matching number of ion 1, response intensity threshold 10,000, and mass error 5 ppm. On the basis of the peak area of the primary parent ion, ciprofloxacin-d8 was quantified with standard curve for recovery calculation.

Metabolomics Data Processing
LC-MS/MS was operated in full scan mode with RAW-formatted files as the direct output, which underwent conversion to corresponding mzXML-formatted files via the ProteoWizard software [35]. These new files are adaptable to the upload to the Work-flow4Metabolomics (W4M) platform (https://workflow4metabolomics.usegalaxy.fr/, accessed on 20 November 2021) for metabolomics analysis [36]. After peak detection, alignment and retention time calibration, plus data normalization, centralization, scaling and transformation performed on the W4M platform, the data matrix was obtained in the format of variable and sample named as abscissa and ordinate, respectively [36,37]. Variable contains a series of information, e.g., molecular weight and retention time, with every marker compound corresponding to its unique variable, that is to say, the process to pursue marker compounds is actually a process to pursue eligible variables. Multivariate statistical analysis including principal component analysis (PCA) [38][39][40] and orthogonal partial least squares discriminant analysis (OPLS-DA) [41,42] was performed in SIMCA 14.1 software [43] after importing the data matrix. A permutation test with 200 iterations was employed for over-fitting judgement of the OPLS-DA model [43,44]. Other parameters to screen marker compound candidates include the absolute value of variable confidence in the S-plot plot [45] and variable importance in projection (VIP) [43,44,46], with the threshold above 0.9 and 1, respectively. After this, eligible marker compound candidates from 20 and 100 ng mL −1 groups can both be obtained, and only overlapped candidates in two groups, representing their significantly low and high concentration in the corresponding 20 and 100 ng mL −1 groups, were further investigated by pairwise t-test [47][48][49] in SPSS Statistics V17.0 software and fold change judgement for the univariate analysis. Univariate analysis is simple, intuitive and easy to be understood. It was used to quickly investigate the differences of marker compound candidates in different groups. To more rapidly verify the identity of marker compounds on behalf of 50 PPCPs, we directly compared the precise molecular weight (<5 ppm in absolute value of error), retention time and the adduct structure of marker compounds with that of the authentic 50 PPCPs (Table 1).

Data Preprocessing
As indicated in Figure 1, although only part of the total ion chromatograms at the retention time of 0~900 s is shown, during which all 50 PPCPs were eluted, obvious differences in peak intensity have already been observed in three concentration groups, implying the possibility to seek marker compounds among groups. The principle for relative standard deviation of peak intensity above 30% was employed to filter out invalid variables in QC and three concentration groups [50], with a final 6512 × 39 data matrix obtained for further analysis.

PCA Analysis
As Taguchi [51] pointed out, PCA can make a natural classification for sample groups and eliminate the extreme data without knowing their categories, thus PCA can be used in metabolomics to assess the data quality and to identify outliers [38][39][40]. As indicated in  Figure 2, no extreme data and outliers were observed. Samples at the same concentration gathered together, indicating the good classification of groups. Obvious separation among three concentration groups indicates the existence of major discrepancies, further paving the way to seek marker compounds from different groups.

Data Preprocessing
As indicated in Figure 1, although only part of the total ion chromatograms at the retention time of 0 ~ 900 s is shown, during which all 50 PPCPs were eluted, obvious differences in peak intensity have already been observed in three concentration groups, implying the possibility to seek marker compounds among groups. The principle for relative standard deviation of peak intensity above 30% was employed to filter out invalid variables in QC and three concentration groups [50], with a final 6512 × 39 data matrix obtained for further analysis.

PCA Analysis
As Taguchi [51] pointed out, PCA can make a natural classification for sample groups and eliminate the extreme data without knowing their categories, thus PCA can be used in metabolomics to assess the data quality and to identify outliers [38][39][40]. As indicated in Figure 2, no extreme data and outliers were observed. Samples at the same concentration gathered together, indicating the good classification of groups. Obvious separation among three concentration groups indicates the existence of major discrepancies, further paving the way to seek marker compounds from different groups.

OPLS-DA Analysis
Theoretically speaking, the peak intensities of variables ought to increase with their rising concentrations, i.e., 20 and 100 ng mL −1 groups should present the minimum and maximum peak intensities, respectively. However, the reality may be different, due to the discrepancies in sample recoveries. Previous studies [30][31][32] proposed deuterated antibiotics as recovery internal standards to correct losses of PPCPs during sample preparation In consideration of this, ciprofloxacin-d8 (parent ion m/z 340.19132; fragment ions m/z 296.20156, 253.15933 and 239.14367; retention time 6.73 min) was employed here to eliminate the peak intensity errors of variables induced by disparate recoveries of PPCPs during the pretreatment process. As shown in Table S1 (Supplementary Materials), the recov eries of ciprofloxacin-d8 were calculated to be 80.1 ~ 85.9%, 80.3 ~ 86.2% and 81.6 ~ 87.7% in the 20, 50 and 100 ng mL −1 groups, respectively, based on the ciprofloxacin-d8 standard curve solutions (100, 50, 25, 10 and 5 ng mL −1 ) prepared in blank lettuce extract solution After this, the recoveries of ciprofloxacin-d8 were all calibrated to 100% by multiplying a corresponding calibration coefficient, with which the peak intensities of ciprofloxacin-d8 were also calibrated, together with peak intensities for all the variables.

OPLS-DA Analysis
Theoretically speaking, the peak intensities of variables ought to increase with their rising concentrations, i.e., 20 and 100 ng mL −1 groups should present the minimum and maximum peak intensities, respectively. However, the reality may be different, due to the discrepancies in sample recoveries. Previous studies [30][31][32] proposed deuterated antibiotics as recovery internal standards to correct losses of PPCPs during sample preparation. In consideration of this, ciprofloxacin-d8 (parent ion m/z 340.19132; fragment ions m/z 296.20156, 253.15933 and 239.14367; retention time 6.73 min) was employed here to eliminate the peak intensity errors of variables induced by disparate recoveries of PPCPs during the pretreatment process. As shown in Table S1 (Supplementary Materials), the recoveries of ciprofloxacin-d8 were calculated to be 80.1~85.9%, 80.3~86.2% and 81.6~87.7% in the 20, 50 and 100 ng mL −1 groups, respectively, based on the ciprofloxacin-d8 standard curve solutions (100, 50, 25, 10 and 5 ng mL −1 ) prepared in blank lettuce extract solution. After this, the recoveries of ciprofloxacin-d8 were all calibrated to 100% by multiplying a corresponding calibration coefficient, with which the peak intensities of ciprofloxacin-d8 were also calibrated, together with peak intensities for all the variables.
As shown in Figure 3, we can observe the separation of two camps on the first principal component axis. One camp represents the specific concentration group (green part), and the other camp is on behalf of the remaining two groups (blue part), indicating the existence of variables with significant differences between the two camps. Each point in the S-plot plots ( Figure 4) represents a variable, which keeps away from the origin along Xand Y-axis, implying more contribution and higher confidence level of the variable to the difference. Therefore, the points at the two ends of 'S' can be deemed the most differentiating components. In the S-plot analysis, absolute value of confidence > 0.9 has been proposed to screen variables as marker compound candidates [45], which at the significantly low and high concentration should be searched at the right and left ends of S-plot plots in Figure 4a,b, respectively.      common parameters to describe the interpretation level of the model in the Y-axis direction and the prediction level of the model [52,53], respectively. If R 2 Y and Q 2 are both close (or equal) to 1, the OPLS-DA models are not susceptible to over-fitting. As can be seen from Figure 5, R 2 Y and Q 2 values were no less than 0.991, indicating the good reliability, predictability and no over-fitting for all OPLS-DA models. VIP > 1 principle continues to screen marker compounds. Eventually, marker compounds on behalf of 50 PPCPs were all screened out as shown in Table 2. Negligible concentrations (<0.1 ng mL −1 ) of 50 PPCPs in the blank lettuce extract solution were obtained by the metabolomics analysis, which eliminates the interference of inherent (rather than spiked) 50 PPCPs residues in lettuce matrix to seek marker compounds.

Univariate Analysis
After multivariate analysis, a pairwise t-test [47][48][49] was firstly employed to examine whether marker compounds from a specific concentration group presented significant differences in peak intensity with those from other two groups. Pairwise t-test, as a reliable statistical test method, was performed to calculate p values between the two concentration groups and the p < 0.05 observed in this study indeed showed the existence of significant differences among groups. Previous studies [29,54] also adopted fold change of concentration > 2 to discern variables with high contrast among groups as marker compounds. Herein, marker compounds on behalf of 50 PPCPs all presented fold change values above 2, supporting the validity of marker compounds obtained with our analytical strategy.
The limits of detection (LODs) for 50 PPCPs were also considered here. Firstly, a 2.0 g blank lettuce sample was used to prepare an extract solution (1 mL) after the same pretreatment mentioned above. Then, a 20 ng mL −1 PPCPs solution was obtained by diluting their mixed methanol solution (20 µL, 1 µg mL −1 ) with 1 mL blank lettuce extract solution. The experiments were repeated in septuplicate to obtain seven samples, which underwent the same metabolomics analysis to obtain the peak intensities of 50 PPCPs. For each PPCP, a 20 ng mL −1 concentration level was deemed to correspond to average values of seven samples in peak intensity; therefore, the concentration (unit: ng mL −1 ) of each PPCP in a sample was calculated by its own peak intensity × 20/average peak intensity for the standard deviation measurement of the seven samples. According to the method proposed by US Environmental Protection Agency [55], the LOD values for 50 PPCPs were calculated to be 0.4 ~ 2.0 µg kg −1 , as shown in Table 2.    Note: a two VIP values from 100 and 20 ng mL −1 groups, respectively; b two-group coordinate values from 100 and 20 ng mL −1 groups, respectively; c Mass error (ppm) = (extracted molecular weight from W4M platformextracted molecular weight from LC-MS/MS) × 10 6 /extracted molecular weight from LC-MS/MS.

Univariate Analysis
After multivariate analysis, a pairwise t-test [47][48][49] was firstly employed to examine whether marker compounds from a specific concentration group presented significant differences in peak intensity with those from other two groups. Pairwise t-test, as a reliable statistical test method, was performed to calculate p values between the two concentration groups and the p < 0.05 observed in this study indeed showed the existence of significant differences among groups. Previous studies [29,54] also adopted fold change of concentration > 2 to discern variables with high contrast among groups as marker compounds. Herein, marker compounds on behalf of 50 PPCPs all presented fold change values above 2, supporting the validity of marker compounds obtained with our analytical strategy.
The limits of detection (LODs) for 50 PPCPs were also considered here. Firstly, a 2.0 g blank lettuce sample was used to prepare an extract solution (1 mL) after the same pretreatment mentioned above. Then, a 20 ng mL −1 PPCPs solution was obtained by diluting their mixed methanol solution (20 µL, 1 µg mL −1 ) with 1 mL blank lettuce extract solution. The experiments were repeated in septuplicate to obtain seven samples, which underwent the same metabolomics analysis to obtain the peak intensities of 50 PPCPs. For each PPCP, a 20 ng mL −1 concentration level was deemed to correspond to average values of seven samples in peak intensity; therefore, the concentration (unit: ng mL −1 ) of each PPCP in a sample was calculated by its own peak intensity × 20/average peak intensity for the standard deviation measurement of the seven samples. According to the method proposed by US Environmental Protection Agency [55], the LOD values for 50 PPCPs were calculated to be 0.4~2.0 µg kg −1 , as shown in Table 2.

Method Applicability in Maize Matrix
Maize as the primary food crop in China has proved to easily absorb PPCPs from the soil [19]; therefore, it was selected as another plant matrix different from vegetables to investigate the applicability of the developed metabolomics-based screening method. Maize sample was purchased from the local market and turned into a powder by a grinder. Then, it underwent the same above-mentioned pretreatment process after 50 PPCPs spiked at 10 µg kg −1 as well. Ciprofloxacin-d8 methanol solution (0.5 mL, 100 ng mL −1 ) was added for recovery calibration, with the results shown in Table S2. The same metabolomics analysis was performed as indicated in Figures S1-S5 (Supplementary Materials). Marker compounds to represent 50 PPCPs were also discovered (Table S3), proving the good applicability of the metabolomics analytical method to non-targeted screening of various PPCPs residues in different plant matrices. As can be seen from Table S3, the LOD values for 50 PPCPs in maize matrix were calculated to be 0.3~2.1 µg kg −1 .

Real Sample Test
We collected lettuce and maize samples from six administrative districts including Zhongshan, Xigang, Shahekou, Gaoxin, Ganjingzi and Jinpu affiliated to Dalian City, each district with two sampling points. A total of 12 fresh lettuce samples were purchased from the local farmer's market and immediately delivered to the laboratory for testing. The above process was also applied to the maize samples. After pretreatment experiments and metabolomics analysis, only one lettuce sample from Jinpu District was found to contain enrofloxacin and its content was 17.4 µg kg −1 . Other samples had no detection of PPCPs.
Although the detection rate of PPCPs in all the samples is only 1/24, and seemingly only one district is vulnerable to PPCPs contamination, the results are enough to show that our proposed method is competent for the screening of PPCPs in plant-derived foods. These spot check results alert us to the fact that PPCP-induced safety risk of plant-derived foods is on the horizon.
Previous studies have successfully applied non-targeted screening methods on the basis of metabolomics to pesticide residues in plant matrices, e.g., orange juice [28] and tea [29], providing the feasibility to screen PPCPs residues in plant-derived foods. In light of the otherness of analytes, the reported methods may not be completely applied to our study. Herein, we firstly considered spiked contaminants to be marker compounds and then implemented a marker compound-seeking analytical strategy of metabolomics to finish the non-targeted screening of contaminants in plant-derived foods, which is the biggest difference from previous studies [24,28,29]. Despite only 50 PPCPs and two plant matrices considered here, the developed method still has wide applicability due to the representation of these PPCPs and universal consumption of lettuce and maize.
Extensive use of PPCPs in livestock farming raises the risk that these compounds end up in soil where animal waste is used as fertilizer [9,56], which leads to the uptake of PPCPs by plant-derived foods from the soil [57][58][59][60][61][62][63][64]. Compared with other plants, leafy vegetables generally show higher detection ratio and concentrations of PPCPs [60,64] and therefore deserve more attention in their food safety risk. Although there are no official documents to explicitly clarify the MRLs of PPCPs in plant-derived foods, we can still deduce their safety thresholds from their corresponding MRLs in animal-derived foods [1][2][3][4]. Relative to the colossal number of analytical methods for PPCPs in animal-derived foods [65][66][67][68][69], the methods for PPCPs detection in plant matrices are in short supply. To better cope with the complicated PPCPs contamination in plants, the top priority is to develop a high-throughput screening method that can accurately, rapidly and comprehensively determine which PPCPs exist in the foods. With this consideration, we developed this novel metabolomics-based analytical method to achieve non-targeted screening of PPCPs in plant-derived foods.

Conclusions
The newly developed metabolomics analytical method was successfully applicable to non-targeted screening of 50 PPCPs residues in lettuce and maize matrices. We intentionally designed three concentration groups of PPCPs (20, 50 and 100 ng mL −1 ) to simulate the experimental and control groups adopted in the traditional metabolomics analytical procedures to search for marker compounds on behalf of 50 PPCPs. The process to perform metabolomics analysis has less artificial interference, a more concise workflow and higher screening efficiency. It is worth mentioning that this is the first implemented analytical strategy of metabolomics for non-targeted screening of PPCPs in plant-derived foods through seeking marker compounds. Due to the lack of binding legal documents on MRLs of PPCPs in plant matrices, together with constant development and application of new PPCPs in animal husbandry, it is urgent to compile legal rules to control MRLs of PPCPs in plant-derived foods, otherwise it may evolve as a serious food safety issue. To date, plant uptake from PPCP-contaminated soil is a known source of PPCP residues in plant-derived foods. It is not yet clear whether other ways can also induce the accumulation of PPCPs in the foods, potentially increasing the complexity of PPCPs contamination. Even worse, this increases the exposure risk of PPCPs to human health via the food chain. Therefore, we advocate that early attention to this issue would help defuse the potential crisis.