HPLC-HRMS Global Metabolomics Approach for the Diagnosis of “Olive Quick Decline Syndrome” Markers in Olive Trees Leaves

Olive quick decline syndrome (OQDS) is a multifactorial disease affecting olive plants. The onset of this economically devastating disease has been associated with a Gram-negative plant pathogen called Xylella fastidiosa (Xf). Liquid chromatography separation coupled to high-resolution mass spectrometry detection is one the most widely applied technologies in metabolomics, as it provides a blend of rapid, sensitive, and selective qualitative and quantitative analyses with the ability to identify metabolites. The purpose of this work is the development of a global metabolomics mass spectrometry assay able to identify OQDS molecular markers that could discriminate between healthy (HP) and infected (OP) olive tree leaves. Results obtained via multivariate analysis through an HPLC-ESI HRMS platform (LTQ-Orbitrap from Thermo Scientific) show a clear separation between HP and OP samples. Among the differentially expressed metabolites, 18 different organic compounds highly expressed in the OP group were annotated; results obtained by this metabolomic approach could be used as a fast and reliable method for the biochemical characterization of OQDS and to develop targeted MS approaches for OQDS detection by foliage analysis.


Introduction
The year 2013 will be recorded as the annus horribilis for olive trees of the Salento Peninsula in south-eastern Italy [1]. Compared to other Italian regions, olive cultivation in Puglia (also known as Apulia) has a more significant spread, with the associated strong economic impact. For nearly a decade in the province of Lecce and the southern zones of the Brindisi and Taranto provinces [2], olive trees have been affected by a progressive disease that begins with foliage desiccation, quickly ending up with the death of the whole tree. This condition of unknown etiology was named "olive quick decline syndrome" (OQDS, known in Italy also as CoDiRO, Complesso del Disseccamento Rapido dell'Olivo) [3,4]. OQDS is a disease characterized by leaves yellowing and desiccation of twigs and small branches, which eventually results in tree death. In the early stages, symptoms appear on the apical parts of trees, and subsequently, they then extend to the rest of the crown, which acquires a burned-like aspect in the last stages of disease. To contrast OQDS and promote new growth, heavy pruning has always been applied as an agronomical control approach, but this strategy has proven ineffective [5]. The onset of this economically devastating disease has been associated with Xylella fastidiosa (Xf) infection. Xf is a formidable Gramnegative pathogen that has already caused enormous damage in the United States and South America [6]. These bacteria form biofilms inside xylem vessels, leading to their occlusion, which eventually impairs water and mineral salt uptake [7] in the apical regions of the plant organism. To avoid an uncontrolled spread of this pathogen, the European Union has introduced mandatory measures of containment mostly via the eradication of plants carrying the infected phenotypes and all the plants growing in their surroundings creating a buffer zone to protect the rest of the Italian Peninsula [8].
This bacterium has been proven to be responsible for the development of some other economically significant diseases, including Pierce's disease of grapevine [9], leaf scorch of almond, oleander, and coffee [10], citrus variegated chlorosis, and other diseases of crop, forest, and landscape plants [11]. In the case of olive, damage may be aggravated by the presence of fungi of different genera, Phaeoacremonium and Phaemoniella in particular, but also Pleumostomophora and Neofusicoccum, which colonize and necrotize sapwood [2]. In order to develop less drastic strategies for containing the disease spread, researchers have studied bacterial homeostasis and its interaction with plants, investigating the role of different small molecules, such as minerals and phenolics [12][13][14]. Recently, Baldassarre et al. proposed a nanotechnological tool to contrast infection exploiting CaCO 3 nanocarriers [15]. To date, however, except for quarantine and eradication, there are no ways to fight this infection.
An early disease detection method also able to characterize the biochemistry of the infection and of the plant response would certainly be an efficient weapon to contrast Xylella's spread. Unfortunately, to date, it is possible to detect the presence of Xf in olive trees only by serological (ELISA) and molecular (PCR) methods [16,17]. Both methods are sensitive, but they cannot be directly used to characterize the complex phenomena occurring during infection.
Metabolomics is one of the most powerful tools for characterizing molecular profiles and their correlation with physiological or pathological conditions in biological or natural samples [18]. The main goal of this work is to propose an untargeted metabolomics approach combined with an appropriate analytical method to identify OQDS molecular markers able to discriminate between healthy and infected trees. This global metabolomics approach is potentially able to measure the levels of thousands of metabolites in a single analysis and suits perfectly for our purposes thanks to the comprehensive information on the composition of a metabolite pool it can provide [19]. Beside the possibility of characterizing the biological phenomena occurring during the infection, the outcomes of this type of investigation could also be used to identify a panel of biomarkers that could then be exploited by an analytical laboratory through targeted metabolomics approaches. These methods require a priori knowledge of metabolites of interest but are cheaper, faster, and more accurate if compared to untargeted ones [20].

Results
Principal component analysis was performed to understand the factors which account for the larger fraction of variance in the dataset and the results are shown in Figure 1. The results of the univariate statistical analysis on the full set of features extracted from the raw data substantially confirmed the marked difference between healthy and diseased samples. Out of the overall set of 3086 features, 1770 showed a significantly different concentration in healthy and diseased samples (p < 0.01 after Bonferroni correction). The two-dimensional map of the identified features is presented in Figure 2. In order to prioritize the annotation phase, we decided to focus on the features showing the higher contrast in the two sample classes, deciding to classify potential healthyinfected discrimination markers all the significant features which show a change of their median intensity from the first to the fourth quartile in the healthy-infected sample classes.
With this criterion, out of the 1770 significant features, 118 were classified as infection state markers and 37 as healthy state markers. Their position in the mz/rt plane is also highlighted in Figure 2; Tables S1 and S2 report, respectively, infected and healthy discriminating features. The score plot shows the distribution of the sample points in the PC1 vs. PC2 planes which account for a significant fraction of the overall variance of the dataset (50.1%).
The results of the univariate statistical analysis on the full set of features extracted from the raw data substantially confirmed the marked difference between healthy and diseased samples. Out of the overall set of 3086 features, 1770 showed a significantly different concentration in healthy and diseased samples (p < 0.01 after Bonferroni correction). The two-dimensional map of the identified features is presented in Figure 2. In order to prioritize the annotation phase, we decided to focus on the features showing the higher contrast in the two sample classes, deciding to classify potential healthy-infected discrimination markers all the significant features which show a change of their median intensity from the first to the fourth quartile in the healthy-infected sample classes.
With this criterion, out of the 1770 significant features, 118 were classified as infection state markers and 37 as healthy state markers. Their position in the mz/rt plane is also highlighted in Figure 2; Tables S1 and S2 report, respectively, infected and healthy discriminating features.
HPLC-HRMS and tandem MS analysis allowed for the identification of key molecules that are expressed in a significantly different manner between the groups of healthy and sick leaves. These features could then be used as putative markers to discriminate the HP and OP samples.
Among all the discriminating features we annotated, thanks to the HPLC-HRMS and tandem HRMS approach, 16 molecules and two additional molecules reported to be expressed in Olea europaea leaves (O.e Marker) [21] were listed as well. The details of chemical identification are reported in Table 1 HPLC-HRMS and tandem MS analysis allowed for the identification of key molecules that are expressed in a significantly different manner between the groups of healthy and sick leaves. These features could then be used as putative markers to discriminate the HP and OP samples.
Among all the discriminating features we annotated, thanks to the HPLC-HRMS and tandem HRMS approach, 16 molecules and two additional molecules reported to be expressed in Olea europaea leaves (O.e Marker) [21] were listed as well. The details of chemical identification are reported in Table 1, Figure 3 shows some examples of tandem MS experiments reporting the fragmentation pathways of some annotated molecules. While their intensity profile is presented as a boxplot in Figures 4 and 5.  Boxplots show the relative abundance of features in the different classes of samples (infected, healthy, and healthy desiccated), allowing for interpreting and inferring their distribution related to the sample state of health.
In Figure 5, the boxplot reports the features considered as O.e markers [21].       Boxplots show the relative abundance of features in the different classes of samples (infected, healthy, and healthy desiccated), allowing for interpreting and inferring their distribution related to the sample state of health.
In Figure 5, the boxplot reports the features considered as O.e markers [21].

Discussion
To date, there are no literature reports of the use of a HPLC-HRMS untargeted method to discover the early infection markers for Xylella fastidiosa-related OQDS. However, several untargeted metabolomics approaches were developed for the characterization and valorization of different Olea europaea cultivars and origins [22,23] and their oils [24,25].

HPLC-HRMS Analytical Method Development
The analytical chromatographic method proposed was developed with the aim of obtaining the best separation of analytes in the shortest time possible to maximally lower the inter-run retention time deviation of the molecules, thus also minimizing the peakpicking errors. The HPLC method chosen consisted in a multi-step gradient which was empirically time-optimized during the experimental design to better separate both high hydrophilic compounds such as organic acids and glycosylated metabolites up to lipophilic phytosterols. MS acquisition was performed in the negative ion mode as it is less affected by background noise than the positive ion mode [26]. An example of TIC chromatograms is reported in Figure S1.

Sample Two-Dimensional Clustering
In Figure 1a, three different sample clusters are clearly visible. The tighter cluster is composed of QC samples (violet) and its low spread confirms the low variability of the analytical pipeline. As expected, infected samples (red dots) show a higher variability, but they are otherwise a well-separated group. The larger and most diverse cluster is composed of healthy and desiccated samples, which are only partially separated in this bivariate projection. Interestingly, desiccated samples are positioned far away from the diseased group, indicating that desiccation alone is not responsible for the metabolic separation between healthy and diseased samples. It is worth noting that healthy and diseased samples are separated along the direction of larger variance (PC1, 49%) and this indicates that their difference is the most important factor affecting the overall variability of the dataset.
As discussed in the introduction, regionality might also play a role in sample clustering, acting as a potential confounder. For this purpose, the association of the samples to their specific origin is shown in Figure 1b. Healthy and desiccated samples from Puglia and Liguria are clearly distributed homogeneously inside the "healthy" cluster. These data suggest that also the origin of samples (Liguria vs. Puglia) does not play a prominent role in driving the observed differentiation between healthy and diseased samples. QC samples, obtained by pooling equal aliquots of every single sample, show a very condensed grouping in the PCA score plot. Their variance on both PCA axes is so narrow that we can assess that there is no evident analytical drifting that could affect our analytical method.

Features Annotation
Most of the molecules annotated belong to the class of plants secondary metabolites such as 12-hydroxy jasmonate sulfate, a molecule discovered in Arabidopsis thaliana which derives from jasmonic acid, which has a signaling role mediating diverse developmental processes and plant defense responses. The potential finding of this molecule suggests that the infected plant builds a defense response against the stress condition [27].
Nodakenin is a coumarin compound firstly found in the root of Angelica gigas and, as a coumarin, its function is to discourage herbivores and insect species from eating the plant [28]. Together with nodakenin, another coumarin, decuroside, has been MS/MSannotated. Physalin, first discovered in Physalis alkekengi, is a phytosteroidal molecule [29]; this molecule has accentuated antimicrobial and antibacterial effects [30] and its presence can be attributed to the plant's stress condition. Another annotated compound: marchantin A, a macrocyclic bis-benzyl ether isolated from Marchantia emarginata, is known for its antiprotozoal activity, suggesting that the plant may also defend from concurrent protozoal infections [31]. 6'-O-beta-D-glucopyranosyloleuropein is a molecule expressed in the Oleaceae plant family and belongs to the class of secoiridoid diglucosides [32], similar to the widely known secoiridoid glucoside oleuropein which has also been annotated as a proof of consistency for this work since it is a highly species-specific marker for Olea europaea, together with rutin.
On the other hand, we annotated some features mostly abundant in healthy samples such as the acridone alkaloid grandisine III isolated in Citrus grandis [37] and isomurrayazoline, a carbazole alkaloid found in Murraya koenigii [38]. Figure 4 shows abundance boxplots of the annotated features in each class. Most of the features identified appear to be up-regulated in infected samples and it is interesting to notice the intensity change in each class; the desiccation seems to play a role in incrementing the signal of a given feature, probably by increasing its concentration, reducing the amount of water. This phenomenon is, however, not as intense as what happens in the infected class.

Feature Abundance Class-Related Variability
Regionality is a known factor in differentiating the metabolome of Olea europaea trees and leaves [21]; in this case, it seems to play a role with some features such as FT2071 and FT1133, where in these, we can observe a different intensity distribution in healthy samples from Liguria and Puglia. These features, in addition to being reliable infection markers, can be used also as regionality markers, giving strength to this global metabolomic approach itself. Oleuropein (FT1775) shows a similar abundance distribution in healthy and desiccated classes except for some outliers which are, however, comparable with the two classes; in the case of FT2149: rutin, there is an appreciable origin-related variability in healthy samples ( Figure 5).

Chemicals and Materials
Ammonium acetate, formic acid LC-MS grade, catechin, and galangine were purchased from Sigma Aldrich (Milan, Italy). Acetonitrile and methanol solvents were purchased from VWR Italia (Milan, Italy) and were used without any further purification treatment. All the aqueous solutions were prepared by using ultrapure water (Merck Millipore MilliQ ™, Darmstadt, Germany).

Sampling Protocol of Plant Samples
Plant extracts considered in our investigation belonged to four distinguished groups: "Puglia infected", "Liguria healthy", "Puglia healthy", and "Puglia desiccated". It was impossible to collect Liguria infected samples since, to date, there is no evidence of Xf infection in any Italian region other than Puglia. Puglia infected samples were harvested from plants infected by Xylella fastidiosa; their infection state was confirmed by an antibody ELISA kit test. In total, 17 healthy samples (HP) comprehending 6 desiccated samples (DHP) and 15 samples with OQDS symptoms (OP) were collected directly from olive trees. The two groups of healthy samples were extracted from healthy olive trees grown in Pornassio (Liguria region, Italy) and Salento (Puglia region, Italy). Finally, desiccated samples were collected from a desiccated branch of a healthy olive tree; Table 2 reports the state, origin, and the number of samples. This specific experimental design was decided to compare the effect of the infection with potential confounding factors such as the farming locations (Puglia or Liguria regions, Italy) or the sample dryness, since infected samples are in desiccated form.

Quality Controls Setting
Quality controls (QCs) are created to assess and ensure that the analytical method created is performed appropriately and meets the criteria defined a priori. In our case, the QC sample was a pooled sample in which a small aliquot (20 µL) of each extracted sample under analysis was mixed in a 10-mL tube. By this way, the pooled QC created represents the matrix as the metabolites' composition of Xylella-infected samples.
Frequency of QC injections [39] was set according to earlier publications, focusing particularly upon [40]. HP and OD samples were injected in randomized run order in the same batch. QC injections were performed in each of the 5 samples. Additionally, at the beginning of the analysis, 10 consecutive injections of QC samples were performed to prime the column.

Sample Harvesting and Preparation
An amount of 600 g of leaves was harvested from each tree; 10 of them were sampled and immediately shock-frozen with liquid nitrogen to block all metabolic processes and transferred to the laboratory for the extraction.
It is fundamental to quench the metabolism as soon as possible, and shock freezing using liquid nitrogen is the most common and efficient method to inactivate the metabolism and preserve all the metabolites.

Extraction Protocol of Plant Samples and QCs Sample Generation
Leaves were flash frozen in liquid nitrogen and then manually grounded with a pestle and a mortar (pre-cooled and filled with liquid nitrogen). An amount of 300 mg of fine powder was extracted with 1.2 mL of 70% aqueous methanol in 1.5 mL Eppendorf tubes, sonicated for 15 min, and centrifuged at 13,680× g for 20 min. An amount of 500 µL of the supernatants was transferred in new Eppendorf tubes and the solvent evaporated under a stream of nitrogen. The addition of 1 mL of water/acetonitrile 50:50 (v/v) to the dry extract was followed by sonication (15 min) and centrifugation (13,680× g, 20 min).
Before conducting the analyses, each sample was spiked with two different internal standards: catechin (289.0790 m/z negative ion mode), and galangine (269.0528 m/z negative ion mode). The use of two different internal standards (eluting by the analytical column at different retention times) is a fundamental condition for having good results during the data alignment process. All injections were performed in the same batch.

HPLC-ESI-LTQ Orbitrap Parameters
Analyses were performed on a HPLC-ESI HRMS. Instrument setup for all the analyses consisted of a Dionex Ultimate 3000 HPLC system equipped with a solvent vacuum degasser (Thermo Scientific, Milan, Italy) coupled with a high-resolution mass spectrometer, LTQ-Orbitrap (Thermo Scientific, Milan, Italy), through an electrospray ionization (ESI) interface.
Chromatograms were recorded using Thermo Xcalibur 3.0 software (Rev. SP1 1160). A Gemini NX-C18 column from Phenomenex (Gemini NX-C18, 2.0 × 150 mm, 3.0 µm, 110 Å, Phenomenex, Bologna, Italy) was used to obtain chromatographic separation of the extracts. Mobile phase was ammonium acetate 0.005 M in ultrapure water (C) and acetonitrile (B). Gradient elution was set as follows: linear gradient from 5 to 20% B in 8 min, then 20% B held for 4 min, linear gradient from 20 to 30% B from 12 to 20 min, then in 2 min reaches 100% B; in 1 min the percentage of acetonitrile returns to initial conditions: (5%) and it is held for 10 min (from 23 to 33 min) to ensure the correct equilibration of the column at the initial condition percentages. The flow rate was set at 200 µL/min. Global run time was 33 min. The column temperature was set at 25 • C. Sample injection volume was 10 µL.
The HRMS system operated in the negative ionization mode. ESI tuning parameters were set as follows: capillary voltage was −13 V (ESI-); tube lens was set at −36 V (ESI-); source voltage was set to 3.5 kV (ESI-); sheath gas and aux gas flow rate were, respectively, 35 and 20 arbitrary units in both methods; spray current was set at 0.05 µA; capillary temperature was 270 • C during all the analyses. The mass spectrometer operated in full-scan mode in the range 100-1200 m/z (ESI-), with a resolution of 30,000 in FTMS mode. Tandem mass (MS/MS) experiments were automatically performed in the range 100-1200 m/z (ESI-), using the automatic dependent scan function. Collision energy was set at 30 (arbitrary units) for all the MS acquisitions. All spectra were acquired in centroid mode. Xcalibur 3.0 software (Rev. SP1 1160, Thermo Scientific, Bremen, Germany) was used both for acquisition and for elaboration and calculation.

Data Processing and Statistical Analysis
Raw LC-MS data files were converted into mzXML using the open source software ProteoWizard-MSConvert [41].
Data pre-processing was performed with XCMS [42]. Parameters applied for the processing were set as follows: centWave for feature detection (peakwidth = c (20, 80), prefilter = c (3, 50000), ppm = 10); retention time correction was performed with obiwarp (binSize = 0.6); peak matching across the samples (minFraction = 0.8, bw = 40). Before statistical analysis, missing peaks were imputed by applying the FillMissingPeaks algorithm available in XCMS. Statistical analyses were performed in R on the matrix constructed by extracting the maximum value of the intensity measured on each feature chromatographic peak (maxo). Log transformation was used to correct for the expected heteroskedasticity of metabolomics data.
Due to the strong differences in the samples belonging to the different classes, no sample normalization was performed. To pinpoint the features showing the stronger contrast, a two-stage strategy was applied:

1.
Kruskall-Wallis test was applied to identify the features showing a significant difference between HP and OP samples (p < 0.01 after Bonferroni correction).

2.
Significant features were then ranked on the bases of their median intensity in the two sample classes.

3.
The potential list of infection biomarkers was selected: a-By considering the features present in the top quartile of the ranked list for OP and in the lower quartile of the HP list. b-By considering the features present in the top quartile of the ranked list for HP and in the lower quartile of the OP list.
Tables S1 and S2 report the discriminating features found together with their averaged m/z ratio and retention times.

Metabolites Identification
Identification of single metabolites was achieved by MS/MS spectrum matching with available online databases such as MoNA and UNPD [43], and the MetFrag online tool [44]. When a tandem MS experiment was not present, we attempted molecular recognition by the accurate m/z signal which was converted to a putative molecular formula by means of Xcalibur Qual browser 3.0 software (Rev. SP1 1160, Thermo Scientific, Bremen, Germany) or the Metlin [45,46] online database. Proposed formulas were ranked for their ∆-ppm (0 to 6 ppm max); other applied constraints were the included atom elements: C: 0 to 50, O: 0 to 30, N: 0 to 10, H: 0 to 100, S: 0 to 5, P: 0 to 5; and RDB equivalent: −1 to 30 (for Xcalibur Qual browser only). As already mentioned, all analytes detected by our untargeted approach are classified as per MSI guidelines (Metabolomic Standard Initiative) [47].

Conclusions
An untargeted metabolomics approach was applied to olive leaves samples with the aim to understand the main differences between healthy plants and plants with OQDS-like symptoms. To this end, we followed an extraction procedure with aqueous methanol and developed a simple, accurate, high-resolution mass-based analysis method that could detect the broadest range of metabolites.
Results of multivariate analysis show a clustering of two pools of samples (HP vs. OP) based on two principal components (PC1 and PC2). Notably, the addition of Ligurian samples with different regionality factors and of desiccated samples proved the robustness of the method which is still capable of clustering healthy and infected samples on the first PCA axis. In addition, eighteen different organic compounds, among which 14 were highly expressed in the OP group, were annotated.
These results should pave the way for a targeted and feasible analytical approach aimed at the detection of early infection state-related molecules for all the research and routine laboratories who cannot afford a global metabolomics instrumentation setup. However, a truly comprehensive analysis of the plant metabolite pool is not easily feasible due to the large number of primary and secondary metabolites in any given plant species. Each analytical technology has advantages and limitations, and not one can cover the whole metabolome due to the chemical diversity of metabolites and their broad dynamic range in cellular abundance [48]. Consequently, different extraction techniques and combinations of analytical methods should be employed in attempts to achieve adequate metabolite coverage [49].
In future, the primary goal is to obtain a more significant number of features by implementing diverse extraction methods and by merging results upcoming from MS to other techniques more oriented to identification and characterization such as NMR spectroscopy.
More in-depth and more accurate knowledge of the olive metabolome and its infectionrelated differences can be provided by reiterating the analysis of different years to verify whether the data obtained by analyzing samples belonging to a single vintage remain unchanged or not by analyzing olive trees of different years.

Conflicts of Interest:
All authors, in the interests of transparency, declare that they do not have competing financial and/or non-financial interests in relation to the present work.