An Efficient Workflow for Quality Control Marker Screening and Metabolite Discovery in Dietary Herbs by LC-Orbitrap-MS/MS and Chemometric Methods: A Case Study of Chrysanthemum Flowers

LC-MS is widely utilized in identifying and tracing plant-derived food varieties but quality control markers screening and accurate identification remain challenging. The adulteration and confusion of Chrysanthemum flowers highlight the need for robust quality control markers. This study established an efficient workflow by integrating UHPLC-Orbitrap-MS/MS with Compound Discoverer and chemometrics. This workflow enabled the systematic screening of 21 markers from 10,540 molecular features, which effectively discriminated Chrysanthemum flowers of different species and cultivars. The workflow incorporated targeted and untargeted methods by employing diagnostic product ions, fragmentation patterns, mzCloud, mzVault, and in-house databases to identify 206 compounds in the flowers, including 17 screened markers. This approach improved identification accuracy by reducing false positives, eliminating in-source fragmentation interference, and incorporating partial verification utilizing our established compound bank. Practically, this workflow can be instrumental in quality control, geolocation determination, and varietal tracing of Chrysanthemum flowers, offering prospective use in other plant-derived foods.


Introduction
The rapidity and sensitivity of liquid chromatography-tandem mass spectrometry (LC-MS/MS) have enabled it to be a significant tool for food control [1] and it has attracted attention globally for its application in habitat discrimination and authentication [2,3].However, although a large number of components can be simultaneously detected using this technique, the accuracy of metabolite identification and screening of quality control (QC) markers remain the bottlenecks [4].It remains a challenge to procure advantageous and comprehensive insights from the wealth of information provided by LC-MS [4].Specifically, LC-MS cannot distinguish isomers, suffers from in-source fragmentation (ISF) interference, and relies on inaccurate online database matching of precursor ions [5,6].Although some product ion spectrum libraries can improve identification reliability, identifying compounds with few and less abundant fragments is still problematic [7].Moreover, the identification accuracy of LC-MS is often overlooked, resulting in unreliable subsequent analysis.Thus, an efficient and dependable strategy based on in-house and online databases is essential for QC marker screening and metabolite discovery.
Chrysanthemum flowers, including Chrysanthemum indicum (Ye Juhua [YJ] in Chinese) and Chrysanthemum morifolium (Juhua [JH]), are widely used in China as teas, food supplements, herb medicines, and cosmetic additives, owing to their unique flavor and extensive health benefits [8,9].They are rich in chlorogenic acid and flavonoids, which have diverse Foods 2024, 13, 1008 2 of 15 pharmacological activities, including antioxidation, antimicrobial properties, cardiovascular protection, and liver protection [10].Additionally, chrysanthemum flowers are abundant sources of anthocyanin, a natural edible pigment offering both nutritional value and pharmacological effects [11,12].JH has a large diversity of cultivars with regional features [10].The six most commonly consumed cultivars are Boju (BJ), Chuju (CJ), Huaiju (HuJ), Hangju (HaJ), Gongju (GJ), and Jinsihuangju (JSHJ) (Figure S1).Chemical variations resulting from the species, cultivars, climate, or soil conditions give rise to the distinct qualities, pharmacological effects, medicinal functions, and applications of Chrysanthemum flowers [13].For example, according to the Chinese Pharmacopoeia, JH and YJH demonstrate divergent medicinal properties and applications, whereas JSHJ is not authorized for medicinal use [14].Nevertheless, Chrysanthemum flowers are highly susceptible to adulteration due to their indistinguishable appearances, colors, and aromas.Hence, the identification of QC markers that can differentiate between various species and cultivars of Chrysanthemum flowers is of utmost importance for preventing adulteration, guaranteeing quality and safety, and facilitating government regulation.
Our previous study revealed the variations in major components between YJ and JH by HPLC [13].However, the differences among the cultivars were still not well exposed because of the limited sensitivity of HPLC.Previous studies have also applied LC-MS combined with chemometrics to discriminate between the geographic regions or cultivars of the flowers but these studies were hampered by insufficient sample sizes, limited compound identification, unexplored chemical diversity, and inefficient QC marker screening [15][16][17].
In this study, to address these issues, an efficient and reliable workflow utilizing UHPLC-Orbitrap-MS/MS and chemometric techniques was developed for screening the characteristic markers of the 84 batches of Chrysanthemum flowers from different species and cultivars (Figure 1).From 10,540 features, 1419 credible components were screened and 21 QC markers were revealed to successfully discriminate between the species and cultivars through principal component analysis (PCA), partial least squares-discriminant analysis (PLS-DA), hierarchical clustering (HC), and analysis of variance (ANOVA).An identification procedure depending on fragmentation patterns, diagnostic product ions (DPIs), an in-house database, and product ion spectrum libraries was devised to discover 206 compounds, which effectively improved the accuracy by reducing false positives and avoiding ISF interference (Figure 1).Over the past decade, our laboratory has actively researched the chemical foundations of ethnic medicine and traditional Chinese medicine, specifically focusing on substances like Schisandraceae plants, floral medicinal materials, and plants used for rheumatoid arthritis.Over 1000 compounds, including many with novel structures, have been isolated and identified.These compounds were used as the standards to validate the identification results, further confirming the strategy's reliability.

Chemicals and Materials
HPLC-grade methanol and formic acid were provided by Meck KGaA (Darmstadt, Germany) and Anpel Laboratory Technologies Inc. (Shanghai, China), respectively.All aqueous solutions were prepared with Watsons Water (Guangzhou, China).Twenty reference standards (Table S1) for fragmentation pattern analysis were obtained, as we described previously [12].The compound bank we established by systematic isolation of different herbs provided the other 15 compounds for identification validation (Table S1).Chrysanthemum flowers, including 10 batches of BJ, 9 batches of CJ, 12 batches of GJ, 13 batches of HaJ, 12 batches of HuJ, 12 batches of JSHJ, and 12 batches of YJ, were gathered from several Chinese provinces (Table 1).All samples were air-dried, crushed, and stored under vacuum at 4 °C.

Chemicals and Materials
HPLC-grade methanol and formic acid were provided by Meck KGaA (Darmstadt, Germany) and Anpel Laboratory Technologies Inc. (Shanghai, China), respectively.All aqueous solutions were prepared with Watsons Water (Guangzhou, China).Twenty reference standards (Table S1) for fragmentation pattern analysis were obtained, as we described previously [12].The compound bank we established by systematic isolation of different herbs provided the other 15 compounds for identification validation (Table S1).Chrysanthemum flowers, including 10 batches of BJ, 9 batches of CJ, 12 batches of GJ, 13 batches of HaJ, 12 batches of HuJ, 12 batches of JSHJ, and 12 batches of YJ, were gathered from several Chinese provinces (Table 1).All samples were air-dried, crushed, and stored under vacuum at 4 • C.

Preparation of the Sample Solution
Sample powder (0.25 g) was transferred to a 50-mL flask and extracted using ultrasonication for 30 min with 70% methanol solution (25 mL).Following a 10-min centrifugation at 12,000 rpm, the extract was filtered through a 0.25-µm microporous membrane to obtain the sample solution.The QC sample was created by combining an equal volume of each sample solution.Prior to analysis, the solutions were kept at 4 • C in a dark environment.A QC sample was analyzed at the beginning of the sequence and after every eight samples to evaluate the instrument's stability.

LC-Orbitrap-MS/MS Analysis
LC-MS analysis was conducted on a Vanquish Flex Binary UHPLC and an Orbitrap Exploris 120 mass spectrometer (Thermo Scientific, Waltham, MA, USA).Data acquisition was carried out using Xcalibur 4.0.Data preprocessing and analysis were performed using Freestyle 1.8 SP1 and Compound Discoverer 3.3.A Thermo Scientific Hypersil GOLDTM Aq-C18 column (20 × 2.1 mm, 1.9 m) was used for the analysis.Methanol (A) and 0.1% formic acid solution (B) made up the mobile phase.The analytes were separated according to the following gradient elution program: 0-25 min, 20-42% A; 25-45 min, 40-95% A; and 45-50 min, 95% A. An injection volume of 2 µL and a column temperature of 25 • C were employed.The flow rate was 0.3 mL/min.
The instrument was calibrated as instructed by the manufacturer before analysis.The electrospray ionization (ESI) source parameters were optimized as follows: positive ion spray voltage, 3.5 kV; negative ion spray voltage, 3.0 kV; sheath gas flow rate, 50 Arb; auxiliary gas flow rate, 10 Arb; sweep gas flow rate, 0 Arb; ion transfer tube temperature, 325 • C; and vaporizer temperature, 350 • C. Data acquisition included a full scan followed by data-dependent MS/MS data collection (Full-MS/ddMS/MS).The orbitrap resolution for the full scan was set at 60,000 FWHM with a scan range of 100-1000 Da and an RF lens of 70%.The stepped HCD collision energies for the ddMS/MS scan were 5, 10, and 20 eV with a resolution of 15,000 FWHM.

Data Pretreatment and Processing
The raw data of all samples were imported into Compound Discover 3.3, where peak alignment, background subtraction, and mass and retention time (RT) calibration were performed [18].For the subsequent differential analysis, the data of each cultivar or species were divided into one group.Statistical analysis was preliminarily performed on every two groups after QC correction.Compounds were detected, grouped, and searched in mzCloud and mzVault and the mass defect was calculated using the processing workflow for traditional Chinese medicines and natural products.A mass tolerance of 5 ppm and a minimum peak intensity of 2 × 10 5 were used for compound detection.Compounds were grouped based on a mass tolerance of 5 ppm, an RT tolerance of 0.1 min, and a peak rating of ≥6 in at least one data file.The mzCloud and mzVault search properties were adjusted according to the manufacturer's instructions.In brief, a mass tolerance of 10 ppm was employed for precursor and fragment ions.The collision energy tolerance was set at ±20%, with a match factor threshold of 60% and a maximum of 10 matching results for each compound.

Quality Control Marker Screening and Compound Discovery
Credible compounds were selected from all features using Compound Discover with a peak rating threshold and exclusion of compounds without predicted chemical formulas or with rare heteroatoms (Figure 1).Possible differential components were revealed by ANOVA, followed by the Tukey post-hoc test.The key markers were indicated using PCA and PLS-DA with Umetrics SIMCA 14.1, followed by HC with MetaboAnalyst (https://www.metaboanalyst.ca/,accessed on 25 March 2024).
An in-house database was established by summarizing chemical structures, molecular formulas, molecular weights, and CAS numbers of all compounds reported in Chrysanthemum flowers.Databases such as PubMed, SciFinder, Google Scholar, and CNKI were searched to compile this information.Three methods were employed to more accurately and specifically identify the compounds (Figure 1).Method A involved summarizing the fragmentation pattern of the standards and clarifying the DPIs and characteristic fragment Foods 2024, 13, 1008 5 of 15 ions of each type of compound.The DPIs in MS/MS spectra of the QC sample were retrieved using Freestyle to discover the same type of compound and their corresponding precursor ions and molecular formulas were then inferred and calculated.The possible structures were determined based on the molecular formulas and fragmentation pattern.Method B involved retrieving precursor ions of each compound in the in-house database and identifying them directionally if they complied with the fragmentation pattern.Method C was based on using Compound Discoverer to match all features' data with the product ion spectrum libraries, including mzCloud and mzVault, and more reliable identification was performed after screening and exclusion.If the previously established compound bank contained the identified compounds, they were used as reference standards and analyzed using the same method to validate identification results by comparing RTs and MS/MS spectra.

Optimization of the Method
Sample analysis was performed in positive ion mode due to its ability to detect more compounds compared with negative ion mode (Figure S2).The use of methanol and 0.1% formic acid in water as the mobile phase improved the shape of the compound peaks.An injection volume of 2 µL ensured response abundance and prevented peak tailing caused by overload.A higher extraction efficiency was obtained when using 70% methanol in water than using 50%, 90%, or 100% methanol.A better compound separation efficiency was achieved at 25 • C than at 15 • C, 30 • C, and 35 • C.

Quality Control Marker Screening
After conducting an analysis using Compound Discoverer, 10,540 molecular features were detected.PCA was performed on all feature data imported into SIMCA-P.The instrument displayed sufficient stability during the sample analysis process, as evidenced by the tightly clustered QC samples.(Figure 2A).Chrysanthemum flowers of different species or cultivars tend to separate from each other (Figure 2B), indicating significant differences in their chemical compositions.The LC-Orbitrap-MS analysis method established in this study is superior to the statistical analysis model previously developed to distinguish between Chrysanthemums flowers using HPLC, as confirmed by PLS-DA [13].
The peak rating is a metric used to assess the quality of peaks, calculated based on factors such as peak shape, baseline noise, and signal-to-noise ratio, resulting in a score between 0 and 10.A higher peak rating indicates better quality, reliability, and accuracy of the peak.Therefore, to eliminate interference from the matrix or baseline noise, we filtered out 2474 components from 10,540 features by setting a peak rating threshold of 6.As the components in Chrysanthemum flowers mainly consist of C, H, O, and N [10], further exclusion of components with other elements or unpredicted molecular formulas resulted in 1419 features with high reliability.Following ANOVA with the Tukey post-hoc test, a total of 1316 features with an adjusted p-value of <0.0001 remained.
PLS-DA with these 1316 features indicated that Chrysanthemum flowers of different species or cultivars could still be well discriminated (Figure 2C).Among the 1316 features, there were 48 with a VIP value of ≥2.0 (Figure S3).Out of these 48, the top 25 features identified by ANOVA could accurately distinguish between Chrysanthemums flowers through HC analysis (Figure 3), which was further validated by PLS-DA (Figure 2D).Detailed information on the 25 features (M1-M25) is shown in Table 2.The peak areas of some features exhibit significant differences not only between groups but also within individual samples of each group, which suggests that environmental factors, such as soil type and climate, significantly impact the chemical composition of Chrysanthemum flowers (Figure S4).PLS-DA with these 1316 features indicated that Chrysanthemum flowers of different species or cultivars could still be well discriminated (Figure 2C).Among the 1316 features, there were 48 with a VIP value of ≥2.0 (Figure S3).Out of these 48, the top 25 features identified by ANOVA could accurately distinguish between Chrysanthemums flowers through HC analysis (Figure 3), which was further validated by PLS-DA (Figure 2D).Detailed information on the 25 features (M1-M25) is shown in Table 2.The peak areas of some features exhibit significant differences not only between groups but also within individual samples of each group, which suggests that environmental factors, such as soil type and climate, significantly impact the chemical composition of Chrysanthemum flowers (Figure S4).In the subsequent compound identification study, it was discovered that M9 and M11 were an ISF product and another adduct form of M10, respectively.Both M14 and M16 were ISF products of M15.Therefore, 21 compounds served as QC markers and 17 of them were identified through later identification.HC and PLS-DA showed that these 21 compounds could also accurately distinguish between Chrysanthemum flowers (Figure S5).In the subsequent compound identification study, it was discovered that M9 and M11 were an ISF product and another adduct form of M10, respectively.Both M14 and M16 were ISF products of M15.Therefore, 21 compounds served as QC markers and 17 of them were identified through later identification.HC and PLS-DA showed that these 21 compounds could also accurately distinguish between Chrysanthemum flowers (Figure S5).

Identification of Caffeoylquinic Acid
The compounds were identified according to the workflow shown in Figure 2. Eight caffeoylquinic acids (Table S1), along with quinic acid (1), were identified using the reference standards and their MS/MS spectra exhibited high similarity.For example, 3,5-di-Ocaffeoylquinic acid (56) displayed an [M+Na] + signal at m/z 539.11578.It generated ions at m/z 203.03307 and 185.02072 after the loss of a chlorogenic acid moiety and subsequent elimination of an H2O molecule.Moreover, the cleavage between the caffeoyl group and the quinic acid moiety formed ions at m/z 163.03903 and 377.08405.These ions underwent further loss of H2O, CO, and the caffeoyl group (CA) to generate a series of characteristic peaks, as depicted in Figure 4.
A search using Freestyle 1.8 revealed that compounds 53 and 120 also produced the characteristic caffeoyl DPI (m/z 169.03897), indicating that they were also caffeoylquinic acids.The precursor ions of both compounds, [M+Na] + at m/z 539.11450 and 539.11536, indicated their molecular formulas as C25H24O12.Thus, these two compounds were tentatively identified as dicaffeoylquinic acid isomers.Similarly, 5-O-caffeoylquinic acid (5) and 1-O-caffeoylquinic acid (28) were identified, with their RTs determined by calculated log(P) (Clog(P)), with higher Clog(P) values indicating longer RTs [19].

Identification of Caffeoylquinic Acid
The compounds were identified according to the workflow shown in Figure 2. Eight caffeoylquinic acids (Table S1), along with quinic acid (1), were identified using the reference standards and their MS/MS spectra exhibited high similarity.For example, 3,5-di-Ocaffeoylquinic acid (56) displayed an [M+Na] + signal at m/z 539.11578.It generated ions at m/z 203.03307 and 185.02072 after the loss of a chlorogenic acid moiety and subsequent elimination of an H 2 O molecule.Moreover, the cleavage between the caffeoyl group and the quinic acid moiety formed ions at m/z 163.03903 and 377.08405.These ions underwent further loss of H 2 O, CO, and the caffeoyl group (CA) to generate a series of characteristic peaks, as depicted in Figure 4.
A search using Freestyle 1.8 revealed that compounds 53 and 120 also produced the characteristic caffeoyl DPI (m/z 169.03897), indicating that they were also caffeoylquinic acids.The precursor ions of both compounds, [M+Na] + at m/z 539.11450 and 539.11536, indicated their molecular formulas as C 25 H 24 O 12 .Thus, these two compounds were tentatively identified as dicaffeoylquinic acid isomers.Similarly, 5-O-caffeoylquinic acid (5) and 1-O-caffeoylquinic acid (28) were identified, with their RTs determined by calculated log(P) (Clog(P)), with higher Clog(P) values indicating longer RTs [19].
The in-house database consisted of 308 compounds.A preliminary screening was conducted to determine the presence of caffeoylquinic acids in the QC sample by searching for the [M+Na] + or [M+H] + ions of these compounds using Freestyle.Subsequently, the decision on whether a compound was identified as such was based on whether its fragment ions matched the pattern described above.A total of four caffeoylquinic acids (6, 7, 47, and 93), as well as caffeic acid (25), were identified through this approach.The in-house database consisted of 308 compounds.A preliminary screening was conducted to determine the presence of caffeoylquinic acids in the QC sample by searching for the [M+Na] + or [M+H] + ions of these compounds using Freestyle.Subsequently, the decision on whether a compound was identified as such was based on whether its fragment ions matched the pattern described above.A total of four caffeoylquinic acids (6, 7, 47, and 93), as well as caffeic acid (25), were identified through this approach.
After importing all the data into the Compound Discoverer for data preprocessing and compound identification, 10,540 features were detected.The parameters mzCloud best match and mzVault best match refer to the comparison of the sample's mass spectrum with those in the mzCloud or mzVault database to find the most similar compound structure.Best match values range from 0 to 100, with higher values indicating a higher degree of similarity.The "mzCloud best confidence" is a score calculated for each candidate compound structure, with a higher score indicating a greater confidence in the match between the candidate structure and the actual compound.To reduce false positives, the relatively reliable identifications were obtained by filtering with mzCloud best match score ≥90 and its Confidence ≥60 or mzVault best match score ≥90.In the process of utilizing mzVault for compound identification, no parameter was accessible to evaluate the credibility.The presence of compounds with inadequate fragment ions or weak responses may potentially lead to false positive outcomes accompanied by high matching scores.To ensure the accuracy of the findings, any results with less than three matching fragment ions corresponding to compounds in the database were omitted.Finally, one caffeoylquinic acid (134) was identified after excluding the previously identified results.After importing all the data into the Compound Discoverer for data preprocessing and compound identification, 10,540 features were detected.The parameters mzCloud best match and mzVault best match refer to the comparison of the sample's mass spectrum with those in the mzCloud or mzVault database to find the most similar compound structure.Best match values range from 0 to 100, with higher values indicating a higher degree of similarity.The "mzCloud best confidence" is a score calculated for each candidate compound structure, with a higher score indicating a greater confidence in the match between the candidate structure and the actual compound.To reduce false positives, the relatively reliable identifications were obtained by filtering with mzCloud best match score ≥90 and its Confidence ≥60 or mzVault best match score ≥90.In the process of utilizing mzVault for compound identification, no parameter was accessible to evaluate the credibility.The presence of compounds with inadequate fragment ions or weak responses may potentially lead to false positive outcomes accompanied by high matching scores.To ensure the accuracy of the findings, any results with less than three matching fragment ions corresponding to compounds in the database were omitted.Finally, one caffeoylquinic acid (134) was identified after excluding the previously identified results.

Identification of Flavonoids
Chrysanthemum flowers are rich sources of flavonoids, which are primarily flavonoid glycosides.This study identified 12 flavonoids by comparing their RTs and MS spectra with the standards.Their proposed fragmentation patterns are depicted in Figure 5.The glycosidic bond of flavonoids was found to be susceptible to cleavage, leading to the formation of aglycones.Retro-Diels-Alder (RDA) fragmentation consistently occurred in the C-ring of the aglycone, whereas distinct fragmentation patterns were observed for different types of aglycones [20,21].Thus, flavonoids of the same type were identified using the aglycone precursor ion as a DPI.A comparison of their MS/MS spectra with the corresponding type of flavonoids was conducted to further confirm their classification.The molecular formula and fragments were then utilized to deduce their possible functional groups and their structures were ultimately identified.

Identification of Flavonoids
Chrysanthemum flowers are rich sources of flavonoids, which are primarily flavonoid glycosides.This study identified 12 flavonoids by comparing their RTs and MS spectra with the standards.Their proposed fragmentation patterns are depicted in Figure 5.The glycosidic bond of flavonoids was found to be susceptible to cleavage, leading to the formation of aglycones.Retro-Diels-Alder (RDA) fragmentation consistently occurred in the C-ring of the aglycone, whereas distinct fragmentation patterns were observed for different types of aglycones [20,21].Thus, flavonoids of the same type were identified using the aglycone precursor ion as a DPI.A comparison of their MS/MS spectra with the corresponding type of flavonoids was conducted to further confirm their classification.The molecular formula and fragments were then utilized to deduce their possible functional groups and their structures were ultimately identified.For example, by searching for the precursor ion of luteolin aglycone (m/z 287.05501) in the QC sample with Freestyle, five corresponding compounds (32, 44, 45, 78, and 153) were retrieved.Further analysis of their MS/MS spectra revealed that they were indeed luteolin-type flavonoid glycosides, which were subsequently identified when taking into account their molecular formulas.Furthermore, four additional luteolin-type flavonoids (57, 61, 103, and 122) were identified through the in-house database, as well as mzCloud and mzVault, using the method described above.Similarly, apigenin-, luteolin-, kaempfero-l, and quercetin-type flavonoids were identified.Moreover, other types of flavonoids, such as luteolin, hesperetin, and naringenin, were also identified through this process, and their fragmentation patterns can be determined by comparing the reported data in the literature with the actual MS/MS spectra.Overall, the workflow established in this study enabled the identification of a total of 109 flavonoids from Chrysanthemum flowers (Table S1).

Identification of Other Compounds
Chrysanthemum flowers contain compounds such as terpenes, sesquiterpenes, lignans, and amino acids, in addition to caffeoylquinic acid and flavonoids.However, it is challenging to identify a series of compounds through DPI analysis due to the absence of regularity in the chemical structure or distinctiveness in the fragmentation pattern of these compounds.As a result, we primarily relied on the in-house database in combination with the literature-reported MS data or the mzCloud and mzVault libraries for the identification of these compounds.Given that Chrysanthemum flowers are primarily composed of C, H, O, and N, compounds containing other elements were excluded during the identification of other compounds using mzCloud and mzVault to prevent false positive results.In addition, the complex structures and numerous isomers of terpenes present in the flowers make it challenging to distinguish and identify them based on their highly similar MS spectra.Therefore, when mzCloud and mzVault identified more than two isomers for terpenes, determining their specific structures was difficult and the identification results were deemed unreliable and hence excluded.Finally, 79 compounds other than flavonoids and caffeoylquinic acids were identified using the in-house database, mzCloud, and mzVault (Table S1).

In-Source Fragmentation and Partial Verification of the Identification Results
Precursor ions are commonly generated via compound protonation or deprotonation.Subsequently, these ions acquire sufficient energy from high-energy collisions in the collision cell, leading to their fragmentation into smaller ions, a phenomenon referred to as collision-induced dissociation (CID) [22].However, in certain cases, sample molecules may undergo fragmentation in the ionization source, thereby producing fragment ions that exhibit the same RT as the target molecules, which is known as ISF [23].Although ISF has potential benefits, it can pose several challenges, such as decreased detection sensitivity, misannotation of non-target compounds, and the possibility of generating false negative or false positive results [24,25].
In the current investigation, the foremost predicament stemming from ISF was the occurrence of false positive outcomes, wherein the products of the ISF of compounds were susceptible to being erroneously identified as autonomous entities.Regrettably, the inhouse databases, mzCloud, mzVault, and Compound Discoverer, are unable to distinguish between ISF products.The most effective approach to determine whether the molecular feature is an ISF product is to manually scrutinize whether there is a compound at the same RT that can generate a fragmentation ion corresponding to the precursor ion of the molecular feature, which can be determined by the compound's MS/MS spectrum.For example, the ISF of acacetin-7-O-β-D-rutinoside caused the loss of a rhamnosyl moiety, generating a molecular feature that can be mistaken for an isomer of acacetin-7-O-β-D-glucopyranoside.The MS/MS analysis of acacetin-7-O-β-D-rutinoside indeed demonstrated its susceptibility to losing a rhamnose molecule.Furthermore, ISF leading to the production of aglycones is a typical occurrence in flavonoid glycosides, whereas di-substituted caffeoylquinic acids tend to lose a caffeoyl group, resulting in the formation of mono-substituted caffeoylquinic acids, or lose an H 2 O molecule.In the current investigation, we utilized the aforementioned methods to exclude 27 ISF products from the identified compounds (Table S2).
Our research group has extensive experience in studying natural products and has isolated nearly 1000 compounds from over 20 plants with a focus on food sources, forming

Foods 2024 , 15 Figure 1 .
Figure 1.Workflow of quality control markers screening and compounds identification.

Figure 1 .
Figure 1.Workflow of quality control markers screening and compounds identification.

Figure 3 .
Figure 3. Heatmap of HC with the top 25 differential features.

Figure 3 .
Figure 3. Heatmap of HC with the top 25 differential features.

Figure 5 .
Figure 5.The proposed fragmentation patterns of the 12 flavonoids were identified by reference standards.Figure 5.The proposed fragmentation patterns of the 12 flavonoids were identified by reference standards.

Figure 5 .
Figure 5.The proposed fragmentation patterns of the 12 flavonoids were identified by reference standards.Figure 5.The proposed fragmentation patterns of the 12 flavonoids were identified by reference standards.

Table 1 .
Information on the samples used in this study.
* Genuine producing area of the corresponding species or cultivars.

Table 1 .
Information on the samples used in this study.

Table 2 .
Detailed information on the top 25 molecular features.