An HPLC-DAD Method to Quantify Flavonoids in Sonchus arvensis and Able to Classify the Plant Parts and Their Geographical Area through Principal Component Analysis

: A simple and efﬁcient method has been developed for the simultaneous determination of eight ﬂavonoids (orientin, hyperoside, rutin, myricetin, luteolin, quercetin, kaempferol, and apigenin) in Sonchus arvensis by high-performance liquid chromatography diode array detector (HPLC-DAD). This method was utilized to differentiate S. arvensis samples based on the plant parts (leaves, stems, and roots) and the plant’s geographical origin. The chromatographic separation was carried out on a reverse-phase C18 column by eluting at a ﬂow rate of 1 mL/min using a gradient with methanol and 0.2% aqueous formic acid. In the optimum conditions, the developed method’s system suitability has met the criteria of good separation. The calibration curve shows a linear relationship between the peak area and analyte concentration with a correlation coefﬁcient (r 2 ) > 0.9990. The ranges for the analytes’ limits of detection and quantitation were 0.006–0.015 and 0.020–0.052 µ g/mL, respectively. Intra-day and inter-day precision expressed in terms of RSD values were <2%, and the accuracy range based on recovery was 97–105%. The stability of all analytes within 48 h was about 2%. By combining HPLC-DAD ﬁngerprint analysis with chemometrics, the developed method can classify S. arvensis samples based on the plant parts and geographical origin.


Introduction
Indonesia is known as one of the countries with the highest biodiversity level. The organisms found in it include food and medicinal plants, which can be used for dietary supplements and herbal medicines to prevent or treat diseases. Sonchus arvensis, known as perennial sowthistle and tempuyung in Indonesia, has been used in commercial functional drinks and herbal medicines. In European countries, S. arvensis leaves have been used as a salad [1]. This plant is also used in traditional medicine to treat inflammation, coughing, bronchitis, gallstones, and asthma [2]. For medicinal purposes, the leaves are commonly used [3]. S. arvensis tissues are rich in flavonoids like kaempferol, luteolin-7-O-glucoside, and apigenin-7-O-glucoside. They also contain coumarin (scopoletin), taraxasterol, inositol, and phenolic acids (e.g., cinnamic acid, coumaric acid, and vanillic acid) [4]. Besides that, Sonchus arvensis also have essential oil [5], sesquiterpenes [6], and quinic acid esters [7]. The mentioned flavonoids and other secondary metabolites found in S. arvensis have an essential role in its biological activity.
Flavonoids are a large group of plant polyphenols found in many medicinal plants, vegetables, fruits, and crop products, mainly in flowers and leaves. These compounds play an important role in the human diet and have a beneficial effect on human health, showing antioxidants [8], anticancer [9], anti-inflammatory, antiviral, and antimicrobial [10] activities. Considering the importance of these biological activities of flavonoids, the identification and quantification of these compounds in plant tissues must be carried out by employing methods that can be used to certify the plant's quality and safety containing the mentioned flavonoids.
Several analytical methods have been developed to identify flavonoids in some medicinal plants. They are based on thin-layer chromatography, high-performance liquid chromatography (HPLC) with ultraviolet or fluorescence detectors [11], UV-Vis spectrophotometry [12], and capillary electrophoresis [13]. High-performance liquid chromatography (HPLC) is one of the most common and efficient chromatographic methods used to separate flavonoids from each other. In particular, silica-based stationary phases are the most popular and commonly used in HPLC procedures to achieve the separation of flavonoids. This stationary phase is widely used, providing high separation efficiency and excellent mechanical stability. In addition, silica stationary phases modified by several types of bound phases using C4-C30 alkyl groups, such as octadecyl or C18, have been widely used in HPLC [14]. The stationary phase is used to produce more efficient separation results in combination with good selectivity.
In recent years, many studies have been published on the separation of the flavonoids present in medicinal plants, including studies conducted on Lycium barbarum [15], Nelumbo nucifera [16], and Scutellaria species [17]. In general, the flavonoids present in these plants can be separated well by HPLC by employing different compositions of the mobile phase. Khan [18] reported the separation of flavonoids like rutin, kaempferol, quercetin, myricetin, and hyperoside extracted from S. arvensis. Seal [19] only detected the presence of catechin, quercetin, and kaempferol in S. arvensis. On the other hand, Hua and Qin [20] reported the presence of luteolin and apigenin in S. arvensis. Both luteolin and apigenin compounds are typical compounds in S. arvensis [21]. Therefore, these two flavonoids should be quantified in S. arvensis. Furthermore, no analytical approach has been reported, enabling users to discriminate between plant parts to obtain the sample or predict the S. arvensis sample's geographical origin.
This study aimed to develop a method for the simultaneous identification and quantification of flavonoids in S. arvensis, namely orientin, hyperoside, rutin, myricetin, luteolin, quercetin, kaempferol, and apigenin. We also applied the developed method for discrimination of geographical origin and plant parts used from S. arvensis. Thus, in the present study, a combination of chromatographic fingerprint and chemometric analysis was used as a powerful tool for discriminating between plant parts used to prepare the samples and identify the S. arvensis sample's geographical origin.

Plant Materials
Plant samples were collected from two regions (May 2018), characterized by lowand high-altitude. The low-altitude samples were collected in Dramaga, Bogor, West Java, whereas the high-altitude samples were collected in Lembang, West Bandung, West Java, Indonesia. We separated and investigated S. arvensis roots, stems, and leaves samples. All samples were cut into small pieces, dried at room temperature, and milled to a 60-mesh size. Mr. Taopik Ridwan performed the sample's identification, and the voucher specimen was stored at the Tropical Biopharmaca Research Center of IPB University, Indonesia.

Sample Preparation and Standard Solutions
Accurately weighed powdered samples (1 g) were sonicated in the presence of HCl 6M (5 mL) and TBHQ solutions (20 mL) for one hour at room temperature. The sample extracts obtained following sonication were filtered through a 0.45-mm membrane filter before being injected into the HPLC-DAD device. Standard stock solutions of apigenin, kaempferol, luteolin, myricetin, rutin, orientin, hyperoside, and quercetin were prepared in methanol at 1000 µg/mL concentrations. An appropriate amount of each standard stock solution was mixed and diluted with methanol to obtain eight concentrations of the working standard solutions of the eight analytes for constructing the relevant calibration curves.

Chromatography Conditions
An HPLC LC-20 AD equipped with a diode array detector (Shimadzu, Kyoto, Japan) and Zorbax Eclipse Plus C18 column (150 mm × 4.6 mm i.d., 5-µm particle size) was used to separate the compounds. The mobile phase consisted of methanol (A) and 0.2% formic acid in water (B), with the following gradient elution program: 15-30% A at 0-10 min, 30-50% A at 10-45 min, 50-80% A at 45-50 min, 80-95% A at 50-55 min, and 100% A at 55-60 min. After each run, the chromatographic system is set to 0% B for 5 min and balanced for 5 min with a flow rate of 1 mL/min. A diode-array detector monitors the column eluate. The quantification of flavonoids is based on a standard comparison of 270 nm for rutin and hyperoside, 330 nm for kaempferol, and 340 nm for orientin, myricetin, luteolin, quercetin, and apigenin. We used an injection volume of standard and sample solutions of about 10 µL.

Analytical Performance
The analytical performance of our approach was evaluated by following the guidelines of the International Conference on Harmonization (ICH) [22] by determining system suitability, curve calibration linearity, the limit of detection (LOD), the limit of quantitation (LOQ), precision, accuracy, and stability, expressed as a relative standard derivation (RSD) from each of the analyte levels (µg/g) obtained from the characteristic chromatogram peaks. Notably, the samples utilized to determine the analytical performance were S. arvensis leaves.

Data Analysis
Principal component analysis (PCA) was used to classify the different plant parts and the geographical origin of S. arvensis samples. For this purpose, we used HPLC-DAD chromatogram data of S. arvensis. Before being subjected to the PCA, the chromatogram was aligned using correlation optimized warping (COW). PCA and COW were performed using the Unscrambler X software version 10.1 from Camo (Oslo, Norway).

Optimization of HPLC-DAD Conditions
In order to meet the best separation condition for quantitative analysis and a chromatographic fingerprint, the effect of the composition of the mobile phase was investigated, Separations 2021, 8, 12 4 of 10 and the detection wavelength was optimized. The resolution of each analyte and the total analysis time were the parameters used to optimize the chromatographic conditions. Representative chromatograms of the standard solutions of flavonoids are reported in Figure 1. The optimal chromatographic separation was achieved using a gradient elution of phase A, methanol, and an elution of phase B, 0.2% formic acid in the water, as described in the Materials and Methods section. gated, and the detection wavelength was optimized. The resolution of each analyte and the total analysis time were the parameters used to optimize the chromatographic conditions. Representative chromatograms of the standard solutions of flavonoids are reported in Figure 1. The optimal chromatographic separation was achieved using a gradient elution of phase A, methanol, and an elution of phase B, 0.2% formic acid in the water, as described in the Materials and Methods section.
A satisfactory separation was achieved for all analytes with a separation factor of each peak being greater than 1.5 and the total time to perform the quantitative flavonoid analysis is 60 min. The wavelengths chosen for analytes quantification (corresponding to the absorption maxima of the analytes) were 270 nm in the case of rutin and hyperoside, 330 nm for kaempferol, and 340 nm for orientin, myricetin, luteolin, quercetin, and apigenin. These wavelengths were chosen because they afforded high sensitivity and signal intensities concerning the alternatives for each separated compound.

Evaluation of the Analytical Performance of the Developed Method
An analytical performance test is a process aimed at proving that a given analytical procedure provides satisfactory results in terms of its stated intent and purpose. This evaluation can provide information on the goodness, quality, reliability, and consistency of the analytical results. The parameters defining the outcome of the analytic performance test include system suitability, linearity, LOD, LOQ, accuracy, and stability.
System suitability was assessed, performing five replicate analyses of a mixed standard solution of orientin, hyperoside, rutin, myricetin, luteolin, quercetin, kaempferol, and apigenin. The relative standard deviation (RSD) values of the retention time, peak area, capacity factor, tailing factor, and theoretical plate number were calculated to evaluate the A satisfactory separation was achieved for all analytes with a separation factor of each peak being greater than 1.5 and the total time to perform the quantitative flavonoid analysis is 60 min. The wavelengths chosen for analytes quantification (corresponding to the absorption maxima of the analytes) were 270 nm in the case of rutin and hyperoside, 330 nm for kaempferol, and 340 nm for orientin, myricetin, luteolin, quercetin, and apigenin. These wavelengths were chosen because they afforded high sensitivity and signal intensities concerning the alternatives for each separated compound.

Evaluation of the Analytical Performance of the Developed Method
An analytical performance test is a process aimed at proving that a given analytical procedure provides satisfactory results in terms of its stated intent and purpose. This evaluation can provide information on the goodness, quality, reliability, and consistency of the analytical results. The parameters defining the outcome of the analytic performance test include system suitability, linearity, LOD, LOQ, accuracy, and stability.
System suitability was assessed, performing five replicate analyses of a mixed standard solution of orientin, hyperoside, rutin, myricetin, luteolin, quercetin, kaempferol, and apigenin. The relative standard deviation (RSD) values of the retention time, peak area, capacity factor, tailing factor, and theoretical plate number were calculated to evaluate the system suitability ( Table 1). The RSD values for all the parameters quantified in the system suitability tests were found to be less than 2%, which indicates that this method is very well suited for achieving the target separation of flavonoids. The calibration curves were performed at eight concentration levels in the 0.078-10 µg/mL range with three replicate injections for each concentration level by plotting the peak areas versus each analyte's concentration ( Table 2). The linearity of the calibration curves was determined based on the value of the correlation coefficient (r 2 ). Good linearity was obtained with a mean correlation coefficient value greater than 0.9995 for all analytes within the concentration levels. The LOD (S/N = 3.3) and LOQ (S/N = 10) were observed in the range of concentration levels of the calibration curve between 0.006-0.015 and 0.020-0.052 µg/mL, respectively. These results indicated that the proposed analytical method is characterized by good sensitivity. The precision of the method was determined by measuring the intra-day and inter-day repeatability of six individual sample analyses performed every day over three consecutive days. The method precision was expressed in terms of RSD, and the RSD values obtained for intra-day and inter-day analysis results were found to be less than 3%, confirming the good repeatability of the method. The accuracy of the method was evaluated by carrying out a recovery test that consisted of spiking known amounts of standard solutions of flavonoids into a test sample with three different levels of added standard solutions and The analytes stability in the sample solution was evaluated by performing analytes of the sample solution after 0, 4, 8, 12, 24, and 48 h of incubation at room temperature. The analytes were stable in the sample solution with RSD values ranging between 0.7% and 2.0% for all compounds. Analytical data for the LOD, LOQ, precision, accuracy, and stability are reported in Table 2.

Determination of Flavonoid Content in S. arvensis
The HPLC-DAD-based method for quantitative flavonoid analysis described above was applied for the simultaneous determination of flavonoids in six samples of S. arvensis consisting of several parts of the plant, namely two samples of roots, stems, and leaves from two plants collected in geographically distinct areas. Each sample analysis consisted of five replicate measurements of the amount of each analyte. Figure 2 show representative chromatograms of S. arvensis extracts whereby the peaks associated with specific target analytes are labeled as such. It can be seen that the eight target analytes contained in S. arvensis are well separated. The peaks of the flavonoid compounds were identified by two methods. The first method includes comparing the retention times of the peak samples with those of the reference compounds eluted under the same conditions and the second method includes spiking the sample with a standard solution of analytes. As can be seen in Table 4, flavonoid levels did not differ significantly due to either the geographical origin or type (roots versus stems versus leaves) of the sample. The amount of orientin, hyperoside, rutin, myricetin, luteolin, quercetin, kaempferol, and apigenin observed to be present across all samples fell in the following ranges, respectively: 16  .34 µg/g. Apigenin was found to be the dominant compound concentration-wise, whereas kaempferol was the flavonoid characterized by the lowest concentration in most of the samples investigated. Evidence indicated that the concentrations of flavonoids present in S. arvensis samples originating from the Bogor area were generally higher than those observed in West Bandung samples.

Analyte
Geographical Origin of Sample In addition to the eight peaks that have identified flavonoids, from the results obtained, there are still other peaks that have not been identified. The peak is likely another compound, such as other flavonoids. Then, from the results of these measurements, we analyzed the content of eight flavonoids using a calibration curve from the optimum condition of HPLC-DAD to see the sample difference based on the concentration levels obtained (Table 3). Table 3. Analytical data of LOD, LOQ, precision, accuracy, and analyte stability for determination of eight flavonoids.  As can be seen in Table 4, flavonoid levels did not differ significantly due to either the geographical origin or type (roots versus stems versus leaves) of the sample. The amount of orientin, hyperoside, rutin, myricetin, luteolin, quercetin, kaempferol, and apigenin observed to be present across all samples fell in the following ranges, respectively: 16.36-24.91, 1.34-39.34, 1.73-46.59, 18.00-313.97, 4.03-91.87, 2.27-31.64, 0.91-1.27, and 123.22-272.34 µg/g. Apigenin was found to be the dominant compound concentrationwise, whereas kaempferol was the flavonoid characterized by the lowest concentration in most of the samples investigated. Evidence indicated that the concentrations of flavonoids present in S. arvensis samples originating from the Bogor area were generally higher than those observed in West Bandung samples.

OR
The flavonoids were also found in very high concentrations in leaves samples, whereas their concentrations were very low in the plant stem samples. Based on these results, we could not discriminate them only on the amount of these eight bioactive compounds. For this purpose, therefore, we relied on the chemometric method, which is widely employed to discriminate between the different plant parts and the plant samples' geographical origin based on the samples' contents of suitable chemical compounds.

Clustering of S. arvensis Samples from the Different Geographical Origin and Plant Parts
In developing a method for distinguishing between S. arvensis samples based on the plant part they originated from and on the geographical origin of the plant, we employed a combination of HPLC-DAD fingerprint analysis and the chemometric method. This combination has become one of the most used approaches to the classification, authentication, and discrimination of medicinal plants to recognize the plant's geographical origin, the detection of adulteration, and the discrimination of closely related species. One of the multivariate analysis techniques we used in this study was PCA, which is a well-known unattended pattern recognition method. PCA is useful for reducing data and extracting information to find combinations of variables to describe the data set [23]. The original variable is transformed into a new and uncorrelated variable called the principal component (PC) [24].
As a variable, we used the HPLC-DAD fingerprint chromatogram data and preprocessed first by alignment using correlation optimized warping (COW). COW is a transformation to align chromatogram data that has an x-axis misalignment between one sample and another. The COW method works by aligning the sample chromatogram to the reference chromatogram using dynamic programming by stretching or compressing the sample segments using linear interpolation [25,26]. Segment length and slack size are two parameters optimized by the COW automated algorithm [27]. The segment length parameter determines the number of segments, while the slack parameter controls how these segments can be expanded or compressed [26]. When the number of time points in the sample and reference segment is different, the first to be linearly interpolated to make the segment length the same is the slack parameter [28]. The optimal slack size and segment length obtained were 70 and 14, respectively.
In this study, PCA was used to group samples based on differences associated with geographical origin. The score plot for the first two principal components (PC1 and PC2) is usually used most in the analysis because these two PCs contain the most data variations. The results of the score plots used the first two PCs originating from PCA. These components accounted for 97% of the total variance (PC1 = 93% and PC2 = 4%) (Figure 3). The PCA score plot explains that all samples can be classified into their respective groups based on their geographic origin and plant part. In this case, based on the results described above, the HPLC-DAD fingerprint chromatogram data combined with PCA can be reliably used to infer what plant part was utilized to prepare the sample and the geographical origin of the S. arvensis sample. above, the HPLC-DAD fingerprint chromatogram data combined with PCA can be reliably used to infer what plant part was utilized to prepare the sample and the geographical origin of the S. arvensis sample.

Conclusions
In this study, the simultaneous quantification of eight flavonoids in different S. arvensis samples originating from different plant parts and geographical areas was achieved by employing an HPLC-DAD-based methodology, which was successfully developed a high-accuracy and high-reliability qualitative and quantitative analysis of the mentioned compounds. Evidence indicated that flavonoid content in the leaves of S. arvensis was higher than in the plant's roots and stem. A combination of HPLC-DAD fingerprint analysis and the chemometric method enabled us to group the samples into six groups successfully. Therefore, the developed method could be used for the quality control of S. arvensis.

Conclusions
In this study, the simultaneous quantificatio sis samples originating from different plant parts employing an HPLC-DAD-based methodology high-accuracy and high-reliability qualitative an compounds. Evidence indicated that flavonoid higher than in the plant's roots and stem. A comb ysis and the chemometric method enabled us to cessfully. Therefore, the developed method cou arvensis.
), stem ( origin of the S. arvensis sample.

Conclusions
In this study, the simultaneous qua sis samples originating from different pl employing an HPLC-DAD-based met high-accuracy and high-reliability quali compounds. Evidence indicated that fl higher than in the plant's roots and stem ysis and the chemometric method enab cessfully. Therefore, the developed me arvensis.

Conclusions
In this study, the simu sis samples originating fro employing an HPLC-DA high-accuracy and high-re compounds. Evidence ind higher than in the plant's r ysis and the chemometric cessfully. Therefore, the d arvensis.
) from Bogor and root (

Conclusions
In this study, the simultaneous quantification of eight flavonoids in different S. arvensis samples originating from different plant parts and geographical areas was achieved by employing an HPLC-DAD-based methodology, which was successfully developed a high-accuracy and high-reliability qualitative and quantitative analysis of the mentioned compounds. Evidence indicated that flavonoid content in the leaves of S. arvensis was higher than in the plant's roots and stem. A combination of HPLC-DAD fingerprint analysis and the chemometric method enabled us to group the samples into six groups successfully. Therefore, the developed method could be used for the quality control of S. arvensis.

Conclusions
In this study, the simultaneous quantification of eight flavonoids in different S. arvensis samples originating from different plant parts and geographical areas was achieved by employing an HPLC-DAD-based methodology, which was successfully developed a high-accuracy and high-reliability qualitative and quantitative analysis of the mentioned compounds. Evidence indicated that flavonoid content in the leaves of S. arvensis was higher than in the plant's roots and stem. A combination of HPLC-DAD fingerprint analysis and the chemometric method enabled us to group the samples into six groups successfully. Therefore, the developed method could be used for the quality control of S. arvensis.

Conclusions
In this study, the simultaneous quantification of eight flavonoids in different S. arvensis samples originating from different plant parts and geographical areas was achieved by employing an HPLC-DAD-based methodology, which was successfully developed a high-accuracy and high-reliability qualitative and quantitative analysis of the mentioned compounds. Evidence indicated that flavonoid content in the leaves of S. arvensis was higher than in the plant's roots and stem. A combination of HPLC-DAD fingerprint analysis and the chemometric method enabled us to group the samples into six groups successfully. Therefore, the developed method could be used for the quality control of S. arvensis.

Conclusions
In this study, the simultaneous quantification of eight flavonoids in different S. arvensis samples originating from different plant parts and geographical areas was achieved by employing an HPLC-DAD-based methodology, which was successfully developed a high-accuracy and high-reliability qualitative and quantitative analysis of the mentioned compounds. Evidence indicated that flavonoid content in the leaves of S. arvensis was higher than in the plant's roots and stem. A combination of HPLC-DAD fingerprint analysis and the chemometric method enabled us to group the samples into six groups successfully. Therefore, the developed method could be used for the quality control of S. arvensis.