Quantitative Analysis of Bioactive Compounds in Commercial Teas: Profiling Catechin Alkaloids, Phenolic Acids, and Flavonols Using Targeted Statistical Approaches

Tea, an extensively consumed and globally popular beverage, has diverse chemical compositions that ascertain its quality and categorization. In this investigation, we formulated an analytical and quantification approach employing reversed-phase ultra-high-performance liquid chromatography (UHPLC) methodology coupled with diode-array detection (DAD) to precisely quantify 20 principal constituents within 121 tea samples spanning 6 distinct variants. The constituents include alkaloids, catechins, flavonols, and phenolic acids. Our findings delineate that the variances in chemical constitution across dissimilar tea types predominantly hinge upon the intricacies of their processing protocols. Notably, green and yellow teas evinced elevated concentrations of total chemical moieties vis à vis other tea classifications. Remarkably divergent levels of alkaloids, catechins, flavonols, and phenolic acids were ascertained among the disparate tea classifications. By leveraging random forest analysis, we ascertained gallocatechin, epigallocatechin gallate, and epicatechin gallate as pivotal biomarkers for effective tea classification within the principal cadre of tea catechins. Our outcomes distinctly underscore substantial dissimilarities in the specific compounds inherent to varying tea categories, as ascertained via the devised and duly validated approach. The implications of this compositional elucidation serve as a pertinent benchmark for the comprehensive assessment and classification of tea specimens.


Introduction
Tea, obtained from the freshly plucked leaves of Camellia sinensis, represents a universally consumed potable [1].In China, the postharvest processing of Camellia sinensis leaves involves a series of six distinct techniques, leading to the production of six types of teas: black tea, green tea, yellow tea, white tea, oolong tea, and dark tea [2,3].These processing methods have evolved over thousands of years across various regions of China and are generally classified into five categories based on the extent of endogenous enzymatic reactions: (1) non-fermented tea, such as green tea; (2) lightly fermented tea, including yellow tea and white tea; (3) partially fermented tea, represented by oolong tea; (4) fully fermented tea, exemplified by black tea; and (5) post-fermented tea, wherein exogenous microbial fermentation plays a crucial role in the processing.Figure S1 illustrates the distribution of tea types across China.Despite the extensive research conducted on tea and its chemical composition, flavor profiles, and health benefits, there is still much to explore and understand, particularly in relation to the unique chemical profiles of each tea type and their potential implications for human health.The variations in processing techniques and the involvement of different enzymatic reactions and microbial fermentation in tea production contribute to the unique characteristics and properties of each tea variety.Therefore, a comprehensive investigation into the chemical constituents and sensory attributes of these teas is crucial for establishing a deeper understanding of their distinct qualities and potential health implications.
Tea is replete with an array of chemical constituents, prominently inclusive of catechins, phenolic acids, flavonols, and alkaloids, all of which collectively constitute major bioactive constituents [4,5].These components confer palatability and concurrently confer bioactivities such as antioxidation and antibacterial effects.Fermentation, primarily manifesting as enzymatic oxidation, orchestrates the conversion of tea polyphenols into the corresponding oxidation byproducts, which are eventually activated upon the exposure of tea leaves to ambient humidity and oxygen.The diverse chemical constituents and polyphenol oxidases evident within the six tea types stem from the varying extents of fermentation.Hence, a systematic and exhaustive evaluation of the constituents within the six tea varieties is of paramount significance.Noteworthy endeavors have been directed towards investigating functional constituents, notably catechins, purine alkaloids, and flavonol glycosides, within select Chinese tea variants [6][7][8].Notably, tea polyphenols, particularly catechins, undergo oxidative transformations during manufacturing processes, instigated by either moist heat or intrinsic polyphenol oxidases alongside microbial oxidases [9].Consequently, non-fermented green tea characteristically boasts elevated levels of catechins, with epigallocatechin-3-gallate (EGCG) assuming particular prominence.Modestly fermented white and yellow teas exhibit slightly diminished catechin levels in relation to green tea, thereby facilitating the emergence of theaflavins and thearubigins [10].Within semi-fermented oolong tea, catechin oxidation transpires solely at the leaf periphery, thereby positioning itself as intermediate to green tea and black tea.Catechin oxidation in dark tea is mediated by microorganisms, ultimately culminating in the generation of theabrownine.Notwithstanding, a comprehensive chemical profiling of Chinese tea remains an outstanding pursuit.A comprehensive and methodical inquiry underpinned by extensive data analysis is pivotal to obviating ambiguities pertaining to the functional constituents of Chinese teas.The standardization of tea quality holds profound ramifications for both tea enterprises and regulatory oversight.
The classification method based on sensory evaluation has a drawback as it lacks quantitative assessment indicators, which leads to difficulties in tea authentication.In contrast, international standards concentrate more on the physical and chemical attributes of tea.The absence of quantitative indicators poses challenges to tea quality control for producers, consumers, and regulatory agencies.In recent years, various analytical techniques, including thin-layer chromatography [11,12], high-performance liquid chromatography [13,14], and ultra-performance liquid chromatography-tandem mass spectrometry [15,16], have been utilized to determine the chemical composition of tea.However, most of these studies have focused on only a few chemical markers from a small number of teas.Thus, a comprehensive analysis of tea using a single analytical method is still missing [17].To meet international standards, there is an urgent need for analyzing the major chemical constituents of different types of processed teas.In this research, we established an efficient and rapid UHPLC-DAD method for the determination of 20 components, including catechins, alkaloids, phenolic acids, and flavonols, in 121 tea samples from six distinct types of tea.Furthermore, we identified the potential essential chemical components critical for classifying tea types.By performing a principal component analysis on the chemical composition of the six different types of tea leaves, we conducted a classification analysis of the tea sample characteristics.

Tea Sample Extraction
The sample extraction method was optimized to enhance the efficiency of extracting key constituents in tea.Tea samples underwent initial drying at 35 • C for 2 h, followed by crushing into powders and passage through a 40 mesh screen (304 stainless steel sieve, Yongkang Jielong Industrial and Trade Co., Ltd., Jinhua, China).The selection of this specific mesh screen aimed to optimize the extraction procedure, achieving elevated dissolution rates while minimizing material loss.An aliquot of 0.5 g sample powder was weighed into an Erlenmeyer flask and 10 mL of the methanol-dimethyl sulfoxide mixture (50:50, v/v) was added.The mixture was shaken for 15 min at room temperature and centrifuged at 8000 rpm for 15 min at 4 • C.This extraction process was repeated once.The supernatants from the two extracts were combined, diluted to 50 mL with the methanol-dimethyl sulfoxide mixture (50:50, v/v), and stored at −20 • C until analysis.

Development of UHPLC-DAD Analytical Method
In this study, a UHPLC-DAD system was employed for the analysis.The instrument used for this analysis was the Ultimate 3000 UHPLC (Thermo Fisher Scientific, Milan, Italy).Prior to UHPLC analysis, the extract was filtered through a 0.22 µm microporous membrane, and 1 µL of the filtered extract was injected into the UHPLC system.Chromatographic separation was performed using a reverse phase column (Merck Lichrospher RP-18, 100 mm × 2.1 mm, 2 µm, Hessian, Germany).Mobile phases A and B were 0.1% formic acid and acetonitrile, respectively [18].The gradient elution procedure was: 0 min, 93% A; 12 min, 80% A; 16 min, 50% A; 20 min, 93% A. The analysis duration for each specimen was 20 min, inclusive of a 4 min column equilibration period.The column temperature and flow rate were maintained at 30 • C and 0.3 mL min −1 , respectively.For detection, two wavelengths, 280 and 340 nm, were compared in this study using the DAD integrated into the UHPLC system.The developed UHPLC method was validated according to ICH guidelines [18] to ensure fulfillment of current regulatory standards.

Statistical Analysis
Data were presented as mean ± standard deviation and range (min-max).The differences among different groups (>2 groups) were evaluated using ANOVA adjusted by Tukey post hoc test in SPSS software (SPSS for Windows, Release 19.0, SPSS Inc., Chicago, IL, USA).Different lower cases indicate significant differences (p < 0.05).To identify potential chemical biomarkers for the classification of teas, a random forest (RF) algorithm was implemented in the "Random Forest" package [19] under R software (Version 3.5.3,https://www.r-project.org/,accessed on 6 July 2021).RF achieves classification by constructing a series of decision trees.It optimizes the classification by aggregating the inputs across all trees.Although this method cannot be compared to traditional chemometrics, which has a solid statistical foundation, RF possesses several advantages, including excellent predictive capabilities and the ability to balance all variables even in cases of overfitting [20,21].Principal component analysis (PCA) was used to examine patterns in composition data and to highlight similarities and dissimilarities in the phytochemical contents of the tea products.

Development and Validation of a UHPLC-DAD Analytical Method
The structures of the 20 compounds are displayed in Figure S2.For enhanced precision, we compared the absorption peaks of these compounds at two wavelengths of 280 and 340 nm (Figure 1B).It came to our attention that all components exhibited enhanced sensitivity and reduced interference at 280 nm, signifying their suitability for concurrent determination of the chosen compounds.We then proceeded to verify the reliability of this analysis method.Table 1 illustrated the favorable linearity of all analytes with R 2 > 0.999.The relative standard deviations (RSD) were within the range of 0.01-0.31%for intraday assays and 0.42-3.31%for interday assays.Table 1 shows that the limit of detection (LOD) of the analytes was between 0.03 and 2.73 mg/L, alongside the average recovery rate, which ranged from 93.61% to 106.25%.These results indicate that the proposed analysis method was sensitive, precise, and accurate.An analysis of the main chemical components in six different types of teas revealed that green tea contained the highest levels of chemical components (Figure S3).

Comparison of Alkaloids Levels in Six Different Types of Chinese Teas
Our developed method successfully detected three types of alkaloids, namely theophylline (THEO), theobromine (TB), and caffeine (CAF).Remarkably, CAF was the predominant alkaloid in all six types of teas, followed by TB and THEO, indicating that tea processing methods might have had little impact on the alkaloid composition ratio.Nevertheless, our results indicated that different tea types possessed different alkaloid contents.For instance, YT and OT exhibited the highest and lowest TB levels, respectively, while DT showed significantly higher levels of THEO than other tea types (p < 0.05, Table 2).Moreover, GT displayed significantly higher levels of CAF compared to OT and WT, and YT exhibited markedly higher TB levels than the other tea types (p < 0.05, Table 2).These observations suggested that processing methods may have differentially affected the composition of tea alkaloids, which are known to possess various health-promoting effects.

Comparison of Alkaloids Levels in Six Different Types of Chinese Teas
Our developed method successfully detected three types of alkaloids, namely theophylline (THEO), theobromine (TB), and caffeine (CAF).Remarkably, CAF was the predominant alkaloid in all six types of teas, followed by TB and THEO, indicating that tea processing methods might have had little impact on the alkaloid composition ratio.Nevertheless, our results indicated that different tea types possessed different alkaloid contents.For instance, YT and OT exhibited the highest and lowest TB levels, respectively, while DT showed significantly higher levels of THEO than other tea types (p < 0.05, Table 2).Moreover, GT displayed significantly higher levels of CAF compared to OT and WT, and YT exhibited markedly higher TB levels than the other tea types (p < 0.05, Table 2).These observations suggested that processing methods may have differentially affected the composition of tea alkaloids, which are known to possess various health-promoting effects.

Dynamic Changes in Catechins in Six Different Types of Tea
Table 3 illustrates the remarkable effect of tea processing methods on the composition ratio of tea catechins.Our data revealed that GC levels were markedly higher in OT and GT compared to other tea varieties (p < 0.05, Table 3).Lower EGC levels were detected in DT and BT, while C levels were relatively higher in WT and GT (p < 0.05).Notably, WT exhibited significantly higher EC levels than others (p < 0.05).The EGCG content was higher in the GT and YT groups, while it was lower in the DT group (p < 0.05).Furthermore, the GCG levels in the GT group were significantly higher compared to the WT, DT, and BT groups, and the ECG levels of the GT tea variety were also the highest (p < 0.05).Investigating tea catechin changes can aid in regulating tea quality during thermal processing.Tea type: OT, oolong tea; GT, green tea; WT, white tea; DT, dark tea; YT, yellow tea.Note: Data expressed as mean ± standard deviation and range (min-max).a, b, c, d, e Values with different letters indicate significant differences (p < 0.05) compared to GT, YT, DT, WT, and OT samples using ANOVA and Tukey post hoc test.The results were reported in mg/g to indicate the concentration of compounds in 1 g of tea leaves after conversion.

Dynamics Changes in Flavonols in Six Different Types of Teas
The tea processing procedures have resulted in a shift in the composition ratio of tea flavonols.As shown in Table 4, OT and WT exhibited a relatively lower proportion of RUT than other tea varieties.WT also contained the lowest level of KAE among all tea types.While BT, GT, and DT demonstrated a higher proportion of QUE, YT had the lowest concentration of QUE.The differences in flavonol levels among the six different tea varieties were not statistically significant (Table 4).Tea type: OT, oolong tea; GT, green tea; WT, white tea; DT, dark tea; YT, yellow tea.Note: Data expressed as mean ± standard deviation and range (min-max).ANOVA and Tukey post hoc tests were used to detect no significant differences in the levels of the four flavonols among the six types of tea.The results were reported in mg/g to indicate the concentration of compounds in 1 g of tea leaves after conversion.

Dynamics Changes in Phenolic Acids in Six Different Types of Tea
In this study, six phenolic acids, namely gallic acid (GA), coumaric acid (COU), chlorogenic acid (CHL), ferulic acid (FER), sinapic acid (SIN), and caffeic acid (CAA) were identified and analyzed.As presented in Table 5, the composition ratio of these components was significantly influenced by tea processing procedures, which classified teas into distinct chemo-types.It is worth noting that CAA was not detected (as nd).DT, BT, WT, and OT exhibited dormancy with GA, while YT displayed dormancy with CHL.OT and GT contained two major phenolic acids.Tea type: OT, oolong tea; GT, green tea; WT, white tea; DT, dark tea; YT, yellow tea.Note: Data expressed as mean ± standard deviation and range (min-max).a, b, c, d, e Values with different letters indicate significant differences (p < 0.05) compared to GT, YT, DT, WT, and OT samples using ANOVA and Tukey post hoc test.The results were reported in mg/g to indicate the concentration of compounds in 1 g of tea leaves after conversion.
DT contained a notable quantity of GA, accounting for over 90% of the total phenolic acids, which was significantly higher compared to other tea types (p < 0.05, Table 5).A substantial proportion of FER, almost a quarter of the total phenolic acids, was identified in OT and GT (Table 5).Furthermore, the proportion of CHL in GT and YT exceeded that found in other tea types.

Identification of Potential Biomarkers for Tea Classification Using Random Forests
In this study, the random forests (RF) classifier was employed to distinguish different types of tea and identify possible biomarkers based on 19 detected chemical components (Figure 2A).The results showed that the proposed RF classifier achieved accuracies of 78.57% for BT, 86.21% for GT, 85.71% for OT, 82.35% for BT, 50.00% for WT, and 57.14% for YT (Figure 2B).The overall accuracy of the classifier was 79.34% using 19 identified chemical components.Although the accuracy of the present classifier was still low, especially for WT and YT, the performance could be improved by increasing the number of samples.In addition, GC, EGCG, and ECG, as the main components of tea catechins, were identified as important biomarkers for tea classification (Figure 2C).Moreover, tea catechins had different responses under thermal processing.

Principal Component Analysis
Principal component analysis (PCA) was performed on the data obtained from the sample processing of a particular type of tea (Figure 2D).The first principal component (PC1) explained 42.59% of the total variance, while the second principal component (PC2) explained 19.14% of the total variance, and the third principal component (PC3) explained 13.82% of the total variance.Together, PC1, PC2, and PC3 explained a cumulative variance of 75.54%.The results show that the tea samples of six varieties could be classified into three categories based on their fermentation levels.Green tea and yellow tea (non-fermented tea) were categorized together.Oolong tea (semi-fermented tea) was classified separately.Black tea and dark tea (post-fermented tea and fully fermented tea) were grouped together.

Discussion
Currently, there is a lack of quantitative methods for classifying tea categories in accordance with tea standards.This study aimed to introduce a quantitative and objective approach to identifying tea categories, thereby establishing a scientific foundation for the development of chemical classification methods for tea.Previous research has demonstrated the efficacy of HPLC-DAD in identifying the distinctive constituents of Laoshan green tea (GT) harvested during both summer and autumn, providing accurate determinations of tea leaves across both seasons [22].Expanding on this foundation, our objective was to formulate and validate a UHPLC-DAD methodology to concurrently quantify 19 key components, encompassing alkaloids, catechins, flavonols, and phenolic acids, within six differently processed teas.Additionally, we endeavored to pinpoint significant biomarkers for tea classification.
Tea, as a globally consumed beverage, can be classified into six main groups based on the degree of fermentation: green tea (GT), yellow tea (YT), white tea (WT), oolong tea (OT), black tea (BT), and dark tea (DT), in the increasing order of fermentation.The results obtained from our analysis revealed distinct differences in both tea categories and specific compounds as fermentation levels changed.PCA analysis based on these specific compounds also showed good classification results for tea classes with different fermentation levels.These insights contribute to the broader knowledge base surrounding tea, its fermentation process, and its potential implications for health and flavor profiles.
In the present investigation, caffeine was observed to comprise the largest proportion among the six tea types, especially in green tea (GT), which is in accordance with the findings of Boros et al., who extracted caffeine from a variety of teas and reported similar results [23,24].Interestingly, our results indicated that the processing method for tea leaves had minimal influence on the proportion of alkaloid composition.However, it is worth noting that different alkaloids may react differently to the processing method due to the effects of steeping time, temperature, pH value, and picking time, which all contribute to the chemical composition of tea leaves [25][26][27][28].
Catechins are a standard type of flavonoid found in green tea (GT), and it has been observed that green tea contains more catechins than black or oolong teas [29].The current study also found that green tea has the highest amount of catechins compared to other tea varieties.Previous studies have indicated that unfermented green tea has EGCG as its main component [29,30], while fermented green tea has GC as its main component with less EGCG [2,5].Our study confirms that green tea has the highest amount of EGCG, with significantly higher levels than other tea types except for yellow tea.Interestingly, our findings show that DT has the lowest amount of EGCG, which may be attributed to the conversion of catechins to theaflavins or theobromine during the fermentation process in black tea, leading to a significant reduction in total catechins (EGCG, ECG, and EGC) [31][32][33].Furthermore, different processing methods have been found to affect the ratio of catechin composition in tea leaves, possibly due to the gradual decrease in catechin content and increase in gallic acid content during tea fermentation [5,34].Another study reported that thermal processing affects the eight catechins in tea differently [35].
Tea flavonols are potent antioxidants that have been shown to protect against cancer and cardiovascular disease [36][37][38], it is of interest to understand the levels of flavonols in different teas and how processing methods impact flavonol composition.We identified RUT, MYR, QUE, and KAE as the major flavonols in tea leaves, in particular RUT and MYR.Our study found that black tea (BT) had lower total flavonol content compared to other teas, which is consistent with the findings of Selim et al. [39,40].We also found that the gallic acid (GA) content was the highest in black tea (BT) among the 121 tea samples, previous reports indicated that the increase in gallic acid content coupled with deepening fermentation [41,42].
Previous studies classified black, green, white, yellow, dark, and oolong teas by UV spectroscopy [43]; Ding et al., classified tea quality levels based on CLPSO-SVM using near-infrared spectrum [44].To further explore the potential biomarkers for tea classification, we identified GC, EGCG, and ECG as important biomarkers for tea classification using UHPLC-DAD quantification combined with RF calculations.Previously, Wang et al., identified the physicochemical components such as catechin and caffeine in yellow tea (YT) using quantitative descriptive analysis (QDA) and partial least squares regression (PLSR).RF, although lacking a comprehensive theoretical foundation comparable to traditional chemometrics, offers advantages such as strong predictive capabilities and balanced handling of all variables even in cases of overfitting [20,21].Additionally, RF's ability to handle data dimensions without limitations made it a valuable tool for predicting the types of tea accurately.Previous studies have also demonstrated RF's predictive prowess in tea sample analysis.For example, Zheng et al. [45] demonstrated the superior predictive performance of RF compared to other machine learning methods, such as PCA and SVM, in predicting unknown tea samples.Similarly, Xu et al. [46] used RF based on fused signals to achieve the best performance in predicting the concentrations of chemical components in tea.Our study builds upon this knowledge and showcases the potential of RF in further advancing tea classification methodologies.
In conclusion, we have developed a UHPLC-DAD analytical method to simultaneously determine a total of 19 major components in tea, including alkaloids, catechins, flavonols, and phenolic acids.The method has undergone methodological validation, demonstrating sensitivity, stability, and good repeatability in content determination.Significant differences in these components have been observed among the six types of tea studied, with green tea (GT) and yellow tea (YT) exhibiting higher total chemical content compared to the other teas.These observations suggest that the varying degrees of fermentation in the six major tea categories may influence the composition of alkaloids, catechins, flavanols, and phenolic acids in tea leaves.Furthermore, GC, EGCG, and ECG, serving as the principal constituents of catechins in tea, have been identified as important biomarkers for tea classification.The results of PCA analysis reveal the possibility of categorizing these six tea types into three groups based on their fermentation levels.In future work, we plan to establish a tea composition database and develop a standard analytical method for the evaluation and classification of teas.

Figure 2 .
Figure 2. Tea classification based on 19 chemical components using random forests.(A) Random forest classification; (B) accuracy of random forest classification; (C) potential biomarkers identified

Table 1 .
Validation parameters for the UHPLC-DAD method proposed in this study (n = 6).

Table 2 .
Comparative analysis of alkaloid levels in six different types of teas (n = 121), including caffeine (CAF), theophylline (THEO), and theobromine (TB).OT, oolong tea; GT, green tea; WT, white tea; DT, dark tea; YT, yellow tea.Note: Data expressed as mean ± standard deviation and range (min-max).a, b, c Values with different letters indicate significant differences (p < 0.05) compared to GT, YT, and DT samples using ANOVA and Tukey post hoc test.The results were reported in mg/g to indicate the concentration of compounds in 1 g of tea leaves after conversion.