Next Article in Journal
Qualitative and Antioxidant Evaluation of High-Moisture Plant-Based Meat Analogs Obtained by Extrusion
Previous Article in Journal
Hypoglycemic Effects of Sechium edule (Chayote) in Older Adults: A Systematic Review and Meta-Analysis of Clinical and Preclinical Trials
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Assessment of Tenderness and Anthocyanin Content in Zijuan Tea Fresh Leaves Using Near-Infrared Spectroscopy Fused with Visual Features

1
Tea Research Institute, Shandong Academy of Agricultural Sciences, Jinan 250033, China
2
Shandong Guohe Industrial Technology Institute Co., Ltd., Jinan 250014, China
3
College of Biosystems Engineering and Food Science, Zhejiang University, Hangzhou 310058, China
*
Authors to whom correspondence should be addressed.
These authors contributed equally to this work.
Foods 2025, 14(17), 2938; https://doi.org/10.3390/foods14172938
Submission received: 14 July 2025 / Revised: 10 August 2025 / Accepted: 20 August 2025 / Published: 22 August 2025

Abstract

Focusing on the characteristic tea resource Zijuan tea, this study addresses the difficulty of grading on production lines and the complexity of quality evaluation. On the basis of the fusion of near-infrared (NIR) spectroscopy and visual features, a novel method is proposed for classifying different tenderness levels and quantitatively assessing key anthocyanin components in Zijuan tea fresh leaves. First, NIR spectra and visual feature data were collected, and anthocyanin components were quantitatively analyzed using UHPLC-Q-Exactive/MS. Then, four preprocessing techniques and three wavelength selection methods were applied to both individual and fused datasets. Tenderness classification models were developed using Particle Swarm Optimization–Support Vector Machine (PSO-SVM), Random Forest (RF), and Convolutional Neural Networks (CNNs). Additionally, prediction models for key anthocyanin content were established using linear Partial Least Squares Regression (PLSR), nonlinear Support Vector Regression (SVR) and RF. The results revealed significant differences in NIR spectral characteristics across different tenderness levels. Model combinations such as TEX + Medfilt + RF and NIR + Medfilt + CNN achieved 100% accuracy in both training and testing sets, demonstrating robust classification performance. The optimal models for predicting key anthocyanin contents also exhibited excellent predictive accuracy, enabling the rapid and nondestructive detection of six major anthocyanin components. This study provides a reliable and efficient method for intelligent tenderness classification and the rapid, nondestructive detection of key anthocyanin compounds in Zijuan tea, holding promising potential for quality control and raw material grading in the specialty tea industry.

1. Introduction

The high-quality development of the tea industry relies on precise control over tea quality [1]. Zijuan tea, a unique tea cultivar, is distinguished by its tenderness level and intrinsic components—particularly anthocyanins—which are key determinants of tea quality. Due to its high anthocyanin content, Zijuan tea offers both distinctive health benefits and flavor, attracting strong consumer attention [2]. Chen et al., conducted a systematic analysis of characteristic metabolites in Zijuan tea from different regions in Yunnan Province, revealing significant differences in the content of tea polyphenols, amino acids, catechins, caffeine, sugars, and anthocyanins [3].
Traditional methods using visual and morphological cues such as leaf color and shape to determine the tenderness of fresh tea leaves rely heavily on manual expertise. These sensory-based approaches are highly subjective, inefficient, and poorly suited to the demands of large-scale production. However, anthocyanin content is typically measured using chemical techniques such as high-performance liquid chromatography (HPLC), which, while being highly accurate, are time-consuming, costly, complex, and inherently destructive, making them unsuitable for real-time online monitoring [4]. As a result, there is an urgent need within the tea industry for rapid, nondestructive, and accurate detection technologies to support digitalization and intelligent upgrading. Near-infrared spectroscopy (NIRS), which leverages the absorption characteristics of molecular bonds in the near-infrared region, enables the fast acquisition of multi-component information from samples; hence, it has been widely applied in the detection of agricultural product composition [5,6,7]. The advantages of NIRS lie in its nondestructive nature and high efficiency, enabling the rapid quantitative analysis of components such as anthocyanins. However, NIRS is susceptible to interference from the physical properties of samples, such as color and texture, and single-source spectral data often struggle to capture the complex characteristics of fresh leaves [8]. Nonetheless, visual features—such as color and texture—contain intuitive information related to leaf tenderness. Zijuan tea leaves of different tenderness levels exhibit noticeable color differences (due to variations in chlorophyll and anthocyanin accumulation) and changes in surface characteristics (such as leaf folding and vein density), reflecting the developmental stages. Integrating visual features with NIR spectroscopy allows for the fusion of macroscopic morphological and microscopic compositional information, compensating for the limitations of each individual technique and providing a more comprehensive basis for both tenderness classification and component detection in fresh tea leaves.
At present, the integration of multi-source information is gaining momentum in the field of agricultural product detection. However, research specifically focused on Zijuan tea fresh leaves remains limited. As shown in Table 1. Existing tea quality detection studies are primarily concentrated on conventional tea types such as green tea and black tea [9,10,11,12,13,14,15]. Due to its unique varietal characteristics—particularly its high anthocyanin content and purplish leaf coloration—Zijuan tea exhibits distinct visual features and spectral response patterns compared to ordinary tea leaves [16]. Therefore, targeted fusion-based modeling research for this tea variety is urgently needed. In addition, a key challenge lies in the efficient extraction of visual features, such as optimizing color space conversion and selecting informative texture descriptors, and in constructing suitable models that integrate spectral and visual data. Developing such fusion models to enable accurate detection of both leaf tenderness and anthocyanin content remains a critical task to be accomplished.
This study is based on the unique characteristics of the Zijuan tea variety and innovatively integrates near-infrared spectroscopy with visual features such as color and texture of fresh Zijuan tea leaves, aiming to establish models for tenderness classification and key anthocyanin content detection. Four preprocessing methods were applied to eliminate environmental noise and electronic signal interference in the NIR spectra, and three wavelength selection techniques were implemented to extract informative spectral bands and further optimize model performance. Finally, both linear and nonlinear machine learning models, along with a deep learning Convolutional Neural Network (CNN) model, were developed to achieve the accurate classification of leaf tenderness and prediction of key anthocyanin components. This research seeks to overcome the limitations of traditional detection methods by providing a technological foundation for the intelligent grading and rapid quality evaluation of Zijuan tea fresh leaves, promoting the transition of the tea industry toward greater precision and intelligence while also enriching the theoretical framework for multi-source data fusion in the detection of specialty tea resources.

2. Materials and Methods

2.1. Sample Preparation

The Zijuan fresh leaves used in this study were collected from the tea plantation of the Tea Research Institute, Chinese Academy of Agricultural Sciences (Hangzhou, Zhejiang, China). Based on different developmental stages, the fresh leaves were categorized and harvested as one bud with one leaf, one bud with two leaves, one bud with three leaves, one bud with four leaves, and fully mature leaves. A portion of the freshly harvested leaves was sealed in airtight bags and transported to the laboratory for near-infrared (NIR) spectral scanning, while another portion was rapidly frozen in liquid nitrogen (to prevent metabolite degradation) and transported on dry ice for anthocyanin content analysis. The brief flowchart of this study is shown in Figure 1.

2.2. Determination of Key Physicochemical Components in Zijuan Tea Fresh Leaves

2.2.1. Anthocyanin Extraction

The collected Zijuan tea fresh leaves were completely freeze-dried using a vacuum freeze dryer. Before this operation, the sample chamber was installed, and the sealing gasket was checked to ensure it was properly in place without any air leaks. The machine was pre-cooled for 30 min with the refrigeration temperature set to −30 °C. Mesh bags containing samples of different tenderness levels, which had been labeled accordingly, were placed inside the sample chamber. After sealing, vacuum was applied until the pressure dropped below 50 mTorr. The samples were freeze-dried completely over a period of three days. Finally, the dried samples were removed and ground into powder.
Anthocyanins were extracted using an acidified ethanol ultrasonic-assisted method. Briefly, 0.5 g of Zijuan tea freeze-dried powder was weighed and mixed with 10 mL of 1% (w/v) citric acid–60% (v/v) ethanol solution (pH 2.0). The mixture was vortexed for 1 min to ensure thorough wetting, and then ultrasonic extraction was performed in an ice bath protected from light at 200 W for 10 min while maintaining the temperature below 25 °C. The extract was subsequently centrifuged at 8000 rpm and 4 °C for 10 min. The supernatant was transferred to a 25 mL brown volumetric flask. The residue was washed with 5 mL of the same acidified ethanol solution and centrifuged again, and then the supernatants were combined and brought to volume with the extraction solvent to 25 mL. The solution was filtered through a 0.45 μm nylon membrane filter, and the filtrate was immediately used for analysis.

2.2.2. UHPLC-Q-Exactive/MS Analysis

Anthocyanin analysis was performed using the UHPLC-Q-Exactive/MS system. The chromatographic conditions were as follows: an Accucore™ C18 column (2.1 × 100 mm, 2.6 μm) was used. A binary gradient elution was applied at a flow rate of 0.3 mL/min. The mobile phases consisted of 0.1% formic acid in water (A) and 0.1% formic acid in acetonitrile (B). The gradient program was set as follows: 0 min, 5% B; 0–10 min, increased to 20% B; 10–15 min, increased to 30% B; 15–18 min, increased to 95% B and held for 2 min; 20 min, returned to 5% B and held for 2 min. The column temperature was maintained at 35 °C, and the injection volume was 2 μL.
The mass spectrometry conditions were set as follows: HESI-II heated electrospray ionization source; positive ion mode; resolution of 70,000 (Full MS) and 17,500 (dd-MS2); spray voltage of 3.8 kV; capillary temperature at 320 °C; auxiliary gas temperature set to 350 °C with a flow rate of 10 L/min; scan range (m/z) from 150 to 1500.

2.2.3. Standard Solution Preparation

Standard compounds including cya-3-O-rutinoside, cya-3-O-galactoside, cya-3-O-glucoside, peonidin-3-O-galactoside, peonidin, peonidin-3-O-glucoside, malvin, delphinidin-3-O-galactoside, delphinidin-3-O-glucoside, pelargonidin-3-O-galactoside, pelargonidin-3-O-glucoside, pelargonidin, cyanidin, delphinidin, petunidin, petunidin-3-O-glucoside, and malvin-3-O-glucoside with a purity of no less than 98% were purchased from Sigma. The standards were dissolved in a solvent consisting of 2% formic acid in methanol–water solution (v:v, 50:50) to prepare stock solutions at a concentration of 1 mg/mL. Subsequently, volumes of 0.5, 1.0, 2.5, 10.0, 20.0, or 50.0 μL were pipetted to prepare mixed standard solutions of varying concentrations. The stock solutions were stored at −20 °C. Data acquisition was performed on the standard solutions using the above-described mass spectrometry method to establish an anthocyanin standard database. Calibration curves were generated for quantitative analysis of anthocyanins in samples.

2.2.4. Quantification of Anthocyanins in Zijuan Tea Fresh Leaves

The contents of major anthocyanins in tea leaf samples were calculated on the basis of calibration curves constructed from standard compounds.

2.3. Near-Infrared Spectral Acquisition and Spectral Preprocessing

Near-infrared spectral data were collected using an IAS3100 near-infrared spectrometer from Wuxi Intelligent Analysis Service Co., Ltd., Wuxi, China. Diffuse reflectance spectral scanning was performed on freshly harvested Zijuan tea leaves at room temperature. The laboratory relative humidity was maintained at approximately 60%, and the temperature was controlled around 20 °C. Tea samples were thoroughly compacted in Petri dishes to ensure uniformity. After preheating the instrument for 30 min, spectral data were acquired. Each sample was scanned 10 times, with the sample thoroughly mixed before each scan. The average of the 10 scans was taken as the final spectral data for a specific sample. In total, 100 spectral datasets were collected for building classification and prediction models. The spectral detection range was 900–1700 nm, with 801 wavelength points.
Near-infrared spectral data often contain interference signals such as baseline drift and low-frequency noise caused by environmental lighting and instrument dark current. In this study, four preprocessing algorithms were applied: Savitzky-Golay (S-G) first derivative filtering was used to eliminate baseline drift and enhance peak boundary recognition, with a window width of 11 and a polynomial order of 3 [17,18]; Standard Normal Variate (SNV) transformation was performed to remove baseline drift and spectral intensity variations through centering and scaling [19,20]; one-dimensional median filtering (Medfilt), a nonlinear signal processing method, was used to replace original spectral values with local median values within a sliding window to eliminate noise caused by instrument fluctuations and outliers from sudden environmental changes [21,22]; and normalization (Normaliz) was utilized to scale each spectrum to unit length by dividing by its Euclidean norm to eliminate variations due to differences in optical path length [23].

2.4. Extraction of Visual Color and Texture Features

A small image processing program developed using the GUI module of MATLAB software was used to extract color and texture features of Zijuan tea fresh leaves at different tenderness levels [24]. The color features included R, G, B, H, S, V, 2G-R-B, R/G, and hab*, while the texture features comprised smoothness (r), standard deviation (δ), entropy (e), uniformity (U), and mean gray level (m).

2.5. Data Fusion Strategies

Three data analysis methods were proposed to provide effective quality assessment and intuitive data support for Zijuan tea fresh leaves: (1) a discrimination model and anthocyanin quantitative detection model based on near-infrared spectroscopy; (2) a discrimination model and anthocyanin quantitative detection model based on color and texture features; and (3) a discrimination model and anthocyanin quantitative detection model based on data fusion. The three data fusion strategies are shown in Table 2. As shown in Figure 1, the data fusion method directly concatenates near-infrared spectral information with color and texture features. The fused data is then subjected to preprocessing and wavelength selection, followed by dimensionality reduction using PCA to establish models for determining the tenderness of Zijuan tea fresh leaves and predicting the content of key anthocyanins.

2.6. Establishment of Tenderness Classification Models for Zijuan Tea Fresh Leaves

In this study, tenderness classification models for Zijuan tea fresh leaves were developed using machine learning methods including Particle Swarm Optimization–Support Vector Machine (PSO-SVM) [25], Random Forest (RF) [26], and deep learning Convolutional Neural Network (CNN) [27]. PSO-SVM combines the ability of SVM to handle high-dimensional and small samples with the powerful automatic parameter optimization ability of PSO and can efficiently construct tenderness classification models with strong discriminative ability and good generalization performance. The high precision, robustness, automatic feature importance assessment capability, and relatively low parameter adjustment requirements of RF make it a powerful tool for building discriminative models. CNN is particularly suitable for discriminative tasks. It can automatically learn spectral features that the human eye may find difficult to describe or quantify, as well as subtle visual features related to tenderness. Prior to modeling, the data were preprocessed and subjected to Principal Component Analysis (PCA) [28]. The model input comprised the number of principal components (PCs) corresponding to the minimum root mean square error in the training set. The Kennard-Stone (K-S) algorithm [29] was used to divide the dataset into training and prediction sets at a ratio of 4:1, resulting in 80 samples for training and 20 samples for prediction.

2.7. Development of Prediction Models for Key Anthocyanin Content in Zijuan Tea Fresh Leaves

Despite the preprocessing steps, the near-infrared spectral data of Zijuan tea fresh leaves still contained a large amount of redundant information, which would lead to model overfitting or underfitting. To address this issue, this study employed three feature wavelength selection algorithms to significantly reduce the number of spectral wavelengths: Competitive Adaptive Reweighted Sampling (CARS) [30], Bootstrapping Soft Shrinkage (BOSS) [31], and Successive Projections Algorithm (SPA) [32].
After feature wavelength selection, Principal Component Analysis (PCA) was applied to the spectral data for dimensionality reduction. This method transforms the selected feature wavelengths into a smaller number of principal components, retaining the main spectral information while reducing the model input size and improving detection efficiency.
Prediction models for key anthocyanin content in Zijuan tea fresh leaves were developed using three traditional machine learning algorithms: Partial Least Squares Regression (PLSR) [20], nonlinear Support Vector Regression (SVR) [33], and Random Forest (RF) [34]. PLSR effectively handles spectral collinearity, avoiding the ill-conditioned solutions inherent in traditional linear regression. SVR accurately captures the nonlinear response between anthocyanins and spectral data by mapping the data to a high-dimensional space using the RBF or polynomial kernel function. RF demonstrates exceptional robustness to noisy spectral data (e.g., scattering interference) by reducing variance through bootstrap aggregation of hundreds of decision trees. Five-fold cross-validation was performed on the training set to optimize the number of decision trees for RF, the penalty parameter for SVR, and the optimal number of principal components for PLSR. Model performance was evaluated using the calibration set correlation coefficient (Rc), prediction set correlation coefficient (Rp), root mean square error of calibration (RMSEC), root mean square error of prediction (RMSEP), and relative percentage deviation (RPD) as the final metrics.

2.8. Data Analysis Software

Normality tests, homogeneity of variance tests and nonparametric tests in this study were performed using IBM SPSS Statistics 27 software. Discrimination and prediction models were developed using MATLAB 2019a. All plots were created using Origin 2022 software.

3. Results and Analysis

3.1. Analysis of Anthocyanin Content Differences in Zijuan Tea Fresh Leaves

In this study, a total of 16 anthocyanins were measured in Zijuan tea fresh leaves. Through statistical analysis, six anthocyanins with relatively high contents were classified as key anthocyanin components, while ten with lower contents were classified as trace anthocyanin components. The Kolmogorov-Smirnov (K-S) test was performed on the anthocyanin components at different tenderness levels, revealing that none followed a normal distribution. Homogeneity of variance tests showed significance levels below 0.05, indicating unequal variances. Therefore, the Kruskal–Wallis test for independent samples was used to analyze differences among tenderness levels. The differences in anthocyanin content across tenderness levels are shown in Figure 2, with tenderness grades defined from one bud one leaf to fully mature leaf as Level 1 through Level 5, respectively. As shown in Table 3, the total anthocyanin content peaked at Level 4, while fully mature leaves (Level 5) had the lowest anthocyanin content. Figure 2a indicates that only a few adjacent tenderness levels showed no significant differences (p > 0.05) in component content, while most key anthocyanin components exhibited significant or highly significant differences (p < 0.05) across tenderness levels. The key anthocyanin components—delphinidin-3-O-galactoside, delphinidin-3-O-glucoside, cyanidin-3,5-O-diglucoside, cyanidin-3-O-glucoside, petunidin, and cyanidin—had content ranges of 0.895–1.651 mg/g, 0.513–0.910 mg/g, 2.058–2.695 mg/g, 5.009–6.116 mg/g, 5.768–8.923 mg/g, and 0.179–0.789 mg/g, respectively. Figure 2b presents the distribution of trace anthocyanin components in Zijuan tea at different tenderness levels. The detailed content of trace anthocyanins is shown in Table S1. The contents of cyanidin-3-O-rutinoside and pelargonidin-3,5-O-diglucoside generally increased with tenderness level. The contents of peonidin-3-O-galactoside, peonidin-3-O-glucoside, malvidin-3-O-glucoside, delphinidin, pelargonidin, peonidin, and malvidin first decreased and then increased as the tenderness level increased, reaching their highest values at Level 5. This indicates that trace anthocyanin components constitute the largest proportion in fully mature leaves.

3.2. Analysis of Visual Color and Texture Features

The Kruskal–Wallis test for independent samples was used to analyze differences in color and texture features among different tenderness levels, and violin plots were drawn. As depicted in Figure 3, there were generally no significant differences (p > 0.05) in color and texture features from Level 1 to Level 4, while Level 5 showed highly significant differences (p < 0.001) compared to Levels 1–4. As tenderness increased from Level 1 to Level 5, leaf maturity progressed. Anthocyanins, the core source of Zijuan tea’s purple color, are present in high concentrations that give the leaves a rich and uniform purple hue. The highest anthocyanin content was observed at Level 4, where leaf color transitioned from light purple to deep purple and then to green. At Level 5, the R, G, B, S, L, 2G-R-B, and mean gray level (m) were at their highest values, whereas R/G, standard deviation (δ), hue (H), and smoothness (r) were at their lowest.

3.3. PCA Cluster Analysis

To visually demonstrate the differences in anthocyanin content among different tenderness levels of Zijuan tea fresh leaves, PCA cluster analysis was performed on key anthocyanin components (Figure 4c) and trace components (Figure 4d). Clear clustering trends were observed among different tenderness levels, as shown in Figure 4c,d, indicating significant differences in both key and trace anthocyanin components. For the key anthocyanins, the first three principal components accounted for 62.9% (PC1), 20.7% (PC2), and 12.0% (PC3) of the variance, totaling 95.6% of the overall information. For the trace anthocyanins, PC1, PC2, and PC3 accounted for 77.3%, 16.7%, and 4.7% of the variance, respectively, covering 98.7% of the overall information. The 3D PCA clustering effectively captured the comprehensive information of the samples.
Figure 4a and Figure 4b show the PCA clustering of near-infrared spectra and color–texture features, respectively, for Zijuan tea fresh leaves at different tenderness levels. As seen in Figure 4a, Level 5 samples are distinctly clustered together, while Levels 1 to 4 are difficult to separate. Figure 4b reveals that PCA clustering based on color and texture features alone is insufficient to discriminate between different tenderness levels of Zijuan tea. Therefore, this study aimed to develop machine learning models to accurately classify the tenderness levels of Zijuan tea fresh leaves, providing a basis for selecting high-quality tea raw materials.

3.4. Preprocessing Near-Infrared Spectra of Zijuan Tea Fresh Leaves and Establishment of Tenderness Classification Models

The raw spectra are shown in Figure 5a. Due to significant noise in the 1650–1700 nm range, this segment was removed, retaining only the 900–1650 nm range and yielding 751 spectral features. The spectra after preprocessing by four methods—Savitzky-Golay (S-G), Standard Normal Variate (SNV), median filtering (Medfilt), and normalization (Normaliz)—are shown in Figure 5b–e. Figure 5f displays the average spectra of Zijuan tea fresh leaves at five tenderness levels. It can be observed that the absorbance decreases progressively from Level 1 to Level 5. As the leaf position moves lower on the plant, the leaves become more mature and darker in color; the high pigment content and dense structure inhibit diffuse reflectance. The more difficult the diffuse reflectance is, the lower the absorbance. The near-infrared spectra of Zijuan tea fresh leaves reflect their rich internal chemical composition. The absorption peak near 970–990 nm is related to the first overtone of O–H bonds in the molecules, mainly originating from moisture and free or weakly hydrogen-bonded hydroxyl groups (such as polyphenols and free water). Meanwhile, the absorption around 1150 nm is primarily associated with the first overtone of C–H bonds and combination vibrations of C–H bending. A strong absorption peak appears near 1390–1420 nm, related to the second overtone of O–H groups, which is one of the strongest water absorption peaks in the NIR region. The 1500–1600 nm region corresponds to the second overtone of C–H and combination bands of O–H, mainly characterizing complex organic compounds.
Tenderness classification models for Zijuan tea fresh leaves were developed using the machine learning methods PSO-SVM, Random Forest (RF) and deep learning CNN. Table 4, Table 5 and Table 6 present the accuracy of these classification models. Interestingly, among all models, Medfilt preprocessing consistently demonstrated excellent classification performance. For near-infrared spectroscopy (NIR), the PSO-SVM model with Medfilt preprocessing used 16 principal components (PCs) and achieved calibration and prediction accuracies of 96.50% and 90%, respectively; the RF model with 11 PCs achieved 100% and 95% accuracy on calibration and prediction sets, respectively; while the CNN model with 11 PCs achieved 100% accuracy on both training and prediction sets. For color and texture features (TEX), the PSO-SVM model with Medfilt preprocessing (10 PCs) reached 100% and 90% calibration and prediction accuracy; the RF model (10 PCs) achieved 100% accuracy on both sets; whereas the CNN model (10 PCs) obtained 97.5% calibration accuracy and 85% prediction accuracy. For the fused data (NIR + TEX), the PSO-SVM model with Medfilt preprocessing (9 PCs) achieved 100% calibration accuracy and 85% prediction accuracy; the RF and CNN models (25 PCs) both reached 100% calibration accuracy and 95% prediction accuracy. In summary, the TEX + Medfilt + RF and NIR + Medfilt + CNN combinations were the best-performing models for tenderness classification of Zijuan tea fresh leaves. The confusion matrix is shown in Figure 6. The data fusion of NIR and TEX did not significantly improve model performance, while it increased the input dimensionality.

3.5. Analysis of Near-Infrared Spectra of Zijuan Tea Fresh Leaves and Optimization of Data Fusion Preprocessing

Although the spectral trends among different samples are similar, there are subtle differences in absorbance that are difficult to detect with the naked eye. Therefore, a nonlinear Support Vector Regression (SVR) algorithm was employed to develop prediction models for key anthocyanin components. The number of spectral bands remained unchanged after preprocessing; however, to reduce model input dimensionality, Principal Component Analysis (PCA) was applied to the preprocessed spectra before modeling. The optimal number of principal components (PCs) was selected based on the smallest root mean square error of calibration (RMSEC) in the training set, and these PCs were used as input features for the model.
The evaluation metrics of prediction models for six key anthocyanin components and total anthocyanins, established using three methods—near-infrared spectroscopy (NIR), color–texture features (TEX), and fused NIR + TEX data—are presented in Table 7, Table 8, Table 9, Table 10, Table 11, Table 12 and Table 13. For the delphinidin-3-O-galactoside prediction model, the best preprocessing method for all three data types turned out to be Medfilt. Both NIR + Medfilt and NIR + TEX + Medfilt showed high accuracy but exhibited overfitting; therefore, the optimal model combination was TEX + Medfilt, which used the smallest number of principal components (PCs) and achieved Rc = 0.98, Rp = 0.98, and RPD = 4.687. In the delphinidin-3-O-glucoside model, SNV, S-G, and Medfilt were the best preprocessing methods for NIR, TEX, and NIR + TEX, respectively. Since NIR + SNV and NIR + TEX + Medfilt showed overfitting, the optimal preprocessing combination was TEX + S-G with only nine PCs, achieving Rc = 0.94, Rp = 0.96, and RPD = 3.718. For the cyanidin-3,5-O-diglucoside model, fused data showed better performance, with NIR + TEX + Medfilt as the optimal preprocessing combination, achieving Rc = 0.92, Rp = 0.93, and RPD = 2.736. In the cyanidin-3-O-glucoside model, the best prediction performance was obtained using Medfilt preprocessing on single NIR data, with Rc = 0.95, Rp = 0.95, and RPD = 3.267; other models exhibited overfitting. For the petunidin model, all preprocessed models showed overfitting, and the best combination was NIR + Medfilt. Further feature band selection will be applied in the future to improve prediction performance. In the cyanidin model, TEX + Normaliz showed underfitting, and the best prediction combination was NIR + Medfilt, with Rc = 0.93, Rp = 0.90, and RPD = 2.303. For the total anthocyanins model, TEX + S-G was the optimal combination, achieving Rc = 0.97, Rp = 0.96, and RPD = 3.280.

3.6. Near-Infrared Spectral Feature Band Selection

After data fusion and preprocessing optimization, the prediction accuracy of the best model combination for petunidin was unsatisfactory. Therefore, three feature band selection algorithms—CARS, BOSS, and SPA—were applied to further improve the model’s predictive performance (Table 14). As shown in Table 13, the CARS algorithm selected 52 feature bands from the 751 near-infrared spectral bands. After PCA dimensionality reduction, only 11 principal components were retained. The resulting model improved the accuracy of both the training and prediction sets by 0.02, with the RPD value increasing from 2.465 to 2.584, indicating only a slight improvement in model accuracy. The BOSS algorithm selected 20 feature bands, and after PCA, the input features were reduced to eight. While the training set accuracy increased by 0.02 compared to the original model, the prediction set accuracy decreased by 0.02, exacerbating overfitting. The RPD value also dropped significantly. The SPA algorithm selected 37 feature bands (Figure 7), which were reduced to five principal components by PCA. The model’s training set accuracy remained unchanged, but the prediction set accuracy increased by 0.04, significantly alleviating overfitting. The RPD value rose from 2.465 to 2.888, indicating stronger predictive performance. Therefore, for the key anthocyanin petunidin, the optimal combination was NIR + Medfilt + SPA, achieving Rc = 0.97, Rp = 0.95, and an RPD value of 2.888.

3.7. Optimization of Prediction Models for Key Anthocyanin Components

The nonlinear Support Vector Regression (SVR) model can effectively fit complex nonlinear relationships by using the kernel trick to find a function such that most data points lie within an ε-insensitive margin around it while minimizing the loss caused by points outside this margin (support vectors) [35]. Given the complex linear and nonlinear relationships in near-infrared spectra, this study also established linear Partial Least Squares Regression (PLSR) and nonlinear Random Forest (RF) models to comprehensively optimize the prediction of key anthocyanin components in Zijuan tea fresh leaves (Table 15). The PLSR models for total anthocyanins and key components showed signs of underfitting, whereas the RF models improved training accuracy compared to SVR but decreased prediction accuracy, intensifying model overfitting. Figure 8 shows scatter plots of actual versus predicted values for key anthocyanin components. According to model performance metrics, the best models were TEX + Medfilt + SVR for delphinidin-3-O-galactoside (Rc = 0.98, Rp = 0.98, RPD = 4.687); TEX + S-G + SVR for delphinidin-3-O-glucoside (Rc = 0.94, Rp = 0.96, RPD = 3.718) and cyanidin-3,5-O-diglucoside (Rc = 0.96, Rp = 0.94, RPD = 2.755); NIR + Medfilt + SVR for cyanidin-3-O-glucoside (Rc = 0.95, Rp = 0.95, RPD = 3.267) and cyanidin (Rc = 0.93, Rp = 0.90, RPD = 2.304); NIR + Medfilt + SPA + SVR for petunidin (Rc = 0.97, Rp = 0.95, RPD = 2.888); and TEX + S-G + SVR for total anthocyanins (Rc = 0.97, Rp = 0.96, RPD = 3.280).

4. Conclusions

This study employed near-infrared spectroscopy combined with visual features, integrating spectral preprocessing and variable selection techniques to develop models for Zijuan tea fresh leaf tenderness classification and key anthocyanin content prediction. From the results, the main conclusions can be drawn as follows:
The differences in anthocyanin content among different tenderness levels of Zijuan tea fresh leaves were systematically analyzed, finding significant or highly significant variation in key anthocyanin components across tenderness grades. Color and texture features showed no significant differences from one bud one leaf to one bud four leaves, while fully mature leaves differed highly significantly from these younger stages. PCA clustering effectively distinguished anthocyanin content among different tenderness levels. The best tenderness classification model combinations were found to be TEX + Medfilt + RF and NIR + Medfilt + CNN, both achieving 100% accuracy on training and prediction sets. Prediction models for key anthocyanin components were established, enabling quantitative prediction of six major anthocyanins and total anthocyanins. The optimal models were delphinidin-3-O-galactoside with TEX + Medfilt + SVR (Rc = 0.98, Rp = 0.98, RPD = 4.687); delphinidin-3-O-glucoside with TEX + S-G + SVR (Rc = 0.94, Rp = 0.96, RPD = 3.718); cyanidin-3,5-O-diglucoside with TEX + S-G + SVR (Rc = 0.96, Rp = 0.94, RPD = 2.755); cyanidin-3-O-glucoside with NIR + Medfilt + SVR (Rc = 0.95, Rp = 0.95, RPD = 3.267); petunidin with NIR + Medfilt + SPA + SVR (Rc = 0.97, Rp = 0.95, RPD = 2.888); cyanidin with NIR + Medfilt + SVR (Rc = 0.93, Rp = 0.90, RPD = 2.304); and total anthocyanins with TEX + S-G + SVR (Rc = 0.97, Rp = 0.96, RPD = 3.280). All models demonstrated excellent predictive performance. This research provides an important theoretical foundation for mechanized harvesting, intelligent grading, and quality evaluation of the specialty Zijuan tea resource, laying the groundwork for the digitalization and intelligent upgrading of the specialty tea industry.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/foods14172938/s1, Table S1: The content of trace anthocyanins in different tenderness grades.

Author Contributions

S.C.: data curation and writing—original draft. F.D.: methodology, software, and conceptualization. M.G.: investigation, formal analysis, and writing—review and editing. C.D.: conceptualization, methodology, funding acquisition, resources, and supervision. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the Innovation Project of SAAS (CXGC2025A02), the Agricultural Science and Technology Research Project of Jinan City (GG202415), the Key R&D Projects in Zhejiang Province (2023C02043), and the Technology System of Modern Agricultural Industry in Shandong Province (SDAIT19).

Data Availability Statement

The original contributions presented in the study are included in the article, further inquiries can be directed to the corresponding authors.

Conflicts of Interest

Author Shuya Chen was employed by the company Shandong Guohe Industrial Technology Institute. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

  1. Wang, S.; Altaner, C.; Feng, L.; Liu, P.; Song, Z.; Li, L.; Gui, A.; Wang, X.; Ning, J.; Zheng, P. A review: Integration of NIRS and chemometric methods for tea quality control-principles, spectral preprocessing methods, machine learning algorithms, research progress, and future directions. Food Res. Int. 2025, 205, 115870. [Google Scholar] [CrossRef]
  2. Zou, C.; Li, R.-Y.; Chen, J.-X.; Wang, F.; Gao, Y.; Fu, Y.-Q.; Xu, Y.-Q.; Yin, J.-F. Zijuan tea- based kombucha: Physicochemical, sensorial, and antioxidant profile. Food Chem. 2021, 363, 130322. [Google Scholar] [CrossRef]
  3. Chen, Y.; Yang, J.; Meng, Q.; Tong, H. Non-volatile metabolites profiling analysis reveals the tea flavor of “Zijuan” in different tea plantations. Food Chem. 2023, 412, 135534. [Google Scholar] [CrossRef]
  4. Dai, F.; Shi, J.; Yang, C.; Li, Y.; Zhao, Y.; Liu, Z.; An, T.; Li, X.; Yan, P.; Dong, C. Detection of anthocyanin content in fresh Zijuan tea leaves based on hyperspectral imaging. Food Control 2023, 152, 109839. [Google Scholar] [CrossRef]
  5. Guo, M.; Chen, Z.; Ding, Z.; Wang, D.; Qi, D.; Lu, M.; Wang, M.; Dong, C. Traceability of Rizhao green tea origin based on multispectral data fusion strategy and chemometrics. Food Chem. X 2025, 27, 102346. [Google Scholar] [CrossRef] [PubMed]
  6. Zhong, K.; Hu, X.; Li, Y.; Tang, L.; Sun, X.; Li, X.; Zhang, J.; Meng, Y.; Ma, R.; Wang, S.; et al. A colorimetric and NIR fluorescent probe for ultrafast detecting bisulfite and organic amines and its applications in food, imaging, and monitoring fish freshness. Food Chem. 2024, 438, 137987. [Google Scholar] [CrossRef] [PubMed]
  7. Liang, J.; Guo, J.; Xia, H.; Ma, C.; Qiao, X. A black tea quality testing method for scale production using CV and NIRS with TCN for spectral feature extraction. Food Chem. 2025, 464 Pt 1, 141567. [Google Scholar] [CrossRef]
  8. Zhang, J.; Zhang, Y.; Zhou, G.; Li, C.; Wen, L.; Li, W. Adulteration identification strategy for Acanthopanax Senticosus based on data fusion of portable mass spectrometry and near-infrared spectroscopy. Food Chem. 2025, 491, 145239. [Google Scholar] [CrossRef]
  9. Wu, L.; Xu, Q.; Su, C.; Yin, X.; Huo, X.; Zhao, X.; Zhou, Y.; Huang, J. Classification of quality grading of Anji white tea using hyperspectral imaging and data fusion techniques. J. Food Compos. Anal. 2025, 142, 107563. [Google Scholar] [CrossRef]
  10. Wang, Z.; Han, Y.; Zhang, L.; Ye, Y.; Wei, L.; Li, L. The utilization of a data fusion approach to investigate fingerprint profiles of dark tea from China’s different altitudes. Food Chem. X 2024, 22, 101447. [Google Scholar] [CrossRef]
  11. Chen, Y.; Guo, M.; Chen, K.; Jiang, X.; Ding, Z.; Zhang, H.; Lu, M.; Qi, D.; Dong, C. Predictive models for sensory score and physicochemical composition of Yuezhou Longjing tea using near-infrared spectroscopy and data fusion. Talanta 2024, 273, 125892. [Google Scholar] [CrossRef] [PubMed]
  12. Zhang, L.; Zhang, Z.; Hou, Y.Z.; Ocholi, S.S.; Wang, L.; Fu, Z.; Liu, C.; Zhang, Z.; Han, L. Evaluation of changes in chemical composition and antioxidant activities from vine tea at different harvest times based on LC-MS, GC-MS, and data fusion algorithms. Food Chem. X 2025, 27, 102363. [Google Scholar] [CrossRef] [PubMed]
  13. Li, L.; Chen, Y.; Dong, S.; Shen, J.; Cao, S.; Cui, Q.; Song, Y.; Ning, J. Rapid and comprehensive grade evaluation of Keemun black tea using efficient multidimensional data fusion. Food Chem. X 2023, 20, 100924. [Google Scholar] [CrossRef]
  14. Chen, Q.; Sun, C.; Ouyang, Q.; Wang, Y.; Liu, A.; Li, H.; Zhao, J. Classification of different varieties of Oolong tea using novel artificial sensing tools and data fusion. LWT-Food Sci. Technol. 2015, 60, 781–787. [Google Scholar] [CrossRef]
  15. Yin, Y.; Li, J.; Ling, C.; Zhang, S.; Liu, C.; Sun, X.; Wu, J. Fusing spectral and image information for characterization of black tea grade based on hyperspectral technology. LWT 2023, 185, 115150. [Google Scholar] [CrossRef]
  16. Tan, L.; Zhang, P.; Cui, D.; Yang, X.; Zhang, D.; Yang, Y.; Chen, W.; Tang, D.; Tang, Q.; Li, P. Multi-omics analysis revealed anthocyanin accumulation differences in purple tea plants ‘Ziyan’, ‘Zijuan’ and their dark-purple hybrid. Sci. Hortic. 2023, 321, 112275. [Google Scholar] [CrossRef]
  17. Shi, X.; Song, J.; Wang, H.; Lv, X.; Zhu, Y.; Zhang, W.; Bu, W.; Zeng, L. Improving soil organic matter estimation accuracy by combining optimal spectral preprocessing and feature selection methods based on pXRF and vis-NIR data fusion. Geoderma 2023, 430, 116301. [Google Scholar] [CrossRef]
  18. Ndao, M.L.; Youness, G.; Niang, N.; Saporta, G. Improving predictive maintenance: Evaluating the impact of preprocessing and model complexity on the effectiveness of eXplainable Artificial Intelligence methods. Eng. Appl. Artif. Intell. 2025, 144, 110144. [Google Scholar] [CrossRef]
  19. Bi, Y.; Yuan, K.; Xiao, W.; Wu, J.; Shi, C.; Xia, J.; Chu, G.; Zhang, G.; Zhou, G. A local pre-processing method for near-infrared spectra, combined with spectral segmentation and standard normal variate transformation. Anal. Chim. Acta 2016, 909, 30–40. [Google Scholar] [CrossRef]
  20. Jiang, X.; Cao, X.; Liu, Q.; Wang, F.; Fan, S.; Yan, L.; Wei, Y.; Chen, Y.; Yang, G.; Xu, B.; et al. Prediction of multi-task physicochemical indices based on hyperspectral imaging and analysis of the relationship between physicochemical composition and sensory quality of tea. Food Res. Int. 2025, 211, 116455. [Google Scholar] [CrossRef] [PubMed]
  21. Guo, M.; Zhao, Y.; Chen, Y.; Chen, Z.; Chen, Y.; Qi, D.; Lu, M.; Dong, C. Determination of quality differences and origin tracing of green tea from different latitudes based on TG-FTIR and machine learning. Food Res. Int. 2025, 203, 115853. [Google Scholar] [CrossRef] [PubMed]
  22. Guhdar, M.; Mohammed, A.O.; Mstafa, R.J. Advanced deep learning framework for ECG arrhythmia classification using 1D-CNN with attention mechanism. Knowl.-Based Syst. 2025, 315, 113301. [Google Scholar] [CrossRef]
  23. Mirikharaji, Z.; Abhishek, K.; Bissoto, A.; Barata, C.; Avila, S.; Valle, E.; Celebi, M.E.; Hamarneh, G. A survey on deep learning for skin lesion segmentation. Med. Image Anal. 2023, 88, 102863. [Google Scholar] [CrossRef]
  24. Sheng, C.; Lu, M.; Zhang, J.; Zhao, W.; Jiang, Y.; Li, T.; Wang, Y.; Ning, J. Metabolomics and electronic-tongue analysis reveal differences in color and taste quality of large-leaf yellow tea under different roasting methods. Food Chem. X 2024, 23, 101721. [Google Scholar] [CrossRef]
  25. Wang, Z.; Ahmad, W.; Zhao, S.; Zhu, A.; Huo, S.; Chen, Q. Temporal analysis of non-targeted flavor component variation in green tea storage and rapid prediction of storage duration. Food Chem. 2025, 464 Pt 3, 141898. [Google Scholar] [CrossRef]
  26. Zhang, Y.; Chen, X.; Chen, D.; Zhu, L.; Wang, G.; Chen, Z. Machine learning-based classification and prediction of typical Chinese green tea taste profiles. Food Res. Int. 2025, 203, 115796. [Google Scholar] [CrossRef]
  27. Shen, S.; Ren, N.; Zheng, H.; Xue, X.; Ye, Y.; Liu, T.; Zhang, Q.; Yu, G. Rapid and real time detection of black tea rolling quality by using an inexpensive machine vison system. Food Res. Int. 2025, 205, 115983. [Google Scholar] [CrossRef]
  28. Ma, C.; Wang, Q.; Tian, D.; Yuan, W.; Tang, X.; Deng, X.; Liu, Y.; Gao, C.; Fan, G.; Xiao, X.; et al. HS-SPME-GC-MS combined with relative odor activity value identify the key aroma components of flowery and fruity aroma in different types of GABA tea. Food Chem. X 2024, 24, 101965. [Google Scholar] [CrossRef] [PubMed]
  29. Chen, K.; Gao, Y.; Zhang, Z.; Zhu, Y.; Cui, Z.; Lin, Y.; Yan, Y.; Shen, J.; Chen, W.; Zhu, J.; et al. Intelligent non-destructive biosensing of fresh tea leaves by portable spectroscopy and enhanced transformer. Food Control 2025, 178, 111527. [Google Scholar] [CrossRef]
  30. Zan, J.; Li, H.; Cai, L.; Wu, C.; Fan, Z.; Sun, T. Detection of tea seed oil adulteration based on near-infrared and Raman spectra information fusion. LWT 2024, 213, 117064. [Google Scholar] [CrossRef]
  31. Liu, L.; Zareef, M.; Wang, Z.; Li, H.; Chen, Q.; Ouyang, Q. Monitoring chlorophyll changes during Tencha processing using portable near-infrared spectroscopy. Food Chem. 2023, 412, 135505. [Google Scholar] [CrossRef]
  32. Zhou, X.; Wu, X.; Wu, B. Comparative study of indirect and direct feature extraction algorithms in classifying tea varieties using near-infrared spectroscopy. Curr. Res. Food Sci. 2025, 10, 101065. [Google Scholar] [CrossRef] [PubMed]
  33. Song, Y.; Yi, W.; Liu, Y.; Zhang, C.; Wang, Y.; Ning, J. A robust deep learning model for predicting green tea moisture content during fixation using near-infrared spectroscopy: Integration of multi-scale feature fusion and attention mechanisms. Food Res. Int. 2025, 203, 115874. [Google Scholar] [CrossRef] [PubMed]
  34. Deng, X.; Liu, Z.; Zhan, Y.; Ni, K.; Zhang, Y.; Ma, W.; Shao, S.; Lv, X.; Yuan, Y.; Rogers, K.M. Predictive geographical authentication of green tea with protected designation of origin using a random forest model. Food Control 2020, 107, 106807. [Google Scholar] [CrossRef]
  35. Li, M.; Jin, X.; Ma, W.; Du, S.; Wang, Y.; Ji, Z.; Qi, H.; Zhao, X. A Gastrodia elata green tea pulsed light sterilization model based on cost-benefit and GA-SVR algorithm. Food Control 2025, 177, 111445. [Google Scholar] [CrossRef]
Figure 1. Research methodology flowchart (p value: ns p > 0.05; * p < 0.05; *** p < 0.001).
Figure 1. Research methodology flowchart (p value: ns p > 0.05; * p < 0.05; *** p < 0.001).
Foods 14 02938 g001
Figure 2. Content of anthocyanin components in Zijuan tea fresh leaves of varying tenderness levels: (a) six key anthocyanin components; (b) ten minor anthocyanin components (p value: ns p > 0.05; * p < 0.05; ** p < 0.01; *** p < 0.001).
Figure 2. Content of anthocyanin components in Zijuan tea fresh leaves of varying tenderness levels: (a) six key anthocyanin components; (b) ten minor anthocyanin components (p value: ns p > 0.05; * p < 0.05; ** p < 0.01; *** p < 0.001).
Foods 14 02938 g002
Figure 3. Violin plots of color and texture features in Zijuan tea fresh leaves across tenderness levels: (a) is the color feature and correlation test; (b) is the texture feature and correlation test (p value: ns p > 0.05; * p < 0.05; ** p < 0.01; *** p < 0.001).
Figure 3. Violin plots of color and texture features in Zijuan tea fresh leaves across tenderness levels: (a) is the color feature and correlation test; (b) is the texture feature and correlation test (p value: ns p > 0.05; * p < 0.05; ** p < 0.01; *** p < 0.001).
Foods 14 02938 g003
Figure 4. PCA of Zijuan tea fresh leaves across tenderness levels. (a) NIR spectra PCA clustering; (b) color–texture feature PCA; (c) key anthocyanin component PCA; (d) minor anthocyanin component PCA.
Figure 4. PCA of Zijuan tea fresh leaves across tenderness levels. (a) NIR spectra PCA clustering; (b) color–texture feature PCA; (c) key anthocyanin component PCA; (d) minor anthocyanin component PCA.
Foods 14 02938 g004
Figure 5. NIR of Zijuan tea fresh leaves: (a) raw spectra; (b) S-G preprocessed spectra; (c) SNV preprocessed spectra; (d) Medfilt preprocessed spectra; (e) Normaliz preprocessed spectra; (f) group mean spectra across tenderness levels.
Figure 5. NIR of Zijuan tea fresh leaves: (a) raw spectra; (b) S-G preprocessed spectra; (c) SNV preprocessed spectra; (d) Medfilt preprocessed spectra; (e) Normaliz preprocessed spectra; (f) group mean spectra across tenderness levels.
Foods 14 02938 g005
Figure 6. Confusion matrix diagrams for the optimal Zijuan tea tenderness discrimination model. (a) Training set of TEX + Medfilt + RF; (b) prediction set of TEX + Medfilt + RF; (c) training set of NIR + Medfilt + CNN; (d) prediction set of NIR + Medfilt + CNN.
Figure 6. Confusion matrix diagrams for the optimal Zijuan tea tenderness discrimination model. (a) Training set of TEX + Medfilt + RF; (b) prediction set of TEX + Medfilt + RF; (c) training set of NIR + Medfilt + CNN; (d) prediction set of NIR + Medfilt + CNN.
Foods 14 02938 g006
Figure 7. Waveband selection plot for NIR using SPA.
Figure 7. Waveband selection plot for NIR using SPA.
Foods 14 02938 g007
Figure 8. Scatter plot of prediction model for anthocyanins and key components of anthocyanins. The prediction of (a) delphinidin-3-O-galactoside by TEX + Medfilt + SVR, (b) delphinidin-3-O-glucoside by TEX + S-G + SVR, (c) cyanidin-3,5-O-diglucoside by TEX + S-G + SVR, (d) cyanidin-3-O-glucoside by NIR + Medfilt + SVR, (e) petunidin by NIR + Medfilt + SPA + SVR, (f) cyanidin by NIR + Medfilt + SVR, and (g) total anthocyanins by TEX + S-G + SVR.
Figure 8. Scatter plot of prediction model for anthocyanins and key components of anthocyanins. The prediction of (a) delphinidin-3-O-galactoside by TEX + Medfilt + SVR, (b) delphinidin-3-O-glucoside by TEX + S-G + SVR, (c) cyanidin-3,5-O-diglucoside by TEX + S-G + SVR, (d) cyanidin-3-O-glucoside by NIR + Medfilt + SVR, (e) petunidin by NIR + Medfilt + SPA + SVR, (f) cyanidin by NIR + Medfilt + SVR, and (g) total anthocyanins by TEX + S-G + SVR.
Foods 14 02938 g008
Table 1. The application of nondestructive testing technology in tea grading and quality evaluation.
Table 1. The application of nondestructive testing technology in tea grading and quality evaluation.
TeaResearch TargetTechnologyMachine Learning AlgorithmReference
Vine teaChemical composition (myricetin, dihydromyricetin, quercetin, kaempferol, and quercitrin.)LC-MS, GC-MSPCA, RF[12]
Dark teaClassification of black tea at different altitudesHPLC, DAD, ELSDHPLC, HPLC-DAD, HPLC-ELSD[10]
Keemun black teaClassification of black tea, quantitative prediction of the concentrations of chemical components (GA, CAFF, EGC, C, EGCG, EC, GCG, total catechins)micro-NIR, CV, CASSVM, LS-SVM, ELM, PLS-DA[13]
Oolong teaVariety classificationGustative sensor system, CSAPCA, LDA, CA, ANN[14]
Black teaGrade evaluationHyperspectralPLS-DA, SVM, PNN[15]
White teaGrade evaluation, chemical composition (catechins, tea polyphenols, and free amino acids), sensory evaluationColor texture feature, hyperspectralSVM, KNN[9]
Table 2. Abbreviations of different data fusion methods and corresponding instructions.
Table 2. Abbreviations of different data fusion methods and corresponding instructions.
AbbreviationInstructions
NIRSpectral data from 900 to 1700 nm
TEXNine color feature factors and five texture feature factors
NIR + TEXThe fusion data of NIR and TEX
Table 3. Distribution of total anthocyanin content across tenderness levels.
Table 3. Distribution of total anthocyanin content across tenderness levels.
LevelRange (mg/g)MeanSTDCV
Level 117.498–18.30317.9310.2110.012
Level 218.215–18.99818.6680.2410.013
Level 319.477–20.12019.7830.1800.009
Level 419.665–20.32519.9630.2170.011
Level 514.965–15.84315.3770.2620.017
Table 4. Performance metrics of the tenderness level prediction model for Zijuan tea fresh leaves implementing PSO-SVM algorithm.
Table 4. Performance metrics of the tenderness level prediction model for Zijuan tea fresh leaves implementing PSO-SVM algorithm.
ModelData Fusion MethodsPreprocessing
Method
PCsCalibration SetPrediction Set
ResultCCRResultCCR
PSO-SVMNIRRaw1274/8092.50%16/2080.00%
S-G1174/8092.50%16/2080.00%
SNV1065/8081.50%15/2075.00%
Medfilt1677/8096.50%18/2090.00%
Normaliz1171/8088.75%14/2070.00%
TEXRaw1080/80100.00%9/2045.00%
S-G1080/80100.00%15/2075.00%
SNV858/8072.50%8/2040.00%
Medfilt1080/80100.00%18/2090.00%
Normaliz1862/8077.50%8/2040.00%
NIR + TEXRaw1820/8025.00%0/200.00%
S-G1978/8097.50%13/2065.00%
SNV2166/8082.50%12/2060.00%
Medfilt980/80100.00%17/2085.00%
Normaliz1962/8077.50%10/2050.00%
Table 5. Performance metrics of the tenderness level prediction model for Zijuan tea fresh leaves implementing RF algorithm.
Table 5. Performance metrics of the tenderness level prediction model for Zijuan tea fresh leaves implementing RF algorithm.
ModelData Fusion MethodsPreprocessing
Method
PCsCalibration SetPrediction Set
ResultCCRResultCCR
RFNIRRaw1280/80100.00%15/2075.00%
S-G1080/80100.00%18/2090.00%
SNV1180/80100.00%19/2095.00%
Medfilt1180/80100.00%19/2095.00%
Normaliz1180/80100.00%16/2080.00%
TEXRaw1080/80100.00%17/2085.00%
S-G1280/80100.00%15/2075.00%
SNV880/80100.00%12/2060.00%
Medfilt1080/80100.00%20/20100.00%
Normaliz880/80100.00%11/2055.00%
NIR + TEXRaw1980/80100.00%15/2075.00%
S-G1580/80100.00%14/2070.00%
SNV2180/80100.00%16/2080.00%
Medfilt980/80100.00%17/2085.00%
Normaliz1980/80100.00%13/2065.00%
Table 6. Performance metrics of the tenderness level prediction model for Zijuan tea fresh leaves implementing CNN algorithm.
Table 6. Performance metrics of the tenderness level prediction model for Zijuan tea fresh leaves implementing CNN algorithm.
ModelData Fusion MethodsPreprocessing
Method
PCsCalibration SetPrediction Set
ResultCCRResultCCR
CNNNIRRaw1280/80100.00%17/2085.00%
S-G1080/80100.00%19/2095.00%
SNV1480/80100.00%19/2095.00%
Medfilt1180/80100.00%20/20100.00%
Normaliz1180/80100.00%13/2065.00%
TEXRaw1079/8098.75%12/2060.00%
S-G1280/80100.00%9/2045.00%
SNV880/80100.00%9/2045.00%
Medfilt1078/8097.50%17/2085.00%
Normaliz880/80100.00%10/2050.00%
NIR + TEXRaw2180/80100.00%15/2075.00%
S-G1580/80100.00%15/2075.00%
SNV2180/80100.00%15/2075.00%
Medfilt2580/80100.00%19/2095.00%
Normaliz1980/80100.00%13/2065.00%
Table 7. Model performance metrics for delphinidin-3-O-galactoside under triple-data-fusion strategies and fourfold preprocessing methods.
Table 7. Model performance metrics for delphinidin-3-O-galactoside under triple-data-fusion strategies and fourfold preprocessing methods.
ComponentData Fusion
Method
Preprocessing
Method
PCsCalibration SetPrediction SetRPD
RcRMSEC (mg/g)RpRMSEP (mg/g)
Delphinidin-3-O-galactosideNIRRaw160.990.010.770.101.441
S-G170.970.070.800.101.528
SNV150.970.070.790.121.540
Medfilt100.990.030.950.082.776
Normaliz160.910.110.850.121.804
TEXRaw50.940.090.900.122.214
S-G50.970.060.970.084.072
SNV80.990.040.900.102.244
Medfilt50.980.060.980.064.687
Normaliz60.970.070.920.092.588
NIR + TEXRaw240.980.060.910.132.032
S-G160.980.060.900.122.287
SNV210.970.070.890.122.145
Medfilt70.970.060.960.083.368
Normaliz260.990.010.890.121.957
Table 8. Model performance metrics for delphinidin-3-O-glucoside under triple-data-fusion strategies and fourfold preprocessing methods.
Table 8. Model performance metrics for delphinidin-3-O-glucoside under triple-data-fusion strategies and fourfold preprocessing methods.
ComponentData Fusion
Method
Preprocessing
Method
PCsCalibration SetPrediction SetRPD
RcRMSEC (mg/g)RpRMSEP (mg/g)
Delphinidin-3-O-glucosideNIRRaw160.990.020.510.101.111
S-G130.990.020.510.101.111
SNV130.960.040.890.052.158
Medfilt90.990.010.880.062.082
Normaliz130.980.030.710.091.421
TEXRaw50.870.070.820.101.525
S-G90.940.050.960.053.718
SNV80.960.040.880.062.109
Medfilt70.970.030.950.052.992
Normaliz60.930.060.860.071.973
NIR + TEXRaw180.990.020.820.101.545
S-G150.980.030.870.081.948
SNV170.940.050.780.091.570
Medfilt80.980.030.930.062.616
Normaliz170.980.030.820.091.636
Table 9. Model performance metrics for cyanidin-3,5-O-diglucoside under triple-data-fusion strategies and fourfold preprocessing methods.
Table 9. Model performance metrics for cyanidin-3,5-O-diglucoside under triple-data-fusion strategies and fourfold preprocessing methods.
ComponentData Fusion
Method
Preprocessing
Method
PCsCalibration SetPrediction SetRPD
RcRMSEC (mg/g)RpRMSEP (mg/g)
Cyanidin-3,5-O-diglucosideNIRRaw90.970.050.750.081.328
S-G90.970.050.770.071.471
SNV70.940.070.800.081.680
Medfilt60.970.050.910.072.410
Normaliz70.960.050.850.081.823
TEXRaw70.970.040.910.092.294
S-G100.960.050.940.072.755
SNV70.900.080.840.101.628
Medfilt70.890.080.870.092.026
Normaliz70.920.080.840.101.657
NIR + TEXRaw170.950.060.920.082.510
S-G140.990.030.920.092.439
SNV150.930.070.910.072.442
Medfilt70.920.070.930.072.736
Normaliz160.970.050.900.072.242
Table 10. Model performance metrics for cyanidin-3-O-glucoside under triple-data-fusion strategies and fourfold preprocessing methods.
Table 10. Model performance metrics for cyanidin-3-O-glucoside under triple-data-fusion strategies and fourfold preprocessing methods.
ComponentData Fusion
Method
Preprocessing
Method
PCsCalibration SetPrediction SetRPD
RcRMSEC (mg/g)RpRMSEP (mg/g)
Cyanidin-3-O-glucosideNIRRaw60.940.110.910.102.396
S-G110.950.100.880.112.157
SNV60.920.130.910.102.214
Medfilt40.950.100.950.083.267
Normaliz110.950.110.940.092.969
TEXRaw100.910.120.850.161.877
S-G100.970.070.940.092.866
SNV80.950.090.850.151.739
Medfilt100.940.110.900.122.319
Normaliz70.970.070.840.161.795
NIR + TEXRaw160.960.080.930.122.405
S-G140.990.020.940.112.865
SNV230.980.070.890.132.120
Medfilt90.980.060.940.112.815
Normaliz240.990.020.900.122.227
Table 11. Model performance metrics for petunidin under triple-data-fusion strategies and fourfold preprocessing methods.
Table 11. Model performance metrics for petunidin under triple-data-fusion strategies and fourfold preprocessing methods.
ComponentData Fusion
Method
Preprocessing
Method
PCsCalibration SetPrediction SetRPD
RcRMSEC (mg/g)RpRMSEP (mg/g)
PetunidinNIRRaw90.970.250.760.541.525
S-G100.990.170.810.501.648
SNV80.980.180.870.452.040
Medfilt80.970.230.910.422.465
Normaliz90.980.210.860.482.000
TEXRaw60.830.560.510.891.082
S-G100.980.200.890.532.023
SNV70.810.620.560.811.090
Medfilt100.990.040.840.621.706
Normaliz80.910.430.550.831.190
NIR + TEXRaw160.950.310.740.641.385
S-G210.990.160.790.651.528
SNV150.960.310.770.611.545
Medfilt120.920.390.750.691.495
Normaliz140.920.410.750.651.387
Table 12. Model performance metrics for cyanidin under triple-data-fusion strategies and fourfold preprocessing methods.
Table 12. Model performance metrics for cyanidin under triple-data-fusion strategies and fourfold preprocessing methods.
ComponentData Fusion
Method
Preprocessing
Method
PCsCalibration SetPrediction SetRPD
RcRMSEC (mg/g)RpRMSEP (mg/g)
CyanidinNIRRaw70.750.110.610.081.246
S-G120.960.050.680.071.320
SNV80.810.100.700.091.359
Medfilt40.930.050.900.072.304
Normaliz130.780.100.410.121.103
TEXRaw50.690.110.400.151.055
S-G130.940.050.500.151.149
SNV60.740.110.680.101.333
Medfilt100.700.110.640.121.168
Normaliz70.830.090.880.072.012
NIR + TEXRaw70.800.090.410.140.987
S-G80.990.020.730.101.488
SNV80.810.090.620.111.285
Medfilt70.850.080.660.111.272
Normaliz80.990.010.870.081.857
Table 13. Model performance metrics for total anthocyanins under triple-data-fusion strategies and fourfold preprocessing methods.
Table 13. Model performance metrics for total anthocyanins under triple-data-fusion strategies and fourfold preprocessing methods.
ComponentData Fusion
Method
Preprocessing
Method
PCsCalibration SetPrediction SetRPD
RcRMSEC (mg/g)RpRMSEP (mg/g)
Total anthocyaninsNIRRaw100.970.340.650.741.245
S-G100.990.220.720.641.434
SNV80.940.500.870.531.980
Medfilt60.980.260.950.432.979
Normaliz100.990.220.850.601.860
TEXRaw50.910.550.820.841.694
S-G100.970.280.960.493.280
SNV60.880.680.760.811.505
Medfilt50.960.370.940.512.749
Normaliz60.910.600.820.691.753
NIR + TEXRaw230.990.080.910.761.818
S-G200.990.040.850.891.639
SNV240.990.120.890.661.945
Medfilt110.940.460.860.751.949
Normaliz240.990.220.870.671.769
Table 14. Model performance metrics for petunidin waveband selection.
Table 14. Model performance metrics for petunidin waveband selection.
ComponentOptimal CombinationMethodVariable
Number
PCsCalibration SetPrediction SetRPD
RcRMSEC (mg/g)RpRMSEP (mg/g)
PetunidinNIR + MedfiltRaw75180.970.230.910.422.465
CARS52110.990.040.930.402.584
BOSS2080.990.030.890.491.995
SPA3750.970.250.950.342.888
Table 15. Performance metrics for linear versus nonlinear predictive models of total anthocyanin content and key components.
Table 15. Performance metrics for linear versus nonlinear predictive models of total anthocyanin content and key components.
ComponentOptimal
Combination
ModelPCsCalibration SetPrediction SetRPD
RcRMSEC (mg/g)RpRMSEP (mg/g)
Delphinidin-3-O-galactosideTEX + MedfiltPLSR50.890.120.950.083.135
SVR50.980.060.980.064.687
RF50.990.060.970.093.726
Delphinidin-3-O-glucosideTEX + S-GPLSR90.920.050.960.053.567
SVR90.940.050.960.053.718
RF90.980.040.960.072.901
Cyanidin-3,5-O-diglucosideTEX + S-GPLSR100.830.090.940.072.850
SVR100.960.050.940.072.755
RF100.980.050.940.112.438
Cyanidin-3-O-glucosideNIR + MedfiltPLSR40.930.110.930.102.629
SVR40.950.100.950.083.267
RF40.980.070.930.113.532
PetunidinNIR + Medfilt + SPAPLSR50.470.900.720.710.958
SVR50.970.250.950.342.888
RF50.980.300.920.442.573
CyanidinNIR + MedfiltPLSR40.150.150.550.130.334
SVR40.930.050.900.072.304
RF40.960.050.650.111.893
Total anthocyaninsTEX + S-GPLSR100.890.570.940.552.732
SVR100.970.280.960.493.280
RF100.990.370.950.852.455
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Chen, S.; Dai, F.; Guo, M.; Dong, C. Assessment of Tenderness and Anthocyanin Content in Zijuan Tea Fresh Leaves Using Near-Infrared Spectroscopy Fused with Visual Features. Foods 2025, 14, 2938. https://doi.org/10.3390/foods14172938

AMA Style

Chen S, Dai F, Guo M, Dong C. Assessment of Tenderness and Anthocyanin Content in Zijuan Tea Fresh Leaves Using Near-Infrared Spectroscopy Fused with Visual Features. Foods. 2025; 14(17):2938. https://doi.org/10.3390/foods14172938

Chicago/Turabian Style

Chen, Shuya, Fushuang Dai, Mengqi Guo, and Chunwang Dong. 2025. "Assessment of Tenderness and Anthocyanin Content in Zijuan Tea Fresh Leaves Using Near-Infrared Spectroscopy Fused with Visual Features" Foods 14, no. 17: 2938. https://doi.org/10.3390/foods14172938

APA Style

Chen, S., Dai, F., Guo, M., & Dong, C. (2025). Assessment of Tenderness and Anthocyanin Content in Zijuan Tea Fresh Leaves Using Near-Infrared Spectroscopy Fused with Visual Features. Foods, 14(17), 2938. https://doi.org/10.3390/foods14172938

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop